feat: support hcl #210

maunzCache · 2025-02-16T14:00:35+01:00

maunzCache commented

2025-02-16 14:00:35 +01:00

This PR should support HCL (HashiCorp Language) which is used in common infrastructure as code projects such as Terraform. Terraform uses a different file ending thus I'd like also support the *.tf extension.

The upstream grammar has an open PR to support mergiraf tree-sitter, so i'll leave this PR open until it got merged or I'll fix it in my fork.

This PR should support HCL (HashiCorp Language) which is used in common infrastructure as code projects such as Terraform. Terraform uses a different file ending thus I'd like also support the *.tf extension. The upstream grammar has an open PR to support mergiraf tree-sitter, so i'll leave this PR open until it got merged or I'll fix it in my fork.

🎉 1

wetneb commented

2025-02-16 23:12:25 +01:00

Nice, thanks a lot! It looks like your PR to update the crate is on its way to getting merged.
I see you added an example: could you perhaps turn that into some merging scenario(s), like the other tests in the examples/ directory?

Nice, thanks a lot! It looks like your PR to update the crate is on its way to getting merged. I see you added an example: could you perhaps turn that into some merging scenario(s), like the other tests in the `examples/` directory?

👍 1

maunzCache commented

2025-02-23 13:50:36 +01:00

@wetneb wrote in #210 (comment):

Nice, thanks a lot! It looks like your PR to update the crate is on its way to getting merged. I see you added an example: could you perhaps turn that into some merging scenario(s), like the other tests in the examples/ directory?

Can you give me a hint about what the difference of the failing/working test directories is? I was not able to comprehend it based on what the other examples provide.

@wetneb wrote in https://codeberg.org/mergiraf/mergiraf/pulls/210#issuecomment-2833348: > Nice, thanks a lot! It looks like your PR to update the crate is on its way to getting merged. I see you added an example: could you perhaps turn that into some merging scenario(s), like the other tests in the `examples/` directory? Can you give me a hint about what the difference of the failing/working test directories is? I was not able to comprehend it based on what the other examples provide.

wetneb commented

2025-02-23 14:02:07 +01:00

Yes, test cases that currently expected to pass go in the working directory. The ones in the failing directory are known failures that we could want to address at some point.

Yes, test cases that currently expected to pass go in the `working` directory. The ones in the `failing` directory are known failures that we could want to address at some point.

maunzCache force-pushed feat/hcl from fab56e1af9 to f110ecc7f2

2025-03-08 13:08:01 +01:00

Compare

maunzCache force-pushed feat/hcl from f110ecc7f2 to 7376c34be0

2025-03-08 13:09:21 +01:00

Compare

brc commented

2025-03-27 04:38:26 +01:00

First-time contributor

Thank you for this @maunzCache. Just discovered mergiraf and my initial use-case is for a huge Terraform mess. I was reading the mergiraf docs about adding a language and it appears we eventually want to define commutative parents and signatures as well (probably in a separate PR). Thanks again for getting this started! 👍

Thank you for this @maunzCache. Just discovered mergiraf and my initial use-case is for a huge Terraform mess. I was reading the mergiraf docs about [adding a language](https://mergiraf.org/adding-a-language.html) and it appears we eventually want to define commutative parents and signatures as well (probably in a separate PR). Thanks again for getting this started! 👍

maunzCache commented

2025-03-27 09:22:07 +01:00

@brc wrote in #210 (comment):

Thank you for this @maunzCache. Just discovered mergiraf and my initial use-case is for a huge Terraform mess. I was reading the mergiraf docs about adding a language and it appears we eventually want to define commutative parents and signatures as well (probably in a separate PR). Thanks again for getting this started! 👍

In case you have a merge conflict that makes for a good test, let me know. I am still a bit struggling with the testing.

@brc wrote in https://codeberg.org/mergiraf/mergiraf/pulls/210#issuecomment-3117926: > Thank you for this @maunzCache. Just discovered mergiraf and my initial use-case is for a huge Terraform mess. I was reading the mergiraf docs about [adding a language](https://mergiraf.org/adding-a-language.html) and it appears we eventually want to define commutative parents and signatures as well (probably in a separate PR). Thanks again for getting this started! :+1: In case you have a merge conflict that makes for a good test, let me know. I am still a bit struggling with the testing.

ada4a commented

2025-04-12 12:43:06 +02:00

@maunzCache wrote in #210 (comment):

In case you have a merge conflict that makes for a good test, let me know. I am still a bit struggling with the testing.

I mean the tests are normally there precisely to test commutative merging stuff, so if you haven't defined anything there then I'd say the tests are not strictly necessary.

I did go ahead and make the following sanity-checking test though:

# Base.hcl
variable "foo" {
  description = "hello world"
}

# Left.hcl
variable "foo" {
  description = "hello left"
}

# Right.hcl
variable "foo" {
  description = "hello right"
}

And got the following output:

# Expected.hcl
variable "foo" {
<<<<<<< LEFT
  description = "hello left"
||||||| BASE
  description = "hello world"
=======
  description = "hello right"
>>>>>>> RIGHT
}

Looks pretty sensible to me.

The compact output was somewhat more surprising:

# ExpectedCompact.hcl
variable "foo" {
  description = "
<<<<<<< LEFT
hello left
||||||| BASE
hello world
=======
hello right
>>>>>>> RIGHT
"
}

I would've expect the quotes to stay around the strings. But an identical test in Rust has the same behaviour, so I guess it's fine

@maunzCache wrote in https://codeberg.org/mergiraf/mergiraf/pulls/210#issuecomment-3118270: > In case you have a merge conflict that makes for a good test, let me know. I am still a bit struggling with the testing. I mean the tests are normally there precisely to test commutative merging stuff, so if you haven't defined anything there then I'd say the tests are not _strictly_ necessary. I did go ahead and make the following sanity-checking test though: ```hcl # Base.hcl variable "foo" { description = "hello world" } # Left.hcl variable "foo" { description = "hello left" } # Right.hcl variable "foo" { description = "hello right" } ``` And got the following output: ```hcl # Expected.hcl variable "foo" { <<<<<<< LEFT description = "hello left" ||||||| BASE description = "hello world" ======= description = "hello right" >>>>>>> RIGHT } ``` Looks pretty sensible to me. The compact output was somewhat more surprising: ```hcl # ExpectedCompact.hcl variable "foo" { description = " <<<<<<< LEFT hello left ||||||| BASE hello world ======= hello right >>>>>>> RIGHT " } ``` I would've expect the quotes to stay around the strings. But an identical test in Rust has the same behaviour, so I guess it's fine

ada4a commented

2025-04-12 12:46:34 +02:00

I did have one problem when rebasing your branch onto main though -- we have since updated to tree-sitter 0.25, but tree-sitter-hcl declares compatibility with 0.24 only. You might want to look into whether you could support 0.25 as well?

I did have one problem when rebasing your branch onto main though -- we have since updated to tree-sitter 0.25, but `tree-sitter-hcl` declares compatibility with 0.24 only. You might want to look into whether you could support 0.25 as well?

maunzCache commented

2025-04-12 12:58:56 +02:00

@ada4a wrote in #210 (comment):

I did have one problem when rebasing your branch onto main though -- we have since updated to tree-sitter 0.25, but tree-sitter-hcl declares compatibility with 0.24 only. You might want to look into whether you could support 0.25 as well?

Already on it https://github.com/tree-sitter-grammars/tree-sitter-hcl/pull/62

I think i removed my upstream branch for 0.24 so you will need to use https://github.com/maunzCache/tree-sitter-hcl/tree/feat/support-tree-sitter-0.25 instead.

Sorry, but there was no time for me to rebase this PR yet.

@ada4a wrote in https://codeberg.org/mergiraf/mergiraf/pulls/210#issuecomment-3795863: > I did have one problem when rebasing your branch onto main though -- we have since updated to tree-sitter 0.25, but `tree-sitter-hcl` declares compatibility with 0.24 only. You might want to look into whether you could support 0.25 as well? Already on it https://github.com/tree-sitter-grammars/tree-sitter-hcl/pull/62 I think i removed my upstream branch for 0.24 so you will need to use https://github.com/maunzCache/tree-sitter-hcl/tree/feat/support-tree-sitter-0.25 instead. Sorry, but there was no time for me to rebase this PR yet.

ada4a commented

2025-04-12 13:21:03 +02:00

Already on it https://github.com/tree-sitter-grammars/tree-sitter-hcl/pull/62

Nice!

I think i removed my upstream branch for 0.24 so you will need to use https://github.com/maunzCache/tree-sitter-hcl/tree/feat/support-tree-sitter-0.25 instead.

Yeah I just used the main branch for now, but good to know

Sorry, but there was no time for me to rebase this PR yet.

No need to apologize. The "conflicts" were fairly trivial anyway (namely, another language was added in the meantime)

> Already on it https://github.com/tree-sitter-grammars/tree-sitter-hcl/pull/62 Nice! > I think i removed my upstream branch for 0.24 so you will need to use https://github.com/maunzCache/tree-sitter-hcl/tree/feat/support-tree-sitter-0.25 instead. Yeah I just used the main branch for now, but good to know > Sorry, but there was no time for me to rebase this PR yet. No need to apologize. The "conflicts" were fairly trivial anyway (namely, another language was added in the meantime)

maunzCache commented

2025-04-16 09:08:15 +02:00

The 0.25 was merged so I will try to update this PR over the easter holidays. Please note https://github.com/tree-sitter-grammars/tree-sitter-hcl/issues/64 . There is no official build, new release or anything yet so i might need to include the dependency as git reference for it to work. If that is a problem for mergiraf, we might consider to postpone this PR further.

As i also updated the interface to align with the upstream way of creating tree-sitter-grammars, this PR cannot simply be rebased. Previously tree_sitter_hcl::language() would be used to reference the language, but now its properly migrated to tree_sitter_hcl::LANGUAGE.into() (if i did everything right at least)

The 0.25 was merged so I will try to update this PR over the easter holidays. Please note https://github.com/tree-sitter-grammars/tree-sitter-hcl/issues/64 . There is no official build, new release or anything yet so i might need to include the dependency as git reference for it to work. If that is a problem for mergiraf, we might consider to postpone this PR further. As i also updated the interface to align with the upstream way of creating tree-sitter-grammars, this PR cannot simply be rebased. Previously `tree_sitter_hcl::language()` would be used to reference the language, but now its properly migrated to `tree_sitter_hcl::LANGUAGE.into()` (if i did everything right at least)

wetneb referenced this pull request

2025-04-16 09:40:44 +02:00

Mark all strings as atomic #325

maunzCache force-pushed feat/hcl from 7376c34be0 to 8dea886673

2025-05-06 20:00:29 +02:00

Compare

maunzCache commented

2025-05-06 20:31:07 +02:00

Alright, i got the upstream changes back into my branch and added a simple test. While tinkering around with a more complex test, i got a panic in the test tool so it might be worth to investigate or i am just not good at writing tests ;)

I will open this PR for a review now. Thanks for being patient with this. Please feel free to recommend any changes. Unfortunately the upstream grammar project does not ship an artifact to cargo so i fixed the commit hash to what is required to be compatible with mergiraf for now. I also created a ticket to publish to cargo in the project, but i doubt this will happen soonish.

Alright, i got the upstream changes back into my branch and added a simple test. While tinkering around with a more complex test, i got a panic in the test tool so it might be worth to investigate or i am just not good at writing tests ;) I will open this PR for a review now. Thanks for being patient with this. Please feel free to recommend any changes. Unfortunately the upstream grammar project does not ship an artifact to cargo so i fixed the commit hash to what is required to be compatible with mergiraf for now. I also created a ticket to publish to cargo in the project, but i doubt this will happen soonish.

maunzCache changed title from ~~WIP: feat: support hcl~~ to feat: support hcl

2025-05-06 20:34:22 +02:00

ada4a reviewed

2025-05-06 20:52:13 +02:00

src/supported_langs.rs Outdated

					
				@ -688,1 +688,4 @@

				        },

				        LangProfile {

				            name: "HCL",

				            alternate_names: &[],

ada4a commented

2025-05-06 20:51:58 +02:00

Do you think it would make sense to add "terraform" to alternate_names? Is that a name that people (often) use to refer to this language?

alternate_names is used to recognize a manually specified language (#327) by its name

Do you think it would make sense to add "terraform" to `alternate_names`? Is that a name that people (often) use to refer to this language? `alternate_names` is used to recognize a manually specified language (#327) by its name

maunzCache commented

2025-05-06 21:03:19 +02:00

I am a bit divided on it.

The syntax is HCL and that is what its usually known for. Terraform adds a few keywords to the syntax which are not part of the HCL definition, but there is no "dialect" or "interpretation" called Terraform. If i'd compare it with SQL i'd say that MySQL and MSSQL would totally be alternate names - altough dialects.

If we'd add Terraform here, I'd also expect all other tools that use the HCL syntax to be part of the alternate name and then it defeats the purpose.

As i am writing this, i think i would not add it

I am a bit divided on it. The syntax is HCL and that is what its usually known for. Terraform adds a few keywords to the syntax which are not part of the HCL definition, but there is no "dialect" or "interpretation" called Terraform. If i'd compare it with SQL i'd say that MySQL and MSSQL would totally be alternate names - altough dialects. If we'd add Terraform here, I'd also expect all other tools that use the HCL syntax to be part of the alternate name and then it defeats the purpose. As i am writing this, i think i would not add it

ada4a commented

2025-05-06 21:07:41 +02:00

Thank you very much for the context! I don't have any experience with that ecosystem, but your argumentation sounds logical and so I agree with your conclusion:)

ada4a marked this conversation as resolved

ada4a commented

2025-05-06 20:56:32 +02:00

it might be worth to investigate

Definitely! Do you remember what test case caused the panic?

Thanks for being patient with this.

No problem at all:)

Please feel free to recommend any changes.

I'd say the PR looks pretty uncontroversial! It's only some more complex test cases that would warrant discussion imo

> it might be worth to investigate Definitely! Do you remember what test case caused the panic? > Thanks for being patient with this. No problem at all:) > Please feel free to recommend any changes. I'd say the PR looks pretty uncontroversial! It's only some more complex test cases that would warrant discussion imo

maunzCache commented

2025-05-06 21:18:30 +02:00

@ada4a wrote in #210 (comment):

it might be worth to investigate

Definitely! Do you remember what test case caused the panic?

It was a file descriptor error which would not write the debug output

$ helpers/inspect.sh examples/hcl/working/sanity_check/
DEBUG line-based merge took 5.816119ms
DEBUG re-constructing revisions from parsed merge took 1.874µs
DEBUG initializing the parser took 4.589µs
DEBUG parsing all three files took 527.914µs
DEBUG matching base to left
DEBUG matching base to right
DEBUG top-down phase yielded 0 matches
DEBUG top-down phase yielded 0 matches
DEBUG discarding match with similarity 0.35, close to threshold 0.4
DEBUG matching took 963.945µs
DEBUG matching took 1.048634ms
DEBUG matching left to right
DEBUG top-down phase yielded 0 matches
DEBUG matching took 473.591µs
DEBUG matching all three pairs took 1.891872ms
DEBUG constructing the classmapping took 233.63µs
DEBUG generating PCS triples

thread '<unnamed>' panicked at src/changeset.rs:173:34:
Unable to write changeset file: Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }

It happened because my test was wrong. I forgot parts in the Expected.hcl that i previously added to Left.hcl, so basically a wrong test.

@ada4a wrote in https://codeberg.org/mergiraf/mergiraf/pulls/210#issuecomment-4176029: > > it might be worth to investigate > > Definitely! Do you remember what test case caused the panic? It was a file descriptor error which would not write the debug output ```sh $ helpers/inspect.sh examples/hcl/working/sanity_check/ DEBUG line-based merge took 5.816119ms DEBUG re-constructing revisions from parsed merge took 1.874µs DEBUG initializing the parser took 4.589µs DEBUG parsing all three files took 527.914µs DEBUG matching base to left DEBUG matching base to right DEBUG top-down phase yielded 0 matches DEBUG top-down phase yielded 0 matches DEBUG discarding match with similarity 0.35, close to threshold 0.4 DEBUG matching took 963.945µs DEBUG matching took 1.048634ms DEBUG matching left to right DEBUG top-down phase yielded 0 matches DEBUG matching took 473.591µs DEBUG matching all three pairs took 1.891872ms DEBUG constructing the classmapping took 233.63µs DEBUG generating PCS triples thread '<unnamed>' panicked at src/changeset.rs:173:34: Unable to write changeset file: Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" } ``` It happened because my test was wrong. I forgot parts in the Expected.hcl that i previously added to Left.hcl, so basically a wrong test.

wetneb requested changes

2025-05-06 23:15:24 +02:00

Dismissed

wetneb left a comment

Thanks a lot for updating this!
Before merging this we'd need to change the dependency to an official release, otherwise it will prevent us from publishing Mergiraf on crates.io.
Concerning the test, it's a bit unusual that the expected test output is identical to a line-based merge - surely it should be possible to give an example where we're benefiting from the syntax-awareness, no?
At the moment, even if we replaced the HCL parser by a Python one, the test would still pass (except for the compact output, ok…)

Thanks a lot for updating this! Before merging this we'd need to change the dependency to an official release, otherwise it will prevent us from publishing Mergiraf on crates.io. Concerning the test, it's a bit unusual that the expected test output is identical to a line-based merge - surely it should be possible to give an example where we're benefiting from the syntax-awareness, no? At the moment, even if we replaced the HCL parser by a Python one, the test would still pass (except for the compact output, ok…)

maunzCache commented

2025-05-07 06:43:32 +02:00

@wetneb wrote in #210 (comment):

Thanks a lot for updating this! Before merging this we'd need to change the dependency to an official release, otherwise it will prevent us from publishing Mergiraf on crates.io.

Would it be enough to get a tagged branch or is it required to have a downloadable artifact at crate.io for it?

Concerning the test, it's a bit unusual that the expected test output is identical to a line-based merge - surely it should be possible to give an example where we're benefiting from the syntax-awareness, no? At the moment, even if we replaced the HCL parser by a Python one, the test would still pass (except for the compact output, ok…)

That is a valid suggestion.

@wetneb wrote in https://codeberg.org/mergiraf/mergiraf/pulls/210#issuecomment-4176788: > Thanks a lot for updating this! Before merging this we'd need to change the dependency to an official release, otherwise it will prevent us from publishing Mergiraf on crates.io. Would it be enough to get a tagged branch or is it required to have a downloadable artifact at crate.io for it? > Concerning the test, it's a bit unusual that the expected test output is identical to a line-based merge - surely it should be possible to give an example where we're benefiting from the syntax-awareness, no? At the moment, even if we replaced the HCL parser by a Python one, the test would still pass (except for the compact output, ok…) That is a valid suggestion.

maunzCache commented

2025-05-07 07:33:07 +02:00

I updated the test to what i think is a better example showcasing the syntax, but now it panicks again with the bad file descriptor. I must admit, that i am at my knowledge limit here und would appreciate any help.

$cargo --version
cargo 1.86.0 (adf9b6ad1 2025-02-28)

$ helpers/inspect.sh examples/hcl/working/sanity_check/
DEBUG line-based merge took 6.577884ms
DEBUG re-constructing revisions from parsed merge took 1.734µs
DEBUG initializing the parser took 3.346µs
DEBUG parsing all three files took 505.99µs
DEBUG matching base to left
DEBUG matching base to right
DEBUG top-down phase yielded 0 matches
DEBUG top-down phase yielded 0 matches
DEBUG matching took 885.153µs
DEBUG matching took 1.196698ms
DEBUG matching left to right
DEBUG top-down phase yielded 18 matches
DEBUG matching took 640.262µs
DEBUG matching all three pairs took 2.210101ms
DEBUG constructing the classmapping took 199.214µs
DEBUG generating PCS triples

thread '<unnamed>' panicked at src/changeset.rs:173:34:
Unable to write changeset file: Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
/home/user/Projekte/codeberg/maunzCache/mergiraf/helpers/run.sh: Zeile 15: 59522 Abgebrochen             (Speicherabzug geschrieben) ${script_dir}/../target/debug/mergiraf merge $1/Base.$ext $1/Left.$ext $1/Right.$ext -s BASE -x LEFT -y RIGHT $extra_args
------ RESULT ------
------ BASE --------
configuration {
  service_name = "bar-service"
}
------ LEFT --------
variable "foo" {
  type        = string
  description = "you know the drill"
  default     = "bar"
}

configuration {
  service_name = "${var.foo}-service"
}
------ RIGHT --------
variable "foo" {
  type        = list(string)
  description = "you know the drill"
  default     = ["baz"]
}

configuration {
  for_each = toset(var.foo)

  service_name = "${each.key}-service"
}
------ EXPECTED --------
variable "foo" {
  type        = list(string)
  description = "you know the drill"
  default     = ["bar", "baz"]
}

configuration {
  for_each = toset(var.foo)

  service_name = "${each.key}-service"
}
------ diff ------
*** /tmp/expected59506  2025-05-07 07:37:57.662419935 +0200
--- /tmp/out59506       2025-05-07 07:37:57.466419400 +0200
***************
*** 1,11 ****
- variable "foo" {
-   type        = list(string)
-   description = "you know the drill"
-   default     = ["bar", "baz"]
- }
- 
- configuration {
-   for_each = toset(var.foo)
- 
-   service_name = "${each.key}-service"
- }
--- 0 ----

What i noticed is that my expected output is not matching with what CI thinks and also my cargo version differs in the commit hash in CI. I assume that there is an issue with the grammar not recognizing the second value.

I updated the test to what i think is a better example showcasing the syntax, but now it panicks again with the bad file descriptor. I must admit, that i am at my knowledge limit here und would appreciate any help. ```sh $cargo --version cargo 1.86.0 (adf9b6ad1 2025-02-28) $ helpers/inspect.sh examples/hcl/working/sanity_check/ DEBUG line-based merge took 6.577884ms DEBUG re-constructing revisions from parsed merge took 1.734µs DEBUG initializing the parser took 3.346µs DEBUG parsing all three files took 505.99µs DEBUG matching base to left DEBUG matching base to right DEBUG top-down phase yielded 0 matches DEBUG top-down phase yielded 0 matches DEBUG matching took 885.153µs DEBUG matching took 1.196698ms DEBUG matching left to right DEBUG top-down phase yielded 18 matches DEBUG matching took 640.262µs DEBUG matching all three pairs took 2.210101ms DEBUG constructing the classmapping took 199.214µs DEBUG generating PCS triples thread '<unnamed>' panicked at src/changeset.rs:173:34: Unable to write changeset file: Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" } note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace /home/user/Projekte/codeberg/maunzCache/mergiraf/helpers/run.sh: Zeile 15: 59522 Abgebrochen (Speicherabzug geschrieben) ${script_dir}/../target/debug/mergiraf merge $1/Base.$ext $1/Left.$ext $1/Right.$ext -s BASE -x LEFT -y RIGHT $extra_args ------ RESULT ------ ------ BASE -------- configuration { service_name = "bar-service" } ------ LEFT -------- variable "foo" { type = string description = "you know the drill" default = "bar" } configuration { service_name = "${var.foo}-service" } ------ RIGHT -------- variable "foo" { type = list(string) description = "you know the drill" default = ["baz"] } configuration { for_each = toset(var.foo) service_name = "${each.key}-service" } ------ EXPECTED -------- variable "foo" { type = list(string) description = "you know the drill" default = ["bar", "baz"] } configuration { for_each = toset(var.foo) service_name = "${each.key}-service" } ------ diff ------ *** /tmp/expected59506 2025-05-07 07:37:57.662419935 +0200 --- /tmp/out59506 2025-05-07 07:37:57.466419400 +0200 *************** *** 1,11 **** - variable "foo" { - type = list(string) - description = "you know the drill" - default = ["bar", "baz"] - } - - configuration { - for_each = toset(var.foo) - - service_name = "${each.key}-service" - } --- 0 ---- ``` What i noticed is that my expected output is not matching with what CI thinks and also my cargo version differs in the commit hash in CI. I assume that there is an issue with the grammar not recognizing the second value.

wetneb commented

2025-05-07 08:26:39 +02:00

Would it be enough to get a tagged branch or is it required to have a downloadable artifact at crate.io for it?

I believe it's necessary that all dependencies are downloaded from crates.io, not from git. Arguably, using a tag name or branch name to refer to the desired version is even less stable than using a commit id, as the associated contents can change without notice.

I updated the test to what i think is a better example showcasing the syntax, but now it panicks again with the bad file descriptor.

The "Unable to write changeset file" error is probably coming from the fact that you haven't got a debug directory in the current working directory. I'll fix the script to make sure it creates this directory first. Edit: we already do mkdir -p debug in run.sh so it doesn't seem to be the source of the problem. On your side, you can also just run cargo test.

What i noticed is that my expected output is not matching with what CI thinks and also my cargo version differs in the commit hash in CI. I assume that there is an issue with the grammar not recognizing the second value.

As far as I can tell, mergiraf is genuinely unable to merge the example you provided. Does it work interactively for you when you just run it outside the test suite?

Here is a simpler test case that works (and takes advantage of the syntax awareness):

Base.hcl

configuration {
  default     = ["bar"]
}

Left.hcl

configuration {
  default     = ["foo", "bar"]
}

Right.hcl

configuration {
  default     = ["bar", "baz"]
}

Expected.hcl

configuration {
  default     = ["foo", "bar", "baz"]
}

> Would it be enough to get a tagged branch or is it required to have a downloadable artifact at crate.io for it? I believe it's necessary that all dependencies are downloaded from crates.io, not from git. Arguably, using a tag name or branch name to refer to the desired version is even less stable than using a commit id, as the associated contents can change without notice. > I updated the test to what i think is a better example showcasing the syntax, but now it panicks again with the bad file descriptor. <s>The "Unable to write changeset file" error is probably coming from the fact that you haven't got a `debug` directory in the current working directory. I'll fix the script to make sure it creates this directory first.</s> **Edit:** we already do `mkdir -p debug` in `run.sh` so it doesn't seem to be the source of the problem. On your side, you can also just run `cargo test`. > What i noticed is that my expected output is not matching with what CI thinks and also my cargo version differs in the commit hash in CI. I assume that there is an issue with the grammar not recognizing the second value. As far as I can tell, mergiraf is genuinely unable to merge the example you provided. Does it work interactively for you when you just run it outside the test suite? Here is a simpler test case that works (and takes advantage of the syntax awareness): #### Base.hcl ``` configuration { default = ["bar"] } ``` #### Left.hcl ``` configuration { default = ["foo", "bar"] } ``` #### Right.hcl ``` configuration { default = ["bar", "baz"] } ``` #### Expected.hcl ``` configuration { default = ["foo", "bar", "baz"] } ```

wetneb commented

2025-05-08 07:36:45 +02:00

By the way I can now reproduce the "bad file descriptor" issue: #369.

👍 1

wetneb commented

2025-05-10 14:45:05 +02:00

@maunzCache the "bad file descriptor" issue is fixed, rebasing your branch should hopefully solve the problem on your side.

maunzCache commented

2025-05-13 08:49:48 +02:00

The hcl grammar project is currently planning official deployments to crate.io so I will sit a bit longer on this one until they are good. Will check out the fix ASAP. Thanks also for the test recommendation, i was a bit stressed out when trying it again last week.

Can you give me a hint whether my (currently failing) test is related to a problem in the grammar or in mergiraf? I think it would be nice if it worked that way but I have not digged deeper why the behavior is as it is. I think i lack a bit of understanding on both parts to be honest. But we might not need to solve it in this PR.

The hcl grammar project is currently planning official deployments to crate.io so I will sit a bit longer on this one until they are good. Will check out the fix ASAP. Thanks also for the test recommendation, i was a bit stressed out when trying it again last week. Can you give me a hint whether my (currently failing) test is related to a problem in the grammar or in mergiraf? I think it would be nice if it worked that way but I have not digged deeper why the behavior is as it is. I think i lack a bit of understanding on both parts to be honest. But we might not need to solve it in this PR.

wetneb commented

2025-05-13 11:11:33 +02:00

I think it would be nice if it worked that way

Your test case currently generates two conflicts.

<<<<<<< LEFT
variable "foo" {
  type        = string
  description = "you know the drill"
  default     = "bar"
}
||||||| BASE
=======
variable "foo" {
  type        = list(string)
  description = "you know the drill"
  default     = ["baz"]
}
>>>>>>> RIGHT

This conflict is due to the left and right side adding similar, but different variable "foo" blocks on each side.
How is Mergiraf supposed to chose between string and list(string) as the value of type? Why should it pick list(string) more than string?

For unifying "bar" and ["baz"], we have the same problem. You would like to get ["bar", "baz"], but I don't see how Mergiraf could output this. If you had ["bar"] instead of "bar" on the left and you specified that those lists are commutative (is it always the case, wherever they appear?) by adding a commutative parent, then it could potentially solve it. At the moment, I think it would likely require that the list is also present in the base revision though.

Then comes the second conflict:

<<<<<<< LEFT
  service_name = "${var.foo}-service"
||||||| BASE
  service_name = "bar-service"
=======
  service_name = "${each.key}-service"
>>>>>>> RIGHT

Here again, you have ${var.foo} on the left side, bar in the base (which is even part of a larger text node, bar-service), and ${each.key} in the right. In your Expected.hcl file you specify that you expect this to be resolved to ${each.key}. Why?

> I think it would be nice if it worked that way Your test case currently generates two conflicts. ``` <<<<<<< LEFT variable "foo" { type = string description = "you know the drill" default = "bar" } ||||||| BASE ======= variable "foo" { type = list(string) description = "you know the drill" default = ["baz"] } >>>>>>> RIGHT ``` This conflict is due to the left and right side adding similar, but different `variable "foo"` blocks on each side. How is Mergiraf supposed to chose between `string` and `list(string)` as the value of `type`? Why should it pick `list(string)` more than `string`? For unifying `"bar"` and `["baz"]`, we have the same problem. You would like to get `["bar", "baz"]`, but I don't see how Mergiraf could output this. If you had `["bar"]` instead of `"bar"` on the left and you specified that those lists are commutative (is it always the case, wherever they appear?) by adding a commutative parent, then it could potentially solve it. At the moment, I think it would likely require that the list is also present in the base revision though. Then comes the second conflict: ``` <<<<<<< LEFT service_name = "${var.foo}-service" ||||||| BASE service_name = "bar-service" ======= service_name = "${each.key}-service" >>>>>>> RIGHT ``` Here again, you have `${var.foo}` on the left side, `bar` in the base (which is even part of a larger text node, `bar-service`), and `${each.key}` in the right. In your `Expected.hcl` file you specify that you expect this to be resolved to `${each.key}`. Why?

wetneb commented

2025-05-19 16:34:56 +02:00

Good news, the hcl grammar seems to have published a new release on crates.io! I believe this PR should be good to go once the new tests are in place and the dependency is changed to point to the released version.

maunzCache commented

2025-06-01 11:57:16 +02:00

@wetneb Thanks for the feedback on the tests. I must admit that when i started the PR i did not dig deep enough into mergiraf functionality nor git conflicts. Those were simply created from a gut feeling which is usually not the best. Not knowing these things made these tests what they are.

As you already mentioned, we finally got a release for the grammar on crates.io so i will be trying one more time to finalize this PR. I think it needs another rebase though.

Once again I am glad about the support I got in this PR. I was anxious about this in the beginning because of a lot unknowns, but your support helped me stay on it.

@wetneb Thanks for the feedback on the tests. I must admit that when i started the PR i did not dig deep enough into mergiraf functionality nor git conflicts. Those were simply created from a gut feeling which is usually not the best. Not knowing these things made these tests what they are. As you already mentioned, we finally got a release for the grammar on crates.io so i will be trying one more time to finalize this PR. I think it needs another rebase though. Once again I am glad about the support I got in this PR. I was anxious about this in the beginning because of a lot unknowns, but your support helped me stay on it.

maunzCache force-pushed feat/hcl from 287650f9ba to 1247ed4b49

2025-06-01 12:07:22 +02:00

Compare

maunzCache added 1 commit

2025-06-01 12:10:19 +02:00

fix: use proper cargo dependency. add missing property to LangProfile

/ test (pull_request) Failing after 2m17s

Details

4c18ed20f9

Signed-off-by: maunzCache <kevin.kortum@gmx.de>

maunzCache commented

2025-06-01 12:24:41 +02:00

I wanted to add a few details to the questions you asked. Looking at the HCL specs at https://github.com/hashicorp/hcl/blob/main/spec.md my test regarding the lists was faulty. It explains that lists must always be ordered. Mergiraf cannot guess the correct order of lists. My test is mixing that up with a similar tuple type called set, which allows unordered lists.

Edit: After looking into the parser output, i don't think its possible to support the behavior of unordered lists (sets). At least not in the way i would imagine it. Let's ditch it for now

On the second case: This is an overcomplicated example. My idea was to test for variable interpolation, but to make it meaningful i added the list type and a loop. In Terraform a loop entry (key-value) can be accessed via the "each" indentifier. While it would like to have a test for variable interpolation, putting it into a single test is not a good idea.

Edit2: Well, if not using the Terraform dialect, the each keyword might not be available. I think mixing HCL and the Terraform dialect is my main issue in this PR

I wanted to add a few details to the questions you asked. Looking at the HCL specs at https://github.com/hashicorp/hcl/blob/main/spec.md my test regarding the lists was faulty. It explains that lists must always be ordered. Mergiraf cannot guess the correct order of lists. My test is mixing that up with a similar tuple type called set, which allows unordered lists. Edit: After looking into the parser output, i don't think its possible to support the behavior of unordered lists (sets). At least not in the way i would imagine it. Let's ditch it for now On the second case: This is an overcomplicated example. My idea was to test for variable interpolation, but to make it meaningful i added the list type and a loop. In Terraform a loop entry (key-value) can be accessed via the "each" indentifier. While it would like to have a test for variable interpolation, putting it into a single test is not a good idea. Edit2: Well, if not using the Terraform dialect, the each keyword might not be available. I think mixing HCL and the Terraform dialect is my main issue in this PR

maunzCache added 1 commit

2025-06-01 14:04:00 +02:00

test: simplify test on tuple types

/ test (pull_request) Successful in 1m38s

Details

07faeae4a3

wetneb approved these changes

2025-06-01 15:18:23 +02:00

wetneb left a comment

Looks great, thanks a lot for getting back to this! :)

❤️ 1

wetneb merged commit 2d4bdd72ad into main