feat: support hcl #210
No reviewers
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#210
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "maunzCache/mergiraf:feat/hcl"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR should support HCL (HashiCorp Language) which is used in common infrastructure as code projects such as Terraform. Terraform uses a different file ending thus I'd like also support the *.tf extension.
The upstream grammar has an open PR to support mergiraf tree-sitter, so i'll leave this PR open until it got merged or I'll fix it in my fork.
Nice, thanks a lot! It looks like your PR to update the crate is on its way to getting merged.
I see you added an example: could you perhaps turn that into some merging scenario(s), like the other tests in the
examples/
directory?@wetneb wrote in #210 (comment):
Can you give me a hint about what the difference of the failing/working test directories is? I was not able to comprehend it based on what the other examples provide.
Yes, test cases that currently expected to pass go in the
working
directory. The ones in thefailing
directory are known failures that we could want to address at some point.fab56e1af9
tof110ecc7f2
f110ecc7f2
to7376c34be0
Thank you for this @maunzCache. Just discovered mergiraf and my initial use-case is for a huge Terraform mess. I was reading the mergiraf docs about adding a language and it appears we eventually want to define commutative parents and signatures as well (probably in a separate PR). Thanks again for getting this started! 👍
@brc wrote in #210 (comment):
In case you have a merge conflict that makes for a good test, let me know. I am still a bit struggling with the testing.
@maunzCache wrote in #210 (comment):
I mean the tests are normally there precisely to test commutative merging stuff, so if you haven't defined anything there then I'd say the tests are not strictly necessary.
I did go ahead and make the following sanity-checking test though:
And got the following output:
Looks pretty sensible to me.
The compact output was somewhat more surprising:
I would've expect the quotes to stay around the strings. But an identical test in Rust has the same behaviour, so I guess it's fine
I did have one problem when rebasing your branch onto main though -- we have since updated to tree-sitter 0.25, but
tree-sitter-hcl
declares compatibility with 0.24 only. You might want to look into whether you could support 0.25 as well?@ada4a wrote in #210 (comment):
Already on it https://github.com/tree-sitter-grammars/tree-sitter-hcl/pull/62
I think i removed my upstream branch for 0.24 so you will need to use https://github.com/maunzCache/tree-sitter-hcl/tree/feat/support-tree-sitter-0.25 instead.
Sorry, but there was no time for me to rebase this PR yet.
Nice!
Yeah I just used the main branch for now, but good to know
No need to apologize. The "conflicts" were fairly trivial anyway (namely, another language was added in the meantime)
The 0.25 was merged so I will try to update this PR over the easter holidays. Please note https://github.com/tree-sitter-grammars/tree-sitter-hcl/issues/64 . There is no official build, new release or anything yet so i might need to include the dependency as git reference for it to work. If that is a problem for mergiraf, we might consider to postpone this PR further.
As i also updated the interface to align with the upstream way of creating tree-sitter-grammars, this PR cannot simply be rebased. Previously
tree_sitter_hcl::language()
would be used to reference the language, but now its properly migrated totree_sitter_hcl::LANGUAGE.into()
(if i did everything right at least)7376c34be0
to8dea886673
Alright, i got the upstream changes back into my branch and added a simple test. While tinkering around with a more complex test, i got a panic in the test tool so it might be worth to investigate or i am just not good at writing tests ;)
I will open this PR for a review now. Thanks for being patient with this. Please feel free to recommend any changes. Unfortunately the upstream grammar project does not ship an artifact to cargo so i fixed the commit hash to what is required to be compatible with mergiraf for now. I also created a ticket to publish to cargo in the project, but i doubt this will happen soonish.
WIP: feat: support hclto feat: support hcl@ -688,1 +688,4 @@
},
LangProfile {
name: "HCL",
alternate_names: &[],
Do you think it would make sense to add "terraform" to
alternate_names
? Is that a name that people (often) use to refer to this language?alternate_names
is used to recognize a manually specified language (#327) by its nameI am a bit divided on it.
The syntax is HCL and that is what its usually known for. Terraform adds a few keywords to the syntax which are not part of the HCL definition, but there is no "dialect" or "interpretation" called Terraform. If i'd compare it with SQL i'd say that MySQL and MSSQL would totally be alternate names - altough dialects.
If we'd add Terraform here, I'd also expect all other tools that use the HCL syntax to be part of the alternate name and then it defeats the purpose.
As i am writing this, i think i would not add it
Thank you very much for the context! I don't have any experience with that ecosystem, but your argumentation sounds logical and so I agree with your conclusion:)
Definitely! Do you remember what test case caused the panic?
No problem at all:)
I'd say the PR looks pretty uncontroversial! It's only some more complex test cases that would warrant discussion imo
@ada4a wrote in #210 (comment):
It was a file descriptor error which would not write the debug output
It happened because my test was wrong. I forgot parts in the Expected.hcl that i previously added to Left.hcl, so basically a wrong test.
Thanks a lot for updating this!
Before merging this we'd need to change the dependency to an official release, otherwise it will prevent us from publishing Mergiraf on crates.io.
Concerning the test, it's a bit unusual that the expected test output is identical to a line-based merge - surely it should be possible to give an example where we're benefiting from the syntax-awareness, no?
At the moment, even if we replaced the HCL parser by a Python one, the test would still pass (except for the compact output, ok…)
@wetneb wrote in #210 (comment):
Would it be enough to get a tagged branch or is it required to have a downloadable artifact at crate.io for it?
That is a valid suggestion.
I updated the test to what i think is a better example showcasing the syntax, but now it panicks again with the bad file descriptor. I must admit, that i am at my knowledge limit here und would appreciate any help.
What i noticed is that my expected output is not matching with what CI thinks and also my cargo version differs in the commit hash in CI. I assume that there is an issue with the grammar not recognizing the second value.
I believe it's necessary that all dependencies are downloaded from crates.io, not from git. Arguably, using a tag name or branch name to refer to the desired version is even less stable than using a commit id, as the associated contents can change without notice.
The "Unable to write changeset file" error is probably coming from the fact that you haven't got aEdit: we already dodebug
directory in the current working directory. I'll fix the script to make sure it creates this directory first.mkdir -p debug
inrun.sh
so it doesn't seem to be the source of the problem. On your side, you can also just runcargo test
.As far as I can tell, mergiraf is genuinely unable to merge the example you provided. Does it work interactively for you when you just run it outside the test suite?
Here is a simpler test case that works (and takes advantage of the syntax awareness):
Base.hcl
Left.hcl
Right.hcl
Expected.hcl
By the way I can now reproduce the "bad file descriptor" issue: #369.
@maunzCache the "bad file descriptor" issue is fixed, rebasing your branch should hopefully solve the problem on your side.
The hcl grammar project is currently planning official deployments to crate.io so I will sit a bit longer on this one until they are good. Will check out the fix ASAP. Thanks also for the test recommendation, i was a bit stressed out when trying it again last week.
Can you give me a hint whether my (currently failing) test is related to a problem in the grammar or in mergiraf? I think it would be nice if it worked that way but I have not digged deeper why the behavior is as it is. I think i lack a bit of understanding on both parts to be honest. But we might not need to solve it in this PR.
Your test case currently generates two conflicts.
This conflict is due to the left and right side adding similar, but different
variable "foo"
blocks on each side.How is Mergiraf supposed to chose between
string
andlist(string)
as the value oftype
? Why should it picklist(string)
more thanstring
?For unifying
"bar"
and["baz"]
, we have the same problem. You would like to get["bar", "baz"]
, but I don't see how Mergiraf could output this. If you had["bar"]
instead of"bar"
on the left and you specified that those lists are commutative (is it always the case, wherever they appear?) by adding a commutative parent, then it could potentially solve it. At the moment, I think it would likely require that the list is also present in the base revision though.Then comes the second conflict:
Here again, you have
${var.foo}
on the left side,bar
in the base (which is even part of a larger text node,bar-service
), and${each.key}
in the right. In yourExpected.hcl
file you specify that you expect this to be resolved to${each.key}
. Why?Good news, the hcl grammar seems to have published a new release on crates.io! I believe this PR should be good to go once the new tests are in place and the dependency is changed to point to the released version.
@wetneb Thanks for the feedback on the tests. I must admit that when i started the PR i did not dig deep enough into mergiraf functionality nor git conflicts. Those were simply created from a gut feeling which is usually not the best. Not knowing these things made these tests what they are.
As you already mentioned, we finally got a release for the grammar on crates.io so i will be trying one more time to finalize this PR. I think it needs another rebase though.
Once again I am glad about the support I got in this PR. I was anxious about this in the beginning because of a lot unknowns, but your support helped me stay on it.
287650f9ba
to1247ed4b49
I wanted to add a few details to the questions you asked. Looking at the HCL specs at https://github.com/hashicorp/hcl/blob/main/spec.md my test regarding the lists was faulty. It explains that lists must always be ordered. Mergiraf cannot guess the correct order of lists. My test is mixing that up with a similar tuple type called set, which allows unordered lists.
Edit: After looking into the parser output, i don't think its possible to support the behavior of unordered lists (sets). At least not in the way i would imagine it. Let's ditch it for now
On the second case: This is an overcomplicated example. My idea was to test for variable interpolation, but to make it meaningful i added the list type and a loop. In Terraform a loop entry (key-value) can be accessed via the "each" indentifier. While it would like to have a test for variable interpolation, putting it into a single test is not a good idea.
Edit2: Well, if not using the Terraform dialect, the each keyword might not be available. I think mixing HCL and the Terraform dialect is my main issue in this PR
Looks great, thanks a lot for getting back to this! :)