test(supported_langs): check that commutative parents' children all have signatures defined #507
No reviewers
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#507
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "comm-parents-check-have-all-sigs"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
for #489
8cc06cabf2
to76fb1168e8
@ -167,0 +181,4 @@
// `field_declaration` above -- is there a way to reuse it instead?
Field("declarator"),
Field("name"),
],
I'm surprised by the three components of this signature. I would expect that it's not possible to declare two records with the same name, so
vec![Field("name")]
would be enough, no?I don't really know if I'm being honest. I took this from
method_declaration
, which does need it because of overloading, but I guess structs/records/enums don't?Yes, not only they don't need it, but adding it will prevent the signatures to be equal if the records have different fields.
For instance, in a case like this:
we do want to flag those as being duplicates, because the Java compiler will not accept that.
With the signature definition you have, those two declarations would get different signatures, so we wouldn't detect that they are in conflict.
You're totally right, I'll go fix it
76fb1168e8
tod38fde0970
In Rust (I decided to work on a language I actually know something about^^),
let_declaration
s are marked as children of commutative parents of bothdeclaration_list
(e.g. inside an impl block) andsource_file
, but I don't think either of those is even allowed to include let-declarations?I think the reason we have this is that in the
node-types.json
, we have bothdeclaration_list
andsource_file
list_declaration_statement
as one of the possible child types, and the latter in turn haslet_declaration
as one of its kinds. So this is probably an oversimplification on the grammar side, which we need not follow?That's a good question. I think it's indeed a simplification. One could try making the grammar tighter and seeing if it still passes the test suite (but it's probably not worth adding more complexity to the parser for the sake of introducing this restriction).
Removing
let_declaration
s from the children groups would probably make sense.@ -262,9 +274,9 @@ pub static SUPPORTED_LANGUAGES: LazyLock<Vec<LangProfile>> = LazyLock::new(|| {
"type_item",
"function_item",
"function_signature_item",
"impl_item",
looks like this:
parses like this:
Firstly, this can very well be marked as a commutative child, since impl blocks are reorderable.
Secondly, if we do add them as a commutative child, what would their signature be?
Theoretically, nothing stops one from having two impl blocks for the same type in the same file -- though normally one would either:
HashMap::new
has no bounds, butHashMap::insert
requires the key to be hashable, so you put them in different blocks)So in order to detect identical impl blocks, we would need to consider all of:
impl Trait for Foo
is also animpl_item
?.. In that case, we'd also need to look at thetype
field, whereTrait
is stored in this case. Although in that case, having twoimpl X for Y
for the same (X,Y) is disallowedThat's a tricky case indeed. I guess we could change the grammar so that it parses
impl Trait for Foo
as a different node type. Then, for that node type, we could define a signature which bundles up:And those all together should be unique in a given file, no?
For the general
impl Foo { … }
case, we could perhaps also just usevec![vec![]]
as signature… (and maybe it's also a sensible temporary solution forimpl Trait for Foo
in the meantime)I mean I understand why the grammar might not want to concern itself with that -- highlighting and stuff don't really change just because of the
Trait for
part. But if it would be possible to reuse "impl_body
" in bothimpl_item
and "impl_trait_item
", then it could probably work I think. Though interestingly enough, not even the compiler distinguishes between the two, neither in the AST nor in the HIR.All of this makes me think that maybe we should just restrict ourselves to
vec![vec![]]
after all..26ca1386ac
tofdf564ce76
@ -332,2 +344,4 @@
// source_file
signature("use_declaration", vec![vec![Field("argument")]]),
signature("extern_crate_declaration", vec![vec![Field("name")]]),
signature("foreign_mod_item", todo!()),
this is similar to
impl_item
, in that one can have multipleextern "C" { ... }
items in the source code, and the only real way to find out if they are identical is to go through all the functions defined inside the block (and the foreign ABI convention ("C"
here) as well, of course)Then maybe use
vec![vec![]]
as signature? It means the whole node is treated as a signature, which would at least prevent isomorphic elements from being added.makes sense!
The reason I push these not-quite-ready commits is so that we could have one discussion thread for each signature (since I feel like there are many--different--non-trivial cases) -- of course, I would've preferred to be able to open threads here directly...
@ -348,0 +362,4 @@
// where_clause
signature(
"where_predicate",
vec![vec![Field("left")], vec![Field("bounds")]],
here's an example of its use:
the where clause in this:
gets parsed as this:
and we're asked to provide signatures for the
where_predicate
s.I initially went with the approach of
left
+ (list of allidentifier
s inbound
). But then I saw thatuse_declaration
uses the simpler approach, where it creates the signature from justargument
, which is everything that followsuse
. And so I thoughtwhere_predicate
could do the same for the right half of its signature.Although I'm not quite sure that the
argument
thing is safe to do in the first place? I think the following two use declarations:would technically have different
argument
s (and thus different signatures) because of the different formatting? At leastAst::ascii_tree
does show the signatures as with the formatting respectedWhat you did looks good to me. It's true that it will fail to identify bounds that are slightly different, but it's the best we can do given the system we have.
Concerning your example about
use
statements above, the deciding factor that would make the signatures different is the trailing comma (the formatting differences shouldn't make a difference since nodes in signatures are treated up to isomorphism).fdf564ce76
toabc6d55858
abc6d55858
toe7abb14911
e7abb14911
to0bb42380ca
0bb42380ca
toa5098e3472
@ -239,1 +282,4 @@
signature("delegation_specifier", vec![vec![]]),
// modifiers
signature(
"annotation",
So the Kotlin grammar is a bit annoying -- it seems to only ever use
ChildType
s, which in the particular case ofannotation
causes the following problem:An annotation normally looks like this:
but it can optionally include a use site target, for example:
And the way the grammar defines this
use_site_target
is as follows:3dea6dfa9c/grammar.js (L534-L537)
, so what we get from parsing is something like this:so we'd need to add an alternative sigdef path for all possible variants of
use_site_target
(similar to #540)Or, much more simply, change the sigdef to be
vec![vec![]]
, since there aren't that many moving parts there anyway...Well, not really, there's also this.. thing: https://kotlinlang.org/docs/annotations.html#arrays-as-annotation-parameters, so this:
gets parsed to this:
and so doesn't even pass into the sigdef I currently have, but also wouldn't quite work with the naive
vec![vec![]]
, because I think the parameters aren't important as far as the duplicate signature check goes?Although it looks like none of this is important, because there's a meta-attribute that can allow an attribute to be used multiple times: https://kotlinlang.org/api/core/kotlin-stdlib/kotlin.annotation/-repeatable/
This is a bit too much, I'll go touch some grass for a bit..
decided to go with plain
vec![vec![]]
, just like we do for annotations in Javab63c22a5bd
to0957e3a0a2
I would propose we merge what we've got so far (except the test of course), and then continue? Especially if you're planning to do a release soon-ish
Makes sense!
2bb1b592eb
toe0ff006144
Ok the extraction thing was a bit awkward.. Let's do this another way:
#[ignore = "not all signatures are added yet"]
to the test, and merge it#[ignore]
for testing, but shouldn't include that change in the final versionAlternatively, we could do my favorite thing: a UI test! For that, we could:
#[ignore]
to the testThis has the advantage that, when someone adds a new language, and doesn't add all the necessary signatures, they'll get a test failure that shows them which signatures are missing in particular.
In both cases, we might want to open a tracking issue for all the remaining signatures. There, for each language with missing signatures, we could tag the people who added that particular language, with the hope that it'd be easier for them to add those since they're more familiar with the particular language's semantics than.. me.
What do you think?
Sorry, I completely missed that last message…
Both solutions look good to me.
On the long term I would like to list all possible children of a commutative parent (not just the ones that are mentioned in the language profile) to do this check. But it's much more involved so we clearly shouldn't let that hold this PR back.
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.