feat(TypeScript): commutative merging for union and intersection types #531
No reviewers
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#531
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "wetneb/flatten_binary_operators"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #484.
This is the approach 2. that I mentioned there. I would have preferred approach 1., but it looks very hard to change the grammar accordingly. Arguably, not having to change the grammar at all is also a good thing. The added complexity to the parsing code looks manageable to me. What do you think?
I don't know if it's feature creep? It doesn't feel that far from mergiraf's core feature set to me.
Performance-wise, I've added a guard to avoid looking up all node types in the newly introduced
flattened_nodes
vector, which should also avoid reallocations when the vector gets filled. (The capacity computation could be made even tighter, by subtracting the number of children being expanded, but I'm not sure it's worth tracking that.)I'm admittedly not very confident with grammars, but I also wasn't able to come up with a good in-grammar solution -- after looked at my initial attempt at #484 a bit more, I realized that I somehow completely missed the fact that
type
s are nested and would need further flattening. So it probably is better to let the grammar create a correct nesting, and do just the flattening of the result ourselvessome small things to get out of the way
@ -347,6 +351,22 @@ impl<'a> AstNode<'a> {
let grammar_name = node.grammar_name();
// check if this nodes needs flattening.
s/nodes/node
@ -350,0 +357,4 @@
{
let mut flattened_children =
Vec::with_capacity(children.len() + children_with_identical_grammar_type);
for child in children.into_iter() {
isn't
into_iter
unnecessary here? I wonder why Clippy doesn't flag itIntuitively it makes sense to consume the existing children here, no? We want this variable to get dropped.
If I replace it by
&children
orchildren.iter()
thenflattened_children
becomes aVec<&&AstNode>
instead ofVec<&AstNode>
.No, I meant just
children
– that should iterate by value, just like.into_iter()
would@ -350,0 +359,4 @@
Vec::with_capacity(children.len() + children_with_identical_grammar_type);
for child in children.into_iter() {
if child.grammar_name == grammar_name {
flattened_children.extend(child.children.iter());
I think this should work?
@ -220,6 +220,7 @@ impl<'a> AstNode<'a> {
let mut children = Vec::new();
let mut field_to_children: FxHashMap<&'a str, Vec<&'a Self>> = FxHashMap::default();
let mut last_child_end = node.byte_range().start;
let mut children_with_identical_grammar_type = 0;
the name is not quite correct.. what we're counting is the number of children of children with an identical grammar type, but that's of course a mouthful, and I can't think of a simpler name..
@ -253,2 +254,4 @@
field_to_children.entry(field_name).or_default().push(child);
}
if cursor.node().grammar_id() == node.grammar_id() {
children_with_identical_grammar_type += child.children.len();
Couldn't we do
child.children.len().saturating_sub(1)
here to account for that? I think that a small comment could manage to explain this just fine.Though I guess
len
will never be 0 -- since the grammar type is of a foldable type, it kind of by definition has non-zero children. So we could replacesaturating_sub
with just a-
-- it would panic in debug mode, working as an assertion, but silently wrap in release mode, which is the expected behaviour of arithmetic it seems (though I don't really understand why)Makes a lot of sense! It's going to be even harder to find a fitting name for this variable, but I guess comments can do…
Maybe
children_added_by_flattening
or something..AstNode::internal_new
is already a death by thousand cuts... I wish we could create separate-out the different processing steps, but it seems difficult given that we only really create the object at the very end of the method. This flattening in particular looks like something that could be extracted into a method somewhat easily -- it only accesseschildren
and the children-with-identical-whatever, no?@ -30,2 +30,4 @@
/// See https://tree-sitter.github.io/tree-sitter/3-syntax-highlighting.html#language-injection
pub injections: Option<&'static str>,
/// List of node types that should be flattened
pub flattened_nodes: Vec<&'static str>,
This could probably be a slice, right? I don't know how well slices would play with a configuration file, but if it does turn out to be impossible, returning to vecs would be quite straightforward
Sure! But actually, I wonder if it's really necessary to introduce this as a new field. I wonder if there is any reason not to flatten all commutative parents by default. I'm curious to investigate which of our existing commutative parents are allowed to appear as direct children of themselves.
Another option would be to have this as a boolean field on the commutative parent definition, indicating whether it should be flattened or not.
I expect this flattening feature will be used mostly (if not only) to declare commutative parents, so that would likely make language profiles a bit tidier…
Hm, wouldn't that be problematic for nested lists, for example? Surely we don't want to flatten this:
into this?
Yes, but I find it weird I can't think of any example in our existing commutative parent in our existing language profiles…
In Python there is a syntax for sets, but we haven't added it as a commutative parent somehow. It would be a good concrete example of your case (apart from the fact that sets aren't hashable so
t = { 1, { 2, 3 } }
raises an exception)… I find that funny.But you're right, even if we don't have concrete examples, it wouldn't be very future-proof to just mark all commutative parents as associative.
One non-commutative type that could be flattenable (but not really associative) are method calls I think?
@ -426,0 +446,4 @@
Vec::with_capacity(children.len() + children_added_by_flattening);
for child in children {
if child.grammar_name == grammar_name {
flattened_children.extend(&child.children);
Actually, couldn't this even be just
child.children
? Since we're consumingchild
anyway, we might as well move the fileld out of itI'd find it quite impressive if it worked, but the borrow checker doesn't seem to appreciate the idea:
ah, right,
child
is still&AstNode
, so you can't move (a field) out of thatlooking good now!