fix: Prevent the commutative merging of all extra
elements #562
No reviewers
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#562
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "extra"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #467.
This will likely allow us to remove quite a few children groups from language profiles, as those were often added for the purpose of excluding comments from commutative merging.
fix: Prevent the commutative of allto fix: Prevent the commutative merging of allextra
elementsextra
elementsThe implementation is good already, but I think we could improve it even further
@ -53,2 +53,4 @@
/// An internal node id, guaranteed to be unique within the tree.
pub id: usize,
/// Whether this node is one that can appear anywhere in the source (such as a code comment)
pub is_extra: bool,
That's one expensive boolean you've got there 😅 (64 bits of space for 1 bit of info). Not like I can come up with something better though.. one could of course make
AstNode
into an enum ofRegular
/Extra
, but it would be such a massive headache to modify all the code touching AST nodes throughout the codebase to handle that changeAnyway, maybe we want to include it in the
Display
output while we're at it? Not sure quite how useful it would be thoughI was naively hoping that Rust would be able to magically make this compact by bundling it up with another field, but that was wishful thinking… If we want to shave some bits, I guess we could probably bundle this up in the
id
field (for instance as least significant bit), but I am not sure it's worth it.The ideal situation would be to be able to look up directly from the
LangProfile
whether a certain node type isextra
or not, but that's not information that's readily available sadly.@ -651,0 +654,4 @@
.any(|repr| repr.node.is_extra)
{
Err(
"One of the children is an extra node, cannot merge commutatively"
Hm, it's a bit unfortunate that this error only happens inside
commutatively_merge_lists
, because its description says that it should only be called if we know that the lists are safe to merge. So for example inbuild_subtree_from_changeset
, we:?
the errorWouldn't it be nicer if we could tell the caller that the children weren't commutative after all, so that they could still fall back onto some other strategy?
I think this is a case which could benefit from a more descriptive error type – something roughly like this:
That way, the caller could match on the error kind and do the best thing
It's a change I have thought about doing in #563 but I was reluctant because it's a change to the actual merging logic which could have unintended consequences. But I agree with you that it is worth doing.
Yeah it's probably best to leave this to a future PR -- could you please add a TODO comment?
I've opened a PR for it instead: #568
@ -651,1 +662,4 @@
}
})
.try_collect()?;
let Some(raw_separator) = commutative_parent.child_separator(&child_types) else {
I realize that this method is currently only called in this one place, but more generally, if we simplify our commutative children groups after this change, wouldn't it need to also check that the provided list of children doesn't contain any extra nodes? In fact, maybe the check should just be moved there..
I would be very keen to do that if we had a simple way to determine if a given node type is extra. At the moment we can only do it for a concrete node, sadly.
In the future, we might be able to parse the
node-types.json
file that is available from the crate of the parser and extract the node types marked asextra
from there. Ideally we'd do this at compilation time so that we don't embed big JSON blobs in our binary, just for them to be parsed at runtime. For this to work, we'll also need our tree-sitter parsers to be generated by a recent enough version of tree-sitter, because there was a bug in the generation of this information.If we go ahead with my current solution I'd open an issue about making this move, as it would indeed be more principled and save some memory.
Oh, right, I completely overlooked this.
Let's go with this approach then, but maybe also add a FIXME comment to
CommutativeParent::child_separator
?LGTM!
@ -0,0 +1,7 @@
<<<<<<< LEFT
off-topic (I think): I was wondering what this would look like in compact mode. I expected something like this:
but apparently it's this instead:
I wonder whether that's some sort of a bug in
MergedText