No spaces after lifetimes when a structured merge is rendered #170
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#170
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I've just got this while working on #169. I'll leave a not-quite-minimized example for now, and come back to this later.
The main differences are as follows:
Result.rs
Expected.rs
&'sself
&'s self
&'bPCS
&'b PCS
Note that
--compact
has no effect on this --&'bPCS<'a>
just gets pulled out of the conflict, but still formatted incorrectlyWorking on this. Current minified version:
So it looks like it's important to have conflicts in both type parameters and regular parameters of the function -- otherwise, the whole parameters subtree gets printed as is?Ah, no, it's probably just that without the type parameter atRight.rs
, the left and right sides become identical, and Mergiraf resolves the conflict triviallyOk so I think this actually uncovered some deeper problems in the logic flow.
The issue manifests itself during pretty-printing. This is how all the calls to
MergedTree::pretty_print_recursive
look:(the
[1,1,1]
is an improvised call-stack, and after that comes theMergedTree
kind)The interesting step is 1.1.4.2.2: since the three sides were determined to nominally conflict, we have a
MergedTree::Conflict
. Because of that, when we then go to 1.1.4.2.3,previous_sibling
is set toNone
. We callMergedTree::add_preceding_whitespace
with that, therefore resort to looking up what the whitespace beforeself
was in the conflict sides. That turns out to be""
(the whitespace between&
andself
in BASE I guess?). So that's what we push tooutput
.The interesting thing is that when we
push_conflict
during 1.1.4.2.2, the method is smart enough to realize that this is not a real conflict at all, and turns it intoMergedText::Merged
. But it would be nice if this realization was made earlier. That way, we would've passedSome(PreviousSibling("'s"))
toMergedTree::add_preceding_whitespace
, and (I'm guessing here), the method would've returned the correct whitespace of" "
. The problem is that I'm not really sure where that would need to happen.PS: This actually causes another problem elsewhere -- after we construct the merge, we proceed to calculate its
conflict_{mass,count}
. And the fake conflict becomes the sole reason why the merge gets marked as non-perfect, which means that we go on to do the (now pure) structured merge all over again, just to get the same exact result. Avoiding this waste would be one more reason for us to tackle this.TreeBuilder::build_conflict
looks like a good place for the not-really-a-conflict detection (this part in particular)? It would look something like this, though I'm once again not knowledgeable enough about all the nodes and revs and stuff to fully work this out... @wetneb would you be able to help?I think I got it?
Thanks for the in-depth investigation! The last version looks like it would be a very safe addition, and it sounds like it would fix this particular case. I am not clear on whether we could also intervene on other parts of the algorithm with this example in mind. The part of the algorithm that adds whitespace between merged tokens has always felt very brittle to me - it really feels like a best-effort heuristic rather than a reliable algorithm, sadly.
One thing I have been considering for a while is to add a safety check by re-parsing the merged output with the tree-sitter parser and checking that the resulting tree matches what we expect. If it's not, we throw the result away and fall back on textual merging. It's a pretty fuzzy idea of course, but it feels like it could be a useful safety net to catch similar problems. Quite some of the bad merges that have been reported are related to incorrect whitespace/newlines between merged items resulting in different semantics, which feels like a really bad kind of failure. I am also not sure if we could run the tree-sitter parser on an output with conflicts in a meaningful way, but doing it on conflict-free merges could already be interesting since they are the ones that can most easily slip through review.
Of course, the scope of such an improvement feels a lot broader than this issue, so I think the fix you have above is definitively worth integrating regardless.
The
MergedText::add_conflict
method (and I think there was another one like that, though I can't find it right now) kind of reminds me of shotgun parsing -- in that, instead of eliminating the not-really-conflicts right when they're created, we seem to propagate (some of) them, and therefore end up needing to deal with them at a later point (or, probably, multiple ones) in the pipeline. I think a possible solution would be to experiment with erroring out atMergedText::add_conflict
when the left and right sides are identical, and from there trace back the source of this not-really-a-conflict.I very much agree.
The idea actually sounds pretty good!
I think we could do something similar to
ParsedMerge::reconstruct_revision
? We'd basicallyMergedTree::pretty_print
, but only take one of the sides when we encounter aMergedTree::Conflict
Glad to hear that! The only thing that worries me is the possibly-panicking
[0]
, but I think it really shouldn't fail given the context... I'd add anexpect
there, but I'm really afraid my understanding of the whole situation is too shaky for that... I'll think about it some more, but do let me know if you come up with something!Ah yeah I didn't catch the
[0]
- it does seem possible for it to panic, no? If both sides remove things and both lists are empty, we could potentially still end up in this case, I guess. But I see that it's annoying to handle because we do have to return something, given the typing constraints…At least returning an error would be better than panicking, and perhaps not so difficult since we are already returning a
Result
.Right! Totally missed that possibility
The only reason I wanted to go with a panic is because I thought the error case really is unreachable... but it in fact is, so erroring out could be an option indeed
Yeah. I'll see what I can come up with!
I mean an obvious solution would be to return an
Option<MergedTree>
in the happy case. But the return type ofbuild_conflict
warrants us a type complexity warning already. So I'll first deal with that, and come back to this right afterWait. Shouldn't
TreeBuilder::resolve_commutative_conflict
have handled this? It's called on the result ofbuild_conflict
, and itself callsTreeBuilder::commutative_or_line_based_fallback
, which in turn has checks for- oh, I think I've got it. This is what the last function looks like:pub(crate) fn line_based_local_fallback_for_revnode(
node: Leader<'a>,
class_mapping: &ClassMapping<'a>,
) -> MergedTree<'a> {
let base_src = class_mapping.node_at_rev(node, Revision::Base);
let left_src = class_mapping.node_at_rev(node, Revision::Left);
let right_src = class_mapping.node_at_rev(node, Revision::Right);
match (base_src, left_src, right_src) {
(None, None, None) => {
unreachable!("A node that does not belong to any revision, how curious!")
}
(_, Some(_), None) => MergedTree::new_exact(
node,
RevisionNESet::singleton(Revision::Left),
class_mapping,
),
(_, None, Some(_)) => MergedTree::new_exact(
node,
RevisionNESet::singleton(Revision::Right),
class_mapping,
),
(Some(_), None, None) => MergedTree::new_exact(
node,
RevisionNESet::singleton(Revision::Base),
class_mapping,
),
(base, Some(left), Some(right)) => {
let base_src = base.map_or(Cow::from(""), |base| base.unindented_source());
let left_src = left.unindented_source();
let right_src = right.unindented_source();
let line_based_merge = line_based_merge(&base_src, &left_src, &right_src, None);
MergedTree::LineBasedMerge {
node,
contents: line_based_merge.contents,
conflict_mass: line_based_merge.conflict_mass,
}
}
}
}
I think what's missing is a check in the last branch to see if
left
andright
are isomorphic, no? And it looks like that should be it. Let me check real quick.Oh, I was too quick... the first function is actually called
TreeBuilder::resolve_commutative_conflict
, and is, as the name suggests, only used in case the conflict has a commutative parent. But there shouldn't be anything stopping us from callingTreeBuilder::commutative_or_line_based_fallback
(well, without the "commutative" part), ourselves. I'll look at that later (I have an exam in an hour, so maybe it's time to prepare for that or something :p)So I thought about this some more.
Wouldn't that actually be impossible inside
build_conflict
? We only call the method when we have, well, a conflict -- in particular, when two sides diverge. But if they do, then it can't possibly be that one of two lists is empty, no?For this same reason I think it doesn't make sense to integrate the entire range of possibilities covered by
MergedTree::line_based_local_fallback_for_revnode
-- the latter mostly deals with cases where one of sides is empty, which we don't need for the reason explained above.But I think we could actually add the case of isomorphic left and right sides to
MergedTree::line_based_local_fallback_for_revnode
, and handle it the same way as inbuild_conflict
. This is how that'd look like: