feat(mergiraf solve): improve error output for jj users #318
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#318
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "senekor/mergiraf:senekor/solve-error-output"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
@ -339,6 +339,27 @@ fn real_main(args: CliArgs) -> Result<i32, String> {
}
Err(e) => {
As said in #317 (comment), by the time we reach this, it might already be too late -- Mergiraf would think that a file with jj-style conflicts doesn't have conflicts at all (since they won't match the conflict regex), and happily return it in the
Ok
variant. So all of this should be happening sooner.Even if you don't want to go with the regex approach, I think the best option would be to modify this part of
ParsedMerge::parse
(the method that parses files with conflicts):// the 3 regexes each match more things than the last in this order:
// 1) diff2 -- by ignoring the base marker and base rev text
// 2) diff3_no_newline -- by, well, ignoring the final newline
// 3) diff3 -- only matches a diff3 conflict ending with a newline
//
// so we run them in the opposite order:
// 1) if diff3 matches, take that
// 2) if diff3_no_newline matches, take that
// 3) if diff2 matches, then we know that this isn't a misrecognized diff3, and bail out
let resolved_end = if let Some(occurrence) =
(diff3_captures.as_ref()).or(diff3_no_newline_captures.as_ref())
{
occurrence
.get(0)
.expect("whole match is guaranteed to exist")
.start()
} else if diff2conflict.is_match(remaining_source) {
return Err(PARSED_MERGE_DIFF2_DETECTED.to_owned());
} else {
remaining_source.len()
};
Namely, I think it would make sense to move the error handling added by this PR (probably packed in a helper function) to after we try to match for diff2.
That will of course make the no-conflict case slower, but it's a very-happy path anyway (meaning it's very fast elsewhere), so slowing it down a little here shouldn't be too bad
Here's the output I get from running
mergiraf solve Example.java
in the example-repo:In what situation would Mergiraf think there are no conflicts at all?
TLDR: it actually does in this one, and it's only by luck (and robust logic I guess) that everything doesn't blow up violently.
in this exact one -- see the verbose output:
I guess it might be a bit convoluted, so here's the important bit:
from_parsed_original
refers to what have parsed (withParsedMerge::parse
) from the input file. As you can see, Mergiraf reports no problems at all with this "solution", and that's precisely because it didn't recognize any of the conflicts.The only reason we don't happily return this clearly wrong "solution" is that we have a logic to look for a better one. So normally, we:
from_parsed_original
becomes this:
then parse those sides using tree-sitter and do a structured merge
3. if that fails, we resort to trying to get the original revisions (so before the merge was even attempted) from Git
4. And we return the best of those two solution only if it's better that what we started with.
But here, both of those fail:
Err
and so we abort structured resolution. The output above does mention this:.git
directory1. Therefore, this strategy fails as well. Again, see the logs:So it's but a pure coincidence that everything works out relatively peacefully after all.
That's why I propose handling the error as early as possible. If we get a "jujutsu detected" error from
ParsedMerge::parse
, we can avoid calling bothresolve_merge
andstructured_merge_from_git_revisions
(the first and second strategy, respectively). That'd not only be faster, but would also avoid the confusing warnings before the final, actually helpful, suggestionit's surprising to me that Jujutsu is bold enough to already do that by default, I thought it'd try to maintain compatibility for somewhat longer ↩︎
Now that I think of it, you could even just move the check to the very_beginning_ of
mergiraf solve
logic, so line 300 (inmergiraf.rs
), and hard-error-out right awayOkay, that makes sense. So, you propose one more regex in
ParsedMerge::parse
? Why not shell out tojj
earlier? Maybe the performance impact isn't so bad. And even git users are not expected to usemergiraf solve
regularly, right? The recommended way is to let git callmergiraf merge
automatically.TBH my regex experience is very limited, I'd prefer to use a parser combinator if I had to do something like detect conflict markers.
It's always been this way I think. The important piece of git compatibility is that you can push to a git remote, so you can choose to use jj while your coworkers stay on git. So it's about overcoming the network effects of a VCS. Existing local tooling working is also very useful, but colocating a repo also has a few downsides noted here, so it's not done by default.
Yeah I initially thought we'd need the former, but now I think we could actually do the latter, see my last comment (the small one)
Well it's recommended if you prefer doing everything manually. But yeah, I'd expect most users to have it done automatically. Though this is the other way around with Jujutsu, funnily enough.
I wanted to do that, too, but unfortunately failed.
I see, thanks for the explanation! And that link is exactly what I would've liked to see after "colocated" in the explanation comment you added (but we decided to ditch it so it's not important)
@ada4a wrote in #318 (comment):
done!
@ -342,0 +344,4 @@
// which case various issues can occur. For example, Jujutsu
// has its own style of conflict markers, which Mergiraf
// doesn't understand. Also, a Jujutsu repository may not
// be colocated, which means Mergiraf cannot find a .git
colocated?
A Jujutsu repository can either be a
.jj
directory with the backing git repo stored in.jj/repo/store/git
, or a.jj
and.git
directory side-by-side. The latter is what's usually recommended, because many git-related tools keep working.Yeah I think that comment would only be clear to someone who is already moderatly knowledgeable about Jujutsu. I'd replace "may not be colocated" with "may be configured to store
.git
as a subdirectory of.jj
instead of side-by-side (<link to a Jujutsu documentation article that talks about this possibility)".Does this imply that some git-related tools are in fact able to handle this? How? Now that I think of it, they probably read
GIT_DIR
instead of hard-coding it as.git
?Well, to my knowledge, we in Mergiraf don't hardcode it either. In fact I don't think we even look for it anywhere – on the rare occasions that we do need to interact with the repo, we just shell out to Git.
I felt like the comment was getting a bit wordy and confusing. The point about the
.git
directory isn't even that important, the conflict marker style is the main issue. I removed that part.Well, no. If a Jujutsu repo is not colocated, meaning the
.git
directory is stored within the.jj
repo, pretty much all tooling breaks because they don't recognize that you're in a repository in the first place. ReadingGIT_DIR
wouldn't help much, because Jujutsu doesn't set that for you. Shelling out to Git is perfectly fine and the right way to go in my view. Jujutsu actually used to uselibgit2
and has moved to shelling out for some things because its more reliable.I.. don't get it. If neither GIT_DIR is set, nor is
.git
in its correct location, how is Git-that-we-shell-out-to even able to do anything?It isn't. In what situation does mergiraf shell out to git? It must not do that for
mergiraf merge
, which is how jj calls it. Otherwise git would certainly fail.There's actually one at the very beginning! You can pass an envvar to disable Mergiraf – you can want to do that when running
git {merge,rebase}
. In that case, Mergiraf falls back togit merge-file
, the lower-level command that takes three files and, well, merges them – the function's calledfallback_to_git_merge_file
.So I guess that's quite an important case indeed, and what we'd like to do there is to check for Jujutsu and call some lower-level analogue of
jj resolve
instead ofgit merge-file
.It might be that there are more shell-outs, but I think they're mostly in testing, and one in
mergiraf solve
which is not relevant here as you sayI see. For that particular case, I don't think it's relevant to jujutsu. Since every call to mergiraf is made intentionally with
jj resolve --tool mergiraf
, there would be no point in disabling mergiraf, right?Hmm, I guess so... I want to ask whether there's a way set up a default
--tool
– but even if there were one, and you were to set Mergiraf as default, but wanted to use something different temporarily, you'd probably just specify it with--tool
, so no need for the envvar either.I'll try to take a look at the code tomorrow to check that there aren't any more shell-outs I forgot
Right. The closest thing to a default merge tool would be to add an alias for it, e.g. the config
will allow you to run
jj rmg [file]
instead, but that's still very intentionally using mergiraf.Marking this as resolved because we now error out before getting any opportunity to shell out to Git
@ -342,0 +351,4 @@
// because Jujutsu has a builtin configuration to resolve
// merge conflicts manually with Mergiraf, which is more
// robust in several ways. Let's inform them about it.
if let Ok(output) = Command::new("jj").arg("root").output() {
Result::is_ok_and
to collapse these twoif
sjj root
(if any) by piping stdout/err to/dev/null
:Applied the first suggestion 👍 capturing stdout and stderr is already the default for
.output()
,.spawn()
would be the alternative way to start a process that inherits stdout and stderr from its parent.Ah, didn't know the latter, thanks for the info!
ad25240eba
to766b1e3b0b
766b1e3b0b
to0fd0e22f16
0fd0e22f16
to548319e093
@ -298,2 +298,4 @@
stdout = keep;
}
// Check if user is using Jujutsu instead of Git, which can lead
// to issues. Jujutsu has its own style of conflict markers,
I think the detailed explanation is nice, but it's a shame that the user doesn't see any of it. I think it'd be good to argue why we think they should run
jj resolve
, and showing (a part of) this comment might be the way to do that. Here's roughly what I have in mind:The alignment at the right is kind of wonky, but I'm not sure we should hand-wrap the text to any particular width, or just leave everything on one line and have the terminal do the rest
updated 👍 It's a wordy error message, but users should only see it once, so it's fine. If you
I used
\n\
for indentation, another option would be to use the indoc crate.TIL about
indoc
– looks useful! Also didn't know about\n\
-s – I was trying to use , but it would eat the entire empty line after the first paragraph. Actually, isn't a\n\
only really needed on that empty line? I'm pretty sure on the other lines just\
should be enoughThere is a difference in behavior.
\
eats all whitespace following it, including any newlines. So without the\n
before it, the text would be on a single line, soft-wrapped by the terminal. With\n\
on the other hand, lines are hard-wrapped at the places we defined.With
\
, except for the empty line:With
\n\
on every line:It just depends on what we want.
I see, thanks! I think I'd have the first two lines as separate lines, then an empty line, then the rest soft-wrapped. What do you think?
I like that! updated 🙂
548319e093
to9314a97a63
9314a97a63
to189c727cee
189c727cee
todeec7771dd
deec7771dd
toad2b10c2a5
ad2b10c2a5
to20bff34cc3
20bff34cc3
toa8025468c7
I know we've had a lot of back-and-forth already, but I think it would be good to test this somehow... Ideally we'd have 3 versions: one where we don't have
.jj
at all, and one for each of colocated and non-colocated.jj
and.git
Sounds reasonable... but the test infrastructur doesn't seem to be in place? The "end-to-end" test for the
mergiraf solve
command callsresolve_merge_cascading
instead of the CLI. But the changes here are not in library code.We might consider pulling in
assert_cmd
to test the CLI truly end-to-end.Right; the way we test the binary right now is by putting tests inside
mergiraf.rs
and callingreal_main
from them, passing the corresponding CLI arguments as an array. I realize this is an awful approach, but we couldn't find a better one at the time.I haven't heard of
assert_cmd
(learning a lot today!) – I'll look into it, but if you already have experience with, feel free to start writing the test (or ask for anything you know we'll need in order to use it)Ah, got it. That doesn't seem so bad to me. Mergiraf doesn't have a hugely complicated CLI and probably never will. So, focusing on testing business logic plus a few simple, "low tech" CLI tests is fine in my view.
Yeah, that makes sense.
The other question is how we would get Jujutsu onto
PATH
for the two tests that'd requirejj root
to succeed. We do define our runtime test dependencies inflake.nix
:nativeCheckInputs = with pkgs; [git];
but we don't actually use the flake in CI 😅 So I guess we'll need to add a step in CI to install jj?
Right. Are you sure these tests are worth it? It's a pretty straight-forward if statement that slighly improves a rare error message.
Eh, I guess not.
Ok then that should be it from my side I think! We're good to go if @wetneb approves:)
a1b74479dd
toa8025468c7