Improve error reporting when working copy is not in a conflict state #423

xmo · 2025-06-03T09:47:22+02:00

xmo commented

2025-06-03 09:47:22 +02:00

When trying to resolve a conflict with mergiraf solve in a working copy which is not in a conflict state, mergiraf reports

error while retrieving Base revision for <path>:
git checkout-index: <path> does not exist at stage 1

This actually comes directly from git checkout-index but is confusing as it's not clear what the issue could be to a user not versed in checkout-index speficially (even if proficient with git). The underlying issue is that the index entry is not in a conflict state, and so there is no stage git can recover.

This can be caused by at least two things:

A new file with a conflict marker is added to the working copy (e.g. maybe work was done in an other working copy then the files were copied over conflict markers and all) (this is the version tested for simplicity)
The conflict is committed (à la jj I think?) e.g. some merge-adjacent operation was performed by tooling, and in order to report it the tooling just created a commit with the conflict markers (this is the version I hit because of $DAYJOB tooling)

In that case, mergiraf should not even attempt to call checkout-index, and should make the error clearer.

When trying to resolve a conflict with `mergiraf solve` in a working copy which is not in a conflict state, mergiraf reports ``` error while retrieving Base revision for <path>: git checkout-index: <path> does not exist at stage 1 ``` This actually comes directly from `git checkout-index` but is confusing as it's not clear what the issue could be to a user not versed in `checkout-index` speficially (even if proficient with git). The underlying issue is that the index entry is not in a conflict state, and so there is no stage git can recover. This can be caused by at least two things: - A new file with a conflict marker is added to the working copy (e.g. maybe work was done in an other working copy then the files were copied over conflict markers and all) (this is the version tested for simplicity) - The conflict is committed (à la jj I think?) e.g. some merge-adjacent operation was performed by tooling, and in order to report it the tooling just created a commit with the conflict markers (this is the version I hit because of $DAYJOB tooling) In that case, mergiraf should not even attempt to call `checkout-index`, and should make the error clearer.

xmo force-pushed error-reporting-state-not-conflicted from bc5470252d to 1707759cc2

2025-06-03 10:01:13 +02:00

Compare

xmo referenced this pull request

2025-06-03 12:18:17 +02:00

Call checkout-index just once per file #424

wetneb commented

2025-06-03 14:52:47 +02:00

I sympathize with the dissatisfaction with the current messaging, it's definitely worth improving.

When designing the mergiraf solve command, the retrieval of the revisions from Git was meant to be a purely optional step. If a conflict is solvable simply on the basis of the file with conflict markers, I wanted this command to just work on the file and not have to interact with Git at all (so that the resolution would even work outside of a Git work tree). The use case you have (working on a file with conflicts that has been committed into the repo) seems to also argue for this sort of flexibility. But I agree that the current error message doesn't convey this intention well.

The warning message could perhaps look more like this:

WARN Couldn't retrieve the original revisions from Git. This limits Mergiraf's ability to solve certain types of conflicts.

So I would not even display git's own error message there (it could be added as a separate logging statement at DEBUG level, perhaps).
Do you think this would give an appropriate user experience?

If so, then we probably don't even need the additional call to git ls-files and can simplify this PR to just an improvement of this logging statement.

I sympathize with the dissatisfaction with the current messaging, it's definitely worth improving. When designing the `mergiraf solve` command, the retrieval of the revisions from Git was meant to be a purely optional step. If a conflict is solvable simply on the basis of the file with conflict markers, I wanted this command to just work on the file and not have to interact with Git at all (so that the resolution would even work outside of a Git work tree). The use case you have (working on a file with conflicts that has been committed into the repo) seems to also argue for this sort of flexibility. But I agree that the current error message doesn't convey this intention well. The warning message could perhaps look more like this: ``` WARN Couldn't retrieve the original revisions from Git. This limits Mergiraf's ability to solve certain types of conflicts. ``` So I would not even display git's own error message there (it could be added as a separate logging statement at `DEBUG` level, perhaps). Do you think this would give an appropriate user experience? If so, then we probably don't even need the additional call to `git ls-files` and can simplify this PR to just an improvement of this logging statement.

xmo commented

2025-06-03 15:53:42 +02:00

I sympathize with the dissatisfaction with the current messaging, it's definitely worth improving.

No worry, I can't say it was as strong as "dissatisfaction" more interrogation and wondering if I was doing something wrong when I got the original console output, so I figured I'd look into it (then that I might as well make it clearer)

When designing the mergiraf solve command, the retrieval of the revisions from Git was meant to be a purely optional step. If a conflict is solvable simply on the basis of the file with conflict markers, I wanted this command to just work on the file and not have to interact with Git at all (so that the resolution would even work outside of a Git work tree). The use case you have (working on a file with conflicts that has been committed into the repo) seems to also argue for this sort of flexibility. But I agree that the current error message doesn't convey this intention well.

Aye, to be honest for my specific use case of having the conflicts committed I'd like to try and get the revision ids from the conflict markers and resolve that to the file contents, eventually, but it's a lot more involved than those relatively minor changes: as far as I could tell from looking around mergiraf already parses the conflict markers fully, so it'd be a matter of moving that around until it reaches extract_revision_from_git (or extract_all_revisions_from_git with #424 merged) and then requesting the blobs from git as a fallback to the fallback.

And I'm just realising now that there might be some overlap with the code which makes mergiraf work as a merge driver, I'll have to check how that's hooked up.

The warning message could perhaps look more like this:
WARN Couldn't retrieve the original revisions from Git. This limits Mergiraf's ability to solve certain types of conflicts.
So I would not even display git's own error message there (it could be added as a separate logging statement at DEBUG level, perhaps). Do you think this would give an appropriate user experience?

If so, then we probably don't even need the additional call to git ls-files and can simplify this PR to just an improvement of this logging statement.

That makes sense and sounds good.

> I sympathize with the dissatisfaction with the current messaging, it's definitely worth improving. No worry, I can't say it was as strong as "dissatisfaction" more interrogation and wondering if I was doing something wrong when I got the original console output, so I figured I'd look into it (then that I might as well make it clearer) > When designing the `mergiraf solve` command, the retrieval of the revisions from Git was meant to be a purely optional step. If a conflict is solvable simply on the basis of the file with conflict markers, I wanted this command to just work on the file and not have to interact with Git at all (so that the resolution would even work outside of a Git work tree). The use case you have (working on a file with conflicts that has been committed into the repo) seems to also argue for this sort of flexibility. But I agree that the current error message doesn't convey this intention well. Aye, to be honest for my specific use case of having the conflicts committed I'd like to try and get the revision ids from the conflict markers and resolve that to the file contents, eventually, but it's a lot more involved than those relatively minor changes: as far as I could tell from looking around mergiraf already parses the conflict markers fully, so it'd be a matter of moving that around until it reaches `extract_revision_from_git` (or `extract_all_revisions_from_git` with #424 merged) and then requesting the blobs from git as a fallback to the fallback. And I'm just realising now that there might be some overlap with the code which makes mergiraf work as a merge driver, I'll have to check how that's hooked up. > The warning message could perhaps look more like this: > > ```text > WARN Couldn't retrieve the original revisions from Git. This limits Mergiraf's ability to solve certain types of conflicts. > ``` > > So I would not even display git's own error message there (it could be added as a separate logging statement at `DEBUG` level, perhaps). Do you think this would give an appropriate user experience? > > If so, then we probably don't even need the additional call to `git ls-files` and can simplify this PR to just an improvement of this logging statement. That makes sense and sounds good.

wetneb commented

2025-06-03 17:10:38 +02:00

Aye, to be honest for my specific use case of having the conflicts committed I'd like to try and get the revision ids from the conflict markers and resolve that to the file contents, eventually, but it's a lot more involved than those relatively minor changes: as far as I could tell from looking around mergiraf already parses the conflict markers fully, so it'd be a matter of moving that around until it reaches extract_revision_from_git (or extract_all_revisions_from_git with #424 merged) and then requesting the blobs from git as a fallback to the fallback.

That would be pretty amazing… but as far as I remember those conflict markers are quite unstructured, and often contain things like HEAD which are ambiguous, no? I mean, as far as I am concerned, even if it only works in certain cases, I'd be fine with integrating something like that (if it doesn't interfere with the other cases), but it does feel like a lot of work to implement it.

> Aye, to be honest for my specific use case of having the conflicts committed I'd like to try and get the revision ids from the conflict markers and resolve that to the file contents, eventually, but it's a lot more involved than those relatively minor changes: as far as I could tell from looking around mergiraf already parses the conflict markers fully, so it'd be a matter of moving that around until it reaches `extract_revision_from_git` (or `extract_all_revisions_from_git` with #424 merged) and then requesting the blobs from git as a fallback to the fallback. That would be pretty amazing… but as far as I remember those conflict markers are quite unstructured, and often contain things like `HEAD` which are ambiguous, no? I mean, as far as I am concerned, even if it only works in certain cases, I'd be fine with integrating something like that (if it doesn't interfere with the other cases), but it does feel like a lot of work to implement it.

xmo commented

2025-06-04 07:47:57 +02:00

That would be pretty amazing… but as far as I remember those conflict markers are quite unstructured, and often contain things like HEAD which are ambiguous, no? I mean, as far as I am concerned, even if it only works in certain cases, I'd be fine with integrating something like that (if it doesn't interfere with the other cases)

Indeed but I was really just planning on a best effort aka try it out if the marker text looks like commit ids and that's it.

but it does feel like a lot of work to implement it.

Definitely not trivial, hence my having not even gotten started to look at it, and done some pretty basic QoL instead 😄

> That would be pretty amazing… but as far as I remember those conflict markers are quite unstructured, and often contain things like `HEAD` which are ambiguous, no? I mean, as far as I am concerned, even if it only works in certain cases, I'd be fine with integrating something like that (if it doesn't interfere with the other cases) Indeed but I was really just planning on a best effort aka try it out if the marker text looks like commit ids and that's it. > but it does feel like a lot of work to implement it. Definitely not trivial, hence my having not even gotten started to look at it, and done some pretty basic QoL instead 😄

wetneb commented

2025-06-04 08:28:20 +02:00

After all it's maybe not that much work given that the conflict labels are already accessible to the top-level merging functions…
And it reminds me that Git does something similar in git am:

       -3, --3way, --no-3way
           When the patch does not apply cleanly, fall back on 3-way merge if the patch records the identity of blobs
           it is supposed to apply to and we have those blobs available locally.  --no-3way can be used to override
           am.threeWay configuration variable. For more information, see am.threeWay in git-config(1).

So maybe it's not that backwards!!

After all it's maybe not that much work given that the conflict labels are already accessible to the top-level merging functions… And it reminds me that Git does something similar in `git am`: ``` -3, --3way, --no-3way When the patch does not apply cleanly, fall back on 3-way merge if the patch records the identity of blobs it is supposed to apply to and we have those blobs available locally. --no-3way can be used to override am.threeWay configuration variable. For more information, see am.threeWay in git-config(1). ``` So maybe it's not that backwards!!

xmo force-pushed error-reporting-state-not-conflicted from 1707759cc2 to 85d4423b20

2025-06-05 08:41:45 +02:00

Compare

xmo commented

2025-06-05 08:47:41 +02:00

@wetneb I've progressed on the error reporting, but I've hit a bit of a snag and I'm not sure there's a way around it: it might not be possible to get mergiraf to do a structured merge for missing stage 2 / missing stage 3 cases (outside of a merge driver context). That's because these are delete/modify and modify/delete conflicts (one branch deleted the file and the other updated it), and in those cases git just dumps the file from the non-deleted side in the working copy.

As a result, as far as I can tell when mergiraf is asked to solve the conflict it just sees a file with no conflicts and goes "there's not conflict there" and exits successfully, or something along those lines. It thus never even needs to look for the base revisions for the conflict.

@wetneb I've progressed on the error reporting, but I've hit a bit of a snag and I'm not sure there's a way around it: it might not be possible to get mergiraf to do a structured merge for missing stage 2 / missing stage 3 cases (outside of a merge driver context). That's because these are delete/modify and modify/delete conflicts (one branch deleted the file and the other updated it), and in those cases git just dumps the file from the non-deleted side in the working copy. As a result, as far as I can tell when mergiraf is asked to solve the conflict it just sees a file with no conflicts and goes "there's not conflict there" and exits successfully, or something along those lines. It thus never even needs to look for the base revisions for the conflict.

xmo force-pushed error-reporting-state-not-conflicted from 85d4423b20 to 437cf95f46

2025-06-05 08:48:40 +02:00

Compare

ada4a commented

2025-06-05 11:30:48 +02:00

in those cases git just dumps the file from the non-deleted side in the working copy

Hm, but when I do a rebase/merge and there is such a conflict, I do get it shown? Or is that exactly what you mean by "merge driver context"?

> in those cases git just dumps the file from the non-deleted side in the working copy Hm, but when I do a rebase/merge and there is such a conflict, I do get it shown? Or is that exactly what you mean by "merge driver context"?

xmo commented

2025-06-05 11:45:22 +02:00

in those cases git just dumps the file from the non-deleted side in the working copy

Hm, but when I do a rebase/merge and there is such a conflict, I do get it shown? Or is that exactly what you mean by "merge driver context"?

Git signals that there is a conflict, the file is in a conflicted state in the index, and I assume if there is a merge driver configured it will be called with all the bits (not sure of the precise calling conventions of the merge drivers in that case but I'd assume you get a signal that one of the file is missing), and mergiraf merge handles that somehow.

However the bit I'm currently working on is invoked by mergiraf solve, which in my understanding uses the file's conflict markers. So if there's no conflict markers it makes sense for mergiraf to assume there is no conflict and everything's fine.

I guess this is also used in the reporter thing so I could maybe create a test case for that tho.

> > in those cases git just dumps the file from the non-deleted side in the working copy > > Hm, but when I do a rebase/merge and there is such a conflict, I do get it shown? Or is that exactly what you mean by "merge driver context"? Git signals that there is a conflict, the file is in a conflicted state in the index, and I assume if there is a merge driver configured it will be called with all the bits (not sure of the precise calling conventions of the merge drivers in that case but I'd assume you get a signal that one of the file is missing), and `mergiraf merge` handles that somehow. However the bit I'm currently working on is invoked by `mergiraf solve`, which in my understanding uses the file's conflict markers. So if there's no conflict markers it makes sense for mergiraf to assume there is no conflict and everything's fine. I guess this is also used in the reporter thing so I could maybe create a test case for that tho.

wetneb commented

2025-06-05 12:36:00 +02:00

When one side deletes the file and the other modifies it, as far as I know:

git doesn't call the merge driver at all and just adds the modified file, marking it as conflicting -> so there is no real use case for mergiraf merge to handle this situation
I also don't see why one would want to use mergiraf solve on such a file - this isn't something mergiraf can help with as long as it's only able to deal with one file at a time.

So, as far as I can tell, the only case we should care about is the one where the file isn't present in the base revision, but is present in the left and right revisions. In other words, both sides add the same file with different contents. Although we can work on handling that gracefully in this area of revision retrieval from git, there are other hurdles in the downstream merging algorithm (such as matching) which mean that we don't do a good job at solving those cases in general. (See the test case examples/java/failing/left_right_class for instance).

When one side deletes the file and the other modifies it, as far as I know: * git doesn't call the merge driver at all and just adds the modified file, marking it as conflicting -> so there is no real use case for `mergiraf merge` to handle this situation * I also don't see why one would want to use `mergiraf solve` on such a file - this isn't something mergiraf can help with as long as it's only able to deal with one file at a time. So, as far as I can tell, the only case we should care about is the one where the file isn't present in the base revision, but is present in the left and right revisions. In other words, both sides add the same file with different contents. Although we can work on handling that gracefully in this area of revision retrieval from git, there are other hurdles in the downstream merging algorithm (such as matching) which mean that we don't do a good job at solving those cases in general. (See the test case `examples/java/failing/left_right_class` for instance).

xmo commented

2025-06-05 12:52:26 +02:00

When one side deletes the file and the other modifies it, as far as I know:

git doesn't call the merge driver at all and just adds the modified file, marking it as conflicting -> so there is no real use case for mergiraf merge to handle this situation

I also don't see why one would want to use mergiraf solve on such a file - this isn't something mergiraf can help with as long as it's only able to deal with one file at a time.

Right so sounds like handling / testing the case of a missing stage 2 / stage 3 revision at all is unnecessary.

I still need to see what happens in the bug reporter as it's the other user of that API. I guess it still has a use for this information order to not generate a bug report archive in the case of a delete/modify or modify/delete conflict, since this is unsupported / nonsensical for mergiraf?

But a difference should probably be made between "missing stage 1" and "missing stage 2 / 3", as it could make sense to create an archive in the case of a missing stage 1?

What about the case of an unindexed / stage 0 file? Maybe a path telling the user to just submit the file with conflict markers?

> When one side deletes the file and the other modifies it, as far as I know: > > * git doesn't call the merge driver at all and just adds the modified file, marking it as conflicting -> so there is no real use case for `mergiraf merge` to handle this situation > * I also don't see why one would want to use `mergiraf solve` on such a file - this isn't something mergiraf can help with as long as it's only able to deal with one file at a time. Right so sounds like handling / testing the case of a missing stage 2 / stage 3 revision at all is unnecessary. I still need to see what happens in the bug reporter as it's the other user of that API. I guess it still has a use for this information order to *not* generate a bug report archive in the case of a delete/modify or modify/delete conflict, since this is unsupported / nonsensical for mergiraf? But a difference should probably be made between "missing stage 1" and "missing stage 2 / 3", as it could make sense to create an archive in the case of a missing stage 1? What about the case of an unindexed / stage 0 file? Maybe a path telling the user to just submit the file with conflict markers?

wetneb commented

2025-06-05 13:12:01 +02:00

But a difference should probably be made between "missing stage 1" and "missing stage 2 / 3", as it could make sense to create an archive in the case of a missing stage 1?

Yes, I guess for the bug reporter we should ideally make an archive whatever revisions are missing, no? As soon as a user takes the time to want to report a buggy situation, I'd like to have as least hurdles towards that, and it's helpful for us to receive an archive with missing revisions, as we're then able to understand the situation better.

What about the case of an unindexed / stage 0 file? Maybe a path telling the user to just submit the file with conflict markers?

Yes we don't really have a satisfactory workflow in those cases. Submitting the file with conflict markers makes sense. It reminds me of #66 - the workflows around reviewing and reporting the output of mergiraf solve aren't ideal so far.

> But a difference should probably be made between "missing stage 1" and "missing stage 2 / 3", as it could make sense to create an archive in the case of a missing stage 1? Yes, I guess for the bug reporter we should ideally make an archive whatever revisions are missing, no? As soon as a user takes the time to want to report a buggy situation, I'd like to have as least hurdles towards that, and it's helpful for us to receive an archive with missing revisions, as we're then able to understand the situation better. > What about the case of an unindexed / stage 0 file? Maybe a path telling the user to just submit the file with conflict markers? Yes we don't really have a satisfactory workflow in those cases. Submitting the file with conflict markers makes sense. It reminds me of #66 - the workflows around reviewing and reporting the output of `mergiraf solve` aren't ideal so far.

xmo commented

2025-06-05 13:24:36 +02:00

But a difference should probably be made between "missing stage 1" and "missing stage 2 / 3", as it could make sense to create an archive in the case of a missing stage 1?

Yes, I guess for the bug reporter we should ideally make an archive whatever revisions are missing, no? As soon as a user takes the time to want to report a buggy situation, I'd like to have as least hurdles towards that, and it's helpful for us to receive an archive with missing revisions, as we're then able to understand the situation better.

An add/add conflict should definitely generate an archive, even if it's a known weak point currently, but if mergiraf can't work in the case of a delete/modify or modify/conflict telling that to the user seems more useful than having them report the issue then the issue being closed because it's essentially an impossible case though?

> > But a difference should probably be made between "missing stage 1" and "missing stage 2 / 3", as it could make sense to create an archive in the case of a missing stage 1? > > Yes, I guess for the bug reporter we should ideally make an archive whatever revisions are missing, no? As soon as a user takes the time to want to report a buggy situation, I'd like to have as least hurdles towards that, and it's helpful for us to receive an archive with missing revisions, as we're then able to understand the situation better. An add/add conflict should definitely generate an archive, even if it's a known weak point currently, but if mergiraf can't work in the case of a delete/modify or modify/conflict telling that to the user seems more useful than having them report the issue then the issue being closed because it's essentially an impossible case though?

wetneb commented

2025-06-05 15:12:08 +02:00

having them report the issue then the issue being closed because it's essentially an impossible case though?

As far as I am concerned, I want mergiraf to improve there, so I wouldn't want to dismiss bug reports about that, even if the issue is known. There is no fundamental/architectural reason why mergiraf can't handle those cases (Spork handles them, for instance).
But of course, if there is an option to give a suitable message to the user earlier on, that's great too.

> having them report the issue then the issue being closed because it's essentially an impossible case though? As far as I am concerned, I want mergiraf to improve there, so I wouldn't want to dismiss bug reports about that, even if the issue is known. There is no fundamental/architectural reason why mergiraf can't handle those cases (Spork handles them, for instance). But of course, if there is an option to give a suitable message to the user earlier on, that's great too.

xmo force-pushed error-reporting-state-not-conflicted from 437cf95f46 to eb5cf32896

2025-06-06 12:33:55 +02:00

Compare

xmo commented

2025-06-06 12:39:16 +02:00

As far as I am concerned, I want mergiraf to improve there, so I wouldn't want to dismiss bug reports about that, even if the issue is known. There is no fundamental/architectural reason why mergiraf can't handle those cases (Spork handles them, for instance). But of course, if there is an option to give a suitable message to the user earlier on, that's great too.

OK with that in mind, I updated the change to signal missing stages properly to the caller (by returning Option<GitTemporaryFile>). In the solver missing stages are replaced by empty strings as that seems to be the expected behaviour given the left_right_class example, and in the reporter it just leads the archive creator to not write anything (which ends up as an empty file in the zip).

Consequently, if checkout-index succeeds with a non-stage-0 file, and the output is well formed, the only message from mergiraf ought be the normal solve results.

In essence, the original proposal (improve solve messaging if the file we're trying to solve is not in a conflict state), but with the addition of missing stages not being considered errors anymore.

> As far as I am concerned, I want mergiraf to improve there, so I wouldn't want to dismiss bug reports about that, even if the issue is known. There is no fundamental/architectural reason why mergiraf can't handle those cases (Spork handles them, for instance). But of course, if there is an option to give a suitable message to the user earlier on, that's great too. OK with that in mind, I updated the change to signal missing stages properly to the caller (by returning `Option<GitTemporaryFile>`). In the solver missing stages are replaced by empty strings as that seems to be the expected behaviour given the `left_right_class` example, and in the reporter it just leads the archive creator to not write anything (which ends up as an empty file in the zip). Consequently, if `checkout-index` succeeds with a non-stage-0 file, and the output is well formed, the only message from mergiraf ought be the normal solve results. In essence, the original proposal (improve `solve` messaging if the file we're trying to solve is not in a conflict state), but with the addition of missing stages not being considered errors anymore.

xmo force-pushed error-reporting-state-not-conflicted from eb5cf32896 to 1d9fa49f66

2025-06-06 12:40:23 +02:00

Compare

ada4a reviewed

2025-06-10 11:45:19 +02:00

src/solve.rs Outdated

					
				@ -73,0 +74,4 @@

				        Err(FallbackMergeError::GitError(err)) => {

				            debug!("Error while extracting original revisions from Git: {err}");

				            warn!(

				                "Couldn't retrieve the original revisions from Git. This limits Mergiraf's ability to solve certain types of conflicts."

ada4a commented

2025-06-10 11:31:03 +02:00

I think having long strings like this can confuse rustfmt. Could you please break it apart with \?

warn!(
    "Couldn't retrieve the original revisions from Git. \
     This limits Mergiraf's ability to solve certain types of conflicts."
);

I think having long strings like this can confuse rustfmt. Could you please break it apart with `\`? ```rs warn!( "Couldn't retrieve the original revisions from Git. \ This limits Mergiraf's ability to solve certain types of conflicts." );

xmo commented

2025-06-10 13:48:00 +02:00

Sure.

Seems to mostly be an issue when the long string is part of a wider structure (struct literal, method chain, closure) though:

so I'm not sure it applies in this case.

Also if that's a worry you might want to enable

Possibly tune max_width as well if you find the default (100) too restrictive.

Note that wrap_comments straight up bails if the comment contains a URL, or is trailing, which then triggers error_on_unformatted. I'm not sure there's a way around that if a URL is longer than max_width + whatever intend the comment has.

Sure. Seems to mostly be an issue when the long string is part of a wider structure (struct literal, method chain, closure) though: - https://github.com/rust-lang/rustfmt/issues/3863 - https://github.com/rust-lang/rustfmt/issues/4800 - https://github.com/rust-lang/rustfmt/issues/6288 so I'm not sure it applies in this case. Also if that's a worry you might want to enable - [`format_strings`](https://rust-lang.github.io/rustfmt/?version=v1.8.0#format_strings) - [`wrap_comments`](https://rust-lang.github.io/rustfmt/?version=v1.8.0#wrap_comments) - [`error_on_unformatted`](https://rust-lang.github.io/rustfmt/?version=v1.8.0#error_on_unformatted) - [`error_on_line_overflow`](https://rust-lang.github.io/rustfmt/?version=v1.8.0#error_on_line_overflow) Possibly tune `max_width` as well if you find the default (100) too restrictive. Note that `wrap_comments` straight up bails if the comment contains a URL, or is trailing, which then triggers `error_on_unformatted`. I'm not sure there's a way around that if a URL is longer than `max_width` + whatever intend the comment has.

ada4a commented

2025-06-10 14:06:02 +02:00

Seems to mostly be an issue when the long string is part of a wider structure (struct literal, method chain, closure) though

Interesting. I'm pretty sure most of the cases of this we've had were in log macro arguments (since that's where one most often has long string literals), but I'm not sure whether there always was a surrounding structure of some kind.

Maybe match expressions? Though I think that one might also be just a frequency thing because the place we very often have logs is when matching on an expression and finding an invalid value.

Or maybe match expressions are generally pretty nested, and so the total length of the line that the log message is on exceeds some limit in rustfmt...

Also if that's a worry you might want to enable

Oh, didn't know about these, thanks! It seems like all of them require an unstable toolchain though, which is.. not ideal

> Seems to mostly be an issue when the long string is part of a wider structure (struct literal, method chain, closure) though Interesting. I'm pretty sure most of the cases of this we've had were in log macro arguments (since that's where one most often has long string literals), but I'm not sure whether there always was a surrounding structure of some kind. Maybe match expressions? Though I think that one might also be just a frequency thing because the place we very often have logs is when matching on an expression and finding an invalid value. Or maybe match expressions are generally pretty nested, and so the total length of the line that the log message is on exceeds some limit in rustfmt... > Also if that's a worry you might want to enable Oh, didn't know about these, thanks! It seems like all of them require an unstable toolchain though, which is.. not ideal

xmo commented

2025-06-10 14:21:50 +02:00

Oh, didn't know about these, thanks! It seems like all of them require an unstable toolchain though, which is.. not ideal

I'm not sure they do, I tried them locally and they just worked without needing to cargo +nightly so they seem to work on the stable toolchain.

From what I understand for rustfmt "unstable" doesn't mean unavailable but rather that they don't conform to the stability guarantee namely:

a newer version of Rustfmt cannot modify the successfully formatted output that was produced by a previous version of Rustfmt.

So e.g. you might have formatted code with format_strings in one version of rustfmt, and in the next it gets tweaked and will reformat the same string literal differently.

> Oh, didn't know about these, thanks! It seems like all of them require an unstable toolchain though, which is.. not ideal I'm not sure they do, I tried them locally and they just worked without needing to `cargo +nightly` so they seem to work on the stable toolchain. From what I understand for rustfmt "unstable" doesn't mean unavailable but rather that they don't conform to the stability guarantee namely: > a newer version of Rustfmt cannot modify the successfully formatted output that was produced by a previous version of Rustfmt. So e.g. you might have formatted code with `format_strings` in one version of rustfmt, and in the next it gets tweaked and will reformat the same string literal differently.

ada4a commented

2025-06-10 14:32:29 +02:00

Hm, here's what the "Configuring Rustfmt" section (at the top of https://rust-lang.github.io/rustfmt) says:

Each configuration option is either stable or unstable. Stable options can always be used, while unstable options are only available on a nightly toolchain and must be opted into. To enable unstable options, set unstable_features = true in rustfmt.toml or pass --unstable-features to rustfmt.

Hm, here's what the "Configuring Rustfmt" section (at the top of https://rust-lang.github.io/rustfmt) says: > Each configuration option is either stable or unstable. Stable options can always be used, while unstable options are only available on a nightly toolchain and must be opted into. To enable unstable options, set `unstable_features = true` in rustfmt.toml or pass `--unstable-features` to rustfmt.

xmo commented

2025-06-10 15:23:07 +02:00

Well I don't know what to tell you, I've just the stable toolchain and these settings work out of the box: https://asciinema.org/a/mgtj03E0ZT2S5qW0KI3iHt7Jr

My homedir has no hit for unstable_features outside of the cargo registry, and just 3 rustfmt.toml, two which just set the max_width and one which only sets the edition.

Maybe they always work on the command line but have to be enabled for on-disk configuration files?

Well I don't know what to tell you, I've just the stable toolchain and these settings work out of the box: https://asciinema.org/a/mgtj03E0ZT2S5qW0KI3iHt7Jr My homedir has no hit for `unstable_features` outside of the cargo registry, and just 3 rustfmt.toml, two which just set the `max_width` and one which only sets the `edition`. Maybe they always work on the command line but have to be enabled for on-disk configuration files?

xmo added 1 commit

2025-06-10 13:27:11 +02:00

Break up long string literal

/ test (pull_request) Successful in 1m0s

Details

55fac5257d

wetneb approved these changes

2025-06-11 13:45:02 +02:00

wetneb left a comment

Fantastic! As far as I am concerned, this is good to go. Thank you so much for this!

wetneb merged commit f2d10ced14 into main

2025-06-15 22:22:29 +02:00