feat(Git): support reading language from git attributes #599

Merged
wetneb merged 12 commits from attribute-based-language-selection into main 2025-09-30 14:30:09 +02:00
Owner

This is a continuation of #565, which I'm not allowed to push to directly.
I've added commits to address the points brought up by @ada4a.

There is still the open question of whether we want to honor the linguist gitattributes. It could be tackled here or in a follow-up PR.

This is a continuation of #565, which I'm not allowed to push to directly. I've added commits to address the points brought up by @ada4a. There is still the open question of whether we want to honor the linguist gitattributes. It could be tackled here or in a follow-up PR.
When used as a merge driver, `mergiraf` only has the file extension as
guidance for language specification. This may not be suitable in all
cases (e.g., custom languages that are supported-language-adjacent or
`.in` template files that are mostly another language). Allow
specification of the language using the `mergiraf.language` attribute.
Inline find_impl
Some checks failed
/ test (pull_request) Failing after 19s
e4fc23458b
cargo fmt
All checks were successful
/ test (pull_request) Successful in 38s
42cab5454c
Owner

There is still the open question of whether we want to honor the linguist gitattributes

I'd still prefer to only use them, because they're kind of the standard -- if your file type doesn't get recognized, it'd be annoying to need to specify both linguist-language and mergiraf.language imo

> There is still the open question of whether we want to honor the linguist gitattributes I'd still prefer to _only_ use them, because they're kind of the standard -- if your file type doesn't get recognized, it'd be annoying to need to specify both `linguist-language` _and_ `mergiraf.language` imo
requested review from mathstuf 2025-09-17 21:33:06 +02:00
Member

@ada4a wrote in #599 (comment):

There is still the open question of whether we want to honor the linguist gitattributes

I'd still prefer to only use them, because they're kind of the standard -- if your file type doesn't get recognized, it'd be annoying to need to specify both linguist-language and mergiraf.language imo

Eh. I'm not sure that one would say that mergiraf would always be 100% in agreement with linguist. For example, say mergiraf wants to add something like the more specific logic for pyproject.toml rather than a generic toml file. If Linguist doesn't differentiate, what could mergiraf do?

For bundled attributes like this, projects can do:

[attr]my-lang-props  mergiraf.language=flubber linguist-language=flobber

*.flb  my-lang-props
@ada4a wrote in https://codeberg.org/mergiraf/mergiraf/pulls/599#issuecomment-7228570: > > There is still the open question of whether we want to honor the linguist gitattributes > > I'd still prefer to _only_ use them, because they're kind of the standard -- if your file type doesn't get recognized, it'd be annoying to need to specify both `linguist-language` _and_ `mergiraf.language` imo Eh. I'm not sure that one would say that `mergiraf` would always be 100% in agreement with `linguist`. For example, say `mergiraf` wants to add something like the more specific logic for `pyproject.toml` rather than a generic `toml` file. If Linguist doesn't differentiate, what could `mergiraf` do? For bundled attributes like this, projects can do: ``` [attr]my-lang-props mergiraf.language=flubber linguist-language=flobber *.flb my-lang-props ```
Owner

Yeah, the silly Cake thing is kind of an example for that. Maybe we could look for mergiraf.language first, and only then linguist-language

Yeah, the silly Cake thing is kind of an example for that. Maybe we could look for `mergiraf.language` first, and only then `linguist-language`
mathstuf left a comment
Member

I think mergiraf.language and then asking linguist-language makes sense.

I think `mergiraf.language` and then asking `linguist-language` makes sense.
src/git.rs Outdated
@ -97,0 +110,4 @@
.ok()
.filter(|output| output.status.success())?;
// Parse the output of git-check-attr, which looks like:
// <path> COLON SP <attribute> COLON SP <info> LF
Member

Well, it is NUL separators here with the -z flag above.

Well, it is `NUL` separators here with the `-z` flag above.
mathstuf marked this conversation as resolved
Author
Owner

Great! I'll be taking a small holiday from mergiraf, do feel invited to add further changes to this PR (the branch is in the upstream repo so that it's easy for everyone to write to it).

Great! I'll be taking a small holiday from mergiraf, do feel invited to add further changes to this PR (the branch is in the upstream repo so that it's easy for everyone to write to it).
mathstuf force-pushed attribute-based-language-selection from c0e68aac36 to 78b80fd4c8 2025-09-22 01:28:26 +02:00 Compare
Member

I added support for mergiraf.language with fallback to linguist-language. Test suite updated to cover it.

I added support for `mergiraf.language` with fallback to `linguist-language`. Test suite updated to cover it.
mathstuf force-pushed attribute-based-language-selection from 78b80fd4c8 to 60e74cc0b6 2025-09-22 01:30:41 +02:00 Compare
@ -97,0 +124,4 @@
.filter(|value| value != "unspecified" && value != "set" && value != "unset")
};
read_attr("mergiraf.language").or_else(|| read_attr("linguist-language"))
Author
Owner

This makes two calls to git check-attr, but apparently git check-attr supports retrieving multiple attributes at once according to its man page. So I think it would be worth optimizing that further. Leaving that to a follow-up PR.

This makes two calls to `git check-attr`, but apparently `git check-attr` supports retrieving multiple attributes at once according to its man page. So I think it would be worth optimizing that further. Leaving that to a follow-up PR.
Owner

Could you please leave a TODO?

Could you please leave a TODO?
Author
Owner

I'll do it directly anyway

I'll do it directly anyway
ada4a marked this conversation as resolved
cargo fmt
All checks were successful
/ test (pull_request) Successful in 38s
dbcd30977d
wetneb merged commit 48caa54d96 into main 2025-09-30 14:30:09 +02:00
wetneb deleted branch attribute-based-language-selection 2025-09-30 14:30:18 +02:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: mergiraf/mergiraf#599
No description provided.