feat: manual language selection #327
No reviewers
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#327
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "wetneb/mergiraf:manual-language-selection-v2"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Continuation of #296 by @mathstuf.
Love the tests, and docs! There were still some things left unaddressed from #296 -- I mostly copied the comments from there verbatim
@ -84,0 +84,4 @@
### Manually specifying the file's language
You can use the `--language` option (short: `-L`) to specify the language of the files to merge.
It accepts both file extensions (`--language js`) and language names (`--language javascript`), as specified in the list of [supported languages](./languages.md).
it doesn't actually though?.. At least right now,
LangProfile::find_by_name
only looks atname
andalternate_names
, notextensions
Ah, great that you point this out! Wouldn't it make sense to also include extensions then? Arguably, they are language identifiers that users will be familiar with, no?
Definitely! I mean the only problem with that would be
.pl
, which is used both for Prolog and Perl. But:.pro
extension for Prolog because of thatOther than that, I think it's pretty much a no-brainer
@ -53,1 +53,4 @@
conflict_marker_size: Option<usize>,
/// Override automatic language detection.
#[arg(short = 'L', long)]
language: Option<String>,
#296 (comment):
@ -162,6 +163,7 @@ pub static SUPPORTED_LANGUAGES: LazyLock<Vec<LangProfile>> = LazyLock::new(|| {
},
LangProfile {
name: "Java properties",
#296 (comment):
@ -349,6 +354,7 @@ pub static SUPPORTED_LANGUAGES: LazyLock<Vec<LangProfile>> = LazyLock::new(|| {
},
LangProfile {
name: "Javascript",
alternate_names: &[],
#296 (comment):
#296 (comment):
Yes, I think that's a good case for including extensions then, no?
feat: Manual language selectionto feat: manual language selection5bc5517e72
toc010642734
mostly stylistic nits
@ -215,0 +216,4 @@
let repo_dir = tempfile::tempdir().expect("failed to create the temp dir");
let test_file = repo_dir.path().join(Path::new("file.txt"));
fs::copy(
Path::new("examples/java/working/demo/Base.java"),
you should be able to omit
Path::new
here and directly above, since the methods takeAsRef<Path>
, which is implemented for&str
@ -34,0 +38,4 @@
pub fn find_by_name(name: &str) -> Option<&'static Self> {
SUPPORTED_LANGUAGES.iter().find(|lang_profile| {
lang_profile.name.eq_ignore_ascii_case(name)
|| lang_profile
@ -34,0 +41,4 @@
|| lang_profile
.alternate_names
.iter()
.chain(lang_profile.extensions.iter())
You should be able to replace
.iter()
with&
here@ -237,0 +290,4 @@
#[test]
fn find_by_filename_or_name() {
assert_eq!(
LangProfile::find_by_filename_or_name(Path::new("file.json"), None)
you shouldn't need all the
Path::new
s in this test, sinceLangProfile::find_by_filename_or_name
acceptsAsRef<Path>
, so a&str
works on its own@ -237,0 +301,4 @@
.name,
"JSON"
);
LangProfile::find_by_filename_or_name(
expect_err
is kind of confusing -- I think usingassert!(/* ... */.is_err(), /* the message */)
might be clearer@ -41,3 +38,1 @@
fname_base.display()
);
return line_based_merge(contents_base, contents_left, contents_right, &settings);
let lang_profile = match LangProfile::find_by_filename_or_name(fname_base, language) {
couldn't you use
let Ok(lang_profile) = /* ... */ else { /* ... */ }
?I thought about it and the problem is that I need the error message in the
else
block… do you have a solution for that?Oh, completely missed that! It looks like that
match
is indeed required then, yeah.By the way,
message
already ends with a.
, so you'll end up with two of them – I think it makes more sense to remove the ones infind_by_filename_or_name
Good eyes! :)
@ -28,3 +26,1 @@
fname_base.display()
)
})?;
let lang_profile = if let Some(lang_name) = language {
can't this whole block be replaced with
LangProfile::detect_from_filename_or_name
?Very good catch, of course that's much better.
more of the same
@ -215,0 +223,4 @@
"parse",
"--language",
"java",
test_file.display().to_string().as_str(),
I'm pretty sure
.display()
is plain redundant here, sinceto_string
is implemented in terms of it -- so you can just remove that.You could also replace
to_string().as_str()
withto_str().unwrap()
(first one replaces non-Unicode characters with the?
thing, while the second one returnsNone
on non-UTF8 characters, both of which won't happen in this case), but it's not like one is really shorter than the other... Replacing.as_str()
with&
, on the other hand, will save some charactersI see that you used
to_str().unwrap()
inmergiraf.rs
, so maybe go with that for consistency... up to you@ -237,0 +301,4 @@
);
assert!(
LangProfile::find_by_filename_or_name(
Path::new("file.json"),
missed one
Path::new
here:)@ -237,0 +307,4 @@
.is_err(),
"If a language name is provided, the file name should be ignored"
);
LangProfile::find_by_filename_or_name(Path::new("file.unknown_extension"), None)
this could also be turned into an
assert!
, andPath::new
can be removed as well@ -237,0 +276,4 @@
assert_eq!(LangProfile::find_by_name("python").unwrap().name, "Python");
assert_eq!(LangProfile::find_by_name("py").unwrap().name, "Python");
assert_eq!(
LangProfile::find_by_name("Java properties").unwrap().name,
When I wrote about the need for tests for languages whose names contain spaces, I was thinking more about cli tests... but I mean the users will probably be able to realize that such names need quoting. So yeah, I guess that doesn't really need tests after all, and having this test is also good
One final change, and we can finally merge this. Thank you for bringing this to completion:)
@ -542,6 +559,7 @@ pub static SUPPORTED_LANGUAGES: LazyLock<Vec<LangProfile>> = LazyLock::new(|| {
},
LangProfile {
name: "Typescript (TSX)",
alternate_names: &["TSX"],
this one is redundant now that we match on extensions