WIP: feat: Gettext support (*.po
, *.pot
) #584
No reviewers
Labels
No labels
Compat/Breaking
Kind
Bad merge
Kind
Bug
Kind
Documentation
Kind
Enhancement
Kind
Feature
Kind
New language
Kind
Security
Kind
Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: mergiraf/mergiraf#584
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "wetneb/mergiraf:gettext_support"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Interestingly, support for Gettext files is essentially useless without #576, because each entry in the file is normally preceded by a comment.
Also, this
comment
node is not marked as "extra" in the grammar. We will probably need some additional field in the language profiles to specify a set of node types that should also be bundled, even though they are not extras. This would also be useful for Python docstrings (assuming we can change the grammar appropriately).Hm, looking at the spec, I don't think those comments are really extra -- rather, they're optional attributes of the message they annotate. Bundling them together with the message looks to me like something that should be implemented by the grammar...
I agree, but this sentence makes me think that some comments might be allowed to appear as isolated lines:
Doesn't that sort of imply that the other comments (starting with
#
) are meant to be completely ignored by the program, hence could appear in any order and any line in the file?For instance you could perhaps have something like this:
Ah, I see what you mean. I tried searching Github for some examples, and apparently what happens instead is that the
#
comments are put onto a dummy entry at the beginning of the file, like this:Which would suggest that the comments can't be free-standing after all (even though your example does look plausible and helpful).
But that's of course no hard proof. I'm assuming that the Gettext format support was requested by someone? If so, I'd probably ask them about this, because they're likely more familiar with the language..
Support for this format wasn't asked for by anyone, but in the dataset of merge cases that I'm gathering, it's the most common file format that we don't handle yet.
I totally believe the comments are indeed bundlable in the grammar like you said for the overwhelming majority of PO files out there. We can actually use the dataset to check if the parsing errors increase if we change the grammar.
8ac9df1914
to452d2d3398
So, thanks to the bundling feature, the motivating test case now passes, but running this on real examples of merge conflicts on
.po
files, I don't get very convincing results.This seems to come mostly from cases like this:
There is no way for mergiraf to know how to resolve the conflict between
template.php:92
andtemplate.php:108
.The desired behavior from a user standpoint would be to pick any of the two, because it doesn't matter: at the next update of the file, the line number will be corrected. What matters is the
msgid
tomsgstr
mapping, for which there is no conflict here.So maybe mergiraf isn't a great fit for
.po
files.Here are more detailed statistics:
*.po
Not exactly convincing.
Pull request closed