# ๐ Localization (L10n) in uutils coreutils
This guide explains how localization (L10n) is implemented in the **Rust-based coreutils project**, detailing the use of [Fluent](https://projectfluent.org/) files, runtime behavior, and developer integration.
## ๐๏ธ Architecture Overview
**English (US) locale files (`en-US.ftl`) are embedded directly in the binary**, ensuring that English always works regardless of how the software is installed. Other language locale files are loaded from the filesystem at runtime.
### Source Repository Structure
- **Main repository**: Contains English (`en-US.ftl`) locale files embedded in binaries
- **Translation repository**: [uutils/coreutils-l10n](https://github.com/uutils/coreutils-l10n) contains all other language translations
- **Online Translation**: [weblate/rust-coreutils](https://hosted.weblate.org/projects/rust-coreutils/) to translate the strings.
---
## ๐ Fluent File Layout
Each utility has its own set of translation files under:
```
src/uu/<utility>/locales/<locale>.ftl
```
Examples:
```
src/uu/ls/locales/en-US.ftl # Embedded in binary
src/uu/ls/locales/fr-FR.ftl # Loaded from filesystem
```
These files follow Fluent syntax and contain localized message patterns.
The French translation is the only locale with English to be part of the tree. The goal is to be able to run tests with
a different locale to verify that they work.
---
## โ๏ธ Initialization
Localization must be explicitly initialized at runtime using:
```
setup_localization(path)
```
This is typically done:
- In `src/bin/coreutils.rs` for **multi-call binaries**
- In `src/uucore/src/lib.rs` for **single-call utilities**
The string parameter determines the lookup path for Fluent files. **English always works** because it's embedded, but other languages need their `.ftl` files to be available at runtime.
---
## ๐ Locale Detection
Locale selection is automatic and performed via:
```
fn detect_system_locale() -> Result<LanguageIdentifier, LocalizationError>
```
It reads the `LANG` environment variable (e.g., `fr-FR.UTF-8`), strips encoding, and parses the identifier.
If parsing fails or `LANG` is not set, it falls back to:
```
const DEFAULT_LOCALE: &str = "en-US";
```
You can override the locale at runtime by running:
```
LANG=ja-JP ./target/debug/ls
```
---
## ๐ฅ Retrieving Messages
We have a single macro to handle translations.
It can be used in two ways:
### `translate!(id: &str) -> String`
Returns the message from the current locale bundle.
```
let msg = translate!("id-greeting");
```
If not found, falls back to `en-US`. If still missing, returns the ID itself.
---
### `translate!(id: &str, args: key-value pairs) -> String`
Supports variable interpolation and pluralization.
```
let msg = translate!(
"error-io",
"error" => std::io::Error::last_os_error()
);
```
Fluent message example:
```
error-io = I/O error occurred: { $error }
```
Variables must match the Fluent placeholder keys (`$error`, `$name`, `$count`, etc.).
---
## ๐ฆ Fluent Syntax Example
```
id-greeting = Hello, world!
welcome = Welcome, { $name }!
count-files = You have { $count ->
[one] { $count } file
*[other] { $count } files
}
```
Use plural rules and inline variables to adapt messages dynamically.
---
## ๐งช Testing Localization
Run all localization-related unit tests with:
```
cargo test --lib -p uucore
```
Tests include:
- Loading bundles
- Plural logic
- Locale fallback
- Fluent parse errors
- Thread-local behavior
- ...
---
## ๐งต Thread-local Storage
Localization is stored per thread using a `OnceLock`.
Each thread must call `setup_localization()` individually.
Initialization is **one-time-only** per thread โ re-initialization results in an error.
---
## ๐งช Development vs Release Mode
During development (`cfg(debug_assertions)`), paths are resolved relative to the crate source:
```
$CARGO_MANIFEST_DIR/../uu/<utility>/locales/
```
In release mode, **paths are resolved relative to the executable**:
```
<executable_dir>/locales/<utility>/
<prefix>/share/locales/<utility>/
~/.local/share/coreutils/locales/<utility>/
~/.cargo/share/coreutils/locales/<utility>/
/usr/share/coreutils/locales/<utility>/
```
If external locale files aren't found, the system falls back to embedded English locales.
---
## ๐ค Unicode Isolation Handling
By default, the Fluent system wraps variables with Unicode directional isolate characters (`U+2068`, `U+2069`) to protect against visual reordering issues in bidirectional text (e.g., mixing Arabic and English).
In this implementation, isolation is **disabled** via:
```
bundle.set_use_isolating(false);
```
This improves readability in CLI environments by preventing extraneous characters around interpolated values:
Correct (as rendered):
```
"Welcome, Alice!"
```
Fluent default (disabled here):
```
"\u{2068}Alice\u{2069}"
```
---
## ๐ง Embedded English Locales
English locale files are always embedded directly in the binary during the build process. This ensures that:
- **English always works** regardless of installation method (e.g., `cargo install`)
- **No runtime dependency** on external `.ftl` files for English
- **Fallback behavior** when other language files are missing
The embedded English locales are generated at build time and included in the binary, providing a reliable fallback while still supporting full localization for other languages when their `.ftl` files are available.