Expand description
An iterator adapter that takes an iterator over char yielding a sequence of
chars in Normalization Form C (this precondition is not checked!) and
yields chars either such that tone marks that wouldn’t otherwise fit into
windows-1258 are decomposed or such that text is decomposed into orthographic
units.
Use cases include preprocessing before encoding Vietnamese text into windows-1258 or converting precomposed Vietnamese text into a form that looks like it was written with the (non-IME) Vietnamese keyboard layout (e.g. for machine learning training or benchmarking purposes).
Structs§
- Decompose
Vietnamese - An iterator adapter yielding
charwith tone marks detached.
Traits§
- Iter
Decompose Vietnamese - Trait that adds a
decompose_vietnamese_tonesmethod to iterators overchar.