Yore
Rust library for decoding/encoding character sets according to OEM code pages.
Features
- Fast *
- Minimal memory usage due to
Cowandshrink_to_fit - Simple API
- Many supported code pages
- Supports code pages with redefined ascii(<0x80), for example '٪' in CP864
Usage
Add yore to Cargo.toml.
[]
= "0.3.0"
Examples
Use specific code page
use ;
use ;
// Vec contains ascii "text"
let bytes = vec!;
// Vec contains ascii "text " and codepoint 231
let bytes_undefined = vec!;
// notice that decoding CP850 can't fail because it is completely defined
assert_eq!;
// But CP857 can fail
assert_eq!;
//"text " + codepoint 231
assert!;
// lossy decoding will not fail because of fallback
assert_eq!;
// encoding
assert_eq!;
assert!;
assert_eq!;
Use trait object
use CodePage;
Supported code pages
| Identifier | Name | Description |
|---|---|---|
| 437 | IBM437 | OEM United States |
| 737 | ibm737 | OEM Greek (formerly 437G); Greek (DOS) |
| 775 | ibm775 | OEM Baltic; Baltic (DOS) |
| 850 | ibm850 | OEM Multilingual Latin 1; Western European (DOS) |
| 852 | ibm852 | OEM Latin 2; Central European (DOS) |
| 855 | IBM855 | OEM Cyrillic (primarily Russian) |
| 857 | ibm857 | OEM Turkish; Turkish (DOS) |
| 860 | IBM860 | OEM Portuguese; Portuguese (DOS) |
| 861 | ibm861 | OEM Icelandic; Icelandic (DOS) |
| 862 | DOS-862 | OEM Hebrew; Hebrew (DOS) |
| 863 | IBM863 | OEM French Canadian; French Canadian (DOS) |
| 864 | IBM864 | OEM Arabic; Arabic (864) |
| 865 | IBM865 | OEM Nordic; Nordic (DOS) |
| 866 | cp866 | OEM Russian; Cyrillic (DOS) |
| 869 | ibm869 | OEM Modern Greek; Greek, Modern (DOS) |
| 874 | windows-874 | Thai (Windows) |
| 1250 | windows-1250 | ANSI Central European; Central European (Windows) |
| 1251 | windows-1251 | ANSI Cyrillic; Cyrillic (Windows) |
| 1252 | windows-1252 | ANSI Latin 1; Western European (Windows) |
| 1253 | windows-1253 | ANSI Greek; Greek (Windows) |
| 1254 | windows-1254 | ANSI Turkish; Turkish (Windows) |
| 1255 | windows-1255 | ANSI Hebrew; Hebrew (Windows) |
| 1256 | windows-1256 | ANSI Arabic; Arabic (Windows) |
| 1257 | windows-1257 | ANSI Baltic; Baltic (Windows) |
| 1258 | windows-1258 | ANSI/OEM Vietnamese; Vietnamese (Windows) |
* Benchmarks
encoding_rs supports only a few of the encodings that oem_cp and yore supports.
Furthermore, the use case of encoding_rs is focused on streaming.
See bench crate for details