Little things that matter in language design
Little things that matter in language design
Posted Jun 14, 2013 0:22 UTC (Fri) by dvdeug (subscriber, #10998)In reply to: Little things that matter in language design by wahern
Parent article: Little things that matter in language design
Grapheme indexing is not what everybody has been using for generations. In the 60 years of computing history, there have been a lot of cases where people working with scripts more complex then ASCII or Chinese have handled it a number of ways, including character sets that explicitly encoded combining characters (like ISO/IEC 6937) and the use of BS with ASCII to overstrike characters like ^ with the previous character.
UTF-8 is so popular because for many purposes it's 1/4th the size of UTF-32, and for normal data never worse then 3/4 the size. And as long as you're messing with ASCII, you can generally ignore the differences. If people want UTF-32, it's easy to find.