|
Thank you, Joe. I haven't had a chance to check the link. I wuz more familiar with UCS-2 than either UTF-8 or UCS-4. "UTF-8 was designed to transport UCS data from one machine to another with the ability to resync after dropping a character. In order to do that, some fancy bit-packing is done..." Ah, Rube Golberg must-a expanded the scope of the project to include data-compression/validation, it appears...;-D "Base Multilingual Plane..." This the same as "Base64" in XML-speak?? "UTF-16. ... This encoding gets REALLY bizarre REALLY quick, because there are issues of byte order." Luckily a 400 is both big-endian and li'l-endian (unless this is a different issue)...! "It gets VERY complicated, but the semi-short version is that Unicode supports up to 1.1 million code points." This goes back to my original question, which is is there a NEED for 1.1 million code points?? I'm somewhat familiar with Katakana, ancient hieroglyphs of various kinds and such. But I don't see where these would require anything close to 1.1 million "code points"? (So not saying there isn't a need, but just don't see what it would be.) Is "code point" not synonomous with character/glyph representation?? Is the problem with glyph representations? | -----Original Message----- | [mailto:rpg400-l-bounces@xxxxxxxxxxxx]On Behalf Of Joe Pluta : : | For a little more on the encoding/decoding (although from a Mac | standpoint), read here: | | http://www1.tip.nl/~t876506/utf8tbl.html | | Joe
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.