× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.




http://systeminetwork.com/article/unicode-and-system-i-where-do-i-go-and-how-do-i-get-there

What Is Unicode?
Unicode is the encoding standard replacing both ASCII and EBCDIC. Unicode defines a comprehensive set of unique numbers (code points) that represent (or map to) particular characters from all well-known written languages. Unicode supports languages such as English, Greek, German, Chinese, Japanese, and Korean. The unique numbers are the same regardless of the software platform or operating system used. Because Unicode includes all language characters, one field can contain characters from many different languages.
To help explain the differences between EBCDIC, ASCII, and Unicode, here is a small example. In EBCDIC on code page 37 (commonly used in the U.S.), the dollar sign is assigned to the hexadecimal value 5B, and the pound sterling sign is assigned to the value 4A. By contrast, on EBCDIC code page 285 (commonly used in the U.K.), the dollar sign is assigned the hexadecimal value 4A, and the hexadecimal value 5B maps to the pound sterling sign. On ASCII code page 819 (commonly used in the U.S. and the U.K.), hexadecimal 5B maps to the open square bracket, and hexadecimal 4A maps to the capital letter J. This means that if you see the string of hexadecimal data 4A5B, you do not know whether it is referring to the string $£, J[, or £$.
This simple example shows the basic encoding problem. To further complicate the issue, several more encodings are used for Chinese, Russian, and other languages. As a programmer, all you really want to know is what hexadecimal value you're supposed to use!
To contrast the above example, in Unicode, the characters (in UTF-16) would have these values: '0024'x for $, '004A'x for J, '005B'x for [, and '00A3'x for £. Because they are all different hexadecimal values, your applications would not have to be aware of the encoding that the data came from.

As an Amazon Associate we earn from qualifying purchases.

This thread ...


Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.