×
The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.
On Wed, Oct 12, 2016 at 6:33 PM, Kevin Adler <kadler@xxxxxxxxxx> wrote:
While UTF-32 can encode all 1+ million Unicode code point in one code unit
(4 bytes), you have to be careful not to conflate a Unicode code point
with a "character." A character (in the abstract sense) may be made up of
multiple Unicode code points, which may further be encoded in multiple
code units.
There comes a point when further details are not very productive. If
you give people too much information at once, they can't absorb the
key points. We're at a stage where not enough people understand
Unicode even to a first approximation. If we can get to a place where
a critical mass of programmers subscribes to the misconception that a
Unicode code point is tantamount to a conceptual character, that will
already be significant progress from where we are now. And when we're
there, THEN we can refine the picture. To be quite frank, right now
most people are simply not ready.
Joel Spolsky tends to do a good job of finding a balance between
accessibility and rigorousness, ruthlessly simplifying (even
oversimplifying) when it's more important to drive home the first
approximation than to be absolutely correct. I highly recommend his
primer on Unicode:
http://www.joelonsoftware.com/articles/Unicode.html
John Y.
As an Amazon Associate we earn from qualifying purchases.