On Fri, Aug 14, 2015 at 12:52 PM, Vern Hamberg <vhamberg@xxxxxxxxxxxxxxx> wrote:
No - it displays the actual bytes in hex, as I said.
But that's nonsense. There's no such thing as "actual bytes" of
Unicode. This is why I was asking what was meant by "Unicode".
In the very earliest days of Unicode, it was thought that 16 bits
would be sufficient to represent every character that anyone would
want to represent. Back then, there was an easy, one-to-one
correspondence between the idealized, conceptual "Unicode code points"
and the most obvious 2-byte concrete representation (encoding) for
them, which was known then as UCS-2.
So people were lazy with their terminology and used "Unicode"
interchangeably to mean either the idealized concept or the concrete
representation.
But about 20 years ago (yes, it's been that long), the Unicode
standard was updated to be much more robust; effectively becoming a
21-bit concept. Ever since then, there has been a very sharp and clear
divide between conceptual code points and concrete encodings.
I had hoped that the folks responsible for RDi had gotten that
distinction clear, so as not to perpetuate hazy or incorrect
understanding of Unicode. It's even worse than calling a POWER8
machine running IBM i an "AS/400".
John Y.
As an Amazon Associate we earn from qualifying purchases.