Re: Problem copying PC document to DDS file with accentuated characters -- MIDRANGE-L

× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.

On Wed, Mar 31, 2021 at 10:54 AM Vern Hamberg <vhamberg@xxxxxxxxxxxxxxx> wrote:

Anyone confirm that some machine code stuff is done
better with little-endian?

It mainly comes down to choices made at the hardware level. Compared
to other architectural design features, enddianness is small potatoes:

https://softwareengineering.stackexchange.com/questions/95556/what-is-the-advantage-of-little-endian-format

Now it is not required to use a BOM. I suppose one can identify which
flavor of UTF-16 you have when you determine if a null is in an odd
position or an even position - even is LE, odd is BE - not sure how
anyone else, such as NotePad++, does that.

Same as anything else. You guess. You look for tell-tale signs that
have a high probability of indicating one encoding but not another.
The process you describe below is pretty much how all encoding
detectors work:

But UTF-8 is another creature - it doesn't have endian flavors but does
have a BOM, EF BB BF - it also is not required, and if it's absent, you
have a real guessing game on your hands. On the i, what I did was try to
copy the text file to one with 1208 CCSID - if successful, I considered
the contents to be UTF-8. Not great but mostly useful. There ARE certain
byte sequences that, I will say, probably can be sure to mean the
contents is UTF-8

Patrik, by the flag in metadata, do you mean the CCSID or the code page?

He probably meant CCSID, but to be honest, there isn't enough of a
difference between those two concepts for it to matter to most people.

John Y.

This mailing list archive is Copyright 1997-2026 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.