× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



On Tue, Jun 19, 2018 at 6:42 PM, Buck Calabro <kc2hiz@xxxxxxxxx> wrote:
On Tue, 19 Jun 2018 at 17:19, John Yeung <gallium.arsenide@xxxxxxxxx> wrote:
When you're inspecting a file using WRKLNK, option 5 (or,
equivalently, using DSPF directly on the file), what you get is a
character (not hex) display. In this mode, you CANNOT BE SURE what the
hex codes are.

You ***MUST*** press F10 to get the hex display. I don't care if you
don't see the BOM in character mode. That doesn't matter. Press F10.
The extra bytes will be there. EFBBBF. No matter what the CCSID is,
those bytes will be there at the beginning. Those three bytes are the
BOM for UTF-8, and CHGATR has no effect whatsoever on the bytes.

I don't want to speak for John, but I'm sure I missed out on why his
advice is, in general, useful and important.

I don't think my advice is *generally* useful and important. But it's
pretty darn close to critical for the times when you really need to be
sure of the exact bytes in a stream file, and it seems that this is
often the case when trying to diagnose encoding issues.

Likewise, it's /possible/ (but I haven't myself
tested it) that DSPF, WRKLNK and friends will try to do the conversion
for you as IBM i tries to display the text from the IFS file onto your
display screen.

This absolutely happens. If you create a stream file with UTF-8
content, including BOM, and set its CCSID to 1208, then DSPF will not
show you the BOM. It will do you the "favor" of hiding it. Let's say
your stream file has the following content:

3 bytes for BOM, followed by the three letters 'foo', followed by
Windows-style newline. (You can create this with Notepad, for example.
You have to be careful to save it as UTF-8, and if you have to FTP it
to your i, be sure to force binary mode!)

This file is 8 bytes long. If it has the proper CCSID of 1208, DSPF
will show you three visible characters, and the 'f' will be
left-aligned, as though it's the very first byte in the file.

But then press F10. Lo! You will see 16 hex characters, corresponding
to 8 bytes.

If you use CHGATR to set the CCSID to, say, 1252, then the system
won't know to interpret the BOM as a BOM, and it will instead do its
best to actually render those bytes as characters, thus showing the
six visible characters 'foo'. But once again, press F10 and the
bytes will be identical to what they were under 1208. Or indeed any
CCSID, because CHGATR has no effect whatsoever on actual bytes.

I urge people to try this for themselves. Until you see it with your
very own eyes, you won't really understand that you CANNOT know what
the bytes are unless you are using hex mode. I know I was astonished.

John Y.

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.