Hello Greg,
Comments in-line:
On 8/11/2020 11:22 AM, Greg Wilburn wrote:
Thanks for the reply. For whatever reason I struggle with the whole encoding thing.
I guess I (incorrectly) assumed the base64 encoded element was UTF-8 because the XML document was UTF-8?
The purpose of base64 is to ensure the integrity of the underlying
binary value. To put it another way: Bytes go in (encoding), and the
exact same bytes come out (decoding).
For example, if I take the ISO-8859-1 string "Scott", it has a hex value
of 53 63 6f 74 74
If I base64 encode it, it will result in U2NvdHQ=
It doesn't matter whether the U2NvdHQ= is encoded as UTF-8, or ASCII, or
EBCDIC, it doesn't matter if its single byte or double byte or any of
1000 other encodings. When it is decoded, it will go right back to the
same hex string of 53 63 6f 74 74
That's the purpose of base64 -- to allow data to retain the exact same
binary value, even if it is transferred over text medium. The most
common purpose in the early days was to transfer photos in e-mail. In
an image file like a photo, the byte values don't represent letters or
text, the represent stuff like colors and pixels to be drawn on the
screen. If you translated those byte values from (for example) ASCII to
EBCDIC, the picture would become completely corrupt. Since e-mail is a
text medium, it was not safe to send pictures through e-mail until
base64 encoding made it possible. (Actually, there was an earlier
system called uuencoding that was used prior to base64, but is mostly
replaced by base64... but, you get the idea.)
So, yes... your XML document was UTF-8 (until you translated to EBCDIC,
anyway!) But that means the XML tags like <xxxxxx> were in UTF-8. The
encoded data (like my "U2NvdHQ=" example) was encoded with UTF-8
characters, but once decoded, it'll have the exact same byte values that
were input by whomever encoded it.
I took your (original) advice and removed the CCSID from my RPG variable that receives the base64_decode.
Then I changed my open() API to use CCSID = 0
I agree with removing the CCSID from the RPG variable. I don't agree
with using 0 with open().
fd = open(Stmf:flag:mode:0:0);
rc = write(fd: %addr(base64decoded): %len(%trim(base64decoded)));
So CCSID = 0 means "My job's flavor of EBCDIC". Which is not right
unless whomever encoded the data was using the same flavor of EBCDIC
that you are. That's almost certainly not the case. Also, how these
CCSIDs are used will depend on what you have in the 'flag' parameter,
which you haven't shown us.
What you need to do is find out what the actual encoding of the ZPL
label was before it was base64 encoded. Then, you need to call open()
and tell it the CCSID that corresponds to that encoding. Do NOT use
O_TEXTDATA on the open() call because the data is already in that
encoding, you don't want it to translate it.
The CCSID of the IFS file is 37. I can view it using Notepad++ and WRKLNK (although this does indicate: Message . . . . : File CCSID not valid.
Cause . . . . . : The file Coded Character Set Identifier (CCSID) was 00037,
but the data in the file looks like ASCII. A CCSID of 00819 is being used.
Recovery . . . : If another CCSID is needed, use F15 to change to the
desired CCSID. )
So what's happening here is that you've obviously given it the wrong
CCSID, so it's guessing at CCSID 819 instead. CCSID 819 is ISO-8859-1.
But, this is a guess... just like UTF-8 was a guess... Rather than
take a guess at how its encoding, ask whomever encoded it!
I'm not sure how would I "know" what encoding the string is once I have decoded it using base64_decode()?
Would this be indicated in the WSDL for the SOAP web service?
The XML response indicates <?xml version="1.0" encoding="utf-8"?>
Asking whomever is encoding the document is the only way I know to find
out the proper encoding.
The most commonplace ones would be ISO-8859-1, Windows-1252 or UTF-8.
Though, of course, we've already ruled out UTF-8. There's no point in
guessing... ask.
As an Amazon Associate we earn from qualifying purchases.