RE: Base64 decode -- RPG400-L

Scott,

Thanks... this is working great. I am using the length returned by base64_decode (I discovered that error in my testing this morning).

The filename contains a timestamp at the end, so no worries there.

Thanks to you and everyone else that helped! Hopefully I can retain some of this information.

Greg

-----Original Message-----
From: RPG400-L [mailto:rpg400-l-bounces@xxxxxxxxxxxxxxxxxx] On Behalf Of Scott Klement
Sent: Wednesday, August 12, 2020 10:52 AM
To: rpg400-l@xxxxxxxxxxxxxxxxxx
Subject: Re: Base64 decode

Hi Greg,

I would avoid using %LEN(%TRIM()). I would not assume the data is
EBCDIC text, and that means never using %TRIM on it.

The base64_decode routine returns the length, so there's no need to
re-calculate the length later, just use the one it returned.

The open flags look okay. You'll want to make sure there's no old copy
of the file in that location, to be sure it doesn't pick up any old
settings.

-SK

On 8/11/20 4:26 PM, Greg Wilburn wrote:

Scott,

Thanks... The vendor did come back in the last few minutes and indicated that the data is UTF-8. I'm not sure I completely trust that, but that's what he said.

I was using a service program (I created for other things) to write this file, so that was a mistake... and you're right, I don't completely understand these APIs. I get the purpose of Base64 encoding... where I'm lost is writing to the IFS.

Here is what my service program was doing:
flag = o_wronly + o_creat + o_ccsid + o_excl +
o_textdata + o_text_creat;
mode = s_irusr + s_iwusr + s_iroth + s_iwoth;
fd = open(inStmf:flag:mode:inCCSID:0);

So this was likely my issue.

So I'm thinking this code may provide better results... please correct me if I'm wrong.
1. No CCSIDs on any RPG variables
2. Write using CCSID 1208 without using O_TEXTDATA and O_TEXT_CREAT flags.

dcl-s base64encoded varchar(65535) inz;
dcl-s base64decoded char(65535) inz;
dcl-s label varchar(65535) inz;
dcl-s len uns(10) inz;

base64Encoded = xml.base64label;
len = base64_decode( %addr(base64Encoded:*data)
: %len(%trim(base64Encoded))
: %addr(base64Decoded)
: %size(base64Decoded)
);

label = %subst(base64decoded:1:len);
Stmf = '/gwilburn/myxmllabel.txt';
flag = o_wronly + o_creat + o_ccsid + o_excl ;
mode = s_irusr + s_iwusr + s_iroth + s_iwoth;
fd = open(Stmf:flag:mode:1208:0);
rc = write(fd: %addr(base64decoded): %len(%trim(base64decoded)));
rc = close(fd);

-----Original Message-----
From: RPG400-L [mailto:rpg400-l-bounces@xxxxxxxxxxxxxxxxxx] On Behalf Of Scott Klement
Sent: Tuesday, August 11, 2020 4:42 PM
To: rpg400-l@xxxxxxxxxxxxxxxxxx
Subject: Re: Base64 decode

Hello Greg,

Comments in-line:

On 8/11/2020 11:22 AM, Greg Wilburn wrote:

Thanks for the reply. For whatever reason I struggle with the whole encoding thing.
I guess I (incorrectly) assumed the base64 encoded element was UTF-8 because the XML document was UTF-8?

The purpose of base64 is to ensure the integrity of the underlying
binary value.   To put it another way: Bytes go in (encoding), and the
exact same bytes come out (decoding).

For example, if I take the ISO-8859-1 string "Scott", it has a hex value
of 53 63 6f 74 74

If I base64 encode it, it will result in U2NvdHQ=

It doesn't matter whether the U2NvdHQ= is encoded as UTF-8, or ASCII, or
EBCDIC, it doesn't matter if its single byte or double byte or any of
1000 other encodings. When it is decoded, it will go right back to the
same hex string of 53 63 6f 74 74

That's the purpose of base64 -- to allow data to retain the exact same
binary value, even if it is transferred over text medium. The most
common purpose in the early days was to transfer photos in e-mail. In
an image file like a photo, the byte values don't represent letters or
text, the represent stuff like colors and pixels to be drawn on the
screen. If you translated those byte values from (for example) ASCII to
EBCDIC, the picture would become completely corrupt. Since e-mail is a
text medium, it was not safe to send pictures through e-mail until
base64 encoding made it possible. (Actually, there was an earlier
system called uuencoding that was used prior to base64, but is mostly
replaced by base64... but, you get the idea.)

So, yes... your XML document was UTF-8 (until you translated to EBCDIC,
anyway!)   But that means the XML tags like <xxxxxx> were in UTF-8. The
encoded data (like my "U2NvdHQ=" example) was encoded with UTF-8
characters, but once decoded, it'll have the exact same byte values that
were input by whomever encoded it.

I took your (original) advice and removed the CCSID from my RPG variable that receives the base64_decode.
Then I changed my open() API to use CCSID = 0

I agree with removing the CCSID from the RPG variable. I don't agree
with using 0 with open().

fd = open(Stmf:flag:mode:0:0);
rc = write(fd: %addr(base64decoded): %len(%trim(base64decoded)));

So CCSID = 0 means "My job's flavor of EBCDIC".   Which is not right
unless whomever encoded the data was using the same flavor of EBCDIC
that you are.   That's almost certainly not the case. Also, how these
CCSIDs are used will depend on what you have in the 'flag' parameter,
which you haven't shown us.

What you need to do is find out what the actual encoding of the ZPL
label was before it was base64 encoded. Then, you need to call open()
and tell it the CCSID that corresponds to that encoding.   Do NOT use
O_TEXTDATA on the open() call because the data is already in that
encoding, you don't want it to translate it.

The CCSID of the IFS file is 37. I can view it using Notepad++ and WRKLNK (although this does indicate: Message . . . . : File CCSID not valid.
Cause . . . . . : The file Coded Character Set Identifier (CCSID) was 00037,
but the data in the file looks like ASCII. A CCSID of 00819 is being used.
Recovery . . . : If another CCSID is needed, use F15 to change to the
desired CCSID. )

So what's happening here is that you've obviously given it the wrong
CCSID, so it's guessing at CCSID 819 instead. CCSID 819 is ISO-8859-1.

But, this is a guess... just like UTF-8 was a guess...    Rather than
take a guess at how its encoding, ask whomever encoded it!

I'm not sure how would I "know" what encoding the string is once I have decoded it using base64_decode()?
Would this be indicated in the WSDL for the SOAP web service?
The XML response indicates <?xml version="1.0" encoding="utf-8"?>

Asking whomever is encoding the document is the only way I know to find
out the proper encoding.

The most commonplace ones would be ISO-8859-1, Windows-1252 or UTF-8.
Though, of course, we've already ruled out UTF-8. There's no point in
guessing... ask.