Re: MD5 hash with ASCII conversion -- RPG400-L

Hi Dan,

Yikes, there are so many things that can go wrong with the approach you're taking. It's certainly possible to get it work, but I'm wondering if you wouldn't be better served by taking a different approach?

Here's the approach I would recommend (though, I don't know that much about the scenario, so I may be off-base)

1) Use CPYTOIMPF or some other tool to convert your PF to an ASCII file in the IFS. Get it exactly the way the Windows system wants it.

2) Run your MD5 against the IFS file.

3) FTP the file to the Windows box in BINARY mode so nothing can be changed in the data during transit.

So that's my recommendation. That will ensure that your MD5 is calculated on the same data that the Windows one will be. That's crucial, because the main purpose of an MD5 hash is to verify that the data is identical. Any change to the data will give you a different hash... which is why your current approach is so difficult.

Your current approach is tricky because you have to try to calculate exactly what will be done to the file. In other words, you have to try to guess the future :) Then, change each record in your file to look like it will be in the future, and calculate the hash on that. Possible, but tricky.

If you persist with this approach, here are things to consider:

1) Translating EBCDIC to ASCII. You are already doing this, but I wonder if you've thought of everything? For example, if your job is CCSID 65535, RPG won't translate the file as it's read. But, if your job ever changes, it will translate the data as it's read. So you can't simply assume the data in your program will be the CCSID of the file, and you can't assume it'll be the CCSID of the job. It could be either. Plus, RPG uses the "mixed byte CCSID that corresponds to the jobs CCSID" so you can't really use the job's CCSID directly anyway, even if you know it'll never be used on a 65535 system. It's a bit tricky. Once you get the right EBCDIC CCSID, then you also have to make sure you're using the right ASCII one.

Indeed, normally you would NOT want to translate data before calculating an MD5 -- because therein lies the road to madness. But in your case you have to because you're trying to predict what the file will end up as...

I glanced at your code, and you do not appear to be using iconv(). wouldn't iconv() work better than QTQCVRT?

2) Trimming trailing blanks... sounds like you're already doing this.

3) Adding CRLF. FTP in ASCII mode will add CRLF to the end of every record, however, the FTP client MIGHT change that to just LF (probably not if the client is Windows -- but in Unix it would... and some clients allow that option anywhere.)

There is no "end of file" character normally. Some very old PC software uses Ctrl-Z as end of file, but I don't think FTP does this, so it wouldn't be an issue for you.

That's my $0.02 for now.

Dan wrote:

I have cobbled together an application to calculate the MD5 hash from a
native i file. I used Scott's code from iSeriesNetwork, which worked on IFS
files, converted that to use native files, and then added in the CCSID
conversion from Bruce's example found here:
http://archive.midrange.com/midrange-l/200006/msg01334.html
The current end result is shown below.

I am using WinHasher on the PC side to calculate the MD5 hash there.

The file is FTP'd from the i to a Windows server, and we need to ensure that
the file is unadulterated. By default, the FTP handles the conversion to
ASCII.

I have, so far, been unable to get a match on the MD5 hashes. One thing I
noticed is that the file on the Windows server has the linefeed/carriage
return characters after the last non-blank character in each line. The file
being transferred has fixed-length records, but the text data on each line
varies in length, so I calculate the length before each call to CIPHER.

Do I need to append linefeed/carriage return characters to the end of the
%trimmed string that I pass to CIPHER in my RPG program? If so, what are
they? I can't make out the hex representation in Textplorer. And it's been
way too many years since I've cared about linefeed/carriage return
characters. I'd hazard a guess that Textplorer is trying to show me
x'0C0A', but I can't say for certain.

Also, is there an "end of file" character that I need to consider.

TIA,
Dan

H DFTACTGRP(*NO) BNDDIR('QC2LE')

fDOC1PS if e disk

/copy md5klement,ifsio_h

D SetupConversn PR
D ToAscii 256a

*
* _CIPHER MI builtin. Allows access to cryptographic
* functions in the licensed internal code.
*
D cipher PR extproc('_CIPHER')
D receiver * value
D control * value
D source * value

D Convert PR extproc('_XLATEB')
D * value
D * value
D 10u 0 value

D HASH C const(5)
D MD5 C const(x'00')
D SHA1 C const(x'01')
D ONLY C const(x'00')
D FIRST C const(x'01')
D MIDDLE C const(x'02')
D FINAL C const(x'03')

*
* These control how the system creates a hash
*
D HashCtrl DS qualified
D Function 5I 0 inz(HASH)
D HashAlg 1A inz(MD5)
D Sequence 1A inz(FIRST)
D Length 10I 0 inz
D Output 1A inz(x'00')
D Reserved 7A inz(x'00000000000000')
D CtxPtr * inz(%addr(WorkArea))

D WorkArea S 160A inz(*loval)

*
* MI builtin to create a hex dump of a spot in memory
*
D hexdump PR EXTPROC('cvthc')
D output 40A
D input 20A
D output_len 10I 0 value

D ReadBuf S *
D BufSize s 10I 0
D Len s 10I 0
D fd s 10I 0
D BinHash S 20A inz(*loval)
D HexHash s 40A
D p_BinHash s * inz(%addr(BinHash))
D ToAscii s 256a

/free
*inlr = *on;

SetupConversn( ToAscii ) ;

hashCtrl.Sequence = FIRST;

dou %eof( DOC1PS ) ;
read DOC1PS ;
if not %eof( DOC1PS ) ;
hashCtrl.Length = %len( %trimr( DOC1OUT ) ) ;
Convert( %addr( DOC1OUT )
: %addr( ToAscii )
: %size( DOC1OUT ) ) ;
ReadBuf = %addr( DOC1OUT ) ;

cipher( %addr(p_BinHash)
: %addr(HashCtrl)
: %addr(ReadBuf) );

hashCtrl.Sequence = MIDDLE;
endif;
enddo;

// At the very end, call cipher with Sequence=FINAL
// to get back the MD5 hash

hashCtrl.Sequence = FINAL;
hashCtrl.Length = 0;

cipher( %addr(p_BinHash)
: %addr(HashCtrl)
: %addr(ReadBuf) );

// Apps that use MD5 hashes usually want them to be
// hexidecimal and lowercase. Here, that is done...

hexdump( HexHash : BinHash : %size(HexHash) );
HexHash = %xlate('ABCDEF': 'abcdef': HexHash);

dsply ('MD5 hash is ' + %subst(HexHash:1:32));

/end-free

*+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* SetupConversn(): This sets up and gets the conversion table
* for converting EBCDIC (37) to ASCII (819)
*
* Taken from
http://archive.midrange.com/midrange-l/200006/msg01334.html

*

*+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
P SetupConversn
B
D SetupConversn
PI
D r_To819
256a

D CCSID1 s 10i 0
inz(37)
D ST1 s 10i 0
inz(0)
D StartMap s
256
D L1 s 10i 0
inz(%size(StartMap))
D CCSID2 s 10i 0
inz(819)
D ST2 s 10i 0
inz(0)
D GCCASN s 10i 0
inz(0)
D L2 s 10i 0
inz(%size(To819))
D To819 s
256
D L3 s 10i
0
D L4 s 10i
0
D FB s
12
D
ds
D x 5i
0
D LowX 2
2
* Get all single byte ebcdic hex
values
C 0 do 255
x
C eval %subst(StartMap:x+1:1) = LowX
C enddo
* Get conversion table for 819 from 37
C call 'QTQCVRT'
C parm CCSID1
C parm ST1
C parm StartMap
C parm L1
C parm CCSID2
C parm ST2
C parm GCCASN
C parm L2
C parm To819
C parm L3
C parm L4
C parm FB
*
C eval r_To819 = To819

P E