Re: MD5 hash with ASCII conversion -- RPG400-L

I'm with Scott on this one, you should copy the data into a stream
file on the IFS before attempting to calculate a MD5 hash for it. If
it were me, I'd look into creating the file on the IFS as a .ZIP file,
then create the hash. I was thinking you could use the Java jar
utility directly on the PF, but a quick test didn't work; however that
doesn't mean it couldn't work if done differently from what I quickly
tried. I'm pretty sure the PKZIP product for the iSeries can do it.
It would be nice to skip the CPYTOIMPF given how slow it is.

I don't get what you are saying about using CPYTOIMPF would double the
number of transfers.

As far as the current process being set in concrete....give me a
break. Either sell a new, improved, and more reliable process to
management/users or let them live with what they've got. Why would
you bother to implement changes that would make the process more
fragile?

It seems to me, that the real solution, is to have a reliable,
guaranteed file transfer process (sometimes called Enterprise File
Transfer). I believe most such products have matured beyond just
simple transfers. Have you considered purchasing such a product? One
product I'm familiar with is Metastorm Integration Manager,
http://www.metastorm.com/products/metastorm_integration_manager.asp .
With such a product, you can automate processing on both sides, be
notified of problems, and have a complete audit trail; just to name a
few features. You might be able to find a java based open source
package.

HTH,
Charles

On Fri, Jul 10, 2009 at 1:36 AM, Dan<dan27649@xxxxxxxxx> wrote:

As usual Scott, you are generously thorough. You're right, yikes.
Comments & responses inline:

On Thu, Jul 9, 2009 at 7:14 PM, Scott Klement <rpg400-l@xxxxxxxxxxxxxxxx>wrote:

Hi Dan,

Yikes, there are so many things that can go wrong with the approach
you're taking. It's certainly possible to get it work, but I'm
wondering if you wouldn't be better served by taking a different approach?

Here's the approach I would recommend (though, I don't know that much
about the scenario, so I may be off-base)

1) Use CPYTOIMPF or some other tool to convert your PF to an ASCII file
in the IFS. Get it exactly the way the Windows system wants it.

2) Run your MD5 against the IFS file.

3) FTP the file to the Windows box in BINARY mode so nothing can be
changed in the data during transit.

So that's my recommendation. That will ensure that your MD5 is
calculated on the same data that the Windows one will be. That's
crucial, because the main purpose of an MD5 hash is to verify that the
data is identical. Any change to the data will give you a different
hash... which is why your current approach is so difficult.

Your current approach is tricky because you have to try to calculate
exactly what will be done to the file. In other words, you have to try
to guess the future :) Then, change each record in your file to look
like it will be in the future, and calculate the hash on that.
Possible, but tricky.

I *think* the CPYTOIMPF is a non-starter for several reasons:
1) The current process of transferring the files is set in production
concrete. It will be easier to work around this, with the approach I am
attempting. (A little background: Currently, the on-call person has to
"validate" these transfers manually, by opening up these files on the
Windows server in a text editor. The larger ones take about 20 minutes to
open. Once the file is opened, we "look" at the beginning, middle, and end
for problems. We've had FTP transfers that return a successful completion
status, but where we've had corruption in the file on the Windows box.
Needless to say, this is a time-consuming process that does not guarantee
we'll find problems that are lurking somewhere in the file.)
2) These are *huge* files we're moving, about a half a GB, and about 20 of
those every day. To effectively double the number of transfers, I think,
would impact our promised timelines.
3) I believe we would still need to run another MD5 hash run once the file
hits the final destination. Is there something about a binary transfer that
is more, um, reliable in terms of ensuring a complete transfer? To my
limited knowledge, there is no guarantee that the binary FTP transfer
between the IFS and the Windows server will give a prefect match (emphasis
on "guarantee").