|
Hello, Dan: I would just simply like to add the following thought: 1. although it is always possible that two slightly different source members could generate the same "check-sum" or "hash code", it is pretty unlikely; 2. if two members generate the same hash-code or check-sum, they are probably the same (with, say, > 99% accuracy); but, 3. if two members generate a different hash-code or check-sum, they are definitely NOT THE SAME. So, with that in mind, I think that it is fairly "safe" to use this type of comparison of a hash-code or check-sum, along with some other information that is easy to obtain, for example, using DSPFD ... TYPE(*MBR or *MBRLIST) OUTPUT(*OUTFILE), and compare that to the same outfile created on the "master" machine, looking at obvious things like member changed date/time stamps that are different, number of records are different, etc. So, I am suggesting that a hash-code or check-sum can be used, in combination with several other "sanity checks", such as that the member has the same number of records, the same record length, etc., before deciding that these two members are in all probability truly the same. As others (Gene Gaunt) have suggested, the ultimate tool to compare members is the IBM CMPPFM command. Of course, this requires you to save an entire source physical file on one system, and send it over to the other system, where you will do the compare, and restore it (in another library). NOTE: the CMPPFM has a "nice" feature that many people are apparently unaware of, and that is, it can be used to compare *ALL members in one source file with all members in another source file, and report only the "differences", where differences includes any members found in one file but not in the other, as well as members that exist in both files but that contain different data. The syntax for this multi-member comparison is: CMPPFM NEWFILE(newlib/QRPGSRC) + NEWMBR(*ALL) + OLDFILE(oldlib/*NEWFILE) + OLDMBR(*NEWMBR) + CMPTYPE(*LINE) + RPTTYPE(*SUMMARY) + OUTPUT(*PRINT) Some of you may prefer to use RPTTYPE(*DIFF) ... Also, some of you who have been around IBM (mainframe) systems for a long time may recognize the format of the output reports of the CMPPFM command; and, yes, it is essentially the old SUPER-C (SuperCompare) utility, ported from the mainframe to the AS/400. ;-) SuperC was made available by IBM for OS/400, from about V1R3 to V2R3, as a PRPQ, before CMPPFM was integrated into PDM. For those not familiar with SuperC, it uses a hash-coding technique to quickly isolate lines that are different in the two members. For more information about such source comparison techniques, I refer you to the (computer science) literature: A Technique for Isolating Differences Between Files, by Paul Heckel, (c) 1978, Communications of the ACM, Apr. 1978, Vol. 21, No.4 This is also essentially the same technique used by the diff command provided with most Unix and Linux distributions. Regards, Mark S. Waterbury ----- Original Message ----- From: <thomas@inorbit.com> To: <mi400@midrange.com> Sent: Thursday, May 09, 2002 2:51 PM Subject: RE: [MI400] Generate hash code for a source member? > Dan: > > First thing to keep in mind is that no hash value is going to be > foolproof unless your hashes have as many significant characters > as your largest members. Hashes can be pretty good, but they > won't guarantee uniqueness. > > With that in mind, note that _most_ switched characters will > indeed generate different hashes; the XFOOT was suggested over > groups of 4 characters for 32 bits. "ABCD" is definitely a > different 32-bit value from "BACD", e.g., 3250766788 vs. > 3267478468. Similarly, "ABCD" and "EFGH" are together different > from "ABCE" and "DFGH". That is, while many transpositions won't > be caught, most of them will be caught. > > Switched records are slightly more trouble, but something such as > RRN being used a kind of seed value should help. > > The real question comes down to exactly how precise do you need > this to be? Do you need to guarantee you'll catch every > duplication or variation? > > Tom Liotta > > "Dan Bale" wrote > > > A simple XFOOT solution, I'm thinking, will not catch changes > where > > characters are switched, or where records are switched. I > realize I could > > introduce some logic to multiply each element in a record by a > different > > value, and do something similar by RRN, and then have to deal > with overflow, > > -- > Tom Liotta > The PowerTech Group, Inc. > 19426 68th Avenue South > Kent, WA 98032 > Phone 253-872-7788 > Fax 253-872-7904 > http://www.400Security.com > ___________________________________________________ > The ALL NEW CS2000 from CompuServe > Better! Faster! More Powerful! > 250 FREE hours! Sign-on Now! > http://www.compuserve.com/trycsrv/cs2000/webmail/ > > > > > _______________________________________________ > This is the MI Programming on the AS400 / iSeries (MI400) mailing list > To post a message email: MI400@midrange.com > To subscribe, unsubscribe, or change list options, > visit: http://lists.midrange.com/cgi-bin/listinfo/mi400 > or email: MI400-request@midrange.com > Before posting, please take a moment to review the archives > at http://archive.midrange.com/mi400. >
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.