|
-----Original Message----- From: midrange-l-bounces@xxxxxxxxxxxx [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of Roger Harman Sent: Wednesday, September 27, 2006 6:58 PM To: midrange-l@xxxxxxxxxxxx Subject: RE: Pause technique Are you saying that you never have failures? If so, I'd sure like to live on your planet <grin>.
Of course not.
Some processes (particularly when dealing with imported data) have potential weak spots and they need to be dealt with. I'd sure rather deal with the possibility of failure proactively than wait for an abend and have to figure out what needs to be unwound to start over.
Cleaning up imported data prior to processing it through your system is a separate issue and not what I'm talking about here. We are talking about processing after the data has been accepted into your application system.
We have a critical nightly job with about 50 distinct steps, one of which has about 15 sub-steps. The fact that we've planned for checkpoint restart capability doesn't imply to me that there is a design flaw. Rather, it implies that we've studied the issue and programmed defensively to deal with real world conditions.
Let me ask you this: How many times have you had to restart that job this month/year? How many times has the same basic problem caused a restart? How many times have you said to yourself, "I could make a change to prevent the problem above", but have not yet found the time to make the change? How do you determine that a checkpoint has been reached successfully? when a checkpoint fails, how to you fix the problem so that you can restart? Now if you answered zero, never, never, automatically, and "it fixes itself". Then congratulations, I'd say your shop is an exception. Most shops with an easily restartable and/or checkpointed multi-step job often validate the checkpoint manually and end up making use of the restartability often after fixing the data with DFU/SQL/DBU. That is what I mean when I say failure is an expected and acceptable occurrence. Programming in such a manner isn't defensive in my mind; it's defective. But the business process of validation the checkpoints, fixing the error and restarting limit the effects of the defective programming. At the place I used to work, I too had a critical nightly job that ran. Did I ever have a problem that required me to restart it? Yep sure did and let me tell you it was a major PITA to do so. But in the 7 years I was there it probably only had to be done 3 or 4 times. Mostly early on when the system was new. Any problem that required a restart was tracked down and prevented from happening again at the source.
I assume you do range checks on input data or chains to master files to maintain data integrity? By doing so, are you not also assuming that failure (i.e. erroneous data input) is an expected and acceptable occurrence? Of course you are, and you're programming defensively to catch and correct those failures. I assume you use the MONMSG command in CL programs? Another example of defensive programing.
I agree that you have to program defensively, and yes my programs monitor for failures. I also validate data on input, but I'd argue that validation on input is offensive programming and like the saying goes; the best defense is a good offense. Once the data is in you application system it had better be good. Program defensively, but if your defenses are getting hit often, you need a better offense. Charles Wilt -- iSeries Systems Administrator / Developer Mitsubishi Electric Automotive America ph: 513-573-4343 fax: 513-398-1121
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.