On 19/11/2006, at 1:52 AM, PaulMmn wrote:

I take issue with your 'overly clean' job logs--  I -want- to see if
a data area isn't created.  In our environment, every clue that
provides insight into why a job had problems is important!  I know
you'll say that a program that has problems doesn't belong in
production (I agree), but the reality says that only a "user" can
find the last bugs!

Which is exactly WHY all programs require graceful error handling. Most "user" errors should be found during User Acceptance Test but because unexpected things can happen in production all code needs a way to deal with the unexpected. The default exception handler isn't good enough.

In the case of the data area, if it doesn't exist and it should, or
it does but it shouldn't, I -need- to know!

And you -would- know if you follow what I said.

In both these cases the exception is unexpected. Why? Because:
        1) in case A your code expects the data area to exist
        2) in case B your code expects the data area to not exist

Because these are unexpected exceptions there should be no command level MONMSG for either of these cases. The very presence of a command level MONMSG says the exception being monitored IS expected.

Both these situations are correctly handled by a global MONMSG for CPF9999 with a corresponding error handler. Because this is an unexpected exception you would leave these messages in the job log otherwise you have no chance of diagnosing the problem. The error handler should at least resignal the exception but it should probably also do any necessary diagnostic collection (such as DSPJOB, DSPJOBLOG, DMPCLPGM, etc.) and notification.

My 'clean up' comments were specifically about command-level MONMSG statements (i.e., handled exceptions) and unnecessary completion messages.

The result is a graceful handling of the exception AND sufficient information to correctly diagnose the problem. In this example however the real cause is likely some other program or job that failed to create or delete the data area as necessary.

(I'll bet someone will throw up the example of a long running batch job that crashes because of a missing something-or-other and because it does not have ANY exception handling or at least does not have a global exception handler the default exception handler kicks in, the missing something-or-other can be created/added/restored/whatever, and the job allowed to continue. I agree that's nice but not necessary. A properly written long-running process will have a restart facility so it can pick up close to where the crash occurred.)

Simon Coulter.
   FlyByNight Software         AS/400 Technical Specialists

   Phone: +61 3 9419 0175   Mobile: +61 0411 091 400        /"\
   Fax:   +61 3 9419 0175                                   \ /
                 ASCII Ribbon campaign against HTML E-Mail  / \

Return to Archive home page | Return to MIDRANGE.COM home page