MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » November 2006

Re: monmsg question



fixed


On 19/11/2006, at 1:52 AM, PaulMmn wrote:

I take issue with your 'overly clean' job logs--  I -want- to see if
a data area isn't created.  In our environment, every clue that
provides insight into why a job had problems is important!  I know
you'll say that a program that has problems doesn't belong in
production (I agree), but the reality says that only a "user" can
find the last bugs!

Which is exactly WHY all programs require graceful error handling. Most "user" errors should be found during User Acceptance Test but because unexpected things can happen in production all code needs a way to deal with the unexpected. The default exception handler isn't good enough.


In the case of the data area, if it doesn't exist and it should, or
it does but it shouldn't, I -need- to know!

And you -would- know if you follow what I said.

In both these cases the exception is unexpected. Why? Because:
        1) in case A your code expects the data area to exist
        2) in case B your code expects the data area to not exist

Because these are unexpected exceptions there should be no command level MONMSG for either of these cases. The very presence of a command level MONMSG says the exception being monitored IS expected.

Both these situations are correctly handled by a global MONMSG for CPF9999 with a corresponding error handler. Because this is an unexpected exception you would leave these messages in the job log otherwise you have no chance of diagnosing the problem. The error handler should at least resignal the exception but it should probably also do any necessary diagnostic collection (such as DSPJOB, DSPJOBLOG, DMPCLPGM, etc.) and notification.

My 'clean up' comments were specifically about command-level MONMSG statements (i.e., handled exceptions) and unnecessary completion messages.

The result is a graceful handling of the exception AND sufficient information to correctly diagnose the problem. In this example however the real cause is likely some other program or job that failed to create or delete the data area as necessary.

(I'll bet someone will throw up the example of a long running batch job that crashes because of a missing something-or-other and because it does not have ANY exception handling or at least does not have a global exception handler the default exception handler kicks in, the missing something-or-other can be created/added/restored/whatever, and the job allowed to continue. I agree that's nice but not necessary. A properly written long-running process will have a restart facility so it can pick up close to where the crash occurred.)

Regards,
Simon Coulter.
--------------------------------------------------------------------
   FlyByNight Software         AS/400 Technical Specialists

   http://www.flybynight.com.au/
   Phone: +61 3 9419 0175   Mobile: +61 0411 091 400        /"\
   Fax:   +61 3 9419 0175                                   \ /
                                                             X
                 ASCII Ribbon campaign against HTML E-Mail  / \
--------------------------------------------------------------------







Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact