On 8 February 2013 21:01, Eric Lehti <elehti@xxxxxxxxxxxxxxxxxx> wrote:
I would like to protect our system against run-away jobs.
We went a slightly different route with this. We have a background job
that takes a snapshot every 10 minutes and looks for jobs with
excessive temp storage or DASD over a preset level, or if it appears
to be climbing too quickly. For high temp storage we put the job on
hold (the level is set way above anything seen in normal use) and send
a message to QSYSOPR. That gets reviewed, and if it looks legitimate
we up the warning level a bit and release the job. That's worked well
for several years now, and I've extended our in-house version to
report on excessive CPU time used. You could add in anything else
that's specific to your way of working. The activity is logged to a
message queue, so it's easy to look back to see timings, etc.