On 26-Aug-2014 09:43 -0500, Stone, Joel wrote:
the issue is that the online system is brought down, along
with all online users. Unfortunately once in a while an independent
batch job named "BrokenJob" will be hung or MSGW status, with a lock
on a PF named "ImportantFile". Or maybe it is an FTP file transfer
in progress, or a web interface file lock.

For SWA to function, all access to the objects should be controlled, such that when /the online system is brought down/ that effect includes terminating *all* active accesses, including preventing *any new* accesses until the checkpoint processing completes; that is the point in having an event to notify when that the task completes (SAVACTMSGQ). If access is so broadly available, as alluded, then ending each of the FTP server and the web interface and the batch subsystem(s) is probably easier than redesigning the accesses to be controlled so as to enable them functioning in concert with the ability to effect SWA; redesigning accesses is better to limit the scope of the required or merely suspected-as-required terminations.

So my question is, how can I identify and send an email or text
message stating something like "SWA cannot perform nightly backups
because BrokenJob has a lock on ImportantFile".

There is no way, except as a predictive response; either the SWA can function without any locking issues or the SWA will not function due to conflicting locks. A pre-processor that either records and notifies of locks found or even just blindly ends jobs that were found to hold any locks, will be merely _assuming_ that the present condition is a reflection of a future condition; unless a pre-processor also obtains any required locks for the save activity and holds those locks until the SWA reaches the checkpoint, any new conflicting locks still can be obtained by other work on the system such that the SWA is not ensured to complete without locking issues, just as if the pre-processor had never been run. That is the nature of the beast called locks; they are temporal, irrespective apparent persistence for any anecdotal incident.

By leaving the locking to the SWA [with or without a pre-processor that does not obtain+hold the locks to prevent others from obtaining a conflicting lock], the errors in the joblog and\or logged with output file support for the save request might suffice. Some allocation messages were updated to include a job name holding a conflicting lock at the time the timeout transpired. Locks being temporal, trying to interrogate what was the conflicting lock from the past is not reflective of the present; even if _a_ conflict persists at that later time, the current holder of the conflicting lock may not be the same lock[holder] that was active when the lock timeout transpired during the SWA.

Is there a DSPOBJLCK type command that can write to an outfile
(instead of a printer?)

I was fairly confident that some other replies had given necessary links to code that would reveal some APIs that might be used to obtain the locks held against an object. Yet another reply suggested how to approach from a different direction, from all jobs and then finding their locks-held [albeit not what I would suggest for the scenario; ignoring that the processing is akin to a game of whack-a-mole, no matter from what direction is the approach].

This thread ...


Return to Archive home page | Return to MIDRANGE.COM home page