Re: Multiple joblogs being created for some jobs -- MIDRANGE-L

On 15-Aug-2014 14:58 -0500, Steinmetz, Paul wrote:

After more research, I found this problem is NOT always occurring
for this job.

If a specific job and conditions can be noted, for which the problem is consistently reproduced, then a Job Trace is a good place to continue with a review of the issue. IIRC the message identifier appears as trace data [trace data vs merely trace flow must be included in the trace request] when the QMHSNMSG [I think that is the correct OS pgm] performs the work of sending a program message that would count against the overall number of messages sent in the job. The trace would also reveal any use of Display Job Log (DSPJOBLOG) [or any indirect method effecting the QMHJLOG program being invoked as well].

Also found the source for the 3rd party vendor's program.
The job does send many messages to programs message q.
I believe these messages are filling up the program message, which
result in the job message q filling, which sometimes results is a
CPI2417, but not always.
The job log is not full, but job message.

Presumably that means to suggest that "The job log is not full, but the Job Message Queue is full.

Distinguishing the JobLog from the Job Message Queue (JMQ) is important in that scenario; the joblog being viewed as what is either the data spooled or directed to an output file, after the effects of Log-Level (LOG) filtering, and the JMQ as the unfiltered\comprehensive information regarding _all messaging for the job_ [since the last time the JMQ was cleared for re-use or wrapping]. The joblog could appear almost completely empty, even while the JMQ has reached its capacity; though one would hope that /something/ would appear in the spooled joblog to help one to identify what had caused the JMQ to become full. Thus the joblog could never be /full/, but the output device\file into which the /joblog/ is written can be /full/.

Because the recovery text for CPI2417 suggests that one might be able to infer the origin by reviewing "the messages in the job message queue to determine if there is a problem", but because the /user/ really only has access to the Job Log vs the referenced Job Message Queue, a possible Design Change Request (DCR) idea is to ask that when the OS decides to send the msg CPI2417, that the output device for the joblog should always include additionally, a "message" detailing an effective /map/ of the JMQ; perhaps some counts\statistics provided that might assist one to visualize the origin for a "full" condition being diagnosed. Or perhaps by some other means, one could request that an effective /formatted dump/ of the JMQ be produced when that condition is diagnosed; a /dump/ in that case, meaning something that can be comprehended by someone other than a programmer of the OS. As a message, the CPI2417 itself might serve to provide additional details that might help one to infer the origin beyond simply what is found in the joblog output effected per the *PRTWRAP.

This job also sends to a dtaq.
If the other job on the system that reads this dataq is not running,
the dataq also fills up.
<ed: Job> S001NITE19 <ed: Usr> trp1
message - Storage limit exceeded for data queue PTMDTAQ001.

Presumably that is a reflection of the msg CPF950A "Storage limit exceeded for data queue &1 in &2.", with an origin from a request to call the Send Data Queue (QSNDDTAQ) API which could have been sent as a message or provided as feedback via a[n effective] return code.

Without the actual spooled joblog to show some context of the failure, the above comment is merely speculation.

However, I'm not sure if the filling of the dataq is related.

If the requester adding data queue entries does not temper the work\messages being added to the data queue, in response to the "queue full" failure condition, as an attempt to allow the /other job/ an opportunity to decrease the total number of messages on the DtaQ [i.e. to allow the dequeing job to decrement the count of messages from the DtaQ], then the filling of the JMQ might easily be a side effect only on occasions whereby the data queue was allowed to reach that storage limit.

My next question is does a job have more than one message q, one for
the joblog and one for program messages?

Each job has only one Job Message Queue object that is comprised of, is a composite of, all messages sent to any of the program message queues; the External (*EXT) program message queue being effectively active until the EOJ, and while the job is active there may be many inactive program message queues plus every program on the stack at any one moment is an active program message queue [that may be devoid of any messages]. For an ILE program, probably acceptable to substitute /procedure/ for /program/, because in many ways, they are synonymous for how messages are sent and received [via call stack entries].

<<SNIP ZASNMS subroutine calling Y2SNMGC>>

Presumably the called program effects an invocation of the Send Program Message (QMHSNDPM) API; directly contributing to the total number of messages in the JMQ, and to the specific [active] program message queue [possibly *EXTernal, or even itself] to which the message(s) would get sent.