|
Peter Dow wrote:
One of my customers was running payroll and the job seemed to
hang. Looking at his call stack I saw:
Type Program Stmt /Inst Procedure
QCMD QSYS /04F3
INLPGMC0 QGPL 15400 /00D6
GPRM950 GP#LIBR _QRNP_PEP_GPRM950
GPRM950 GP#LIBR 1790 GPRM950
PPCP400 HS#LIBR_PP _CL_PEP
PPCP400 HS#LIBR_PP 32900 PPCP400
PPCU669 HS#LIBR_PP 3700 /001F
QCPEX0FL QSYS /0121
QCPEXCON QSYS /0AB7
QCPGENIO QSYS /01DF
QDBGETM QSYS /0533
QWTPECTL QSYS /013D
QMHDLVMS QSYS /01A3
QMHDSMSS QSYS /1772
QWSGET QSYS /065D
QT3REQIO QSYS /0253
which to me looks like it's waiting for a response from the user;
however, his (Rumba) session shows input inhibited. WRKACTJOB
shows the job in a DSPW status.
According to the stack, the job is sending a status message to the message line. If the job were awaiting a reply, the job status would be MSGW and the QMHRCVM [or similar name; i.e. a receive message reply processor] would be on the stack instead of QMHDSMSS [the display status message processor].
The joblog shows:
3700 - CPYF FROMFILE(PPPPCWK) TOFILE(QTEMP/PPBPCWK)
MBROPT(*REPLACE) CRTFILE(*YES)
INCREL((*IF WHOSP# *LT '00001')
(*OR WHOSP# *GT '99999'))
Physical file PPBPCWK created in library QTEMP.
Member PPPPCWK added to file PPBPCWK in QTEMP.
I checked the program, and it has a MONMSG after that CPYF:
MONMSG MSGID(CPF2869 CPF2817 CPC2957)
The issue is not specifically related to those messages which can
be issued by the CPYF. Note also that a completion message can
not be monitored; the CPYF does not show CPC2957 would be issued
as an escape\monitor-capable message.
I checked the PPPPCWK file and all records have WHOSP# = 00001,
which means the above CPYF stmt would select nothing, which
should give a CPC2957 completion message (which it did when I
ran the CPYF manually).
Again, the issue has effectively nothing to do with the CPYF, so
that is of little value from which to infer anything. What might
be of curiosity is how /noisy/ the request was for its status
messages.
I had him cancel his Rumba session (without canceling the job).
His job then showed status DSC (disconnected). He started Rumba
again, reconnected to the job, and it took off and completed
normally.
I believe the reconnection was functional because the device
had previously been disconnected due to the first error listed
below, i.e. the "not active" condition, and the device recovery
had been defined as *MSG for which the CPF509F-related messaging
allowed a monitored\handled recovery.
Device MISPC02S2 session not active.
Input or Output request failed. See message CPF5170.
Job connected again. Sign on information ignored.
Job has successfully connected after I/O error.
? C
Cancel reply received for message CPF509F.
Error while processing file QDDSPMSG in library QSYS.
No records copied from file PPPPCWK in PP#FILE. <= CPC2957
4100 - CLRPFM FILE(PPPPCWK)
Member PPPPCWK file PPPPCWK in PP#FILE cleared.
4500 - CPYF FROMFILE(QTEMP/PPBPCWK) TOFILE(PPPPCWK)
MBROPT(*REPLACE)
Empty member PPPPCWK in file PPBPCWK in library QTEMP is
not copied.
Copy command ended because of error. <= CPF2817
The CPF2817 was monitored for and the program finished normally.
From the joblog and continued processing, it looks like everything is probably working as designed.
The question is, why did it pause at the CPC2957 completion message?
The device disconnected, probably due to a communication error. The time it actually disconnected may not be the "not active" message. The correlation of the timing of the "not active" condition detection to the moment the "CPC2957 completion message", would suggest the detection of the disconnected device as a direct consequence of the completion message being sent to the UIM or DSPF; i.e. the condition of the device having lost communication probably was detected, because the sending of the completion message is an attempt to perform I/O to the device via the UIM or DSPF.
Are they missing a PTF? Is it a Rumba problem? Something completely different?
While there may be a problem with the comm, the emulation, or something else, unless the devices are commonly disconnected, it
may just be hiccup.
To further explain...
A program which performs no I\O to the display device can
continue processing unaffected by the loss of the device.
However the job was sending status messages _asynchronously_ to the virtual display device, sent from QDBGETM. When a status
message is sent to *EXT but when that message can not be
delivered [to the QDDSPMSG message subfile] due to the device
being in error, that is a recoverable error. The /recoverable/
error for that I\O request generally does not impair the processing of a program, however the _completion_ message sent to
the display file, UIM panel or menu, is not an I\O that would be
considered "recoverable" because it is not simply /status/.
I am not sure if perhaps *MSG might effect the I\O error [see QDEVRCYACN *sval] for which the workstation support would then
await reconnect, but from the stack it would seem so. So anyhow
the device remained "input inhibited" due to a combination of the
QDEVRCYACN setting and the way the virtual device and emulation
software interact. That the job was able to be reattached via
that device, indicates that things worked well. Review the system
value setting to decide if a different result might be more
preferable.
FWiW the CHGJOB STSMSG(*NONE) before performing the copy would
have prevented some wasted processing on sending the status
messages; i.e. what was seen active in the quoted stack.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.