On 06-Dec-2017 15:29 -0700, RIAAN RAS/TCM Software & Services/ZA wrote:
We have a client with a system running V5R4. The system crashed a
week ago with three cache batteries failed, one disk failure and
another disk logging errors. The system had at least <???>
abnormal shut downs. After the hardware was restored, did <???>
the backups fail due to damaged objects.
  Saves are designed to [issue a fatal] stop when encountering 
diagnosed-as damaged objects, because the integrity of the save would be 
questionable, with regard to what was planned to be saved and what could 
be or have been dumped to media for that save request.  An object noted 
during saves, to have been marked as damaged previously, prior to that 
save, can and will be omitted from that save [and a diagnostic logged to 
indicate the object would not be included]; i.e. the feature ensures 
consistency between what is planned to be saved [by the OS] and what can 
actually be dumped to media [by the LIC].
Advised customer to perform a reclaim storage which failed.
  One phase of reclaim storage is "damage notification"; damage found 
or previously found such that an object was marked as damaged, will be 
notified to the [*sysopr message queue and the] history log.  Note that 
the reclaim can and will not be able to find all objects that later may 
be found to be damaged when the object is saved or when the object is 
"used" for its intended purpose; i.e. some damage is not subject to 
detection by RCLSTG, solely with the limited access/touch, as performed 
by the reclaim feature.
I IPL'ed the system and perform another RCLSTG. The "Read objects
from disk" failed after 68% with MCH3601 and MCH3202.
I ran a RCLSTG *DBXREF which completed successfully and restarted
the full RCLSTG. This time the "Read objects from disk"
step completed, and failed immediately after that.
All attempts since has been unsuccessful.
The CHKPRDOPT *OPSYS reported no errors.
Have anybody had a similar problem before? I cannot get any
information on the MCH3203 error
  msg MCH3203, or msg MCH3202, or both?  Which was fatal to, i.e. 
terminated, the reclaim request, for which apparently the effect was the 
following [which implied MCH3202]?
CPF9999 Diagnostic 40 06/12/17 20:57:42.887600
  QMHUNMSG *N QUIMNDRV QSYS 060C
Thread . . . . : 00000004
Message . . . . : Function check. MCH3202 unmonitored by QRCLENUP
   at statement *N, instruction X'0060'.
  That is some portion of the details from the "function check" 
message, not the msg MCH3202 itself; rather meaningless, except to imply 
that the actual failure, the preceding condition, was msgMCH3202 
T/QRCLENUP x/0060, though, without any further context, such as the 
Return Code (RC) which defines what is the Minor Code for the exception 
diagnosed by the Licensed Internal Code (LIC).  The symptom details from 
the above, is merely, the unhelpful:
    msgCPF9999 *FC F/QMHUNMSG rcMCH3202
  The details for the apparent actual failing condition/message, the 
msgMCH3202 RC####, are recorded in a VLIC Log (VLog), if I recall 
correctly, as a VL0200####.  But details from a spooled joblog would 
show the "From program" [and instruction] as additional context to 
reveal what might be the origin of the difficulty for the LIC.  For 
example, the condition as issued from #dbdschk [symptom F/#dbdschk] 
might suggest the condition is a duplicate of APAR MA42142 [for which a 
PTF MF55800 exists for a LIC level V5R4M5]; though that APAR does not 
mention what might be the "To program", so despite the implication of 
the above FC revealing a symptom T/QRCLENUP, that APAR has no similar 
implication.
As an Amazon Associate we earn from qualifying purchases.