On 21 Jan 2013 12:38, Smith, Mike wrote:
So, I'm running V6R1M0 <<SNIP>>went to look in
the QAUDJRN for some of the entries but didn't find anything.. ANYTHING.
From the time the system was IPL'd Sat/Sun there were no entries written.
QAUDCTL was still *AUDLVL
QAUDLVL still had *PGMFAIL, *AUTFAIL and *SECURITY.
I looked at QAUDJRN and verified that it had an active receiver.
A quick check of ASP1 and I was at 82%, so there plenty of storage
(even though the machine would have crashed or been unresponsive
I tested logging in with my user id and giving it a bad password. I
logged back in and checked QAUDJRN for a T/AF record for my ID,
nothing. I changed an objects owner and then changed it back. Again,
looked for my ID in QAUDJRN and nothing.
I ended up opening a ticket with IBM. The first thing they asked me
to do was go have the system create a new journal receiver. The only
entry was the J/PR for a new receiver. In the end they had me change
QAUDCTL to *NONE and then back to *AUDLVL and *NOQTEMP. That kicked
started it into collecting. I've asked them how this could happen but
haven't gotten a response yet.
As this ever happened to anyone? This seems like a major problem. If
I hadn't happened to be working on a problem, it might have been
months before I noticed it. I guess it may have been quicker as we
have folks that look through those daily or ever other day.
Not really looking for what the best values are for QAUDCTL/QAUDLVL
but more about the journaling stopping.
Was the message CPI2284 issued during that IPL? Does that message or
CPI2283 show anywhere in the history?
I can not recall for sure, but I think that was a side effect of some
of the audit ending issues I had diagnosed before. IIRC: As part of the
recovery of a failure to audit, which includes an attempt to turn off
auditing, the LIC is informed to do so first, and then the OS
indications are turned off; i.e. after the LIC is informed that auditing
is no longer active, what is visibly manifest to the user via system
values is then changed to reflect that auditing has been changed to
*NONE. Unfortunately the processing that turns off auditing runs in the
same process in which the failure to audit was detected, and if whatever
caused the failure was something about that process for which the OS
invocation to change the system value would also fail, then the
described effect would occur. I do not recall if the msgCPI2283 was
sent immediately or even periodically afterward in that scenario.
There would likely be a vlog for the process that failed to turn off
auditing. There may even be a VLOG specifically for the LIC request to
stop auditing. Those would of course be logged sometime before the last
recorded audit entry. The most common issue I had investigated several
times [irrespective of the possibility the described effect was also a
symptom] was always due to a lack of PASA [Process Automatic Storage] in
the process. AFaIK if the "audit end action" system value QAUDENDACN
had been established as *PWRDWNSYS, the system would have powered down
with the SRC that indicated the issue; i.e. SRCB9003D10.