Early this morning, during an IPL, I had a 57B5 (5913) pair fail during an IPL.
Because our monitoring software is not running during the IPL, I initially did not see the alert.
Two additional IPLs - LPAR continued to run, but performance was EXTREMELY poor
The initial "call home" PAL entry did not re-appear, only an increased count in the SAL entry, which I didn't see till later on.
SAL entry
Status  Date   	   Time       SRC        Resource     Count   PLID
NEW     01/21/19  01:10:38   57B59076   DC05             3       
It was later discovered, with IBM hardware support,  via SST - 6. Display disk hardware status
That the pair was "Performance degraded", which implied the disk controllers were running with ZERO cache.
Performance degraded-                              
  This state indicates the device is functional but  performance may be impacted due to other hardware  problems (such as a IOA cache problem). 
We identified the "suspect" card, which was NOT operational.
Powered off the slot, powered slot back on.
LPAR disk performance problem re-solved.
I had similar failures on a different LPAR, different card pair over the years.
Those failures were not during an IPL, but while LPAR was running.
The difference in those two previous  failures was the card/slot was automatically reset by the code.
Previously,
This error was an L2 cache error and the cards needed to do a reset for data integrity reasons.
The controllers went into a recovery, lasted 23 seconds, LPAR was "suspended" during this period.
During the recovery, several applications failed, which then need a manual reset/recycle.
1) How does one better monitor for these types of card/pair failures?
2) Why did 2nd and 3rd IPL not reset the card?
3) Why did 2nd and 3rd IPL not "call home" and create a new PAL entry.
4) Anyone else from the group experience similar card/pair failures?
Thank You
_____
Paul Steinmetz 
IBM i Systems Administrator 
Pencor Services, Inc. 
462 Delaware Ave 
Palmerton Pa 18071 
610-826-9117 work 
610-826-9188 fax 
610-349-0913 cell 
610-377-6012 home 
psteinmetz@xxxxxxxxxx 
http://www.pencor.com/
  
As an Amazon Associate we earn from qualifying purchases.