× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Early this morning, during an IPL, I had a 57B5 (5913) pair fail during an IPL.
Because our monitoring software is not running during the IPL, I initially did not see the alert.
Two additional IPLs - LPAR continued to run, but performance was EXTREMELY poor
The initial "call home" PAL entry did not re-appear, only an increased count in the SAL entry, which I didn't see till later on.

SAL entry
Status Date Time SRC Resource Count PLID
NEW 01/21/19 01:10:38 57B59076 DC05 3

It was later discovered, with IBM hardware support, via SST - 6. Display disk hardware status
That the pair was "Performance degraded", which implied the disk controllers were running with ZERO cache.

Performance degraded-
This state indicates the device is functional but performance may be impacted due to other hardware problems (such as a IOA cache problem).

We identified the "suspect" card, which was NOT operational.
Powered off the slot, powered slot back on.
LPAR disk performance problem re-solved.

I had similar failures on a different LPAR, different card pair over the years.
Those failures were not during an IPL, but while LPAR was running.
The difference in those two previous failures was the card/slot was automatically reset by the code.
Previously,

This error was an L2 cache error and the cards needed to do a reset for data integrity reasons.
The controllers went into a recovery, lasted 23 seconds, LPAR was "suspended" during this period.
During the recovery, several applications failed, which then need a manual reset/recycle.

1) How does one better monitor for these types of card/pair failures?
2) Why did 2nd and 3rd IPL not reset the card?
3) Why did 2nd and 3rd IPL not "call home" and create a new PAL entry.
4) Anyone else from the group experience similar card/pair failures?

Thank You
_____
Paul Steinmetz
IBM i Systems Administrator

Pencor Services, Inc.
462 Delaware Ave
Palmerton Pa 18071

610-826-9117 work
610-826-9188 fax
610-349-0913 cell
610-377-6012 home

psteinmetz@xxxxxxxxxx
http://www.pencor.com/










As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.