|
-----Original Message----- From: S. Kalyana Sundaram [SMTP:stgsys@squared.co.in] Sent: Monday, November 24, 1997 4:39 AM To: 'MIDRANGE-L@midrange.com' Subject: IPL problems We recently had a disk unit failure on our AS/400 (V2 R3). Before the crash, it took hardly 20 minutes for IPL. After disaster recovery, we accidentally applied all PTFs permanently. (All these PTFs had been applied even before but only temporarily). Now the system takes nearly 2 hours for IPL.. The IPL proceeds normally until SRC code C900 2AA0. At this code, it takes nearly 1 and a half hours (normal time should be less than a minute). In the manual, this SRC code C900 2AA0 indicates damage object notification. We are not able to understand the reason for this abnormal delay in IPL. Can anybody please help us in figuring out the problem ? TIA, GopiPriya, Just some guesses here. Maybe your machine pool is set too small ? There are several APAR's relating to possible problems with this IPL step that have had PTF's issued. Of course the standard advice here is to get off of V2R3 and install V3R2 with the latest PTF cum package. You don't mention what PTF level you are at for V2R3. See below APARs that may address the issue: Item SA48972 APAR Identifier ...... SA48972 Last Changed..97/02/24 OSP-F90 IPL RUNNING FOR OVER 90 HOURS. Symptom ...... DD LOOP Status ........... CLOSED DOC Severity ................... 2 Date Closed ......... 97/02/24 Component .......... 5763SS100 Duplicate of ........ Reported Release ......... 310 Fixed Release ............ Component Name 5763 OS/400 ROC Special Notice Current Target Date .. Flags SCP ................... Platform ............ Status Detail: APARCLOSURE - APAR is being closed. PE PTF List: PTF List: Parent APAR: Child APAR list: ERROR DESCRIPTION: * EQUIVALENT ABSTRACT: OSP-F90 IPL RUNNING FOR OVER 90 HOURS_ * See problem summary. LOCAL FIX: force manual ipl ,change major system values at ipl and increase qmchpool value. PROBLEM SUMMARY: F90 with 512MB main store installed from D70 192MB savsys and nonsys and ipl'd which failed to complete aftre 90 hours, srcc9002AA0 displayed for over 24 hours,cpu start/stop shows different tdes running, DST via F21 on panel fails with 0000 0000 in panel lights. also suspect qdbsrvxr attemping to update the cross reference files after the nonsys restore. MSSD shows the machine pool too small and thrashing as the cause of the long IPL. The system values qmchpool and qbaspool restored from D70 savsys were 35,000KB and 50,000KB which caused the machine pool to be set too small on the F90.Even qpfradj was page faulting thus unable to tune the system by adjusting pool sizes. Backup Recovery Guide sc41-3304-01 page 11-12 needs to include system values qmchpool and qbaspool to be changed if restoring onto a different CPU with qmchpool set to 15% of main storage size. This will prevent the incorrect resources in the machine pool, while base pool will still get all unallocated storage. PROBLEM CONCLUSION: The Basic B&R book will be changed for the next release to tell the customer that in the given circumstance the machine pool should be set to 15% of main storage. Item SA36355 APAR Identifier ...... SA36355 Last Changed..95/03/15 OSP-SRCC9002AA0-LOOP IPL WITH DAMAGED OR DELETED RECOVERY OBJ. Symptom ...... LP SRCC9002AA0 Status ........... CLOSED PER Severity ................... 2 Date Closed ......... 94/08/30 Component .......... 5738SS1DB Duplicate of ........ Reported Release ......... 230 Fixed Release ............ 999 Component Name 5738 OS/400 DAT Special Notice Current Target Date ..94/09/30 Flags SCP ................... Platform ............ Status Detail: SHIPMENT - Packaged solution is available for shipment. PE PTF List: PTF List: Release 230 : SF18234 available 94/09/27 (5038 ) Parent APAR: Child APAR list: ERROR DESCRIPTION: ******* (Do NOT alter/erase this or next 3 lines) ******* * EQUIVALENT ABSTRACT: OSP-SRCC9002AA0-LOOP IPL WITH DAMAGED/DELETED RECOVERY OBJ_ * THIS APAR WAS SYSROUTED TO R305 SA36786. Customer initiated a normal ipl after completing an entire system save. The IPL looped at step SRCC9002AA0. Restarting the IPL nor installing the system resolved the problem. Processor light was solid during this step. LOCAL FIX: PROBLEM SUMMARY: Customer experienced an unexpected power failure but did successfully IPL and ran a couple of days without problems after the IPL. The customer was running under commitment control. A job that used a file that was active during the power failure was started and ran two days before being cancelled since it was in a loop. After completing an entire system save the customer initiated a normal IPL. The IPL looped at step SRCC9002AA0. Neither restarting the IPL nor re-installing the system resolved the problem. Processor light was solid during this step. PROBLEM CONCLUSION: The system encountered a damaged/deleted recovery object related to commitment control during IPL. The delete file code was not properly handling this situation and an infinite loop occurred. The code has been changed to prevent the infinite loop. Item SA37155 APAR Identifier ...... SA37155 Last Changed..94/10/12 IPL- SRCC9002AA0 LOOPING FOR HOURS IN SCPF Symptom ...... AB SRCC9002AA0 Status ........... CLOSED CAN Severity ................... 2 Date Closed ......... 94/10/12 Component .......... 5763SS1DB Duplicate of ........ Reported Release ......... 305 Fixed Release ............ Component Name 5763 OS/400 DAT Special Notice Current Target Date .. Flags SCP ................... Platform ............ Status Detail: APARCLOSURE - APAR is being closed. PE PTF List: PTF List: Parent APAR: Child APAR list: ERROR DESCRIPTION: 9406 300 scratched installed with os400 and user data and running ok, the wsc controller resource names where incorrect so srm data cleared and system re ipl, and looped at srcc9002AA0 for 4 hours in SCPF. reload of vlic and os400 gave same symptom. cpu stop start gave modules #ixmain and #ixxindx. mssd shows scpf as current task with cre stack #cfszrel,qdbapmgr , icb stack #pminpr2 qwciscfr qrcimpln qdbrcips #ixmain . user has to scratch install again to recover. LOCAL FIX: PROBLEM SUMMARY: A 9406 was scratch installed with OS400 and user data. It came up and appeared to be running fine. However, it was discovered that the workstation controller resource names where incorrect. The SRM (System Resource Management) data was cleared and the system was re-IPLed. The system appeared to be hung at SRCC9002AA0 for 4 hours in SCPF. PROBLEM CONCLUSION: TEMPORARY FIX: COMMENTS: During an IPL the system can be at SRCC9002AA0 for long periods (many, many hours) of time. A list of access paths that have to be rebuilt is being generated. The most common reasons for an access path to have to be rebuilt are that the access path was open when an abnormal termination occurred or a file was restored but its access path has not yet been rebuilt. The list generation can be very time consuming because the algorithm used requires sequential searches. In V3R1M0 the algorithm was changed. This change will improve the performance at the SRCC9002AA0. Also in V3R1M0, a change was made to greatly improve the performance of changing all of the sequence numbers or changing them to *HLD or *OPN for rebuilding the access paths at SRCC9002AB0. These changes are made on the EDTRBDAP screen during a manual IPL. +--- | This is the Midrange System Mailing List! | To submit a new message, send your mail to "MIDRANGE-L@midrange.com". | To unsubscribe from this list send email to MIDRANGE-L-UNSUB@midrange.com. | Questions should be directed to the list owner/operator: david@midrange.com +---
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.