Looks like I am entering the "high maintenance" era on my aging
BladeCenter S and my JS12 blade...I see messages in the AMM indicating
something is amiss in VIOS:
Blade_01 09/01/18 08:07:11 0x800000ff OS status (OEMOS
Reporting) OS detected fault. See OS error log for details.
Took a look at the error logs on the blade running VIOS and see this:
Detail Data
SYMPTOM CODE
256
SOFTWARE ERROR CODE
-9035
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'234'
FAILING MODULE
rpc.lockd
---------------------------------------------------------------------------
LABEL: SC_DISK_ERR2
IDENTIFIER: B6267342
Date/Time: Sat Sep 1 07:32:05 CDT 2018
Sequence Number: 4972999
Machine Id: 000270CAD400
Node Id: vios
Class: H
Type: PERM
WPAR: Global
Resource Name: hdisk1
Resource Class: disk
Resource Type: scsd
Location: U78A5.001.WIH1BE8-P1-D1
VPD:
Manufacturer................IBM-ESXS
Machine Type and Model......ST9146802SS
FRU Number..................42D0422
ROS Level and ID............42353239
Serial Number...............3NM6QMKN
EC Level....................H17923Y
Part Number.................42C0251
Device Specific.(Z0)........000005329F001002
Device Specific.(Z1)........0429B529
Device Specific.(Z2)........1000
Device Specific.(Z3)........000-0
Device Specific.(Z4)........0001
Device Specific.(Z5)........22
Device Specific.(Z6)........H17923Y
Description
DISK OPERATION ERROR
Probable Causes
DASD DEVICE
Failure Causes
DISK DRIVE
DISK DRIVE ELECTRONICS
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
PATH ID
0
SENSE DATA
0600 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0102 0000
7000 0200
0000 0018 0000 0000 0400 0000 0000 0000 0204 0000 FFFF FFFF FFFF FFFF
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0086 0000
0000 0012 001A
Taking a look at the devices I see some info that I can't quite make
full sense of:
$ lsvg -pv rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE
DISTRIBUTION
hdisk0 active 546 440
109..102..11..109..1 09
hdisk1 missing 546 542
110..105..109..109.. 109
hdisk1 is missing, presumably because it has failed....but I *thought*
hdisk0 and hdisk1 were mirrored. I can't seem to find evidence of
that. I ran lsdev to check the status:
$ lsdev -type disk
name status description
hdisk0 Available SAS Disk Drive
hdisk1 Available SAS Disk Drive
..........
Although hdisk1 reports as "missing" it also has a status of
"active"...seems a little confusing. I guess active could mean "the
power is on and it is spinning".....
And lsvg to see what the volume reports with:
$ lsvg -lv rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 1 1 closed/syncd N/A
hd6 paging 2 2 1 open/syncd N/A
paging00 paging 4 4 1 open/syncd N/A
hd8 jfs2log 1 1 1 open/syncd N/A
hd4 jfs2 1 1 1 open/syncd /
hd2 jfs2 21 21 1 open/syncd /usr
hd9var jfs2 3 3 1 open/syncd /var
hd3 jfs2 14 14 1 open/syncd /tmp
hd1 jfs2 40 40 1 open/syncd /home
hd10opt jfs2 13 13 1 open/syncd /opt
hd11admin jfs2 1 1 1 open/syncd /admin
livedump jfs2 1 1 1 open/syncd
/var/adm/ras/l ivedump
lg_dumplv sysdump 4 4 1 open/syncd N/A
fwdump jfs2 3 3 1 closed/syncd
/var/adm/ras/p latform
loglv00 jfslog 1 1 1 closed/syncd N/A
Indications are that the VG isn't a mirrored pair. OR, the mirror has
"broken" and I am just seeing the contents of hdisk0....could that be?
I am assuming that the disk HAS failed and needs to be replaced. I
ordered a couple of disks (one extrat for backup) and they should be
here in about a week. My concern is in properly replacing the failed
drive. The fact that it has failed seems to have had no effect on VIOS
and I am not "missing" any data. So I think the approach should be to
shut down the blade, replace the disk, make sure that it is "seen" by
the blade and then add it to the volume group as a mirrored drive. Does
that make sense? That *seems* like the way to go. I don't see any
exact matches but a couple of IBM support articles seem to be pointing
the way:
http://www-01.ibm.com/support/docview.wss?uid=isg3T1000426
http://www-01.ibm.com/support/docview.wss?uid=isg3T1020096
Not many BladeCenter savvy folks on this list but it's worth
checking.... Just want to take the correct approach to get the disk
replaced.
As an Amazon Associate we earn from qualifying purchases.