MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » March 2014

RE: RAID 6 rebuild time - can I slash it?



fixed

Larry,

Am I correct that you can have 1 hot spare for all your Raid sets, so Jeff would only have to remove 1 drive.

Paul

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of DrFranken
Sent: Wednesday, March 19, 2014 4:30 PM
To: Midrange Systems Technical Discussion
Subject: Re: RAID 6 rebuild time - can I slash it?

I didn't see a release mentioned but with IBM i 7.1 you can completely remove the two drives that will become hot spares in advance with essentially zero impact to production and no outage. Determine which drives you want to be the hot spares and then in Service Tools work with disk units take them out of the ASP.

That will take some time but runs in the background. When done they will appear at the bottom of WRKDSKSTS with no drive number. They are still RAID drives at that point but that will change when you End RAID5 in preparation for RAID 6. Since they are not in the ASP they will be the chosen ones for hot spare. NOTE: In a Power5 the LS drive must be drive 5, 6, or 7 counting from the LEFT, likely it's drive 5. So pick either 6 or 7 to remove so it's a legit Hot Spare.

If you are on 6.1 then use STRASPBAL to *ENDALC followed by *MOVDTA to drain them as low as possible, likely to 1 or 2 percent. This will reduce the time needed in DST to remove them.

- Larry "DrFranken" Bolhuis

www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com

On 3/19/2014 3:58 PM, Jeff Crosby wrote:

I'm concerned about the amount of time we will be unprotected.

We're planning to do it all on 1 Saturday.

I don't think it's possible to move the data off the 2 drives that
will be the hot spares. One 8-drive RAID 5 set has the striping
across 4 drives (CEC). The other 8-drive RAID 5 set has the striping
across all 8 drives (expansion unit).



On Wed, Mar 19, 2014 at 3:34 PM, Evan Harris <auctionitis@xxxxxxxxx> wrote:

Hi Jeff

you didn't say (or I missed it) but are you moving the data off that
drive in advance ? Hopefully that's a "yes"

Also, is your concern around the timing based on the amount of time
you will be unprotected or the amount of time the system will
potentially be out of service ?




On Thu, Mar 20, 2014 at 8:26 AM, Jeff Crosby
<jlcrosby@xxxxxxxxxxxxxxxx
wrote:

Paul,

Not adding any drives. Moving data off 1 drive to make the hot spare.


On Wed, Mar 19, 2014 at 2:46 PM, Steinmetz, Paul
<PSteinmetz@xxxxxxxxxx
wrote:

Jeff,

Currently, you have 2 8 drive Raid5 sets, total of 16 drives Are
you adding 2 new drives for you're hot spare or remaining at 2 8
drive
Raid6 with hot spare.
If you're not adding drives, you will need to move data off 1 drive
for each of your Raid5 sets, correct.

Paul

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:
midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of Jeff Crosby
Sent: Wednesday, March 19, 2014 2:34 PM
To: Midrange Systems Technical Discussion
Subject: Re: RAID 6 rebuild time - can I slash it?

Our QPFRADJ is set to 3.

You give times for drives with NO data and times for drives WITH data.
The drives WITH data are my question. Does it make a difference,
for example, if the drive is 20% full vs 40% full?




On Wed, Mar 19, 2014 at 2:07 PM, Steinmetz, Paul <
PSteinmetz@xxxxxxxxxx
wrote:

Jeff,

Disk controller, Disk type (SSD, 15K or 10k) Disk size are all
factors, IBM internals has a tool that will estimate this.

Below are times from a disk project years back on a 9406-550 Stop
parity - 20 minutes for 8 drives if no data on drives - 4 hours /
40 minutes per drive if drives contain data

Start parity
Drives with no data - 5 minutes / drive
3 to 18 drives, all with no data, 5 minutes / drive; 15 to 90
minutes.
The time to start Raid on 3 drives vs 18 drives is not much
different = about 30 minutes with NO drives configured.
At least one drive with data, either customer data or LS - 40
minutes
/ drive The time to start Raid on 3 drives vs 18 drives is not
much different = about 5.5 to 5.75 hours with some drives configured.

Also check, QPFRADJ was set to 2. Machine Pool needed more,
adjuster was not giving it. I set QPFADJ to 0, forced machine pool
with more
memory.
This seemed to help.
What happens when at DST, is it possible machine pool needs more
memory, no way of changing, (to the best of my knowledge), at DST.


Below are additional Raid detail

Calculating RAID-5 and RAID-6 capacity All drives in each RAID-5
or
RAID-6 array must have the same capacity. To calculate the usable
capacity of each RAID-5 array, simply subtract 1 from the actual
number of drives in the array and multiply the result by the
individual drive capacity. For RAID-6, subtract 2 and multiply by
the capacity. Using the latest System i controllers, the minimum
number
of
drives in a single RAID-5 array is three and in a single RAID-6
array is four.
The maximum is the
lesser of 18 or the number of drives that can be placed in the
disk enclosure (12 for a
#5095/0595 or 15 for a #5094/5294).

There is significant performance overhead involved in RAID-6
environments with heavy writes to disk application workload. A
RAID-5 environment requires four physical disk accesses for each
system issued write operation. A RAID-6 environment requires six
physical disk accesses for the single system write operation. This
is because two parity sets are maintained rather than just one. We
discuss this more fully in the next section ("The RAID-5 and
RAID-6 Write Penalty"). While it can protect against disk outages
(which should be relatively rare with the SCSI drives used by the
System i, but less
so
for the Serial ATA (SATA) drives where RAID-6 was initially
implemented on other platforms), it does nothing to protect
against disk controller or other component outages. If you need
something greater than
RAID-5 protection, mirroring (with double capacity disks) is
usually preferable to RAID-6 (from both an availability and a
performance perspective).


"RAID levels 3, 4, 5, and 6 all use a similar parity-based
approach
to
data protection. Simple arithmetic is used to maintain an extra
drive
(2 for RAID-6) that contains parity information. In the
implementation
for RAID-5 and RAID-6 from IBM, it is the capacity equivalent of
one or two extra drive(s), and the parity data is physically
striped across multiple drives in the RAID array. If a drive
fails, simple arithmetic is used to reconstruct the missing data.
It is beyond the scope of this paper to discuss this in more
detail. It works. So well in fact, that the RAID-5 is widely used throughout the industry."

The RAID-5 and RAID-6 Write Penalty Parity information is used to
provide disk protection in RAID-5 and
RAID-6 (also in RAID-3 and RAID-4, but because they are not used
on System i and function similarly to RAID-5, we do not discuss
them further). To maintain the parity, whenever data changes on
any of the disks, the parity must be updated. For RAID-5, this requires:
_ An extra read of the data on the drive being updated (to
determine the current data on the drive that will be updated
(changed)). This
is
needed to allow the calculation of any needed parity changes.
_ A read of the "parity information" (to retrieve the current
parity data that will need to be updated), and _ After some quick
math to determine the new parity values, a write of the updated
parity information about the drive containing the parity.
This nets out to 2 extra reads and 1 extra write that all occur
behind
the scenes and are not reported on the standard performance
reports; plus the actual writing of the data-a total of 4 I/O operations.
With RAID-6, these activities also occur, but for 2 sets of parity
drives.
Therefore, in addition
to the initial data read, there are 2 parity reads and 2 parity
writes, and the actual write of the data. This means three extra
reads, and two extra writes, plus the actual data update for a
total of six I/O operations.
These extra activities have been termed the RAID Write Penalty.
System
i
RAID-5
subsystems reduce/eliminate the write penalty via a true write
cache (not just a write buffer) See Appendix B, "Understanding
disk write cache" on page 69 for more information.
This cache disconnects the physical disk write from the reporting
of the write complete back to the processor. The physical write
(usually)
occurs after System i has been told that the write has been
performed.
This hides the write penalty (actually write cache allows the
write penalty to overlap with other disk and system activities so
that it usually does not have a performance impact). The write
cache also supports greater levels of disk activity in mirrored
environments as well.

Paul

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:
midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of DrFranken
Sent: Wednesday, March 19, 2014 12:34 PM
To: Midrange Systems Technical Discussion
Subject: Re: RAID 6 rebuild time - can I slash it?

Mostly the size of the drives is the major factor.

Yes data does need to be moved about to allow for the various RAID
stripes so if the system will END UP more than 70% full then the
prepare step will take longer as well. If it's cool to delete
that
130+ GB virtual tape then I would do so as it can't hurt for sure!

- Larry "DrFranken" Bolhuis

www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com

On 3/19/2014 12:29 PM, Jeff Crosby wrote:

All,

On a Saturday of my choosing, our HW service provider is going to
take our System i 520 from 2 8-drive RAID 5 sets to 2 8-drive
RAID
6
sets w/hot spare. Yes I'm taking a SAVE21 before.

Building the RAID 6 will be time-consuming. My question is this:
Is the amount of time it will take based on the size of the
*drives*? Or on the amount of *data* on the drives?

I'm asking because we have a virtual tape drive (TAPVRT01) that
is taking up 123GB. If I understand virtual tape correctly, it
cannot be
"shrunk."
I could remove that device (or whatever "piece" of the image
catalog
structure) beforehand and recreate it afterward. I could also do
other cleanup, like old job logs, etc.

But only if it actually saves time. If the RAID 6 build time is
tied to drive size to where a cleanup won't make a difference,
then I won't
bother.

Thanks.


--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)
mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To
subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please
take
a moment to review the archives at
http://archive.midrange.com/midrange-l.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)
mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To
subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please
take
a moment to review the archives at
http://archive.midrange.com/midrange-l.




--
Jeff Crosby
VP Information Systems
UniPro FoodService/Dilgard
P.O. Box 13369
Ft. Wayne, IN 46868-3369
260-422-7531
www.dilgardfoods.com

The opinions expressed are my own and not necessarily the opinion
of my company. Unless I say so.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)
mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To
subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please
take
a
moment to review the archives at
http://archive.midrange.com/midrange-l.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)
mailing
list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please
take a moment to review the archives at
http://archive.midrange.com/midrange-l.




--
Jeff Crosby
VP Information Systems
UniPro FoodService/Dilgard
P.O. Box 13369
Ft. Wayne, IN 46868-3369
260-422-7531
www.dilgardfoods.com

The opinions expressed are my own and not necessarily the opinion of
my company. Unless I say so.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)
mailing
list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please
take a moment to review the archives at
http://archive.midrange.com/midrange-l.




--

Regards
Evan Harris
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L)
mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To
subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take
a moment to review the archives at
http://archive.midrange.com/midrange-l.




--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a moment to review the archives at http://archive.midrange.com/midrange-l.






Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact