RE: RAID 6 rebuild time - can I slash it? -- MIDRANGE-L

Jeff,

Disk controller, Disk type (SSD, 15K or 10k) Disk size are all factors,
IBM internals has a tool that will estimate this.

Below are times from a disk project years back on a 9406-550
Stop parity - 20 minutes for 8 drives if no data on drives - 4 hours / 40 minutes per drive if drives contain data

Start parity
Drives with no data - 5 minutes / drive
3 to 18 drives, all with no data, 5 minutes / drive; 15 to 90 minutes. The time to start Raid on 3 drives vs 18 drives is not much different = about 30 minutes with NO drives configured.
At least one drive with data, either customer data or LS - 40 minutes / drive The time to start Raid on 3 drives vs 18 drives is not much different = about 5.5 to 5.75 hours with some drives configured.

Also check, QPFRADJ was set to 2. Machine Pool needed more, adjuster was not giving it. I set QPFADJ to 0, forced machine pool with more memory. This seemed to help.
What happens when at DST, is it possible machine pool needs more memory, no way of changing, (to the best of my knowledge), at DST.

Below are additional Raid detail

Calculating RAID-5 and RAID-6 capacity
All drives in each RAID-5 or RAID-6 array must have the same capacity. To calculate the
usable capacity of each RAID-5 array, simply subtract 1 from the actual number of drives in
the array and multiply the result by the individual drive capacity. For RAID-6, subtract 2 and
multiply by the capacity. Using the latest System i controllers, the minimum number of drives
in a single RAID-5 array is three and in a single RAID-6 array is four. The maximum is the
lesser of 18 or the number of drives that can be placed in the disk enclosure (12 for a
#5095/0595 or 15 for a #5094/5294).

There is significant performance overhead involved in RAID-6 environments with heavy writes
to disk application workload. A RAID-5 environment requires four physical disk accesses for
each system issued write operation. A RAID-6 environment requires six physical disk
accesses for the single system write operation. This is because two parity sets are
maintained rather than just one. We discuss this more fully in the next section ("The RAID-5
and RAID-6 Write Penalty"). While it can protect against disk outages (which should be
relatively rare with the SCSI drives used by the System i, but less so for the Serial ATA (SATA)
drives where RAID-6 was initially implemented on other platforms), it does nothing to protect
against disk controller or other component outages. If you need something greater than
RAID-5 protection, mirroring (with double capacity disks) is usually preferable to RAID-6 (from
both an availability and a performance perspective).

"RAID levels 3, 4, 5, and 6 all use a similar parity-based approach to data protection. Simple
arithmetic is used to maintain an extra drive (2 for RAID-6) that contains parity information. In
the implementation for RAID-5 and RAID-6 from IBM, it is the capacity equivalent of one or
two extra drive(s), and the parity data is physically striped across multiple drives in the RAID
array. If a drive fails, simple arithmetic is used to reconstruct the missing data. It is beyond the
scope of this paper to discuss this in more detail. It works. So well in fact, that the RAID-5 is
widely used throughout the industry."

The RAID-5 and RAID-6 Write Penalty
Parity information is used to provide disk protection in RAID-5 and RAID-6 (also in RAID-3
and RAID-4, but because they are not used on System i and function similarly to RAID-5, we
do not discuss them further). To maintain the parity, whenever data changes on any of the
disks, the parity must be updated. For RAID-5, this requires:
_ An extra read of the data on the drive being updated (to determine the current data on the
drive that will be updated (changed)). This is needed to allow the calculation of any
needed parity changes.
_ A read of the "parity information" (to retrieve the current parity data that will need to be
updated), and
_ After some quick math to determine the new parity values, a write of the updated parity
information about the drive containing the parity.
This nets out to 2 extra reads and 1 extra write that all occur behind the scenes and are not
reported on the standard performance reports; plus the actual writing of the data-a total of 4
I/O operations.
With RAID-6, these activities also occur, but for 2 sets of parity drives. Therefore, in addition
to the initial data read, there are 2 parity reads and 2 parity writes, and the actual write of the
data. This means three extra reads, and two extra writes, plus the actual data update for a
total of six I/O operations.
These extra activities have been termed the RAID Write Penalty. System i RAID-5
subsystems reduce/eliminate the write penalty via a true write cache (not just a write buffer)
See Appendix B, "Understanding disk write cache" on page 69 for more information.
This cache disconnects the physical disk write from the reporting of the write complete back
to the processor. The physical write (usually) occurs after System i has been told that the
write has been performed. This hides the write penalty (actually write cache allows the write
penalty to overlap with other disk and system activities so that it usually does not have a
performance impact). The write cache also supports greater levels of disk activity in mirrored
environments as well.

Paul

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of DrFranken
Sent: Wednesday, March 19, 2014 12:34 PM
To: Midrange Systems Technical Discussion
Subject: Re: RAID 6 rebuild time - can I slash it?

Mostly the size of the drives is the major factor.

Yes data does need to be moved about to allow for the various RAID stripes so if the system will END UP more than 70% full then the prepare step will take longer as well. If it's cool to delete that 130+ GB virtual tape then I would do so as it can't hurt for sure!

- Larry "DrFranken" Bolhuis

www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com

On 3/19/2014 12:29 PM, Jeff Crosby wrote:

All,

On a Saturday of my choosing, our HW service provider is going to take
our System i 520 from 2 8-drive RAID 5 sets to 2 8-drive RAID 6 sets
w/hot spare. Yes I'm taking a SAVE21 before.

Building the RAID 6 will be time-consuming. My question is this: Is
the amount of time it will take based on the size of the *drives*? Or
on the amount of *data* on the drives?

I'm asking because we have a virtual tape drive (TAPVRT01) that is
taking up 123GB. If I understand virtual tape correctly, it cannot be "shrunk."
I could remove that device (or whatever "piece" of the image catalog
structure) beforehand and recreate it afterward. I could also do
other cleanup, like old job logs, etc.

But only if it actually saves time. If the RAID 6 build time is tied
to drive size to where a cleanup won't make a difference, then I won't bother.

Thanks.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a moment to review the archives at http://archive.midrange.com/midrange-l.