Re: RAID 6 rebuild time - can I slash it? -- MIDRANGE-L

Hi Jeff

you didn't say (or I missed it) but are you moving the data off that drive
in advance ? Hopefully that's a "yes"

Also, is your concern around the timing based on the amount of time you
will be unprotected or the amount of time the system will potentially be
out of service ?

On Thu, Mar 20, 2014 at 8:26 AM, Jeff Crosby <jlcrosby@xxxxxxxxxxxxxxxx>wrote:

Paul,

Not adding any drives. Moving data off 1 drive to make the hot spare.

On Wed, Mar 19, 2014 at 2:46 PM, Steinmetz, Paul <PSteinmetz@xxxxxxxxxx

wrote:

Jeff,

Currently, you have 2 8 drive Raid5 sets, total of 16 drives
Are you adding 2 new drives for you're hot spare or remaining at 2 8

drive

Raid6 with hot spare.
If you're not adding drives, you will need to move data off 1 drive for
each of your Raid5 sets, correct.

Paul

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:
midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of Jeff Crosby
Sent: Wednesday, March 19, 2014 2:34 PM
To: Midrange Systems Technical Discussion
Subject: Re: RAID 6 rebuild time - can I slash it?

Our QPFRADJ is set to 3.

You give times for drives with NO data and times for drives WITH data.
The drives WITH data are my question. Does it make a difference, for
example, if the drive is 20% full vs 40% full?

On Wed, Mar 19, 2014 at 2:07 PM, Steinmetz, Paul <PSteinmetz@xxxxxxxxxx

wrote:

Jeff,

Disk controller, Disk type (SSD, 15K or 10k) Disk size are all
factors, IBM internals has a tool that will estimate this.

Below are times from a disk project years back on a 9406-550 Stop
parity - 20 minutes for 8 drives if no data on drives - 4 hours / 40
minutes per drive if drives contain data

Start parity
Drives with no data - 5 minutes / drive
3 to 18 drives, all with no data, 5 minutes / drive; 15 to 90 minutes.
The time to start Raid on 3 drives vs 18 drives is not much different
= about 30 minutes with NO drives configured.
At least one drive with data, either customer data or LS - 40 minutes
/ drive The time to start Raid on 3 drives vs 18 drives is not much
different = about 5.5 to 5.75 hours with some drives configured.

Also check, QPFRADJ was set to 2. Machine Pool needed more, adjuster
was not giving it. I set QPFADJ to 0, forced machine pool with more

memory.

This seemed to help.
What happens when at DST, is it possible machine pool needs more
memory, no way of changing, (to the best of my knowledge), at DST.

Below are additional Raid detail

Calculating RAID-5 and RAID-6 capacity All drives in each RAID-5 or
RAID-6 array must have the same capacity. To calculate the usable
capacity of each RAID-5 array, simply subtract 1 from the actual
number of drives in the array and multiply the result by the
individual drive capacity. For RAID-6, subtract 2 and multiply by the
capacity. Using the latest System i controllers, the minimum number of
drives in a single RAID-5 array is three and in a single RAID-6 array
is four.
The maximum is the
lesser of 18 or the number of drives that can be placed in the disk
enclosure (12 for a
#5095/0595 or 15 for a #5094/5294).

There is significant performance overhead involved in RAID-6
environments with heavy writes to disk application workload. A RAID-5
environment requires four physical disk accesses for each system
issued write operation. A RAID-6 environment requires six physical
disk accesses for the single system write operation. This is because
two parity sets are maintained rather than just one. We discuss this
more fully in the next section ("The RAID-5 and RAID-6 Write
Penalty"). While it can protect against disk outages (which should be
relatively rare with the SCSI drives used by the System i, but less so
for the Serial ATA (SATA) drives where RAID-6 was initially
implemented on other platforms), it does nothing to protect against
disk controller or other component outages. If you need something
greater than
RAID-5 protection, mirroring (with double capacity disks) is usually
preferable to RAID-6 (from both an availability and a performance
perspective).

"RAID levels 3, 4, 5, and 6 all use a similar parity-based approach to
data protection. Simple arithmetic is used to maintain an extra drive
(2 for RAID-6) that contains parity information. In the implementation
for RAID-5 and RAID-6 from IBM, it is the capacity equivalent of one
or two extra drive(s), and the parity data is physically striped
across multiple drives in the RAID array. If a drive fails, simple
arithmetic is used to reconstruct the missing data. It is beyond the
scope of this paper to discuss this in more detail. It works. So well
in fact, that the RAID-5 is widely used throughout the industry."

The RAID-5 and RAID-6 Write Penalty
Parity information is used to provide disk protection in RAID-5 and
RAID-6 (also in RAID-3 and RAID-4, but because they are not used on
System i and function similarly to RAID-5, we do not discuss them
further). To maintain the parity, whenever data changes on any of the
disks, the parity must be updated. For RAID-5, this requires:
_ An extra read of the data on the drive being updated (to determine
the current data on the drive that will be updated (changed)). This is
needed to allow the calculation of any needed parity changes.
_ A read of the "parity information" (to retrieve the current parity
data that will need to be updated), and _ After some quick math to
determine the new parity values, a write of the updated parity
information about the drive containing the parity.
This nets out to 2 extra reads and 1 extra write that all occur behind
the scenes and are not reported on the standard performance reports;
plus the actual writing of the data-a total of 4 I/O operations.
With RAID-6, these activities also occur, but for 2 sets of parity

drives.

Therefore, in addition
to the initial data read, there are 2 parity reads and 2 parity
writes, and the actual write of the data. This means three extra
reads, and two extra writes, plus the actual data update for a total
of six I/O operations.
These extra activities have been termed the RAID Write Penalty. System
i
RAID-5
subsystems reduce/eliminate the write penalty via a true write cache
(not just a write buffer) See Appendix B, "Understanding disk write
cache" on page 69 for more information.
This cache disconnects the physical disk write from the reporting of
the write complete back to the processor. The physical write (usually)
occurs after System i has been told that the write has been performed.
This hides the write penalty (actually write cache allows the write
penalty to overlap with other disk and system activities so that it
usually does not have a performance impact). The write cache also
supports greater levels of disk activity in mirrored environments as
well.

Paul

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx [mailto:
midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of DrFranken
Sent: Wednesday, March 19, 2014 12:34 PM
To: Midrange Systems Technical Discussion
Subject: Re: RAID 6 rebuild time - can I slash it?

Mostly the size of the drives is the major factor.

Yes data does need to be moved about to allow for the various RAID
stripes so if the system will END UP more than 70% full then the
prepare step will take longer as well. If it's cool to delete that
130+ GB virtual tape then I would do so as it can't hurt for sure!

- Larry "DrFranken" Bolhuis

www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com

On 3/19/2014 12:29 PM, Jeff Crosby wrote:

All,

On a Saturday of my choosing, our HW service provider is going to
take our System i 520 from 2 8-drive RAID 5 sets to 2 8-drive RAID 6
sets w/hot spare. Yes I'm taking a SAVE21 before.

Building the RAID 6 will be time-consuming. My question is this:
Is the amount of time it will take based on the size of the
*drives*? Or on the amount of *data* on the drives?

I'm asking because we have a virtual tape drive (TAPVRT01) that is
taking up 123GB. If I understand virtual tape correctly, it cannot
be

"shrunk."

I could remove that device (or whatever "piece" of the image
catalog
structure) beforehand and recreate it afterward. I could also do
other cleanup, like old job logs, etc.

But only if it actually saves time. If the RAID 6 build time is
tied to drive size to where a cleanup won't make a difference, then
I won't

bother.

Thanks.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take
a moment to review the archives at

http://archive.midrange.com/midrange-l.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take
a moment to review the archives at
http://archive.midrange.com/midrange-l.

--
Jeff Crosby
VP Information Systems
UniPro FoodService/Dilgard
P.O. Box 13369
Ft. Wayne, IN 46868-3369
260-422-7531
www.dilgardfoods.com

The opinions expressed are my own and not necessarily the opinion of my
company. Unless I say so.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a
moment to review the archives at http://archive.midrange.com/midrange-l.

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing

list

To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.

--
Jeff Crosby
VP Information Systems
UniPro FoodService/Dilgard
P.O. Box 13369
Ft. Wayne, IN 46868-3369
260-422-7531
www.dilgardfoods.com

The opinions expressed are my own and not necessarily the opinion of my
company. Unless I say so.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.