Don’t RAID-5/6 everything! – Part 1

It’s a beautiful Saturday morning … the sun is out, and the birds are chirping … and here I am, thinking about RAID-5/6. What’s wrong with me?

Anyway, have you ever wondered almost all your volumes are in a RAID-5/6 configuration? Like an obedient child, the answer would probably be “Oh, my vendor said it is good for me …”

In storage, the rule is applications-read, applications-write. And different applications have different behaviors but typically, they fall under 2 categories:

  • Random access
  • Sequential access

The next question to ask is how much Read/Writes ratio (or percentage) is in that Random Access behavior and how much of Read/Write ratio in Sequential Access behavior.

We usually pigeonhole transactional databases such as SQL Server, Oracle into OLTP-type characteristics with random access being the dominant access method. Similarly, email applications such as Exchange, Lotus and even SMTP into similar OLTP-type characteristics as well. We typically do a 2:1 or 3:1 ratio for OLTP-type applications with Read heavy and less of Writes. Data warehouse type of databases tend to be more sequential.

However, even within these OLTP applications, there are also sequential access behaviors as well, as the following table for a database shows:

Operation Random or Sequential Read/Write Heavy Block Size
DB-Log Random (Sequential in log recovery) Write Heavy unless you are doing log recovery 1KB – 64KB
DB-Data Files Random Read/Write mix dependent on load 4KB – 32KB
Batch insert Sequential Write Heavy 8KB – 128KB
Index scan Sequential Read Heavy 8KB – 128KB

We will look into 4 RAID-levels in this scenario and see how each RAID-level applies to an OLTP-type of environment. These RAID levels are RAID-0, RAID-1 (1+0, 0+1 included), RAID-5 and RAID-6.

RAID-0 is the baseline, with 1 x Read and 1 x Write being processed as per normal.

In RAID-1, it would require 2 x Writes and 1 x Read, because the write operation is mirrored. The RAID penalty is 2.

To avoid the cost of RAID-1, RAID-5 is almost always the RAID level of choice (unless you speak to those NetApp fellas). RAID-5 is a parity-based RAID and require 2 x Read (1 to read the data block and 1 to read the parity block) AND 2 x Write (1 to write the modified block and 1 to write the modified parity). Hence it has a RAID penalty of 4.

RAID-6 was to address the risk of RAID-5 because disk capacity are so freaking large now (3TB just came out). To rebuild a large-TB drive would take longer time and the RAID-5 volume is at risk if a second disk failure occurs. Hence, double parity RAID in RAID-6. But unfortunately, the RAID penalty for RAID-6 is 6!

To summarize the RAID write penalty,

RAID-level Number of I/O Reads
Number of I/O for Writes
RAID Write Penalty
0 1 1 1
1 (1+0, 0+1) 1 2 2
5 1 4 4
6 1 6 6

So, it is well known that RAID 0 has good performance for reads and writes but with absolutely no protection. RAID-1 would be good for random reads and writes but it is costly. RAID-5 is good for applications with a high ratio of sequential reads vs writes (2:1, 3:1 as mentioned), and RAID-6, errr … should be taken similarly as RAID-5 with some additional performance penalty.

With that in mind, a storage administrator must question why a particular RAID-level was proposed to the database or any like-applications.

I am going out to enjoy the Saturday now … and today, August 13th is the World’s Left-Handed Day. More about this RAID penalty and IOPS in my next entry.

Advertisements

About cfheoh

I am a technology blogger with 20 years of IT experience. I write heavily on technologies related to storage networking and data management because that is my area of interest and expertise. I introduce technologies with the objectives to get readers to *know the facts*, and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and presently the Chairman of SNIA Malaysia. My day job is to run 2 companies - Storage Networking Academy and ZedFS Systems. Storage Networking Academy provides open-technology courses in storage networking (foundation to advanced) as well as Cloud Computing. We strives to ensure vendor-neutrality as much as we can in our offerings. At ZedFS Systems, we offer both storage infrastructure and software solution called Zed OpenStorage. We developed Zed OpenStorage based on the Solaris kernel, ZFS file system and DTrace storage analytics.

Posted on August 13, 2011, in Disks, RAID and tagged , , , , , , , . Bookmark the permalink. 3 Comments.

  1. I absolutely agree with you.

    I got a quote from an ISP on Thursday, my requirement has a heavily write intensive database and yet their “technical expert” revealed that the VM they provisioned me would have its .VMDK files (virtual drives) provisioned on a raid 6 array. They proudly told me that they use 15K 300GB drives, they must not understand they would achieve higher write performance and energy efficiency if they had at least some of their storage on larger 10K drives which were mirrored.

    [Provided you have decent hardware] For mirroring (raid 1) the penalty is less than 2x the latency of an IO because the raid controller issues the two writes simultaneously to the drives meaning that the entire IO takes the length of time of the slowest of the two I/Os to the drives.

    • Hi Alex

      Sorry for the late reply. My blog has migrated from WordPress to my own domain name – http://storagegaga.com.

      If you have a write intensive application, RAID-6 is possibly the least friendly of them all. With a write penalty of 6 (depending on the storage vendor), it doesn’t really matter if the HDDs’ RPM is 15K, because IOPS will max out real fast.

      As for energy efficiency, 15K RPM HDDs will draw more power and less green than 10K RPM HDDs. If you consider the variables of max capacity, RPM, power draw, 10K RPM HDDs are likely to be a better choice in the long run. The performance bump from 15K compared to 10K may not be significant enough when it comes to energy efficiency.

      Lastly you are absolutely right. Spend a bit more, and get RAID 1+0 for your DB.

      All the best. Thank you for reading my blog.

      /Chin-Fah

  1. Pingback: Don’t RAID-5/6 everything! – Part 1 (thnx cfheoh)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: