Whitepaper on RAID

RAID Review

raid7RAID (Redundant Array of Inexpensive or Independent Disks) is an important component for servers on a critical enterprise or workgroup network. RAID provides crash-proof hard drive systems. How does it work? How do I administer it? What does it cost? What is the difference between RAID levels and RAID vendors?

RAID systems can be provided in IP Camera video recording systems to increase reliability.

RAID Background

RAID as a computer concept has been around for over twenty years.  The computer science department at UC Berkeley first developed the RAID concept back in the 1980’s. They used the word Inexpensive rather than today’s Independent.

Since then, changes in technology and the use of computers have made RAID more popular. Computers and hard disk drives became faster, smaller, and less expensive. The computer has become a critical part of an organizations business. More and more data is stored in the computer and this information must be available 24 hours per day and 7 days per week. Data accessibility and reliability has become a key factor in the success of business.

Accessibility is the essential component of RAID. RAID technology makes data more accessible by preventing downtime due to a hardware failure. RAID systems can sustain several bad sectors and even whole disk failures, continue running, and all the while being transparent to the end-user.

But, with that accessibility, comes a price. How much? It depends on your total storage requirement, the type of redundancy, and how quickly you need to recover from a failure. The cost of this hardware should be measured against the cost of having a failure, and on the cost of the downtime due to this failure. Some companies can sustain the loss of a disk drive or two and not suffer financially. Other companies, such as brokerages, measure downtime in minutes of revenue loss. For this class of customer, RAID with full redundancy is a must-have.

RAID systems not only increase reliability, they also increase available storage capacity. Kintronics manufactures RAID systems with Terabyte capacity. Remember when a 40 megabyte drive was overwhelmingly large?

RAID Levels

Six distinctive RAID levels have been developed and agreed upon, voluntarily, by various manufacturers. These RAID levels are 0, 1, 2, 3, 4, and 5. Other combinations of these levels are also used, such as level 10 (which is 0+1) or level 6 (which is 5+1).

A RAID system appears as a single large hard disk to the operating system. All of the computations associated with creating the RAID set are hidden from the operating system. RAID responds to standard disk commands such as read, write, and format.

RAID Level 0 stripes data across all disks without redundancy or parity. This Level maximizes data transfer rates and is good for handling large files. Spare drives are not useful on this Level.

RAID Level 1 mirrors data across multiple disks. Data is duplicated on another set of drives. If one drive fails, then the data is still available on the other mirror. This Level has the highest cost per MB and is best suited for smaller capacity applications such as mirroring the boot drive. Typically only one drive is mirrored at a time. Spare drives are not useful on this Level.

RAID Level 2 bit interleaves data across multiple disks with parity information created using a Hamming code. A Hamming code detects errors that occur and determines which part is in error. RAID Level 2 specifies 39 disks with 32 disks of user storage and 7 disks of error recovery coding. This Level is not used in practice.

RAID Levels 3 and 4 stripe data across multiple drives and write parity to a dedicated drive. Level 3 is typically implemented at the BYTE level. While Level 4 is typically implemented at the BLOCK level. These Levels combine the performance of RAID 0 with a redundancy feature. If a drive fails, the data can be restructured by the parity drive. RAID 3 and 4 are best suited for large transfer sizes and rates where redundancy is important. The parity information is calculated during write time and can effect overall performance. Spare drives take over in the event of a drive failure.

RAID Level 5 stripes data and parity information at the block level across all the drives in the array. Parity is written onto the next available drive rather than a dedicated parity drive. Reads and writes may be performed concurrently. Level 5 also calculates parity during the write cycles, but uses an Exclusive-OR (X-OR) algorithm. This algorithm is best suited for smaller data transfers. Spare drives take over in the event of a drive failure.

Table 1 – Summaries RAID Levels

Raid Level Description Capacity Example of Actual Capacity, using eight (8) 4 TB disks
0 Striping # Drives x Cap Drive 32 TB
1 Mirroring (# Drives / 2) x Cap Drive 16 TB
2 Hamming code parity (# Drives -1) x Cap Drive 28 TB
3 Byte level parity (# Drives -1) x Cap Drive 28 TB
4 Block level parity (# Drives -1) x Cap Drive 28 TB
5 Interleave parity (# Drives -1) x Cap Drive 28 TB

 

RAID Implementation

RAID can be implemented in hardware or software. Software RAID solutions use the host computer’s CPU and memory to implement the RAID functions. As an example, Sun Computers (Online Disk Suite) implement RAID Level 0 or Level 1 using software and the internal processor. It takes advantage of the large cache memory available in this computer.

RAID Level 3 or 5 are usually implemented in hardware. The hardware RAID controller has a dedicated CPU to calculate parity and map the location of the files.

Hardware RAID is implemented using either using an internal RAID board (like the Adaptec RAID systems) or an external RAID processor such as the CMD controllers.  Internal RAID, like software RAID, is operating system dependent. It usually requires a driver to access and configure the RAID controller. On the plus side, the internal RAID controller can communicate faster than an external controller, because it incorporates the SCSI adapter function.  This means that access to the data avoids one communication layer. Also, an internal RAID controller is usually less expensive than a comparably equipped external controller.

There are some major disadvantages to an internal controller. First, if the controller fails, then the host computer must be shut down to repair or replace the board. If an external RAID controller fails, simply turn off the other devices on that bus. Second, most internal controllers do not have expansion cards, so the size of the RAID set is limited. External controllers offer more flexibility in the number of drives it can address. Third, only one host can talk to an internal controller. Fourth, failover on an internal controller is difficult at best to configure.

RAID Benchmark

A known benchmark is used to compare the performance of various RAID systems. A group of RAID controller manufacturers, disk drive manufacturers, and adapter board manufacturers agreed to create standards and formed a committee called the RAID Advisory Board (RAB). The RAB has developed and reviewed several benchmarks suitable of creating data derived solely on the merits of the RAID controller, and not the test environment. The RAB can be reached via the Internet at http://www.raid-advisory.com/

A partial list of these benchmarks is IOBENCH, IOZONE, BONNIE, and IOGEN. Of these four, IOGEN and IOBENCH are used most often. All three tests are available free from various sites and relatively easy to setup and run.

RAID Features and Functions

Comparative performance is just one measure of a RAID controller. The feature set, upgrade path, and ease of use all play a significant role in choosing the proper controller. These three measurable statistics are advertised heavily and discussed at length with various slants towards and against different vendors. The following table describes a brief list of features and benefits for RAID controllers.

Table 2 – Partial RAID Controller Feature Set

Feature Benefit and Advantage Disadvantage
Hot Spare Drives Ability to recover from single disk failure Makes drive work but is not used
Warm Spare Drives Ability to recover from single disk failure Delay in recovering because the drive must be started
Global Spare Drive Ability to recover from single disk failure on any rank Requires multiple ranks running on the RAID controller
Failover (Active-Passive) Avoids RAID controller as the single point of failure Other controller sits idle until needed
Failover (Active-Active) Avoids controller as the single point of failure More expensive

 

Upgrade path and ease of use also should be considered when purchasing a RAID controller. How much user downtime is required to expand the RAID set? How much work does the systems administrator need to do for the upgrade? How is the RAID controller addressed? Is there a graphical user interface or does the systems administrator need to stand in front of the controller? These questions should be considered as well as performance. If the RAID controller is the fastest one on the market but impossible to configure, why use it?

Another set of features to consider when purchasing a RAID system is the physical packaging. Most MIS rooms have limited space and need to efficiently allocate precious rackmount inches. Also, ease of upgrading and repairing the RAID system are important purchasing factors. How many disk drives can this RAID cabinet hold? How easy is it to install another drive? Do I need extra power plugs? What about SCSI cables? How quickly can I swap out a bad disk, controller, power supply, cooling fan?