Selection of RAID Levels

In the previous section, we came to know about RAID and understood the various levels of RAID.

Here, we will discuss the selection of those discussed RAID levels, i.e., in what way these levels should be chosen. Each level has its own pros as well as cons. Thus, each level selection depends on their capabilities of storing the data.

Various factors are taken into consideration while selecting the RAID levels. The factors are discussed below:

The monetary cost of the additional disk-storage requirements.
Needs for performance in terms of the number of input-output operations.
Measuring the performance in case of a disk failure.
Measuring the performance when the data in a failed disk are being rebuilt on a new disk, i.e., when rebuilding.

Keeping all the above-described factors, the RAID system designers make the decisions of selecting the appropriate RAID levels. It is because the designers can easily select the appropriate RAID level that meets and fulfills the designer's requirements.

Comparing RAID Levels

Let's discuss several comparison points that will distinguish each level from the other. It will also help in making a better and convenient choice between the levels.

RAID level 0 is a right choice when data safety and its security is not a big case. Thus, level 0 is used in high-performance applications.
The designers can go for RAID level 1 for rebuilding the data. It is because rebuilding is the simplest job for level 1. As in RAID level 1, the user can copy the data from another disk. In case of other levels, it is required to access all other disks in the array for rebuilding the data of a failed disk. Build performance is an important factor in high-performance database systems. In fact, the time taken to rebuild the data may become a significant part of the repair time, so rebuild performance also influence the meantime for data loss.
RAID level 3 and RAID level 5 are so powerful that they have restricted the selection of RAID level 2 and RAID level 4 by absorbing them. The block striping feature of RAID level 5 inferiors bit striping feature of the RAID level 3. It is because the block striping provides good data transfer rates for large transfers, and uses a few disks for making small data transfers. In the case of small data transfer, the access time dominates, which, as a result, diminishes the benefits of the parallel reads. RAID level 3 can also be proved as a bad choice for making small data transfers. It is because the data transfer finishes only after each disk has fetched the corresponding sectors over them. It leads the average latency for the disk array closer to the worst latency for a single disk, where the benefits of the high data transfer rates are being ignored.
While comparing RAID level 6 with RAID level 5, it offers a good reliability option than RAID level 5. Also, designers can use RAID level 6 in applications where data safety and security is a major concern. But, currently, many RAID implementations do not support RAID level 6.
In some cases, it is difficult to choose between RAID level 1 and RAID level 5. RAID level 1 is good for applications like storage of log files in the database system as it offers the best write performance. Such a feature of RAID level 1 is not comparable with the remaining other five RAID levels. On the other hand, RAID level 5 offers low storage overhead in comparison to RAID level 1. But it takes high time overhead for write performance. Thus, it is better to choose RAID level 5 for those applications where data is read frequently but written rarely.
Although the disk-storage demand increases with time per year, cost per byte on the other hand, falls at the same rate. Consequently, this has led to the need for the monetary cost of extra storage at a significant level. However, the access time is increasing day by day at a slower rate, which has shown a high increase in the number of input-output operations per second. So, RAID level 1 and RAID level 5 have become the most moderate choices among all other RAID levels because RAID level 5 provides high input-output requirements, and RAID level 1 offers moderate storage requirements for the data.

All the above points show the features and capabilities of each RAID level, which will surely help the designers to choose and use the appropriate RAID level for storing the data.

Hardware Issues

To the above-described points for the selection of the RAID levels, there can be some other issues that may arise while implementing RAID.

While implementing RAID, two other levels are required in the implementation. The system provides them. They are:

Software RAID

These systems implement RAID without making any change at the hardware level. The modifications are made in the system software only. Such types of RAID implementations are known as Software RAID.

Hardware RAID

The systems that build special-purpose hardware for supporting RAID are known as Hardware RAID systems. The implementations of the hardware RAID often use the non-volatile memory for recording the writes, before executing them. It helps to complete the incomplete writes by fetching the information from the non-volatile memory storage in case of any power failure.

With this, during the implementation of RAID, there occurs some issues at the hardware level as discussed below:

Latent Failure: Data loss may occur even after completing all the writes properly. In some rare cases, it may because a sector in a disk becomes unreadable at some point. But in most of the cases, data loss issues occur because of any manufacturing defects in the system or data corruption on a track when an adjacent track is repeatedly written. After successful completion, if such type of data loss occurs, it is known as Latent Failure or Bit rot. If the designer detects the latent failure at an early stage, it becomes easy to recover the data from the remaining disks of the RAID organization. But, if the failure is not detected, then a single disk failure could lead to data loss even if another disk's sector has a latent failure.
Thus to reduce such data loss, the RAID controllers perform either scrubbing or hot-swapping. In scrubbing, when the disks are idle, each sector of every disk is read, and somehow if any sector is left unreadable, the data get recovered from all other remaining disks present in the RAID organization. Finally the sector is written back. In case, the sector damages physically, the disk controller remaps the logical sector address to a different physical sector on the disk.
In hot-swapping, the faulty disks are removed and replaced with the new disks without turning the power off. This method reduces the mean time required for repairing, as the disk replacement does not wait for the system to shut down. Further, the RAID implementations assign a spare disk for every array. Such disk will be used for replacing a failed disk. Thus, these two techniques reduce the chances of data loss.
The system power supply or the disk controller could become a common issue in the implementation of RAID. As a solution to this problem, multiple redundant power supplies are provided with battery backups so that they may function even if the power fails. For the disk controller, the RAID implementations provide multiple disk interfaces.
A system interconnection failure may also affect or stop the functioning of the RAID system. Thus, to avoid such an issue, a good RAID implementation enables multiple disk interfaces as well as multiple interconnections for making the connection of the RAID system to the computer system or network connection. As a result, single component failure is not enough to stop the functioning of the RAID systems.
In the case of an array of tapes, if anyone tape got damaged, the RAID structures can recover the data easily. While broadcasting the data over wireless systems, a block of data is split into short units and then gets broadcast along with a parity unit. Unless each unit is not received, it is possibly reconstructed from the remaining other units.