RAID

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Overview

RAID is one way of making many disks work as one. RAID can provide redundancy which will help to improve the lifetime of your data at the cost of hardware.

Types of RAID

Hardware RAID

Hardware RAID handles the RAID organization at a much lower level than the CPU and memory. It is very fast. Depending on the technology used, it may be easy or hard to manage. Typically, hardware RAID is pretty expensive, both in time, expertise, and cost of the hardware.

Software RAID

Linux has, built in, software RAID capability within the Kernel. The kernel will organize reads and writes and dispatch them to the appropriate spot on the appropriate partitions.

Software RAID is much easier to manage, but at the cost of some speed. It is also very cheap in terms of expertise and time and cost of hardware.

RAID 0: Striping

RAID 0 stripes data across several disks. This decreases the times to read and write since several files can be read from the same partition in parallel if they are on different disks in the array.

However, RAID 0 decreases the lifespan of your data. If any of the disks fail, you lose your data. As you add more disks to your array, the chances of losing some data increase accordingly.

For instance, let's say you have a chance of failure in a year of 10% of a particular kind of disk. Having a raid array of 3 disks means that the chances of having one or more disks fail is roughly 30%.

RAID 1: Mirroring

RAID 1 copies the data and puts it on each disk in the array. This incraeses the write time since writes have to go to ALL of the disks, but it decreases the read times since reads can come from ANY of the disks. You can read several files from the same partition at once since they will come from different disks.

RAID 1 increases the lifespan of your data. Your data is intact as long as any one of the disks is good.

Let's say each disk can fail 10% of the time in a year. If you have a RAID 1 array of 3 disks, then the chances of losing any of your data is 0.1% since all three disks have to fail at the same time to lose your data.

By the way, you can use RAID 1 for backups, since each disk in the array will be a perfect copy of every other disk in the array. If you have a pluggable hard drive, just plug it in and add it to the array. When it gets a perfect copy, remove it from the array and remove it and there's a perfect copy!

RAID 0+1: Striping and Mirroring

RAID 0+1 first stripes the data into several partitions, and each partition has several mirrors. The chances of losing data are the chances of losing any one of the partitions.

RAID 1+0: Mirroring and Striping

Raid 1+0 first mirrors the data into several partitions, and then stripes it across each of the disks in each partition. The chances of failure is the chance of losing the same disk in each partition at the same time.

RAID 5: Striping and Checksums

RAID 5 stripes data across the disks, but keeps a parity bit that is rotated across the disks. It is pretty cool, but I don't recommend it for your desktop. If you're writing a mean app, you may want to investigate this. You need at least 3 disks to make this work the right way.

Setting Up Software RAID

Setting up Software RAID is pretty easy. Your linux installer should walk you through it. Before you begin, you need to think hard about what you want to do and draw out how your data will be split across disks.

Fixing Problems

If you're using RAID 1 and one of the disks fail, or you simply remove it from the machine (during a reboot), then you should get a friendly message telling you that.

After you boot up, even if the partition that the RAID runs from is the main partition, run:

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Thu Feb  8 09:14:11 2007
     Raid Level : raid1
     Array Size : 67802240 (64.66 GiB 69.43 GB)
  Used Dev Size : 67802240 (64.66 GiB 69.43 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Nov 26 01:47:03 2008
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 7978da8d:fa89e90a:fced7653:645531c2
         Events : 0.16944

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed

Pay attention to the bottom where it will tell you what is missing. I put that part in bold so you can't miss it.

To add in the missing drive:

# mdadm /dev/md0 --add /dev/sdb2
mdadm: re-added /dev/sdb2

This will bring sdb2 back into the /dev/md0 RAID cluster. But if you do a --detail again, you'll see that it is out of sync. That means it is useless for reads and writes.

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Thu Feb  8 09:14:11 2007
     Raid Level : raid1
     Array Size : 67802240 (64.66 GiB 69.43 GB)
  Used Dev Size : 67802240 (64.66 GiB 69.43 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Nov 26 01:50:09 2008
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 0% complete

           UUID : 7978da8d:fa89e90a:fced7653:645531c2
         Events : 0.16988 

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       2       8       18        1      spare rebuilding   /dev/sdb2

Next, you'll have to tell the system to get your newly re-added partition up to speed. Don't worry, this is done automatically with spare cycles. If you keep running --detail from time to time, you should see that the rebuild status will progress, depending on how much work there is to do and how busy the RAID array is.

When it's all finished, you should see your cluster back to normal.

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Thu Feb  8 09:14:11 2007
     Raid Level : raid1
     Array Size : 67802240 (64.66 GiB 69.43 GB)
  Used Dev Size : 67802240 (64.66 GiB 69.43 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Nov 26 02:17:52 2008
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 7978da8d:fa89e90a:fced7653:645531c2
         Events : 0.17402

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       18        1      active sync   /dev/sdb2

Pretty easy, huh? And who said Linux was hard???