Software RAID

Setup new RAID

  1. For Debian users, RAID modules aren’t started early by default. Add raid5 in /etc/modules, then update the initram with:

    update-initramfs -u
    
  2. Prepare the hard-drives by partitioning them. Create a single partition using 100% the drive space ending 100MiB before the end the disk (in this example 11444124MiB). This will ensure that a replacement hard drive that doesn’t have exactly the same size will be usable.

    parted /dev/sdc mkpart primary 1MiB 11444124MiB set 1 raid on print
    
  3. Create the RAID volume.

    • RAID Level 5
    • Eight hard-drives composes the RAID volume
    • RAID stripe is 512k
    mdadm --create --verbose /dev/md0 --level=5 --raid-devices=8 --chunk=512 /dev/sd[abcdefgh]1
    
  4. Save the configuration of the RAID.

    mdadm --detail --scan > /etc/mdadm.conf
    

Format new RAID

New RAID array can be formatted directly or, for added flexibility, LVM can be setup.

To format the RAID volume, with or without LVM, it is critical to properly align the partition with the RAID itself using the options described below for the XFS filesystem.

  • Install xfsprogs package to format as XFS
  • XFS options
    • su: Stripe size (here 512k, same as above)
    • sw: Stripe width. Number of “usable” hard-drives in the RAID, i.e. without counting the parity hard-drives (i.e. N-1 for RAID5 and N-2 for RAID6).
    • -L for the partition name.
    • Replace /dev/md0 by the logical volume path if using LVM.
mkfs.xfs -L data -d su=512k,sw=7 /dev/md0

Maintain RAID

Replace failing drive

To identify failing drive, list drives including their serial numbers using as root:

lsblk -o NAME,MOUNTPOINT,HCTL,TYPE,SIZE,SERIAL

The failing drive/volume in this example is /dev/sda1.

  1. If the drive isn’t already detected as failing, manually set it faulty:

    mdadm /dev/md0 --fail /dev/sda1
    
  2. Remove the failing drive from the RAID:

    mdadm /dev/md0 -–remove /dev/sda1
    
  3. Shut down the machine and replace the drive. To identify physically the failing drive, the HCTL (Host:Channel:Target:Lun for SCSI) column from lsblk can inform on which slot the drive is plugged in.

  4. Partition the drive (see Setup Step 2). Get the partition size from another (already installed) drive using parted and create a new partition of that size on the replacement drive.

  5. Add the new drive to the RAID (identified in this example as /dev/sdb1):

    mdadm /dev/md0 -–add /dev/sdb1
    
  6. RAID recovery can be followed using cat /proc/mdstat.

Monitor

mdadm can execute a script each time an event occurs on any RAID. This script gets as parameters, the i) event, ii) RAID device, and iii) hardware device.

The script started by mdadm is configured in /etc/mdadm.conf:

PROGRAM /etc/mdadm_warning.sh

And copy the script mdadm_warning.sh in /etc.

To test the script, run mdadm --monitor --scan --oneshot --test which will generate a test message sent to /etc/mdadm_warning.sh.

About RAID

Literature

  • Mdadm Cheat Sheet
  • Partition alignment.
    • As noted here the RAID volume can be used directly without partitioning it.
    • Theory of partition alignment is explained here along with benchmark, and here.
    • Setting up XFS on Hardware RAID here.

Benchmark

Test new RAID with Bonnie++. This can be useful to test for example if partition alignment is correct since improper alignment will degrade performance. With a RAID mounted on /data, replace user by the user of your choice.

bonnie++ -d /data -s 600G -n 0 -m first -f -b -u user -x 4 > /tmp/test.csv
bon_csv2html /tmp/test.csv > /tmp/test.html
Last modification: October 27, 2020