Sunday, July 25, 2010

Replacing a Bad Drive in a md Software Mirror

I have a lot of Linux "md" RAID-1 arrays in my set-up. Every so often, one of the two drives in a RAID-1 starts behaving badly and I want to replace it. The "bad" drive hasn't stopped working outright, but it's causing I/O errors to be logged in /var/log/messages, or it fails a SMART self-test. My favorite method is to put the bad drive in "write-mostly" mode, add a third drive to the RAID-1, let it fully sync, and then remove the bad drive from the RAID-1.

Why put the drive in "write-mostly" mode? Because the one remaining "good" drive in the RAID-1 may not be so good. It might have unreadable sectors. You will not discover that until the md software tries to read the entire "good" drive in order to copy it to the new third drive. If this happens, unless one's luck is extremely bad, the md software will be able to read the unreadable sectors from the "bad" drive.  So instead of taking the bad drive out of the mirror immediately, I tell md to use it as little as possible.

The details; say our RAID-1 is /dev/md9, the bad drive is /dev/sdX, and our new drive is /dev/sdN.

Step 1. echo writemostly > /sys/block/md9/md/dev-sdX/state

This tells the md software to avoid reading from the bad drive. It still gets written to as usual.

Step 2. mdadm --grow /dev/md9 -n 3

Grow the RAID-1 to be able to handle 3 drives.

Step 3. mdadm /dev/md9 -a /dev/sdN

Add the new drive to the array. Syncing starts at once.

Step 4. grep md9 -A 3 /proc/mdstat

Monitor the progress of the sync and when it's done, continue. If you are unlucky, this step will never finish. You will see I/O errors in /var/log/messages and the md software will restart the sync over from scratch. This will happen in an endless loop. This is where backups come into play as the safety net.

Step 5. mdadm /dev/md9 -f /dev/sdX

Manually "fail" the bad drive.

Step 6. mdadm /dev/md9 -r /dev/sdX

Remove the bad drive from the array.

Step 7. mdadm --zero-superblock /dev/sdX

Erase the "md" superblock that's written towards the end of /dev/sdX to avoid any confusion that /dev/sdX might be part of an md array down the road.

Step 8. mdadm --grow /dev/md9 -n 2

Back to normal.



No comments: