joe shaw

kicking non-fresh information from google index

(Minor update below)

I have a RAID1 setup on my home machine. A few years back I bought two 100 GB drives because they were cheap, and I didn’t have that much data to store, so I RAIDed them together to get some data redundancy. I think I initially set it up in Fedora, but I’ve been using it in SUSE for quite some time. It all works great.

Some time ago, though, I started noticing that my raid drives were out of sync. I was seeing this in my dmesg:

md: raid1 personality registered for level 1
md: md0 stopped.
md: bind
md: bind
md: kicking non-fresh hdd1 from array!
md: unbind
md: export_rdev(hdd1)
raid1: raid set md0 active with 1 out of 2 mirrors

Google’s responses on these were surprisingly unhelpful. Most unanswered web forum posts. So I give you this blog post to guide your and my future problems.

First off, I have /dev/hdb1 and /dev/hdd1 in my RAID1 device, /dev/md0. That device is my /home partition. So step one is to log in as root and unmount /home:

# umount /home

The next step is to re-add the stale device:

# mdadm —manage /dev/md0 —add /dev/hdd1
mdadm: re-added /dev/hdd1

Then, you have to monitor the /proc/mdstat file to see the progress of recovery:

Personalities : [raid1]
md0 : active raid1 hdd11 hdb10
120623936 blocks [2/1] [U_]
[====>…………….] recovery = 21.5% (25939136/120623936) finish=48.5min speed=32360K/sec

unused devices: <none>

You might also want to run mdadm in monitor mode while this is going on:

# mdadm —monitor /dev/md0
Dec 29 13:32:16: DegradedArray on /dev/md0 unknown device
Dec 29 13:44:16: Rebuild20 on /dev/md0 unknown device

While this is going on, you can remount the device and use it.


Update (30 Dec 2006): lmb says that you don’t need to unmount the device to rebuild the array. I thought this might be true, but I had tried a bunch of different things before I was able to get it rebuilt, and I wasn’t sure if this was necessary. Jacob also says that there is a sysctl you can set to make it recover the RAID at a faster speed, but he wasn’t sure what it was. I only had 100 gigs to do, and it only took an hour, so that wasn’t a big problem for me. If anyone knows what it is, let me know and I’ll update the post.

Update (1 Jan 2007): The parameters for the RAID rebuild speed are /proc/sys/dev/raid/speed_limit_min and /proc/sys/dev/raid/speed_limit_max. Thanks Adam!