Migrating from mdadm RAID 6 to btrfs RAID 5

Achtung! Dieser Artikel ist älter als ein Jahr. Der Inhalt ist möglicherweise nicht mehr aktuell!

Please note: This blog post is actually pretty boring. If you are expecting some btrfs RAID 5 or 6 secrets you’ll be disappointed!

End of 2013 after I started working I bought my first Synology NAS with two 3 TByte drives in a RAID 1. I quickly filled it and ran out of storage space. I had a look at Synology devices with four or more slots but they were too expensive for just doing storage tasks. Keep in mind that devices from that time only provided storage space over SMB or iSCSI and were not able to run docker containers and stuff like that. I decided to build a homeserver which should provide NAS features to my network and host additional services. So in mid 2014 I’ve bought hardware for it:

CPU: Intel Core i5-4440
Memory: 24 GByte
OS disk: 120 GByte SSD
Storage: 3x 3 TByte WD Red
Case: Fractal Design Define R5 (silenced)

I started out with a mdadm RAID 5. Whenever I needed more space I simply added a new disk and grew the RAID. At some point in time I decided I wanted more redundancy. I bought another WD Red and converted the RAID 5 to a RAID 6. At the end I was running 6x 3 TByte giving me 12 TByte of space.
On the RAID md device I’ve placed a LUKS partition for encryption. On top of that I’ve used ext4. This setup was running pretty smooth. I’ve virtualized an OPNsense firewall on it and a couple of LXC containers running the services needed for my network.

After all these years I finally came to the conclusion that I should renew the hardware. All the microcode updates slowed down the Intel processor quite a bit. But I also wanted to to get rid of ext4. Over all the years new filesystems were developed with features I really liked. For example data checksums to detect and fix bit rot. There are currently two “usable” ones: Btrfs and ZFS. Because I’m running Archlinux ZFS was not an option as it required fiddling with the AUR. So Btrfs it was.

The goal: Get rid of MDADM + LUKS + ext4 and move on to a btrfs RAID 5.

New Hardware

After hours comparing hardware I decided to buy this combination:

CPU: AMD Ryzen 3 3200G (Fuck Intel!)
Memory: 16 GByte memory
OS disk: 128 GByte M.2 SSD (for the OS)
Storage: 3 x 10 TByte Seagate Exos X16
Case: Stayed the same

At first I wanted a CPU without integrated graphics. But apparently computers don’t boot without a GPU.

Yes! You heard right! I use btrfs RAID 5!

The btrfs status page (🖇️ 🔐) labels RAID56 as “Unstable”. Apparently it has some issues with unclean shutdowns. Full disclosure: I did not truly understand the actualy write hole problem but I know that fixes are constantly merged into the kernel. And thanks to Arch I’m permanently receiving the newest kernel. Together with my backup (made from the old hard drives) I’m pretty confident to be able to recover easily when something breaks. My files on my NAS are pretty static. I backup every time I update which is ever 4-6 weeks. Depending on the urgency. Older files are in the backup and newer files can normally be re-downloaded without any problems. Often changeing and important files are backed up more often to the OS SSD. If my btrfs kicks the bucket I’m sure I can recovery easily. Of course it will be annoying but my life does not depend on it.

Migration process

My goal was to keep my homeserver up and running as long as possible during the migration. At first I wanted to hook up the three new drives into my homeserver, create a btrfs RAID 5 and rsync the data over. But I did not have enough free SATA ports and power connectors. So I came up with another plan.
I pulled the two oldest drives while still running from my mdadm RAID 6 (mounted to /mnt). As expected it went into degraded mode but was still operational. I put two new seagate drives in and created a btrfs RAID 0 which I mounted to /mntNew. It took round about 13 hours to copy the data over.
I unmounted the mdadm RAID 6 from /mnt, unmouned the btrfs RAID 0 from /mntNew and mounted it again to /mnt. All my services came up flawless. My btrfs was live. I closed the LUKS partition, stopped the mdadm RAID and pulled the remaining WD Red hard drives out.

Rebalance from btrfs RAID 0 to RAID 5

Now with space for the last remaining Seagate drive I was able to convert to a RAID 5:

btrfs device add /dev/sdf1 /mnt
btrfs balance start -dconvert raid5 /mnt

I had a look at the usage report and saw this:

Data,RAID0: Size:3.79TiB, Used:3.73TiB (98.47%)
   /dev/sdd1	   1.90TiB
   /dev/sdg1	   1.90TiB

Data,RAID5: Size:1.46TiB, Used:1.41TiB (96.52%)
   /dev/sdd1	 749.00GiB
   /dev/sdg1	 749.00GiB
   /dev/sdf1	 749.00GiB

It created a new volume(?) and copied the data from the RAID 0 to the RAID 5. btrfs is awesome!

Note: Until now the Intel CPU was still in service and no reboot was done. I was able to do all this live without any interruption. Only the short time between shutting down the LXC containers on ext4 and re-starting them on btrfs.

While the rebalance was running I prepared for the hardware swap. First of all I created a bootable Archlinux USB stick. The operating system is still running off an SSD connected to one of the SATA ports. For the new mainboard I’ve bought a M.2 SSD which should host the OS. My plan after the mainboard swap:

Problem: I’m also switching the CPU vendor. I’ve never done that before. But the Arch Wiki (🖇️ 🔐) had me covered. I basically only needed to swap the microcode. In case something went wrong I could simply start from the SSD.

Hardware swap

I don’t know how long the rebalance was running but it finished over night. The last step until completion was to shut down the computer and do the actualy hardware swap. I don’t know what to tell here to be honest. Rip out the old parts, put in the new parts, pull the new cables. Done :-)

Conclusion

Thanks for reading this pretty uninteresting blog post! I know you probably expected more. But after writing it I realized it’s actually kind of boring.
Until now this setup is running for three months now without any problems.

Tipps and tricks

Torrents

If you plan to run a torrent application on your btrfs file system consider disabling the copy on write (CoW) feature on your download folder. The torrent protocol transmits small chunks of the file in question out of order and stores it in a preallocation causing fragmentation. Every time such a chunk is written to disk btrfs would create a copy applying unnecessary overhead.
There is a feature where you can disable the CoW feature for specific files. Transmission for example stores its data in /var/lib/transmission/Downloads. By issuing a chattr +C /var/lib/transmission/Downloads you can set the special attribute to the folder recursively. You should see a capital C in lsattr output:

[root@homeserver transmission]# lsattr
—————C—— ./Downloads

If a new file is created this flag will automatically be transferred onto it. CoW is now disabled. You should also apply it to disks of virtual machines.
You can read more about here: https://btrfs.wiki.kernel.org/index.php/FAQ#Can_copy-on-write_be_turned_off_for_data_blocks.3F

Create files wasting storage!

As mentioned previously btrfs is prone to fragmentation and can lead to false “no space left on device” error messages. To combat that I’ve created multiple files in a directory and filled it with random data:

for i in seq 1 30; do touch saver-$i.bin; chattr +C saver-$i.bin; dd if=/dev/urandom of=saver-$i.bin bs=20M count=100 status=progress; done

This will create 30 files each with 2 GBytes. As stupid as this may sound: If you ever need space - there it is!


Du hast einen Kommentar, einen Wunsch oder eine Verbesserung? Schreib mir doch eine E-Mail! Die Infos dazu stehen hier.

🖇️ = Link zu anderer Webseite
🔐 = Webseite nutzt HTTPS (verschlüsselter Transportweg)
Zurück