Linux/Fedora: Encrypt /home and swap over RAID with dm-crypt
© 2006 Justin Wells

Do you have important company files on your PC at home, that you can neither afford to lose, nor let fall into the wrong hands? Perhaps you have personal and business emails or other files you must keep not only safe but also secure. How much would you suffer if your machine were stolen or seized, and fell into the hands either of a competitor, or a lawsuit happy adversary, or both?

If you can't afford to lose your intellectual property either to a drive crash or to an adversary, this page is for you! In this guide I'll explain how to keep your private data (on your /home partition) safely mirrored over RAID1 to a second disk, all the while keeping it encrypted with modern, strong cryptography. While we're at it, we'll defend your swap space too, since your private files spend some time in RAM, and therefore, in swap.

This page explains how to set up encrypted RAID1 ext3 filesystems with dm-crypt, along with an encrypted RAID0 swap, on RedHat / Fedora Core 5, using the twofish encryption algorithm and dm-crypt's new ESSIV mode. A variety of benchmarks showing the performance of different cipher and block combinations are also provided, as well as init.d/rc.d scripts to set up the encrypted partitions on system boot.

  Encrypt /home and swap over RAID

The commands on this page format entire disk partitions and therefore have the potential to destroy all of the data on your system. You absolutely must and should back up any and all data before trying anything below.


Back up your data: You have been warned!


What you require
I set this up on Fedora Core 5 with a 2.6.16 linux kernel. I believe you can use a kernel as old as 2.6.12 with this approach, but any older and your system will not be secure using the approach that I have described. With older kernels you are better off using loop-AES instead of dm-crypt (see below).

The instructions and scripts below are specific to Fedora Core 5, but if you have a different Linux distribution it should mostly be the same idea--you may have to adapt in a few places; you will almost certainly have to adapt the init scripts.

To set up a RAID array you obviously need two or more disks. I have two and opted for a RAID1 setup for /home; if you have four or more disks you could opt for RAID5 instead. See below for more info on your various RAID choices.

In the following I assume you already have Fedora Core 5 installed on your root partition, and that you have two or more free disks with enough empty space to create the RAID partitions for /swap and /home.


How to set up your RAID arrays
I decided that I wanted to have my /home partition on a RAID1 (mirrored) between my two disks, and my swap partition on RAID0 (striped) between the two disks. See below for my reasoning--you may choose differently.

Here is how I set up the RAID partitions:

  1. Run diskdruid or some other partitioning system, and create two equal sized partitions for your /home directory, and two equal sized partitions for your swap.
    • Write down what your partitions are for, in my case:
      • root: /dev/sda1 (I assume it's already there)
      • /home: /dev/hda2 and /dev/sda2
      • swap: /dev/hda1 and /dev/sda3
    • WARNING: Back up your data, especially anything in /home because we are absolutely about to blow away /home, and you run some chance of losing everything if something goes worng.

    • HINT: You will get a big, big speed increase if you ensure that the partitions within each RAID array are on separate channels (ie: separate physical HD cables).

  2. Log in as root and set up the two raid arrays. Before you execute the following commands double check that you have got all the partition identifiers exactly right. An error here could cost you all of your data on all of your disks.
            # make /dev/md0 a RAID1 array for /home
            mdadm --create /dev/md0 --level 1 -n 2 /dev/sda2 /dev/hda2 
            mdadm --detail /dev/md0 # have a look
            # make /dev/md1 a RAID0 array for swap
            mdadm --create /dev/md1 --level 0 -n 2 /dev/hda1 /dev/sda3 
            mdadm --detail /dev/md1 # have a look

  3. Edit /etc/mdadm.conf and make sure that it is correct
    • There should be two lines, one for each RAID array
    • The uuid values MUST mach the value returned from the mdadm --detail commands above.
    If mdadm.conf is not set up properly you may find that your system does not find both RAID arrays on a reboot.
Congratulations! You now have your RAID up and running. The next step is to set up encryption on the arrays.


How to set up a dm-crypt encrypted /home on your RAID1 array
The first step is to fill your new raid partition with pseudo random data. This will prevent an attacker from being able to tell how much of your filesystem is used vs. unused, and which parts have real data vs. empty data: the whole partition will look like random noise. It'll also over-write any data you may have had there previously (although look into "shred" and other tools if you need it really, really, really gone.)

Here is the command to fill a partition with random data. Be very careful that you have got the parition identifier exactly right, you are about to destroy data! Double check, then run this:

    dd if=/dev/urandom of=/dev/md0 bs=1M
Note that this command might take overnight to run if you have a big partition!

Next, make sure you have the twofish cipher installed in your kernel:

    modprobe twofish
Now set up an encrypting device mapper on your /dev/md0 partition (or whichever partition). You will be prompted for a password: Use a long sentence. Your "password" should have at least 20-40 characters of text, and preferably include some numbers as well. Think of something special to you, and make a 5-6 word or longer sentence about it that you will easily remember without ever having to write down.

If you choose an 8 character password an attacker who acquired your system could systematically try every possible 8 character combination until they broke in. Choose something stronger: the encryption is strong, don't let your password be the weakest link in the chain.

Here is the command to set up encryption on your /dev/md0 RAID device:

    cryptsetup -y -c twofish-cbc-essiv:sha256 create home /dev/md0
The -y causes it to ask for your password twice, to make sure you typed it properly. We chose cipher "twofish-cbc-essiv:sha256". If you want to use AES or serpent instead (see below) you would use something like "aes-cbc-essiv:sha256".

Note that we specified "ESSIV", which is not the default. It is important to specify ESSIV mode for all encrypted filesystems, as without ESSIV your system will be vulnerable to very serious watermark and known plaintext attacks (see the section on why DM crypt below).

Next we have to make a new filesystem on the encrypted device, with a regular mkfs.ext3 command:

    mkfs.ext3 -b 4096 -R stride=8 /dev/mapper/home
The precise options to this command will depend on your setup:
  • My benchmarking suggests 4k blocks have better performance under encryption, just as they ordinarily do (see below).
  • stride is a raid tuning parameter. blocks times stiride should equal your RAID stripe size. With a RAID1 there is no stripe size; however my testing showed that stride=8 still provided some performance benefit over no stride. If you have RAID5 or RAID10 be sure and set stride correctly.
Finally it's time to mount your new encrypted partition:
    mount -O noatime /dev/mapper/home /home
The "noatime" option prevents the filesystem from updating the metadata every time you access a file. If you read from many small files it has a potential performance advantage; it does mean you will be unable to tell the last time a file in your system was read (do you ever? I never do.)


Mounting your dm-crypt encrypted /home on boot
It isn't possible to mount this partition automatically! You need to type in a password. What I did was set up an init.d / rc.d script that would prompt for the password on boot. If no password is entered within 60 seconds the boot continues without the partition mounted--in that case you'll have to log in later as root and explicitly mount it.

Here is an init.d script that works for Fedora Core 5, you would install this in /etc/rc.d/init.d as encrypted_home:

Once you have installed that file add it to your boot process:
    ln -s /etc/rc.d/init.d/encrypted_home /etc/rc5.d/S50encrypted_home
    ln -s /etc/rc.d/init.d/encrypted_home /etc/rc3.d/S50encrypted_home

The script is configured to mount /dev/md0 on /home using the cipher "twofish-cbc-essiv:sha256". If you set up something differently then it won't work for you until you edit it to change those values.

Your system should now boot properly--TEST IT! First try that init script a few times to see that you can start/stop the /home mount properly. If that works test a reboot--you want to find out now that there's a problem, not some day in two weeks when you have forgotten all about this.

Note: If your system does not boot properly when you hit the grub boot screen (e)dit the kernel boot line and add a '1' at the end of the kernel line so that it boots to single user mode. This will get you back into the system as root so that you can undo the damage and figure out what went wrong.


Setting up an encrypted swap on your RAID0 array
The process for setting up an encrypted swap is similar in theory to setting up the encrypted /home, only we don't have to worry about a password so in practice it is much, much simpler. We will use a random password for swap each boot. That means that swap can never be "recovered" after a crash, but that's OK for swap.

I assume you have set up a RAID0 partition for your swap, as per the instructions above. Once you have done that, you can enable/disable it using this script, which you should install in your /etc/rc/init.d directory:

The script is configurd to mount /dev/md1 as swap. If you have set up some different partition to mount as swap then you will have to edit the variables at the top of the script to match your values.

When you have a look at the script you will notice a few things:

  • It uses /dev/random as a password file (random passowrd)
  • It uses cipher twofish-cbc-plain
Why does it it use -plain instead of -essiv? We are using random passwords so there is little chance anyone will be able to look one up in a password dictionary, for one thing. Second, it DOES leave the swap open to watermark attacks. If that bothers you change the word "-plain" to "-essiv" and eliminate the watermark problem at a slight performance cost. In my opinion the odds of a watermarked file winding up in swap are low enough that I would rather have the slight improvement in swap performance that -plain buys. Decide for yourself.

Carrying on, you need to create two symlinks to this script, so that it will be picked up during your boot process:

    ln -s /etc/rc.d/init.d/encrypted_swap /etc/rc5.d/S50encrypted_swap
    ln -s /etc/rc.d/init.d/encrypted_swap /etc/rc3.d/S50encrypted_swap

Again, make sure you edit the script to match your setup, and then TEST the script on your system. Try it a few times from a root shell to verify swap is created and destroyed properly, and then verify that swap is properly created on a reboot.


Special risks you should be aware of
There are a few issues you should be aware of when choosing this approach:
  • Filesystem corruption on an encrypted volume is MORE DAMAGING than on a conventional volume. Entire blocks of encrypted data are likely to be left undecryptable by even a single bit error.

  • RAID1 protects you against a drive crash but not against a power failure. A RAID partition is likely to experience some loss of data on a power failure, since both drives power off simultaneously.
The combination of these two effects means that you should be fairly conservative in your choice of filesystem. Use ext3 with the default, conservative journaling options. Do not attempt to use a filesystem which journals only metadata (reiser, XFS) as that will compound the problem further. Do not attempt to mount an ext3 filesystem over RAID over loopback mounted files: you lose the write ordering guarantees upon which your journaled filesystem depends. If you must use loopback mounted files, switch back to plain old ext2 without any journal.


Why just /home and swap? Why not encrypt the root filesystem?
This is another one of those cases where I am trading off a bit of security to buy speed and convenience. An encrypted root filesystem would mean I could not remote reboot my machine, it would be slower, and in my particular situation, I don't think it would buy me that much additional security. There are other guides out there that will help you encrypt your whole root filesystem if it makes sense for you.

If I had a notebook computer I would have opted to encrypt the root filesystem. You carry notebooks around with you on business, and sometimes leave them unattended. It is possible for an attacker with physical access to your harddrive to mount it with a rescue disk, install trojaned versions of core system applications, and then later recover all your passwords and sensitive data. An encrypted root filesystem prevents this sort of attack, as the attacker can't modify the contents of the filesystem.

My system is a desktop in my home. I am not worried about someone sneaking in to install trojan programs, then sneaking out again. I am worried about people breaking in over the network, and I am worried about outright theft or seizure of my hardware. Therefore, I opted not to encrypt the root filesystem, but rather, to encrypt just my personal data, which is everything in the /home partition so that, if my machine is seized or stolen, that data is not accessible to the culprit.

None of this protects me from someone who breaks root on my system. If someone can do that while my system is up and running it's game over no matter what I encrypt. Similarly, if someone can seize my computer and then return it to me without my noticing, it's game over. I will have to rely on other means, such as an alarm system, and software packages like se_linux and tripwire, to protect against such attacks.


Why dm-crypt instead of cryptoloop or loop-AES?
Aside from dm-crypt there are two other methods of encrypting filesystems: loop-aes, and cryptoloop. It is easy enough to dispense with cryptoloop: it is insecure, and it has been deprecated in favour of dm-crypt. There are fairly serious problems with cryptoloop:
  • Known plaintext attacks. Since no salt is used to encrypt blocks of data the NSA or some other resourceful attacker can build a database of all possible keys (or at least the most common few billion), locate blocks likely to contain known plaintext (such as filesystem superblock and inode tables) and look up your password quickly and efficiently in a massive password cracking database.

  • Watermark Attacks. Every block of data is encrypted the same way no matter where it is on the disk with cryptoloop so it is possible to "watermark" a file such as an mp3, mpeg, or document so that it has identical blocks of bits that repeat in a specific, known order, and which serve as a signature by which the file can be identified. Identical blocks will appear in the ciphertext as well (as some random value) which repeat in the same order as in the original, so that the publisher can prove with high certainty that your filesystem contains their watermarked file. It is quite easy to insert such watermarking into standard mp3's and documents, so this is a serious issue.

Very recent versions of dm-crypt, and most versions of Loop-AES, are free from both of these problems.

DM-Crypt resolved these problems with the introduction of the ESSIV, which stands for "encrypted sector salt initial value". By combining the key with a salt that is different for every block of data on disk the known plaintext and watermark attacks described above become impossible: Identical blocks of data encrypt to different ciphertext due to differing salt values. Note that since ESSIV is not the default DM-Crypt is insecure by default, you must explicitly enable ESSIV mode to create a secure encrypted filesystem.

Loop-AES used to be the only secure option for encrypting linux filesystems, it is also the most stable and mature, and possibly the fastest. All in all, it ought to be the leader of the pack, except for one thing: Through politics and in-fighting the Loop-AES developers have alienated the Linux kernel team, including Linus himself. As a result loop-AES is not supported by the mainline Linux kernel, nor is it supported by the Fedora Core 5 package manager. If you want to run Loop-AES you have to install custom versions of mount, losetup, and the loop.ko kernel module. This is the main reason why I do not run loop-AES: I rely heavily on my package manager for security updates and general maintenance, and I do not want to run the risk that the next yum upgrade will leave my system unusable.

On debian systems you may fare better with loop-AES: I have heard that you can apt-get install loop-AES and get somewhere. I do not know how reliable this is. Even on a debian system I would be a little leery of installing custom/hacked versions of mount, losetup, and loop.ko, although on the other hand loop-AES has been around for quite awhile so the hacks are likely well tested by now.

If you are running a kernel older than 2.6.12 then loop-AES is still your only viable option. If you want a securely encrypted system with a 2.2 or 2.4 kernel your only option is to abandon your package manager and install and maintain the loop-AES utilities by hand.


Why twofish instead of serpent or AES??
When the U.S. govt. and dept. of defense set out to search for a replacement for the previous encryption standard, DES, they launched a public process involving many cryptographers and evaluated a great many encryption algorithms. Eventually, through many rounds of expert consultation and review, the field was narrowed down to five really good, fast, flexible encryption algorithms:
  • Rijndael (now known as AES)
  • Serpent
  • Twofish
  • MARS
  • RC6
RC6 was abandoned for intellectual property reasons, and few of the reviewers like MARS. The other three were serious contenders, and all are available to you in the modern Linux kernel. All three are fast, flexible, cryptographically strong, and very well reviewed. In the end, Rijndael was selected and it is now known as AES. Strong arguments can still be made in favour of twofish or serpent as well.

Which should you use? The standard answer is, of course, AES, because it's the new standard. However, for an encrypted filesystem speed is of more importance than it is in many situations--at least it is to me. I elected to choose between the three based on performance benchmarks, counting on the fact that all three of them are strong enough cryptographically to meet my needs.

Through many, many benchmarks (some of which are summarized below) twofish was consistently faster than either serpent or AES in my testing, on my system, with my hardware, and using the implementation of the algorithms that is in my kernel, 2.6.16. You might get different results on your hardware, and the results may change if faster versions of some of the algorithms are checked into newer kernels. For now I am confident in saying that, as of today, twofish is a good strong solid choice that is, currently, faster than the others, at least for me.


Why choose RAID1 for /home and RAID0 for swap?
The particular RAID configuration you will want to use depends on what you expect from your system. There are advantages and disadvantages of each RAID level:
  • RAID0 (striping) increases the speed of your filesystem but also increases your odds of losing data in a crash.

  • RAID1 (mirroring) decreases the speed of your filesystem, but reduces your risk of losing any data.

  • RAID5 and RAID10 offer speed and reliability benefits, but require you to have 4 or more indepndent disks in your computer.
With only two drives in my desktop my choice is between RAID1 and RAID0. If you actually have five disks, you should choose RAID5. For /home there is no question which one is the better choice: RAID1. I absolutely do not want to lose my data. The performance advantage of RAID0 simply isn't worth the added risk. It probably isn't for you either, even if you think you are a speed freak. Think about it.

What you should use for swap is a more interesting question. If you swap to RAID0, as I do, your swap will be faster, but if you lose either drive you will probably suffer a system crash. On the other hand, if you swap to RAID1 your swap will be a lot slower, but if a non-root drive crashes your kernel has some chance of surviving it. If you mount root on RAID1 as well then you absolutely can survive a drive crash without loss of service.

For a desktop it is acceptable to me to suffer a system crash on a drive failure, providing my important personal data is not lost. I am even willing to have to reinstall the OS from CD to recover my system. I do not care so much about "loss of service". For a server machine the equation is entirely different--I would mount both root and swap on a RAID1 or RAID5. (I would probably still only encrypt application data, assuming a physically secure server room.)

As you can see you have some choices to make here: Do you want your system to stay up during a drive crash. If so, root and swap have to be on RAID1 or better.

Benchmarking AES, twofish, and serpent
Here are the benchmarks I have, summarized. They were collected by running bonnie++ three times against each of the algorithms with the following arguments:
     bonnie++ -b -s 10000 -n 100 -u justin
That forces a sync on every write (defeats buffering), creates a 10M file, and 100k small files. If you want to try this loadtest on your system I have provided the loadtesting script I ran to generate the data. You will have to edit it! The loadtest takes quite a long time to run so it's useful to be able to start an iteration before you go to bed, then check it in the morning. That also ensures that your computer will be idle while the test runs :) Note that this script repeatedly destroys a partition on your disk--be careful!.

Here's a summary of my testing--I took the best two of three runs for each algorithm and averaged the values. You can have a look at the actual bonnie output as well.

ProtocolBlock Output (K/sec)Block Input (K/sec)Seeks/secRnd. Creates/sec

Oddly all three were faster at writing than reading data, the reverse of the usual hard-drive semantics. In any case, twofish is the across the board winner: it is faster than AES on every measure. Serpent was also slower than twofish was on reading, though in the same range for writing speed.

While we're on the benchmarking, I also tested various block sizes and stride settings to determine what was the fastest configuration for twofish running under a raid1 on my system. I ran fairly exhaustive tests, but aside from ridiculously small 1024 byte blocks, there weren't large differences in performance between different block and stride settings. There was a slight speedup using 4096 blocks with stride=8, but it wasn't huge. Nevertheless I'm using that value because, well, I have to use something, I might as well use the slightly faster one!


© 2006 Justin Wells