Slow ZFS on an SSD? Maybe you aren't trimming
28 July, 2024
A few months ago my desktop started getting slower. For a week it was semi-usable, but soon it was taking minutes to boot.
It was just barely responsive enough to run some benchmarks, which showed that CPU performance was fine, but anything disk related was excruciatingly slow.
What can go wrong with SSDs?
SSDs need to "trim" unused blocks in order to maintain adequate performance.
The easiest way to do this is to have a daemon periodically execute trim operations on all devices. I had this in my NixOS configuration for many years:
{
services.fstrim.enable = true;
}
What if the block device is encrypted?
I use full disk encryption with LUKS, which presents a virtual block device which you mount a regular filesystem on. For reasons that sound highly flimsy to me (not a security expert), LUKS does not by default pass trim commands down to the underlying block device.
Thankfully I was aware of the footgun and again had this in my NixOS configuration:
{
boot.initrd.luks.devices = {
root = {
device = "/dev/disk/by-uuid/fa0fcd31-bd41-41a2-a1ff-b122e8bc67c0";
allowDiscards = true;
};
};
}
allowDiscards
means allow trim on the SSD.
So what went wrong?
I'm glad I was suspicious enough that the issue was trim related to investigate further.
I use ZFS on my machines because I like to keep my data, but I am a somewhat unenthusiastic user. Maybe ZFS was a good citizen in its native Solaris environment, but on Linux, OpenZFS has a big personality, and integrates extremely poorly with the rest of the system.
A quick web search confirmed that ZFS does not listen to regular fstrim
, so I tried:
zpool trim zroot
A few hours later, my desktop was totally back to its regular snappiness!
Now I have the key line:
{
services.zfs.trim.enable = true;
}