void-linux/void-runit

How should 03-filesystems import/mount ZFS volumes when zpool.cache doesn't exist?

ericonr opened this issue · 3 comments

The current version of 03-filesystems.sh only attempts to mount ZFS if /etc/zfs/zpool.cache exists. This behavior isn't explained in the commit message that added the ZFS block, and it has never been changed. I have recently had issues with it, because zfsbootmenu set the rootfs for me, but my /home dataset wasn't being mounted. Even the zfsbootmenu guide for ZFS root says that the cache speeds up stuff, not that it is essential if you split up certain volumes (I will add that information there too).

So what we have to determine is the best way for 03-filesystems.sh to find out whether it should attempt to mount zfs volumes or not.

Pinging @ahesford @zdykstra @Vaelatern

I agree that 03-filesystems.sh should be more tolerant of importing pools in the absence of zpool.cache, but I'm not yet sure what the best route would be. At a minimum, I think making the entire ZFS mount logic conditional only on the existence of /usr/bin/zfs and /usr/bin/zpool would be acceptable, since that's the same flow for btrfs, dmraid and lvm. If people don't want potential slowdowns importing zfs pools, they can uninstall zfs.

It may or may not be ok to use zpool import -c /etc/zfs/zpool.cache -N -a if the cache exists, but handling the imports gracefully without a cache will require some thought. I put an ESP on all of my disks and use /dev/disk/by-partuuid to refer to partitions as vdevs. Others might use by-id or (shudder) by-partlabel (I don't think the other options will uniquely identify multiple disks and should never be recommended). Would we have to pass all of these directories to zpool import? What kind of boot-time lag will that incur?

Not having a pool cache file defined manifests itself in a few different ways.

  1. If root-on-ZFS is used, and there's a single pool on the system, the pool is already imported by the 90zfs dracut module. Importing in 03-filesystems.sh is a no-op. A cachefile doesn't help us here, and instead can introduce the problem that @ericonr found.

  2. If root-on-ZFS is used, and there are multiple pools on the system, only the pool used by the boot environment / root filesystem is imported and mounted. Subsequent pool imports and mounts now depend on zpool.cache being present before they can be automagically imported and mounted.

  3. If pools are used for non-root filesystems, then they won't ever be mounted or imported unless zpool.cache is present.

The official upstream OpenZFS systemd unit files prefer doing an import by cache file if the file is present and then fall back to importing by scanning after. There's no preference that I can see for which by-disk directory should be used. In my opinion, if you want disks imported through a specific by-disk path, you should import the pool and then set a cache file. This will preserve that for future imports.

I think, then, that 03-filesystems.sh should do the following:

if [ -x /usr/bin/zpool -a -x /usr/bin/zfs ]; then
  if [ -f /etc/zfs/zpool.cache ]; then 
    zpool import -c /etc/zfs/zpool.cache -N -a
  else
    zpool import -aN -o cachefile=none
  fi
  zfs mount -a
  zfs share -a
fi

This solution looks good to me. It should clear up some ZFS boot issues that are not always obvious.