cannot import zroot: no such pool available
geekifan opened this issue · 10 comments
ZFSBootMenu build source
Release EFI
ZFSBootMenu version
2.3.0
Boot environment distribution
Debian 12
Problem description
zfsbootmenu cannot import zroot. It can recognize zroot/ROOT/debian when press [ESCAPE] key at boot.
Steps to reproduce
Follow the instructions on the document website with zfslinux-utils/bookworm-backports, zfs-dkms/bookworm-backports and zfs-initramfs/bookworm-backports installed during the installation.
The screen shot you've shown is from the initramfs in your Debian boot environment failing to import the pool, and not ZFSBootMenu. There's a failed device (/dev/sda
) that should be investigated as the underlying cause for the pool import failure.
Thanks for your quick reply! I'm not familiar with system boot. I'm not sure why debian cannot recognize the disk /dev/sda. It is a brandnew disk on a brandnew machine. I know it is not your responsibility to help deal with the situation, but I really appreciate it if you could provide any suggestions. :P
I'd recommend looking through the full dmesg output when you're in the busybox shell. You might be able to find something that stands out there. It's possible that this is related to kexec'ing into a new kernel - but this would be the first time we've seen it at the drive controller level.
In my experience, MegaRAID cards are very finicky. I suspect it's not able to gracefully handle the kexec process. Do you have any options to use an onboard AHCI controller port?
Sadly :(, I cannot change the disk topology right now. That's to say, i cannot move the disk to an onboard sata port. Does it mean that I cannot use zfsbootmenu on this machine?
It means that it might take a bit more effort to make it work. If you're interested in trying a few things, I can write up a few steps. Is the LSI exposing a single disk, or a RAID volume?
Thanks a lot! I am willing to try. The LSI is now exposing a single disk with JBOD mode, which is the "passthrough" mode of this raid controller.
By the way, I suspect it is a fail-to-reinitialize problem of LSI (but I'm not sure). Maybe I can try to put a new teardown script to unbind this raid controller from megaraid driver like what zbm do for USB controller?
UPDATE 1: I tried to rebind the controller in teardown.d but it doesn't work. The disk is still not recognized even after a manual rebind.
UPDATE 2: I used this script below to reset the controller and it works. But it takes about 2 minutes to reset (so weird) with an ioctl error write error: Inappropriate ioctl for device
.
#!/bin/sh
SYS_MEGARAID=/sys/bus/pci/drivers/megaraid_sas
# shellcheck disable=SC2231
for DEVPATH in ${SYS_MEGARAID}/????:??:??.?; do
[ -L "${DEVPATH}" ] || continue
DEVICE="${DEVPATH#"${SYS_MEGARAID}"/}"
echo "Tearing down Megaraid controller ${DEVICE}..."
echo "${DEVICE}" > ${SYS_MEGARAID}/unbind
echo "Resetting Megaraid controller ${DEVICE}..."
echo "1" > /sys/bus/pci/devices/${DEVICE}/reset
done