not possible to run enroot start when operating system is running on rootfs (stateless server boot)
Opened this issue · 2 comments
If you run a stateless cluster (such as one deployed by warewulf) with root filesystem in RAM, for example:
root@node1:/tmp# df -h /
Filesystem Size Used Avail Use% Mounted on
rootfs 1001G 16G 985G 2% /
root@node1:/tmp# mount | grep rootfs
rootfs on / type rootfs (rw,size=1048948424k,nr_inodes=262237106,inode64)
root@node1:/tmp#
You cannot start enroot containers. This happens:
root@node1:/tmp# enroot start raf-ssd
enroot-switchroot: failed to switch root: /raid/enroot/raf-ssd: Invalid argument
root@node1:/tmp#
strace snippet:
pivot_root(".", ".") = -1 EINVAL (Invalid argument)
There seems to be a hard requirement for enroot to do a pivot_root syscall:
enroot/bin/enroot-switchroot.c
Line 207 in 09ae4b2
Unfortunately pivot_root is not supported by stateless/memory based root disk.
The nvidia-container-cli binary provides a flag to --no-pivot, presumably this works for docker... but there is no equivalent for enroot.
root@node1:/tmp# nvidia-container-cli --help | grep pivot
-n, --no-pivot Do not use pivot_root
root@node1:/tmp#
Yeah we don't support doing this for now. It should be fairly straightforward to change though.
Hi, same error here. In my case I've modified the file enroot-switchroot.c changing the pivot_root value for a chroot, that makes enroot works good, but it also does pyxis fail with the following:
$ srun --container-image=ubuntu grep PRETTY /etc/os-release
srun: job 14791 queued and waiting for resources
srun: job 14791 has been allocated resources
pyxis: importing docker image: ubuntu
pyxis: imported docker image: ubuntu
PRETTY_NAME="Rocky Linux 8.8 (Green Obsidian)"
It imports the container image but does not chroot inside it.