rpm-ostree in supermin VM leaks rofiles-fuse mounts; prevents clean cache unmount
Closed this issue · 2 comments
In #3844, we saw cosa build
fail on the cache umount
hitting EBUSY when shutting down the supermin VM:
+ mount -o remount,ro /srv/cache
mount: /srv/cache: mount point is busy.
dmesg(1) may have more information after failed mount system call.
[ 321.026584] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00002000
...
Looking at what could be holding the cache busy, a ps aux
shows lots of rofiles-fuse
processes:
[2024-08-13T18:18:47.381Z] root 306 0.0 0.1 401092 3972 ? Ssl 15:46 0:00 rofiles-fuse --copyup usr /tmp/rpmostree-rofiles-fuse0QroLi
[2024-08-13T18:18:47.381Z] root 311 0.0 0.1 251564 2488 ? Ssl 15:46 0:00 rofiles-fuse --copyup etc /tmp/rpmostree-rofiles-fuseANAAgx
[2024-08-13T18:18:47.381Z] root 330 0.0 0.2 474828 4120 ? Ssl 15:46 0:00 rofiles-fuse --copyup usr /tmp/rpmostree-rofiles-fuseioSZ7r
[2024-08-13T18:18:47.381Z] root 334 0.0 0.1 251564 2712 ? Ssl 15:46 0:00 rofiles-fuse --copyup etc /tmp/rpmostree-rofiles-fusePPGBR9
...
This is leftover from the rpm-ostree compose running scriptlets. It should be unmounting them, but clearly something is going wrong. Failures to unmount are logged to the journal, but we don't have a journal in the environment.
Added brutal workaround in #3844 for now, but I'd like to revert that at some point.
Opened coreos/rpm-ostree#5046 to have rpm-ostree log errors to stderr instead.
The easiest would probably just be to take the RPMs spit out from CI in that PR and open a cosa PR that reverts the workaround, and adds the rpm-ostree RPMs to see if we get more information about the error.