threefoldtech/0-fs

re-mounting 0-fs twice using the same backend directory crashes in 0-OS

Closed this issue · 2 comments

I found this issue while working on threefoldtech/zos#793

It seems when we mount 0-fs on the same backend directory after a node reboot. The second time the overlayfs is corrupted and returns IO errors.

To reproduce:

  • create a container
  • reboot the node
  • same container is mounted a again
  • container fs is corrupted

The same scenario doesn't give any error when tested on my laptop running arch with Linux 5.6.13-arch1-1

Here are some error from dmesg: https://gist.github.com/zaibon/7c89c908069005eac9b72afdafed7617

Like shown below, the read-only and read-write layer seems to be healthy. Only the mountpoint of the overlayfs gives issue.

/mnt/e811f02b-ba2d-4c3f-b164-9d4c50b9d73b/9200-1 # cat ro/etc/ssh/ssh_host_ed25519_key.pub
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMN5nZ/h+ZN8MfP8EaBY8s1zaskxpc4LhYIhilFYNXvR root@4b75be6da8df
/mnt/e811f02b-ba2d-4c3f-b164-9d4c50b9d73b/9200-1 # cat rw/etc/ssh/ssh_host_ed25519_key.pub
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMN5nZ/h+ZN8MfP8EaBY8s1zaskxpc4LhYIhilFYNXvR root@4b75be6da8df
/mnt/e811f02b-ba2d-4c3f-b164-9d4c50b9d73b/9200-1 # cat /var/cache/modules/flistd/mountpoint/9200-1/etc/ssh/ssh_host_ed25519_key.pub
cat: can't open '/var/cache/modules/flistd/mountpoint/9200-1/etc/ssh/ssh_host_ed25519_key.pub': Input/output error

Found a bug report that seems to be related: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407

I start to wonder if the problem doesn't come from the flist I was using to test. I tried to reproduce using other flists and I cannot make it fail anymore.