awslabs/soci-snapshotter

[Bug] TestFuseOperationFailureMetrics fails on ARM64

sondavidb opened this issue · 1 comments

Description

Seeing some pretty consistent failures on ARM64 machines for TestFuseOperationFailureMetrics/image_with_valid-formatted_but_invalid-data_ztocs_causes_a_fuse_failure.

{"level":"debug","msg":"FUSE operation","operation":"node.Lookup","path":"var/lib","time":"2024-04-08T23:32:54.407861163Z"}
{"level":"debug","msg":"FUSE operation","operation":"node.Lookup","path":"var/lib","time":"2024-04-08T23:32:54.408042352Z"}
time="2024-04-08T23:32:54Z" level=fatal msg="failed to copy xattrs: failed to list xattrs on /tmp/initialA0000000000/var/lib/rabbitmq: operation not supported"

Will look into trying to root-cause this.

Steps to reproduce the bug

Run GO_TEST_FLAGS="-run TestFuseOperationFailureMetrics" make integration.

The last test (image_with_valid-formatted_but_invalid-data_ztocs_causes_a_fuse_failure) will fail.

Describe the results you expected

Tests pass

Host information

  1. OS: Ubuntu 22.04 ARM64
  2. Snapshotter Version: tip of main (ee1b7c2528b9b72a315966b9dd23e23b97c6b37f)
  3. Containerd Version: 1.6.30

Any additional context or information about the bug

No response

We root-caused this to #1314. TL;DR disabling xattrs on a volume-mounted layer will cause the container to not create, which is what was happening in the pinnedRabbitmqImage. This is an issue because of #1133, which disables xattrs automatically on layers that don't have xattrs or whiteout/opaque dirs.

We reverted that change, which fixed this issue, so we can close this now.