open with O_RDWR not working correctly with overlay
smoser opened this issue · 6 comments
I'm trying to use squashfuse with overlay mounts and having a problem with 'openat' and 'AT_FDCWD'.
Real world situation was when trying to run 'apt-get install' where an strace showed:
openat(AT_FDCWD, "/var/lib/dpkg/lock-frontend", O_RDWR|O_CREAT|O_NOFOLLOW, 0640) = -1 ENOSYS (Function not mplemented)
I have a recreate here recreate.zip.
There are 3 files in the zip:
- test-openat.c : compile this with
cc -o test-openat test-openat.c
- simple.squashfs : simple squashfs file (1 dir and 2 files)
- test-squash-overlay.sh (run as root): script that does:
- mount squashfs image
- mount overlay filesystem with squash mountpoint as it's lowerdir
- run 'test-openat' with CWD in the overlay mountpoint
The result is something like this:
% sudo ./test-squash-overlay.sh
$ mkdir -p /tmp/tmp.SqAn7lSrRw/mp /tmp/tmp.SqAn7lSrRw/workd /tmp/tmp.SqAn7lSrRw/upper /tmp/tmp.SqAn7lSrRw/writable
$ squashfuse -f -o debug,allow_other,ro /tmp/testx/simple.squashfs /tmp/tmp.SqAn7lSrRw/mp >/tmp/testx/squashfuse.log &
$ mount -t overlay -olowerdir=/tmp/tmp.SqAn7lSrRw/mp,upperdir=/tmp/tmp.SqAn7lSrRw/upper,workdir=/tmp/tmp.SqAn7lSrRw/workd xx-/tmp/tmp.SqAn7lSrRw/mp /tmp/tmp.SqAn7lSrRw/writable
+ cd /tmp/tmp.SqAn7lSrRw/writable
+ /tmp/testx/test-openat dir1/file2.txt dir1/file-noexist.txt file1.txt file-noexist.txt
dir1/file2.txt: -1: Function not implemented
dir1/file-noexist.txt: 3
file1.txt: 3
file-noexist.txt: 3
You can run test-squash-overlay.sh
with the KERNELMOUNT environment variable set to 'true' and it will do a kernel mount rather than a squashfuse mount. If you do that, the overlay works as desired.
% sudo KERNELMOUNT=true ./test-squash-overlay.sh
$ mkdir -p /tmp/tmp.BKBVLI1z7b/mp /tmp/tmp.BKBVLI1z7b/workd /tmp/tmp.BKBVLI1z7b/upper /tmp/tmp.BKBVLI1z7b/writable
$ mount -o loop,ro /tmp/testx/simple.squashfs /tmp/tmp.BKBVLI1z7b/mp
$ mount -t overlay -olowerdir=/tmp/tmp.BKBVLI1z7b/mp,upperdir=/tmp/tmp.BKBVLI1z7b/upper,workdir=/tmp/tmp.BKBVLI1z7b/workd xx-/tmp/tmp.BKBVLI1z7b/mp /tmp/tmp.BKBVLI1z7b/writable
+ cd /tmp/tmp.BKBVLI1z7b/writable
+ /tmp/testx/test-openat dir1/file2.txt dir1/file-noexist.txt file1.txt file-noexist.txt
dir1/file2.txt: 3
dir1/file-noexist.txt: 3
file1.txt: 3
file-noexist.txt: 3
FWIW I get the same thing on 6.1.0-060100rc5-generic kernel with squashfuse 0.1.103
I was running 5.15.0-53-generic
and squashfuse from Ubuntu 22.04 package at 0.1.103-3.
I've also recreated with tag 0.1.105. 0.1.105 actually gives Function not implemented for both
dir1/file2.txt and dir1/file-noexist.txt . as shown above, the 0.1.103-3 version "works" for the non-existing file dir1/file-noexist.txt and only fails for the existing file in dir1.
A bit simpler, we can reproduce with just open and O_RDWR. Using the following for test-openat.c we get the output below.
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
int main(int argc, char* argv[]) {
int fd = 0, errors = 0;
for (int i=1; i<argc;i++) {
fd = open(argv[i], O_RDWR, 0640);
if (fd < 0) {
printf("%s: %d: %s\n", argv[i], fd, strerror(errno));
errors++;
} else {
printf("%s: %d\n", argv[i], fd);
}
close(fd);
}
return errors;
}
log:
$ sudo ./test-squash-overlay.sh
$ mkdir -p /tmp/tmp.EJrNzd7Mrr/mp /tmp/tmp.EJrNzd7Mrr/workd /tmp/tmp.EJrNzd7Mrr/upper /tmp/tmp.EJrNzd7Mrr/writable
$ squashfuse -f -o debug,allow_other,ro /tmp/testx/simple.squashfs /tmp/tmp.EJrNzd7Mrr/mp >/tmp/testx/squashfuse.log &
$ mount -t overlay -olowerdir=/tmp/tmp.EJrNzd7Mrr/mp,upperdir=/tmp/tmp.EJrNzd7Mrr/upper,workdir=/tmp/tmp.EJrNzd7Mrr/workd xx-/tmp/tmp.EJrNzd7Mrr/mp /tmp/tmp.EJrNzd7Mrr/writable
+ cd /tmp/tmp.EJrNzd7Mrr/writable
+ /tmp/testx/test-openat dir1/file2.txt dir1/file-noexist.txt file1.txt file-noexist.txt
dir1/file2.txt: -1: Function not implemented
dir1/file-noexist.txt: -1: No such file or directory
file1.txt: -1: No such file or directory
file-noexist.txt: -1: No such file or directory
Hm, when I look at squashfuse debug output, it tells me:
unique: 6, opcode: LOOKUP (1), nodeid: 1, insize: 42, pid: 278341
LOOKUP /a
getattr /a
NODEID: 2
unique: 6, success, outsize: 144
unique: 8, opcode: LOOKUP (1), nodeid: 2, insize: 43, pid: 278341
LOOKUP /a/ab
getattr /a/ab
NODEID: 3
unique: 8, success, outsize: 144
unique: 10, opcode: OPEN (14), nodeid: 3, insize: 48, pid: 278341
open flags: 0x48000 /a/ab
open[93843198916512] flags: 0x48000 /a/ab
unique: 10, success, outsize: 32
48000 is 44000 which is O_DIRECT|O_NONBLOCK, which doesn't make any sense.
I built squashfuse against libfuse3, and then built a local copy of libfuse3.
I'm able to see the test case pass if I set LD_LIBRARY_PATH to point to my local libfuse3 build.
Based on @hallyn 's comment above and git log, I'm guessing the fix is libfuse/libfuse@4df0871 . I have not bisected that though.
So I guess I'm going to close this issue, as it is not squashfuse's problem.
I will probably open an ubuntu bug and chase getting the fix in there. I'll provide a link at that point.
I filed debian bug 1025706 and submitted a patch there. There is more info project-stacker/stacker#350 (comment) .
The tldr: this is fixed by linking to recent version of fuse3. Ubuntu 22.10 has a fix. Backports of fix are available in https://launchpad.net/~puzzleos/+archive/ubuntu/dev.