WebAssembly/wasi-libc

Directory listings should include preopens

SteveSandersonMS opened this issue · 4 comments

Transferred from bytecodealliance/wasmtime#6396


wasi-libc's behavior (and hence Wasmtime's behavior) for directory listings (via the WASI fd_readdir API?) is surprising and is inconsistent with Wasmer.

If there is a preopen for /blah, do you expect a directory listing for / to include /blah? Most people would do, because that's how all normal filesystems work. However:

  • This is not the case with wasi-libc/wasmtime. Each directory listing returns only the files that literally exist on the host within that directory, ignoring any other preopens that may be mapped into the guest directory.
  • But it is the case with wasmer - it correctly includes other preopens that have been mapped to the guest directory.

Repro code: https://gist.github.com/SteveSandersonMS/ff5f5cb91524bbcde24a168841e66f10

Existing application code could be broken in strange ways if typical filesystem invariants are not maintained (e.g., "a directory's parent always contains that directory").

After discussion at bytecodealliance/wasmtime#6396 with @bjorn3, it sounds like:

  • The ability to even see preopened directories as existing within a global file hierarchy (as opposed to being more like independent file hierarchies) is a feature of wasi-libc intended for compatibility with existing code
  • For compatibility, then, it makes sense to complete this picture and also simulate the existence of ancestor directories containing the preopens. For example, simulating that a preopen exists at /a/b/c entails also simulating that /a/b exists and contains c, etc, otherwise filesystem invariants are broken.

Its sounds like you are asking for some kind of virtual root directory to be created by wasi-libc... and in the case where / is actually included in the pre-opens that would then need to be somehow overlaid on top of the pre-opened root?

That sounds like fair amount of extra complexity. I wonder if there is some way to specify more precisely the expected host behaviour to avoid this kind of inconsistency?

I'm not sure I have a strong opinion here but others might (cc: @sunfishcode, @pchickey). My bias would be towards simplicity, though. @SteveSandersonMS, the opinion I do have is that whatever we decide, we should encode the behavior in a test over at the wasi-testsuite.

From a capabilities perspective, it's a a little closer to the spirit of capabilities to not make these paths appear in readdir. Programs should ideally be given paths of things they should open, rather than open-world exploring the filesystem to discover all of the things that exist with it. So I propose that readdir should not implicitly include preopen directories.

That said, I agree that wasi-libc could implement this on its own, and that some users would find useful. Right now, wasi-libc's preopen lookup mechanism is very simple. If we added a full VFS tree to wasi-libc, it'd add some amount of complexity and code size that not everyone needs.

So I propose that this be approached as a wasi-libc feature, which is built in a way that makes it optional, so that users can chose whether to use the simple small implementation or the larger full-tree implementation.

I think WebAssembly/wasi-filesystem#128 is a more elegant long-term solution to this. You would get the VFS tree for free, and it conceals unnecessary information from the guest (which directories are preopens and which are real files under preopens), which I think is generally a good thing. I think the default model of having preopens as siloed namespaces is needlessly unintuitive.