WebAssembly/wasi-libc

Is it possible to not use preopen?

konsumer opened this issue · 1 comments

I am working on a very simple browser-host, manually, and I having a tough time with preopens. I got some ideas from here and they seem to not use them, but I can't figure out how to make wasi-libc expose that.

I have a synchronous VFS in the host (it's a zip + localstorage, using browserfs) and I can access any file as needed, but I think I am missing some key part of how wasi/wasi-libc tries to access files.

  • Here is the host I am working on. I can inject the zip file-list, too, if I have to, but I would prefer dynamic
  • Here is example wasm that is just trying to stat that file in the zip-file.

Here is an example:

my test code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>

__attribute__((export_name("load")))
void load() {
  struct stat st;
  stat("assets/cyberpunk.txt", &st);
  printf("Filesize: %d", (int)st.st_size);
}

I am not using main and compile with this:

${WASI_SDK_PATH}/bin/clang --sysroot=${WASI_SDK_PATH}/share/wasi-sysroot -Wl,--no-entry -nostartfiles -Oz

The printf works fine with stdout:

fd_write (fd, iovsPtr, iovsLength, bytesWrittenPtr) {
  const iovs = new Uint32Array(wasi.memory.buffer, iovsPtr, iovsLength * 2)
  let text = ''
  let totalBytesWritten = 0
  for (let i = 0; i < iovsLength * 2; i += 2) {
    const offset = iovs[i]
    const length = iovs[i + 1]
    const textChunk = decoder.decode(new Int8Array(wasi.memory.buffer, offset, length))
    text += textChunk
    totalBytesWritten += length
  }
  const dataView = new DataView(wasi.memory.buffer)
  dataView.setInt32(bytesWrittenPtr, totalBytesWritten, true)

  if (fd === FILENO_STDOUT) {
    console.log(text)
  }

  if (fd === FILENO_STDERR) {
    console.error(text)
  }
  
  // TODO: do more with fd for other files

  return WASI_ESUCCESS
}

I see it calling fd_prestat_get with fd:3:

fd_prestat_get (fd, bufPtr) {
  console.log('fd_prestat_get', { fd, bufPtr })
  return WASI_EBADF
}

If I have to preload all the files, that sort of works, but I am not sure how to tell my host "3 means assets/cyberpunk.txt". Without me telling the wasm, it just decided that is 3. Also, I am a bit concerned about files that are created from wasm, since the VFS can handle all that on the fly, but I will have to double-manage it (inserting things into some other out-of-vfs array, to keep them in sync.)

I'd rather just be able to call something on the fly when it asks for assets/cyberpunk.txt, and just know what the fd number means, in the host.

My goal is that people can write "regular looking C" and have it work in browser & native, with the same zip file (for read, and however I do synchronous persistant filesystem for write.) I think the actual filesystem part is covered, I am just having trouble with how to expose it to wasm made with wasi-libc.

Here is what I would like to do, in order of preference:

  • WASI calls out to a proposed fd and filename before it uses it, so I can map it in the host: "whenever wasm asks for 3, give it this file"
  • Call some host function with pathname string, asking for it's fd
  • preload the file-list with numbers they both know, then when new files are made, I will add these in the host
  • some build-flag to just disable preload behavior and call the other functions that use the filename (the path_ functions do this, right? Or maybe the other fd_ functions?)
  • some other way I am not imagining to map 3 to a filename.

Are any of these possible?

I think I kind of have my answer, looking here.

So I implemented these:

fd_prestat_get
fd_prestat_dir_name
fd_fdstat_get
path_filestat_get

Since I was leaving out main, wasi's _start was never triggered. It is needed for preload files to work. So, I added main, disabled ignoring it in clang-flags, and it worked. Here is the process for others:

  • _start triggers looping through every number > 2
  • fd_prestat_get will return WASI_ESUCCESS/WASI_EBADF if it's a good file fd (in your own system)
  • fd_prestat_dir_name will be called, so you can associate those good fd's with filenames from your own system
  • now the fd's are associated with paths, and also whatever you gave it will be available in all the path_ functions (but fd too, which should be linked up already to your mapping)

It does the mapping for you, so now instead of 3, it's 6 (since the name was sent to wasm using fd_prestat_dir_name when it sent me 6.)

I will close this. Sorry for the noise.