ethanhs/python-wasm

Try to get WASI working here as well?

brettcannon opened this issue ยท 14 comments

So far the work has focused on the emscripten browser. Do we want to also try and tackle wasi in this repo as well?

I would love to work on WASI! Unfortunately, I'm not sure how we'd do that :/

We would need to either

  1. write an implementation of pthreads based on webworkers and sharedarraybuffer (WASI does not have its own pthread implementation) or

  2. Re-introduce threadless builds to CPython

Both of these are a lot of work, so I figured making progress on getting Emscripten working will probably carry over, and would be the best use of time for now until we have a good idea of which of the above we should do.

https://github.com/singlestore-labs/cpython/tree/wasm_v3.9.7 is supposedly a build of CPython 3.9.7 that works under WASI. I have not dived into it, though, to see what they had to do to make it work.

Otherwise https://leaningtech.com/cheerpx/ claims to have gotten it to work, but that's a very different approach.

That could work. However, I also know that WASI libc itself is compiled as single threaded, so I'm not sure if that could cause any issues.

Would be interesting to try nonetheless!

Ok I think we'd also need to maybe rebuild wasi-libc even if we had pthread stubs. See this comment and the discussion here: WebAssembly/wasi-libc#209 (comment)

Another option may be to stub out pthreads at the level of https://github.dev/python/cpython/blob/main/Python/thread_pthread.h so it isn't stubbing out pthreads per-se, just removing our use of the header.

I think that is probably the best option, I've been toying with that idea in my head for a bit. I think it would probably require changes elsewhere as well. It seems a lot of places check if HAVE_PTHREAD is 1 (time, signal, etc). But that would definitely be a better solution in the long term than stubbing pthread as we wouldn't need to rebuild WASI-libc, and we could keep things to just patches on CPython :)

You say "patches", I say "change in CPython" ๐Ÿ˜‰.

For me, I want WASI for three reasons:

  1. I can start to write VS Code extensions in Python thanks to Node having experimental support for WASI (heck, I can ship Python with the Python extension in a cross-platform manner this way as well ๐Ÿ˜)
  2. I'm really curious about the idea of using WASI as the code boundary instead of containers for security in cloud scenarios
  3. We could set up a (stable) buildbot for WebAssembly that uses WASI

And it's that last one that's critical to me, because that then makes WebAssembly a legitimate build target for CPython which then helps Pyodide and emscripten-based targets as well. And since I have plans to introduce platform support tiers, having a stable buildbot will be important someday. ๐Ÿ˜‰ Hence I would want this upstream instead of as a series of patches.

Oh yeah when I say patches I mean patches in the "changes that would get upstreamed". They would live temporarily as patches, as I expect the changes to be more than a single PR.

I can start to write VS Code extensions in Python thanks to Node having experimental support for WASI

Note that the existing scripts could be changed to generate node-compatible Emscripten output by outputting to a .js file instead of .html. I don't know how well this works and haven't tried it but it would be a way to get something that doesn't require a browser. But I definitely think WASI would be a nicer way of doing this.

I'm really curious about the idea of using WASI as the code boundary instead of containers for security in cloud scenarios

As am I! This and reproducible computing are both very interesting concepts to me.

We could set up a (stable) buildbot for WebAssembly that uses WASI

We could probably do this with Emscripten, but WASI would be better for sure.

And it's that last one that's critical to me, because that then makes WebAssembly a legitimate build target for CPython ...

Would definitely like to see this happen. I will probably try to get CPython to build on WASI over the weekend :)

tiran commented

@brettcannon I looked into platform specific compiler flags a while ago. Emscripten and wasi define some macros by default, https://gist.github.com/tiran/ee6e825e1388b21ca70f4fc645d8cbb9

#ifdef __wasm__
#endif
#ifdef __EMSCRIPTEN__
#endif
#ifdef __wasi__
#endif
if sys.platform == "emscripten":
    pass
tiran commented

Yesterday I got a hacky WASI build to work using bubble gum, paper clips, and the pthread stubs from https://github.com/singlestore-labs/cpython/blob/wasm_v3.9.7/wasi-stubs/include/pthread_stubs.h

# wasmtime --mapdir=.::/python-wasm/cpython -- ./python.wasm -c "import sys; print(sys.platform)"
wasi

I have created upstream ticket https://bugs.python.org/issue46315 and opened a PR with the non-hacky bits of my patch.

We've actually been continuing our work on the WASI Python build. We're continuing down the path of a compatibility layer called WASIX to fill the gaps left between POSIX and WASI: https://github.com/singlestore-labs/wasix. The reason we went this direction was so that we could also use that library to compile 3rd party libraries for Python extensions as well. Trying to patch every 3rd party library to bypass the missing features didn't seem like a viable option.

We have a WASI Python build working now that based on WASIX that runs most of the Python unit test suite. Of course many tests fail since there are no threads, pipes, sockets, etc. yet, but a vast majority are passing now. I was actually planning on pinging your team with the changes to the Python sources that we did have to see if we could work together on this.

@kesmit13 Would you be able to share the source and/or a binary. I'd like to try running it with https://github.com/turbolent/w2c2 and implement support for wasix

@turbolent Our WASI Python repo is available at https://github.com/singlestore-labs/python-wasi. It's mostly just a patch to configure.ac and a script that sets up the C compiler settings.