Tracking issue for WebAssembly atomics

Question

Tracking issue for WebAssembly atomics

alexcrichton opened this issue 4 years ago · 15 comments

This is an issue intended to track the state of WebAssembly atomics support in Rust. For the WebAssembly target there is the threads proposal in WebAssembly which adds a number of instructions and a new kind of memory to the WebAssembly specification. The new instructions largely deal with atomic memory operations (e.g. i32.atomic.add), but also deal with synchronization between threads (memory.atomic.notify). The threads proposal does not add an ability to spawn threads nor does it really define what threads are, but it's largely set up to have a wasm-instance-per-thread (not that this is super relevant for the standard library).

As of the time of this writing the WebAssembly threads proposal is at stage 2 of the phases process. It is shipping in Chrome and in Firefox, however.

Rust's support for this proposal boils down to a few things:

Primarily Rust/LLVM support the -Ctarget-feature=+atomics CLI flag to rustc. This causes codegen for atomic types like std::sync::atomic to use the atomic instructions.
Rust has support for the three synchronization intrinsics:
The Rust standard library implements mutexes differently based on whether the atomics feature is enabled for the library at compile time. Namely it has custom implementations of:

In terms of toolchain, we're, as usual, inheriting a lot of the experience from LLVM as well. As usual the WebAssembly target uses LLD as the linker, but a number of options are passed by default when we're generating an executable compiled with threads (currently detected with -Ctarget-feature=+atomics). We instruct LLD to create a "shared" memory (which allows the memory to be shared across multiple wasm instances, how threading works on the web and in other engines), specifies a default maximum size for memory (this is required for shared memory, and normal wasm memories don't need to list a maximum), flags memory as being imported (since otherwise each instance would export a new memory and not share it!), and ensures that a few TLS/initialization-related symbols are exported.

The symbols are perhaps the most interesting part here, so to go into them in some more detail:

__wasm_init_memory - this is called once which initializes all data segments of memory (e.g. copies from data into memory). This is intended to only happen once for the lifetime of a module at the beginning.
__wasm_init_tls - this is a function which is intended to be called once-per-instance and initializes thread-local information from a static area. The pointer to thread-local data is passed as the first argument. The pointer must be initialized according to __tls_size and __tls_align.

Also as of today there is no dedicated target for wasm with atomics. The usage of -Ctarget-feature=+atomics was intended to help ship this feature ASAP on nightly Rust, but wasn't necessarily intended to be the final form of the feature. This means that if you want to use wasm and atomics you need to use Cargo's -Zbuild-std feature to recompiled the standard library.

Overall threads, wasm, and Rust I feel are not in a great spot. I'm unfortunately not certain about how best to move things forward. One thing we could do is to simply stabilize everything as-is and call it a day. As can be seen with memory initialization, imports, and TLS, lots of pieces are missing and are quite manual. Additionally std::thread has no hope of ever working with this model!

In addition to the drawbacks previously mentioned, there's no way for TLS destructors to get implemented with any of this runtime support. The standard library ignores destructors registered on wasm and simply never runs them. Even if a runtime has a method of running TLS destructors, they don't have a way of hooking into the standard library to run the destructors.

I personally fear that the most likely scenario here is to simply stabilize what we have, bad user experience and all. Other possible alternatives (but not great ones?) might be:

Add new wasm target for threads, but add a target per "runtime". We might add one for wasm-bindgen, one for Wasmtime, etc. This can try to work around TLS destructor issues and make things much more seamless, but it would be an explosion of targets.
Coordinate with C and other toolchains to try to create a standard way to deal with wasm threads. For example we could standardize with C how modules are instantiated, TLS is handled, threads are intended to be spawned/exited, etc. This AFAIK isn't happening since I believe "Emscripten does its thing" and I don't think anyone else is trying to get something done in this space. Wasm-bindgen "works" but doesn't implement TLS destructors and it's still very manual and left up to users.

I originally started writing this issue to stabilize the core::arch intrinsics, but upon reflection there are so many other unanswered questions in this space I'm no longer certain this is the best course of action. In any case I wanted to write down my thoughts on the current state of things, and hopefully have a canonical place this can be discussed for Rust-related things.

Answer 1 · 2021-02-07T15:45:24.000Z

Anything that can be done short-term to improve the status quo? Happy to contribute if I can. I currently do some shared memory mechanics through SharedArrayBuffer (see https://github.com/wasm-rs/shared-channel). This helps integrating with JavaScript right now, but I am wondering if native support for atomic intrinsics in wasm will give me a better way to deal with this shared memory (namely, access it directly and not through SharedArrayBuffer/*Array API)

Answer 2 · 2021-02-07T17:00:21.000Z

Adding some thoughts here:

Add new wasm target for threads, but add a target per "runtime". We might add one for wasm-bindgen, one for Wasmtime, etc. This can try to work around TLS destructor issues and make things much more seamless, but it would be an explosion of targets.

What if this approach were partially modified?

A new wasm target would be added for threads, but instead of having a new target per runtime a common interface would be used. Rust could export functions where it needs support from the host. It would be up to wasm-bindgen, Wasmtime, and other runtimes to implement the appropriate functions as imports.

For example Rust could export __ruststd_spawn_thread(u32) and it'd be up to the runtime to implement that how they see fit (for wasm-bindgen a Web Worker could be spawned).

Answer 3 · 2021-02-08T16:07:30.000Z

Ah that's indeed a good point @kettle11! I would personally be wary of doing something Rust-specific, however, and would prefer that we work with other toolchain folks (e.g. C/C++/...) to figure out a standard set of APIs that could work for everyone. Most likely this seems like a WASI-level concern perhaps.

My fear is that if we do something Rust-specific it's very tailor-made for just us and doesn't take other ecosystem's feedback into account, which I think is important to do to get a long-lasting convention we're confident in implementing in rust-lang/rust.

Answer 4 · 2021-09-22T15:53:09.000Z

are there any discussions happening with other toolchains currently? I think letting this stale is not the best idea tbh

Answer 5 · 2023-01-22T15:43:14.000Z

Just want to ping the questions @ImUrX brought up. Access to threads is a major feature in the browser, and while workers provide access to parallelism, they're extremely hard to work with.

Answer 6 · 2023-01-23T17:44:28.000Z

Some of us have been working on wasi-threads, a proposal for allowing WASI hosts to spawn threads. Much of the work to figure out the implementation details has happened for the C language in the wasi-libc repository. As a part of all of this work, I opened rust-lang/compiler-team#574 to add a wasm32-wasi-threads target to the Rust toolchain as well. Once that is merged, which should be soon, it should be possible to solve some of the problems @alexcrichton brings up. (Note, though, that I'm talking here about standalone WebAssembly engines, not browsers; there is more work needed to have equivalent support in both places). Please let me know if you are interested in helping to build any of these pieces!

Answer 7 · 2023-01-28T21:15:19.000Z

I would be interested in helping with the browser piece, though I'm not sure how to get started.

Answer 8 · 2023-10-25T10:11:59.000Z

The webassembly threads proposal has reached phase 4(standardization).

Wouldn't it be the good time to implement threads for the wasm target? std::thread for wasm32-unknown-unknown already exists, though most of the code is left unsupported(). We all know that major browsers already support this.

Code for wasm thread

Many crates that depend on std::thread are broken right now on wasm32-unknown-unknown, such as tokio and rayon, because of this missing piece. Improving this part should open up a whole new possibilities on the web.

Answer 9 · 2023-10-25T10:34:09.000Z

For WASI we already have wasm32-wasi-preview1-threads which fully supports std::thread.

For wasm32-unknown-unknown implementing std::thread::spawn is not possible as Webassembly doesn't have an instruction to spawn a new thread and wasm32-unknown-unknown can't depend on any OS or JS interfaces to spawn a new thread as it by definition doesn't depend on any external interfaces. What you can do however is to compile everything (including the standard library) with -Ctarget-feature=+atomics,+bulk-memory,+mutable-globals and then use eg wasm-bindgen to write the necessary JS interfacing code to spawn a new webworker which will run the thread. This only works in the browser though and it depends on wasm-bindgen. Because of this libstd can't have this implemented itself. Every wasm-bindgen version is abi incompatible, but libstd has to work with all wasm-bindgen versions and thus can't depend on wasm-bindgen itself. And wasm32-unknown-unknown is not a browser specific target, so it has to work outside of the browser too.

Answer 10 · 2023-10-25T10:48:04.000Z

Many crates that depend on std::thread are broken right now on wasm32-unknown-unknown, such as tokio and rayon, because of this missing piece. Improving this part should open up a whole new possibilities on the web.

Have you tried targeting wasm32-unknown-emscripten, @temeddix? I think this is the way forward until such time that browsers have support for WASI and/or the component model.

Alternatively there is a crate wasm-bindgen-rayon for using Rayon via wasm32-unknown-unknown that basically does what @bjorn3 described above.

Answer 11 · 2023-10-25T11:02:20.000Z

Thanks for the quick reply.

I haven't used wasm32-unknown-emscripten because it was known to be not actively maintained.

I am aware of those helper crates that try to bridge the gap, but it makes me write the code twice for native and web in many cases. The biggest problem was that wasm-bindgen-rayon was not supporting no-modules target.

Answer 12 · 2023-10-25T11:46:22.000Z

As far as I know, we can interop with JavaScript with extern "C" blocks. Would it be possible to check if it's a JavaScript environment(browser) first, and then depend on those C functions? This wouldn't need wasm_bindgen stuffs.

Answer 13 · 2023-10-25T12:01:22.000Z

extern "C" doesn't allow importing arbitrary javascript functions, only those explicitly provided by the user who runs the module. In addition there is no way to pass arbitrary javascript objects without the externref proposal which LLVM doesn't support. Webassembly only supports i32, i64, f32, f64 as function arguments and return values. Even with externref you can't construct a string (like the path of the javascript file which will run in a new WebWorker) from Webassembly, only store one retrieved from javascript and pass it back. Wasm_bindgen exists precisely because of these limitations. It works around them by generating arbitrary javascript wrappers based on instructions from the wasm module. The exact abi between those javascript wrappers and the bindings in the wasm module is unstable.

Answer 14 · 2023-10-25T12:03:40.000Z

I understand, thank you for your explanation. Perhaps I should focus on wasm32-wasi with browser polyfills instead.

Answer 15 · 2023-12-09T16:23:15.000Z

3+ years after this is was opened there's still no clear path to moving multithreaded Wasm forward on web.

Rust should aim to be one of the best languages to use for Wasm on Web and the absence of seamless threading threatens that. With each passing year we're accumulating more and more technical debt and hacks in the Rust ecosystem that bake in or work-around the assumption that multithreading is not available on web.

The ideal reality is that using multithreaded Rust on web requires as few code changes as possible.

Here are the two largest issues I see blocking that reality:

wasm32-unknown-unknown cannot implement std::thread::spawn because it would need to interact with the host environment. wasm32-wasi offers a solution to this.
The main thread cannot wait on web. wasm32-wasi will not help here.

So here's a proposed solution:

If the atomics flag is enabled on the wasm32-unknown-unknown target provide a way for a library to implement std::thread. A bit like how a global allocator can be configured. This could be used by wasm-bindgen(or other libraries) to provide a threading implementation.
If the atomics flag is enabled on the wasm32-unknown-unknown target change the behavior of futex_wait to busy-loop if a 'main_thread' flag is set. Maintain the current crash-by-default behavior.

These two changes would allow a significant chunk of the multithreaded Rust ecosystem to just work with only compilation flags (once libraries remove their workarounds for the fact that web isn't multithreaded).