Why does wasi-threads require import-memory?
anuraaga opened this issue · 5 comments
I have been playing with threads support and noticed this point in the cmake file that wasi-threads requires import-memory
https://github.com/WebAssembly/wasi-sdk/blob/main/wasi-sdk-pthread.cmake#L12
Just to get an understanding of things, I was wondering why this is actually required? I have tried compiling the same module with it enabled and disabled and the difference seems to be
yes import-memory:
(import "env" "memory" (memory (;0;) 3))
(func $__wasm_init_memory (type 1)
i32.const 65056
i32.const 0
i32.const 3172
memory.fill)
(export "memory" (memory 0))
(start $__wasm_init_memory)
no import-memory:
(memory (;0;) 3)
(export "memory" (memory 0))
Imported, it initializes the imported memory, not imported, it uses guest memory which should be automatically initialized. In terms of atomics, both should work the same way right?
A hunch is that there is an assumption that multiple threads requires multiple instantiated modules, with a shared memory they import. But I can imagine a runtime just instantiating a single module and allowing multiple threads to call functions in it with reactor mode - is my understanding correct that in such a scenario, import-memory isn't actually required?
Thanks, just want to confirm my understanding on this.
Your understanding is correct. As of today, wasi-threads is based on the "instance-per-thread" model. For more information see: https://github.com/WebAssembly/wasi-threads#design-choice-instance-per-thread
Ah thanks for that link, I was only looking at
https://github.com/WebAssembly/threads
I am mainly interested in having atomics available and calling the same Wasm function from multiple host threads, not guest threads. This also needs the -threads
triple to have atomics available while thread creation functions are not used. In this case, is there anything about LLVM or wasi-libc that is known to not function with a model of single instance called from multiple host threads?
The very high level context is I am trying to remove the global lock and multiple module instantiation in https://github.com/wasilibs/go-re2 because it results in a lot more memory usage and reduced concurrency - while I have some simple wat working ok, not the actually compiled re2. If it's known that using a single module can't work, then I'll need to look at how to support a multiple module scheme.
Yes, I think there are thing that would break if you tried to concurrently call into a single instance from multiple host threads. The most obvious one that comes for mind is that wasm globals are expected to not be shared in the current model. They are effectively TLS. We use wasm globals for things like __tls_base
and __stack_pointer
.. so if you try to run wasm code in the same instance from two different threads they will race to update the __stack_pointer
and clobber each other.
Also, because the pthread id itself is accessed via a wasm global I think many of the pthread primitives would be broken since all your "threads" would appear to be the same thread at the libc level (i.e. they would all report the same pthread_self()
)
Got it - thanks for the explanation, that clears things up a lot. Glad to have random memory corruption explainable rather than not explainable :)