WebAssembly/wasi-sdk

Why does wasi-threads require import-memory?

anuraaga opened this issue · 5 comments

I have been playing with threads support and noticed this point in the cmake file that wasi-threads requires import-memory

https://github.com/WebAssembly/wasi-sdk/blob/main/wasi-sdk-pthread.cmake#L12

Just to get an understanding of things, I was wondering why this is actually required? I have tried compiling the same module with it enabled and disabled and the difference seems to be

yes import-memory:

(import "env" "memory" (memory (;0;) 3))
  (func $__wasm_init_memory (type 1)
    i32.const 65056
    i32.const 0
    i32.const 3172
    memory.fill)
(export "memory" (memory 0))
(start $__wasm_init_memory)

no import-memory:

(memory (;0;) 3)
(export "memory" (memory 0))

Imported, it initializes the imported memory, not imported, it uses guest memory which should be automatically initialized. In terms of atomics, both should work the same way right?

A hunch is that there is an assumption that multiple threads requires multiple instantiated modules, with a shared memory they import. But I can imagine a runtime just instantiating a single module and allowing multiple threads to call functions in it with reactor mode - is my understanding correct that in such a scenario, import-memory isn't actually required?

Thanks, just want to confirm my understanding on this.

sbc100 commented

Your understanding is correct. As of today, wasi-threads is based on the "instance-per-thread" model. For more information see: https://github.com/WebAssembly/wasi-threads#design-choice-instance-per-thread

Ah thanks for that link, I was only looking at

https://github.com/WebAssembly/threads

I am mainly interested in having atomics available and calling the same Wasm function from multiple host threads, not guest threads. This also needs the -threads triple to have atomics available while thread creation functions are not used. In this case, is there anything about LLVM or wasi-libc that is known to not function with a model of single instance called from multiple host threads?

The very high level context is I am trying to remove the global lock and multiple module instantiation in https://github.com/wasilibs/go-re2 because it results in a lot more memory usage and reduced concurrency - while I have some simple wat working ok, not the actually compiled re2. If it's known that using a single module can't work, then I'll need to look at how to support a multiple module scheme.

sbc100 commented

Yes, I think there are thing that would break if you tried to concurrently call into a single instance from multiple host threads. The most obvious one that comes for mind is that wasm globals are expected to not be shared in the current model. They are effectively TLS. We use wasm globals for things like __tls_base and __stack_pointer.. so if you try to run wasm code in the same instance from two different threads they will race to update the __stack_pointer and clobber each other.

sbc100 commented

Also, because the pthread id itself is accessed via a wasm global I think many of the pthread primitives would be broken since all your "threads" would appear to be the same thread at the libc level (i.e. they would all report the same pthread_self())

Got it - thanks for the explanation, that clears things up a lot. Glad to have random memory corruption explainable rather than not explainable :)