What's the relationship between Wasm and WASI threading proposal?

Question

What's the relationship between Wasm and WASI threading proposal?

Becavalier opened this issue 5 years ago · 19 comments

Answer 1 · 2019-06-06T14:58:20.000Z

This proposal does not add any way for a wasm module to create its own threads; the wasm module must import this functionality from the host. Currently, this is achieved in a browser environment by importing JS functions that use the Web Worker API, as described in the overview. As a portable host interface, I think it would make sense for WASI to also define a portable thread creation API too (that could be polyfilled in a browser with workers).

There is a longer-term plan to add thread-creation operators to core wasm, but, for various connected reasons, this will take a while because it ends up requiring not just linear memory, but also tables, globals and instances to be shared. Pure-wasm threads won't be a 100% replacement for host-created threads, though, since host-created threads can do extra host-specific stuff, like provide an event loop, so I don't think the WASI thread-creation APIs are just a stopgap; they'd have long-term utility.

Answer 2 · 2019-06-06T19:57:07.000Z

@lukewagner:

There is a longer-term plan to add thread-creation operators to core wasm, but, for various connected reasons, this will take a while because it ends up requiring not just linear memory, but also tables, globals and instances to be shared.

Hm, why is that? I understand that more sharing would be useful, but to match whatever a host can do atm it shouldn't be necessary, I think. (However, in such a scenario it would be necessary to have a module instantiation instruction and first-class tables, globals, etc. to support it.)

Answer 3 · 2019-06-06T23:17:24.000Z

@rossberg That's a good question. I've been assuming, from previous discussions, that what we want is some sort of thread.create operator that would take a funcref or funcidx, create a thread, then call the function. With that design, the funcref's instance (or the instance calling thread.create if the funcidx form was used) would necessarily be shared between the thread calling thread.create and the new thread.

But it sounds like you're imagining a thread.create that instead takes a first-class module ref and an array of exportvals (anyrefs) such that the host creates a new thread, creates a new instance of the given module with the given import values, then calls the new instance's start function in the new thread. I guess that could work; but I'm not sure how much sooner it'd be made available. Eventually I think we'll need the funcref/funcidx versions too, though.

Answer 4 · 2019-06-07T06:19:03.000Z

@lukewagner, it's simpler actually: the same operator you describe plus a separate instantiate instruction. Because the spawned function must not be allowed to access non-shared state of its module, all it could effectively do would be instantiating a new module (with a separate instruction), similar to how it currently works on the host side.

That's how we model it in our memory model paper draft anyway. There, we have a fork instruction that requires a function of shared function type and a separate instantiate instruction. For wiring up imports/exports we simply reify externvals as anyref. We also introduce shared tables etc, but they are not needed to emulate the current host semantics.

Answer 5 · 2019-06-07T14:13:04.000Z

If it's a question of "to bytecode or not to bytecode" I think I would prefer that we not have bytecodes that create instances or deal with modules. The reason for that is that those necessitate types and first class values for modules and instances, which are necessarily embedder concepts. So they would be imported types or "standardized" reference types--though likely opaque. In the continuing spirit of not baking any non-trivial types into core wasm, then I think it's better that types for instances and modules remain embedder concepts that must be imported. That only leaves room for bytecodes that do not need to refer to these in a first class way, a variant of what Luke suggested. But, assuming we had thread.create that created something that can execute a given function (with arguments--with no implicit access to anything else, in the best case), then what is its return type? Presumably, a typed, opaque reference to some thread thing. Then we end up with the same problem.

So it seems like standardizing threading bytecodes are going to inevitably lead to a set of opaque reference types in any case.

Answer 6 · 2019-06-07T14:37:36.000Z

@rossberg When you say "the same operator you describe", do you mean the first version of thread.create I described, which takes a funcref or funcidx? How do you achieve the property that the spawned function is not allowed to access non-shared state of its ~~module~~instance? Also what do you mean by "spawned"?

Answer 7 · 2019-06-07T15:01:08.000Z

@titzer, you don't even need first-class instances, only first-class memories, tables, globals, such that the instantiate instruction directly maps imports to exports. Not high priority, but probably significantly easier than implementing shared-everything.

Depending on how dynamic we'd want to make the link-time type-checking, that wouldn't require fancy types either. In our threads paper we even use plain anyref, which is no worse than what we have in JS. You could polyfill that instruction with a call to JS imports today.

But, assuming we had thread.create that created something that can execute a given function (with arguments--with no implicit access to anything else, in the best case), then what is its return type?

Does it have to return anything? You want to give it access to shared memory anyway (and other shared defs if we had them). That would be enough.

Answer 8 · 2019-06-07T15:17:40.000Z

@lukewagner:

When you say "the same operator you describe", do you mean the first version of thread.create I described, which takes a funcref or funcidx?

Yep. In the paper we take a funcidx and parameters. (Using a funcref can easily be expressed with an auxiliary function.)

How do you achieve the property that the spawned function is not allowed to access non-shared state of its instance?

Via validation (see paper, Appendix A if you're interested). In our system, function types have a shared attribute as well, and the instruction (we call it fork for no particular reason) requires a shared function as argument. Shared functions cannot access non-shared definitions.

Regardless of the details, a thread.create instruction will need something equivalent to (or stronger than) that restriction.

Also what do you mean by "spawned"?

Oh, executed in a new thread.

Answer 9 · 2019-06-07T15:44:23.000Z

@rossberg: generally, yes, it is good to get a handle on the spawned computation, e.g. to perhaps await it, join it, cancel it, etc.

Answer 10 · 2019-06-09T22:17:30.000Z

@rossberg Ah, interesting; I had been imagining that there was only a "shared" attribute on the whole module/instance, with that requirement propagating to its memories/tables/globals. Are there uses you can think of for having the "shared" attribute be per-function other than fork?

Answer 11 · 2019-06-10T10:26:19.000Z

@lukewagner, there might be use cases where a module has both shared and unshared exports. But the primary reason for putting the attribute on the function type is that we'd need to track it in function types anyway, because function references are first-class, so you don't know what module they come from.

Answer 12 · 2019-06-11T10:28:26.000Z

Yes, definitely makes sense to track that in the function reference type; I was mostly just asking about granularity (module vs. function).

Answer 13 · 2019-06-17T01:48:34.000Z

WAVM has some non-standard support for shared instances (and tables) at the C API level. One way that it differs from the shared functions @rossberg is talking about is that functions in shared instances can access non-shared globals. The semantics are equivalent to re-instantiating the module in each thread in the same compartment: non-shared mutable globals become thread-locals.

IMO adding a way to directly create threads from WebAssembly is only superficially valuable, and the next step after this shared memory extension should be to tackle shared instances. If this is something browser folks don't want to take on yet, it might be possible to do it in a constrained way that can be polyfilled on web VMs.

Answer 14 · 2019-06-17T07:42:36.000Z

@lukewagner, my thinking was that if each function declares it anyway (as part of its type), then what's the use of also having a mode per module? Also, I always want to avoid per-module modes/flags, since they would get in the way of module merging, and thus modular (de)composition.

Answer 15 · 2019-06-17T07:47:46.000Z

@AndrewScheidecker, silently duplicating state seems dangerous, since it can arbitrarily break state invariants the module is assuming. I think that should at least be gated by some third form of sharing attribute, like TLS.

Answer 16 · 2019-06-17T11:21:03.000Z

silently duplicating state seems dangerous, since it can arbitrarily break state invariants the module is assuming. I think that should at least be gated by some third form of sharing attribute, like TLS.

It's not silent, it's controlled by whatever host API is being used to create threads. If the host API is implemented on the web by re-instantiating the module in a new WebWorker, then WAVM can reproduce that behavior by creating a new context.

I do think it makes sense to add a thread-local sharing attribute alongside shared functions.

How does segment drop state interact with shared functions? Non-shared segments don't seem useful, so maybe segments should just be implicitly shared.

Answer 17 · 2019-06-17T11:30:36.000Z

It's not silent, it's controlled by whatever host API is being used to create threads.

Sure, but the module itself has no way of controlling this and preventing a random client from breaking it that way. It is violating state encapsulation.

How does segment drop state interact with shared functions?

Good question. I agree that they should probably be shared. They are typically accessed for relatively expensive operations only, so the additional synchronisation on retrieving the address shouldn't be prohibitive.

Answer 18 · 2019-06-17T22:33:04.000Z

@rossberg Motivating per-function via trivial-module-merging is a great point.

Answer 19 · 2020-11-10T16:42:15.000Z

what's the status on this?