golang/go

proposal: cmd/compile: go:wasmexport directive

slinkydeveloper opened this issue · 43 comments

The goal of this proposal is to add a new directive, called go:wasmexport, in a similar fashion to the go:wasmimport directive proposed in #38248. This proposal is a first step towards #25612 and #41715.

With this directive a user can export a function, so the engine that runs the module can invoke it:

//go:wasmexport hello_world
func helloWorld() {
  println("Hello world!")
}
(master)⚡ % wasm-nm -e sample/main.wasm
e run
e resume
e getsp
e hello_world

In this proposal I won't modify the actual Golang ABI. In fact, thanks to this feature, users will be able to define their own extensions to the existing ABI. This is already supported by tinygo too https://tinygo.org/webassembly/webassembly/

Like #38248, The go:wasmexport directive will not be covered by Go's compatibility promise as long as the wasm architecture itself is not considered stable.

If I call hello_world from the WebAssembly host, on which goroutine will it run?

tl;dr I think a new goroutine is fine

In my view of the problem (which might be incomplete), the goal of defining exports is to "signal" to the go wasm module to execute a specific bit of code OR to signal the completion of an async result (similar to runtime.handleEvent but abstracted from the js promise handlers). Hence, my understanding of the execution flow is:

  1. When the export is invoked, it writes in some memory (eg writes in a channel, creates a goroutine, etc) what has to be done. I'm not sure where this particular step has to be executed, but I guess a new goroutine is fine. We don't have to guarantee thread safety and we could just force the user to be careful about that.
  2. In the export, the user can (but doesn't have to) invoke something like runtime.Run() to run the scheduler up to the point where all goroutines are parked

With this execution model, there is no need for user defined exports to interact with resume and stop. This execution model is general enough, because it allows to develop all kind of exports: you could implement with it both an application that always run and an application which serves as "event handlers", where events are sent from the wasm runner to the wasm module.

From my understanding, most of the code to implement this execution model is already there: wasm_export_resume in runtime/rt0_js_wasm.s pretty much does what I'm talking about.

You explanations do not yet paint a full picture to me.

  • So if I don't call runtime.Run(), then no other goroutine would be allowed to run except the exported function? If I do a go ..., then it would just be put on hold?
  • When does the call to the exported function return? What happens if the Go code blocks?

So if I don't call runtime.Run(), then no other goroutine would be allowed to run except the exported function? If I do a go ..., then it would just be put on hold?

Yes, this way you allow the application to have the full control of its lifecycle

When does the call to the exported function return?

If you invoke runtime.Run(), it returns when there is nothing else to do (all goroutines parked). Otherwise, it returns immediatly.

What happens if the Go code blocks?

The first question that comes to my mind is: define code that blocks in wasm environment.
In any case, it will block the runner application too, like runtime.Run() does

Yes, this way you allow the application to have the full control of its lifecycle

Possible, but quite a special situation for Go code. I would make code break in unexpected places. E.g. if you call some library that uses goroutines and channels to do something in parallel, then it would deadlock, since the new goroutines wouldn't start and it would wait for the channel forever.

If you invoke runtime.Run(), it returns when there is nothing else to do (all goroutines parked).

So runtime.Run() would block until all goroutines are parked?

The first question that comes to my mind is: define code that blocks in wasm environment.

That's exactly the issue. WebAssembly has no concept of blocking, but Go has. I guess you would need something similar to the semantics of https://golang.org/pkg/syscall/js/#FuncOf.

Possible, but quite a special situation for Go code. I would make code break in unexpected places. E.g. if you call some library that uses goroutines and channels to do something in parallel, then it would deadlock, since the new goroutines wouldn't start and it would wait for the channel forever.

Yes but please note that the user of such directive won't be the average user, but will be a library developer, somebody knowledgeable of the system where this application is going to run. Designing the module interface it's often part of designing the "runner", even the whole application itself. So the situation you described likely won't happen to the user of this directive.

If that makes people more comfortable, we can make the directive to invoke by default runtime.Run() at the end of each export invocation, but, because we're talking about a "power user" feature, it's probably better to leave such decision to the developer.

So runtime.Run() would block until all goroutines are parked?

Yes, that's how it works today

That's exactly the issue. WebAssembly has no concept of blocking, but Go has.

WebAssembly doesn't really need the concept of blocking: it's a vm, you decide, when you design the "interface", if this machine is sync of async. My proposal here is to allow both, then the user picks what's best for her/his use case.

WebAssembly doesn't really need the concept of blocking: it's a vm, you decide, when you design the "interface", if this machine is sync of async. My proposal here is to allow both, then the user picks what's best for her/his use case.

It does not seem that trivial to me. You need to somehow have a mapping between Go's semantics and WebAssembly's semantics. For example if your exported function is blocked on some channel, you have no way to make the WebAssembly call block until the channel is unblocked. What do we want to do in this case? Crash?

For example if your exported function is blocked on some channel, you have no way to make the WebAssembly call block until the channel is unblocked. What do we want to do in this case? Crash?

Why you don't? You just block the goroutine and leave the program stuck, that's what i expect from a deadlock error mistake. The engine then may have a deadlock checker that blocks the execution of a "blocked module"

Ok I made some graphs to better explain the use cases I have in mind.


Case 1) the module is intended to continuously run in its own thread (in this context thread = host application thread), the host can signal to it some informations from a separate thread. Note: it's up to the engine to guarantee concurrent access to the module memory, not to us. If the engine chosen by the user doesn't allow that, then this use case is not implementable.

plantuml

The module has the "blocking" (from the host perspective) methodrun that blocks the application and run it. This method runs the goscheduler up to a signal to exit from the application, like a os.Exit(0). The module also serves methods to asynchronously signal events from the outside world, eg when the host wants to signal the event a, it invokes from a separate thread the export signal_a. In the golang side, the signal_a func will just write some memory (like send a message to a buffered channel) to signal the application and it won't run the scheduler (it doesn't make sense here), it will exit immediately (unless there's a programmer mistake here)


Case 2) The module is intended to be used just as "signal handler" with some lifecycle methods (e.g. start and stop) to prepare the environment and to teardown it. In this use case the module might potentially run always in the same thread, for example:

for ev := range eventsch {
  wasmModule.SignalEvent(ev)
}

plantuml(1)

Every time a new signal gets in, the host invokes the module export to write some memory, and then it executes all gorutines up until all of them are parked. With this mechanism the module is effectively used just as signal handler, it literally "process the stuff you want it to process" and then it exit.

Note that, with this execution model, there is no difference in how the user develops the application: The user will end up writing all it's business logic in the start method "tree", then the developers of "wasm golang libraries" will take care of implementing exports that signals results to the user. For example, in the start the user starts consuming a channel, provided by the wasm library X. When a signal comes through an export, that channel is filled and the application continues the execution.


The picture becomes even more complete if we take in account the ability of defining imports (not part of this proposal). At that point, a library developer can serve to the users a method like wasmLibrary.SendSomethingAndWaitForReply(message). The library invokes an import, signaling to the host to send the message, and it parks that goroutine. When the host receives the reply, it "wakeup" the module, invoking the proper export, and the application resumes the execution returning a value from wasmLibrary.SendSomethingAndWaitForReply(message)

I don't really see how you would implement case 1 with WebAssembly. There is no "halt until signal" instruction. Sure, the WebAssembly host could provide a function as a WebAssembly import that does this, but I haven't seen anywhere that the WebAssembly ecosystem wants to go into this direction. As the Go project we want to add support for patterns that have emerged as standards across the WebAssembly ecosystem, not invent our own ways and then expect the WebAssembly hosts to provide them.

There is no "halt until signal" instruction

The application runs up to the point where the scheduler is stopped (os.Exit() does it right?). I implemented the very same pattern using Rust with async/await: when the halt signal was received, the application just stopped the "scheduler", causing the run method to quit

As the Go project we want to add support for patterns that have emerged as standards across the WebAssembly ecosystem, not invent our own ways and then expect the WebAssembly hosts to provide them.

100% agree, It's not my intention here to define any pattern users should use. My intention in the previous comment was to explain some use cases for this go:wasmexport directive.

The point of this proposal is to provide to the users the low level tools to implement the patterns they want to. IMO the Go project should not choose a "standard de facto" pattern and implement it, but it should just provide the tools to let users implement whatever pattern they want to follow.

when the halt signal was received

What do you mean by this? Is "recieve signal" a call to some exported functon? If there is only a single thread (which is the case right now), then on which thread is this function being called if the only thread you have is still running the run function?

You just block the goroutine and leave the program stuck, that's what i expect from a deadlock error mistake.

So how exactly do you "leave the program stuck" with WebAssembly instructions? The only thing that comes to my mind is an infinite loop.

Sorry, but I'm getting a bit tired here. Have you read the WebAssembly specification? I feel like you're just arguing on a high abstraction level without actually explaining how this is supposed to work concretely.

Seems like we're missing each other in translation somewhere. Would you agree to do a quick synchronous meeting? We're in the same timezone, so that should be easy. Maybe that helps us unblocking this and/or gives me a good direction to clear things up.

I think you're missing the differences between Go and languages with as simpler runtime, like Rust or C:

  • Go has goroutines that get scheduled on one or more host threads
  • The goroutines each have their own Go stack, so there is not only the WebAssembly stack, but also the Go stacks
  • Go has a garbage collector

These features need proper orchestration. You can't "just call" a Go function from the WebAssembly host as you would call a Rust function. For example look at all the magic that Cgo does. You need to make sure that the Go stack is in a good state and that the runtime properly knows what is going on.

Are you talking about WebAssembly that only has a single thread (current state) or do you assume that the Threads Proposal has landed?

  • If you have only a single WebAssembly thread, then it is simply not possible to call some exported function while another exported function is still running Go code, because you simply have no thread to call it on.
  • If you presume the Threads Proposal, then I would like to suggest that we land it in Go first. This will be quite a significant amount of work and we will learn more about how Goroutines and WebAssembly threads interact. Btw: This would currently mean dropping support for Safari, which is an open question: #28360

The point of this proposal is to provide to the users the low level tools to implement the patterns they want to. IMO the Go project should not choose a "standard de facto" pattern and implement it, but it should just provide the tools to let users implement whatever pattern they want to follow.

In short: Go's runtime is not low level enough for this to be possible. We can not support multiple patterns. We have to pick a single pattern and then spend quite some effort to make even this single pattern possible.

In short: Go's runtime is not low level enough for this to be possible. We can not support multiple patterns. We have to pick a single pattern and then spend quite some effort to make even this single pattern possible.

I see your point, makes sense to me.

I think I accidentally diverted the discussion on the wrong topic: I wonder, Is it possible to start with supporting the use case 2 I proposed (aka call export -> run in its own goroutine -> run go scheduler until stalled) assuming just 1 thread? I think it's reasonable to implement with existing tools, shouldn't require much changes to the existing Wasm golang code (unless i miss something) and doesn't preclude any future evolution of Wasm and/or Golang Wasm support. It's also a good starting point to evolve to an eventual support of Thread proposal.

Yes, your "case 2" might be possible. It is quite similar to what the callbacks of syscall/js already do.

However, I still worry that the WebAssembly ecosystem needs to evolve some more before the Go project can properly build on top of existing standards. For example the interface types proposal seems very relevant.

With WASI there is already a somewhat clear target that we can work towards (run Go app on WASI host), so we did, but even in this case the go:wasmimport directive is only exposed internally at first. I want to avoid that we add some go:wasmexport right now which people start to use and then it causes issues later, for example when we add support for WebAssembly threads.

Honestly I feel a bit in over my head right now. @ianlancetaylor You built a lot of Cgo, right? It seems to me like this is very related. Maybe you could give some thoughts.

I know something about cgo but I know very little about wasm. In particular I don't understand the wasm threading model. Since wasm doesn't currently support threads (right?), it seems very hard to know when a call to an exported function should return. As seems to be discussed above.

In cgo it is possible to call an exported function from C code running in a thread that was not started by Go. When this happens, the Go runtime will use an M, which has been preallocated for this purpose. That M will have an associated G. That M and G will be used for the duration of the exported function. When the exported function returns, the M and G will be returned to a cache for use by later calls to exported functions.

We could presumably make all of that work in wasm too. But the part I don't understand is: what if the exported function starts a goroutine? And that goroutine is expected to continue running after the exported function returns? In a single-threaded environment like wasm, how can that work?

For example the interface types proposal seems very relevant.

The interface types proposal just extends the "standard" types, I think that in fact this proposal is the first necessary step to support interface types later.

With WASI there is already a somewhat clear target that we can work towards (run Go app on WASI host)

Yes but run "go app continuously and use wasi as a kind of posix" approach doesn't represent at all most of the use cases of Wasm. All the "plugin" architectures, like the one I'm trying to create, are not represented by this execution model, where the module has to be called on-demand when an event is received. Look for example at what istio is doing with their own plugin system: https://github.com/proxy-wasm/spec

so we did, but even in this case the go:wasmimport directive is only exposed internally at first

I think that in a first iteration we can have it internally (maybe we can use it for run, resume and getsp?) and then promote it, like you did for go:wasmimport? It might help people experiment it, forking the go project, and provide feedback on the feature, without exposing it to the "wide audience".

And that goroutine is expected to continue running after the exported function returns?

Given we're under the assumption that the environment is fully sandboxed, that goroutine can only execute some code and at some point it will need or to invoke an export or to wait for an external event (e.g. a message on a channel). Without accessing/giving back control to the host, it can't do very much except computations.
The single threaded execution model I have in mind is: host invokes an export, this export is executed and it might start one or more new goroutines. After the export function completed, before giving back the control to the host, you run the goscheduler up to the point when all goroutines are parked. When the goscheduler has all goroutines parked, it "returns" completely from the export and gives back control to the host.

Because from the host, I see the module as a bunch of functions, I just expect that an hypotetic handleEvent export returns after the event is processed. If behind that handleEvent there is a simple C function or a complex scheduling system with light threads and channels I don't care, it's up to the module to do that job.
Even better, I don't expect at all that my module is able to control my host threads to run its own goroutines. To me this even sounds like an antipattern to the whole Wasm sandboxing philosophy.

@ianlancetaylor

I know something about cgo but I know very little about wasm. In particular I don't understand the wasm threading model. Since wasm doesn't currently support threads (right?), it seems very hard to know when a call to an exported function should return. As seems to be discussed above.

Yes, wasm currently does not support threads, but there is a proposal for wasm threads which is already supported by V8 and Firefox without any flags. It is currently in phase 2 of standardization. We could either wait for it to land or design a solution that only uses a single thread.

We could presumably make all of that work in wasm too. But the part I don't understand is: what if the exported function starts a goroutine? And that goroutine is expected to continue running after the exported function returns? In a single-threaded environment like wasm, how can that work?

Functions exported with syscall/js.FuncOf solve this by only returning when all goroutines are asleep, e.g. there is nothing further to compute and the Go code has to wait for external stimuli (exported function or timer) to continue.

We could either wait for it to land or design a solution that only uses a single thread.

@neelance I might miss the point here because of my lack of knowledge of some Go internals, but I don't understand why the exports in single threaded fashion like I described are not "forward compatible" with an eventual wasm threads support.
If at some point the wasm threads proposal lands, then you'll be able to create threads and run the go runtime in a multithreaded environment (if user wants, eg in my use case it doesn't make sense for an isolation to control threads), and you can either choose to run the runtime up to all goroutines parked or, leveraging the go wasm multithreaded runtime, you run continuously the module.

rsc commented

To summarize the discussion, it sounds like @neelance is making the argument that this feature is not particularly implementable.

I wouldn't say "not implementable", but there are open questions in my head and I am not spending much time right now to resolve them myself. What is needed is a proper proposal text that describes the intended behavior and also addresses future changes, so we can make an informed decision on this proposal together.

Would it help if I try to write down a design doc about this feature? Do you have any preference on the tool I should use to share the doc?

rsc commented

Putting on hold waiting for a design doc. Feel free to move back to Active once a doc is ready. Thanks.

Here is the design doc: golang/proposal#31

Change https://golang.org/cl/278692 mentions this issue: design: add go:wasmexport proposal

Design doc merged

fumin commented

(https://golang.org/pkg/syscall/js/#FuncOf) solve this by only returning when all goroutines are asleep, e.g. there is nothing further to compute and the Go code has to wait for external stimuli (exported function or timer) to continue.

@neelance Could you elaborate on the mechanisms relied by FuncOf to solve this background goroutine issue?
The source of syscall/js looks pretty thin, so I suspect the bulk of the logic is in the runtime package?
I skimmed through src/runtime/sys_wasm.go and friends, but still couldn't form a complete mental picture.
Your help would be appreciated.

I am interested in how syscall/js solves this background goroutine issue, because it is one of the hardest problems preventing plugins from unmapping memory and other runtime limitations.

@fumin I haven't worked on this for a while so my mental picture is not as good at it used to be. 😅 Feel free to talk to me on the Gophers Slack, maybe I can help you with finding answers to some specific questions.

I don't think this would be possible on the wasm side alone (with the current way go wasm works). From my understanding, the current js.Func callback works this way:

  1. The js.Func is called
  2. A currentEvent object on the javascript side is set with the target function and arguments
  3. The go runtime is resumed
  4. The go runtime grabs and reads the currentEvent, finds its target go function, and executes with the arguments
  5. When finished, the go runtime sets the currentEvent.result (or something like that) to the return value of the go func
  6. Go runtime returns execution to javascript
  7. Javascript returns currentEvent.result

1-2 and 7 are on the javascript side
3-6 are on the go-wasm side

Following the same method, this could be done. It would require a new js glue function (in wasm_exec.js) that could be used similarly to the following:

const go = new Go();
const result = go.Call("myFunc", argA, argB, argC);

But, because this requires the js glue anyway, I don't see how this is any easier / better than just doing the following:

Go side

js.Global().Set("myExportedFunc", js.FuncOf(myExportedFunc))

JS side

window.myExportedFunc(arg1, arg2, arg3)

Plus, it's another weird magical comment

I propose something like this (inspired by wasm-bindgen's examples):

TinyGo also follows similar approach.

package main

//go:wasmexport
func hello(name string) {
   fmt.Println("Hello", name)
}
const hello = (name) => {
  // Obtain exported function
  const helloFunc = go._inst.exports["main·hello"];

  // Get program memory and memory allocator to allocate memory for a string
  const { malloc, mem }  = go._inst.exports;

  // Allocate memory for characters
  const buf = new TextEncoder('utf-8').encode(name);
  const byteArrRef = malloc(buf.length);
  const memArray = new UInt8Array(mem);
  memArray.set(buf, ptr);

  // Allocate string struct
  const stringHeaderRef = malloc(16) // sizeof(reflect.StringHeader)
  const memView = new DataView(mem);
  memView.setUint32(byteArrRef, ptr) // reflect.StringHeader.Data
  memView.setUint32(byteArrRef + 8, buf.length) // reflect.StringHeader.Len

  // Call exported Go function.
  // 'wasm_export_resume' is called under the hood.
  helloFunc(stringHeaderRef);
}

I provided more details in similar issue #58584

Something like this could be done easily using the existing syscall/js + reflection (see this). That's why I don't see a point in adding go:wasmexport unless we plan to be able to call imports without the js glue entirely

Of course, this also adds another overhead because we are using reflection but it is minor compared to switching between wasm and javascript

@garet90 this might be ok for simple cases not sensitive to performance, but for complex programs which might require heavy computation or data processing (video, graphics, etc) will benefit from low-level sugar-less FFI.

zetaab commented

This feature is needed if people want to switch from tinygo to official go. Before that it is difficult to see it happening. This is pretty old issue and I am wondering is anyone looking to implement this? Like you can see, it is referenced in many issues where people are trying to switch from tinygo to offial go, and this issue is blocker.

A few of us who worked on the wasip1 implementation are preparing a new proposal around this functionality, we hope to have something to share soon :).

zetaab commented

@johanbrandhorst any timeline for that? :)

We're working in our spare time so it will take as long as it takes, unfortunately. We want this functionality as soon as you do, believe me 😁.

The new proposal for this functionality is available at #65199. I think we can close this and continue the discussion there.

EDIT: correct link 🤦🏻

zetaab commented

@johanbrandhorst you perhaps mean #65199 ?

Haha, oops, yes, thank you.

#65199 is implemented. Closing this as a dup. Thanks.