WebAssembly/multi-memory

Use-case for multi-memory: dart2wasm to implement ByteBuffer/TypedList objects with linear memory

eyebrowsoffire opened this issue · 8 comments

The dart2wasm compiler has so far used WasmGC for its object representations, meaning that the dart runtime itself does not require any linear memory as of today except to communicate with an external module. However, the dart objects ByteBuffer or TypedList are implemented via a WasmGC array. The issue with this is that we sometimes want to use these ByteBuffer or TypedList objects in browser APIs that take a JS TypedArray, and currently there is no way to create a TypedArray object from a WasmGC array. Due to this, we are looking at changing the implementation of ByteBuffer and TypedList to instead use regions of linear memory, from which a JS TypedArray can be created.

The issue here is that without multi-memory support, dart2wasm cannot have its own linear memory and also import memory from an external module. We could import malloc and free from the external module and just use regions of the imported memory in this case, but we are not always guaranteed to be linking on an external module. This means we would have to maintain two separate implementations for linear-memory-backed ByteBuffer/TypedList objects depending on whether we are importing memory from an external module or not. This is an unfortunate burden which would be greatly simplified by multi-memory support.

@eyebrowsoffire this is an interesting approach, I have a question about the garbage collection: if the ByteBuffer/TypedList is stored in linear memory, how can they be collected by the garbage collector?

osa1 commented

@xujuntwt95329 the linear memory allocation will need a reference count. When it's shared with a manually-managed language (e.g. C++) the refcount will be decremented when freeing the array. When it's shared with a managed language, reference to the array needs to be added a finalizer that will decrement the ref count. When one of these languages decrement the ref count to zero they need to call the freeing function implemented by the memory allocator that allocated the array.

@osa1 Thanks for your reply.
So the wasm module need to expose a function to host for decreasing ref count during finalizing, right? But when the wasm runtime is doing garbage collection it may stop the world, then how can we re-enter a wasm function?

osa1 commented

@xujuntwt95329

So the wasm module need to expose a function to host for decreasing ref count during finalizing, right?

Yes.

But when the wasm runtime is doing garbage collection it may stop the world, then how can we re-enter a wasm function?

Wasm GC doesn't support finalizers yet so the finalizer will have to be added in JS, as a JS function. The GC will be calling the JS function, which then can call Wasm functions.

Wasm GC doesn't support finalizers yet so the finalizer will have to be added in JS, as a JS function. The GC will be calling the JS function, which then can call Wasm functions.

Thanks!

Sorry I still have some question: as you mentioned, Wasm GC doesn't have finalizers yet, then if a wasm managed object is going to be reclaimed, how does the wasm runtime know whether he need to call a JS function, and which JS function to call?

osa1 commented

No need to apologize! We're assuming that the host supports Wasm and JS (like in a browser), and that you can pass a Wasm GC object to JS (I think we do it as passing the reference as externref), and JS can attach a finalizer to that Wasm object. In dart2wasm we do this here. value is implicitly converted to an externref, implemented here.

When you share a reference to the linear-memory backed Wasm GC object with another Wasm module (or JS, or some other host-supported language), you need to somehow let the allocating module know that the reference shared with the other module is no longer used. With managed languages you often do this with finalizers. In our case, this managed language is JS, so we use a JS finalizers.

If you share this reference with another Wasm GC module and you don't have JS support in your host than I think you're out of luck until Wasm GC gets finalizer support.

Thanks for your patience and detailed explanation! I probably understood, so the ownership of this linear-memory backed Wasm GC object is actually transferred to the host JS object, am I correct?

This is interesting, btw, recently I implemented GC finalizer mechanism for WasmGC in WebAssembly Micro Runtime, it's a very simple implementation, the host can invoke some APIs to register a finalizer on a certain wasm object.

osa1 commented

This plan implements both transferring the ownership and sharing. You just need to make sure whenever you introduce a reference to the linear memory object you increment the reference count, and when the reference dies you decrement it. The code that decrements the ref count to 0 needs to call the deallocation function.