WebAssembly/module-linking

Aliasing outer imported modules forces modules to have per-instance runtime state

Closed this issue · 3 comments

One thing I tripped over recently when updating wasmtime's as proposed in #26 is that I believe when you alias an outer module's imported module then you're forcing each instantation to create new modules with new state.

For example a simple module like:

(module $PARENT
  (import "" (module $b))
  (module $a
    (instance (instantiate (module outer $PARENT $b))))

  (export "a" (module $a)))

each time you instantiate $PARENT then the "a" export will have different state because it has to reference whatever module was imported.

I was under the impression, however, that this was likely not intended. Does validation need to ensure that outer module aliases only refer to locally-defined modules? Or is this an intended feature and consequence for engines to implement?

The issue, as written, wasn't 100% clear to me, but after clarifying with Alex, here is my summary / reinterpretation:

Modules have historically been static, self-describing things, and if we allow inner modules to close over their outer modules' imports, then we lose that property. Is the module linking proposal intentionally trading away this property to gain module-level closures, or was this an oversight?

Good question! I think this is indeed intentional and valuable. One useful pattern it enables is preventing O(n^2) blowup of module types in module dependency trees. Although the syntax needs to be updated, I have an old gist that illustrates this -- in particular, see the use of the outer module's module imports by $B_wrap and $C_wrap.

Conceptually, I wouldn't think of this as giving modules state but, rather, think of modules as functions that produce tuples of values (instances) when applied (instantiated). Thus, modules are basically function values and, in your example, $a is an expression for producing a new module value each time $PARENT is applied (instantiated), analogous to, say the following JS:

function PARENT(b) {
  function a() {
    let _ = b()
  }
  return {a}
}

In a JS engine, a gives rise to 3 related things: bytecode (created once, up-front), a function value (created each time PARENT is applied), and a closure (each time a is applied). I expect the new-weirdness you're seeing is that, up until now, implementing wasm has only needed 2 of these 3 things: a wasm version of "bytecode" (a wasm module) and "closure" (a wasm instance), and now we need the middle thing (which we could call a "module value"). With module linking, all module (function) instantiation (application) happens up-front, so this "module value" needn't be reified to outlive the instantiation procedure. However, just to paint the whole picture, at some point in the (farther off) future, I'm pretty sure we'll need a runtime instantiate core instruction, at which point "module values" become actual first-class values, distinct from (current, static) modules and (current, dynamic) instances.

Ok makes sense, just wanted to confirm! You're right that this came up during implementation where I realized that third thing will be needed, but it shouldn't be the end of the world to implement it!