WebAssembly/module-linking

Difficulties with inline modules in the text format

alexcrichton opened this issue · 2 comments

I've been working on a prototype implementation of this specification, mostly creating a text parser so far. This proposal has ended up being a massive change that greatly increases the complexity of the text-to-binary translator. I wanted to open this issue to talk about some of the issues, but I don't really expect a ton of change to come out of this. I think it's important to acknowledge the burden being placed on a text-to-binary translator, but that's not a reason to not have new features!

The main problem with a text-to-binary translator is that it may need to infer the type of a nested module. For example:

(module 
  (module $foo (; ... ;) )
)

Here the type of $foo isn't listed. The text-to-binary translator will need to calculate the type of $foo and inject it into the parent module. This is greatly complicated in the presence of alias directives, where the child module can refer to the parent (or its own sub-children). Cycles are certainly one possibility that can arise but it generally means that resolving names to indices must be done extremely carefully.

One issue I've run into which is somewhat fundamental, however, is that in creating a type for $foo it's difficult to know where to place it in the type index space. Given the current text of this proposal, the first five sections (type, import, alias, module, instance) can all be interleaved. This means that in the text and binary format the type of $foo needs to positionally come before the definition of $foo. In our example above it means that we need to inject the type of $foo before that textual directive.

This is generally expected and not too bad, but it means that we can't actually know the index of later types:

(module 
  (module $foo (; ... ;) )
  (type $t (func))
)

here the index of $t isn't actually known until the type of $foo is fully known. To make matters worse knowing the full type of $foo may require knowing indices/types into the parent:

(module 
  (module $foo
    (alias $foo_t parent (type 1)) ;; what type does this refer to?
    (func (export "x") (type $t))
  )
  (type $t (func))
)

Overall I'm not sure how best to specify what happens in these cases. I've personally had a lot of trouble grappling with what should and shouldn't be accepted with mutally referential modules, and so far nothing really feels "natural" as a way to specify what should and shouldn't be accepted and/or how types should be elaborated and such. I wanted to make sure there was an issue on this in case others had ideas though!

Yes, I agree this seems rather nasty, and would greatly complicate the text format. The simplest fix would be that we always require nested modules to have an explicit type annotation, i.e., never infer it by desugaring.

I believe this has been resolved with #23 and other recent updates.