WebAssembly/binaryen

wasm-opt error when closed-world option enabled

hondogo opened this issue ยท 28 comments

Binaryen version: 119

When executing wasm-opt with enabled option closed-world got:
Fatal: Internal GlobalTypeRewriter build error: Heap type has an invalid supertype at index 1

Command that execute:
wasm-opt --enable-gc --enable-reference-types --enable-exception-handling --enable-bulk-memory --enable-nontrapping-float-to-int --closed-world --type-ssa input.wasm -O3 -o optimized.wasm

Execution is successful when performed without option closed-world.

In attachment there is input.zip file (archive with input.wasm) for which this command was executed.
input.zip

Reduced testcase:

(module
  (rec
    (type $parent (sub (struct (field (ref func)))))
    (type $child1 (sub $parent (struct (field (ref func)))))
    (type $child2 (sub $parent (struct (field (ref func)))))

    (type $func (func (param $child1 (ref $child1)) (param $child2 (ref $child2))))
  )

  (func $func (export "func") (type $func) (param $child1 (ref $child1)) (param $child2 (ref $child2))
    (drop
      (struct.new $parent
        (ref.func $func)
      )
    )
    (drop
      (struct.new $child1
        (ref.func $func)
      )
    )
    (drop
      (struct.new $child2
        (ref.func $func)
      )
    )
    (drop (struct.get $child2 0 (local.get $child2)))
  )
)

What happens here is that the entire rec group is public because of that function, which is exported, and which then makes the entire rec group public. GTO is not noticing that it is public and optimizing, which leads to making those struct.news all by new_defaults, since it wants to remove the fields. But one field is non-nullable, which errors, as the field is not actually removed - TypeUpdater does notice the type is public.

We can make GTO avoid public types, and I'll do that as a bugfix. However, separately, on this testcase that would mean GTO will do nothing at all, as there is just one huge rec group, which is a big loss. I was hoping that --minimize-rec-groups would help here, but it doesn't - running it on that reduced testcase does nothing. I was hoping it would split out the type $func which is not recursive with anything - @tlively do you know why it isn't splitting there?

#7017 resolves one issue here, but I also see an error on SignaturePruning.

@hondogo Separately from fixing these bugs, if you can avoid mixing the types of exported functions in the big rec group, then you will get much better results (as it will keep the big rec group private, and modifiable).

@kripken Thanks for advise!
This code was generated by Kotlin compiler.
I've forwarded your advise to Kotlin team.

--minimize-rec-groups does not modify public types, so it can't do anything here.

@tlively Oh, right, thanks. That can't help here then.

I guess in the long term if we add a way to mark a rec group as private then we could optimize using that (so a private rec group might contain public types, which we would not modify, but other ones we could).

Maybe, but even then modifying the types would change the module's external types, so it still wouldn't be safe. The real solution is for the producer to avoid this situation when originally producing the module.

Note, however, that Kotlin is currently forced into this situation where all the types are in the same rec group as the public types because it is a hack to work around failing our closed world validation. The real real solution is for us to make the changes described in #6965.

Wait, how does putting all the types in a single rec group work as a hack for a closed world validation issue?

The validation allows all types in the same rec group as a public function type to be public without complaint. This might be a bug, but I would say the existence of the validation is a bigger bug. Is a bug in a feature that is a bug a feature?

Is a bug in a feature that is a bug a feature?

๐Ÿคฃ

Yeah, I guess that validation is kind of weird. Removing it as part of #6965 sgtm. In fact, if it is blocking people right now, removing the validation by itself seems fine?

I'd be worried about getting an avalanche of fuzz bugs, but other than that, I agree that removing the validation as a first step would be nice.

Sounds good, see #7019 for removing that validation.

@kripken
Could you please hint to me which declarations in the original input.wasm led to the bug?
Could you please make a release with a fix?

@bashor One fix landed so far, another is open in #7018, and there is at least one more error after that lands that I haven't diagnosed yet. I can make a release when they all land.

I don't think there is anything to do on your side here. This code is just hitting some Binaryen bugs because we didn't have much real-world content that mixed public and private types in closed-world mode (e.g. Java uses closed-world but doesn't mix, and some other users mix but don't use closed-world). We'll just fix those bugs in Binaryen.

However, if you can avoid mixing public and private types in a single rec group, that will help. If closed-world validation errors were what forced you to do that, then #7019 should help once it lands.

By "mixed public and private types in a single rec group" I mean things like the example in the second comment. There the function type could have been in its own singleton rec group outside, which would have left all the private types together, where they could be optimized.

With #7022, #7018, #7019 (all not yet landed), the testcase here finally passed in closed world (even including the extra validation checks of pass-debug mode).

To get an idea of the benefit of closed world here, I ran -O3 --gufa -O3 --gufa -O3 -tnh with and without it. Code size is 6% smaller with closed world. (There may also be speed benefits.)

I also did an experiment where I

  • Moved the types of _initialize and startUnitTests out of the main rec group, and
  • Removed the exports of main and __callFunction_(()->Unit) (just moving their types wouldn't be enough, as their signatures contain other types).

As a result, there are practically no public types remaining.

When I then optimize with vs without --closed-world, that flag makes the binary 48% smaller - almost half the size.

@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.

@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.

In particular, this can be accomplished by using anyref instead of concrete reference types in the signatures of exported functions.

@kripken @tlively thanks, folks for the hints, we will investigate that.

@kripken, could you please publish a new release, since the PRs mentioned above are merged?

It looks like it is better to wait for the merging of #7031, right?

That's merged now ๐Ÿ‘

Great, thanks!
Could we please have a release now?

In particular, this can be accomplished by using anyref instead of concrete reference types in the signatures of exported functions.

@tlively using structref is also ok, right?

Moved the types of _initialize and startUnitTests out of the main rec group, and

@kripken aren't they already defined outside of the big rec group?

__callFunction_(()->Unit) couldn't be removed in this case since it's used to call kotlin FunctionX from JS function wrapper created by __convertKotlinClosureToJsClosure_(()->Unit).

@kripken, am I right that the original problem was caused by the exported main function and its type?

In particular, this can be accomplished by using anyref instead of concrete reference types in the signatures of exported functions.

@tlively using structref is also ok, right?

Yes, that would be fine as well.

(For the rest, @kripken is out of office today but will be back tomorrow)

@bashor Yes, it looks like $main was the actual problem. I removed all the type annotations from all exported functions earlier, but now I see that was more than was needed.

I added a release now, 120: https://github.com/WebAssembly/binaryen/releases/tag/version_120

Looks like all the issues here have been resolved. If there are any new issues found with public types, please open new issues.