WebAssembly/stringref

Stringref as a toolchain concept

wingo opened this issue · 3 comments

wingo commented

So, the CG does not currently have consensus to move stringref forward. It may in the future, but that's not where we're at right now.

For the moment, the Hoot compiler does emit stringref, but to ship to current WebAssembly, it lowers strings to WTF-8 arrays (array i8). For information on what kinds of use-cases stringref is fulfilling for Hoot, see the lower-stringref pass: https://gitlab.com/spritely/guile-hoot/-/blob/main/module/wasm/lower-stringrefs.scm. There is a discussion of the tradeoffs in the top of that file.

So, in summary, it is possible to be productive with stringref on the toolchain level, even in the world where stringref is not available on the wasm hosts.

Pauan commented

So, in summary, it is possible to be productive with stringref on the toolchain level, even in the world where stringref is not available on the wasm hosts.

But one of the major goals of stringref is to provide efficient interop with the host (e.g. JavaScript). Using an array i8 does not achieve that goal.

Another major goal of stringref is to avoid hardcoding a single encoding, but array i8 is a hardcoded encoding, so it also does not achieve that goal.

So array i8 cannot ever serve as a substitute for stringref.

wingo commented
  1. We do use stringref for the Hoot VM host, providing more efficient interop.
  2. Stringref has other toolchain uses, e.g. it facilitates optimal tree-shaking of literals.
  3. Of course if the host doesn't have stringref you have to do something. It's a private concern of the module though, and anything that works for the particular compiler toolchain is fine. In the case of Hoot, if we have to choose, we'll use WTF-8. But obviously we would prefer stringref in the host!
wingo commented

An update: replacing string with a concrete type, e.g. (array i8), works in some cases but isn't general: if the compiler/toolchain emits dynamic ref.test / br_on_cast checks on the replacement concrete type or any of its supertypes that aren't shared with string -- i.e. in the case of WTF-8 arrays, that would be array i8, array, or eq -- then the replacement concrete type doesn't refine string.

In the case of the Hoot compiler, we only emit dynamic checks against i31 and specific subtypes of struct; refinement holds under this restriction.