WebAssembly/stringref

`string.new*`, `string.const`, `string.concat` should produce non-nullable results

jakobkummerow opened this issue · 6 comments

Currently, the spec text for the various string creating instructions says that the result has type stringref, e.g.:

(string.new_wtf8 $memory $wtf8_policy ptr:address bytes:i32)
  -> str:stringref

(string.concat a:stringref b:stringref) -> stringref

With stringref being nullable (contrary to initial assumptions?), that doesn't make a whole lot of sense: I think the result type should be a non-nullable (ref string) instead.
(This would be similar to e.g. struct.new $t, which also returns a non-nullable (ref $t).)

A follow-up question that's less clear to me is what to do about the views.

We could specify the three stringref.as_* view creation instructions to return non-nullable (ref stringview_*) as well. That would be consistent with making string creating instructions return non-nullable (ref string), and is probably not controversial.

We could then go a step further and assume that this makes nullable view references unused, and change all the view-consuming instructions to consume non-null views. Would that be desirable, because it fits in well with the open questions around where to fit the views into the type system, which might get resolved by making them standalone (non-ref) types? Or would it be an unreasonable limitation for applications that want to store/pass optional string views?

I've gone ahead and implemented the changes I expect to be non-controversial: https://chromium-review.googlesource.com/c/v8/v8/+/3858236

Of course we can further iterate on the behavior if needed.

wingo commented

AFAIU this change would introduce a layering between the stringrefs proposal and GC where there was none before. Three options:

  1. Leave as it was: stringref and the views are all nullable by default, and nothing in this spec consumes non-nullable stringref values.
  2. Layer on top of GC, and replace all reference types with non-nullable types, except for string.eq.
  3. Some frankensolution that will be different depending on whether GC is enabled or not :)

Thoughts? ISTR the timeline for stringrefs eventually shipping (should it reach that stage) would be not dissimilar to GC, so perhaps it makes sense to take option (2). Thoughts?

wingo commented

Relatedly, if we made this change, it could make sense to remove the one-byte shorthand for the nullable stringref type, and instead require the two-byte format ((ref string) or (ref null string)) wherever we need it.

Where do you see a dependency on the GC proposal? Both (ref X) and (ref null X) are introduced by the (somewhat misleadingly named) "typed function references" proposal, which is:

  1. reasonably close to getting finalized, and
  2. a much smaller set of functionality than the GC proposal, so hypothetical engines that shy away from implementing GC can probably still be expected to implement function references

so depending on that is probably fine.

I don't feel strongly about the one-byte shorthand for stringref either way. (For "consistency", I expect making it mean non-nullable (ref string) will encounter opposition even if that would be the more useful shorthand to have available; it could be that the "consistency" argument will even be used against not having a shorthand at all, because all other reftypes have one.)

wingo commented

Ah, I had just imagined that typed function references was in practice part of GC. In my mind typed-function-references was stuck but I see that it has become unstuck; thanks for the correction!