Possible addition: string.new_ascii / string.new_ascii_array
jakobkummerow opened this issue · 0 comments
Sufficiently advanced engines tend to have a specialized internal string representation for ASCII-only strings (V8 certainly does; I'm pretty sure other existing engines do too) [1]. There are also some common use cases where a string is being created that's known to be in ASCII range, such as number-to-string conversions. Of course it is possible to use string.new_utf8[_array]
or string.new_wtf16[_array]
in these situations, but they both require an engine to scan the memory/array for non-ASCII elements before deciding which kind of string to allocate and copy the characters into, which has significant overhead [2]. We could avoid this by adding instructions to create strings from ASCII data.
There's partial, but only partial, overlap of this suggestion and #51, insofar as number-to-string conversions are a use case that could benefit from either instruction set addition but is unlikely to benefit from both. That said, if we e.g. decide that integer-to-string is sufficiently common and standard (in the sense that everyone does it the same way) to warrant its own instruction, whereas float-to-string is sufficiently uncommon and/or language specific that we'll leave it up to languages to ship their own implementation for it, then the latter would still benefit from a string.new_ascii_array
instruction. Also, there might well be common uses cases aside from number conversion that know on the producer side that they're creating ASCII strings.
I wouldn't mind adding such instructions to the MVP; I'm also fine with postponing them to a post-MVP follow-up.
[1] Strictly speaking, any form of "one-byte" string representation is relevant here, e.g. "Latin1"; ASCII is the lowest common denominator of these. In fact, in V8, our "one-byte" strings actually support the Latin1 range, yet I'm suggesting ASCII (i.e. character codes 0 through 127) for standardization here, because I believe that's the subset that maximizes the intersection of usefulness to applications and freedom of implementation choice to engines.
[2] To illustrate with specific numbers: on a particular microbenchmark I'm looking at, which converts 32-bit integers to strings, the score is 20 when I check for ASCII-only characters, and 27 (+35%) when I blindly copy i16 array elements to 2-byte string characters, which wastes memory. There may be potential for (minor?) improvements using SIMD instructions or similar for faster checking, but why bet on engine magic/heroics when it's so trivial to add a Wasm-level primitive that makes it easy and reliable to get high performance?