Rust chars are not UTF-8
Closed this issue · 1 comments
ChayimFriedman2 commented
Rust
char
in F# is achar
(.NETChar
). Rustchar
is UTF-8, while in F# they are UTF-16.
Rust char
is UTF-32 (that's not specified, although it is specified that it should be 4 bytes wide):
Representation
char
is always four bytes in size. This is a different representation than a given character would have as part of aString
...
https://doc.rust-lang.org/std/primitive.char.html#representation
OTOH, Rust strings (String
and str
) are UTF-8 encoded, and actually represented with Vec<u8>
(https://doc.rust-lang.org/src/alloc/string.rs.html#279-281).
Dhghomon commented
Oh yeah, silly me. I do have a note on String
being Vec<u8>
though.