elm/core

NaN integers can be returned by `Char.toCode`

alch-emi opened this issue · 2 comments

It seems like (as of version 0.19.1) it's possible to construct a NaN : Int through the expression

Char.fromCode 0xd800 |> Char.toCode

Doing some research, this seems unintended (and I think NaN ints are unintended in general?), as the expected return value would seem to be 0xFFFD (aka �), which is what you get when you feed most other invalid unicode codepoints to this expression.

I did a cursory search of other issues in this repo and it doesn't seem like anyone else has opened an issue for this, but please excuse me if I have missed something. oh i just read the duplicates policy! that's lovely!!

Thank you for your time!

OS: NixOS 24.11 on Linux 6.6.63

Occurs in the REPL and Firefox 133.0

Thanks for reporting this! To set expectations:

  • Issues are reviewed in batches, so it can take some time to get a response.
  • Ask questions a community forum. You will get an answer quicker that way!
  • If you experience something similar, open a new issue. We like duplicates.

Finally, please be patient with the core team. They are trying their best with limited resources.

We found this behavior a while ago and wrote a small explainer
https://github.com/stil4m/elm-syntax/blob/master/src/Char/Extra.elm#L370-L384

Funnily enough, without some form of this behavior, the elm-syntax parser would be many times slower because there is no other way to check for UTF-16 surrogates currently.