fromCharCode BMP
jamesdbrock opened this issue · 5 comments
fromCharCode should return Nothing if the code is out of the Basic Multilingual Plane Char range, right?
purescript-strings/src/Data/Char.purs
Line 16 in 157e372
>>> show $ fromCharCode 65900
(Just 'Ŭ')
The Bounded instance for Char says that “Characters fall within the Unicode range,” but the Char says “guaranteed to contain one code unit.”
Oh interesting, it appears this is actually the line at fault:
https://github.com/purescript/purescript-enums/blob/170d959644eb99e0025f4ab2e38f5f132fd85fa4/src/Data/Enum.purs#L316-L318
It's using top and bottom for ints, not chars. I guess n >= toCharCode bottom && n <= toCharCode top might work?
String.fromCharCode just does (code) % 0x10000 on the code, so what you're seeing is 65900 % 0x10000 = 0x16C.
I've opened an issue in purescript-enum to track this. Should this issue be closed?
I think it’s reasonable it stays open until the upstream issue is addressed
Technically, we still need a release of that library and then a dependency update here.
PR ready for approval: #163.