CodePoints uncons? Deprecate drop?
Closed this issue · 8 comments
I just noticed that instance stringStringLike uses CodeUnits for uncons.
purescript-parsing/src/Text/Parsing/Parser/String.purs
Lines 27 to 29 in 3d7976e
Doesn't that mean that anyChar will be wrong for astral characters?
purescript-parsing/src/Text/Parsing/Parser/String.purs
Lines 54 to 56 in 3d7976e
Also, the drop member of StringLike is now unused?
I just hit the anyChar issue you found above, but with the purerl backend.
I'm looking through the code for this package and basically the whole thing is completely oriented to UTF-16 codeunits...
You might try using this instead @drathier https://pursuit.purescript.org/packages/purescript-string-parsers
The problem is that https://pursuit.purescript.org/packages/purescript-string-parsers/6.0.1/docs/Text.Parsing.StringParser.CodePoints#v:anyChar puts the result into a Char so it isn't parsing code points after all. we'll be writing our own anyCodePoint and anyGrapheme later today.
@drathier I'd like to see your anyGrapheme parser. Link please? Do you think that this library should include anyGrapheme?