purescript/purescript-strings

CodePoints.uncons performance optimization?

jamesdbrock opened this issue · 3 comments

It seems to me that these lines in Data.String.CodePoints.uncons

cu0 = fromEnum (Unsafe.charAt 0 s)
cu1 = fromEnum (Unsafe.charAt 1 s)

are first slicing the first code unit into a Char string with the JavaScript charAt method

if (i >= 0 && i < s.length) return s.charAt(i);

and then converting the Char string to a CodePoint by the boundedEnumChar instance fromEnum method which calls the Javascript charCodeAt method.

https://github.com/purescript/purescript-enums/blob/170d959644eb99e0025f4ab2e38f5f132fd85fa4/src/Data/Enum.js#L4

We could skip the intermediate string slice of the charAt method and call charCodeAt directly.

Is it doing that because it makes it easier on other backends?

Is it doing that because it makes it easier on other backends?

Maybe. That's a good point.

I tried swapping in a “fast” CodePoints.uncons function in purescript-parsing and couldn't detect any speedup.