roc-lang/unicode

Imrpove grapheme.split testing

Opened this issue · 1 comments

Quoted from Luke:

coverage of the unicode data file test points is pretty average, like it might only have a test that covers an emoji at the start of a string, but not the middle or end or before a CLRF or after a Hangul sequence... etc.
So I'm reasonably confident there are a couple of edge cases we haven't caught, and could end up crashing someone's code. It would be nice to get that to a point where we are reasonably confident that is not going to happen.

This is a fun project for anyone who has the time and interest. Brendan's https://github.com/bhansconnect/roc-fuzz platform would be a great way to test this.

I have had plans to do this, but haven't had an opportunity yet.