Incorrect position check in IsUtf8?

Question

Incorrect position check in IsUtf8?

Opened this issue 6 years ago · 1 comments

It looks like a lot of the UTF8 checker code on GitHub is all copied with what appears to be the same issue: if the utf8 multi-byte sequence is at the end of the buffer, it's incorrectly marked as invalid. AFAICT, all of the checks for position >= length -2 (or whatever) should be position > length-2. In most projects, the offending file is utf8checker.cs

Answer 1 · 2019-03-17T05:51:45.000Z

FYI: I've just created a near drop-in IsUtf8 replacement; that project includes a bunch of unit tests.