Incorrect position check in IsUtf8?
Opened this issue · 1 comments
pedasmith commented
It looks like a lot of the UTF8 checker code on GitHub is all copied with what appears to be the same issue: if the utf8 multi-byte sequence is at the end of the buffer, it's incorrectly marked as invalid. AFAICT, all of the checks for position >= length -2 (or whatever) should be position > length-2. In most projects, the offending file is utf8checker.cs
pedasmith commented
FYI: I've just created a near drop-in IsUtf8 replacement; that project includes a bunch of unit tests.