Raku/nqp

[JVM] Streaming decoder should learn to cope with incomplete UTF-8 code points

usev6 opened this issue · 0 comments

usev6 commented

The streaming decoder on the JVM backend is not able to cope with incomplete UTF-8 code points currently. This is the cause of failing tests in roast, e.g. https://github.com/Raku/roast/blob/master/S32-io/IO-Socket-Async.t#115

The current implementation throws an exception if it tries to decode an incomplete code point. (I added that behaviour last year with 7a28dfc, because the previous implementation silently dropped data.)

I tried to figure out a better solution, but wasn't able to come up with something simple. So I'll create this issue for now. My plan to tackle the problem looks like this:

  1. look at MoarVM's implementation to find out how the decoder is supposed to work exactly
  2. add more tests for NQP (fudged for the JVM backend)
  3. improve implementation in src/vm/jvm/runtime/org/perl6/nqp/sixmodel/reprs/DecoderInstance.java