maraisr/meros

Emits incorrect results when utf-8 codepoint is split between two chunks

laverdet opened this issue · 0 comments

Describe the bug

meros does not correctly decode utf-8 encoded data which is yielded in a different chunk.

You need to pass { stream: true } to TextDecoder#decode:
https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/decode#options

Furthermore, you may not share instances of the encoder between streams since this is a stateful object. If you share instances then multiple concurrent streams will impact each other.

To reproduce

const stream = async function*() {
	const smiley = Buffer.from("🤔");
	yield Buffer.from("\r\n---\r\n\r\n");
	yield smiley.subarray(0, 2);
	yield smiley.subarray(2);
	yield Buffer.from("\r\n-----\r\n");
}();

const chunks = await meros(new Response(stream, {
	headers: {
		"content-type": 'multipart/mixed; boundary="-"',
	},
}));

await Promise.all([
	async function() {
		for await (const chunk of chunks) {
			console.log(chunk);
		}
	}(),
]);

Output:

{ headers: {}, body: '���', json: false }

Expected behavior

It should decode correctly.