The encoded data was not valid for encoding utf-8
denniske opened this issue · 1 comments
denniske commented
How to reproduce
https://aoe4world.com/dumps
File: Games - RM 1v1 - Season 3 - 44 MB
Grab the file url and put it into url
variable.
const dump = await (await fetch(url)).arrayBuffer();
const compressed = new Uint8Array(dump);
const decompressed = decompressSync(compressed);
const origText = strFromU8(decompressed);
The problem
The following error occurs in Line 4 in strFromU8(decompressed)
:
TypeError: The encoded data was not valid for encoding utf-8
at TextDecoder.decode (node:internal/encoding:448:14)
at strFromU8 (/Users/dennis/Projects/poc_collector/node_modules/fflate/lib/node.cjs:1780:19)
at HistoricalTask.<anonymous> (/Users/dennis/Projects/poc_collector/dist/collector/webpack:/src/task/historical.task.ts:79:39)
at Generator.next (<anonymous>)
at fulfilled (/Users/dennis/Projects/poc_collector/node_modules/tslib/tslib.js:166:62)
at processTicksAndRejections (node:internal/process/task_queues:95:5) {
code: 'ERR_ENCODING_INVALID_ENCODED_DATA'
}
I can download the file on mac os and extract it by double click in the finder and then open in visual studio code without problems. Visual studio code shows UTF-8
in the status bar.
101arrowz commented
That file is 533MB decompressed; strings in JavaScript can be at most 512MB. You can try to solve this by converting to strings with streams and using a streaming JSON parser; let me know if you want more info on how to do that.