101arrowz/fflate

Truncated output of gunzip if SIZE footer is incorrect

karyon opened this issue · 5 comments

karyon commented

How to reproduce

let orig = Uint8Array.from([1,2,3,4,5,6]);

let gzipped = gzipSync(orig);
console.assert(gzipped[gzipped.length-4] === 6) // size footer correctly contains 6

gzipped[gzipped.length-4] = 3; // write incorrect value into size footer

let gunzipped = gunzipSync(gzipped);
console.log(gunzipped); // output: Uint8Array(3) [1, 2, 3]

The problem
With an incorrect size footer, the output is truncated to that size. It is reasonable behavior, but as far as I can tell it is not compliant with RFC 1952, which says nothing of that sort.

The same would happen when supplying an out buffer in the options (which seems reasonable), and I guess the same would happen with files larger than 2^32 bytes (which I thought was documented somewhere but I can't find it anymore?).

I understand where this behavior comes from, I just thought that even if implementation complexity and performance take priority over compliance, it would be valuable to document such deviations from the RFC.

Interestingly, the whole decompression is still performed without any errors, since out-of-bound writes into TypedArrays are eaten silently. Thanks Javascript :D

Added this to the docs. This will be in the TypeScript IntelliSense in v0.8.1.

karyon commented

Thank you for taking the time to look into this! If I understand your changes correctly, they document the case where a too-small out buffer is supplied. What about the repro code above, where the input data contains an incorrect size footer?

You're right, I forgot to add that to the docs. Will fix soon.

karyon commented

Bummer, I was hoping for a "real" fix ;) But thanks for taking the time, I appreciate it!

If you really want to, you can ignore the size in the GZIP header by using the streaming API:

import { Gunzip } from 'fflate';

const out = [];

new Gunzip(c => out.push(c)).push(gzipData, true);

const result = out.length !== 1
  ? new Uint8Array(out.reduce((a, b) => a + b.byteLength, 0))
  : out[0];

if (out.length !== 1) {
  let offset = 0;
  for (const chunk of out) {
    result.set(chunk, offset);
    offset += chunk.byteLength;
  }
}

You can do something similar with AsyncGunzip but you need to do the concatenation in the callback instead of outside.