iipc/jwarc

GunzipChannel fails on payload with uncompressed size exceeding int_max

Closed this issue · 0 comments

A gzip-compressed payload with an uncompressed size exceed 2^31-1 (max. value of a 32-bit integer) causes the GunzipChannel to fail with the following exception:

$> java -cp target/jwarc-0.13.1-SNAPSHOT.jar org.netpreserve.jwarc.tools.WarcTool extract --payload test-size-int-max-overflow-content-encoding-gzip.warc.gz 975
Exception in thread "main" java.util.zip.ZipException: gzip uncompressed size mismatch
        at org.netpreserve.jwarc.GunzipChannel.readTrailer(GunzipChannel.java:92)
        at org.netpreserve.jwarc.GunzipChannel.read(GunzipChannel.java:70)
        at org.netpreserve.jwarc.tools.ExtractTool.writeBody(ExtractTool.java:81)
        at org.netpreserve.jwarc.tools.ExtractTool.writePayload(ExtractTool.java:70)
        at org.netpreserve.jwarc.tools.ExtractTool.main(ExtractTool.java:156)
        at org.netpreserve.jwarc.tools.WarcTool.main(WarcTool.java:21)

The WARC file test-size-int-max-overflow-content-encoding-gzip.warc.gz (21 kB) contains one record with a payload size of 2^31.