haskell/zlib

haskell library fails tests against C library zlib-1.2.11 includeing identit test: Exception: 'user error (Codec.Compression.Zlib: stream error)'

trofi opened this issue · 15 comments

trofi commented

Original report: https://bugs.gentoo.org/show_bug.cgi?id=613532
I can reproduce it on my machine as well thus can aid in debugging.

$ ./setup test --show-details=streaming
Running 1 test suites...
Test suite tests: RUNNING...
zlib tests
  property tests
    decompress . compress = id (standard):            FAIL (0.29s)
      *** Failed! Exception: 'user error (Codec.Compression.Zlib: stream error)' (after 1 test): 
      GZip
      CompressParams {compressLevel = CompressionLevel 4, compressMethod = Deflated, compressWindowBits = WindowBits 8, compressMemoryLevel = MemoryLevel 5, compressStrategy = DefaultStrategy, compressBufferSize = 121850, compressDictionary = Nothing}
      DecompressParams {decompressWindowBits = WindowBits 12, decompressBufferSize = 44823, decompressDictionary = Nothing, decompressAllMembers = False}
      ""
      Use --quickcheck-replay '0 TFGenR 000013DCC9212C1D000000001DCD6500000000000000E1EA00000003B9ACA000 0 1 1 0' to reproduce.
hvr commented

@trofi was there any significant (API?) change in the zlib 1.2.11 which could explain this?

PS: I just tried with zlib-1.2.9 which appears to be the first release that broke the decompress . compress tests. And unfortunately, zlib-1.2.9 contains over 3.5 years worth of changes:

Changes in 1.2.9 (31 Dec 2016)
- Fix contrib/minizip to permit unzipping with desktop API [Zouzou]
- Improve contrib/blast to return unused bytes
- Assure that gzoffset() is correct when appending
- Improve compress() and uncompress() to support large lengths
- Fix bug in test/example.c where error code not saved
- Remedy Coverity warning [Randers-Pehrson]
- Improve speed of gzprintf() in transparent mode
- Fix inflateInit2() bug when windowBits is 16 or 32
- Change DEBUG macro to ZLIB_DEBUG
- Avoid uninitialized access by gzclose_w()
- Allow building zlib outside of the source directory
- Fix bug that accepted invalid zlib header when windowBits is zero
- Fix gzseek() problem on MinGW due to buggy _lseeki64 there
- Loop on write() calls in gzwrite.c in case of non-blocking I/O
- Add --warn (-w) option to ./configure for more compiler warnings
- Reject a window size of 256 bytes if not using the zlib wrapper
- Fix bug when level 0 used with Z_HUFFMAN or Z_RLE
- Add --debug (-d) option to ./configure to define ZLIB_DEBUG
- Fix bugs in creating a very large gzip header
- Add uncompress2() function, which returns the input size used
- Assure that deflateParams() will not switch functions mid-block
- Dramatically speed up deflation for level 0 (storing)
- Add gzfread(), duplicating the interface of fread()
- Add gzfwrite(), duplicating the interface of fwrite()
- Add deflateGetDictionary() function
- Use snprintf() for later versions of Microsoft C
- Fix *Init macros to use z_ prefix when requested
- Replace as400 with os400 for OS/400 support [Monnerat]
- Add crc32_z() and adler32_z() functions with size_t lengths
- Update Visual Studio project files [AraHaan]

Changes in 1.2.8 (28 Apr 2013)
hvr commented

SO basically the problems seems to be that compressing the empty string "" fails now?

trofi commented

There was no existing API change (a few new functions were added).
But stream validation was updated. Like here:
madler/zlib@b516b4b

trofi commented

Looks like "" is a miniman shrunk case. If we filter null out tests still fail:

 prop_decompress_after_compress :: Format
                                -> CompressParams
                                -> DecompressParams
+                               -> BL.ByteString
                                -> Property
-prop_decompress_after_compress w cp dp =
+prop_decompress_after_compress w cp dp i =
    (w /= zlibFormat || decompressWindowBits dp >= compressWindowBits cp) &&
    -- Zlib decompression has been observed to fail with both compress and decompress
    -- window bits = 8. This seems to be contrary to the docs and to a quick reading
    -- of the zlib source code.
    (decompressWindowBits dp > compressWindowBits cp || decompressWindowBits dp > WindowBits 8) &&
-   decompressBufferSize dp > 0 && compressBufferSize cp > 0 ==>
-   liftM2 (==) (decompress w dp . compress w cp) id
+   decompressBufferSize dp > 0 && compressBufferSize cp > 0 && (not $ BL.null i) ==>
+   (decompress w dp $ compress w cp $ i) == i
zlib tests
  property tests
    decompress . compress = id (standard): FAIL (0.42s)
      *** Failed! Exception: 'user error (Codec.Compression.Zlib: stream error)' (after 31 tests and 10 shrinks): 
      GZip
      CompressParams {compressLevel = CompressionLevel 8, compressMethod = Deflated, compressWindowBits = WindowBits 8, compressMemoryLevel = MemoryLevel 8, compressStrategy = DefaultStrategy, compressBufferSize = 124143, compressDictionary = Nothing}
      DecompressParams {decompressWindowBits = WindowBits 15, decompressBufferSize = 293676, decompressDictionary = Nothing, decompressAllMembers = False}
      "\NUL"
hvr commented

I wonder if the Compress/Decompress parameters must match more closely w/ zlib-1.2.11 than they did before. specifically the window-bits

hvr commented

@trofi I tried reproducing the issue in GHCi w/ zlib-1.2.11 using the code from the quickcheck test... but I can't seem to reproduce this. Is this even deterministic?

hvr commented

We're not the only ones having problems w/ zlib-1.2.{9,10,11}:

Hmm, nasty.

So the ocaml people tracked their bug down to the fact that they were keeping the z_stream C struct in movable memory on the ocaml heap, and a new pointer sanity check in the zlib C code was failing. xavierleroy/camlzip#1 (comment)

However we already use a ForeignPtr and mallocForeignPtrBytes for the z_stream so relocation would not appear to be the problem we're having. Someone may need to track down exactly which new check in the C code is being triggered.

trofi commented

Tracked it down to a failing zlib function deflateInit2_. It frequently returns Z_STREAM_ERROR

This condition causes decompression error.

    if (memLevel < 1 || memLevel > MAX_MEM_LEVEL || method != Z_DEFLATED ||
        windowBits < 8 || windowBits > 15 || level < 0 || level > 9 ||
        strategy < 0 || strategy > Z_FIXED || (windowBits == 8 && wrap != 1)) {
        return Z_STREAM_ERROR;
    }

Namely windowBits == 8 && wrap != 1 fails. Values are usually around that check are usually

windowBits = 8; wrap = 0
windowBits = 8; wrap = 2

but the variables are tweaked before the check. Input value into the function is windowBits = 24.

This additional check was intruduced in madler/zlib@049578f

trofi commented

Skipping 8-bit window seems to make tests pass. A crude hack I used locally:

diff --git a/test/Test.hs b/test/Test.hs
index bbd8b94..81bd260 100644
--- a/test/Test.hs
+++ b/test/Test.hs
@@ -66,2 +66,3 @@ prop_decompress_after_compress w cp dp =
    -- of the zlib source code.
+   compressWindowBits cp /= WindowBits 8 &&
    (decompressWindowBits dp > compressWindowBits cp || decompressWindowBits dp > WindowBits 8) &&
@@ -76,2 +77,3 @@ prop_gziporzlib1 cp dp =
    decompressWindowBits dp > compressWindowBits cp &&
+   compressWindowBits cp /= WindowBits 8 &&
    decompressBufferSize dp > 0 && compressBufferSize cp > 0 ==>
@@ -85,2 +87,3 @@ prop_gziporzlib2 cp dp =
    decompressWindowBits dp >= compressWindowBits cp &&
+   compressWindowBits cp /= WindowBits 8 &&
    decompressBufferSize dp > 0 && compressBufferSize cp > 0 ==>
@@ -94,2 +97,3 @@ prop_gzip_concat cp dp input =
    decompressWindowBits dp >= compressWindowBits cp &&
+   compressWindowBits cp /= WindowBits 8 &&
    decompressBufferSize dp > 0 && compressBufferSize cp > 0 ==>

@trofi good work tracking it down! Would you like to make a PR for upstream?

@trofi actually having read the zlib change that you pointed me to, I've decided to do it slightly differently: just change the range of valid window bits from 8..15 to 9..15. It seems 8 just isn't supported at all: it fails for raw and gzip and is translated to 9 for zlib format.

hvr commented

@dcoutts @trofi regarding your observation that ws=8 is buggy, see also madler/zlib#94

@dcoutts - it seems that this ticket is now fixed? Given you changed the window bits range?