msievers/zstandard-ruby

Bug in streaming_decompress causes data corruption.

Opened this issue · 0 comments

The buffer size used in streaming_decompress is too small which can lead to data corruption in rare cases (facebook/zstd#918). Unfortunately, by using a buffer size that is exactly equal to the frame size, there is no error thrown by zstd (if you subtract/add 1 from the buffer size you will observe error: destination buffer too small). The fix for this is to use a larger buffer size. zstd docs recommend using ZSTD_bufferSizeMin() to determine a safe minimum buffer size. In practice, I have found this size to be too small as well. I propose using ZSTD_bufferSizeMin()*2 as a large but not gratuitous estimate.

Another thing to note, this gem uses the so-called buffer-less streaming decompress api, but then builds up the result in a buffer! This ends up being a lot more effort for 0 gain memory-usage-wise. You should consider updating this gem to use the normal streaming decompression api, which handles all the buffer management for you.

Other issues I found while debugging the above issue include: an invalid struct layout causing invalid memory accesses, an invalid type for raise (you need to call .read_string to get a string), and, less seriously a typo in ZSTANDARD_MAX_STREAMING_DECOMRPESS_BUFFER_SIZE (DECOMRPESS -> DECOMPRESS)