GZipDecoder cannot process multi-member gzip data
Closed this issue · 0 comments
lizeyan commented
httpx version:
0.27.0
Current Behavior:
GZipDecoder only decodes the first member in gzip data.
Expected Behavior:
Be capable of decompressing multi-member gzip data (multiple gzip blocks concatenated together) just like gzip.decompress
Steps To Reproduce:
In [1]: import httpx
In [2]: raw_bytes = b'\x1f\x8b\x08\x00\x00\tn\x88\x00\xff\x00\x15\x00\xea\xff{"status": "success",\x03\x00\xeb\xdb\xa3\xb0\x15\x00\x00\x00\x1f\x8
...: b\x08\x00\x00\tn\x88\x00\xff\x00\x08\x00\xf7\xff"data": \x03\x00\x1d\xb4\xe6\xc8\x08\x00\x00\x00\x1f\x8b\x08\x00\x00\tn\x88\x00\xff\x00#\
...: x00\xdc\xff{"resultType":"matrix","result":[]}\x03\x00\x12\xb7\x95\x1b#\x00\x00\x00\x1f\x8b\x08\x00\x00\tn\x88\x00\xff\x00\x01\x00\xfe\xf
...: f}\x03\x00\x0c\xe2\xb6\xfc\x01\x00\x00\x00'
In [3]: from httpx._decoders import GZipDecoder
In [4]: GZipDecoder().decode(raw_bytes)
Out[4]: b'{"status": "success",'
In [5]: import gzip
In [6]: gzip.decompress(raw_bytes)
Out[6]: b'{"status": "success","data": {"resultType":"matrix","result":[]}}'
In [7]:
(The raw_bytes are from a Prometheus query request)
Anything else:
A possible implementation:
class GZipDecoder(ContentDecoder):
...
def decode(self, data: bytes) -> bytes:
decompressed_data = b""
try:
length = len(data)
offset = 0
while offset < length:
chunk = self.decompressor.decompress(data[offset:])
decompressed_data += chunk
# Update the offset to the next member
offset += len(data[offset:]) - len(self.decompressor.unused_data)
if not self.decompressor.unused_data:
break
else:
self.decompressor = zlib.decompressobj(zlib.MAX_WBITS | 16)
return decompressed_data
except zlib.error as exc:
raise DecodingError(str(exc)) from exc
I have test it and it can handle this case.