Compare gzip decompressors
Opened this issue · 1 comments
newsch commented
With more optimization it's likely that decompressing the gzip archives will be a bottleneck.
There are a number of "parallel" implementations for gzip, but all the ones I've looked at aren't actually useful for this case.
Steps:
- Look for more implementations
- Compare decompression rates
- Compare with rust gzip crates
What I've looked at so far:
Implementation | Notes |
---|---|
GNU gunzip | Baseline |
pigz | Moves some decompression work to separate threads, mostly single-threaded |
libdeflate | Only for small files |
pugz | Hypothetically ideal, novel approach, but unstable and needs to load entire file into memory |
pgzip (Python) | Decompression only parallelized when compressed with same tool |
pgzip (Golang) | Decompression only does single-threaded work on a separate thread |
biodranik commented
Let's run it first in the current state and see real base numbers before prioritizing/digging deeper into any optimizations.