lichess-org/database

switch to pbzip2

niklasf opened this issue · 3 comments

pbzip2 - parallel bzip2 file compressor, v1.1.6
https://linux.die.net/man/1/pbzip2

not so much to improve compression speed, but to speed up decompression:

Files that are compressed with pbzip2 are broken up into pieces and each individual piece is compressed. This is how pbzip2 runs faster on multiple CPUs since the pieces can be compressed simultaneously. The final .bz2 file may be slightly larger than if it was compressed with the regular bzip2 program due to this file splitting (usually less than 0.2% larger). Files that are compressed with pbzip2 will also gain considerable speedup when decompressed using pbzip2.

Files that were compressed using bzip2 will not see speedup since bzip2 packages the data into a single chunk that cannot be split between processors.

Also important:

The output of this version is fully compatible with bzip2 v1.0.2 or newer (ie: anything compressed with pbzip2 can be decompressed with bzip2).

Which means it should be safe to use it, at a small size cost.

Recompression completed.