dragetd/borgbench

max. useful lzma level?

Closed this issue · 3 comments

i usually advice "do not use too high level with lzma", because at some level it gets pointless as far as better compression is concerned (because we do not feed enough data (usually ~2MB) into lzma) and burns way too many cpu cycles.

but I don't know what's the maximum level for a realistic mix of input data we could recommend (above which it does not yield better compression, just wastes more cycles).

can you find out / make a nice gfx for that?

Heavily depends on the test corpus I assume. I had JPGs, 1GB of a VM-Image and a linux root filesystem.

I guess the most logical approach would be a root filesystem. It has a lot of mixed data with some hardly compressible, some very compressible.

I'll try the debian live-cd root filesystem extracted from this ISO http://ftp.de.debian.org/debian-cd/8.5.0-live/amd64/iso-hybrid/debian-live-8.5.0-amd64-standard.iso

See:
https://raw.githubusercontent.com/dragetd/borgbench/master/compression/img/results_rootfs_lzma.png and
https://raw.githubusercontent.com/dragetd/borgbench/master/compression/img/results_rootfs_zlib.png

Note the different scale of the axis.

Interesting results:

  • lzma,0 begins nicely where zlib,8 ends. Eventually one even gets slightly better compression with lzma,1 with the same time it takes for zlib,8
  • no real compression gain after lzma,6
  • lzma,1 vs lzma,6 is less than 3% in resulting filesize vs. 400% change im 'borg create' time
  • with this test corpus, even anything beyong lzma,4 would just waste a lot of time for hardly any compression

Thanks for testing this!