10XGenomics/lz4-rs

Significant decompression speed regression with rust 1.73 toolchain

Closed this issue · 2 comments

I'm doing some benchmarks with a large (1.7GB) CSV file, and it seems that upgrading from rust 1.71 to 1.73 has caused a significant performance regression for lz4::Decoder.

Some results:

Rust 1.72

lz4 1...
[lz4 1] Compression: 20.3244%, Time: 2.16s
Decompress in: 1.25s
lz4 4...
[lz4 4] Compression: 15.5764%, Time: 12.56s
Decompress in: 1.10s
lz4 7...
[lz4 7] Compression: 15.1194%, Time: 40.19s
Decompress in: 1.11s

Rust 1.73

lz4 1...
[lz4 1] Compression: 20.3244%, Time: 2.15s
^C⏎

rs-test on  master [?] is 󰏗 v0.1.0 via  v1.73.0 took 11m

^ I killed the benchmark during decompression because it took over 11 minutes.

benchmark code
    for level in [1, 4, 7] {
        println!("lz4 {}...", level);
        let mut in_f = std::fs::File::open(in_f_path).unwrap();
        let start_size = in_f.metadata().unwrap().len();
        let now = Instant::now();
        let out_path = format!("out-{}.lz4", level);

        let out_f = std::fs::File::create(&out_path).unwrap();
        let mut encoder = EncoderBuilder::new().level(level).build(out_f).unwrap();
        std::io::copy(&mut in_f, &mut encoder).unwrap();
        let (output, result) = encoder.finish();
        if let Err(e) = result {
            println!("{:#?}", e);
        }

        let elapsed = now.elapsed().as_millis();
        let end_size = output.metadata().unwrap().len();
        println!(
            "[lz4 {}] Compression: {:.4}%, Time: {:.2}s",
            level,
            100.0 * end_size as f32 / start_size as f32,
            elapsed as f32 / 1000.0,
        );

        let file_bytes = std::fs::read(&out_path).unwrap();
        let decode_start = Instant::now();
        let mut decoder = lz4::Decoder::new(Cursor::new(file_bytes)).unwrap();
        let mut uncompressed_buf: Vec<u8> = vec![];
        std::io::copy(&mut decoder, &mut uncompressed_buf)
            .map_err(|e| format!("Error decoding {} as lz4: {:#?}", out_path, e))
            .unwrap();
        println!(
            "Decompress in: {:.2}s",
            decode_start.elapsed().as_millis() as f32 / 1000.0
        );
        std::fs::write(format!("lz4-in-{}.csv", level), uncompressed_buf).unwrap();
    }

I was able to reproduce this on Linux with Rust 1.73, and a similarly sized file. Here's the backtrace:

#1  0x00005630f3b86148 in <alloc::vec::Vec<u8,A> as std::io::copy::BufferedWriterSpec>::copy_from ()
#2  0x00005630f3b8712f in lz4::test::speed_bug ()
#3  0x00005630f3b802a9 in core::ops::function::FnOnce::call_once ()
#4  0x00005630f3bc3d5f in core::ops::function::FnOnce::call_once () at library/core/src/ops/function.rs:250
#5  test::__rust_begin_short_backtrace () at library/test/src/lib.rs:626
...```

It appears to be back to norma speed in Rust 1.75, so I guess it was fixed? I can't find an obvious rust issue that matches this phenotype, so 🤷 .   Closing - please re-open if you have any more insights.

FWIW it was root caused here