zaeleus/noodles

"invalid start positions" when reading unmapped cram

Closed this issue · 2 comments

Hi!

I'm trying to read unmapped cram files (created with noodles_cram) and get an error trying the following

use noodles_cram as cram;
let mut reader = cram::io::reader::Builder::default()
    .build_from_path(cram_file)
    .unwrap();
let header = reader.read_header().expect("bad header");

for result in reader.records(&header) {
    let record = result.expect("Error reading record");
    println!("{:?}", record.sequence());
    std::process::exit(1);
}

the error:

thread 'main' panicked at /Users/hadrien/.cargo/registry/src/index.crates.io-6f17d22bba15001f/noodles-cram-0.67.0/src/data_container/slice.rs:270:10:
invalid start positions
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

However, if I convert my cram into bam (samtools view -b test.cram -o test.bam) the following code works fine:

use noodles_bam as bam;
let mut reader = bam::io::reader::Builder::default()
    .build_from_path(bam_file)
    .unwrap();
let header = reader.read_header().expect("bad header");

for result in reader.records() {
    let record = result.expect("error reading record");
    println!("{:?}", record.sequence());
    std::process::exit(1);
}
// prints Sequence("ATTTCCGAAAGGCCGACCCGGAACTGGATGCGCTCTATGCCGAACTGGTCGACACGGCCGATCAGCGCCACCGACATCGTGCCCTGATCGACCCAGCTGTCATGCCGGATCGCCGCCAGCACACCCAG")

Here is the first record in the file:

M03699:229:000000000-DLMLN:1:1101:15633:13	77	*	0	0	*	*	0	0	ATTTCCGAAAGGCCGACCCGGAACTGGATGCGCTCTATGCCGAACTGGTCGACACGGCCGATCAGCGCCACCGACATCGTGCCCTGATCGACCCAGCTGTCATGCCGGATCGCCGCCAGCACACCCAG	DDDDDFDCCDDFGGGGGGGGGGGGHHHHHHHGGGGHHHHHHGGGGHHHHHGGGGGGGGGGGGGGHHGGGGGHGGGGGHGHHGHHHHHHHGGGGGHHHHHHHHHHHHGGGGGGGGGGGGGGHHHHGGGH	CB:Z:E3-D7-H10-A5	CR:Z:CTGTGAAC-CTTCAGCA-CAACTTGC-TCTGGAAC

Is this a bug in noodles_cram or am I doing something wrong?

Best,
Hadrien

Thanks for the report; it's a bug.

This error occurs when an unplaced record has a downstream mate, and the template length needs to be calculated, which isn't possible.

It's now fixed in cad031c.

Wow that was fast! it works now, thanks a lot 🤩