nygenome/lancet

std::bad_alloc error when calling variant using lancet

Closed this issue · 7 comments

Hi, I read your paper of lancet and I am really excited to see the tool being able to reduce STR false positives.
I've just installed lancet on my Linux. But when I ran it on my target sequencing data (BAM files are generated by bwa mem algorithm):
lancet --num-threads 2 --active-region-off --tumor /path/to/the/indexed/tumor-bam --normal /path/to/the/indexed/normal-bam --ref humanhg38.fa --bed target.bed

The log file is written:

Loaded 2290 from bedfile
20654 total windows to process
starting thread 1 on 10324 windows
starting thread 2 on 10322 windows
Process reads
Process reads

and then it threw error as:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
and terminated.

I am no expert in C but I assume this may related to some memory allocation problem.

I wonder why the error occurred and what your suggestions are to resolve it?

If you need any more details please let me know.

Thanks,
Yiming

Hi Yiming,

it is possible that the program ran out of memory.
Does the error occur as soon as the job is submitted? Can you check the memory usage?

You may want to test the code on a small region first, using the "--reg chr:start-end", to make sure that the input files (BAMs, FASTA, BED) are compatible.

Hi Giuseppe,

Sorry for the late response.

Does the error occur as soon as the job is submitted? Can you check the memory usage?

I think so, the error occurs less than 1 second after I run the command. For its short running time, I use /usr/bin/time to track the memory usage and the max memory is less than 20Mb. The server I am using has more than 100Gb memory thus I guess it shouldn't be a problem.

I tried to test on a small region in two ways;
First time, I specified a small region by --reg option and the command went like

lancet --tumor tumor.bam --normal normal.bam --ref hg38.fa --reg chr1:6181470-6199619

Second time, I extracted reads covering that region using samtools view . The tumor and control BAM files are just several Mb of file size and contains 89795 and 54227 reads respectively. Then I tried again:

lancet --tumor tumor.chr1.6181470-6199619.bam --normal normal.chr1.6181470-6199619.bam --ref hg38.chr1.fa --reg chr1:6181470-6199619

Both time, I ran into the same error:

X total windows to process
starting thread 1 on X windows
Process reads
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

The error happened after

Process reads

So I guess something must be wrong when lancet tried to process the reads in BAM file.

Is there anything else I can do to help you narrow down the problem part?

Thanks,
Yiming

Would you be willing to share the small BAMs extracted from your tumor/normal data? I can try to reproduce your error and find a fix.

Sure, that would be great. Thanks.
small_sample.tar.gz

The mapping was done with GRCh38.d1.vd1.fa.

I was able to run Lancet on your small BAMs without any problems. I have attached the output vcf file test.vcf.zip. I used the exact same command as you:

lancet 
   --normal normal.chr1.6181470-6199619.bam 
   --tumor tumor.chr1.6181470-6199619.bam 
    --reg chr1:6181470-6199619 
    --ref GRCh38.d1.vd1.fa

Not sure why it fails for you on the same data.
Can you run your small test again using the -v option and share the output ? In verbose mode you may be able to identify at what step the code fails to run.

I tried the newest release 1.0.6 and it works!
I was using lancet installed by the up-to-date code in master branch before. I guess some commits ahead of the release may cause the problem.

Thanks,
Yiming

Great! Thank you for the update Yiming.