nygenome/lancet

Problem with BAM chromosome labels

AldhairMedico opened this issue · 2 comments

Dear lancet developers,
I'm trying to run lancet for WES data, I used your example code:

		**for chrom in `seq 1 $NUMBER_OF_AUTOSOMES` X Y; do
			$lancet \
			--tumor $bam_sorted_dir/1B_sorted.bam \
			--normal $bam_sorted_dir/1A_sorted.bam \
			--ref $ref_genome \
			--reg $chrom \
			--num-threads 10 > $lancet_results_dir/'1A_1B_'${chrom}.vcf
		done**

However, I didn't realize that my BAM chromosome labels were a mess:
Note: BAMs were marked duplicates, sorted and indexed
samtools idxstats 1A_sorted.bam | cut -f 1| head -n 10
NC_000001.11
NT_187361.1
NT_187362.1
NT_187363.1
NT_187364.1
NT_187365.1
NT_187366.1
NT_187367.1
NT_187368.1
NT_187369.1

And I got this error (obviously):
ERROR: chromosome label 1 not found in BAM header!
[W::fai_fetch] Reference 1:1- not found in FASTA file, returning empty sequence
Failed to fetch sequence in 1:1-
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_M_construct null not valid
Aborted (core dumped)

Is there any way to solve this problem with some lancet command?
or should I redo my BAMs? (in other programs I had no problems)

Your BAMs are probably fine. For WES data is best to provide in input (--bed option) the BED file containing the list of capture regions (from the capture kit used for sequencing) that should be analyzed. In addition make sure that the chromosomes labels in your BAM and in your BED files match (e.g., 21 vs chr21). Hope this helps.

Thank you for replying. Indeed, the headers were not the same. I had to change my reference genome so that this and other programs do not have the same error.