amplab/snap

-F option yielding incongruous output

cxr5298 opened this issue · 0 comments

While using SNAP with an Illumina ampliseq dataset I get kind of surprising output when using SNAP's -F filter function.

As it is I run the data I'm splitting it out into two buckets one for aligned reads and unaligned reads, now conventional wisdom would suggest that the read totals across both the aligned and unaligned sets would add up to the read count of the run as a whole.

However it doesn't, it adds up as being greater than the original read count. Moreover there is overlap between the unaligned and aligned reads when I dive into the individual .sam files. I also looked at the supposed unaligned reads and they in fact align to my original reference, its just SNAP that's giving them mapq scores of 0. For the life of me I cannot tell why this is happening. Unfortunately I am not able to share any data so I understand if any input you can offer is limited. I will share what I can below:

The call to SNAP:
~/project/snap-aligner paired ~/data/indx/ ~/data/sample1_R1.fastq ~/data/sample1_R2.fastq -F a -o sample1_aln.sam ~/project/snap-aligner paired ~/data/indx/ ~/data/sample1_R1.fastq ~/data/sample1_R2.fastq -F u -o sample1_unaln.sam

The indexed reference I'm using is a small database of only 48 sequences ranging from 95 to 225 bases in length.