Puffaligner doesn't map read pairs to different references
apcamargo opened this issue · 2 comments
Hi,
There are some applications where it's important to identify reads pairs where the reads map to different references. Even though Puffaligner map reads independently ("(…) we consider the chaining and chain filtering for each end of the read separately."), I couldn't find any pair consisting of mates that map to different references.
In comparison, Bowtie2 maps ≈ 1.6% of the read pairs to different references with the same inputs.
Hi @apcamargo ,
Thank you for your post. However, I am not sure if I understand the request clearly.
Would you mind explaining a little bit more?
Sure, @fataltes!
Here's Puffaligner's (using --bestStrata
) samtools flagstat
output:
214688504 + 0 in total (QC-passed reads + QC-failed reads)
50488220 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
125913389 + 0 mapped (58.65% : N/A)
164200284 + 0 paired in sequencing
82100142 + 0 read1
82100142 + 0 read2
83360444 + 0 properly paired (50.77% : N/A)
83360444 + 0 with itself and mate mapped
6101721 + 0 singletons (3.72% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
Here's Bowtie2's (using -k 15
):
241492571 + 0 in total (QC-passed reads + QC-failed reads)
77292287 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
159016623 + 0 mapped (65.85% : N/A)
164200284 + 0 paired in sequencing
82100142 + 0 read1
82100142 + 0 read2
74243714 + 0 properly paired (45.22% : N/A)
77436030 + 0 with itself and mate mapped
4288306 + 0 singletons (2.61% : N/A)
2489036 + 0 with mate mapped to a different chr
2027014 + 0 with mate mapped to a different chr (mapQ>=5)
Puffaligner's with mate mapped to a different chr
is 0, meaning that there are no pairs with reads that mapped to different references.
Essentially, I'm interest in alignments where the 7th field is not =
, for example:
HISEQ13:355:CBN0FANXX:7:1101:17319:1971 97 k147_2000503 17 38 150M k147_584177 66 0 CGGCGGACTAAGGCTCTATAATTTCAATTTTTCACCAGACTAAGTAATCCATGAAGAAACTCATTGCAGCACTGGCTTCCAGTGTTCTGGTGATGTCCGCCGCCGTCGCCCAGACGCTGCCGGCGCCGACCATCGCCGCCAAATCGTGGC =ABBGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGFGG>GEGGGGGGDGFGGCGDGDGGGGG<DGGGGGGGBGGGGGGGGGGGGGGGGGGGGGGGGGG@ AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:150 YT:Z:UP