Realigned output bam TLEN field plus/minus sign when FLAG == 147
Julie-Zhongyun-Huang opened this issue · 2 comments
Hi there!
We are recently very interested in abra2 for fast and accurate reassembly/realignment of InDels.
When using other tools with the realinged bam from abra2, we discovered this following potential issue. Please see the following example read pair:
A00337:46:HHGVNDMXX:1:1441:31946:25316:CTGCAGTA:CTGCAGTA:GA:AA 147 chr16 3727646 60 139M = 3727648 139 TTCCTAGATGCCTGGATTTTCAGTACAAAAGGTCCAAGAACATGAAAGGGGAAAGGTGATGCTCTCACAATGCTACAAGCCCTCCACAAACTTCTCTAGCGTGTCCCCCGTGGTGTCCCCGACCAGGGACAGTTCGCTG :FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF YA:Z:chr16:3727129:964M MD:Z:48A88 RG:Z:4 NM:i:3 YM:i:2 YO:Z:chr16:3727648:-:2S137M AS:i:132 XS:i:23 YX:i:3
A00337:46:HHGVNDMXX:1:1441:31946:25316:CTGCAGTA:CTGCAGTA:GA:AA 99 chr16 3727648 60 8S130M = 3727646 -139 TTTTTATTC
CTAGATGCCTGGATTTTCAGTACAAAAGGTCCAAGAACATGAAAGGGGAAAGGTGATGCTCTCACAATGCTACAAGCCCTCCACAAACTTCTCTAGCGTGTCCCCCGTGGTGTCCCCGACCAGGGACAG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF YA:Z:chr16:3727129:964M MD:Z:48A81 RG:Z:4 NM:i:1 AS:i:125 XS:i:23
In this example, for the FLAG == 147 read, the POS column (col4, here 3727646) is less than PNEXT (col 8, here 3727648), and the TLEN (col 9, here 139) receives a plus sign.
However, when I check other bam files not realigned/reassembled, in such situation (FLAG == 147 & POS < PNEXT), TLEN is always with minus sign.
According to SAM format specification , for TLEN, the leftmost segment has a plus sign and the rightmost has a minus sign. For FLAG==147 (second of a pair / reverse-complemented), when POS < PNEXT, the segment should still be the rightmost.
Please don't hesitate to let me know if the TLEN sign should be modified.
Thanks a lot!!
Julie
I'm not sure I have a grasp on the issue here.
Based on a quick reading of the SAM spec, I could not anything to support the following statement:
"For FLAG==147 (second of a pair / reverse-complemented), when POS < PNEXT, the segment should still be the rightmost."
Feel free to correct me if I am missing something and point me to where this is defined.
Also, it may be helpful to hear how this is impacting downstream tools your are using with the realigned BAM. Thanks.