possorted_bam.bam versus phased_possorted_bam.bam
Opened this issue · 2 comments
Hi,
I would be curious to hear whether you expect the possorted_bam.bam output by "LongRanger align" (rather than the phased_possorted_bam.bam output by "LongRanter wgs") should work adequately in linkedSV?
We are looking to replace LongRanger wgs, as it is tremendously time consuming, and to our understanding, LongRanger align provides a phased output as well. We yet fail to understand the difference between the bam files output by the two different pipelines.
Any insights would be very appreciated!
Best wishes,
Reto
Hello Reto,
Yes. I understand that the Longranger pipeline is pretty slow.
I have not tested on the "possorted_bam.bam". Does the bam file has a "BX" tag for barcode and an "HP" tag for haplotype?
Thanks,
Li
Hi Li,
thanks a lot for getting back on this question! The merely possorted bam files contain the BX field. This is what e.g. hapCUT2 uses to infer phase information (which is why I figured that there is phase info, but it's the info to infer phase and not phase info itself). However, as I now saw when checking, these files do not feature the HP tag. Thus the difference in naming and why using the possorted_bam will not work for linkedSV.
Thanks again & best wishes!
Reto