vibansal/HapCUT2

Only a small part of SNPs can be phased

Closed this issue · 1 comments

Hi,
Firstly, thanks for developing this tool. It's really helpful!

I used HapCUT2 to phase my 2826 heterozygous SNPs from Hi-C data. I called SNPs from Hi-C sequencing data by bcftools and removed the low quality variants, then ran HapCUT2 on the bam file which only contained all alignment reads on the variants (because the whole bam file was too large)
However, the results showed that only 800 SNPs (~30%) can be phased which is much lower than I expected. Is that normal?
I also noticed that the depth and QUAL of unphased SNPs are not low, which mostly DP>100 and QUAL>200, even higher.
I wonder if I did something wrong in the steps and how can I improve it?

Thanks,

Jiang

HapCUT2 uses long-distance linkages present in Hi-C data for phasing, therefore, limiting the phasing to a region could potentially reduce the phasing completeness.