marbl/CHM13

Reference bias issue for some validated variants?

xiaoguanghuan1 opened this issue · 1 comments

Hi, thank you so much for this tremendous contribution to human genome research.
We are currently trying to call variants from nanopore reads mapped to T2T as reference which works very well. As we dig more into some validated variants in a highly homogeneous region (SMN1 and SMN2). We found some reference bias issue. Please see the example in the image. On hg37 and hg38, this position is a SNV (A-->G) validated in HG002, but in T2T, at this position, the base is G and therefore, not variant was called in HG002. This SNV is real because we saw that in HG003 (heterozygous) from the reads mapped to T2T. We are wondering whether that's an alternative haplotype being integrated into the current T2T genome and the haplotypes consistent with hg38 were not integrated or that was just simply a sequencing error that occurred on either hg38/37 or T2T. Could you please give some advice? Thank you!
image

Hello @xiaoguanghuan1,
It is likely that the CHM13 allele matched the HG002 allele and went into the assembly on this locus.
No hg38 haplotypes were considered during the assembly / polishing process.