shishenyxx/DeepMosaic

homopolymer and dinucluotide filter

Closed this issue · 1 comments

Hi,
I noticed in the Q&A, you have recommended that For WGS variants, the exclusion of annotated homopolymer and dinucleotide repeats will remove false positives and increase the validation rate, but decrease the sensitivity. But I do not kown what does homopolymer=0 and dinucleotide=0 mean, is it more reliable as it gets closer to zero or less reliable.
What do you recommend?

Thanks

Hi huangyuanf,

Thank you for your question! Homopolymer and dinucleotide repeat annotations are to define whether a variant is close to homopolymer and dinucleotide repeats in the genome, selecting a "0" annotation avoids the issue. However, might also exclude some true positives, as these homopolymers and dinucleotide repeats are where polymerases also tend to make mistakes.

So =0 is more reliable, you can try to remove anything that is not equal to 0.

Best,

Xiaoxu