KolmogorovLab/hapdup

A few questions

Closed this issue · 2 comments

Hi hi,

  • During the haplotype polishing step, would it make sense to use --read-error in combination with --nano-corr if I am using corrected reads for the polishing? I am using the option as of now so I know Flye accepts it but it is uncertain if it makes a difference.
  • Overall, do you think a full run of Pepper-MARGIN-DeepVariant rather than just Pepper-SNP would make a difference for segmental duplications?
  • Finally, would you recommend any extra step after Hapdup to further polish the haplotypes?

Thank you for your work on Hapdup. I have used it recently on a 70x ONT (R9.4, Guppy 5 SUP) + 40x Illumina (2x151bp) assembly and the resulting diploid assembly had QV50 and k-mer completeness of nearly 99% according to Merqury 1.3 (trio mode).

Guillaume

Hi,

  1. --read-error does not make any difference for polishing.
  2. Unlikely so, PEPPER captures the most of informative SNPs, and Margin uses this for phasing. After that, SNPs generated by PEPPER are not used. The final base-level quality primarily depends on (1) read haplotypes accuracy and (2)Flye polisher performance.
  3. We haven't tested anything extensively. Theoretically, applying Medaka using phased reads might help, but you'd need to keep to do two separate runs for two haplotypes, and use the haplotagged reads from the right set. Or alternatively running PEPPER again, but this time with DeepVariant. But it currently does not have the interface to start from a diploid assembly.

Glad to hear that it's working well! I haven't really seen many examples for ONT + Illumina assemblies being QV50+.

Hi @fenderglass,

Thank you for your input, this was very helpful. Closing now.