marbl/verkko

Explore alternate correction strategies

skoren opened this issue · 0 comments

skoren commented

Current correction works well but is quite slow due to the cost of overlapping, making up >50% of the total verkko runtime.

Options include:

  • lja w/the --dimer-compress 1000000000,1000000000,1 option to avoid compressing dimers
  • hifiasm
  • faster potential overlap identification followed by edlib (aka the mhap strategy)

Results from testing so far on HG002:

  • current correction - T2T ctg/scf 15/26, QV 53
  • lja - T2T ctg/scf 4/10, QV 53
  • hifiasm - T2T ctg/scf 8/15, QV 45

This would seem to indicate hifiasm is over-homogenizing the reads confusing the final assignment and thus consensus. LJA ran w/dimer compression so this likely hurt our ONT resolution alignment and needs to be re-run.