lh3/miniasm

Assembling mixtures of PacBio and Nanopore reads

Opened this issue · 1 comments

jflot commented

Dear Heng,
I am trying to assemble a mixture of PacBio and Nanopore reads using minimap2 and miniasm, what would be the best switch for the overlap step? I tried minimap2 -t 32 -X -Hk15 -w5 -Xp0 -m100 -g10000 --max-chain-skip 25 but it seems to take forever on my 32 Gbp long-read dataset... Would you recommend another combination of switches?
So far my .paf file is 334 Gbp and growing. Is it possible to find out what percentage of the overlap search is already done (so that I can know whether it is near completion or will still be running for weeks)? When I look at the .paf output I cannot find an obvious logic in the order of the reads (perhaps because of the multithreading?).
Thanks a lot in advance, and best regards,
Jean-François

lh3 commented

Don't use -Hk15. That makes mapping slow. When you map pacbio reads to nanopore reads or vice versa, use the ava-ont preset.

With the latest minimap2-2.8 (just released), the recommended way is:

minimap2 -x ava-pb pacbio.fa pacbio.fa > pb-only.paf
minimap2 -x ava-ont nanopore.fa nanopore.fa > ont-only.paf
minimap2 -x ava-ont --dual=yes pacbio.fa nanopore.fa > ont-to-pb.paf

You then concatenate all three .paf files and do assembly. Minimap2-2.7 or earlier can do similar overlapping in principle, but the command line will be a little complicated, so just use the new release.