marbl/verkko

verkko for human mitochondria assembly

Closed this issue · 1 comments

Hi, we were exploring the verkko tool for a small project on human mitochondria assembly, which we notice some of the python scripts filters out contig less than 100kbp which is way over the mitochondria genome size (~16kbp). Hacking some of those scripts kind of works for the Hifi PacBio only reads assembly and we managed to get an assembly done for the mitochondria genome.

However, including the same sample Nanopore reads (doing hybrid mode) the process errored out and was not so obvious where to change those parameters.

Is it possible to reduce the contig size filter easily to allow these kinds of small genome assembly?

skoren commented

The filter should only remove short contigs that are also disconnected from other nodes in the graph. Typically, in full human genomes, the mito forms it's own component and isn't removed by the length filter. Instead, verkko can select a representative and circularize it.

I guess if you have extracted only the mito reads it might end up as a disconnected node in the end. You can control the minimum length with the --discard-short parameter to verkko, set it to whatever threshold you'd need (e.g. --discard-short 10000).