marbl/canu

how to get only one contig by setting the parameter?

Closed this issue · 2 comments

Hi, I assemble sequencing reads from 2 same genome using canu (assemble them respectively), my command is " canu -d 4Bcanu -p 4Bcanu genomeSize=2.5m useGrid=false -nanopore dbtrim_18.fastq.gz" but for genome1, I got only one contig, but for genome 2, I got 2 contigs. I supposed to get only one contig. I rerun the canu for genome2, but still get 2 contigs.
My question is how to get only one contig? do I need to change some parameters in the comman?
Thank you very much!!

There are a lot of reasons you may not get a single contig assembly. It's possible the second genome has larger repeats, the reads are shorter/lower quality/etc which is why you got two instead of 1 contig. There are a few parameter sets listed on the FAQ: https://canu.readthedocs.io/en/latest/faq.html#my-assembly-continuity-is-not-good-how-can-i-improve-it but, as I said above, it may not help if the second genome is more complex. Post the full reports generated by canu from both runs to compare them and see if anything obviously looks off in one vs the other.

Thank you so much!!! I check the QC analysis of the reads from the two genomes, the reads which gave me two contigs are much shorter and the number of reads are much lower, maybe this is the reason. Thanks again!