shingocat/lrscaf

Ns after Scaffolding contigs containing no Ns

Closed this issue · 3 comments

After scaffolding a contigs fasta (containing no Ns), the output scaffolds.fasta file contained Ns, even thought the input long reads contained no Ns.

If two contigs are being linked by a long read, then in the resulting scaffold, is the sequence between the two contigs copied over from the long read, or is it a stretch of Ns?

sanjitsbatra ,

LRScaf is scaffolder and designed for using low coverage TGS long reads to improve draft assemblies. After the scaffolding procedure. The gap regions between contigs would be a stretch of Ns. The consensus sequences in the gap regions are not accurate on low coverage TGS long reads. LRScaf does not try to fill the gap by using low coverage TGS long reads.
And the question you asked, "The valid aligned records is 0". I think the question might be the same as YiweiNiu question. If you use the minimap, you should set identity to 0.1 of the former version of LRScaf. And we had updated LRScaf <Version 1.1.5> to automatically set the identity for minimap to be 0.1 if the user does not set this parameter.

What is the way to fill those gaps inside contigs with the highest quality consensus from low coverage TGS long reads? (Assuming that polishing afterwards will improve the quality of this consensus)

Are there any tools you would recommend?

sanjitsbatra,

I do not test the performance of gap filling.
Might be you could firstly use PBJelly to fill gaps. And then use NGS reads to polish the assemblies.