Oshlack/Lace

Assessing completeness of SuperTranscript transcriptome assembly/BUSCO

Closed this issue · 3 comments

Hello team of the Oshlak lab,

do you have experience with BUSCO-analysis on SuperTranscript data?
I have used corset and Lace to cluster and stitch plant transcriptome assemblies. Afterwards, it did not find a lot of the BUSCOs. However, when using OrthoFinder to find orthologs to additional species, which uses BLAST/Diamond, the assemblies looked more complete.

Do you think, SuperTranscripts are in principle compatible with BUSCO?

Could you think of an alternative way to check the completeness of the SuperTranscript assemblies?

Thank you,
Maria

Hi Maria,

I haven't used BUSCO myself, but just looking at the manual, I wonder if using genome mode would do better than transcriptome mode, as superTranscripts are really a pseudo-genome reference. ie. transcripts might be splicing within the superTranscript and perhaps that's why they are being missed?

Apart from writing your own BUSCO like program, I can only think of checking completeness looking at the number of reads which map. But of course this won't tell you how many genes are missing.

Cheers,
Nadia.

A bit related to this, you might also find our software Neckalce useful, https://github.com/Oshlack/necklace, if you have a reference genome of some sort (if will search for orthologs to related species in the the de novo assembly).

Hi Nadia,

Unfortunately, there is no assembled genome from this or a related species.
And I don’t have the opportunity to test BUSCO in genome mode anymore.
So I close the issue.

Anyway, thank you for your response and taking the time to look at BUSCO,

Maria