Transabyss default param setting issue
Closed this issue · 4 comments
Hi,
I'm running transabyss v 2.0.1 with default settings.
When checking the run-logs I was a little surprised to see the following message:
warning: the seed-length should be at least twice k: k=32, s=32
and indeed the default for s
is set to k
it says in the manual/help .
Would it therefore not be better to set the default for s
to k*2
by default?
thx.
That warning message is intended for genomic assembly, to prevent assembling duplicate sequences, which is less of a concern for transcriptome assembly.
Hi @sjackman ,
ok, fair enough, thanks.
Would it make sense to change (==increase s) to avoid assembling paralogous (and/or recently duplicated) genes together?
If that were your preference, then yes, you could increase s
to decrease over-assembly of paralogous sequence.
The s
parameter was set to k
intentionally (within Trans-ABySS) during the paired-end contig assembly stages because many unitigs with length equal to k
are part of the correct path for assembly into transcripts. This is particularly true for highly expressed transcripts.