Removal of Prokka in favor of Bakta
Daniel-VM opened this issue · 3 comments
Issue description
Prokka is no longer under maintenance and Bakta seems to be a reasonable replacement for genome annotation which incorporates several improvements .
Describe the solution you'd like
Remove Prokka from nf-core/bacass and add Bakta instead.
Additional notes
It seems that Bakta needs a database to perform the annotations. However, even the light version of its database is somewhat heavy and could slow down the testing process.
Another option is to keep Prokka and add Bakta as an additional tool for annotation.
I am open to suggestions.
I don't know if you'll find a newer bacterial annotator that comes with a database the size of prokka's, though.
If it helps any, the bakta database is a lot smaller than the kraken2 ones.
Thanks for your input @erinyoung !.
I see... Well, at some point I noticed that downloading the kraken2 db (8gb) took less time than the Bakta database (1.3gb) using the nf-core/module/bakta/baktadbdownload
. Could it be that the aws S3 speeds up the kraken2 database download process (hosted at: https://genome-idx.s3.amazonaws.com/kraken/k2_standard_8gb_20210517.tar.gz)?
Anyway, lets give it a try 👍🏾 .