leylabmpi/Struo2

No Bracken kmer distributions for strain level taxids

Closed this issue · 2 comments

Hello,

I made custom kraken2 and Bracken databases following your tutorial, where I added several genomes to the pre-built GTDB Struo2 databases (thank you for providing these, by the way!). However, when I try to run Bracken on the Kraken2 output for my metagenomes (which has many reads assigned at several taxonomic levels), I get the error "Error: no reads found. Please check your Kraken report". One curious thing I noticed in the kraken output is that there are no reads assigned to the species (S) level, only to the strain (S1) level. I also noticed that none of my strain taxids are in the database100mers.kmer_distrib files generated by Struo2. I think that Bracken is not evaluating the reads that were assigned to strains because those taxids are not in the Bracken database.

Is there a way to get kmer distributions at the strain (S1) level for a Bracken database generated with Struo2? Or, do you think my problem might be caused by something else?

Thank you so much for any insight!

Maybe you didn't include the correct taxID for the added genomes? You need to provide the species/strain level taxID for the correct taxonomy (GTDB or NCBI, depending on which you use). Could that be the source of the problem?

I'm closing this due to inactivity. Please reopen, if needed