leylabmpi/Struo2

Merging Two or more Kraken Indexes to create most updated Indexes

Closed this issue · 1 comments

Dear Developers

I am looking for a way where we can combined Kraken2 RefSeq PlusPF Indexes and Kraken2 indexes made available by you guys.

Currently, I use Kraken2 primarily to classify reads of my sample using PlusPF indexes from https://benlangmead.github.io/aws-indexes/k2 and then feed the unclassified reads to Struo2 custom Kraken2 indexes (http://ftp.tue.mpg.de/ebio/projects/struo2/GTDB_release202/kraken2/).

Struo2 I observed is actually able to classify 50% of unclassified reads that PlusPF failed to classify (mostly they were cynobacteria). However, your custom database only have bacterial genomes and not Parasite or Viral genomes.

Also, I wish to use Viral indexes that are made available here: https://doi.ccs.ornl.gov/ui/doi/82

So I am thinking something like Kraken2 PlusPF + Struo2 Kraken2 indexes + Virus JGI indexes all combined in a single database.

The problem of using them separately is I have 3 Pavian reports per sample and calculation of abundance using Bracken is becoming difficult.

combined Kraken2 RefSeq PlusPF Indexes and Kraken2 indexes

You can't combine existing databases; you'd have to create a new database from all input reference fasta files + the associated taxonomy.

We are using the GTDB for the taxonomy, which doesn't extend to viruses, and RefSeq uses the standard NCBI taxonomy. I don't see how you can accomplish what you want.