leylabmpi/Struo2

Missing FASTA file paths from bac120 and ar122 GTDB metadata

dgolden96 opened this issue · 1 comments

Hi there,

I've managed to replicate the Kraken2 database creation process with the toy dataset as described in the ReadMe file, but I've run into a snag doing the same using metadata from the GTDB. The metadata files at the following URL don't seem to contain the FASTA file paths necessary for running the pipeline: "https://data.gtdb.ecogenomic.org/releases/release202/202.0/". Would you happen to know of a workaround by which I can use one of the other fields in the metadata to get the necessary filepaths?

Thanks!

You have to add the fasta files yourself to the table. The file paths would be specific to the genomes that you have locally downloaded. See https://github.com/leylabmpi/Struo2#downloading-genomes