DaehwanKimLab/centrifuge

NCBI nt index

Opened this issue · 6 comments

The Centrifuge website has a link to the NCBI nucleotide non-redundant sequences index from 2018. It's possible to generate one, but that is a very long process. Do you plan to offer a more recent version of this index?

Hi,

I wrote an email to the guy that maintains the page of the indexes, I do not think he is anyone involved in centrifuge directly.

The nt database is very huge now..so we don't have the computing resource (probably need a machine with >2TB memory) to build the index. I think there are other labs have built the nt index, not sure whether they are publicly accessible now.

Thanks @mourisl! We are in the process of public release of a recent nt index. It should be available in 24-48 h. We'll let you all know as soon as it happens.

Just for future reference, it's also possible to try Centrifuger if computing resources are more limited. Check this related thread: #275

Our pre-print accompanying the release of a new Centrifuge nt database is online now: Addressing the dynamic nature of reference data: a new nt database for robust metagenomic classification. Any feedback will be welcome!

thanks a lot @khyox. yes sure, I will return to you with some feedback!