muellan/metacache

Allow for more threads?

punnettsun opened this issue · 9 comments

Hi,

I tried to run the database build, but it seems that my machine is only running MetaCache on a single thread. Is there a way to increase this? I do not see an argument for specifying threads in MetaCache.

Thanks.

I'm sorry, but only the query mode is multi-threaded, the database build is not multi-threaded in version 1.x.x

BTW: This is a limitation that Metacache shares with many other metagenomic classification tools. Kraken2 takes actually even longer for database builds on the same input genome size.

I see. Do you have an estimate for how long it might take to build the complete genomes in RefSeq by any chance?

On our machines it takes a bit more than an hour to build a database containing all complete bacterial, viral and archaea genomes from the latest RefSeq.

Thank you. Can I possibly get your email to email you about the specs I am using to build the database? I have been running on a good machine but the reference sequence processing is still at 0% after 4 hours. I assume this isn't supposed to happen with the machine I am using.

Sure, 4 hours sounds very odd indeed.

muellan@uni-mainz.de

I'll have a look at it tomorrow, it's already past 10 pm here :-)

Great thank you!

Did you find any bugs related to this?

I am trying to build a custom database and the build is taking ~50 hours. I am looking for ways to speed this upl

Sorry, but I unfortunately do not have access to the email domain I used at the time to look back at mine and Muellan’s conversation. I believe there was something on my end that was causing the issue with the database build.

I hope you will figure out the issue soon!