Why is `k2mask` set to use only half the number of available cores?
hermidalc opened this issue · 5 comments
For kraken2
I observed that performance is best at 1/2 to 3/4 of available cores. After that performance decreases.
Not sure about if it is the same case with k2mask
. Need to run verify it.
https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html
For
kraken2
I observed that performance is best at 1/2 to 3/4 of available cores. After that performance decreases.Not sure about if it is the same case with
k2mask
. Need to run verify it.https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html
The downside to it being hardcoded like that is when you run Kraken2 in a cluster environment you don't want the script looking at all the cores available on a node because you typically request a certain number of cores to use for the job and the job could be assigned to a node with a lot more cores than what you requested (that other jobs are using).
I agree that it shouldn't be hardcoded. I am trying to understand why it could be hardcoded to multiprocessing.cpu_count() // 2
in the initial stage.
I agree that this change should be configurable via the command line. I will address this in my next commit to k2.
Raised PR for that #866