`k2mask` is only single-threaded even with `--threads` and `OMP_NUM_THREADS` set properly
Closed this issue · 8 comments
During kraken2-build --download-library
in the masking low-complexity seqs step, k2mask
always appears to be single-threaded even though I've specified --threads n
and OMP_NUM_THREADS = n
in my environment before running. Do you know why this is happening?
Looks like only --build
step is using threads. All other steps seem to be single thread.
Looks like only
--build
step is using threads. All other steps seem to be single thread.
Lines 9 to 10 in 4cbdc5f
Lines 457 to 470 in 8f82a7d
But for some reason in my process list the k2
wrapper script is not used and not found as a parent process, and you can see the k2mask options don't include -threads n
like it should:
hermida+ 9734 0.0 0.0 223352 3328 pts/7 S+ Jul29 0:00 /bin/bash /home/hermidalc/soft/miniforge3/envs/tcga-wgs-kraken-microbial-quant/share/kraken2-2.1.3-1/libexec/download_genomic_library.sh bact
hermida+ 61824 0.0 0.0 223220 3328 pts/7 S+ 04:02 0:00 /bin/bash /home/hermidalc/soft/miniforge3/envs/tcga-wgs-kraken-microbial-quant/share/kraken2-2.1.3-1/libexec/mask_low_complexity.sh .
hermida+ 61827 13.7 0.0 41020 32204 pts/7 S+ 04:02 68:30 k2mask -in ./library.fna -outfmt fasta
hermida+ 61828 13.2 0.0 221760 2176 pts/7 S+ 04:02 66:19 sed -e /^>/!s/[a-z]/x/g
Looks like only
--build
step is using threads. All other steps seem to be single thread.
Yep the k2
wrapper script that was added in v2.1.3 is never called when you run kraken2-build --download-library
, it runs download_genomic_library.sh
which spawns mask_low_complexity.sh
which seems to have older k2mask
spawning code that is only single-threaded!
https://github.com/DerrickWood/kraken2/blob/v2.1.3/scripts/mask_low_complexity.sh
I just found k2
. I think this meant to be used as a independent script.
Can you try this?
$ k2 download-library --db viral --library viral
-threads 4
Not sure why this is hardcoded. Shouldn't this be configurable?
I just found
k2
. I think this meant to be used as a independent script.Can you try this?
$ k2 download-library --db viral --library viral
The problem is in the bioconda kraken2
package none of these independent scripts are in the $PATH
, only the binaries in the manual kraken2
, kraken2-build
, kraken2-inspect
, etc. Looking at the bioconda kraken2
package install all these scripts are in a libexec
subfolder in the package, the aren't copied or linked back to the conda environment bin
folder, which they should be if you want to see them. I'm building a pipeline that is intended for others to use and to be reproducible so conda is a must.
-threads 4
Not sure why this is hardcoded. Shouldn't this be configurable?
It's fixed in master but not yet available
Lines 476 to 487 in 4cbdc5f
The new features in the k2 script implemented the fix to this issue