Humann3 specifies wrong index for Bowtie2
BostjanMurovec opened this issue · 12 comments
Good Day!
When I run Humann3 according to instructions for the use of demo Struo2 database (http://ftp.tue.mpg.de/ebio/projects/struo2/GTDB_release95/humann3/ReadMe.md), of course suitably adopted to the actual disk contents, Humann3 crashes with an error that Bowtie2 crashed internally during the Humann3's run.
Inspection of Humann3's stderr reveals that Humann3 specifies wrong index file to Bowtie2:
=====================================================================
Error message returned from bowtie2 :
Could not open index file /home/bostjan/struo2_databases/GTDB_release95/humann3/uniref90/all_genes_annot.rev.rev.rev.1.bt2l
Could not open index file /home/bostjan/struo2_databases/GTDB_release95/humann3/uniref90/all_genes_annot.rev.rev.rev.2.bt2l
Segmentation fault (core dumped)
(ERR): bowtie2-align exited with value 139
The trouble stems from the wrong parameter being passed to Bowtie2:
-x /home/bostjan/struo2_databases/GTDB_release95/humann3/uniref90/all_genes_annot.rev
instead of
-x /home/bostjan/struo2_databases/GTDB_release95/humann3/uniref90/all_genes_annot
I tried to rename Bowtie2 index files to include additional .rev in their names, but then Humann3
keeps adding more and more fragments ".rev" to the name. Hence, the issue is that Humann3
wrongly parses the contents of the directory with Bowtie2 index.
Since this issue is not mentioned anywhere, I am curious whether somebody else also experienced it.
Thank you in advance and best regards,
Bostjan Murovec
Thanks for Bostjan for you interest in the Struo2 database!
I cannot reproduce the issue, but I did realized that there were some typos in the ReadMe.md file.
My run (you'd have to change the paths to reproduce):
NUC_DB=/ebio/abt3_projects/databases_no-backup/GTDB/release95/Struo2/humann3/uniref50/genome_reps_filt_annot.fna.gz
PROT_DB=/ebio/abt3_projects/databases_no-backup/GTDB/release95/Struo2/humann3/uniref50/protein_database/uniref50_201901.dmnd
NUC_DB_DIR=`dirname $NUC_DB`
PROT_DB_DIR=`dirname $PROT_DB`
humann3 --bypass-nucleotide-index \
--search-mode uniref50 \
--nucleotide-database $NUC_DB_DIR \
--protein-database $PROT_DB_DIR \
--input-format fastq \
--output-basename humann3_output \
--input reads.fq \
--output humann3_output_dir \
--o-log humann3_output
Maybe the *.bt2l files were corrupted when you downloaded them (eg., an incomplete download)?
I will add md5 sums for all data files on the ftp server (I should have done that already).
Dear nick-youngblut,
thank you for your prompt response.
Indeed, checking of files would be prudent, although from the logs and stderr I conclude that Bowtie2 is instructed to access wrong file name. I speculate at this point that this is the Humann3's bug ???
I have also spotted and corrected the mentioned typos, so this is not the source of the trouble.
Anyway, thank you for the valuable information that the issue is not reproducible. I will do some further investigations on my system.
Best regards,
Bostjan Murovec
I am using a conda env with the following:
humann 3.0.0.alpha.3 py37h83b1523_0 biobakery
bowtie2 2.4.2 py37h8270d21_1 bioconda
...so it could be an issue with the versions that you are using.
That is exactly my version too. The mystery persists.
Dear nick-youngblut,
would you be so kind to list the exact versions of all packages in your Humann3' Conda environment?
I still cannot resolve the issue.
Thank you in advance and best regards,
Bostjan Murovec
Here's my full humann3 conda env:
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
bcbio-gff 0.6.6 pyh864c0ab_1 bioconda
biom-format 2.1.10 py37ha21ca33_0 conda-forge
biopython 1.78 py37h8f50634_1 conda-forge
blast 2.10.1 pl526he19e7b1_3 bioconda
boost-cpp 1.70.0 h7b93d67_3 conda-forge
bowtie2 2.4.2 py37h8270d21_1 bioconda
brotlipy 0.7.0 py37hb5d75c8_1001 conda-forge
bx-python 0.8.9 py37h73d7ac5_2 bioconda
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h36c2ea0_0 conda-forge
ca-certificates 2021.1.19 h06a4308_0
cached-property 1.5.2 py_0
capnproto 0.6.1 hfc679d8_1 conda-forge
certifi 2020.12.5 py37h89c1867_1 conda-forge
cffi 1.14.5 py37hc58025e_0 conda-forge
chardet 4.0.0 py37h89c1867_1 conda-forge
click 7.1.2 pyh9f0ad1d_0 conda-forge
cmseq 1.0.2 pyh7b7c402_0 bioconda
cryptography 3.4.4 py37hf1a17b8_0 conda-forge
curl 7.71.1 he644dc0_8 conda-forge
cycler 0.10.0 py_2 conda-forge
dendropy 4.5.2 pyh3252c3a_0 bioconda
diamond 2.0.7 h56fc30b_0 bioconda
entrez-direct 13.9 pl526h375a9b1_1 bioconda
expat 2.2.10 h9c3ff4c_0 conda-forge
fasttree 2.1.10 h516909a_4 bioconda
freetype 2.10.4 h0708190_1 conda-forge
future 0.18.2 py37h89c1867_3 conda-forge
glpk 4.65 h9202a9a_1003 conda-forge
gmp 6.2.1 h58526e2_0 conda-forge
gsl 2.6 he838d99_2 conda-forge
h5py 3.1.0 nompi_py37h1e651dc_100 conda-forge
hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge
htslib 1.11 hd3b49d5_1 bioconda
humann 3.0.0.alpha.3 py37h83b1523_0 biobakery
icu 67.1 he1b5a44_0 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
iqtree 2.0.3 h176a8bc_1 bioconda
jpeg 9d h516909a_0 conda-forge
kiwisolver 1.3.1 py37h2527ec5_1 conda-forge
krb5 1.17.2 h926e7f8_0 conda-forge
lcms2 2.12 hddcbb42_0 conda-forge
ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge
libblas 3.9.0 8_openblas conda-forge
libcblas 3.9.0 8_openblas conda-forge
libcurl 7.71.1 hcdd3856_8 conda-forge
libdeflate 1.6 h516909a_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-ng 9.3.0 h2828fa1_18 conda-forge
libgfortran-ng 9.3.0 hff62375_18 conda-forge
libgfortran5 9.3.0 hff62375_18 conda-forge
libgomp 9.3.0 h2828fa1_18 conda-forge
liblapack 3.9.0 8_openblas conda-forge
libnghttp2 1.43.0 h812cca2_0 conda-forge
libopenblas 0.3.12 pthreads_h4812303_1 conda-forge
libpng 1.6.37 hed695b0_2 conda-forge
libssh2 1.9.0 hab1572f_5 conda-forge
libstdcxx-ng 9.3.0 h6de172a_18 conda-forge
libtiff 4.2.0 hdc55705_0 conda-forge
libwebp-base 1.2.0 h7f98852_0 conda-forge
lz4-c 1.9.3 h9c3ff4c_0 conda-forge
lzo 2.10 h516909a_1000 conda-forge
mafft 7.475 h516909a_0 bioconda
mash 2.2.2 ha61e061_2 bioconda
matplotlib-base 3.3.4 py37h0c9df89_0 conda-forge
metaphlan 3.0.7 pyh7b7c402_0 bioconda
muscle 3.8.1551 hc9558a2_5 bioconda
ncurses 6.2 h58526e2_4 conda-forge
numpy 1.20.1 py37haa41c4c_0 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
openssl 1.1.1j h7f98852_0 conda-forge
pandas 1.2.2 py37hdc94413_0 conda-forge
patsy 0.5.1 py_0 conda-forge
pcre 8.44 he1b5a44_0 conda-forge
perl 5.26.2 h36c2ea0_1008 conda-forge
perl-app-cpanminus 1.7044 pl526_1 bioconda
perl-archive-tar 2.32 pl526_0 bioconda
perl-base 2.23 pl526_1 bioconda
perl-business-isbn 3.004 pl526_0 bioconda
perl-business-isbn-data 20140910.003 pl526_0 bioconda
perl-carp 1.38 pl526_3 bioconda
perl-common-sense 3.74 pl526_2 bioconda
perl-compress-raw-bzip2 2.087 pl526he1b5a44_0 bioconda
perl-compress-raw-zlib 2.087 pl526hc9558a2_0 bioconda
perl-constant 1.33 pl526_1 bioconda
perl-data-dumper 2.173 pl526_0 bioconda
perl-digest-hmac 1.03 pl526_3 bioconda
perl-digest-md5 2.55 pl526_0 bioconda
perl-encode 2.88 pl526_1 bioconda
perl-encode-locale 1.05 pl526_6 bioconda
perl-exporter 5.72 pl526_1 bioconda
perl-exporter-tiny 1.002001 pl526_0 bioconda
perl-extutils-makemaker 7.36 pl526_1 bioconda
perl-file-listing 6.04 pl526_1 bioconda
perl-file-path 2.16 pl526_0 bioconda
perl-file-temp 0.2304 pl526_2 bioconda
perl-html-parser 3.72 pl526h6bb024c_5 bioconda
perl-html-tagset 3.20 pl526_3 bioconda
perl-html-tree 5.07 pl526_1 bioconda
perl-http-cookies 6.04 pl526_0 bioconda
perl-http-daemon 6.01 pl526_1 bioconda
perl-http-date 6.02 pl526_3 bioconda
perl-http-message 6.18 pl526_0 bioconda
perl-http-negotiate 6.01 pl526_3 bioconda
perl-io-compress 2.087 pl526he1b5a44_0 bioconda
perl-io-html 1.001 pl526_2 bioconda
perl-io-socket-ssl 2.066 pl526_0 bioconda
perl-io-zlib 1.10 pl526_2 bioconda
perl-json 4.02 pl526_0 bioconda
perl-json-xs 2.34 pl526h6bb024c_3 bioconda
perl-libwww-perl 6.39 pl526_0 bioconda
perl-list-moreutils 0.428 pl526_1 bioconda
perl-list-moreutils-xs 0.428 pl526_0 bioconda
perl-lwp-mediatypes 6.04 pl526_0 bioconda
perl-lwp-protocol-https 6.07 pl526_4 bioconda
perl-mime-base64 3.15 pl526_1 bioconda
perl-mozilla-ca 20180117 pl526_1 bioconda
perl-net-http 6.19 pl526_0 bioconda
perl-net-ssleay 1.88 pl526h90d6eec_0 bioconda
perl-ntlm 1.09 pl526_4 bioconda
perl-parent 0.236 pl526_1 bioconda
perl-pathtools 3.75 pl526h14c3975_1 bioconda
perl-scalar-list-utils 1.52 pl526h516909a_0 bioconda
perl-socket 2.027 pl526_1 bioconda
perl-storable 3.15 pl526h14c3975_0 bioconda
perl-test-requiresinternet 0.05 pl526_0 bioconda
perl-time-local 1.28 pl526_1 bioconda
perl-try-tiny 0.30 pl526_1 bioconda
perl-types-serialiser 1.0 pl526_2 bioconda
perl-uri 1.76 pl526_0 bioconda
perl-www-robotrules 6.02 pl526_3 bioconda
perl-xml-namespacesupport 1.12 pl526_0 bioconda
perl-xml-parser 2.44_01 pl526ha1d75be_1002 conda-forge
perl-xml-sax 1.02 pl526_0 bioconda
perl-xml-sax-base 1.09 pl526_0 bioconda
perl-xml-sax-expat 0.51 pl526_3 bioconda
perl-xml-simple 2.25 pl526_1 bioconda
perl-xsloader 0.24 pl526_0 bioconda
phylophlan 3.0.2 py_0 bioconda
pigz 2.5 h27826a3_0 conda-forge
pillow 8.1.0 py37h4600e1f_2 conda-forge
pip 21.0.1 pyhd8ed1ab_0 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pysam 0.16.0.1 py37hc334e0b_1 bioconda
pysocks 1.7.1 py37h89c1867_3 conda-forge
python 3.7.9 hffdb5ce_100_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python-lzo 1.12 py37he0a3664_1003 conda-forge
python_abi 3.7 1_cp37m conda-forge
pytz 2021.1 pyhd8ed1ab_0 conda-forge
raxml 8.2.12 h516909a_2 bioconda
readline 8.1 h27cfd23_0
requests 2.25.1 pyhd3deb0d_0 conda-forge
samtools 1.11 h6270b1f_0 bioconda
scipy 1.6.0 py37h14a347d_0 conda-forge
seaborn 0.11.1 ha770c72_0 conda-forge
seaborn-base 0.11.1 pyhd8ed1ab_1 conda-forge
seqkit 0.15.0 0 bioconda
setuptools 52.0.0 py37h06a4308_0
six 1.15.0 pyh9f0ad1d_0 conda-forge
sqlite 3.34.0 h74cdb3f_0 conda-forge
statsmodels 0.12.2 py37h902c9e0_0 conda-forge
tbb 2020.3 hfd86e86_0
tk 8.6.10 hed695b0_1 conda-forge
tornado 6.1 py37h5e8e339_1 conda-forge
trimal 1.4.1 hc9558a2_4 bioconda
urllib3 1.26.3 pyhd8ed1ab_0 conda-forge
wheel 0.36.2 pyhd3deb0d_0 conda-forge
xz 5.2.5 h516909a_1 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zstd 1.4.8 ha95c52a_1 conda-forge
It was created with the yaml:
channels:
- conda-forge
- bioconda
- biobakery
dependencies:
- pigz
- bioconda::seqkit
- bioconda::diamond>=2.0.3
- bioconda::metaphlan>=3.0.1
- biobakery::humann
THANK YOU!!!
You are extremely prompt and helpful.
Best regards,
Bostjan Murovec
Hello, could you tell me if the error was solved? I am having the same issue, the "rev,rev" thing. However, I have got a slightly different error message:
Error message returned from bowtie2 : Could not open index file /lustre/marialaura/databases/humann/struo_gtdb/nucleotide/all_genes_annot.rev.rev.1.bt2l Could not open index file /lustre/marialaura/databases/humann/struo_gtdb/nucleotide/all_genes_annot.rev.rev.2.bt2l (ERR): bowtie2-align died with signal 11 (SEGV)
I have searched for the meaning of error "signal 11 SEGV". Here is what I found and tried, but I do not understand well, I am a beginner and still have lots to learn.
I found this on Bowtie2 site, here. "Signal 11 SEGV" seems to be related to the number of threads (one thread vs. multi-thread). This should be fixed in version 2.4, though a few people still have this error (including me, who is using 2.4.2). I tried running with one single thread, but it still failed with the same error.
If you have found a solution for the issue, please tell me. Thank you.
Does the all_genes_annot.rev.rev.1.bt2l
indeed exist at /lustre/marialaura/databases/humann/struo_gtdb/nucleotide/
?
If it does exist, what is the file size? It should be 12G
Does the
all_genes_annot.rev.rev.1.bt2l
indeed exist at/lustre/marialaura/databases/humann/struo_gtdb/nucleotide/
?If it does exist, what is the file size? It should be 12G
That is what I have:
[marialaura]$ tree -h
|-- [ 12K] nucleotide
| |-- [ 12G] all_genes_annot.1.bt2l
| |-- [ 13G] all_genes_annot.2.bt2l
| |-- [500M] all_genes_annot.3.bt2l
| |-- [6.6G] all_genes_annot.4.bt2l
| |-- [ 12G] all_genes_annot.rev.1.bt2l
| |-- [ 13G] all_genes_annot.rev.2.bt2l
| `-- [7.9G] genome_reps_filt_annot.fna.gz
`-- [ 12K] protein
`-- [ 11G] uniref90_201901.dmnd
2 directories, 8 files
So I do not have all_genes_annot.rev.rev.1.bt2l
("double rev"), I just have all_genes_annot.rev.1.bt2l
("single rev"). But as @BostjanMurovec already said, even if I tried to rename it to "double rev", it would return an error Could not open index file all_genes_annot.rev.rev.rev.1.bt2l
("triple rev").
If I delete the all_genes_annot.rev.1.bt2l
, the program asks for all_genes_annot.rev.1.bt2l
(single rev). But then it doesn't exist, of course.
In my last attempt, I used the command:
humann \
--bypass-nucleotide-index \
--search-mode uniref90 \
--remove-temp-output \
--nucleotide-database /marialaura/databases/humann/struo_gtdb/nucleotide/ \
--protein-database /marialaura/databases/humann/struo_gtdb/protein/ \
--taxonomic-profile /marialaura/kraken_results/mpa_syle/S57.mpa.txt \
--threads 1 \
--input /marialaura/samples_joined/S57_R1_R2.fastq.gz \
--output /marialaura/humann_results/humann_23-06_test
I used Bowtie 2.4.2 and Humann v3.0.0.alpha.4.
@marialgk this is likely a humann3 bug. Can you try updating humann3 to the non-alpha version of humann3?
Closing due to lack of activity. Feel free to reopen, if needed.