alesssia/YAMP

ERROR PULLING DOCKER UNIREF DATABASES

Opened this issue · 1 comments

Hello @alesssia @pditommaso , much appreciation for this tool

I haven't much experience with docker but everytime I try to pull the uniref databases from docker , using the below command:

docker container run --volume $HOME:$HOME --workdir $PWD -it biobakery/workflows:3.0.0.a.6.metaphlanv3.0.7
   humann_databases --download uniref uniref90_diamond ./assets/data/uniref

I get the following error:

docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: mkdir /home/geokit/my_shared_data_folder: file exists: unknown.
ERRO[0000] error waiting for container: context canceled

I tried to install the databases myself using the humann_databases command but apparently the diamond version that builds the database in my machine is not the same as the one found in the docker image used in analysis of profile_function step, and so it fails with the following error:

N E X T F L O W  ~  version 21.10.6
Launching `YAMP.nf` [admiring_torricelli] - revision: 521633a8a3
---------------------------------------------
YET ANOTHER METAGENOMIC PIPELINE (YAMP) 
---------------------------------------------

Analysis introspection:


Starting time              : Thu Sep 29 06:18:05 UTC 2022
Environment                : 
Pipeline Name              : YAMP
Pipeline Version           : 0.9.5.3
Config Profile             : base,docker
Resumed                    : true
Nextflow version           : 21.10.6 build 5660 (21-12-2021 16:55 UTC)
Java version               : 11.0.16
Java Virtual Machine       : OpenJDK 64-Bit Server VM(11.0.16+8-post-Ubuntu-0ubuntu118.04)
Operating system           : Linux amd64 v5.4.0-1087-gcp
User name                  : root
Container Engine           : docker
Container                  : [:]
BBmap                      : quay.io/biocontainers/bbmap:38.87--h1296035_0
FastQC                     : quay.io/biocontainers/fastqc:0.11.9--0
biobakery                  : biobakery/workflows:3.0.0.a.6.metaphlanv3.0.7
qiime                      : qiime2/core:2020.8
MultiQC                    : quay.io/biocontainers/multiqc:1.9--py_1
Running parameters         : 
Reads                      : [/srv/data/my_shared_data_folder/amrcattle/YAMP/../data/ke_data/ERR2777823.fastq, ]
Prefix                     : sample1_ke
Running mode               : complete
Layout                     : Single-End
Performing de-duplication  : false
Synthetic contaminants     : 
Artefacts                  : /srv/data/my_shared_data_folder/amrcattle/YAMP/assets/data/sequencing_artifacts.fa.gz
Phix174ill                 : /srv/data/my_shared_data_folder/amrcattle/YAMP/assets/data/phix174_ill.ref.fa.gz
Adapters                   : /srv/data/my_shared_data_folder/amrcattle/YAMP/assets/data/adapters.fa
Trimming parameters        : 
Input quality offset       : ASCII+33
Min phred score            : 10
Min length                 : 60
kmer lenght                : 23
Shorter kmer               : 11
Max Hamming distance       : 1
Decontamination parameters : 
Contaminant (pan)genome    : /srv/data/my_shared_data_folder/amrcattle/YAMP/assets/data/hg19_main_mask_ribo_animal_allplant_allfungus.fa.gz
Min alignment identity     : 0.95
Max indel length           : 3
Max alignment band         : 0.16
MetaPhlAn parameters       : 
MetaPhlAn database         : /srv/data/my_shared_data_folder/amrcattle/YAMP/assets/data/metaphlan_databases/
Bowtie2 options            : very-sensitive
HUMAnN parameters          : 
Chocophlan database        : /srv/data/my_shared_data_folder/amrcattle/YAMP/assets/data/chocophlan
Uniref database            : /srv/data/my_shared_data_folder/amrcattle/YAMP/assets/data/uniref
Folders                    : 
Output dir                 : /srv/data/my_shared_data_folder/amrcattle/YAMP/../YAMP_results
Working dir                : /srv/data/my_shared_data_folder/amrcattle/YAMP/work
Script dir                 : /srv/data/my_shared_data_folder/amrcattle/YAMP
Lunching dir               : /srv/data/my_shared_data_folder/amrcattle/YAMP

executor >  local (2)
[71/15c029] process > get_software_versions                      [100%] 1 of 1, cached: 1 ✔
[-        ] process > dedup                                      -
[b3/578435] process > remove_synthetic_contaminants (sample1_ke) [100%] 1 of 1, cached: 1 ✔
[c4/3a65a2] process > trim (sample1_ke)                          [100%] 1 of 1, cached: 1 ✔
[47/3613df] process > index_foreign_genome (1)                   [100%] 1 of 1, cached: 1 ✔
[32/107bc9] process > decontaminate (sample1_ke)                 [100%] 1 of 1, cached: 1 ✔
[30/0a0ac5] process > quality_assessment (sample1_ke)            [100%] 2 of 2, cached: 2 ✔
[-        ] process > merge_paired_end_cleaned                   -
[57/b73a62] process > profile_taxa (sample1_ke)                  [100%] 1 of 1, cached: 1 ✔
[69/eaabd9] process > profile_function (sample1_ke)              [  0%] 0 of 1
[7e/6eb90a] process > alpha_diversity (sample1_ke)               [  0%] 0 of 1
[-        ] process > log                                        -
Error executing process > 'profile_function (sample1_ke)'

Caused by:
  Process `profile_function (sample1_ke)` terminated with an error exit status (1)

Command executed:

executor >  local (2)
[71/15c029] process > get_software_versions                      [100%] 1 of 1, cached: 1 ✔
[-        ] process > dedup                                      -
[b3/578435] process > remove_synthetic_contaminants (sample1_ke) [100%] 1 of 1, cached: 1 ✔
[c4/3a65a2] process > trim (sample1_ke)                          [100%] 1 of 1, cached: 1 ✔
[47/3613df] process > index_foreign_genome (1)                   [100%] 1 of 1, cached: 1 ✔
[32/107bc9] process > decontaminate (sample1_ke)                 [100%] 1 of 1, cached: 1 ✔
[30/0a0ac5] process > quality_assessment (sample1_ke)            [100%] 2 of 2, cached: 2 ✔
[-        ] process > merge_paired_end_cleaned                   -
[57/b73a62] process > profile_taxa (sample1_ke)                  [100%] 1 of 1, cached: 1 ✔
[69/eaabd9] process > profile_function (sample1_ke)              [100%] 1 of 1, failed: 1 ✘
[-        ] process > alpha_diversity (sample1_ke)               -
[-        ] process > log                                        -
Error executing process > 'profile_function (sample1_ke)'

Caused by:
  Process `profile_function (sample1_ke)` terminated with an error exit status (1)

Command executed:

  #HUMAnN will uses the list of species detected by the profile_taxa process
  humann --input sample1_ke_QCd.fq.gz --output . --output-basename sample1_ke --taxonomic-profile sample1_ke_metaphlan_bugs_list.tsv --nucleotide-database chocophlan --protein-database uniref --pathways metacyc --threads 4 --memory-use minimum &> sample1_ke_HUMAnN.log 
  
  # MultiQC doesn't have a module for humann yet. As a consequence, I
  # had to create a YAML file with all the info I need via a bash script
  bash scrape_profile_functions.sh sample1_ke sample1_ke_HUMAnN.log > profile_functions_mqc.yaml

Command exit status:
  1

Command output:
  (empty)

Command error:
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

Work dir:
  /srv/data/my_shared_data_folder/amrcattle/YAMP/work/69/eaabd9cea87d9911f5cd4363580994

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run

Upon inspection of the workdirectory using the following command: ```bash

less /srv/data/my_shared_data_folder/amrcattle/YAMP/work/69/eaabd9cea87d9911f5cd4363580994/sample1_ke_HUMAnN.log

## the output highlights the problem at the end of the file as :
```bash
Running diamond ........


Aligning to reference database: uniref90_201901b_full.dmnd

CRITICAL ERROR: Error executing: /usr/bin/diamond blastx --query /srv/data/my_shared_data_folder/amrcattle/YAMP/work/69/eaabd9cea87d9911f5cd4363580994/sample1_ke_humann_temp/sample1_ke_bowtie2_unaligned.fa --evalue 1.0 --threads 4 --top 1 --outfmt 6 --db /srv/data/my_shared_data_folder/amrcattle/YAMP/work/69/eaabd9cea87d9911f5cd4363580994/uniref/uniref90_201901b_full --out /srv/data/my_shared_data_folder/amrcattle/YAMP/work/69/eaabd9cea87d9911f5cd4363580994/sample1_ke_humann_temp/tmpvgd6p0jx/diamond_m8_vmf8ifrs --tmpdir /srv/data/my_shared_data_folder/amrcattle/YAMP/work/69/eaabd9cea87d9911f5cd4363580994/sample1_ke_humann_temp/tmpvgd6p0jx

## Error message returned from diamond :
diamond v0.9.24.125 | by Benjamin Buchfink <buchfink@gmail.com>
Licensed under the GNU GPL <https://www.gnu.org/licenses/gpl.txt>
Check http://github.com/bbuchfink/diamond for updates.

#CPU threads: 4
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /srv/data/my_shared_data_folder/amrcattle/YAMP/work/69/eaabd9cea87d9911f5cd4363580994/sample1_ke_humann_temp/tmpvgd6p0jx
Opening the database...  [0.016184s]
Error: Database was built with a different version of Diamond and is incompatible.

Please help with the issue !!!