atlas downdload fail
Closed this issue · 6 comments
- I checked and didn't found a related issue,e.g. while typing the title
- ** I got an error in the following rule(s):
atlas download --db-dir $DBDIR
**
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/ssl.py", line 1274, in recv_into
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/urllib3/util/retry.py", line 470, in increment
raise reraise(type(error), error, _stacktrace)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/urllib3/util/util.py", line 39, in reraise
raise value
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/urllib3/connectionpool.py", line 538, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/urllib3/connectionpool.py", line 370, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='zenodo.org', port=443): Read timed out. (read timeout=15.0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/checkm2/zenodo_backpack.py", line 164, in _retrieve_record_ID
r = requests.get(DOI, timeout=15.)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/sessions.py", line 725, in send history = [resp for resp in gen]
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/sessions.py", line 725, in <listcomp>
history = [resp for resp in gen]
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/sessions.py", line 266, in resolve_redirects
resp = self.send(
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/requests/adapters.py", line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='zenodo.org', port=443): Read timed out. (read timeout=15.0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/bin/checkm2", line 280, in <module>
fileManager.DiamondDB().download_database(args.path)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/checkm2/fileManager.py", line 127, in download_database
backpack_downloader.download_and_extract(download_location, DOI, progress_bar=True, no_check_version=False)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/checkm2/zenodo_backpack.py", line 46, in download_and_extract
recordID = self._retrieve_record_ID(DOI)
File "/scratch/mirror/atlas/2.18.1/conda_envs/970bf81dc2fb89a4cccba899825a84ec_/lib/python3.8/site-packages/checkm2/zenodo_backpack.py", line 166, in _retrieve_record_ID
raise ZenodoConnectionException('Connection error: {}'.format(e))
checkm2.zenodo_backpack.ZenodoConnectionException: Connection error: HTTPSConnectionPool(host='zenodo.org', port=443): Read timed out. (read timeout=15.0)
================================================================================
Removing output files of failed job checkm2_download_db since they might be corrupted:
/scratch/mirror/atlas/2.18.1/CheckM2
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
** Atlas version**
2.18.1
Seems like a problem of Internet connection. Did other download steps succede?
I tried again and got this error:
$ atlas download --db-dir .
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-for
ge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Creating conda environment /home/apps/conda/miniconda3/envs/atlas-2.18.1/lib/python3.10/site-packages/atlas/workflow/rules/../envs/eggNOG.yaml...
Downloading and installing remote packages.
Environment for /home/apps/conda/miniconda3/envs/atlas-2.18.1/lib/python3.10/site-packages/atlas/workflow/rules/../envs/eggNOG.yaml created (location: conda_envs/ce47687b109879e3
9a03638f72c50b1e_)
Creating conda environment /home/apps/conda/miniconda3/envs/atlas-2.18.1/lib/python3.10/site-packages/atlas/workflow/rules/../envs/checkm2.yaml...
Downloading and installing remote packages.
Environment for /home/apps/conda/miniconda3/envs/atlas-2.18.1/lib/python3.10/site-packages/atlas/workflow/rules/../envs/checkm2.yaml created (location: conda_envs/e52a9b934605d80
74e0163627fbe0316_)
Creating conda environment /home/apps/conda/miniconda3/envs/atlas-2.18.1/lib/python3.10/site-packages/atlas/workflow/rules/../envs/gtdbtk.yaml...
Downloading and installing remote packages.
Environment for /home/apps/conda/miniconda3/envs/atlas-2.18.1/lib/python3.10/site-packages/atlas/workflow/rules/../envs/gtdbtk.yaml created (location: conda_envs/ab1b2b5668e67301
9caaf32843c034c4_)
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
--------------------- ------- ------------- -------------
checkm2_download_db 1 1 1
download 1 1 1
download_atlas_files 2 1 1
download_eggNOG_files 1 1 1
download_gtdb 1 1 1
extract_gtdb 1 1 1
total 7 1 1
Select jobs to execute...
[Wed Oct 18 15:54:41 2023]
localrule download_atlas_files:
output: /scratch/students/apptest/adapters.fa
jobid: 1
reason: Missing output files: /scratch/students/apptest/adapters.fa
wildcards: filename=adapters.fa
resources: tmpdir=/tmp
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Select jobs to execute...
--2023-10-18 15:54:42-- https://zenodo.org/record/1134890/files/adapters.fa
Resolving zenodo.org (zenodo.org)... 188.185.10.78, 188.185.22.33, 188.185.33.206, ...
Connecting to zenodo.org (zenodo.org)|188.185.10.78|:443... connected.
HTTP request sent, awaiting response... 301 MOVED PERMANENTLY
Location: /records/1134890/files/adapters.fa [following]
--2023-10-18 15:54:42-- https://zenodo.org/records/1134890/files/adapters.fa
Reusing existing connection to zenodo.org:443.
HTTP request sent, awaiting response... 200 OK
Length: 13954 (14K) [application/octet-stream]
Saving to: '/scratch/students/apptest/adapters.fa'
/scratch/students/apptest/adapters.fa 100%[============================================================================================>] 13.63K --.-KB/s in 0.001s
2023-10-18 15:54:42 (20.6 MB/s) - '/scratch/students/apptest/adapters.fa' saved [13954/13954]
[Wed Oct 18 15:54:43 2023]
Finished job 1.
1 of 7 steps (14%) done
Select jobs to execute...
[Wed Oct 18 15:54:43 2023]
rule checkm2_download_db:
output: /scratch/students/apptest/CheckM2
log: logs/download/checkm2.log
jobid: 4
reason: Missing output files: /scratch/students/apptest/CheckM2
resources: tmpdir=/tmp, time=10
Activating conda environment: conda_envs/e52a9b934605d8074e0163627fbe0316_
[Wed Oct 18 15:55:13 2023]
Error in rule checkm2_download_db:
jobid: 4
output: /scratch/students/apptest/CheckM2
log: logs/download/checkm2.log (check log file(s) for error details)
conda-env: /scratch/students/apptest/conda_envs/e52a9b934605d8074e0163627fbe0316_
shell:
checkm2 database --download --path /scratch/students/apptest/CheckM2 &>> logs/download/checkm2.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile logs/download/checkm2.log:
================================================================================
[10/18/2023 03:55:12 PM] INFO: Command: Download database. Checking internal path information.
Traceback (most recent call last):
File "/scratch/students/apptest/conda_envs/e52a9b934605d8074e0163627fbe0316_/bin/checkm2", line 280, in <module>
fileManager.DiamondDB().download_database(args.path)
File "/scratch/students/apptest/conda_envs/e52a9b934605d8074e0163627fbe0316_/lib/python3.8/site-packages/checkm2/fileManager.py", line 127, in download_database
backpack_downloader.download_and_extract(download_location, DOI, progress_bar=True, no_check_version=False)
File "/scratch/students/apptest/conda_envs/e52a9b934605d8074e0163627fbe0316_/lib/python3.8/site-packages/checkm2/zenodo_backpack.py", line 52, in download_and_extract
fname = str(file['key']).split('/')[-1]
KeyError: 'key'
================================================================================
Removing output files of failed job checkm2_download_db since they might be corrupted:
/scratch/students/apptest/CheckM2
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
An error occurred while downloading reference databases.
ATLAS databases can be manually downloaded from: https://zenodo.org/record/1134890
eggNOG databases can be manually downloaded from: http://eggnogdb.embl.de/download/emapperdb-5
CAT databases can be manually downloaded from: https://github.com/dutilh/CAT
Complete log: .snakemake/log/2023-10-18T154707.281348.snakemake.log
[Atlas] CRITICAL: Command 'snakemake --snakefile /home/apps/conda/miniconda3/envs/atlas-2.18.1/lib/python3.10/site-packages/atlas/workflow/rules/download.smk --jobs 1 --rerun-incomplete --conda-frontend mamba --scheduler greedy --nolock --use-conda --conda-prefix /scratch/students/apptest/conda_envs --show-failed-logs --config database_dir='/scratch/students/apptest' -- ' returned non-zero exit status 1.
The link printed for manual eggNOG database download 404's. http://eggnog6.embl.de/download/emapperdb-5
I will try next week to debug this.
It seems you could install conda envs without problem.
Even if you don't have all the databases you can go ahead and run atlas.
yes, conda installation works without problem! It seems that the directory structure for eggNOG is more fine grained than just major version
http://eggnog6.embl.de/download/emapperdb-5.0.2/ instead of http://eggnog6.embl.de/download/emapperdb-5
Where are we here again. you had an error in checkm2 isn't it or egggNog?
Do you need the eggNOG really?
Hi,
I retried the command:
atlas download --db-dir $DBDIR
and it worked now. May have been related to connection problems after all.