Permission/access credentials to download data and models
Opened this issue · 13 comments
The issue described here also affect the download of pretrained adapters.
Hi,
Thank you very much for reporting this. We are looking into the issue today. Will give an update by the end of the day.
We submitted this PR to potentially solve your issue:
#10
(already merged into master).
Let us know if that works.
Hello,
Thank you for the quick answer!
However, the permission issue is not about the execution of the bash scripts, but rather about accessing the AWS buckets where most of the data and models are stored : I created an AWS key and ran the waws --configure
command and I still cannot download any data or model stored in AWS buckets. For this reason, most of the download scripts cannot be used (1_download_relations.sh
, 9_download_pretrained_adapters_omcs.sh
, 9_download_pretrained_adapters_rw30.sh
, siqa_1_download_siqa.sh
)
Uploading the data and models to a Google Drive (as in the workaround for this issue) and simply including the download links in the README would be an easy fix for this issue.
Thanks for quickly evaluating that.
It's very strange, we just ran the scripts without any credentials at all. Maybe you could try that?
I.e. remove the .waws file from (the home folder: ~)
then reconfigure waws without any AWS keys and try the scripts after that.
Here's the result:
$ rm -rf ~/.waws/
$ waws --configure
Please Enter Your KEY_PATH [CURRENT:~/.ssh/wluper]:
Please Enter Your USER [CURRENT:Firstname]:
Please Enter Your AWS_REGION [CURRENT:eu-west-2]:
Please Enter Your AWS_ENCODING [CURRENT:json]:
Please Enter Your AWS_KEY_ID [CURRENT:]:
Please Enter Your AWS_KEY [CURRENT:]:
$ bash ./1_download_relations.sh
Downloading and extracting relatedTo...
Traceback (most recent call last):
File "download_utility/download_relations.py", line 68, in <module>
sys.exit(main(sys.argv[1:]))
File "download_utility/download_relations.py", line 64, in main
download(rel, args.data_dir)
File "download_utility/download_relations.py", line 36, in download
bucket_name="wluper-retrograph"
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/waws/bucket_manager.py", line 76, in download_file
bucket.download_file( final_remote_path, final_local_path )
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 246, in bucket_download_file
ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 172, in download_file
extra_args=ExtraArgs, callback=Callback)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/transfer.py", line 307, in download_file
future.result()
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 265, in result
raise self._exception
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/tasks.py", line 255, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/download.py", line 343, in _submit
**transfer_future.meta.call_args.extra_args
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 663, in _make_api_call
operation_model, request_dict, request_context)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 682, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
return self._send_request(request_dict, operation_model)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/endpoint.py", line 132, in _send_request
request = self.create_request(request_dict, operation_model)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/endpoint.py", line 116, in create_request
operation_name=operation_model.name)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
return self._emit(event_name, kwargs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
response = handler(**kwargs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/signers.py", line 90, in handler
return self.sign(operation_name, request)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/signers.py", line 162, in sign
auth.add_auth(request)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/auth.py", line 357, in add_auth
raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I am using Ubuntu 19.10 and Python 3.6.12 :: Anaconda, Inc.
My requirements.txt
is as follow:
waws==0.0.0.4
numpy==1.18.5
networkx==2.5
tqdm==4.56.2
tensorflow-gpu==1.15.5
Perfect, thanks for providing your error message.
We understand the bug now.
The problem is basically in the package awscli
requires its own configuration (which we had by default on our machines). We will try and push a fix over the next couple of weeks (when awscli is not configured on a machine). However, in the mean-time you can fix it by configuring aws directly first!
Here our version numbers for reference:
wluper-system$ aws --version
aws-cli/1.19.8 Python/3.6.5 Darwin/19.6.0 botocore/1.20.8
Steps To solve the problem:
- Install
awscli
(version 1), Run:pip3 install awscli<1.19.8 --upgrade --user
- Run the command:
aws configure
(choose default region eu-west-2) - Try running the whole thing again.
Hello,
I installed awscli
(version 1):
$ aws --version
aws-cli/1.19.8 Python/2.7.17 Linux/5.3.0-64-generic botocore/1.20.8
Only difference is that I installed it by downloading the bundle as described here, and not using pip
. This is because I am using conda virtual environments and it was easier to configure awscli
when installed that way.
Here is my current awscli
configuration:
$ aws configure
AWS Access Key ID [****************]:
AWS Secret Access Key [****************]:
Default region name [eu-west-2]:
Default output format [json]:
and were is my current waws
configuration:
$ waws --configure
Please Enter Your AWS_ENCODING [CURRENT:json]:
Please Enter Your AWS_KEY [CURRENT:****************]:
Please Enter Your AWS_KEY_ID [CURRENT:****************]:
Please Enter Your AWS_REGION [CURRENT:eu-west-2]:
Please Enter Your KEY_PATH [CURRENT:~/.ssh/wluper]:
Please Enter Your USER [CURRENT:Firstname]:
With this setup, I now get a different error message when I try to download from your AWS buckets:
bash 1_download_relations.sh
Downloading and extracting relatedTo...
Traceback (most recent call last):
File "download_utility/download_relations.py", line 68, in <module>
sys.exit(main(sys.argv[1:]))
File "download_utility/download_relations.py", line 64, in main
download(rel, args.data_dir)
File "download_utility/download_relations.py", line 36, in download
bucket_name="wluper-retrograph"
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/waws/bucket_manager.py", line 76, in download_file
bucket.download_file( final_remote_path, final_local_path )
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 246, in bucket_download_file
ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 172, in download_file
extra_args=ExtraArgs, callback=Callback)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/transfer.py", line 307, in download_file
future.result()
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 265, in result
raise self._exception
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/tasks.py", line 255, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/download.py", line 343, in _submit
**transfer_future.meta.call_args.extra_args
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 676, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
P.S. I also tried without providing my AWS key and id, but that resulted in the same error message as before (botocore.exceptions.NoCredentialsError: Unable to locate credentials
)
Thank you for trying again and bearing with this issue.
We made a change in the bucket policy and this may affect the results now.
Please try again. If it doesn't work we will open a bigger request internally to also push some fixes to waws
.
My pleasure.
I ran the download scripts once again after the bucket policy update and it's definitively going in the right direction.
I was able to successfully download the relations with the 1_download_relations.sh
script and the pretrained adapters with 9_download_pretrained_adapters_rw30.sh
and 9_download_pretrained_adapters_omcs.sh
scripts.
I was also able to download the CommonsenseQA dataset with csqa_1_download_commonsenseqa.sh
, however, there seem to be some issues left with the bucket policies of the other datasets.
Here's the error I get when I try to download GLUE:
$ bash glue_1_download_glue.sh
Downloading and extracting relatedTo...
Downloaded: cn_relatedTo.txt
Done!
Downloading and extracting formOf...
Downloaded: cn_formOf.txt
Done!
Downloading and extracting isA...
Downloaded: cn_isA.txt
Done!
Downloading and extracting partOf...
Downloaded: cn_partOf.txt
Done!
Downloading and extracting hasA...
Downloaded: cn_hasA.txt
Done!
Downloading and extracting usedFor...
Downloaded: cn_usedFor.txt
Done!
Downloading and extracting capableOf...
Downloaded: cn_capableOf.txt
Done!
Downloading and extracting atLocation...
Downloaded: cn_atLocation.txt
Done!
Downloading and extracting causes...
Downloaded: cn_causes.txt
Done!
Downloading and extracting hasSubevent...
Downloaded: cn_hasSubevent.txt
Done!
Downloading and extracting hasFirstSubevent...
Downloaded: cn_hasFirstSubevent.txt
Done!
Downloading and extracting hasLastSubevent...
Downloaded: cn_hasLastSubevent.txt
Done!
Downloading and extracting hasPrerequisite...
Downloaded: cn_hasPrerequisite.txt
Done!
Downloading and extracting hasProperty...
Downloaded: cn_hasProperty.txt
Done!
Downloading and extracting motivatedByGoal...
Downloaded: cn_motivatedByGoal.txt
Done!
Downloading and extracting obstructedBy...
Downloaded: cn_obstructedBy.txt
Done!
Downloading and extracting desires...
Downloaded: cn_desires.txt
Done!
Downloading and extracting createdBy...
Downloaded: cn_createdBy.txt
Done!
Downloading and extracting synonyms...
Downloaded: cn_synonyms.txt
Done!
Downloading and extracting antonyms...
Downloaded: cn_antonyms.txt
Done!
Downloading and extracting distinctFrom...
Downloaded: cn_distinctFrom.txt
Done!
Downloading and extracting derivedFrom...
Downloaded: cn_derivedFrom.txt
Done!
Downloading and extracting symbolOf...
Downloaded: cn_symbolOf.txt
Done!
Downloading and extracting definedAs...
Downloaded: cn_definedAs.txt
Done!
Downloading and extracting mannerOf...
Downloaded: cn_mannerOf.txt
Done!
Downloading and extracting locatedNear...
Downloaded: cn_locatedNear.txt
Done!
Downloading and extracting hasContext...
Downloaded: cn_hasContext.txt
Done!
Downloading and extracting similarTo...
Downloaded: cn_similarTo.txt
Done!
Downloading and extracting causesDesire...
Downloaded: cn_causesDesire.txt
Done!
Downloading and extracting madeOf...
Downloaded: cn_madeOf.txt
Done!
Downloading and extracting receivesAction...
Downloaded: cn_receivesAction.txt
Done!
Downloading and extracting CoLA...
Traceback (most recent call last):
File "download_utility/download_glue.py", line 145, in <module>
sys.exit(main(sys.argv[1:]))
File "download_utility/download_glue.py", line 141, in main
download_and_extract(task, args.data_dir)
File "download_utility/download_glue.py", line 51, in download_and_extract
urllib.request.urlretrieve(TASK2PATH[task], data_file)
File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 248, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
Here's the error I get when I try to download COPA:
$ bash copa_1_download_copa.sh
An error occurred (403) when calling the HeadObject operation: Forbidden
Download to S3 failed. Please check all parameters are correct. Otherwise, please report on Github.
unzip: cannot find or open copa_en.zip, copa_en.zip.zip or copa_en.zip.ZIP.
mv: cannot stat 'test_gold.jsonl': No such file or directory
mv: cannot stat 'train.en.jsonl': No such file or directory
mv: cannot stat 'val.en.jsonl': No such file or directory
mv: cannot stat 'copa_en.zip': No such file or directory
Here's the error I get when I try to download SIQA:
$ bash siqa_1_download_siqa.sh
An error occurred (403) when calling the HeadObject operation: Forbidden
Download to S3 failed. Please check all parameters are correct. Otherwise, please report on Github.
unzip: cannot find or open socialIQa_v1.4.zip, socialIQa_v1.4.zip.zip or socialIQa_v1.4.zip.ZIP.
mv: cannot stat 'socialIQa_v1.4_dev.jsonl': No such file or directory
mv: cannot stat 'socialIQa_v1.4_trn.jsonl': No such file or directory
mv: cannot stat 'socialIQa_v1.4_tst.jsonl': No such file or directory
mv: cannot stat 'socialIQa_v1.4.zip': No such file or directory
Great!
I can confirm that is was able to execute all download scripts from the repository.
Next step: training a model :)
Awesome! Thanks for running it again.
Hi,
First of all, I want to thank you for providing your code and all the support on the occurring issues!
Unfortunately, I had the same issues as user ghost. I searched the closed issues and followed the advice in this thread. Everything worked out except for the
"botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden" occurring for me again. In the thread, the issue was said to be fixed, but it now seems like it has reappeared. Do you have any advice?
Heres the full error log:
Downloading and extracting relatedTo...
Traceback (most recent call last):
File "download_utility/download_relations.py", line 68, in <module>
sys.exit(main(sys.argv[1:]))
File "download_utility/download_relations.py", line 64, in main
download(rel, args.data_dir)
File "download_utility/download_relations.py", line 36, in download
bucket_name="wluper-retrograph"
File "/usr/local/lib/python3.7/dist-packages/waws/bucket_manager.py", line 76, in download_file
bucket.download_file( final_remote_path, final_local_path )
File "/usr/local/lib/python3.7/dist-packages/boto3/s3/inject.py", line 283, in bucket_download_file
Config=Config,
File "/usr/local/lib/python3.7/dist-packages/boto3/s3/inject.py", line 195, in download_file
callback=Callback,
File "/usr/local/lib/python3.7/dist-packages/boto3/s3/transfer.py", line 320, in download_file
future.result()
File "/usr/local/lib/python3.7/dist-packages/s3transfer/futures.py", line 103, in result
return self._coordinator.result()
File "/usr/local/lib/python3.7/dist-packages/s3transfer/futures.py", line 266, in result
raise self._exception
File "/usr/local/lib/python3.7/dist-packages/s3transfer/tasks.py", line 269, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/s3transfer/download.py", line 357, in _submit
**transfer_future.meta.call_args.extra_args,
File "/usr/local/lib/python3.7/dist-packages/botocore/client.py", line 530, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.7/dist-packages/botocore/client.py", line 960, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden