Wluper/Retrograph

Permission/access credentials to download data and models

Opened this issue · 13 comments

The issue described here also affect the download of pretrained adapters.

Hi,

Thank you very much for reporting this. We are looking into the issue today. Will give an update by the end of the day.

We submitted this PR to potentially solve your issue:
#10

(already merged into master).

Let us know if that works.

Hello,

Thank you for the quick answer!

However, the permission issue is not about the execution of the bash scripts, but rather about accessing the AWS buckets where most of the data and models are stored : I created an AWS key and ran the waws --configure command and I still cannot download any data or model stored in AWS buckets. For this reason, most of the download scripts cannot be used (1_download_relations.sh, 9_download_pretrained_adapters_omcs.sh, 9_download_pretrained_adapters_rw30.sh, siqa_1_download_siqa.sh )

Uploading the data and models to a Google Drive (as in the workaround for this issue) and simply including the download links in the README would be an easy fix for this issue.

Thanks for quickly evaluating that.

It's very strange, we just ran the scripts without any credentials at all. Maybe you could try that?

I.e. remove the .waws file from (the home folder: ~)

then reconfigure waws without any AWS keys and try the scripts after that.

Here's the result:

$ rm -rf ~/.waws/
$ waws --configure
Please Enter Your KEY_PATH [CURRENT:~/.ssh/wluper]:  
Please Enter Your USER [CURRENT:Firstname]:  
Please Enter Your AWS_REGION [CURRENT:eu-west-2]:  
Please Enter Your AWS_ENCODING [CURRENT:json]:  
Please Enter Your AWS_KEY_ID [CURRENT:]:  
Please Enter Your AWS_KEY [CURRENT:]:  
$ bash ./1_download_relations.sh
Downloading and extracting relatedTo...
Traceback (most recent call last):
  File "download_utility/download_relations.py", line 68, in <module>
    sys.exit(main(sys.argv[1:]))
  File "download_utility/download_relations.py", line 64, in main
    download(rel, args.data_dir)
  File "download_utility/download_relations.py", line 36, in download
    bucket_name="wluper-retrograph"
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/waws/bucket_manager.py", line 76, in download_file
    bucket.download_file( final_remote_path, final_local_path )
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 246, in bucket_download_file
    ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 172, in download_file
    extra_args=ExtraArgs, callback=Callback)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/transfer.py", line 307, in download_file
    future.result()
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/tasks.py", line 255, in _main
    self._submit(transfer_future=transfer_future, **kwargs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/download.py", line 343, in _submit
    **transfer_future.meta.call_args.extra_args
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 663, in _make_api_call
    operation_model, request_dict, request_context)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 682, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/endpoint.py", line 132, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/endpoint.py", line 116, in create_request
    operation_name=operation_model.name)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/signers.py", line 90, in handler
    return self.sign(operation_name, request)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/signers.py", line 162, in sign
    auth.add_auth(request)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/auth.py", line 357, in add_auth
    raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials

I am using Ubuntu 19.10 and Python 3.6.12 :: Anaconda, Inc.
My requirements.txt is as follow:

waws==0.0.0.4
numpy==1.18.5
networkx==2.5
tqdm==4.56.2
tensorflow-gpu==1.15.5

Perfect, thanks for providing your error message.

We understand the bug now.

The problem is basically in the package awscli requires its own configuration (which we had by default on our machines). We will try and push a fix over the next couple of weeks (when awscli is not configured on a machine). However, in the mean-time you can fix it by configuring aws directly first!

Here our version numbers for reference:

wluper-system$ aws --version
aws-cli/1.19.8 Python/3.6.5 Darwin/19.6.0 botocore/1.20.8

Steps To solve the problem:

  1. Install awscli (version 1), Run: pip3 install awscli<1.19.8 --upgrade --user
  2. Run the command: aws configure (choose default region eu-west-2)
  3. Try running the whole thing again.

Hello,

I installed awscli (version 1):

$ aws --version
aws-cli/1.19.8 Python/2.7.17 Linux/5.3.0-64-generic botocore/1.20.8

Only difference is that I installed it by downloading the bundle as described here, and not using pip. This is because I am using conda virtual environments and it was easier to configure awscli when installed that way.

Here is my current awscli configuration:

$ aws configure
AWS Access Key ID [****************]: 
AWS Secret Access Key [****************]: 
Default region name [eu-west-2]: 
Default output format [json]: 

and were is my current waws configuration:

$ waws --configure
Please Enter Your AWS_ENCODING [CURRENT:json]:  
Please Enter Your AWS_KEY [CURRENT:****************]:  
Please Enter Your AWS_KEY_ID [CURRENT:****************]:  
Please Enter Your AWS_REGION [CURRENT:eu-west-2]:  
Please Enter Your KEY_PATH [CURRENT:~/.ssh/wluper]:  
Please Enter Your USER [CURRENT:Firstname]:  

With this setup, I now get a different error message when I try to download from your AWS buckets:

bash 1_download_relations.sh 
Downloading and extracting relatedTo...
Traceback (most recent call last):
  File "download_utility/download_relations.py", line 68, in <module>
    sys.exit(main(sys.argv[1:]))
  File "download_utility/download_relations.py", line 64, in main
    download(rel, args.data_dir)
  File "download_utility/download_relations.py", line 36, in download
    bucket_name="wluper-retrograph"
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/waws/bucket_manager.py", line 76, in download_file
    bucket.download_file( final_remote_path, final_local_path )
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 246, in bucket_download_file
    ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/inject.py", line 172, in download_file
    extra_args=ExtraArgs, callback=Callback)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/boto3/s3/transfer.py", line 307, in download_file
    future.result()
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/tasks.py", line 255, in _main
    self._submit(transfer_future=transfer_future, **kwargs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/s3transfer/download.py", line 343, in _submit
    **transfer_future.meta.call_args.extra_args
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/site-packages/botocore/client.py", line 676, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

P.S. I also tried without providing my AWS key and id, but that resulted in the same error message as before (botocore.exceptions.NoCredentialsError: Unable to locate credentials)

Thank you for trying again and bearing with this issue.

We made a change in the bucket policy and this may affect the results now.

Please try again. If it doesn't work we will open a bigger request internally to also push some fixes to waws.

My pleasure.

I ran the download scripts once again after the bucket policy update and it's definitively going in the right direction.
I was able to successfully download the relations with the 1_download_relations.sh script and the pretrained adapters with 9_download_pretrained_adapters_rw30.sh and 9_download_pretrained_adapters_omcs.sh scripts.

I was also able to download the CommonsenseQA dataset with csqa_1_download_commonsenseqa.sh, however, there seem to be some issues left with the bucket policies of the other datasets.

Here's the error I get when I try to download GLUE:

$ bash glue_1_download_glue.sh 
Downloading and extracting relatedTo...
Downloaded: cn_relatedTo.txt
	Done!
Downloading and extracting formOf...
Downloaded: cn_formOf.txt
	Done!
Downloading and extracting isA...
Downloaded: cn_isA.txt
	Done!
Downloading and extracting partOf...
Downloaded: cn_partOf.txt
	Done!
Downloading and extracting hasA...
Downloaded: cn_hasA.txt
	Done!
Downloading and extracting usedFor...
Downloaded: cn_usedFor.txt
	Done!
Downloading and extracting capableOf...
Downloaded: cn_capableOf.txt
	Done!
Downloading and extracting atLocation...
Downloaded: cn_atLocation.txt
	Done!
Downloading and extracting causes...
Downloaded: cn_causes.txt
	Done!
Downloading and extracting hasSubevent...
Downloaded: cn_hasSubevent.txt
	Done!
Downloading and extracting hasFirstSubevent...
Downloaded: cn_hasFirstSubevent.txt
	Done!
Downloading and extracting hasLastSubevent...
Downloaded: cn_hasLastSubevent.txt
	Done!
Downloading and extracting hasPrerequisite...
Downloaded: cn_hasPrerequisite.txt
	Done!
Downloading and extracting hasProperty...
Downloaded: cn_hasProperty.txt
	Done!
Downloading and extracting motivatedByGoal...
Downloaded: cn_motivatedByGoal.txt
	Done!
Downloading and extracting obstructedBy...
Downloaded: cn_obstructedBy.txt
	Done!
Downloading and extracting desires...
Downloaded: cn_desires.txt
	Done!
Downloading and extracting createdBy...
Downloaded: cn_createdBy.txt
	Done!
Downloading and extracting synonyms...
Downloaded: cn_synonyms.txt
	Done!
Downloading and extracting antonyms...
Downloaded: cn_antonyms.txt
	Done!
Downloading and extracting distinctFrom...
Downloaded: cn_distinctFrom.txt
	Done!
Downloading and extracting derivedFrom...
Downloaded: cn_derivedFrom.txt
	Done!
Downloading and extracting symbolOf...
Downloaded: cn_symbolOf.txt
	Done!
Downloading and extracting definedAs...
Downloaded: cn_definedAs.txt
	Done!
Downloading and extracting mannerOf...
Downloaded: cn_mannerOf.txt
	Done!
Downloading and extracting locatedNear...
Downloaded: cn_locatedNear.txt
	Done!
Downloading and extracting hasContext...
Downloaded: cn_hasContext.txt
	Done!
Downloading and extracting similarTo...
Downloaded: cn_similarTo.txt
	Done!
Downloading and extracting causesDesire...
Downloaded: cn_causesDesire.txt
	Done!
Downloading and extracting madeOf...
Downloaded: cn_madeOf.txt
	Done!
Downloading and extracting receivesAction...
Downloaded: cn_receivesAction.txt
	Done!
Downloading and extracting CoLA...
Traceback (most recent call last):
  File "download_utility/download_glue.py", line 145, in <module>
    sys.exit(main(sys.argv[1:]))
  File "download_utility/download_glue.py", line 141, in main
    download_and_extract(task, args.data_dir)
  File "download_utility/download_glue.py", line 51, in download_and_extract
    urllib.request.urlretrieve(TASK2PATH[task], data_file)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/data/anaconda3/envs/retrograph/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Here's the error I get when I try to download COPA:

$ bash copa_1_download_copa.sh 
An error occurred (403) when calling the HeadObject operation: Forbidden
Download to S3 failed. Please check all parameters are correct. Otherwise, please report on Github.
unzip:  cannot find or open copa_en.zip, copa_en.zip.zip or copa_en.zip.ZIP.
mv: cannot stat 'test_gold.jsonl': No such file or directory
mv: cannot stat 'train.en.jsonl': No such file or directory
mv: cannot stat 'val.en.jsonl': No such file or directory
mv: cannot stat 'copa_en.zip': No such file or directory

Here's the error I get when I try to download SIQA:

$ bash siqa_1_download_siqa.sh 
An error occurred (403) when calling the HeadObject operation: Forbidden
Download to S3 failed. Please check all parameters are correct. Otherwise, please report on Github.
unzip:  cannot find or open socialIQa_v1.4.zip, socialIQa_v1.4.zip.zip or socialIQa_v1.4.zip.ZIP.
mv: cannot stat 'socialIQa_v1.4_dev.jsonl': No such file or directory
mv: cannot stat 'socialIQa_v1.4_trn.jsonl': No such file or directory
mv: cannot stat 'socialIQa_v1.4_tst.jsonl': No such file or directory
mv: cannot stat 'socialIQa_v1.4.zip': No such file or directory

Awesome! I think everything should be fixed now.

There was actually another problem as well:
Issue: #12
(The GLUE download links were out-of-date -> Google Firebase Links)

This PR was submitted to fix it:
#11


So hopefully, now everything should be fixed. (At least for downloading the things!)

Great!
I can confirm that is was able to execute all download scripts from the repository.
Next step: training a model :)

Awesome! Thanks for running it again.

Hi,

First of all, I want to thank you for providing your code and all the support on the occurring issues!

Unfortunately, I had the same issues as user ghost. I searched the closed issues and followed the advice in this thread. Everything worked out except for the
"botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden" occurring for me again. In the thread, the issue was said to be fixed, but it now seems like it has reappeared. Do you have any advice?

Heres the full error log:

Downloading and extracting relatedTo...
Traceback (most recent call last):
  File "download_utility/download_relations.py", line 68, in <module>
    sys.exit(main(sys.argv[1:]))
  File "download_utility/download_relations.py", line 64, in main
    download(rel, args.data_dir)
  File "download_utility/download_relations.py", line 36, in download
    bucket_name="wluper-retrograph"
  File "/usr/local/lib/python3.7/dist-packages/waws/bucket_manager.py", line 76, in download_file
    bucket.download_file( final_remote_path, final_local_path )
  File "/usr/local/lib/python3.7/dist-packages/boto3/s3/inject.py", line 283, in bucket_download_file
    Config=Config,
  File "/usr/local/lib/python3.7/dist-packages/boto3/s3/inject.py", line 195, in download_file
    callback=Callback,
  File "/usr/local/lib/python3.7/dist-packages/boto3/s3/transfer.py", line 320, in download_file
    future.result()
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/futures.py", line 103, in result
    return self._coordinator.result()
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/futures.py", line 266, in result
    raise self._exception
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/tasks.py", line 269, in _main
    self._submit(transfer_future=transfer_future, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/s3transfer/download.py", line 357, in _submit
    **transfer_future.meta.call_args.extra_args,
  File "/usr/local/lib/python3.7/dist-packages/botocore/client.py", line 530, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/botocore/client.py", line 960, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden