Cannot use s3 related functions
Closed this issue · 3 comments
trojblue commented
Sysinfo:
(base) ubuntu@ip-10-53-8-252:~$ pip show megfile
Name: megfile
Version: 2.2.9.post3
Summary: Megvii file operation library
Home-page: https://github.com/megvii-research/megfile
Author: megvii
Author-email: megfile@megvii.com
License:
Location: /home/ubuntu/miniconda3/lib/python3.10/site-packages
Requires: boto3, botocore, paramiko, pyyaml, requests, tqdm
Required-by:
(base) ubuntu@ip-10-53-8-252:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
Issue:
tried using it in CLI and here's what happened:
(base) ubuntu@ip-10-53-8-252:~$ aws s3 ls s3://dataset-ingested/user-preference/
PRE pref_100k_min513x768/
PRE pref_100k_min513x768_YIELD/
PRE sd-human-ft/
PRE sd-user-pref-50k-ft-gpt/
PRE sd-user-pref-75k-ft/
PRE sd-user-pref-v2-large-full/
2023-07-04 00:41:59 0
2023-07-11 06:16:49 1444 README.md
(base) ubuntu@ip-10-53-8-252:~$
(base) ubuntu@ip-10-53-8-252:~$ megfile ls s3://dataset-ingested/user-preference/
[S3UnknownError] Unknown error encountered: 's3://dataset-ingested/user-preference/', error: botocore.exceptions.ClientError('An error occurred (PermanentRedirect) when calling the ListObjectsV2 operation: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.'), endpoint: 'https://s3.amazonaws.com'
(base) ubuntu@ip-10-53-8-252:~$
the aws credentials have been configured with aws configure
and works with AWS cli.
also tried using it in python:
from megfile import smart_walk
s3_directory = 's3://dataset-ingested/user-preference/'
# Walking through the directory
for root, dirs, files in smart_walk(s3_directory):
print(f"Current directory: {root}")
print(f"Subdirectories: {dirs}")
print(f"Files: {files}")
print("-" * 20)
error message:
---------------------------------------------------------------------------
ClientError Traceback (most recent call last)
File ~/miniconda3/lib/python3.10/site-packages/megfile/s3_path.py:1534, in S3Path.is_dir(self, followlinks)
1533 try:
-> 1534 resp = self._client.list_objects_v2(
1535 Bucket=bucket, Prefix=prefix, Delimiter='/', MaxKeys=1)
1536 except Exception as error:
File ~/miniconda3/lib/python3.10/site-packages/botocore/client.py:535, in ClientCreator._create_api_method.<locals>._api_call(self, *args, **kwargs)
534 # The "self" in this scope is referring to the BaseClient.
--> 535 return self._make_api_call(operation_name, kwargs)
File ~/miniconda3/lib/python3.10/site-packages/botocore/client.py:980, in BaseClient._make_api_call(self, operation_name, api_params)
979 error_class = self.exceptions.from_code(error_code)
--> 980 raise error_class(parsed_response, operation_name)
981 else:
ClientError: An error occurred (PermanentRedirect) when calling the ListObjectsV2 operation: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.
The above exception was the direct cause of the following exception:
S3UnknownError Traceback (most recent call last)
/home/ubuntu/dev/data-processings/ingested_gdl_twitter_processings.ipynb Cell 18 line 6
3 s3_directory = 's3://dataset-ingested/user-preference/'
5 # Walking through the directory
----> 6 for root, dirs, files in smart_walk(s3_directory):
7 print(f"Current directory: {root}")
8 print(f"Subdirectories: {dirs}")
File ~/miniconda3/lib/python3.10/site-packages/megfile/s3_path.py:2040, in S3Path.walk(self, followlinks)
2037 if not bucket:
2038 raise UnsupportedError('Walk whole s3', self.path_with_protocol)
-> 2040 if not self.is_dir():
2041 return
2043 stack = [key]
File ~/miniconda3/lib/python3.10/site-packages/megfile/s3_path.py:1540, in S3Path.is_dir(self, followlinks)
1537 error = translate_s3_error(error, self.path_with_protocol)
1538 if isinstance(error,
1539 (S3UnknownError, S3ConfigError, S3PermissionError)):
-> 1540 raise error
1541 return False
1543 if not key: # bucket is accessible
S3UnknownError: Unknown error encountered: 's3://dataset-ingested/user-preference/', error: botocore.exceptions.ClientError('An error occurred (PermanentRedirect) when calling the ListObjectsV2 operation: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.'), endpoint: 'https://s3.amazonaws.com'
the bucket I'm trying to get is in the same region as my configuration (from aws configure
). any clue on what happened? thanks.
LoveEatCandy commented
I guess it's a bug. megfile not get all configurations from file. Are you setup region_name
by aws configure
?
LoveEatCandy commented
I test the region
configuration in file is working.
This error message means the region
you using is different from the bucket's region. You may check the region
configuration.
If region is right, please show debug logs to me, like this:
import logging
logging.basicConfig(level=logging.DEBUG)
Thanks.
LoveEatCandy commented
Reopen if the question is still existing.