Reading a file creates a Bucket GET request
Opened this issue · 2 comments
Mountpoint for Amazon S3 version
mount-s3 1.8.0
AWS Region
us-east-1
Describe the running environment
Locally running mountpoint s3 against an AWS bucket using creds stored in the environment variables. Using Ubuntu 20.04
Mountpoint options
mount-s3 <bucket> <file_dir> --endpoint-url <endpoint> --read-only --force-path-style --auto-unmount --prefix <prefix>
What happened?
I'm trying to fetch thousands of really small files and I realized I was getting rate limited due to a rate limit imposed on the number of Bucket GET requests which are basically operations from mountpoint s3 trying to read the location per each file read.
I am not listing the contents of the bucket. I know what the file names are beforehand and are just reading each location without listing the folder. However, I am getting as many Bucket GET requests as Object GET and Object HEAD requests.
Is this expected? Is there a way to not list the folder every time I do a get request?
Relevant log output
No response
Hi, thank you for opening the issue. I assume that you see unexpected ListObjectsV2 requests and to those you refer with "Bucket GET requests". You've mentioned getting rate limited, are you getting 503 errors on ListObjectsV2
requests?
Before reading the file, Mountpoint will make both the ListObjectsV2
and HeadObject
requests for the specified path. This mechanism is ensuring the shadowing semantics (e.g. directory dir/
"shadows" the file dir
).
There were related issues opened previously:
You may avoid repeated ListObjectsV2
operations for the given file by using --metadata-ttl <SECONDS>
. Also I see that you're already using the --prefix
argument, choosing a longer prefix may reduce the number of ListObjectsV2
in case of nested directories.
Ah I see. I missed the previous issues. I'll give a shot with --metadata-ttl indefinite
and see if it helps. Thanks!