simonw/s3-credentials

`s3-credentials put-objects` command

simonw opened this issue · 13 comments

It's frustrating when using s3-credentials put-object that you have to specify the key name each time, rather than deriving that from the filename:

s3-credentials put-object simonwillison-cors-allowed-public \
  click_default_group-1.2.2-py3-none-any.whl \
  /tmp/click-default-group/dist/click_default_group-1.2.2-py3-none-any.whl

One way to fix this would be with a s3-credentials put-objects which works like this:

s3-credentials put-objects simonwillison-cors-allowed-public /tmp/click-default-group/dist/click_default_group-1.2.2-py3-none-any.whl

It could accept multiple files (hence the plural name) and could also accept directories and recursively upload their contents.

If you pass a directory it will upload its contents to the root of the bucket: (Update no it won't see later comment).

s3-credentials put-objects my-bucket ~/path/to/my/directory

You can use --prefix to change this:

 s3-credentials put-objects my-bucket ~/path/to/my/directory --prefix=stuff

Will create /stuff and put the files there instead.

This should try to stay as consistent as possible with the new get-objects command, which looks like this:

s3-credentials/docs/help.md

Lines 167 to 196 in 047020a

Usage: s3-credentials get-objects [OPTIONS] BUCKET [KEYS]...
Download multiple objects from an S3 bucket
To download everything, run:
s3-credentials get-objects my-bucket
Files will be saved to a directory called my-bucket. Use -o dirname to save to
a different directory.
To download specific keys, list them:
s3-credentials get-objects my-bucket one.txt path/two.txt
To download files matching a glob-style pattern, use:
s3-credentials get-objects my-bucket --pattern '*/*.js'
Options:
-o, --output DIRECTORY Write to this directory instead of one matching the
bucket name
-p, --pattern TEXT Glob patterns for files to download, e.g. '*/*.js'
-s, --silent Don't show progress bar
--access-key TEXT AWS access key ID
--secret-key TEXT AWS secret access key
--session-token TEXT AWS session token
--endpoint-url TEXT Custom endpoint URL
-a, --auth FILENAME Path to JSON/INI file containing credentials
--help Show this message and exit.

I wonder if this should support --pattern too in the same way that get-objects does?

Would be useful. The -p shortcut could get confused with --prefix though, need to think about that.

I think -p is the shortcut for --pattern for consistency with get-objects and --prefix has no shortcut version.

What should --prefix foo do against a folder with one.txt and two.txt, as opposed to --prefix foo/?

  • fooone.txt and footwo.txt
  • foo/one.txt and foo/two.txt

I can't think of many reasons people would want the first. I think I'm going to add the missing / if it isn't there.

If you pass a directory it will upload its contents to the root of the bucket:

s3-credentials put-objects my-bucket ~/path/to/my/directory

I'm not convinced that this is the right design decision. It makes sense for a single directory, but what if you do this?

s3-credentials put-objects bucket one.txt two three

Where two and three are folders, not files?

It would be surprising if the contents of those folders were all flattened into the root of the bucket.

If a user wants to upload the contents of a directory they can do so using * like this:

s3-credentials put-objects bucket my-folder/*

This would upload every file in that folder to the root of the bucket.

Draft help:

Usage: s3-credentials put-objects [OPTIONS] BUCKET FILES...

  Upload multiple objects to an S3 bucket

  Pass one or more files to upload them:

      s3-credentials put-objects my-bucket one.txt two.txt

  These will be saved to the root of the bucket. To save to a different
  location use the --prefix option:

      s3-credentials put-objects my-bucket one.txt two.txt --prefix my-folder

  This will upload them my-folder/one.txt and my-folder/two.txt.

  If you pass a directory it will be uploaded recursively:

      s3-credentials put-objects my-bucket my-folder

  This will create keys in my-folder/... in the S3 bucket.

  To upload all files in a folder to the root of the bucket instead use this:

      s3-credentials put-objects my-bucket my-folder/*

Options:
  -p, --pattern TEXT    Glob patterns for files to upload, e.g. '*/*.js'
  --prefix TEXT         Prefix to add to the files within the bucket
  -s, --silent          Don't show progress bar
  --access-key TEXT     AWS access key ID
  --secret-key TEXT     AWS secret access key
  --session-token TEXT  AWS session token
  --endpoint-url TEXT   Custom endpoint URL
  -a, --auth FILENAME   Path to JSON/INI file containing credentials
  --help                Show this message and exit.

Maybe the --pattern option isn't needed here, since users can use their shell for that instead. Having --pattern as well is confusing, especially when I look at the help example above and see that * can sometimes be in single quotes and sometimes not.

I'm going to try this without --pattern first.

This command definitely needs a --dry-run option.

Yeah the --dry-run option is VERY useful:

% s3-credentials put-objects bucket out setup.py ../datasette --dry-run             
out/IMG_1254.jpeg => s3://bucket/out/IMG_1254.jpeg
out/alverstone-mead-2.jpg => s3://bucket/out/alverstone-mead-2.jpg
setup.py => s3://bucket/setup.py
../datasette/pelicans.db => s3://bucket/datasette/pelicans.db
../datasette/.git-blame-ignore-revs => s3://bucket/datasette/.git-blame-ignore-revs

Did some manual testing:

% s3-credentials put-objects blah-bucket-blah setup.py 
Uploading 1.1 KB (1 file)  [####################################]  100%
% s3-credentials put-objects blah-bucket-blah setup.py README.md 
Uploading 2.1 KB (2 files)  [####################################]  100%
% s3-credentials put-objects blah-bucket-blah setup.py README.md --prefix wat
Uploading 2.1 KB (2 files)  [####################################]  100%
% s3-credentials list-bucket blah-bucket-blah --prefix wat
[
  {
    "Key": "wat/README.md",
    "LastModified": "2022-09-15 23:29:24+00:00",
    "ETag": "\"7ea759333f112c85a6e4f7419da969a6\"",
    "Size": 985,
    "StorageClass": "STANDARD"
  },
  {
    "Key": "wat/setup.py",
    "LastModified": "2022-09-15 23:29:24+00:00",
    "ETag": "\"08b0feee3683ec976b8f772465857c8e\"",
    "Size": 1117,
    "StorageClass": "STANDARD"
  }
]

While writing tests for this I leaned that the progress bar was output to stdout and not stderr.

Fix for that is:

with click.progressbar(length=size, label="Uploading", file=sys.stderr) as bar

Applying file=sys.stderr to the other places that use progress bars too.