Seek feedback from AWS experts
simonw opened this issue · 12 comments
I'm not an AWS expert. I would feel a lot more comfortable if some AWS experts could review this tool and make sure that what it is doing makes sense and there are no unpleasant flaws in the approach it is taking.
I'm particularly interested in soundness-checks of the policy documents I am using for read-only, read-write and write-only in this file: https://github.com/simonw/s3-credentials/blob/main/s3_credentials/policies.py
I wrote up a bunch more detail about how this works on my blog: https://simonwillison.net/2021/Nov/3/s3-credentials/
I'm a cloud idiot for a large consumer-facing service in Australia. IAM is part of my daily life.
My suggestions:
- specify individual actions explicitly (no wildcards)
- separate permissions by resource (Buckets vs. Objects)
Sid
is unnecessary
Your read/write policy is good, but instead of *Object
, list GetObject
and PutObject
.
Your read-only policy would be better written like your read/write policy, one section for the bucket permission (ListBucket
), one for the object permission (which should be GetObject
, no wildcard).
Your write-only policy is great as is.
You may want to add additional permissions to let clients set ACLs. But if it's all simple object-by-object stuff, these very simple policies are great.
This is fantastic, thank you!
The thing I need the most help with now is deciding whether or not to use action wildcards and - if not - which S3 action should go in the list for read-only and for read-write, see:
I implemented --policy custom-policy.json
to allow people to apply entirely custom policies in:
First off, thanks for making this tool! AWS is daunting for the uninitiated and tools that simplify common AWS tasks are very helpful.
I second @jdub's suggestions with a few of my own:
- An optional flag for attaching bucket policies to the new s3 bucket. These are just like IAM user policies, but attached to the bucket itself.
- Enable Server Side Encryption with AWS S3-Manged-Keys by default
- Require a flag to enable public access to the bucket rather than making the bucket public by default. See docs
- Flag for setting canned ACL policies
Also, take a look at the s3 security best practices in the official docs.
@zacaytion these are really great suggestions, thank you. Will split these out into separate issues.
Enable Server Side Encryption with AWS S3-Manged-Keys by default
I researched this in #18 but as far as I can tell there isn't a bucket-level way of saying "encrypt all contents in this bucket" - you need to include the x-amz-server-side-encryption: AES256
header every time you upload an object. You can add a policy to the bucket that rejects uploads that don't use that header, but that's a bit of a weird workaround.
So I'm considering that out-of-scope for this tool, though users can set their own custom policies for this (especially if I allow custom bucket policies in #19).
I am by no means an AWS expert! However, I must confess that I was expecting the user created for this s3 bucket to have permission to access only that bucket. Instead, the user's permissions allow access to any s3 bucket in the account.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:*",
"s3-object-lambda:*"
],
"Resource": "*"
}
]
}
Update: Aaah, I think it's because I followed the instructions in the announcement post:
AWS_PROFILE=personal s3-credentials create my-project --bucket-region eu-west-2 --statement '{
"Effect": "Allow",
"Action": "textract:*",
"Resource": "*"
}' --create-bucket > ocr.json
This is discussed a bit in the --statement
issue. Indeed, it would be nice if there were a default statement which set the resource based upon the bucket that had been created.
I think fundamentally, this project could be split into two pieces. The part that manages bucket and principal policies for access to S3 is really useful. The part that provisions long-term credentials via IAM Users is something I'd like to discourage. In general, long-term credentials are only appropriate when there is no trusted source of identity that can be relied on.
For your local dev machine, you (a human) are the source of identity. IAM Identity Center (formerly AWS SSO) provides the best way to have AWS credentials on behalf of a human. Log in once per day, and your tools get access to short-lived AWS credentials.
For something like an EC2 instance, AWS is the trusted source of identity, and can provide short-lived credentials by giving the instance an IAM role.
For something like GitHub Actions, the provider is a trusted source of identity, and you can use the GitHub OIDC token provided to an action to assume an IAM role, giving the action short-lived AWS credentials with no secrets embedded in your action at all, see https://github.com/aws-actions/configure-aws-credentials
If you're running an automated process on, say, a server sitting under your desk at home, that's the case where you don't have a trusted source of identity, and you end up needing long-term credentials of some kind. But because this should be the exception rather than the rule, placing IAM user creation as the default, central part of this tool is a mistake imo.
Hi @benkehoe - thanks very much for dropping by. Your feedback is definitely the kind of thing I've been looking for.
I built this tool because I have a whole bunch of cases where I've needed to create long-term IAM credentials for accessing an S3 bucket, where the STS route isn't applicable.
All of these are things I've run into:
- Running Transmit on my Mac laptop to upload files to a bucket
- Backup scripts running on a Digital Ocean machine in a cron
- An application deployed to Vercel which needs to pull images from a private S3 bucket, then resize and serve them: https://github.com/simonw/s3-image-proxy
- The "machine under my desk" case - a machine in my home that backs things up to S3
And now a confession: I would like to use OIDC for my GitHub Actions (I use long-term credentials for those instead a the moment). But I've read that GitHub OIDC documentation multiple times and it utterly befuddles me - I cannot figure it out!
I swear OIDC between GitHub and AWS is one of the hardest problems in computer science. I really, really wish someone would turn that into a sequence of steps that I can follow. I used to feel ashamed about this, but I think it's a case where it's not me, it's them.
I have this note on the documentation homepage at the moment: https://s3-credentials.readthedocs.io/en/stable/
If your code is running in EC2 or Lambda you can likely solve this using roles instead. This tool is mainly useful for when you are interacting with S3 from outside the boundaries of AWS itself.
Definitely happy to tighten that up with additional notes and recommendations.