Aws::Errors::MissingCredentialsError exception in ECS when using TaskRoleArn
TJNII opened this issue · 10 comments
Describe the bug
I'm attempting to use ECS TaskRoleArn to enable application access to AWS.
- https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesguide/security-iam-roles.html
- https://docs.aws.amazon.com/sdkref/latest/guide/feature-container-credentials.html
In looking at my container's Docker configuration I see AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
is set. Based on my reading of the guides I'm under the impression this should "just work".
Expected Behavior
The SDK would authorize to AWS
Current Behavior
2024-02-05 04:19:02 - Aws::Errors::MissingCredentialsError - unable to sign request without credentials set (Aws::Errors::MissingCredentialsError):
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/sign.rb:120:in `rescue in initialize'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/sign.rb:109:in `initialize'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/sign.rb:34:in `new'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/sign.rb:34:in `signer_for'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/sign.rb:46:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/transfer_encoding.rb:26:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/helpful_socket_errors.rb:12:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/retry_errors.rb:362:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/user_agent.rb:37:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/http_checksum.rb:20:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/endpoint_pattern.rb:30:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/checksum_algorithm.rb:137:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/request_compression.rb:94:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/query/handler.rb:30:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/recursion_detection.rb:18:in `call'
/usr/local/bundle/gems/aws-sdk-sns-1.71.0/lib/aws-sdk-sns/plugins/endpoints.rb:43:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/endpoint_discovery.rb:84:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/seahorse/client/plugins/endpoint.rb:47:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/param_validator.rb:26:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/seahorse/client/plugins/raise_response_errors.rb:16:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/checksum_algorithm.rb:111:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:16:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/idempotency_token.rb:19:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/param_converter.rb:26:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/seahorse/client/plugins/request_callback.rb:89:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/response_paging.rb:12:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/seahorse/client/plugins/response_target.rb:24:in `call'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/seahorse/client/request.rb:72:in `send_request'
/usr/local/bundle/gems/aws-sdk-sns-1.71.0/lib/aws-sdk-sns/client.rb:1858:in `publish'
/usr/local/bundle/gems/aws-sdk-sns-1.71.0/lib/aws-sdk-sns/topic.rb:375:in `block in publish'
/usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/plugins/user_agent.rb:28:in `feature'
/usr/local/bundle/gems/aws-sdk-sns-1.71.0/lib/aws-sdk-sns/topic.rb:374:in `publish'
[Snip trace outside the SDK]
Reproduction Steps
sns = Aws::SNS::Resource.new
topic = sns.topic(topic_arn)
topic.publish(content)
- Run in a ECS task using
taskRoleArn
Possible Solution
No response
Additional Information/Context
All gem versions:
aws-eventstream (1.3.0)
aws-partitions (1.887.0)
aws-sdk-core (3.191.0)
aws-eventstream (~> 1, >= 1.3.0)
aws-partitions (~> 1, >= 1.651.0)
aws-sigv4 (~> 1.8)
aws-sdk-sns (1.71.0)
aws-sdk-core (~> 3, >= 3.191.0)
aws-sigv4 (~> 1.1)
aws-sigv4 (1.8.0)
aws-eventstream (~> 1, >= 1.0.2)
aws-sdk-sns (~> 1.0)
Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version
aws-sdk-sns (1.71.0) aws-sdk-core (3.191.0)
Environment details (Version of Ruby, OS environment)
public.ecr.aws/docker/library/ruby:3.2
Sorry you're running into this - you are right, credentials from ECS TaskRoleArn should just work.
What is AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
set to and are you setting it manually? Are there any other ENV vars set?
I'm not setting it, it's being set by ECS. It's being set to the following, redacting the UUID as I assume that's something I shouldn't share:
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=/v2/credentials/[UUID]
There's no other ENV vars that appear auth related. This is a complete set of container ENV vars, minus the ones specific to my app which have a unique, app-specific prefix:
AWS_EXECUTION_ENV=AWS_ECS_EC2
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=/v2/credentials/[UUID]
ECS_CONTAINER_METADATA_URI=http://169.254.170.2/v3/[UUID]
ECS_CONTAINER_METADATA_URI_V4=http://169.254.170.2/v4/[UUID]
ECS_AGENT_URI=http://169.254.170.2/api/[UUID]
PATH=/usr/local/bundle/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
LANG=C.UTF-8
RUBY_VERSION=3.2.3
RUBY_DOWNLOAD_URL=https://cache.ruby-lang.org/pub/ruby/3.2/ruby-3.2.3.tar.xz
RUBY_DOWNLOAD_SHA256=cfb231954b8c241043a538a4c682a1cca0b2016d835fee0b9e4a0be3ceba476b
GEM_HOME=/usr/local/bundle
BUNDLE_SILENCE_ROOT_WARNING=1
BUNDLE_APP_CONFIG=/usr/local/bundle
From your host, can you try pinging http://169.254.170.2/v2/credentials/[UUID]
to see if this endpoint exists?
We can get more insight if you try initializing these credentials manually:
Aws::ECSCredentials.new(http_debug_output: $stdout) # or your logger
Also please move away from Resource
models (use Client), they are not recommended because they are hand maintained and do not get updates.
Thanks for the quick response and good troubleshooting steps. Once I realized how this is supposed to work I was able to root cause the issue: The ECS agent metadata endpoint wasn't responding. Once I resolved that Aws::ECSCredentials.new(http_debug_output: $stdout)
was able to gather credentials as expected. For anyone else here from Google (there were exactly 0 hits for "ecs taskRoleArn Aws::Errors::MissingCredentialsError" last night) the solution was a host reboot.
To try and reproduce the issue I ran a container without network and instantiated a Aws::ECSCredentials
object:
irb(main):002:0> Aws::ECSCredentials.new(http_debug_output: $stdout)
opening connection to 169.254.170.2:80...
opening connection to 169.254.170.2:80...
opening connection to 169.254.170.2:80...
opening connection to 169.254.170.2:80...
opening connection to 169.254.170.2:80...
opening connection to 169.254.170.2:80...
=>
#<Aws::ECSCredentials:0x00007f0b8cbc05b8
@async_refresh=false,
@backoff=#<Proc:0x00007f0b8c26f4a0 /usr/local/bundle/gems/aws-sdk-core-3.191.0/lib/aws-sdk-core/ecs_credentials.rb:177 (lambda)>,
@before_refresh=nil,
@credential_path="/v2/credentials/1ffa0a10-9270-4233-985b-3bdd542f4b88",
@credentials=#<Aws::Credentials access_key_id=nil>,
@expiration=nil,
@host="169.254.170.2",
@http_debug_output=#<IO:<STDOUT>>,
@http_open_timeout=5,
@http_read_timeout=5,
@mutex=#<Thread::Mutex:0x00007f0b8c26f400>,
@port=80,
@retries=5,
@scheme="http">
And I also reproduced the original exception as well. So this isn't a bug, as it failed due to the ecs-agent
not responding, but I do think Aws::ECSCredentials
should throw a warning if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
is set but 169.254.170.2:80 is unresponsive. That's a ENV var I doubt will ever be encountered outside ECS, so if 169.254.170.2 isn't responding that's an error worth bubbling up to the user. If I had realized last night that the http://169.254.170.2
ENV vars did matter I probably would have been able to figure it out.
So I think we can remove the bug
label, but would you be open to leaving this open as a feature request to add a warning level log to the Aws::ECSCredentials class?
Also please move away from
Resource
models (use Client), they are not recommended because they are hand maintained and do not get updates.
FYI https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/sns-example-send-message.html still documents using Aws::SNS::Resource.
Related to: #2823
I'll contact the docs examples team to have that rewritten. Thanks for pointing that out.
Path forward is to do Kernel.warn on various credential sources (EC2 instance, ECS, Process, etc) when they aren't loaded. This should be fine to do because those credential sources are only initialized when certain hint checks are passed (i.e. existence of that relative URI ENV variable).
This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.