grafana/grafana-aws-sdk

Using Assume Role ARN with opt-in regions gives 403

robbierolin opened this issue · 5 comments

Some AWS regions are opt-in regions, which means they can't be accessed until they are explicitly turned on (see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions)

When using the Assume Role ARN field, credentials from one account (Account A) can be used to retrieve data from another account (Account B). Currently there exists a bug when trying to load data from an opt-in region in Account B will give

"error":"failed to query data: InvalidClientTokenId: The security token included in the request is invalid.\n\tstatus code: 403

This is because the region selected in the configuration is used both for the call to STS to assume the role and the call to the data source (e.g. cloudwatch). This can be fixed by replacing the region used here https://github.com/grafana/grafana-aws-sdk/blob/main/pkg/awsds/sessions.go#L167 with a non-opt-in region (e.g. us-east-1) when the configured region is an opt-in region and Assume Role ARN is configured. Then the regionCfg used here https://github.com/grafana/grafana-aws-sdk/blob/main/pkg/awsds/sessions.go#L227 should use the configured region.

Note the environment variable AWS_STS_REGIONAL_ENDPOINTS=regional must be set to get credentials that can be used in an opt-in region.

Steps to reproduce:

  1. In one AWS account (Account A) Create a user with AssumeRole permissions
    e.g.
{
    "Version": "2012-10-17",
    "Statement": {
        "Effect": "Allow",
        "Action": [
            "sts:AssumeRole"
        ],
        "Resource": "*"
    }
}
  1. In another AWS account (Account B) opt in to an opt-in region (e.g. me-south-1)
  2. In Account B create a role, attach a CloudWatchReadOnlyAccess permission policy and a trust policy for Account A
    e.g.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<AccountA>:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}
  1. Create a CloudWatch data source
  2. Provide an Access Key and Secret Key for the user in step 1
  3. Provide the role ARN from step 3 in Assume Role ARN
  4. Select me-south-1 as Default Region
  5. Save & Test

Status update: The account that we've used for testing cross account access seems to be broken, but ops is looking into that.

Status update: The account that we've used for testing cross account access seems to be broken, but ops is looking into that.

Done! Details in slack.

Thanks for setting up the environment @sunker. I am able to reproduce the issue now. Still, your suggestion @robbierolin is not fixing the issue. Note that:

This can be fixed by replacing the region used here https://github.com/grafana/grafana-aws-sdk/blob/main/pkg/awsds/sessions.go#L167 with a non-opt-in region (e.g. us-east-1) when the configured region is an opt-in region and Assume Role ARN is configured. Then the regionCfg used here https://github.com/grafana/grafana-aws-sdk/blob/main/pkg/awsds/sessions.go#L227 should use the configured region.

When using assumed roles, the cfgs list of configs is redefined in that scope. If we set up the opt-in region at line 227 as you suggest, the result is the same, I still get the InvalidClientTokenId error in all the responses.

I wonder, how can we differentiate between calls to the STS and the target service (e.g. CloudWatch)? Maybe we need to disable opt-in regions if "Assume Role" is used?

Just to clarify for me was able to get it working by

  1. Setting this line to a non-opt-in region
  2. Redefining the regionCfg here (after the cfgs are redefined) with regionCfg = &aws.Config{Region: aws.String(region)} so the configured region is only used for the target service in this case.
  3. Setting the environment variable AWS_STS_REGIONAL_ENDPOINTS=regional

Does that work for you?

Got it. Yes, I was missing the AWS_STS_REGIONAL_ENDPOINTS environment variable. I will send a patch soon, thanks!