external-secrets/kubernetes-external-secrets

Pod not using the role defined on the ServiceAccount (IRSA) when executing AssumeRole to fetch secrets

biosugar0 opened this issue ยท 3 comments

I'm trying to a use external secret into EKS.

And use IRSA to add authentication.

I want to use external secret in this pattern like the one in this comment #452.

Service Account -> sts:AssumeRoleWithWebIdentity (pod role!!) -> sts:AssumeRole (secret role!!) -> SECRETS

My secret saved SSM parameter. so I tried this.

Service Account -> sts:AssumeRoleWithWebIdentity (pod role!!) -> sts:AssumeRole (secret role!! it's have SSM policy) -> SECRETS(SSM)

However, the external secret pod doesn't assume the correct role. The pod used default node role.

I found this comment #597 (comment) so I tried, but It's not working.

I have this error.

{
    "payload": {
        "code": "CredentialsError",
        "message": "Missing credentials in config",
        "originalError": {
            "code": "CredentialsError",
            "message": "Could not load credentials from ChainableTemporaryCredentials",
            "originalError": {
                "code": "AccessDenied",
                "message": "User: arn:aws:sts::000000000000:assumed-role/cluster-nodes/i-00000000000000000 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::000000000000:role/myservice-role",
                "requestId": "000000000000000000000000000000000000",
                "retryDelay": 9.41710534764546,
                "retryable": false,
                "statusCode": 403,
                "time": "2021-02-08T00:28:54.815Z"
            },
            "requestId": "000000000000000000000000000000000000",
            "retryDelay": 9.41710534764546,
            "retryable": false,
            "statusCode": 403,
            "time": "2021-02-08T00:28:54.815Z"
        },
        "requestId": "000000000000000000000000000000000000",
        "retryDelay": 9.41710534764546,
        "retryable": false,
        "statusCode": 403,
        "time": "2021-02-08T00:28:54.815Z"
    },
    "pid": 18,
    "time": 1612744134815
}

The Pod has the following environment variables.

AWS_ROLE_ARN=arn:aws:iam::000000000000:role/external-secrets-role #serviceAccount Role
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_DEFAULT_REGION=ap-northeast-1
AWS_REGION=ap-northeast-1

my environment

  • kubernetes-external-secrets 6.2.0
  • EKS 1.18
    • node into private subnet, using NAT Gateway

Deployment securityContext

securityContext:
  fsGroup: 1000
  runAsNonRoot: true

ServiceAccount

# Source: kubernetes-external-secrets/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: myservice-sa
  namespace: "default"
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::000000000000:role/external-secrets-role
    eks.amazonaws.com/audience: sts.amazonaws.com
  labels:
    app.kubernetes.io/name: kubernetes-external-secrets
    helm.sh/chart: kubernetes-external-secrets-6.2.0
    app.kubernetes.io/instance: myservice
    app.kubernetes.io/managed-by: Helm

external secret

apiVersion: kubernetes-client.io/v1
kind: ExternalSecret
metadata:
  name: myservice-secret
spec:
  backendType: systemManager
  region: ap-northeast-1
  data:
  - key: /dev/myservice/secret
    name: myservice-secret
  roleArn: arn:aws:iam::000000000000:role/myservice-role

Could you help me?

Please confirm that IRSA is working, without assuming additional role, first.
I see you are running in a private subnet have you used IRSA with other pods and got that up and running already?

IAM roles for service accounts is supported. You must include the STS VPC endpoint. For more information, see VPC endpoints for private clusters.

https://docs.aws.amazon.com/eks/latest/userguide/private-clusters.html#vpc-endpoints-private-clusters
Not sure if this applies for you here but maybe ๐Ÿ˜„

I looked it up.
IRSA is not working well on my cluster.
And I tried to use STS endpoint in my Pod, I see this error.

# aws sts get-caller-identity --region ap-northeast-1 --endpoint https://sts.ap-northeast-1.amazonaws.com
An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

environment

test deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aws-cli
spec:
  replicas: 1
  selector:
    matchLabels:
      app: aws-cli
  template:
    metadata:
      labels:
        app: aws-cli
    spec:
      serviceAccountName: myservice-account
      containers:
      - name: aws-cli
        image: amazon/aws-cli
        command:
        - "sh"
        - "-c"
        - "touch /tmp/test.txt && tail -f /tmp/test.txt"
        env:
        - name: AWS_DEFAULT_REGION
          value: "ap-northeast-1"
        - name: ENABLE_IRP
          value: "true"
        - name: AWS_STS_ENDPOINT
          value: "https://sts.ap-northeast-1.amazonaws.com"
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        runAsNonRoot: true
        fsGroup: 1000

service account role AssumeRolePolicyDocument

        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Sid": "",
                    "Effect": "Allow",
                    "Principal": {
                        "Federated": "arn:aws:iam::000000000000:oidc-provider/oidc.eks.ap-northeast-1.amazonaws.com/id/00000000000000000000000000000000"
                    },
                    "Action": "sts:AssumeRoleWithWebIdentity",
                    "Condition": {
                        "StringEquals": {
                            "oidc.eks.ap-northeast-1.amazonaws.com/id/00000000000000000000000000000000:sub": "system:serviceaccount:default:*"
                        }
                    }
                }
            ]
        }

service account role policy

    "PolicyDocument": {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "",
                "Effect": "Allow",
                "Action": "s3:*",
                "Resource": "*"
            }
        ]
    }

so, This seems to be a problem with the AWS configuration rather than external secret.
I'll continue my investigation. Thank you.

@Flydiverny Finally, It's working fine !
The cause was an wrong terraform setting of the IAM policy๐Ÿ˜….
Thank you for good product !