uswitch/kiam

Help needed for Kiam on an AWS Kops cluster

ankur6ue opened this issue · 3 comments

Hello -

Struggling to get kiam to work on a kops cluster using AWS.

I used kops to set up a kubernetes cluster on AWS consisting of one master and two worker nodes. All went well, was able to create pods, exec into them etc.

Then, I used the following values.yaml to install kiam:

agent:
  gatewayTimeoutCreation: 40s
  timeout: 40s
  log.level: debug
  host:
    interface: cbr0
    iptables: true
server:
  gatewayTimeoutCreation: 40s
  timeout: 40s
  log.level: debug
  assumeRoleArn: arn:aws:iam::111111:role/kiam_server_iam_role
  nodeSelector:
    kubernetes.io/role: master
  tolerations:
    - key: "node-role.kubernetes.io"
      operator: "Exists"
      effect: "NoSchedule"
  sslCertHostPath: /etc/ssl/certs

I set the host interface to cbr0, because kops uses kubenet networking by default and the network interface for kubenet is cbr0, according to kops docs. The NoSchedule taint on master ensures that only the server can run on the master node.

I used helm to install kiam:
helm install kiam uswitch/kiam --namespace kiam --values kiam/kiam-values.yaml

The server and agent pods were spun up and seem to be working.

Then I created a pod and namespace with IAM role annotations. However upon exec'ing into the pod and doing a curl:
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/

I get the IAM role attached to the EC2 instance on which the pod is running, rather than the IAM role in the pod annotation.

Here are some of the server logs:
Server:

{"credentials.access.key":"ASIAYABGL6VMKUZVVMFFQ","credentials.expiration":"2021-10-16T21:56:59Z","credentials.role":"arn:aws:iam::111111:role/ankur6ue-dev-ocr-data-access-role","level":"info","msg":"expiring credentials, fetching updated","time":"2021-10-16T21:52:59Z"}
{"credentials.access.key":"ASIAYABGL6VMPVVDB6THQ","credentials.expiration":"2021-10-16T22:07:59Z","credentials.role":"arn:aws:iam::111111:role/ankur6ue-dev-ocr-data-access-role","level":"info","msg":"requested new credentials","time":"2021-10-16T21:52:59Z"}

This indicates the server is requesting credentials for the IAM role (arn:aws:iam::111111:role/ankur6ue-dev-ocr-data-access-role), so that seems to be going well

However agent logs show no interesting activity:
k logs -n kiam kiam-agent-wkfrw
{"level":"info","msg":"configuring iptables","time":"2021-10-16T21:08:41Z"}
{"level":"info","msg":"started prometheus metric listener 0.0.0.0:9620","time":"2021-10-16T21:08:41Z"}
{"level":"info","msg":"listening :8181","time":"2021-10-16T21:08:41Z"}

Seems like some issue with the iptables set up because the networking calls made by pod don't seem to be intercepted by the KIAM agent.

Will appreciate any help/pointers!

Tagging @JethroMV
because I borrowed my values.yaml from his issue.

I tried a different CNI - calico and after setting the host interface to cali+, kiam works. So for whatever reason, the default CNI on kops (cbr0 on kubenet?) doesn't work.

Here's my values.yaml

agent:
  gatewayTimeoutCreation: 40s
  timeout: 40s
  log.level: debug
  host:
    interface: cali+
    iptables: true
    iptablesRemoveOnShutdown: true # ensures that ip tables set by kiam are removed when kiam is uninstalled
server:
  gatewayTimeoutCreation: 40s
  timeout: 40s
  log.level: debug
  assumeRoleArn: arn:aws:iam::111:role/kiam_server_iam_role
  nodeSelector:
    kubernetes.io/role: master
  tolerations:
    - key: "node-role.kubernetes.io/master"
      operator: "Exists"
      effect: "NoSchedule"
  sslCertHostPath: /etc/ssl/certs

Glad it's working! For reference we used weave networking with the above config but not specifying the agent.host.interface, and it's working OK too.

agent:
  host:
    interface: weave
    iptables: true
  ...