Kiam returning incorrect credentials to pods
sagardesai0094 opened this issue · 5 comments
Hello!
I am running into an issue with Kiam on EKS. I am doing a fresh install of Kiam on a cluster using the Helm chart. I have two autoscaling groups, called “Worker” and “Service”. Everything runs on the Worker nodes, with a select few things running on the Service nodes. There are two different IAM roles, one for the Service nodes and one for the Worker nodes. I have set the kiam-server pods to run on the Service nodes, and the kiam-agent pods to run on the Worker nodes using the node selector in the values file. To test the setup, I have a pod (running on a Worker node), that has the following annotation: iam.amazonaws.com/role: EKSRoleForALBIngressControllerTest
. Inside the pod I am running aws sts get-caller-identity
to see what role has been assumed. The response is specifying the Worker node’s instance profile, instead of the role specified in the annotation. This is obviously not expected. At the very least I thought the returned credentials would be that of Service node’s instance profile (where the Kiam-server is running). I performed this test on another cluster that is running a manually installed kiam setup (resource files were applied manually) and the response to get-caller-identity
is the role specified in the pod’s annotation. So something is definitely not working.
My values file:
agent:
log:
level: debug
host:
iptables: true
nodeSelector:
node.kubernetes.io/node-type: Worker
server:
log:
level: debug
tolerations:
- effect: NoSchedule
key: kiam-server
operator: Exists
nodeSelector:
node.kubernetes.io/node-type: Service
In the kiam-server logs, I see that it is getting the role that is requested by the test pod's annotation:
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"EKSRoleForALBIngressControllerTest","pod.name":"alpine-shell-78b6c694bf-cxdlj","pod.namespace":"default","pod.status.ip":"172.16.19.120","pod.status.phase":"Running","resource.version":"167520137","time":"2020-12-17T18:00:19Z"}
I also see an AssumeRole
entry in CloudTrail for that role, made by kiam. So it seems that the requests are making it to kiam-server, and the role is being assumed correctly. But the credentials are not making it back to the pod for some reason.
The response I get from aws sts get-caller-identity
in my test pod:
{
ResponseMetadata: { RequestId: 'a97a8c92-7d68-445b-b604-f50d742b83d9' },
UserId: 'AROAI4IC47I5B6D3P7BRK:i-062d6b6490a459d56',
Account: 'XXXXXXXXX',
Arn: 'arn:aws:sts::XXXXXXXXXX:assumed-role/test-k8s-iam-NodeInstanceRole-1RGFT2BTT77OT/i-062d6b6490a459d56'
}
The role returned here is the instance profile role for the Worker node that kiam-agent is running on (NOT where kiam-server is running). In the cluster where kiam was installed manually, the response has the ARN of the role that is in the pod's annotation.
I enabled debug logs on the agent, but it only has the ping log statements. Logs in the server don't have any errors.
Any ideas? I am not sure how to debug this further. Let me know if I can provide any more information.
I also see an AssumeRole entry in CloudTrail for that role, made by kiam. So it seems that the requests are making it to kiam-server, and the role is being assumed correctly. But the credentials are not making it back to the pod for some reason.
The server performs some prefetching to obtain credentials so it would request them whether a Pod or agent attempts to retrieve them or not.
If your Pod is still obtaining the node credentials I'd expect the problem is your iptables interception isn't working properly. I'd check the settings you're using (the rules vary depending on your CNI) and see how that goes. Kiam can configure it via the agent but you can also configure yourself.
You can look at the agent's log to understand if any calls are being intercepted or made via the Pod. That should confirm the issue is where I suspect it is.
How does the kiam-server know which credentials to pre-fetch? Is it also monitoring the pod annotations? I assumed the requests were making it to the agent/server since I saw the logs in the server where it was fetching the credentials. But what you said makes sense because I don't see any logs in the agent about any credentials request. I'll verify the iptables rule.
I wrote a longer article a while back on how Kiam works, it helps explain how the components interact (in short, agents maintain no knowledge of anything, they're there just to forward requests from Pods to the Server).
https://pingles.medium.com/kiam-iterating-for-security-and-reliability-5e793ab93ec3
I'm going to close this issue for now. If you still get stuck with the agent iptables work we also have a Slack channel in the Kubernetes Slack (#kiam).
Hope it isn't too tricky!
Thanks. For posterity, the issue had to do with the host-interface argument on the agent. I wasn't specifying and the default (cali+) wasn't correct. I set it to !eth0
and it's working now.