aws/amazon-vpc-cni-k8s

vpc cni is getting failed to assign IPs and throwing invalid spec errors with release v1.13.2

ramandeepsharma opened this issue · 7 comments

What happened:

We noticed that pods are getting failed intermittently in our infrastructure and logs are revealing that it's related to vpc cni as it's getting failed to assign the IPs.

Event logs of such pods.

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d456c3f7056f7574708efdcb161e676949989c33411c13eba09bb78502a9a7c3": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container

Currently we have to delete such pods manually so that it can re-schedule on the node.

We have diagnosed this issue at our end and our findings are mentioned below:

  • We checked that sufficient amount of IPs are available.

  • ipamd logs are mentioned below, these errors are being repeated during the pod failure incident:
    { "level": "debug", "ts": "2023-11-01T18:43:09.746Z", "caller": "rpc/rpc.pb.go:713", "msg": "AddNetworkRequest: K8S_POD_NAME:\"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" K8S_POD_NAMESPACE:\"sas\" K8S_POD_INFRA_CONTAINER_ID:\"71833840eab0352825f05c91669c1f1684bd61f0d28346efb6ec519e9dbb28a6\" ContainerID:\"71833840eab0352825f05c91669c1f1684bd61f0d28346efb6ec519e9dbb28a6\" IfName:\"eth0\" NetworkName:\"aws-cni\" Netns:\"/var/run/netns/cni-82499489-f3c4-abcb-c5a4-fb570bbf300b\"" } { "level": "error", "ts": "2023-11-01T18:43:10.391Z", "caller": "retry/util.go:51", "msg": "Failed to annotate vpc.amazonaws.com/pod-ips the pod with 10.17.76.32, error Pod \"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" is invalid: spec: Forbidden: pod updates may not change fields other than spec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations(only additions to existing tolerations) orspec.terminationGracePeriodSeconds(allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 15 identical fields\n  \tSubdomain: \"\",\n  \tSetHostnameAsFQDN: nil,\n- \tAffinity: nil,\n+ \tAffinity: &core.Affinity{\n+ \t\tNodeAffinity: &core.NodeAffinity{\n+ \t\t\tRequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},\n+ \t\t},\n+ \t},\n  \tSchedulerName: \"default-scheduler\",\n  \tTolerations: {{Key: \"kaas.acquia.io/pool\", Operator: \"Equal\", Value: \"jobs\"}, {Key: \"node.kubernetes.io/not-ready\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}, {Key: \"node.kubernetes.io/unreachable\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}},\n  \t... // 11 identical fields\n  }\n" } { "level": "error", "ts": "2023-11-01T18:43:10.391Z", "caller": "rpc/rpc.pb.go:713", "msg": "Failed to add the pod annotation: Pod \"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" is invalid: spec: Forbidden: pod updates may not change fields other thanspec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations(only additions to existing tolerations) orspec.terminationGracePeriodSeconds(allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 15 identical fields\n  \tSubdomain: \"\",\n  \tSetHostnameAsFQDN: nil,\n- \tAffinity: nil,\n+ \tAffinity: &core.Affinity{\n+ \t\tNodeAffinity: &core.NodeAffinity{\n+ \t\t\tRequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},\n+ \t\t},\n+ \t},\n  \tSchedulerName: \"default-scheduler\",\n  \tTolerations: {{Key: \"kaas.acquia.io/pool\", Operator: \"Equal\", Value: \"jobs\"}, {Key: \"node.kubernetes.io/not-ready\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}, {Key: \"node.kubernetes.io/unreachable\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}},\n  \t... // 11 identical fields\n  }\n" } { "level": "info", "ts": "2023-11-01T18:43:10.391Z", "caller": "rpc/rpc.pb.go:713", "msg": "Send AddNetworkReply: IPv4Addr 10.17.76.32, IPv6Addr: , DeviceNumber: 0, err: Pod \"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" is invalid: spec: Forbidden: pod updates may not change fields other thanspec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations(only additions to existing tolerations) orspec.terminationGracePeriodSeconds(allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 15 identical fields\n  \tSubdomain: \"\",\n  \tSetHostnameAsFQDN: nil,\n- \tAffinity: nil,\n+ \tAffinity: &core.Affinity{\n+ \t\tNodeAffinity: &core.NodeAffinity{\n+ \t\t\tRequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},\n+ \t\t},\n+ \t},\n  \tSchedulerName: \"default-scheduler\",\n  \tTolerations: {{Key: \"kaas.acquia.io/pool\", Operator: \"Equal\", Value: \"jobs\"}, {Key: \"node.kubernetes.io/not-ready\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}, {Key: \"node.kubernetes.io/unreachable\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}},\n  \t... // 11 identical fields\n  }\n" } { "level": "debug", "ts": "2023-11-01T18:43:10.428Z", "caller": "rpc/rpc.pb.go:731", "msg": "DelNetworkRequest: K8S_POD_NAME:\"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" K8S_POD_NAMESPACE:\"sas\" K8S_POD_INFRA_CONTAINER_ID:\"71833840eab0352825f05c91669c1f1684bd61f0d28346efb6ec519e9dbb28a6\" Reason:\"PodDeleted\" ContainerID:\"71833840eab0352825f05c91669c1f1684bd61f0d28346efb6ec519e9dbb28a6\" IfName:\"eth0\" NetworkName:\"aws-cni\"" } { "level": "error", "ts": "2023-11-01T18:43:11.142Z", "caller": "retry/util.go:51", "msg": "Failed to annotate vpc.amazonaws.com/pod-ips the pod with , error Pod \"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" is invalid: spec: Forbidden: pod updates may not change fields other thanspec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations(only additions to existing tolerations) orspec.terminationGracePeriodSeconds(allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 15 identical fields\n  \tSubdomain: \"\",\n  \tSetHostnameAsFQDN: nil,\n- \tAffinity: nil,\n+ \tAffinity: &core.Affinity{\n+ \t\tNodeAffinity: &core.NodeAffinity{\n+ \t\t\tRequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},\n+ \t\t},\n+ \t},\n  \tSchedulerName: \"default-scheduler\",\n  \tTolerations: {{Key: \"kaas.acquia.io/pool\", Operator: \"Equal\", Value: \"jobs\"}, {Key: \"node.kubernetes.io/not-ready\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}, {Key: \"node.kubernetes.io/unreachable\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}},\n  \t... // 11 identical fields\n  }\n" } { "level": "error", "ts": "2023-11-01T18:43:11.142Z", "caller": "rpc/rpc.pb.go:731", "msg": "Failed to delete the pod annotation: Pod \"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" is invalid: spec: Forbidden: pod updates may not change fields other thanspec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations(only additions to existing tolerations) orspec.terminationGracePeriodSeconds(allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 15 identical fields\n  \tSubdomain: \"\",\n  \tSetHostnameAsFQDN: nil,\n- \tAffinity: nil,\n+ \tAffinity: &core.Affinity{\n+ \t\tNodeAffinity: &core.NodeAffinity{\n+ \t\t\tRequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},\n+ \t\t},\n+ \t},\n  \tSchedulerName: \"default-scheduler\",\n  \tTolerations: {{Key: \"kaas.acquia.io/pool\", Operator: \"Equal\", Value: \"jobs\"}, {Key: \"node.kubernetes.io/not-ready\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}, {Key: \"node.kubernetes.io/unreachable\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}},\n  \t... // 11 identical fields\n  }\n" } { "level": "info", "ts": "2023-11-01T18:43:11.142Z", "caller": "rpc/rpc.pb.go:731", "msg": "Send DelNetworkReply: IPv4Addr 10.17.76.32, DeviceNumber: 0, err: Pod \"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" is invalid: spec: Forbidden: pod updates may not change fields other thanspec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations(only additions to existing tolerations) orspec.terminationGracePeriodSeconds (allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 15 identical fields\n  \tSubdomain: \"\",\n  \tSetHostnameAsFQDN: nil,\n- \tAffinity: nil,\n+ \tAffinity: &core.Affinity{\n+ \t\tNodeAffinity: &core.NodeAffinity{\n+ \t\t\tRequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},\n+ \t\t},\n+ \t},\n  \tSchedulerName: \"default-scheduler\",\n  \tTolerations: {{Key: \"kaas.acquia.io/pool\", Operator: \"Equal\", Value: \"jobs\"}, {Key: \"node.kubernetes.io/not-ready\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}, {Key: \"node.kubernetes.io/unreachable\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}},\n  \t... // 11 identical fields\n  }\n" }

  • vpc cni logs are mentioned below, these errors are being repeated during the pod failure incident:
    { "level": "info", "ts": "2023-11-01T18:44:10.737Z", "caller": "routed-eni-cni-plugin/cni.go:126", "msg": "Received CNI add request: ContainerID(ae26c3b9a86fbfc79df9e7c2c82caf2daf3252f4f68ca6f3355f421256339952) Netns(/var/run/netns/cni-201a3b5a-9493-48ce-7204-9f3ddde85429) IfName(eth0) Args(K8S_POD_INFRA_CONTAINER_ID=ae26c3b9a86fbfc79df9e7c2c82caf2daf3252f4f68ca6f3355f421256339952;K8S_POD_UID=a296181d-cb58-45ef-be09-5ee8ee5b1070;IgnoreUnknown=1;K8S_POD_NAMESPACE=sas;K8S_POD_NAME=command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt) Path(/opt/cni/bin) argsStdinData({\"cniVersion\":\"0.4.0\",\"mtu\":\"9001\",\"name\":\"aws-cni\",\"pluginLogFile\":\"/var/log/aws-routed-eni/plugin.log\",\"pluginLogLevel\":\"DEBUG\",\"podSGEnforcingMode\":\"strict\",\"type\":\"aws-cni\",\"vethPrefix\":\"eni\"})" } { "level": "info", "ts": "2023-11-01T18:44:11.425Z", "caller": "routed-eni-cni-plugin/cni.go:284", "msg": "Received CNI del request: ContainerID(ae26c3b9a86fbfc79df9e7c2c82caf2daf3252f4f68ca6f3355f421256339952) Netns(/var/run/netns/cni-201a3b5a-9493-48ce-7204-9f3ddde85429) IfName(eth0) Args(IgnoreUnknown=1;K8S_POD_NAMESPACE=sas;K8S_POD_NAME=command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt;K8S_POD_INFRA_CONTAINER_ID=ae26c3b9a86fbfc79df9e7c2c82caf2daf3252f4f68ca6f3355f421256339952;K8S_POD_UID=a296181d-cb58-45ef-be09-5ee8ee5b1070) Path(/opt/cni/bin) argsStdinData({\"cniVersion\":\"0.4.0\",\"mtu\":\"9001\",\"name\":\"aws-cni\",\"pluginLogFile\":\"/var/log/aws-routed-eni/plugin.log\",\"pluginLogLevel\":\"DEBUG\",\"podSGEnforcingMode\":\"strict\",\"type\":\"aws-cni\",\"vethPrefix\":\"eni\"})" } { "level": "error", "ts": "2023-11-01T18:44:12.318Z", "caller": "routed-eni-cni-plugin/cni.go:284", "msg": "Error received from DelNetwork gRPC call for container ae26c3b9a86fbfc79df9e7c2c82caf2daf3252f4f68ca6f3355f421256339952: rpc error: code = Unknown desc = Pod \"command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt\" is invalid: spec: Forbidden: pod updates may not change fields other than spec.containers[].image, spec.initContainers[].image, spec.activeDeadlineSeconds, spec.tolerations(only additions to existing tolerations) orspec.terminationGracePeriodSeconds (allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 15 identical fields\n  \tSubdomain: \"\",\n  \tSetHostnameAsFQDN: nil,\n- \tAffinity: nil,\n+ \tAffinity: &core.Affinity{\n+ \t\tNodeAffinity: &core.NodeAffinity{\n+ \t\t\tRequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},\n+ \t\t},\n+ \t},\n  \tSchedulerName: \"default-scheduler\",\n  \tTolerations: {{Key: \"kaas.acquia.io/pool\", Operator: \"Equal\", Value: \"jobs\"}, {Key: \"node.kubernetes.io/not-ready\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}, {Key: \"node.kubernetes.io/unreachable\", Operator: \"Exists\", Effect: \"NoExecute\", TolerationSeconds: &300}},\n  \t... // 11 identical fields\n  }\n" } { "level": "info", "ts": "2023-11-01T18:44:12.318Z", "caller": "routed-eni-cni-plugin/cni.go:284", "msg": "Could not teardown pod using prevResult: ContainerID(ae26c3b9a86fbfc79df9e7c2c82caf2daf3252f4f68ca6f3355f421256339952) Netns(/var/run/netns/cni-201a3b5a-9493-48ce-7204-9f3ddde85429) IfName(eth0) PodNamespace(sas) PodName(command-b7aeaa61-6361-4011-b9b7-f0226622a2e3-28314385-sd4xt)" }

It seems to be going into a loop while sending a API request and failing over with invalid spec. hence the requests are not getting completed.

-->

What you expected to happen:

Pods should get the IPs.

How to reproduce it (as minimally and precisely as possible):

N/A

Anything else we need to know?:

It looks like this is related to recent change: #2328 which got introduced in https://github.com/aws/amazon-vpc-cni-k8s/releases/tag/v1.13.0 and we are using 'v1.13.2' release.

Environment:

  • Kubernetes version (use kubectl version): v1.24.17-eks
  • CNI Version: v1.13.2
  • OS (e.g: cat /etc/os-release): Amazon Linux
  • Kernel (e.g. uname -a): 5.10.192-183.736.amzn2.x86_64

@ramandeepsharma are you setting the ANNOTATE_POD_IP environment variable: https://github.com/aws/amazon-vpc-cni-k8s#annotate_pod_ip-v193?

If so, how are you deploying the VPC CNI with this change? When set, this feature also requires the ClusterRole to be updated, as the README mentions. If you are deploying with helm or the EKS managed addon, this should be taken care of: https://github.com/aws/amazon-vpc-cni-k8s/blob/v1.13.2/charts/aws-vpc-cni/templates/clusterrole.yaml#L17

Can you check the output of k get clusterrole aws-node -n kube-system -o yaml and look at the pods permissions?

Hello, I'm working on this with Ramandeep. Here are the details you asked for while he's out:

We have the ANNOTATE_POD_IP environment variable set:

$ k get ds -n kube-system aws-node -o yaml | yq '.spec.template.spec.containers[0].env' | grep -i annotate -A2
- name: ANNOTATE_POD_IP
  value: "true"
- name: AWS_VPC_CNI_NODE_PORT_SUPPORT

And here's the aws-node clusterrole permissions:

$ k get clusterrole aws-node -n kube-system -o yaml
[...]
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - list
  - watch
  - get
  - patch
[...]

I believe this is a race condition because this error comes up intermittently, usually during periods of high pod churn. Every one or two days we get about a dozen pods that get stuck in this state all at once across multiple different nodes, with the above logs repeating constantly. New pods that spin up later don't get into this state and start up correctly.

This log specifically is interesting, from https://github.com/aws/amazon-vpc-cni-k8s/blob/v1.13.2/pkg/ipamd/ipamd.go#L2103:

Failed to annotate vpc.amazonaws.com/pod-ips the pod with 10.17.76.32, error Pod \"command-b7aeaa61-[...]\" is invalid: spec: Forbidden: pod updates may not change fields other than [...]

We traced the code but we couldn't see how this could be changing any fields other than the annotation. Do you have any ideas?

@gesarki that is very strange, as that error log is implying a permissions issue that shouldn't occur given that you have patch permissions set for pods. Are you adding any special tolerations to these pods?

The only other thing that comes to mind is that we migrated from a raw client with an older controller-runtime to a cached client with newer version in v1.15.x, so perhaps the underlying k8s call changed. Is it possible for you to test with the latest VPC CNI version?

Other than that, I think we would need an EKS support ticket to dig further into this, as we'll need to look at the API server audit logs

Thanks for your help @jdn5126, after checking the API server audit logs we found that it was due to a MutatingAdmissionWebhook (that we installed via kyverno) modifying the patches that the VPC CNI sent.

We can actually see the illegal patch in the error logs, they were just hard to notice because the formatting was lost in the VPC CNI logs. The API server audit logs have the correct formatting:

Pod "command-b6743717-ad50-4576-ad8c-fc5723e2e898-28322220-kmt8g" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)
  core.PodSpec{
  	... // 15 identical fields
  	Subdomain:         "",
  	SetHostnameAsFQDN: nil,
- 	Affinity:          nil,
+ 	Affinity: &core.Affinity{
+ 		NodeAffinity: &core.NodeAffinity{
+ 			RequiredDuringSchedulingIgnoredDuringExecution: &core.NodeSelector{NodeSelectorTerms: []core.NodeSelectorTerm{...}},
+ 		},
+ 	},
  	SchedulerName: "default-scheduler",
  	Tolerations:   {{Key: "kaas.acquia.io/pool", Operator: "Equal", Value: "jobs"}, {Key: "node.kubernetes.io/not-ready", Operator: "Exists", Effect: "NoExecute", TolerationSeconds: &300}, {Key: "node.kubernetes.io/unreachable", Operator: "Exists", Effect: "NoExecute", TolerationSeconds: &300}},
  	... // 11 identical fields

Posting in case someone else runs into this error 🙂.

@gesarki nice debugging, thank you for updating! Hopefully something like this is hit by future searches 🤞🏻

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.