Security Groups for Pods branch interfaces not being cleaned up after pod deletion

Question

Security Groups for Pods branch interfaces not being cleaned up after pod deletion

Opened this issue a month ago · 2 comments

What happened:
We have enabled Security Groups for pods meaning each pod needs to be assigned a branch interface for it to start. We have observed pods stuck in ContainerCreating state with the following error

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox

Which is because the node has run out of branch interfaces to assign to new pods. The k8s scheduler thinks that there are available branch interfaces on the node when in fact all of them are occupied.

We noticed that the logs indicate ENIs are not to be cleaned up properly when a pod is deleted
/var/log/aws-routed-eni/ipamd.log

{"level":"info","ts":"2024-08-28T19:50:51.147Z","caller":"rpc/rpc.pb.go:881","msg":"Received DelNetwork for Sandbox 8afed98db5bd696ba1a46dcfa3e237be048d9667a9425c5bea1683beb35496e3"}
{"level":"debug","ts":"2024-08-28T19:50:51.147Z","caller":"rpc/rpc.pb.go:881","msg":"DelNetworkRequest: K8S_POD_NAME:\"test-deployment-7fdb64448-g4dnq\" K8S_POD_NAMESPACE:\"sean-scale\" K8S_POD_INFRA_CONTAINER_ID:\"8afed98db5bd696ba1a46dcfa3e237be048d9667a9425c5bea1683beb35496e3\" Reason:\"PodDeleted\" ContainerID:\"8afed98db5bd696ba1a46dcfa3e237be048d9667a9425c5bea1683beb35496e3\" IfName:\"eth0\" NetworkName:\"aws-cni\""}
{"level":"debug","ts":"2024-08-28T19:50:51.147Z","caller":"ipamd/rpc_handler.go:261","msg":"UnassignPodIPAddress: IP address pool stats: total 28, assigned 4, sandbox aws-cni/8afed98db5bd696ba1a46dcfa3e237be048d9667a9425c5bea1683beb35496e3/eth0"}
{"level":"debug","ts":"2024-08-28T19:50:51.147Z","caller":"ipamd/rpc_handler.go:261","msg":"UnassignPodIPAddress: Failed to find IPAM entry under full key, trying CRI-migrated version"}
{"level":"warn","ts":"2024-08-28T19:50:51.147Z","caller":"ipamd/rpc_handler.go:261","msg":"UnassignPodIPAddress: Failed to find sandbox _migrated-from-cri/8afed98db5bd696ba1a46dcfa3e237be048d9667a9425c5bea1683beb35496e3/unknown"}
{"level":"warn","ts":"2024-08-28T19:50:51.147Z","caller":"rpc/rpc.pb.go:881","msg":"Send DelNetworkReply: Failed to get pod spec: error while trying to retrieve pod info: Pod \"test-deployment-7fdb64448-g4dnq\" not found"}

/var/log/aws-routed-eni/plugin.log

{"level":"info","ts":"2024-08-28T19:38:51.056Z","caller":"routed-eni-cni-plugin/cni.go:314","msg":"Received CNI del request: ContainerID(2e970a5ecd4137a9731736f9caf2f109023e2b6b694bdee92632fe559f84400f) Netns() IfName(eth0) Args(K8S_POD_INFRA_CONTAINER_ID=2e970a5ecd4137a9731736f9caf2f109023e2b6b694bdee92632fe559f84400f;K8S_POD_UID=14582d59-8109-41bd-8028-ea0215dd75dc;IgnoreUnknown=1;K8S_POD_NAMESPACE=sean-scale;K8S_POD_NAME=test-deployment-7fdb64448-kr9jm) Path(/opt/cni/bin) argsStdinData({\"cniVersion\":\"0.4.0\",\"mtu\":\"9001\",\"name\":\"aws-cni\",\"pluginLogFile\":\"/var/log/aws-routed-eni/plugin.log\",\"pluginLogLevel\":\"DEBUG\",\"podSGEnforcingMode\":\"standard\",\"type\":\"aws-cni\",\"vethPrefix\":\"eni\"})"}
{"level":"error","ts":"2024-08-28T19:38:51.058Z","caller":"routed-eni-cni-plugin/cni.go:314","msg":"Error received from DelNetwork gRPC call for container 2e970a5ecd4137a9731736f9caf2f109023e2b6b694bdee92632fe559f84400f: rpc error: code = Unknown desc = error while trying to retrieve pod info: Pod \"test-deployment-7fdb64448-kr9jm\" not found"}
{"level":"info","ts":"2024-08-28T19:38:51.058Z","caller":"routed-eni-cni-plugin/cni.go:393","msg":"PrevResult not available for pod. Pod may have already been deleted."}
{"level":"info","ts":"2024-08-28T19:38:51.058Z","caller":"routed-eni-cni-plugin/cni.go:314","msg":"Could not teardown pod using prevResult: ContainerID(2e970a5ecd4137a9731736f9caf2f109023e2b6b694bdee92632fe559f84400f) Netns() IfName(eth0) PodNamespace(sean-scale) PodName(test-deployment-7fdb64448-kr9jm)"}

We can see by searching the network interface that the branch interfaces still exist for about a minute after scaling down the pods

And they are only cleaned up by garbage collection

What you expected to happen:
Once a pod is deleted, it's branch interface should be cleaned up and immediately available for use by another pod

How to reproduce it (as minimally and precisely as possible):
Enable SGFP
ENABLE_POD_ENI = "true"
Add the following request to a deployment

resources:
          requests:
            vpc.amazonaws.com/pod-eni: "1"

Scale up the deployment so that it is assigned branch interfaces
Scale the deployment back to 0
Observe that branch interfaces are still attached for ~60 seconds after pod deletion

Anything else we need to know?:
We came across this tip https://aws.github.io/aws-eks-best-practices/networking/sgpp/#verify-terminationgraceperiodseconds-in-pod-specification-file and have set terminationGracePeriod to 30s but we are still experiencing the issues.

Environment:

Kubernetes version (use kubectl version): v1.29.7-eks-2f46c53
CNI Version: v1.18.0
OS (e.g: cat /etc/os-release): Ubuntu 22.04.4 LTS
Kernel (e.g. uname -a): Linux ip-172-23-175-189 6.5.0-1024-aws #24~22.04.1-Ubuntu SMP Thu Jul 18 10:43:12 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Answer 1 · 2024-08-28T20:52:23.000Z

Were there any services running that could cause the application pods not to be deleted? Were the application pods deleted properly?

Answer 2 · 2024-08-28T21:18:19.000Z

@orsenthil yes the pods were deleted properly. This is the full spec of what I used to reproduce.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-deployment
  namespace: sean-scale
  labels:
    app: test-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test-deployment
  template:
    metadata:
      labels:
        app: test-deployment
    spec:
      containers:
      - image: busybox
        name: busybox
        command: ["sh", "-c", "sleep infinity"]
        resources:
          requests:
            vpc.amazonaws.com/pod-eni: "1"
          limits:
            vpc.amazonaws.com/pod-eni: "1"
      terminationGracePeriodSeconds: 30

The pods terminate straight away once I scale the deployment to 0