CNI not removing network built on a node after IP is lost externally and IPAMD reconciles this state
AbeOwlu opened this issue · 5 comments
IPAM reconciliation:
Scenario;
- Pod is created and assigned an IP,
10.0.2.99
- the IP after complete sandbox initialization is reclaimed by an automation in the network external to the cluster
- the IPAMD logs show an IP pool reconcile that catches this lost IP and reconciles its cache calling EC2 endpoint
- the network route for this pod with IP
10.0.2.99
remains unchanged on the local node however, other node peers are no longer able to reach this pod on10.0.2.99
of its host nodes, it is reachable from this local host and kubernetes liveness probes are succeeding - keeping an unhealthy pod in the cluster
{"level":"debug","ts":"2024-03-08T18:10:50.378Z","caller":"rpc/rpc.pb.go:713","msg":"AddNetworkRequest: K8S_POD_NAME:\"liveness-http\" K8S_POD_NAMESPACE:\"gateway-ns\" K8S_POD_INFRA_CONTAINER_ID:\"7f92409d45a01365839f5db2b7c30c35626c1de02779233046bf5c1bd2c59380\" ContainerID:\"7f92409d45a01365839f5db2b7c30c35626c1de02779233046bf5c1bd2c59380\" IfName:\"eth0\" NetworkName:\"aws-cni\" Netns:\"/var/run/netns/cni-d4e752dc-bdf7-f594-2a1a-38dfa2445dfb\""}
{"level":"info","ts":"2024-03-08T18:10:50.378Z","caller":"datastore/data_store.go:750","msg":"AssignPodIPv4Address: Assign IP 10.0.2.99 to sandbox aws-cni/7f92409d45a01365839f5db2b7c30c35626c1de02779233046bf5c1bd2c59380/eth0"}
Externl automation event Event time
March 08, 2024, 18:11:25 (UTC+00:00) UnassignPrivateIpAddresses "privateIpAddress": "10.0.2.99"
{"level":"warn","ts":"2024-03-08T18:12:00.256Z","caller":"ipamd/ipamd.go:1404","msg":"Instance metadata does not match data store! ipPool: [10.0.2.99 10.0.2.27 10.0.2.158], metadata: [{\n Primary: true,\n PrivateIpAddress: \"10.0.2.149\"\n} {\n Primary: false,\n PrivateIpAddress: \"10.0.2.27\"\n} {\n Primary: false,\n PrivateIpAddress: \"10.0.2.158\"\n}]"}
{"level":"info","ts":"2024-03-08T18:12:00.334Z","caller":"datastore/data_store.go:578","msg":"UnAssignPodIPAddress: Unassign IP 10.0.2.99 from sandbox aws-cni/7f92409d45a01365839f5db2b7c30c35626c1de02779233046bf5c1bd2c59380/eth0"}
What you expected to happen:
- After event
"UnAssignPodIPAddress: Unassign IP 10.0.2.99 from sandbox aws-cni/7f9240...
the CNI is triggered to tear down the network route with this IP, and liveness probe may eventually fail and attempt to heal this pod.
How to reproduce it (as minimally and precisely as possible):
- create pod with liveness and readiness probe, like;
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness3
name: liveness-http3
spec:
containers:
- name: ngo-proxy
image: gcr.io/google_containers/echoserver:1.4
# args:
# - /server
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 8080
# httpHeaders:
# - name: Custom-Header
# value: Awesome
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
readinessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 8080
# initialDelaySeconds: 50
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 2
restartPolicy: Always
- remove the IP from the node this pod is scheduled at any time
Anything else we need to know?:
- during the sweep phase of the
nodeIPPoolReconcile
process, should the CNI be invoked to updateHostNetwork for the removed IPs? - see issue
Environment:
- Kubernetes version (use
kubectl version
): - CNI Version: image: 602401143452.dkr.ecr.us-west-1.amazonaws.com/amazon-k8s-cni-init:v1.15.3-eksbuild.1
- OS (e.g:
cat /etc/os-release
):
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
- Kernel (e.g.
uname -a
):
Linux ....compute.internal 5.10.198-187.748.amzn2.x86_64 #1 SMP Tue Oct 24 19:49:54 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
@AbeOwlu what is this "external event" that reclaims an IP on an ENI? Only the IPAM daemon should be assigning and unassigning IPs to an ENI. Before calling the EC2 API to unassign IPs, it removes those IPs from the datastore. That precondition is required to avoid this exact scenario
There's an automation pipeline that's incorrectly, (I might add) seeing a drift in the VPC network and unassigns an IP from an EC2 instance at the moment.
- looking into this further, it actually appears to show the CRI attempting to recreate container sandbox, but the CNI was not not responsive.. connection refused on the 3 attempts so the orchestrator may may be handling this case.
Will update with more details and logs...
I think I hit this issue too. Let me circle back with some more info
We had this issue. aws/amazon-vpc-resource-controller-k8s#412 which deleted branch ENI from pods. CNI didn't do anything about the missing network interface or lost IP address
@AbeOwlu - CNI will not remove any interface that doesn't manage. For any external changes introduced to the interfaces that CNI manages, if they are not in use, it will garbage collect them. If it didn't happen, and you can reproduce this as bug, let us know. Otherwise, we can close this ticket.