[BUG] container restarted, Could not allocate IP in range despite having reservation for existing Pod
xagent003 opened this issue · 2 comments
@dougbtv @miguel Duarte de Mora Barroso is there a reason whereabouts should not return an existing IP reservation if the podref matches? Seeing more issues surrounding full IP reservations.
Did some tests on node reboots and restarting of our stack k8s services and kubelet. What i noticed is that kubelet recreates the container when it restarts, but we just see an ADD operation coming into the CNI/whereabouts. This fails as Pods already had IPs and IPoool is full. As a result Pod gets stuck in ContainerCreating state
E0104 22:37:06.684288 8702 remote_runtime.go:198] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to setup network for sandbox "a8d579efef9622a4f30486ff435fae4006022cb2c54941ba0a0bccda2385c6a9": plugin type="multus" name="multus-cni-network" failed (add): [default/asdasd-1:whereaboutsexample]: error adding container to network "whereaboutsexample": Error at storage engine: Could not allocate IP in range: ip: 10.128.165.32 / - 10.128.165.34 / range: net.IPNet{IP:net.IP{0xa, 0x80, 0xa5, 0x0}, Mask:net.IPMask{0xff, 0xff, 0xff, 0x0}}"
But in whereabouts.log:
2023-01-04T22:37:04.504Z DEBUG ADD - IPAM configuration successfully read: {Name:whereaboutsexample Type:whereabouts Routes:[] Datastore:kubernetes Addresses:[] OmitRanges:[] DNS:{Nameservers:[] Domain: Search:[] Options:[]} Range:10.128.165.0/24 RangeStart:10.128.165.32 RangeEnd:10.128.165.34 GatewayStr: EtcdHost: EtcdUsername: EtcdPassword:********* EtcdKeyFile: EtcdCertFile: EtcdCACertFile: LeaderLeaseDuration:1500 LeaderRenewDeadline:1000 LeaderRetryPeriod:500 LogFile:/tmp/whereabouts-macvlan165.log LogLevel:debug OverlappingRanges:true Gateway:<nil> Kubernetes:{KubeConfigPath:/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig K8sAPIRoot:} ConfigurationPath: PodName:asdasd-1 PodNamespace:default}
2023-01-04T22:37:04.504Z DEBUG Beginning IPAM for ContainerID: a8d579efef9622a4f30486ff435fae4006022cb2c54941ba0a0bccda2385c6a9
...
2023-01-04T22:37:06.466Z DEBUG PF9: GetIpPool: &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:10.128.165.0-24 GenerateName: Namespace:default SelfLink: UID:3854e207-0e34-4e73-80e5-8883ff039b90 ResourceVersion:140858 Generation:10 CreationTimestamp:2023-01-04 06:21:39 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ClusterName: ManagedFields:[{Manager:whereabouts Operation:Update APIVersion:whereabouts.cni.cncf.io/v1alpha1 Time:2023-01-04 22:19:18 +0000 UTC FieldsType:FieldsV1 FieldsV1:{"f:spec":{".":{},"f:allocations":{".":{},"f:32":{".":{},"f:id":{},"f:podref":{}},"f:33":{".":{},"f:id":{},"f:podref":{}},"f:34":{".":{},"f:id":{},"f:podref":{}}},"f:range":{}}}}]} Spec:{Range:10.128.165.0/24 Allocations:map[32:{ContainerID:529a9ba352e94a553544ddbb838e17ae752c193c7306c71abc108076b2eeb773 PodRef:default/asdasd-0} 33:{ContainerID:0d6b3f6cfc602597bbea82ef00ec8804aa27bee17ca2b69f518191f70cb4af67 PodRef:default/asdasd-1} 34:{ContainerID:2a4b0e4ddd40e59d7693bb7aba317407246bce98e8875a9a0467f624484ed48d PodRef:default/asdasd-2}]}}
2023-01-04T22:37:06.466Z DEBUG PF9: Current Allocations: [IP: 10.128.165.32 is reserved for pod: default/asdasd-0 IP: 10.128.165.33 is reserved for pod: default/asdasd-1 IP: 10.128.165.34 is reserved for pod: default/asdasd-2]
2023-01-04T22:37:06.466Z DEBUG IterateForAssignment input >> ip: 10.128.165.32 | ipnet: {10.128.165.0 ffffff00} | first IP: 10.128.165.32 | last IP: 10.128.165.34
2023-01-04T22:37:06.466Z ERROR Error assigning IP: Could not allocate IP in range: ip: 10.128.165.32 / - 10.128.165.34 / range: net.IPNet{IP:net.IP{0xa, 0x80, 0xa5, 0x0}, Mask:net.IPMask{0xff, 0xff, 0xff, 0x0}}
As you can see, Pod default/asdasd-1 already has an IP reservation (we added some custom logs to print IP pool details). Stranger, we don't see ADD coming into whereabouts for pods asdasd-0 and asdasd-2. despite seeing logs in kubelet for all 3 Pod replicas:
I0104 22:37:04.270321 8702 kuberuntime_manager.go:487] "No sandbox for pod can be found. Need to start a new one" pod="default/asdasd-1"
Also we don't see a DEL coming in, despite original container dying/kubelet recreating it.
Whatever case, the underlying containers could be restarted or die for a variety of reasons. Rather than try to fix why kubelet/container runtime did not send in the DEL or why container died, I think whereabouts should just return an existing IP if there is a matching podRef reservation.
In this case the ADD shouldn't fail because the PodRef already has a reservation. It's effectively the same Pod. Pod names should be unqiue for deployments and DSs. In case of STS like above, we'd prefer it to just keep the same IP.
And The IP would get cleaned up either when we delete Pod gracefully, or via the ip-reconciler if node is brought down ungracefully.
I think right here, if already reserved, this function should check if podRef matches rather than just continuing: https://github.com/k8snetworkplumbingwg/whereabouts/blob/master/pkg/allocate/allocate.go#L215
can we merge this fix into master branch ? we have been running into this full IP pool scenario a few times when cloud node crashes.