aws/amazon-vpc-cni-k8s

Fail pod create correctly when ANNOTATE_POD_IP is configured with No IPs in datastore

jayanthvn opened this issue · 3 comments

What happened:

When IP address is not available in IPAMD we see the below error in logs -

DataStore has no available IP/Prefix addresses

And we return non-nil error -

ipv4Addr, ipv6Addr, deviceNumber, err = s.ipamContext.dataStore.AssignPodIPAddress(ipamKey, ipamMetadata, s.ipamContext.enableIPv4, s.ipamContext.enableIPv6)

"", -1, errors.New("assignPodIPv4AddressUnsafe: no available IP/Prefix addresses")

When ANNOTATE_POD_IP is not configured then we just return non-nil and CNI will fail add.

But when ANNOTATE_POD_IP is configured -

if s.ipamContext.enablePodIPAnnotation {
// On ADD, we pass empty string as there is no IP being released
err = s.ipamContext.AnnotatePod(in.K8S_POD_NAME, in.K8S_POD_NAMESPACE, vpccniPodIPKey, ipv4Addr, "")
if err != nil {
log.Errorf("Failed to add the pod annotation: %v", err)
}
}
resp := rpc.AddNetworkReply{
Success: err == nil,
IPv4Addr: ipv4Addr,
IPv6Addr: ipv6Addr,
DeviceNumber: int32(deviceNumber),
UseExternalSNAT: useExternalSNAT,
VPCv4CIDRs: pbVPCV4cidrs,
VPCv6CIDRs: pbVPCV6cidrs,
PodVlanId: int32(vlanID),
PodENIMAC: branchENIMAC,
PodENISubnetGW: podENISubnetGW,
ParentIfIndex: int32(trunkENILinkIndex),
}

We end up overwriting the "err" hence we would see this log with err overwritten with "nil" -

{"level":"info","ts":"2023-11-02T23:00:02.817Z","caller":"rpc/rpc.pb.go:713","msg":"Send AddNetworkReply: IPv4Addr , IPv6Addr: , DeviceNumber: -1, err: <nil>"}

Hence we will not land here - https://github.com/aws/amazon-vpc-cni-k8s/blob/master/cmd/routed-eni-cni-plugin/cni.go#L178-L182

Leading to setupPodNetwork with nil IP.

{"level":"error","ts":"2023-11-02T23:00:02.818Z","caller":"routed-eni-cni-plugin/cni.go:126","msg":"Failed SetupPodNetwork for container ********: 
SetupPodNetwork: 
failed to setup veth pair: failed to setup veth network: setup NS network: 
failed to add default gateway: one of Dst.IP, Src, or Gw must not be nil"}

Attach logs

What you expected to happen:
Check non-nil err or device -1 or nil IP here -

if s.ipamContext.enablePodIPAnnotation {

How to reproduce it (as minimally and precisely as possible):
Exhaust IP and have ANNOTATE_POD_IP set

Anything else we need to know?: N/A

Environment:

  • Kubernetes version (use kubectl version):
  • CNI Version
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):

Nice catch, yeah this needs a guard

Closing as fixed by #2702

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.