aws/amazon-vpc-cni-k8s

Intermittent timeouts connecting to service in same EKS cluster via Internal NLB

thatInfrastructureGuy opened this issue · 3 comments

What happened:
We get intermittent timeouts when pinging services in EKS cluster. This only happens if the caller and callee are in the same EKS cluster and traffic goes through private NLB. We have also noticed that this happens when caller pod is in the same AZ as private NLB IP, which is being hit.

How to reproduce it (as minimally and precisely as possible):

  1. Create a service with LoadBalancer. It should be private NLB.
---
apiVersion: v1
kind: Namespace
metadata:
  labels:
    kubernetes.io/metadata.name: whoami
  name: whoami
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
  labels:
    app: whoami
  name: whoami
  namespace: whoami
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: whoami
  sessionAffinity: None
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: whoami
  name: whoami
  namespace: whoami
spec:
  replicas: 5
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
      - image: traefik/whoami
        imagePullPolicy: IfNotPresent
        name: whoami
        ports:
        - containerPort: 80
          protocol: TCP
---
  1. Wait for Loadbalancer to be active. Check AWS UI.
  2. Get all the IPs of private NLB.
nslookup xxx-xyz.elb.us-west-2.amazonaws.com
Server:         x.x.x.x
Address:        x.x.x.x#53

Non-authoritative answer:
Name:   xxx-xyz.elb.us-west-2.amazonaws.com
Address: 10.2.x.x  # us-west-2a
Name:   xxx-xyz.elb.us-west-2.amazonaws.com
Address: 10.2.y.y # us-west-2b
Name:   xxx-xyz.elb.us-west-2.amazonaws.com
Address: 10.2.z.z # us-west-2c
  1. On the same cluster run your client
kubectl run mysh --rm -i --tty --image alpine -- sh
/ # IP=10.2.x.x
/ # while true; do wget -O - -q -T 5 http://$IP/bench; sleep 0.5; done

/ # IP=10.2.y.y
/ # while true; do wget -O - -q -T 5 http://$IP/bench; sleep 0.5; done

/ # IP=10.2.z.z
/ # while true; do wget -O - -q -T 5 http://$IP/bench; sleep 0.5; done
  1. At this point, one of the IPs should give you intermittent timeouts. If so, please open a second terminal and check if pod is hosted on same AZ as the NLB IP which is having the issue.
kubectl get pods mysh -o=jsonpath='{.status.hostIP}'

Anything else we need to know?:

  • We installed Cilium and still faced the same issue. The root cause seems to be NLB IP on the same AZ is having trouble reaching EKS NodePort, hence the internal pod network be it Cilium or AWS CNI does not seem to matter.

Environment: EKS

  • Kubernetes version (use kubectl version): 1.27 (Tried 1.22 and 1.27)
  • CNI Version: Tried default and latest

@thatInfrastructureGuy this is not an AWS VPC CNI issue, so I don't think an issue in this GitHub repo will help. I suggest creating an EKS support ticket through your AWS console for help debugging this.

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

This article presented 2 options:

  • Using IP mode instead of Instance Mode: It did not work for me. I still had the same issue.
  • Disabling Client-IP Preservation: Everything started working fine after disabling Client-IP Preservation.

References:


Disabling ClientIP preservation makes sense since TCP connection might have same source and target host and same port might be used for multiple connections and therefore timeouts.

This made me assume (disabling SNAT via AWS_VPC_K8S_CNI_EXTERNALSNAT + IP-mode + ClientIP Preservation flag) would work since the TCP connection would be direct source Pod-IP -> target Pod-IP, but this combination did not work for me. I got similar timeouts as before.


Hence, solution for me is to disable ClientIP preservation and possibly rely on ProxyProtocol for real IPs.