awslabs/amazon-eks-ami

Kuberntes service endpoint timeout error (Untraceable Packet loss)

AbeOwlu opened this issue · 1 comments

What happened: Connection to kubernetes service cluster IP from a pod fails with timeout error

  • connection error
I0822 23:51:24.801068      19 merged_client_builder.go:121] Using in-cluster configuration
I0822 23:51:24.801562      19 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json;g=apidiscovery.k8s.io;v=v2;as=APIGroupDiscoveryList,application/json;g=apidiscovery.k8s.io;v=v2beta1;as=APIGroupDiscoveryList,application/json" -H "User-Agent: kubectl/v1.30.3 (linux/amd64) kubernetes/6fc0a69" -H "Authorization: Bearer <masked>" 'https://172.20.0.1:443/api?timeout=32s'
I0822 23:51:54.803543      19 round_trippers.go:508] HTTP Trace: Dial to tcp:172.20.0.1:443 failed: dial tcp 172.20.0.1:443: i/o timeout
I0822 23:51:54.803605      19 round_trippers.go:553] GET https://172.20.0.1:443/api?timeout=32s  in 30001 milliseconds
I0822 23:51:54.803619      19 round_trippers.go:570] HTTP Statistics: DNSLookup 0 ms Dial 30001 ms TLSHandshake 0 ms Duration 30001 ms
  • attempted trace from the node: pod IP(10.0.8.246) starts a connection and the forward packet is seen through conntrack, then nothing... doesnt show up in mangle's prerouting?
    Screenshot 2024-08-22 at 7 04 25 PM

  • kube-proxy(most probably component) has loaded iptables-legacy module into kernel - even if iptables-legacy is not an instal package Warning: iptables-legacy tables present, use iptables-legacy to see them
    Screenshot 2024-08-22 at 7 11 32 PM

What you expected to happen: Connection to the KAS

How to reproduce it (as minimally and precisely as possible): just from any pod and sometimes (appears intermittent)

Anything else we need to know?:

Environment:

  • AWS Region: us-west-2
  • Instance Type(s): t3.medium
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): "eks.20"
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): "1.27"
  • AMI Version: ami-094a83f86d8289432 amazon/amazon-eks-node-al2023-x86_64-standard-1.27-v20240703
  • Kernel (e.g. uname -a): Linux ip-10-0-8-243.us-west-2.compute.internal 6.1.102-108.177.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jul 31 10:18:50 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /etc/eks/release on a node):

Please open a case with AWS support, we would need to look into the details of your environment.