awslabs/amazon-eks-ami

AWS EKS (1.29) node failing to join the cluster

milosgajdos opened this issue · 1 comments

What happened:

When creating a new EKS nodegroup the node fails to join the cluster. Tailing kubelet logs on the affected node we can see the following errors:

Jun 06 19:33:03 ip-10-150-127-198 kubelet[3214]: I0606 19:33:03.796478 3214 kubelet_node_status.go:73] "Attempting to register node" node="ip-10-150-127-198.ec2.internal"
Jun 06 19:33:03 ip-10-150-127-198 kubelet[3214]: E0606 19:33:03.828406 3214 kubelet_node_status.go:96] "Unable to register node with API server" err="Unauthorized" node="ip-10-150-127-198.ec2.internal"
Jun 06 19:33:03 ip-10-150-127-198 kubelet[3214]: E0606 19:33:03.828481 3214 controller.go:145] "Failed to ensure lease exists, will retry" err="Unauthorized" interval="7s"
Jun 06 19:33:04 ip-10-150-127-198 kubelet[3214]: I0606 19:33:04.623042 3214 csi_plugin.go:8801 Failed to contact API server when waiting for CSINode publishing: Unauthorized
Jun 06 19:33:05 ip-10-150-127-198 kubelet[3214]: I0606 19:33:05.624405 3214 csi plugin.go:8801 Failed to contact API server when waiting for CSINode publishing: Unauthorized

This is similar to #1376

What you expected to happen:

Nodegroup node successfully joins the cluster.

How to reproduce it (as minimally and precisely as possible):

Create a new Nodegroup in EKS and try adding a node to it (we actually use terraform for this).

Anything else we need to know?:

Environment:

  • AWS Region: us-east-1
  • Instance Type(s): AL2_x86_64
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.7
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): 1.29
  • AMI Version: amazon-eks-node-1.29-v20240605
  • Kernel (e.g. uname -a): Linux ip-10-150-116-201 5.10.217-205.860.amzn2.x86_64 #1 SMP Tue May 21 16:52:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /etc/eks/release on a node):
ASE_AMI_ID="ami-017bf9da04c44cc95"
BUILD_TIME="Wed Jun  5 21:30:20 UTC 2024"
BUILD_KERNEL="5.10.217-205.860.amzn2.x86_64"
ARCH="x86_64"

@milosgajdos this sounds like an issue with your cluster's auth setup, please open a ticket with AWS Support so we can look into the details of your environment.