awslabs/amazon-eks-ami

AL2023 should use actual max pods to calculate kube-reserved

Opened this issue · 2 comments

What would you like to be added:

I'd like to see AL2023 (AL2 doesn't have the correct inputs) calculate the kube-reserved value from the actual max pods if provided and not always use the hypothetical instance value.

Why is this needed:

Running a calculation for kube-reserved based on max pods makes sense but using the wrong value for max pods in this calculation when there is a user defined value is part of the structured API doesn't make any sense. If I've set my node's max pods value to 110 then I expect the kube-reserved value to reflect this.

CC @cartermckinnon

I also noticed this for the ubuntu custom ami, which probably uses this repo as a reference to build their eks images.

There are two problems actually:

  • the script does not consider the fact that one could use prefix delegation mode for the vpc cni (which allows you to have more pods per node compared to the default mode) and doesn't calculate max pods properly. It only considers what is available on eni-max-pods.txt.
  • as mentioned, if you overwrite max pods, the script does not consider the overwrite when reserving memory for kube-reserved. This is potentially dangerous and might cause node instability (https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/)

https://github.com/awslabs/amazon-eks-ami/blob/main/templates/al2/runtime/bootstrap.sh#L504-L527

looks like we're affected by this as well