aws/karpenter-provider-aws

Missing list/watch on customresourcedefinitions in 1.0.4 chart

Closed this issue · 4 comments

Description

Observed Behavior:
I updated from 1.0.2 to 1.0.4 and karpenter now says
list:

k8s.io/client-go@v0.30.3/tools/cache/reflector.go:232: failed to list *v1.CustomResourceDefinition: customresourcedefinitions.apiextensions.k8s.io is forbidden: User "system:serviceaccount:kube-system:karpenter" cannot list resource "customresourcedefinitions" in API group "apiextensions.k8s.io" at the cluster scope

watch:

k8s.io/client-go@v0.30.3/tools/cache/reflector.go:232: Failed to watch *v1.CustomResourceDefinition: failed to list *v1.CustomResourceDefinition: customresourcedefinitions.apiextensions.k8s.io is forbidden: User "system:serviceaccount:kube-system:karpenter" cannot list resource "customresourcedefinitions" in API group "apiextensions.k8s.io" at the cluster scope

Did I miss something or was this missed in helm chart?
After this karpenter pod crashes
Expected Behavior:
No crashloops.
Reproduction Steps (Please include YAML):
Used helm chart for karpenter and CRDs and updated from 1.0.2 to 1.0.4
Versions:

  • Chart Version: 1.0.4
  • Kubernetes Version (kubectl version): 1.29
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

I came across this as well on EKS 1.31 and #7123 fixes the issue.

I take it both of you have webhooks disabled in your helm charts? These permissions are required for a version migration controller which is responsible for ensuring all resources are stored at v1; if the webhooks are disabled, the controllers shouldn't be enabled. While we're working on a fix / patch release to conditionally disable these controllers, you can unblock by adding these permissions or rolling back to the previous patch.

We've merged #7128 which will disable the migration controllers when the webhooks aren't enabled. We'll be getting another patch release with this fix out soon.

Closing, we've released 1.0.5 with the discussed changes.