gruntwork-io/kubergrunt

CoreDNS 1.8.3 update requires clusterrole patch

gaba-mlindner opened this issue · 5 comments

I just upgraded my lab EKS cluster (managed by Gruntwork's terraform-aws-eks modules) from 1.19 to 1.20 and ran into some issues with kubergrunt's CoreDNS update.

After the module's provisioner had executed eks sync-core-components .., DNS resolution inside the cluster stopped working and the new CoreDNS pods starting logging errors like:

E0611 10:20:52.181166       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.EndpointSlice: failed to list *v1beta1.EndpointSlice: endpointslices.discovery.k8s.io is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "endpointslices" in API group "discovery.k8s.io" at the cluster scope

Quick googling led me to AWS' docs (see step 5):
https://docs.aws.amazon.com/eks/latest/userguide/managing-coredns.html#updating-coredns-add-on

It turns out that the CoreDNS 1.8.3 update requires additional permissions.

Manually patching the system:coredns clusterrole resolved my issues, but I think it would make sense to have kubergrunt handle this kind of thing automatically (since it's already patching the image version and configmap anyways).

versions:
terraform v1.0.0
terragrunt v0.29.2
kubergrunt v0.7.1
terraform-aws-eks v0.41.0

Thanks for the report! I have opened a PR with the fix: #130

Looking at the related PR and tests now!

Just FYI, I upgraded another cluster with the v0.7.2 version last night and it worked like a charm without manual intervention. Thanks for the quick fix!

Thanks for closing the loop!