webhookconfiguration isn't cleaned up on operator deletion, and grows by namespaces
svollath opened this issue · 3 comments
Describe the bug
When deleting cf-operator, incl. it's kubernetes namespace, webhookconfigurations remain in the default namespace. When redeploying cf-operator to a different namespace than before, new webhookconfigurations get created in addition to those existing already, and managed deployments, like e.g. kubecf can pick invalid ones.
As a result, kubecf deployment will fail with, e.g.
Error: Internal error occurred: failed calling webhook "validate-boshdeployment.quarks.cloudfoundry.org": Post https://cf-operator-webhook.cf-operator.svc:443/validate-boshdeployment?timeout=30s: service "cf-operator-webhook" not found
The two webhookconfigurations of interest are:
> kubectl get mutatingwebhookconfigurations.admissionregistration.k8s.io
> kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io
To Reproduce
We will first have operator in namespace cf-operator
, and install kubecf (successful).
Then we delete everything, and deploy cf-operator in namespace cfo
- and kubecf will fail:
> kubectl create namespace cf-operator
> helm3 install cf-operator local/cf-operator-6.1.17+0.gec409fd7 --namespace cf-operator --set "global.singleNamespace.name=scf"
> helm3 install scf local/kubecf-2.5.8 --namespace scf --values kubecf-config-values_metallb.yaml
> helm3 delete scf -n scf; kubectl delete namespace scf; helm3 delete cf-operator -n cf-operator; kubectl delete namespace cf-operator
> wc -w resources
67 resources
> for i in $(cat resources); do if [ "$(kubectl get $i -o yaml | grep hook-cf-operator &>/dev/null && echo $?)" = "0" ]; then echo $i; fi; done
mutatingwebhookconfigurations.admissionregistration.k8s.io
validatingwebhookconfigurations.admissionregistration.k8s.io
> kubectl get mutatingwebhookconfigurations.admissionregistration.k8s.io
NAME WEBHOOKS AGE
cf-operator-hook-cf-operator 4 133m
> kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io
NAME WEBHOOKS AGE
cf-operator-hook-cf-operator 2 133m
> kubectl create namespace cfo
> helm3 install cf-operator local/cf-operator-6.1.17+0.gec409fd7 --namespace cfo --set "global.singleNamespace.name=scf"
> helm3 install scf local/kubecf-2.5.8 --namespace scf --values kubecf-config-values_metallb.yaml
Error: Internal error occurred: failed calling webhook "validate-boshdeployment.quarks.cloudfoundry.org": Post https://cf-operator-webhook.cf-operator.svc:443/validate-boshdeployment?timeout=30s: service "cf-operator-webhook" not found
> kubectl get mutatingwebhookconfigurations.admissionregistration.k8s.io
NAME WEBHOOKS AGE
cf-operator-hook-cf-operator 4 142m
cf-operator-hook-cfo 4 2m16s
> kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io
NAME WEBHOOKS AGE
cf-operator-hook-cf-operator 2 142m
cf-operator-hook-cfo 2 2m42s
Expected behavior
webhookconfiguration to always match the current namespace-names resp. service-names. This can be achieved by e.g. deleting those on operator deletion, even when they are located in default, or aren't namespaced - or by making them belong to the cf-operator namespace, so they would get deleted on namespace deletion.
Environment
- cf-operator-6.1.17+0.gec409fd7
- kubecf-2.5.8
Workaround
The following workaround has been tested up to succesful cf login
to kubecf.
When a user hit the error above, or on any cf-operator deletion, the two webhookconfigurations of the former, resp. old namespace(s) have to be deleted in additon to the general deployment removal:
> kubectl delete mutatingwebhookconfigurations.admissionregistration.k8s.io cf-operator-hook-cf-operator
mutatingwebhookconfiguration.admissionregistration.k8s.io "cf-operator-hook-cf-operator" deleted
> kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io cf-operator-hook-cf-operator
validatingwebhookconfiguration.admissionregistration.k8s.io "cf-operator-hook-cf-operator" deleted
Then the following kubecf deployment will succeed.
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/175294709
The labels on this github issue will be updated when the story is started.
@svollath
Hooks are not namespaced and can't belong to a namespace. I think we could use a post-delete
https://helm.sh/docs/topics/charts_hooks/
However, installing the operator again should update that webhook. Can you retry with helm3 install --wait ... cf-operator
(cloudfoundry-incubator/kubecf#1194)?
Turns out when the second operator is installed in a different namespace, the original hook configuration is neither updated nor deleted.
We'll fix this by adding a hook to helm and do better clean up.