The webhook should reject IPPool deletion if any IP has been allocated
jessehu opened this issue ยท 23 comments
What steps did you take and what happened:
[A clear and concise description on how to REPRODUCE the bug.]
What did you expect to happen:
Anything else you would like to add:
Looks like it doesn't handle the delete request: https://github.com/metal3-io/ip-address-manager/blob/main/api/v1alpha1/ippool_webhook.go#L137
Environment:
- Cluster-api version:
- CAPM3 version:
- IPAM version:
- Minikube version:
- environment (metal3-dev-env or other):
- Kubernetes version: (use
kubectl version
):
/kind bug
I am not sure tbh if we want to have logic for ippool webhook deletion.. reconcileDelete() does check for the allocations already
ip-address-manager/controllers/ippool_controller.go
Lines 156 to 171 in 31c03f1
/cc @schrej if you have other opinions.
Well, there are only two options (so far) on how to handle deletion: finalizers and webhooks.
Finalizers are the safe way, as they prevent deletion on the apiserver and even work when the webhook isn't deployed.
Webhooks are the user friendly way, as they properly prevent deletion, and the resource doesn't stay in a deleted state for a long time, as is the case with Finalizers (you can't un-delete a resource).
In CAPI this was discussed in kubernetes-sigs/cluster-api#5498 for example.
Either solution, or even a combination of both, is fine. Personally I'd probably prefer the combined approach, as you don't end up with deleted pools that remain in use for a long time that way.
As you can't un-delete a resource in K8s, the webhook approach can prevent IPPool from going into Deleting state when it's in used. In case it's deleted unintentionally.
Right now, the deletion is accepted, but we wait until all addresses are released before actually deleting.
Now #137 would move the responsibility onto the user. I am not sure if that is a good thing. The user won't be able to send a delete on the object and just wait until it is deleted. The user will instead have to retry deleting it as long as addresses are allocated. But then the behavior is a bit more straightforward...
Hi @furkatgofurov7 got your point. When the user got error message that "IPPool is in used so can not be deleted", would he/she release the used IPs in this pool first then delete this pool again?
"IPPool is in used so can not be deleted", would he/she release the used IPs in this pool first then delete this pool again?
Yup, that is how it is now when the delete is issued in the objects
But we are also fine also to have combined
handling as @schrej suggested.
Thanks @furkatgofurov7. Could you help proceed #137 ? It's ready for review and CI passed.
/triage accepted
/triage accepted
triage/accepted
@jessehu: This issue is currently awaiting triage.
If Metal3.io contributors determine this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/triage accepted
prow is bit slow for requests but eventually should be able to label this. Otherwise I will do it manually.
/triage accepted
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues will close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
/remove-lifecycle stale
What is the status of this @furkatgofurov7 ?
#137 referring as a fix to this, but that seems to have some issues passing the CI?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues will close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues will close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
This is something that would be welcomed but looks like progress has stalled
/lifecycle frozen