PV stuck in released/terminating state even if CSI DeleteVolume action has been executed successfully
k2huang opened this issue · 7 comments
In our K8s cluster, in order to PV couldn't disappear before volume is successfully deleted via csi plugin, we enable AddFinalizer
option for ProvisionController
.
But we encounter many terminating PV objects even if CSI DeleteVolume action has been executed successfully. Error logs as following:
delete "pvc-9041b405-1a2d-42df-9f56-876e4f0217fd": failed to remove finalizer for persistentvolume: Operation cannot be fulfilled on persistentvolumes "pvc-9041b405-1a2d-42df-9f56-876e4f0217fd": the object has been modified; please apply your changes to the latest version and try again
After reading source code, I find some hints:
ProvisionController will delete PV Object after successfully calling CSI DeleteVolume, and then start to remove finalizer(external-provisioner.volume.kubernetes.io/finalizer).
But if the finalizer failed to be removed as above logs showing, PV will stuck in released/terminating state forever because PV's DeletionTimestamp is not nil
now
if ctrl.kubeVersion.AtLeast(utilversion.MustParseSemantic("v1.9.0")) {
if ctrl.addFinalizer && !ctrl.checkFinalizer(volume, finalizerPV) && volume.ObjectMeta.DeletionTimestamp != nil {
return false
} else if volume.ObjectMeta.DeletionTimestamp != nil {
return false
}
}
From my POV, it's better to trigger to delete PV object after successfully remove finalizer.
/assign
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-contributor-experience at kubernetes/community.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen
.
Mark the issue as fresh with/remove-lifecycle rotten
.Send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.