kubernetes-csi/external-provisioner

Delete volume requests are sent twice with new provisioner version v5.0.1

shrutinipane opened this issue ยท 48 comments

What happened:
When we try to delete volume using new provisioner version v5.0.1, we notice 2 volume deletion requests in the logs. As a result of which PV remains in Terminating state.
Following are the driver logs:

I0611 15:22:59.944368       1 controller.go:1258] "shouldDelete" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:22:59.944376       1 controller.go:1288] "shouldDelete is true" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:22:59.944381       1 controller.go:1132] "shouldDelete" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:22:59.944389       1 controller.go:1548] "Started" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:22:59.944401       1 controller.go:1312] volume pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52 does not need any deletion secrets
I0611 15:22:59.944454       1 connection.go:264] "GRPC call" method="/csi.v1.Controller/DeleteVolume" request="{\"volume_id\":\"0;0;15242247612712203312;385E110A:66420EDE;;;/mnt/local-fs/primary-fileset-local-fs-15242247612712203312/.volumes/pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52\"}"
I0611 15:23:04.259756       1 controller.go:1563] "Volume deleted" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:23:04.285156       1 controller.go:1607] "PersistentVolume deleted succeeded" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:23:04.285352       1 controller.go:1171] "handleProtectionFinalizer" PV="&PersistentVolume{ObjectMeta:{pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52    52c56854-0f57-4d2a-b242-ebb14dba958e 17734168 0 2024-06-11 15:13:31 +0000 UTC 2024-06-11 15:23:04 +0000 UTC 0xc00056c200 map[] map[pv.kubernetes.io/provisioned-by:spectrumscale.csi.ibm.com volume.kubernetes.io/provisioner-deletion-secret-name: volume.kubernetes.io/provisioner-deletion-secret-namespace:] [] [external-provisioner.volume.kubernetes.io/finalizer kubernetes.io/pv-protection] [{csi-provisioner Update v1 2024-06-11 15:13:31 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:pv.kubernetes.io/provisioned-by\":{},\"f:volume.kubernetes.io/provisioner-deletion-secret-name\":{},\"f:volume.kubernetes.io/provisioner-deletion-secret-namespace\":{}},\"f:finalizers\":{\".\":{},\"v:\\\"external-provisioner.volume.kubernetes.io/finalizer\\\"\":{}}},\"f:spec\":{\"f:accessModes\":{},\"f:capacity\":{\".\":{},\"f:storage\":{}},\"f:claimRef\":{\".\":{},\"f:apiVersion\":{},\"f:kind\":{},\"f:name\":{},\"f:namespace\":{},\"f:resourceVersion\":{},\"f:uid\":{}},\"f:csi\":{\".\":{},\"f:driver\":{},\"f:fsType\":{},\"f:volumeAttributes\":{\".\":{},\"f:csi.storage.k8s.io/pv/name\":{},\"f:csi.storage.k8s.io/pvc/name\":{},\"f:csi.storage.k8s.io/pvc/namespace\":{},\"f:storage.kubernetes.io/csiProvisionerIdentity\":{},\"f:volBackendFs\":{},\"f:volDirBasePath\":{}},\"f:volumeHandle\":{}},\"f:persistentVolumeReclaimPolicy\":{},\"f:storageClassName\":{},\"f:volumeMode\":{}}} } {kube-controller-manager Update v1 2024-06-11 15:22:59 +0000 UTC FieldsV1 {\"f:status\":{\"f:phase\":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{1073741824 0} {<nil>} 1Gi BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:0;0;15242247612712203312;385E110A:66420EDE;;;/mnt/local-fs/primary-fileset-local-fs-15242247612712203312/.volumes/pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52,ReadOnly:false,FSType:gpfs,VolumeAttributes:map[string]string{csi.storage.k8s.io/pv/name: pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52,csi.storage.k8s.io/pvc/name: pvc-gfugq,csi.storage.k8s.io/pvc/namespace: test-driver-dynamic-pass-9-1718118761,storage.kubernetes.io/csiProvisionerIdentity: 1718094082648-2663-spectrumscale.csi.ibm.com,volBackendFs: local-fs,volDirBasePath: LW,},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteOnce],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-dynamic-pass-9-1718118761,Name:pvc-gfugq,UID:c1396a54-9c2b-473f-82f1-6af2253bdf52,APIVersion:v1,ResourceVersion:17729414,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-tvxzfffsfe,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Released,Message:,Reason:,LastPhaseTransitionTime:<nil>,},}"
I0611 15:23:04.285367       1 controller.go:1258] "shouldDelete" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:23:04.285377       1 controller.go:1288] "shouldDelete is true" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:23:04.285384       1 controller.go:1132] "shouldDelete" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:23:04.285391       1 controller.go:1548] "Started" PV="pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52"
I0611 15:23:04.285403       1 controller.go:1312] volume pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52 does not need any deletion secrets
I0611 15:23:04.285454       1 connection.go:264] "GRPC call" method="/csi.v1.Controller/DeleteVolume" request="{\"volume_id\":\"0;0;15242247612712203312;385E110A:66420EDE;;;/mnt/local-fs/primary-fileset-local-fs-15242247612712203312/.volumes/pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52\"}"
I0611 15:23:06.468538       1 connection.go:270] "GRPC response" response="{}" err="rpc error: code = Internal desc = unable to Delete Dir using FS [local-fs] Relative SymLink [primary-fileset-local-fs-15242247612712203312/.volumes/pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52]. Error [unable to delete dir primary-fileset-local-fs-15242247612712203312/.volumes/pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52:[EFSSG0264C The path /mnt/local-fs/primary-fileset-local-fs-15242247612712203312/.volumes/pvc-c1396a54-9c2b-473f-82f1-6af2253bdf52 does not exist.]]"

What you expected to happen:
PV should be deleted successfully.

How to reproduce it:

  1. Create a storageclass and PVC
  2. Delete PVC
  3. Notice the status of PV

Environment:

  • Driver version:
  • Kubernetes version (use kubectl version): v1.29.3
  • OS (e.g. from /etc/os-release): RHEL 8.8
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

We also see similar behavior with pre-provisioned disks, the csi driver after the upgrade started to delete them after with the PVC and PV were deleted which is NOT expected.

The pre-provisioned disks should remain untouched

Another instance where pre-provisioned PVC was deleted as per the logs. But, from the cluster it is still showing Terminating.
Below are the provisioner logs:

I0620 10:50:11.805561       1 controller.go:1132] "shouldDelete" PV="pv-fbeqn"
I0620 10:50:11.805575       1 controller.go:1548] "Started" PV="pv-fbeqn"
I0620 10:50:11.805665       1 connection.go:264] "GRPC call" method="/csi.v1.Controller/DeleteVolume" request="{\"volume_id\":\"453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-rgupwoxrg\"}"
I0620 10:50:12.894422       1 leaderelection.go:281] successfully renewed lease ibm-spectrum-scale-csi-driver/spectrumscale-csi-ibm-com
I0620 10:50:21.453285       1 connection.go:270] "GRPC response" response="{}" err=null
I0620 10:50:21.453335       1 controller.go:1563] "Volume deleted" PV="pv-fbeqn"
I0620 10:50:21.528417       1 controller.go:1607] "PersistentVolume deleted succeeded" PV="pv-fbeqn"
I0620 10:50:21.528709       1 controller.go:1171] "handleProtectionFinalizer" PV="&PersistentVolume{ObjectMeta:{pv-fbeqn    2d464b99-70d9-4001-b4bd-058d4a51dcb4 19599924 0 2024-06-20 10:47:32 +0000 UTC 2024-06-20 10:50:21 +0000 UTC 0xc0009efa80 map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [external-provisioner.volume.kubernetes.io/finalizer] [{OpenAPI-Generator Update v1 2024-06-20 10:47:32 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:labels\":{\".\":{},\"f:author\":{}}},\"f:spec\":{\"f:accessModes\":{},\"f:capacity\":{\".\":{},\"f:storage\":{}},\"f:csi\":{\".\":{},\"f:driver\":{},\"f:volumeHandle\":{}},\"f:persistentVolumeReclaimPolicy\":{},\"f:storageClassName\":{},\"f:volumeMode\":{}}} } {csi-provisioner Update v1 2024-06-20 10:47:32 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:finalizers\":{\"v:\\\"external-provisioner.volume.kubernetes.io/finalizer\\\"\":{}}}} } {kube-controller-manager Update v1 2024-06-20 10:47:32 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:pv.kubernetes.io/bound-by-controller\":{}}},\"f:spec\":{\"f:claimRef\":{\".\":{},\"f:apiVersion\":{},\"f:kind\":{},\"f:name\":{},\"f:namespace\":{},\"f:resourceVersion\":{},\"f:uid\":{}}}} } {kube-controller-manager Update v1 2024-06-20 10:50:11 +0000 UTC FieldsV1 {\"f:status\":{\"f:message\":{},\"f:phase\":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-rgupwoxrg,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1718880431,Name:pvc-kqdmmbm,UID:f3821261-6e7a-4c91-b3e5-59871d7dfc2a,APIVersion:v1,ResourceVersion:19599399,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-wdijv,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Failed,Message:error getting deleter volume plugin for volume \"pv-fbeqn\": no deletable volume plugin matched,Reason:,LastPhaseTransitionTime:2024-06-20 10:50:11 +0000 UTC,},}"

We also see similar behavior with pre-provisioned disks, the csi driver after the upgrade started to delete them after with the PVC and PV were deleted which is NOT expected.

The pre-provisioned disks should remain untouched

This is as per the KEP design https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/2644-honor-pv-reclaim-policy#csi-driver-volumes, essentially, the new finalizer will be added to even statically provisioned volumes if they are in a Bound state, and deleting the PVC, will delete the underlying volume, honoring the reclaim policy specified.

@shrutinipane This is pretty odd, based on https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/blob/master/controller/controller.go#L1545

Once the deleteVolumeOperation is triggered, it removes all the finalizers if necessary. Only after that would it log the message "PersistentVolume deleted succeeded"
Based on your log messages, it seems like the syncVolume https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/blob/master/controller/controller.go#L1114 is triggered after that, and oddly enough, the PV shows up with the finalizer still present, I don't even see deletion time stamp set. I am not sure how this works, any clues on this @jsafrane @xing-yang @carlory ?

We have following setting in provisioner clusterrole:

[root@csi-kube129-1-x-master ~]# oc get clusterrole ibm-spectrum-scale-csi-provisioner -oyaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: "2024-06-21T05:49:56Z"
  labels:
    app.kubernetes.io/instance: ibm-spectrum-scale-csi-operator
    app.kubernetes.io/managed-by: ibm-spectrum-scale-csi-operator
    app.kubernetes.io/name: ibm-spectrum-scale-csi-operator
    product: ibm-spectrum-scale-csi
    release: ibm-spectrum-scale-csi-operator
  name: ibm-spectrum-scale-csi-provisioner
  resourceVersion: "19799980"
  uid: 4fcd148f-ed58-40c1-99b2-26ffbc2c4d8d
rules:
- apiGroups:
  - ""
  resources:
  - persistentvolumes
  verbs:
  - get
  - list
  - watch
  - create
  - delete
  - patch

Still while deleting statically provisioned PV, it remains in Terminating state. Following are the logs in provisioner sidecar:

I0621 05:58:32.717669       1 controller.go:1258] "shouldDelete" PV="pv-xdikibutrt"
I0621 05:58:32.717690       1 controller.go:1279] "shouldDelete is false: PersistentVolumePhase is not Released" PV="pv-xdikibutrt"
I0621 05:58:32.763484       1 controller.go:1171] "handleProtectionFinalizer" PV="&PersistentVolume{ObjectMeta:{pv-xdikibutrt    b3c10e29-2b06-4c9d-a81f-fd31de8dc86f 19801737 0 2024-06-21 05:57:11 +0000 UTC 2024-06-21 05:58:32 +0000 UTC 0xc000a958f8 map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [external-provisioner.volume.kubernetes.io/finalizer] [{OpenAPI-Generator Update v1 2024-06-21 05:57:11 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:labels\":{\".\":{},\"f:author\":{}}},\"f:spec\":{\"f:accessModes\":{},\"f:capacity\":{\".\":{},\"f:storage\":{}},\"f:csi\":{\".\":{},\"f:driver\":{},\"f:volumeHandle\":{}},\"f:persistentVolumeReclaimPolicy\":{},\"f:storageClassName\":{},\"f:volumeMode\":{}}} } {csi-provisioner Update v1 2024-06-21 05:57:11 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:finalizers\":{\"v:\\\"external-provisioner.volume.kubernetes.io/finalizer\\\"\":{}}}} } {kube-controller-manager Update v1 2024-06-21 05:57:11 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:pv.kubernetes.io/bound-by-controller\":{}}},\"f:spec\":{\"f:claimRef\":{\".\":{},\"f:apiVersion\":{},\"f:kind\":{},\"f:name\":{},\"f:namespace\":{},\"f:resourceVersion\":{},\"f:uid\":{}}}} } {kube-controller-manager Update v1 2024-06-21 05:58:17 +0000 UTC FieldsV1 {\"f:status\":{\"f:message\":{},\"f:phase\":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-jzysxcelad,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1718949407,Name:pvc-pgweef,UID:e82645ec-6cca-462f-b57d-78b6685aabe1,APIVersion:v1,ResourceVersion:19801456,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-ufqxiok,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Failed,Message:error getting deleter volume plugin for volume \"pv-xdikibutrt\": no deletable volume plugin matched,Reason:,LastPhaseTransitionTime:2024-06-21 05:58:17 +0000 UTC,},}"
I0621 05:58:32.763535       1 controller.go:1258] "shouldDelete" PV="pv-xdikibutrt"
I0621 05:58:32.763552       1 controller.go:1279] "shouldDelete is false: PersistentVolumePhase is not Released" PV="pv-xdikibutrt"

@shrutinipane could you post the whole csi-provisioner logs? it would be helpful to get kcm logs as well. Please post the PV and PVC yaml as well. Also the steps, did you simply delete the PVC? Do the logs indicate something as a fallout after the PVC delete?

These are the CSI pods:

[root@csi-kube129-1-x-master ~]# oc get pods
NAME                                                 READY   STATUS    RESTARTS   AGE
ibm-spectrum-scale-csi-attacher-79654d86f-s24f9      1/1     Running   0          2m2s
ibm-spectrum-scale-csi-attacher-79654d86f-s9vlh      1/1     Running   0          2m2s
ibm-spectrum-scale-csi-cjbvw                         3/3     Running   0          2m2s
ibm-spectrum-scale-csi-operator-67d475b5cf-vkb5x     1/1     Running   0          3m1s
ibm-spectrum-scale-csi-provisioner-ff75d95c-6s74h    1/1     Running   0          2m2s
ibm-spectrum-scale-csi-resizer-59f6dbbdc9-lrcf5      1/1     Running   0          2m2s
ibm-spectrum-scale-csi-snapshotter-7447b7ffc-kfm9b   1/1     Running   0          2m2s
ibm-spectrum-scale-csi-xk25w                         3/3     Running   0          2m2s

The yaml representation of kubernetes objects created:
PVC:

[root@csi-kube129-1-x-master ~]# oc get pvc pvc-espxwjj -n test-driver-static-sc-14-1719212536 -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
  creationTimestamp: "2024-06-24T07:02:43Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    author: kubetest
  name: pvc-espxwjj
  namespace: test-driver-static-sc-14-1719212536
  resourceVersion: "20514987"
  uid: 542119c1-fd72-4436-ad13-b85f0a8e28aa
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 8Gi
  storageClassName: sc-vmxrn
  volumeMode: Filesystem
  volumeName: pv-ndpym
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 8Gi
  phase: Bound

PV:

[root@csi-kube129-1-x-master ~]# oc get pv pv-ndpym -oyaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2024-06-24T07:02:43Z"
  finalizers:
  - kubernetes.io/pv-protection
  - external-provisioner.volume.kubernetes.io/finalizer
  - external-attacher/spectrumscale-csi-ibm-com
  labels:
    author: kubetest
  name: pv-ndpym
  resourceVersion: "20515001"
  uid: fb8f2b8b-ee28-4e78-9ca2-9ef776dfda77
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 8Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: pvc-espxwjj
    namespace: test-driver-static-sc-14-1719212536
    resourceVersion: "20514947"
    uid: 542119c1-fd72-4436-ad13-b85f0a8e28aa
  csi:
    driver: spectrumscale.csi.ibm.com
    volumeHandle: 453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-anlkxzae
  persistentVolumeReclaimPolicy: Delete
  storageClassName: sc-vmxrn
  volumeMode: Filesystem
status:
  lastPhaseTransitionTime: "2024-06-24T07:02:43Z"
  phase: Bound

StorageClass:

[root@csi-kube129-1-x-master ~]# oc get sc sc-vmxrn -oyaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2024-06-24T07:02:32Z"
  labels:
    author: kubetest
  name: sc-vmxrn
  resourceVersion: "20514916"
  uid: 788558db-cb7f-4b23-936b-a667a739c62f
parameters:
  clusterId: "453210684566980716"
  volBackendFs: fs1
provisioner: spectrumscale.csi.ibm.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

Steps to reproduce defect:

  1. Create StorageClass, PV and PVC (refer to above yaml files)
  2. Delete PVC
  3. Delete PV
  4. Describing PV after deletion:
[root@csi-kube129-1-x-master ~]# oc describe pv pv-ndpym
Name:            pv-ndpym
Labels:          author=kubetest
Annotations:     pv.kubernetes.io/bound-by-controller: yes
Finalizers:      [external-provisioner.volume.kubernetes.io/finalizer]
StorageClass:    sc-vmxrn
Status:          Terminating (lasts 43s)
Claim:           test-driver-static-sc-14-1719212536/pvc-espxwjj
Reclaim Policy:  Delete
Access Modes:    RWX
VolumeMode:      Filesystem
Capacity:        8Gi
Node Affinity:   <none>
Message:         error getting deleter volume plugin for volume "pv-ndpym": no deletable volume plugin matched
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            spectrumscale.csi.ibm.com
    FSType:
    VolumeHandle:      453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-anlkxzae
    ReadOnly:          false
    VolumeAttributes:  <none>
Events:
  Type     Reason              Age   From                                                                                                              Message
  ----     ------              ----  ----                                                                                                              -------
  Warning  VolumeFailedDelete  59s   spectrumscale.csi.ibm.com_ibm-spectrum-scale-csi-provisioner-ff75d95c-6s74h_81f3b680-fc3e-49c1-8420-139b29153aca  persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com
  Warning  VolumeFailedDelete  59s   persistentvolume-controller                                                                                       error getting deleter volume plugin for volume "pv-ndpym": no deletable volume plugin matched

Please find attached CSI provisioner logs, Kube Controller Manager logs and CSI Snap.

CSI Provisioner logs:

I0624 07:04:40.736810       1 controller.go:1132] "shouldDelete" PV="pv-ndpym"
I0624 07:04:40.736821       1 controller.go:1548] "Started" PV="pv-ndpym"
E0624 07:04:40.736905       1 controller.go:1558] "Volume deletion failed" err="persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com" PV="pv-ndpym"
I0624 07:04:40.736955       1 controller.go:1007] "Retrying syncing volume" key="pv-ndpym" failures=0
E0624 07:04:40.736984       1 controller.go:1025] error syncing volume "pv-ndpym": persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com
I0624 07:04:40.737127       1 event.go:389] "Event occurred" object="pv-ndpym" fieldPath="" kind="PersistentVolume" apiVersion="v1" type="Warning" reason="VolumeFailedDelete" message="persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com"
I0624 07:04:40.751966       1 controller.go:1171] "handleProtectionFinalizer" PV="&PersistentVolume{ObjectMeta:{pv-ndpym    fb8f2b8b-ee28-4e78-9ca2-9ef776dfda77 20515335 0 2024-06-24 07:02:43 +0000 UTC <nil> <nil> map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [kubernetes.io/pv-protection external-provisioner.volume.kubernetes.io/finalizer external-attacher/spectrumscale-csi-ibm-com] [{OpenAPI-Generator Update v1 2024-06-24 07:02:43 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:labels\":{\".\":{},\"f:author\":{}}},\"f:spec\":{\"f:accessModes\":{},\"f:capacity\":{\".\":{},\"f:storage\":{}},\"f:csi\":{\".\":{},\"f:driver\":{},\"f:volumeHandle\":{}},\"f:persistentVolumeReclaimPolicy\":{},\"f:storageClassName\":{},\"f:volumeMode\":{}}} } {csi-provisioner Update v1 2024-06-24 07:02:43 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:finalizers\":{\"v:\\\"external-provisioner.volume.kubernetes.io/finalizer\\\"\":{}}}} } {kube-controller-manager Update v1 2024-06-24 07:02:43 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:pv.kubernetes.io/bound-by-controller\":{}}},\"f:spec\":{\"f:claimRef\":{\".\":{},\"f:apiVersion\":{},\"f:kind\":{},\"f:name\":{},\"f:namespace\":{},\"f:resourceVersion\":{},\"f:uid\":{}}}} } {csi-attacher Update v1 2024-06-24 07:02:58 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:finalizers\":{\"v:\\\"external-attacher/spectrumscale-csi-ibm-com\\\"\":{}}}} } {kube-controller-manager Update v1 2024-06-24 07:04:40 +0000 UTC FieldsV1 {\"f:status\":{\"f:message\":{},\"f:phase\":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-anlkxzae,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1719212536,Name:pvc-espxwjj,UID:542119c1-fd72-4436-ad13-b85f0a8e28aa,APIVersion:v1,ResourceVersion:20514947,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-vmxrn,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Failed,Message:error getting deleter volume plugin for volume \"pv-ndpym\": no deletable volume plugin matched,Reason:,LastPhaseTransitionTime:2024-06-24 07:04:40 +0000 UTC,},}"

Kube Controller Manager logs:

E0624 07:04:40.751575       1 goroutinemap.go:150] Operation for "delete-pv-ndpym[fb8f2b8b-ee28-4e78-9ca2-9ef776dfda77]" failed. No retries permitted until 2024-06-24 07:04:41.251537223 +0000 UTC m=+339116.107773870 (durationBeforeRetry 500ms). Error: error getting deleter volume plugin for volume "pv-ndpym": no deletable volume plugin matched
I0624 07:04:40.751658       1 event.go:376] "Event occurred" object="pv-ndpym" fieldPath="" kind="PersistentVolume" apiVersion="v1" type="Warning" reason="VolumeFailedDelete" message="error getting deleter volume plugin for volume \"pv-ndpym\": no deletable volume plugin matched"

kube-controller-manager_logs.txt
csi_provisioner_logs.txt
csi_snap_for_provisioner.tar.gz

@deepakkinni do you need any more information on the above issue?

@deepakkinni @Jainbrt @shrutinipane

A similar issue was reported to kubernetes/kuberentes, and fixed by this PR in 1.30. Because this feature is alpha, it is not backported to v1.29. Only v1.30 and later versions are fixed.

@carlory thanks for your response, but as this feature is not enabled, we are hitting this in common path.

 Error: error getting deleter volume plugin for volume "pv-ndpym": no deletable volume plugin matched

The root cause is that the pv controller marks the pv as failed, so the external provsioner cannot handle the pv. The previous solution is gated with HonorPVReclaimPolicy, so we still hit the issue in the common path.

	if utilfeature.DefaultFeatureGate.Enabled(features.HonorPVReclaimPolicy) {
		if metav1.HasAnnotation(volume.ObjectMeta, storagehelpers.AnnMigratedTo) {
			// CSI migration scenario - do not depend on in-tree plugin
			return nil, nil
		}

		if volume.Spec.CSI != nil {
			// CSI volume source scenario - external provisioner is requested
			return nil, nil
		}
	}

Should we remove the feature-gate check in k/k and then backport the fix to previous releases?

cc @xing-yang @deepakkinni

@carlory I think we'd need to do this. In theory, the fix should have been applicable regardless of the feature HonorPVReclaimPolicy. The fix without the feature gate will ensure there are no Failed volume states for:

  1. Statically provisioned csi volumes
  2. Migrated volumes

But how did this work in the past? With older kcm and older external-provisioner, would deleting a statically provisioned volume(bound pv-pvc, delete PVC) leave the PV in a Failed state? @shrutinipane would it be possible to repeat this using <5.0.1 provisioner? What was the state of PV when PVC was deleted? can you post kcm logs as well?post pv,PVC yamls

With older provisioner version, PV is in Released state and then gets deleted.
I have created PV pv-jsffwiqcc and PVC pvc-fxmzvrucf.

CSI Provisioner Logs:

I0626 13:00:14.507818       1 controller.go:1152] handleProtectionFinalizer Volume : &PersistentVolume{ObjectMeta:{pv-jsffwiqcc    7b8860d9-3424-400b-9158-099677c83dfb 21049328 0 2024-06-26 12:59:23 +0000 UTC <nil> <nil> map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [kubernetes.io/pv-protection external-attacher/spectrumscale-csi-ibm-com] [{OpenAPI-Generator Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:labels":{".":{},"f:author":{}}},"f:spec":{"f:accessModes":{},"f:capacity":{".":{},"f:storage":{}},"f:csi":{".":{},"f:driver":{},"f:volumeHandle":{}},"f:persistentVolumeReclaimPolicy":{},"f:storageClassName":{},"f:volumeMode":{}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:pv.kubernetes.io/bound-by-controller":{}}},"f:spec":{"f:claimRef":{".":{},"f:apiVersion":{},"f:kind":{},"f:name":{},"f:namespace":{},"f:resourceVersion":{},"f:uid":{}}}} } {csi-attacher Update v1 2024-06-26 12:59:28 +0000 UTC FieldsV1 {"f:metadata":{"f:finalizers":{"v:\"external-attacher/spectrumscale-csi-ibm-com\"":{}}}} } {kube-controller-manager Update v1 2024-06-26 13:00:14 +0000 UTC FieldsV1 {"f:status":{"f:phase":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-cmsbmro,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1719406744,Name:pvc-fxmzvrucf,UID:b478df92-ef7a-4725-8a89-bd98e6797e06,APIVersion:v1,ResourceVersion:21049139,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-ztjnmf,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Released,Message:,Reason:,LastPhaseTransitionTime:2024-06-26 13:00:14 +0000 UTC,},}
I0626 13:00:14.508028       1 controller.go:1239] shouldDelete volume "pv-jsffwiqcc"
I0626 13:00:14.508044       1 controller.go:1269] shouldDelete volume "pv-jsffwiqcc" is true
I0626 13:00:14.508050       1 controller.go:1113] shouldDelete Volume: "pv-jsffwiqcc"
I0626 13:00:14.508060       1 controller.go:1509] delete "pv-jsffwiqcc": started
I0626 13:00:14.508113       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteVolume
I0626 13:00:14.508121       1 connection.go:245] GRPC request: {"volume_id":"453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-cmsbmro"}
I0626 13:00:30.325843       1 connection.go:251] GRPC response: {}
I0626 13:00:30.325871       1 connection.go:252] GRPC error: <nil>
I0626 13:00:30.325891       1 controller.go:1524] delete "pv-jsffwiqcc": volume deleted
I0626 13:00:30.331759       1 controller.go:1530] delete "pv-jsffwiqcc": failed to delete persistentvolume: persistentvolumes "pv-jsffwiqcc" not found
W0626 13:00:30.331811       1 controller.go:989] Retrying syncing volume "pv-jsffwiqcc", failure 0
E0626 13:00:30.331833       1 controller.go:1007] error syncing volume "pv-jsffwiqcc": persistentvolumes "pv-jsffwiqcc" not found

Kube Controller Manager logs:

626 13:00:14.531991       1 event.go:376] "Event occurred" object="pv-jsffwiqcc" fieldPath="" kind="PersistentVolume" apiVersion="v1" type="Warning" reason="VolumeFailedDelete" message="error getting deleter volume plugin for volume \"pv-jsffwiqcc\": no deletable volume plugin matched"
I0626 13:00:19.447842       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719406744/pvc-zogxg" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-zrtunanet\": incompatible accessMode"
I0626 13:00:24.495306       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719406744/pvc-wntncvm" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-ulnhx\": incompatible accessMode"
I0626 13:00:24.563981       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719210522/pvc-wtmjwf" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-wmqrgza\": incompatible accessMode"
I0626 13:00:24.564694       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719210522/pvc-yqmro" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-dcqlemx\": incompatible accessMode"
I0626 13:00:24.564743       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719225244/pvc-sxkot" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-wbgeoe\": incompatible accessMode"
I0626 13:00:24.564758       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719225244/pvc-wcjqitliw" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-pfwhq\": incompatible accessMode"
I0626 13:00:24.564771       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719209606/pvc-dorgvat" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-hmjpcdn\": incompatible accessMode"
I0626 13:00:24.564782       1 event.go:376] "Event occurred" object="test-driver-static-sc-14-1719209606/pvc-lgukdapx" fieldPath="" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="VolumeMismatch" message="Cannot bind to requested volume \"pv-houmrovm\": incompatible accessMode"
E0626 13:00:24.581067       1 goroutinemap.go:150] Operation for "delete-pv-jsffwiqcc[7b8860d9-3424-400b-9158-099677c83dfb]" failed. No retries permitted until 2024-06-26 13:00:25.581046191 +0000 UTC m=+533260.437282843 (durationBeforeRetry 1s). Error: error getting deleter volume plugin for volume "pv-jsffwiqcc": no deletable volume plugin matched
E0626 13:00:29.663000       1 pv_protection_controller.go:113] PV pv-jsffwiqcc failed with : Operation cannot be fulfilled on persistentvolumes "pv-jsffwiqcc": the object has been modified; please apply your changes to the latest version and try again

Attaching log files:
provisioner_logs.txt
kcm_logs.txt

Thanks @shrutinipane
PV: pv-jsffwiqcc

From external-provisioner perspective the PV moves from Bound to Released state.

0626 12:59:23.586577       1 controller.go:1152] handleProtectionFinalizer Volume : &PersistentVolume{ObjectMeta:{pv-jsffwiqcc    7b8860d9-3424-400b-9158-099677c83dfb 21049140 0 2024-06-26 12:59:23 +0000 UTC <nil> <nil> map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [kubernetes.io/pv-protection] [{OpenAPI-Generator Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:labels":{".":{},"f:author":{}}},"f:spec":{"f:accessModes":{},"f:capacity":{".":{},"f:storage":{}},"f:csi":{".":{},"f:driver":{},"f:volumeHandle":{}},"f:persistentVolumeReclaimPolicy":{},"f:storageClassName":{},"f:volumeMode":{}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:pv.kubernetes.io/bound-by-controller":{}}},"f:spec":{"f:claimRef":{".":{},"f:apiVersion":{},"f:kind":{},"f:name":{},"f:namespace":{},"f:resourceVersion":{},"f:uid":{}}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:status":{"f:phase":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-cmsbmro,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1719406744,Name:pvc-fxmzvrucf,UID:b478df92-ef7a-4725-8a89-bd98e6797e06,APIVersion:v1,ResourceVersion:21049139,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-ztjnmf,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Available,Message:,Reason:,LastPhaseTransitionTime:2024-06-26 12:59:23 +0000 UTC,},}
I0626 12:59:23.586760       1 controller.go:1239] shouldDelete volume "pv-jsffwiqcc"
I0626 12:59:23.586772       1 controller.go:1260] shouldDelete volume "pv-jsffwiqcc" is false: PersistentVolumePhase is not Released
I0626 12:59:23.622008       1 controller.go:1152] handleProtectionFinalizer Volume : &PersistentVolume{ObjectMeta:{pv-jsffwiqcc    7b8860d9-3424-400b-9158-099677c83dfb 21049141 0 2024-06-26 12:59:23 +0000 UTC <nil> <nil> map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [kubernetes.io/pv-protection] [{OpenAPI-Generator Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:labels":{".":{},"f:author":{}}},"f:spec":{"f:accessModes":{},"f:capacity":{".":{},"f:storage":{}},"f:csi":{".":{},"f:driver":{},"f:volumeHandle":{}},"f:persistentVolumeReclaimPolicy":{},"f:storageClassName":{},"f:volumeMode":{}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:pv.kubernetes.io/bound-by-controller":{}}},"f:spec":{"f:claimRef":{".":{},"f:apiVersion":{},"f:kind":{},"f:name":{},"f:namespace":{},"f:resourceVersion":{},"f:uid":{}}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:status":{"f:phase":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-cmsbmro,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1719406744,Name:pvc-fxmzvrucf,UID:b478df92-ef7a-4725-8a89-bd98e6797e06,APIVersion:v1,ResourceVersion:21049139,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-ztjnmf,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Bound,Message:,Reason:,LastPhaseTransitionTime:2024-06-26 12:59:23 +0000 UTC,},}
I0626 12:59:23.622276       1 controller.go:1239] shouldDelete volume "pv-jsffwiqcc"
I0626 12:59:23.622368       1 controller.go:1260] shouldDelete volume "pv-jsffwiqcc" is false: PersistentVolumePhase is not Released
I0626 12:59:28.811648       1 controller.go:1152] handleProtectionFinalizer Volume : &PersistentVolume{ObjectMeta:{pv-jsffwiqcc    7b8860d9-3424-400b-9158-099677c83dfb 21049164 0 2024-06-26 12:59:23 +0000 UTC <nil> <nil> map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [kubernetes.io/pv-protection external-attacher/spectrumscale-csi-ibm-com] [{OpenAPI-Generator Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:labels":{".":{},"f:author":{}}},"f:spec":{"f:accessModes":{},"f:capacity":{".":{},"f:storage":{}},"f:csi":{".":{},"f:driver":{},"f:volumeHandle":{}},"f:persistentVolumeReclaimPolicy":{},"f:storageClassName":{},"f:volumeMode":{}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:pv.kubernetes.io/bound-by-controller":{}}},"f:spec":{"f:claimRef":{".":{},"f:apiVersion":{},"f:kind":{},"f:name":{},"f:namespace":{},"f:resourceVersion":{},"f:uid":{}}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:status":{"f:phase":{}}} status} {csi-attacher Update v1 2024-06-26 12:59:28 +0000 UTC FieldsV1 {"f:metadata":{"f:finalizers":{"v:\"external-attacher/spectrumscale-csi-ibm-com\"":{}}}} }]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-cmsbmro,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1719406744,Name:pvc-fxmzvrucf,UID:b478df92-ef7a-4725-8a89-bd98e6797e06,APIVersion:v1,ResourceVersion:21049139,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-ztjnmf,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Bound,Message:,Reason:,LastPhaseTransitionTime:2024-06-26 12:59:23 +0000 UTC,},}
I0626 12:59:28.811887       1 controller.go:1239] shouldDelete volume "pv-jsffwiqcc"
I0626 12:59:28.811906       1 controller.go:1260] shouldDelete volume "pv-jsffwiqcc" is false: PersistentVolumePhase is not Released
.
.
.
I0626 13:00:14.507818       1 controller.go:1152] handleProtectionFinalizer Volume : &PersistentVolume{ObjectMeta:{pv-jsffwiqcc    7b8860d9-3424-400b-9158-099677c83dfb 21049328 0 2024-06-26 12:59:23 +0000 UTC <nil> <nil> map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [kubernetes.io/pv-protection external-attacher/spectrumscale-csi-ibm-com] [{OpenAPI-Generator Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:labels":{".":{},"f:author":{}}},"f:spec":{"f:accessModes":{},"f:capacity":{".":{},"f:storage":{}},"f:csi":{".":{},"f:driver":{},"f:volumeHandle":{}},"f:persistentVolumeReclaimPolicy":{},"f:storageClassName":{},"f:volumeMode":{}}} } {kube-controller-manager Update v1 2024-06-26 12:59:23 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:pv.kubernetes.io/bound-by-controller":{}}},"f:spec":{"f:claimRef":{".":{},"f:apiVersion":{},"f:kind":{},"f:name":{},"f:namespace":{},"f:resourceVersion":{},"f:uid":{}}}} } {csi-attacher Update v1 2024-06-26 12:59:28 +0000 UTC FieldsV1 {"f:metadata":{"f:finalizers":{"v:\"external-attacher/spectrumscale-csi-ibm-com\"":{}}}} } {kube-controller-manager Update v1 2024-06-26 13:00:14 +0000 UTC FieldsV1 {"f:status":{"f:phase":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-cmsbmro,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1719406744,Name:pvc-fxmzvrucf,UID:b478df92-ef7a-4725-8a89-bd98e6797e06,APIVersion:v1,ResourceVersion:21049139,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-ztjnmf,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Released,Message:,Reason:,LastPhaseTransitionTime:2024-06-26 13:00:14 +0000 UTC,},}
I0626 13:00:14.508028       1 controller.go:1239] shouldDelete volume "pv-jsffwiqcc"
I0626 13:00:14.508044       1 controller.go:1269] shouldDelete volume "pv-jsffwiqcc" is true
I0626 13:00:14.508050       1 controller.go:1113] shouldDelete Volume: "pv-jsffwiqcc"
I0626 13:00:14.508060       1 controller.go:1509] delete "pv-jsffwiqcc": started
I0626 13:00:14.508113       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteVolume
I0626 13:00:14.508121       1 connection.go:245] GRPC request: {"volume_id":"453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-cmsbmro"}
I0626 13:00:30.325843       1 connection.go:251] GRPC response: {}
I0626 13:00:30.325871       1 connection.go:252] GRPC error: <nil>
I0626 13:00:30.325891       1 controller.go:1524] delete "pv-jsffwiqcc": volume deleted
I0626 13:00:30.331759       1 controller.go:1530] delete "pv-jsffwiqcc": failed to delete persistentvolume: persistentvolumes "pv-jsffwiqcc" not found

But in kcm, there's no indication that it moved into Released state, on the other hand the indicated error should have moved the PV to failed state

E0626 13:00:14.531745       1 goroutinemap.go:150] Operation for "delete-pv-jsffwiqcc[7b8860d9-3424-400b-9158-099677c83dfb]" failed. No retries permitted until 2024-06-26 13:00:15.031690313 +0000 UTC m=+533249.887927013 (durationBeforeRetry 500ms). Error: error getting deleter volume plugin for volume "pv-jsffwiqcc": no deletable volume plugin matched
I0626 13:00:14.531991       1 event.go:376] "Event occurred" object="pv-jsffwiqcc" fieldPath="" kind="PersistentVolume" apiVersion="v1" type="Warning" reason="VolumeFailedDelete" message="error getting deleter volume plugin for volume \"pv-jsffwiqcc\": no deletable volume plugin matched"

Based on https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/persistentvolume/pv_controller.go#L1318-L1334

	pluginName, deleted, err := ctrl.doDeleteVolume(ctx, volume)
	if err != nil {
		// Delete failed, update the volume and emit an event.
		logger.V(3).Info("Deletion of volume failed", "volumeName", volume.Name, "err", err)
		if volerr.IsDeletedVolumeInUse(err) {
			// The plugin needs more time, don't mark the volume as Failed
			// and send Normal event only
			ctrl.eventRecorder.Event(volume, v1.EventTypeNormal, events.VolumeDelete, err.Error())
		} else {
			// The plugin failed, mark the volume as Failed and send Warning
			// event
			if _, err := ctrl.updateVolumePhaseWithEvent(ctx, volume, v1.VolumeFailed, v1.EventTypeWarning, events.VolumeFailedDelete, err.Error()); err != nil {
				logger.V(4).Info("DeleteVolumeOperation: failed to mark volume as failed", "volumeName", volume.Name, "err", err)
				// Save failed, retry on the next deletion attempt
				return pluginName, err
			}
		}

This was supposed to move the PV into VolumeFailed state.

https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/plugins.go#L770-L779

// FindDeletablePluginBySpec fetches a persistent volume plugin by spec.  If
// no plugin is found, returns error.
func (pm *VolumePluginMgr) FindDeletablePluginBySpec(spec *Spec) (DeletableVolumePlugin, error) {
	volumePlugin, err := pm.FindPluginBySpec(spec)
	if err != nil {
		return nil, err
	}
	if deletableVolumePlugin, ok := volumePlugin.(DeletableVolumePlugin); ok {
		return deletableVolumePlugin, nil
	}
	return nil, fmt.Errorf("no deletable volume plugin matched")
}

Will follow up this soon.

The KCM updates pv's phase with the Released state and then may move this field into the Failed state, please refer to kubernetes/kubernetes#122030 (comment) for more details.

We can observe the behavior from the above log snippet of the KCM and the external provisioner.

Thanks @carlory for the help here and raising the fix. I was wondering if we would get new provisioner sidecar image or this be a dependency on newer k8s versions?

@Jainbrt After the pr is merged, it will be backported to 1.30, 1.29, 1.28, and 1.27. The latest provisioner image is enough, so we don't need to release a newer one.

Thanks @carlory for the help in understanding, while are k8s < 1.27 versions also impacted ?

Yes, please refer to End of Life

Hi @carlory, i don't think that is the root cause, it'll definitely resolve the issue(since we avoid moving into Failed state entirely), but there seems to be other issue.

To summarize the observation based on logs in in #1235 (comment), the volume is moved into a Released state, followed by a Failed state, this infact was observed in the csi-provisioner logs:

I0624 07:04:40.736752       1 controller.go:1171] "handleProtectionFinalizer" PV="&PersistentVolume{ObjectMeta:{pv-ndpym    fb8f2b8b-ee28-4e78-9ca2-9ef776dfda77 20515333 0 2024-06-24 07:02:43 +0000 UTC <nil> <nil> map[author:kubetest] map[pv.kubernetes.io/bound-by-controller:yes] [] [kubernetes.io/pv-protection external-provisioner.volume.kubernetes.io/finalizer external-attacher/spectrumscale-csi-ibm-com] [{OpenAPI-Generator Update v1 2024-06-24 07:02:43 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:labels\":{\".\":{},\"f:author\":{}}},\"f:spec\":{\"f:accessModes\":{},\"f:capacity\":{\".\":{},\"f:storage\":{}},\"f:csi\":{\".\":{},\"f:driver\":{},\"f:volumeHandle\":{}},\"f:persistentVolumeReclaimPolicy\":{},\"f:storageClassName\":{},\"f:volumeMode\":{}}} } {csi-provisioner Update v1 2024-06-24 07:02:43 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:finalizers\":{\"v:\\\"external-provisioner.volume.kubernetes.io/finalizer\\\"\":{}}}} } {kube-controller-manager Update v1 2024-06-24 07:02:43 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:pv.kubernetes.io/bound-by-controller\":{}}},\"f:spec\":{\"f:claimRef\":{\".\":{},\"f:apiVersion\":{},\"f:kind\":{},\"f:name\":{},\"f:namespace\":{},\"f:resourceVersion\":{},\"f:uid\":{}}}} } {csi-attacher Update v1 2024-06-24 07:02:58 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:finalizers\":{\"v:\\\"external-attacher/spectrumscale-csi-ibm-com\\\"\":{}}}} } {kube-controller-manager Update v1 2024-06-24 07:04:40 +0000 UTC FieldsV1 {\"f:status\":{\"f:phase\":{}}} status}]},Spec:PersistentVolumeSpec{Capacity:ResourceList{storage: {{8589934592 0} {<nil>}  BinarySI},},PersistentVolumeSource:PersistentVolumeSource{GCEPersistentDisk:nil,AWSElasticBlockStore:nil,HostPath:nil,Glusterfs:nil,NFS:nil,RBD:nil,ISCSI:nil,Cinder:nil,CephFS:nil,FC:nil,Flocker:nil,FlexVolume:nil,AzureFile:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Local:nil,StorageOS:nil,CSI:&CSIPersistentVolumeSource{Driver:spectrumscale.csi.ibm.com,VolumeHandle:453210684566980716;34690B0A:6606C9E1;path=/ibm/fs1/dir-anlkxzae,ReadOnly:false,FSType:,VolumeAttributes:map[string]string{},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,ControllerExpandSecretRef:nil,NodeExpandSecretRef:nil,},},AccessModes:[ReadWriteMany],ClaimRef:&ObjectReference{Kind:PersistentVolumeClaim,Namespace:test-driver-static-sc-14-1719212536,Name:pvc-espxwjj,UID:542119c1-fd72-4436-ad13-b85f0a8e28aa,APIVersion:v1,ResourceVersion:20514947,FieldPath:,},PersistentVolumeReclaimPolicy:Delete,StorageClassName:sc-vmxrn,MountOptions:[],VolumeMode:*Filesystem,NodeAffinity:nil,VolumeAttributesClassName:nil,},Status:PersistentVolumeStatus{Phase:Released,Message:,Reason:,LastPhaseTransitionTime:2024-06-24 07:04:40 +0000 UTC,},}"
I0624 07:04:40.736791       1 controller.go:1258] "shouldDelete" PV="pv-ndpym"
I0624 07:04:40.736802       1 controller.go:1288] "shouldDelete is true" PV="pv-ndpym"
I0624 07:04:40.736810       1 controller.go:1132] "shouldDelete" PV="pv-ndpym"
I0624 07:04:40.736821       1 controller.go:1548] "Started" PV="pv-ndpym"
E0624 07:04:40.736905       1 controller.go:1558] "Volume deletion failed" err="persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com" PV="pv-ndpym"
I0624 07:04:40.736955       1 controller.go:1007] "Retrying syncing volume" key="pv-ndpym" failures=0
E0624 07:04:40.736984       1 controller.go:1025] error syncing volume "pv-ndpym": persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com
I0624 07:04:40.737127       1 event.go:389] "Event occurred" object="pv-ndpym" fieldPath="" kind="PersistentVolume" apiVersion="v1" type="Warning" reason="VolumeFailedDelete" message="persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com"

As seen in the above logs, the csi driver threw the error with message persistentvolume pv-ndpym is still attached to node csi-kube129-1-x-worker2.fyre.ibm.com, this triggered a retry, in the meanwhile, kcm moved the PV into a Failed state, as a result, this made csi-provisioner fail the shouldDelete checks and make no more progress.

Now, even if kcm doesn't move into Failed state, the driver may not necessarily successfully delete the volume because of the above error.

In summary, the change kubernetes/kubernetes#125767 should go in, but doesn't mean it'll resolve @shrutinipane reported issue. Perhaps, the driver eventually detaches the volume from node, then maybe they'll not see the issue.

@deepakkinni I have a different opinion on the error log. Let's see the function signature:

func (p *csiProvisioner) canDeleteVolume(volume *v1.PersistentVolume) error

It uses the return value to indicate whether the volume can be deleted or not.

	for _, va := range vaList {
		if va.Spec.Source.PersistentVolumeName != nil && *va.Spec.Source.PersistentVolumeName == volume.Name {
			return fmt.Errorf("persistentvolume %s is still attached to node %s", volume.Name, va.Spec.NodeName)
		}
	}

the above code snippet is checking if the volume is still attached to a node.

If it is, it has to return an error (not a boolean value) to indicate that the volume cannot be deleted.

The detach operation is asynchronous, the volume may still be attached to the node when the canDeleteVolume function is called. So, the message's level should be Warning or Info if we use a boolean value to indicate the result.

  1. a user deletes pod, pvc

  2. adc does detach operation, the volumeattachment object is being deleted, and csi-attacher observes the event and calls RPC to detach the volume

  3. pv binder controller observes pod and pvc are deleted, and then updates pv's phase with Release. csi-provsioner watches the pv event and try to delete the pv if the volume is not attached to a node.

  4. If the csi-attacher takes a long time to detach volume, csi-provsioner has retried 15 times. the volume is still attached to a node but the volume key is removed from csi-provisioner's volume queue. so the pv is stuck in Terminating phase until the informer resync period is reached (15m) and retries the above step.

Thanks for the clarification, @carlory. I was assuming the error was coming from the driver, but it seems like it's from the provisioner itself.

Thanks @carlory and @deepakkinni for helping on this issue, do we have any outlook when will we get the fix kubernetes/kubernetes#125767 in next k8s PTF builds ?

@carlory Regarding comment #1235 (comment), with the current implementation of provisioner v5.0.1, PV stays in Terminating stays and doesn't get cleaned up after 15 minutes.

@shrutinipane once the pv is updated with Failed state by kcm, the csi-provisioner won't be able to handle the pv. so kubernetes/kubernetes#125767 will fix this.

Thanks @carlory for the help in understanding, while are k8s < 1.27 versions also impacted ?

@deepakkinni @Jainbrt Kubernetes release 1.27 reaches the End of Life, please see https://kubernetes.io/releases/patch-releases/\#1-27. so 1.27 won't include this fix

Sure @carlory . I hope this would be picked for 1.28,1.29 and 1.30 k8s further ptf releases.

just FYI. if you are already hitting this issue, use kubectl patch pv <pv-name> -p '{"metadata":{"finalizers":null}}' would clean up the PV immediately.

Found it, the provisioner computes a wrong JSON patch to remove the finalizer.

removeFinalizer modifies its argument, which means newVolume.finalizers:

finalizers, modified := removeFinalizer(newVolume.ObjectMeta.Finalizers, finalizerPV)

But newVolume is used as the original PV to compute JSON patch. With the finalizer removed from it, the resulting patch does not do anything useful:

if _, err = ctrl.patchPersistentVolumeWithFinalizers(ctx, newVolume, finalizers); err != nil {

/reopen

@carlory: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Needs to bump dependencies @jsafrane

@shrutinipane @andyzhangx @Jainbrt

Before I tag a new sig-storage-lib-external-provisioner + external-provisioner versions, I'd like to make sure the fix is complete and there are no lose ends. Can you please test #1245? If you trust me, I prepared an image from the PR at quay.io/jsafrane/external-provisioner:test1245
(amd64 only).

Thanks in advance for any feedback.

Sure @jsafrane we can give this a try. thanks for the help.

@jsafrane we do not have access to pull the above images, can you please help with the same ?

@Jainbrt sorry, I forgot to make the repo public. Please try again.

Thanks @jsafrane yes, we are able to pull now.

@Jainbrt is my image working? How much did you test it?

@Jainbrt friendly reminder

Hey @jsafrane, we are still testing this while we do see issue in static provisioning.

I already released a fixed version of lib-external-provisioner and updated it here, I'm cherry-picking the fix and I plan to release v5.0.2 soon-ish.

v5.0.2 tagged, waiting for image builds

@jsafrane I noticed that this issue persists with the above image quay.io/jsafrane/external-provisioner:test1245
Please find the attached provisioner logs:
prov_logs.txt
I tested on k8s version: v1.29.3

Is there any specific k8s version this image would work on?

I just tested with 1.29.3 and csi-driver-hostpath with my provisioner image, the driver got just 1 DeleteVolume call.

prov_lopgs.txt shows 1 delete attempt:

I0806 14:01:16.090947 1 controller.go:1258] "shouldDelete" PV="pv-vyfakylg"
I0806 14:01:16.090972 1 controller.go:1288] "shouldDelete is true" PV="pv-vyfakylg"
I0806 14:01:16.090989 1 controller.go:1132] "shouldDelete" PV="pv-vyfakylg"
I0806 14:01:16.091011 1 controller.go:1548] "Started" PV="pv-vyfakylg"
E0806 14:01:16.091195 1 controller.go:1558] "Volume deletion failed" err="persistentvolume pv-vyfakylg is still attached to node csi-kube129-1-x-worker1.fyre.ibm.com" PV="pv-vyfakylg"
I0806 14:01:16.091263 1 controller.go:1007] "Retrying syncing volume" key="pv-vyfakylg" failures=0
E0806 14:01:16.091370 1 controller.go:1025] error syncing volume "pv-vyfakylg": persistentvolume pv-vyfakylg is still attached to node csi-kube129-1-x-worker1.fyre.ibm.com

But then I can see only:

I0806 14:01:16.120583 1 controller.go:1279] "shouldDelete is false: PersistentVolumePhase is not Released" PV="pv-vyfakylg"

(which is odd... how was it Released before and now it's not?)

@jsafrane we have observed this issue when we run the quick test using automation ( where deletion of pvc and pv is quick). Not sure if this is related.