minio/directpv

Cannot delete directpvvolumes, agent node locked in unusable state

hugeblank opened this issue · 5 comments

Describe the bug
I have two nodes that are by all accounts identical in the eyes of directpv:
image

I've been trying to figure out how to get a minio tenant installed onto the cluster with 2 servers and 1 volume each, allocating 700GiB from each of the drives (1400GiB total). Something like this:

kubectl minio tenant create minio-tenant-main --capacity=1400Gi --servers 2 --volumes 1 \
 --namespace minio-tenant-main --storage-class directpv-min-io --enable-host-sharing --disable-tls

Attempting to do this failed twice, because minio expects the amount of volumes to be a multiple of the amount of servers you have(?), I still haven't figured out why this is. This then resulted in me attempting to do it with 1 server, and 2 volumes where I attempted using 1400GiB, well over the amount of space provided by the single drive on that node.

kubectl minio tenant create minio-tenant-main --capacity=1400Gi --servers 2 --volumes 1 \
--namespace minio-tenant-main --storage-class directpv-min-io --enable-host-sharing --disable-tls # OOPS!

In retrospect it was a fundamental misunderstanding of how directpv works that led me to expect that it would allocate that second volume on the other server. Regardless, I figured out that that was also not going to work, and finally settled on the 2 server 2 volume setup that minio seemingly forces me to do (1400 GiB total, 350GiB per volume).

kubectl minio tenant create minio-tenant-main --capacity=1400Gi --servers 1 --volumes 2 \
--namespace minio-tenant-main --storage-class directpv-min-io --enable-host-sharing --disable-tls 
# Not what I wanted, oh well.

On the first server, the two volumes spun up perfectly fine. The second two volumes however were locked in a pending state. After a lot of searching around I found 2 directpvvolumes that were stuck in a pending state that were of size 700GiB, presumably from the botched 1 server 2 volumes attempt:

❯ kubectl get directpvvolumes.directpv.min.io
NAME                                       AGE
pvc-0935538e-2c36-48dc-a267-91f173732bba   36m
pvc-2e58a8dd-b293-418b-9d9c-d8ca14fef3dc   62m <--
pvc-a762100e-03eb-48cf-ab1c-d13b9a1b9f39   36m
pvc-eefe318b-9907-4a9e-a2c4-81dfa64f4395   62m <--

❯ # Taking a closer look at one of these rogue directpvvolumes...
❯ kubectl describe directpvvolumes.directpv.min.io pvc-eefe318b-9907-4a9e-a2c4-81dfa64f4395
Name:         pvc-eefe318b-9907-4a9e-a2c4-81dfa64f4395
Namespace:    
Labels:       directpv.min.io/created-by=directpv-controller
              directpv.min.io/drive=ae69ecd0-5150-48b0-8572-b027cb14a4bb
              directpv.min.io/drive-name=nvme0n1p4
              directpv.min.io/node=kruger
              directpv.min.io/version=v1beta1
Annotations:  <none>
API Version:  directpv.min.io/v1beta1
Kind:         DirectPVVolume
Metadata:
  Creation Timestamp:  2024-03-06T08:47:16Z
  Finalizers:
    directpv.min.io/pv-protection
    directpv.min.io/purge-protection
  Generation:        1
  Resource Version:  1482768
  UID:               1a28dec5-19ac-41a2-b515-c933307f55df
Status:
  Available Capacity:   751619276800
  Data Path:            
  Fsuuid:               ae69ecd0-5150-48b0-8572-b027cb14a4bb
  Staging Target Path:  
  Status:               Pending
  Target Path:          
  Total Capacity:       751619276800
  Used Capacity:        0
Events:                 <none>

What I thought I could do was run an easy kubectl delete directpvvolumes.directpv.min.io ... on the two rogue volumes, and get rid of them, but no. Not even --force has any effect, kubectl just hangs. There's nothing referencing these directpvvolumes either, the pvclaims are long gone. I could probably uninstall and reinstall both minio and directpv, but I think it would be more productive to point out the situation I'm in in the form of an issue because this is either:
A) a really dumb situation to be in, and there's a workaround that I just don't know about.
OR
B) actually an issue and I should be reporting it.

To Reproduce
Steps to reproduce the behavior:

  1. Simultaneously create 2 PVCs that combined are larger than the size of a drive used by directpv.

Expected behavior
I should be able to delete the directpvvolumes, especially if they are stuck in a pending state.

Deployment information (please complete the following information):

  • DirectPV version: directpv version v4.0.10
  • Kubernetes Version: v1.28.6+k3s2
  • OS info: Arch
  • Kernel version: 6.6.16-1-lts

actually on further inspection I'm not able to delete persistent volumes that were created successfully, nor am I able to delete their respective directpvvolumes. Very lost on what to do.

DirectPV volumes are protected by finalizers and they cannot be deleted by force. Refer this documentation https://github.com/minio/directpv/blob/master/docs/volume-management.md#delete-volume

@balamurugana That helped get rid of the volumes that were stuck pending, but what about the ones that are flagged as ready, and already have a pv associated with them. how do I remove them? I attempted removing them with kubectl delete pv but that resulted in hanging, in the same way that the pending ones were before cleaning.

The shared doc has the information i.e. delete the PVC would trigger DirectPV volume deletion.

ah! I thought that the PVCs weren't there, completely forgetting the fact that PVCs are namespaced. Thank you for your help