Cannot delete directpvvolumes, agent node locked in unusable state
hugeblank opened this issue · 5 comments
Describe the bug
I have two nodes that are by all accounts identical in the eyes of directpv:
I've been trying to figure out how to get a minio tenant installed onto the cluster with 2 servers and 1 volume each, allocating 700GiB from each of the drives (1400GiB total). Something like this:
kubectl minio tenant create minio-tenant-main --capacity=1400Gi --servers 2 --volumes 1 \
--namespace minio-tenant-main --storage-class directpv-min-io --enable-host-sharing --disable-tls
Attempting to do this failed twice, because minio expects the amount of volumes to be a multiple of the amount of servers you have(?), I still haven't figured out why this is. This then resulted in me attempting to do it with 1 server, and 2 volumes where I attempted using 1400GiB, well over the amount of space provided by the single drive on that node.
kubectl minio tenant create minio-tenant-main --capacity=1400Gi --servers 2 --volumes 1 \
--namespace minio-tenant-main --storage-class directpv-min-io --enable-host-sharing --disable-tls # OOPS!
In retrospect it was a fundamental misunderstanding of how directpv works that led me to expect that it would allocate that second volume on the other server. Regardless, I figured out that that was also not going to work, and finally settled on the 2 server 2 volume setup that minio seemingly forces me to do (1400 GiB total, 350GiB per volume).
kubectl minio tenant create minio-tenant-main --capacity=1400Gi --servers 1 --volumes 2 \
--namespace minio-tenant-main --storage-class directpv-min-io --enable-host-sharing --disable-tls
# Not what I wanted, oh well.
On the first server, the two volumes spun up perfectly fine. The second two volumes however were locked in a pending state. After a lot of searching around I found 2 directpvvolumes that were stuck in a pending state that were of size 700GiB, presumably from the botched 1 server 2 volumes attempt:
❯ kubectl get directpvvolumes.directpv.min.io
NAME AGE
pvc-0935538e-2c36-48dc-a267-91f173732bba 36m
pvc-2e58a8dd-b293-418b-9d9c-d8ca14fef3dc 62m <--
pvc-a762100e-03eb-48cf-ab1c-d13b9a1b9f39 36m
pvc-eefe318b-9907-4a9e-a2c4-81dfa64f4395 62m <--
❯ # Taking a closer look at one of these rogue directpvvolumes...
❯ kubectl describe directpvvolumes.directpv.min.io pvc-eefe318b-9907-4a9e-a2c4-81dfa64f4395
Name: pvc-eefe318b-9907-4a9e-a2c4-81dfa64f4395
Namespace:
Labels: directpv.min.io/created-by=directpv-controller
directpv.min.io/drive=ae69ecd0-5150-48b0-8572-b027cb14a4bb
directpv.min.io/drive-name=nvme0n1p4
directpv.min.io/node=kruger
directpv.min.io/version=v1beta1
Annotations: <none>
API Version: directpv.min.io/v1beta1
Kind: DirectPVVolume
Metadata:
Creation Timestamp: 2024-03-06T08:47:16Z
Finalizers:
directpv.min.io/pv-protection
directpv.min.io/purge-protection
Generation: 1
Resource Version: 1482768
UID: 1a28dec5-19ac-41a2-b515-c933307f55df
Status:
Available Capacity: 751619276800
Data Path:
Fsuuid: ae69ecd0-5150-48b0-8572-b027cb14a4bb
Staging Target Path:
Status: Pending
Target Path:
Total Capacity: 751619276800
Used Capacity: 0
Events: <none>
What I thought I could do was run an easy kubectl delete directpvvolumes.directpv.min.io ...
on the two rogue volumes, and get rid of them, but no. Not even --force
has any effect, kubectl just hangs. There's nothing referencing these directpvvolumes either, the pvclaims are long gone. I could probably uninstall and reinstall both minio and directpv, but I think it would be more productive to point out the situation I'm in in the form of an issue because this is either:
A) a really dumb situation to be in, and there's a workaround that I just don't know about.
OR
B) actually an issue and I should be reporting it.
To Reproduce
Steps to reproduce the behavior:
- Simultaneously create 2 PVCs that combined are larger than the size of a drive used by directpv.
Expected behavior
I should be able to delete the directpvvolumes, especially if they are stuck in a pending state.
Deployment information (please complete the following information):
- DirectPV version: directpv version v4.0.10
- Kubernetes Version: v1.28.6+k3s2
- OS info: Arch
- Kernel version: 6.6.16-1-lts
actually on further inspection I'm not able to delete persistent volumes that were created successfully, nor am I able to delete their respective directpvvolumes. Very lost on what to do.
DirectPV volumes are protected by finalizers and they cannot be deleted by force. Refer this documentation https://github.com/minio/directpv/blob/master/docs/volume-management.md#delete-volume
@balamurugana That helped get rid of the volumes that were stuck pending, but what about the ones that are flagged as ready, and already have a pv associated with them. how do I remove them? I attempted removing them with kubectl delete pv but that resulted in hanging, in the same way that the pending ones were before cleaning.
The shared doc has the information i.e. delete the PVC would trigger DirectPV volume deletion.
ah! I thought that the PVCs weren't there, completely forgetting the fact that PVCs are namespaced. Thank you for your help