Cannot dynamic provisioning
jackchuong opened this issue · 8 comments
Hi all,
In my environment I have Synology NAS as NFS server.
I have internal k8s cluster and nfs-subdir-external-provisioner installed , they are working fine (dynamic/static PV/PVC but don't support volumesnapshot).
Now I want to try csi-driver-nfs.
I installed CSI driver with Helm 3 , default values
helm -n nfs-provisioner list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
csi-driver-nfs nfs-provisioner 1 2023-09-25 15:26:29.58904168 +0700 +07 deployed csi-driver-nfs-v4.4.0 v4.4.0
nfs-provisioner-sgnnas03 nfs-provisioner 1 2023-09-07 15:45:35.266905104 +0700 +07 deployed nfs-subdir-external-provisioner-4.0.18 4.0.2
My storageclass and volumesnapshotclass
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: storageclass-csi-nfs-sgnnas03
#annotations:
# storageclass.kubernetes.io/is-default-class: "true"
provisioner: nfs.csi.k8s.io
parameters:
server: "192.168.7.107"
share: /k8s-nfs-sgnnas03
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- hard
- nfsvers=4.1
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-nfs-snapclass
driver: nfs.csi.k8s.io
deletionPolicy: Delete
kubectl get csidrivers,sc,VolumeSnapshotClass
NAME ATTACHREQUIRED PODINFOONMOUNT STORAGECAPACITY TOKENREQUESTS REQUIRESREPUBLISH MODES AGE
csidriver.storage.k8s.io/secrets-store.csi.k8s.io false true false <unset> false Ephemeral 18d
csidriver.storage.k8s.io/nfs.csi.k8s.io false false false <unset> false Persistent 17h
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/nfs-sgnnas03 (default) nfs-provisioner-sgnnas03 Delete Immediate true 18d
storageclass.storage.k8s.io/storageclass-csi-nfs-sgnnas03 nfs.csi.k8s.io Delete Immediate true 17h
NAME DRIVER DELETIONPOLICY AGE
volumesnapshotclass.snapshot.storage.k8s.io/csi-nfs-snapclass nfs.csi.k8s.io Delete 17h
My dynamic PVC
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-test
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: storageclass-csi-nfs-sgnnas03
It's keep pending status
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pvc-test Pending storageclass-csi-nfs-sgnnas03 12s
kubectl describe persistentvolumeclaim/pvc-test
Name: pvc-test
Namespace: default
StorageClass: storageclass-csi-nfs-sgnnas03
Status: Pending
Volume:
Labels: <none>
Annotations: volume.beta.kubernetes.io/storage-provisioner: nfs.csi.k8s.io
volume.kubernetes.io/storage-provisioner: nfs.csi.k8s.io
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ExternalProvisioning 4s (x2 over 4s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "nfs.csi.k8s.io" or manually created by system administrator
Normal Provisioning 1s (x3 over 4s) nfs.csi.k8s.io_k3s-hq-master1_4caab03c-f211-44c7-bc53-50b8b7c2b9cf External provisioner is provisioning volume for claim "default/pvc-test"
Warning ProvisioningFailed 1s (x3 over 4s) nfs.csi.k8s.io_k3s-hq-master1_4caab03c-f211-44c7-bc53-50b8b7c2b9cf failed to provision volume with StorageClass "storageclass-csi-nfs-sgnnas03": rpc error: code = Internal desc = failed to mount nfs server: rpc error: code = Internal desc = mkdir /tmp/pvc-a9d891cc-2ba9-465b-8c57-7afb61dfdf78: read-only file system
How can I troubleshoot it ? Please give me some advice, thank you very much.
follow this guide to provide driver controller logs: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/docs/csi-debug.md#case1-volume-createdelete-failed
why mkdir under /tmp/ (inside driver controller) would fail, have you set some permissions on the driver controller?
Internal desc = mkdir /tmp/pvc-a9d891cc-2ba9-465b-8c57-7afb61dfdf78: read-only file system
Hi @andyzhangx
I didn't set any permission on driver controller, I installed CSI driver with Helm 3 , default values
Because logs are quite long (I test some volumesnapshot & volumesnapshotcontent before) , I decide to force delete csi-nfs-controller , csi-nfs-node , snapshot-controller pods and create dynamic pvc again to get exactly logs.
But now they works fine. I can create dynamic and static pv/pvc successfully
I have only 1 concern about 1 warning on csi-nfs-controller pod
kubectl -n nfs-provisioner get pod
NAME READY STATUS RESTARTS AGE
nfs-provisioner-sgnnas03-nfs-subdir-external-provisioner-8vv4gb 1/1 Running 0 18d
csi-nfs-controller-86548b95f7-z7qp2 4/4 Running 0 40m
csi-nfs-node-66dvn 3/3 Running 0 40m
snapshot-controller-7c47788447-xw276 1/1 Running 0 40m
kubectl -n nfs-provisioner describe pod csi-nfs-controller-86548b95f7-z7qp2
Name: csi-nfs-controller-86548b95f7-z7qp2
Namespace: nfs-provisioner
Priority: 2000000000
Priority Class Name: system-cluster-critical
Service Account: csi-nfs-controller-sa
Node: k3s-hq-master1/192.168.7.200
Start Time: Tue, 26 Sep 2023 09:24:29 +0700
Labels: app=csi-nfs-controller
app.kubernetes.io/instance=csi-driver-nfs
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=csi-driver-nfs
app.kubernetes.io/version=v4.4.0
helm.sh/chart=csi-driver-nfs-v4.4.0
pod-template-hash=86548b95f7
Annotations: <none>
Status: Running
SeccompProfile: RuntimeDefault
IP: 192.168.7.200
IPs:
IP: 192.168.7.200
Controlled By: ReplicaSet/csi-nfs-controller-86548b95f7
Containers:
csi-provisioner:
Container ID: containerd://2db0759d03659a7d557959762dfd294b9c145d8526c3206b722db19d59ba97ea
Image: registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
Image ID: registry.k8s.io/sig-storage/csi-provisioner@sha256:d078dc174323407e8cc6f0f9abd4efaac5db27838f1564d0253d5e3233e3f17f
Port: <none>
Host Port: <none>
Args:
-v=2
--csi-address=$(ADDRESS)
--leader-election
--leader-election-namespace=nfs-provisioner
--extra-create-metadata=true
State: Running
Started: Tue, 26 Sep 2023 09:24:30 +0700
Ready: True
Restart Count: 0
Limits:
memory: 400Mi
Requests:
cpu: 10m
memory: 20Mi
Environment:
ADDRESS: /csi/csi.sock
Mounts:
/csi from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-57p5d (ro)
csi-snapshotter:
Container ID: containerd://60fcab37cc0aae7757925230ca8f02481e05ae0a475ec3ac4f88a27d553a1faa
Image: registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2
Image ID: registry.k8s.io/sig-storage/csi-snapshotter@sha256:becc53e25b96573f61f7469923a92fb3e9d3a3781732159954ce0d9da07233a2
Port: <none>
Host Port: <none>
Args:
--v=2
--csi-address=$(ADDRESS)
--leader-election-namespace=nfs-provisioner
--leader-election
State: Running
Started: Tue, 26 Sep 2023 09:24:31 +0700
Ready: True
Restart Count: 0
Limits:
memory: 200Mi
Requests:
cpu: 10m
memory: 20Mi
Environment:
ADDRESS: /csi/csi.sock
Mounts:
/csi from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-57p5d (ro)
liveness-probe:
Container ID: containerd://858d1b86561ac48692bff7965b6f67aff5759ba9e92c68b6cbdb989dce074d70
Image: registry.k8s.io/sig-storage/livenessprobe:v2.10.0
Image ID: registry.k8s.io/sig-storage/livenessprobe@sha256:4dc0b87ccd69f9865b89234d8555d3a614ab0a16ed94a3016ffd27f8106132ce
Port: <none>
Host Port: <none>
Args:
--csi-address=/csi/csi.sock
--probe-timeout=3s
--health-port=29652
--v=2
State: Running
Started: Tue, 26 Sep 2023 09:24:31 +0700
Ready: True
Restart Count: 0
Limits:
memory: 100Mi
Requests:
cpu: 10m
memory: 20Mi
Environment: <none>
Mounts:
/csi from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-57p5d (ro)
nfs:
Container ID: containerd://a501d4a91b904e5572972862054e400e2fd43533de77ff8a9f44d0b44bf9841b
Image: registry.k8s.io/sig-storage/nfsplugin:v4.4.0
Image ID: registry.k8s.io/sig-storage/nfsplugin@sha256:971cb526bc0108cca7d8efda32102629cb362f99bbd4879fd381bea1b2014ab4
Port: 29652/TCP
Host Port: 29652/TCP
Args:
--v=5
--nodeid=$(NODE_ID)
--endpoint=$(CSI_ENDPOINT)
--drivername=nfs.csi.k8s.io
--mount-permissions=0
--working-mount-dir=/tmp
--default-ondelete-policy=delete
State: Running
Started: Tue, 26 Sep 2023 09:24:32 +0700
Ready: True
Restart Count: 0
Limits:
memory: 200Mi
Requests:
cpu: 10m
memory: 20Mi
Liveness: http-get http://:healthz/healthz delay=30s timeout=10s period=30s #success=1 #failure=5
Environment:
NODE_ID: (v1:spec.nodeName)
CSI_ENDPOINT: unix:///csi/csi.sock
Mounts:
/csi from socket-dir (rw)
/tmp from tmp-dir (rw)
/var/lib/kubelet/pods from pods-mount-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-57p5d (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
pods-mount-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/pods
HostPathType: Directory
socket-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-57p5d:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/controlplane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 40m default-scheduler 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports..
Normal Scheduled 40m default-scheduler Successfully assigned nfs-provisioner/csi-nfs-controller-86548b95f7-z7qp2 to k3s-hq-master1
Normal Pulled 40m kubelet Container image "registry.k8s.io/sig-storage/csi-provisioner:v3.5.0" already present on machine
Normal Created 40m kubelet Created container csi-provisioner
Normal Started 40m kubelet Started container csi-provisioner
Normal Pulled 40m kubelet Container image "registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2" already present on machine
Normal Created 40m kubelet Created container csi-snapshotter
Normal Started 40m kubelet Started container csi-snapshotter
Normal Pulled 40m kubelet Container image "registry.k8s.io/sig-storage/livenessprobe:v2.10.0" already present on machine
Normal Created 40m kubelet Created container liveness-probe
Normal Started 40m kubelet Started container liveness-probe
Normal Pulled 40m kubelet Container image "registry.k8s.io/sig-storage/nfsplugin:v4.4.0" already present on machine
Normal Created 40m kubelet Created container nfs
Normal Started 40m kubelet Started container nfs
This is csi-nfs-controller.log.
I0926 02:24:30.726169 1 feature_gate.go:249] feature gates: &{map[]}
I0926 02:24:30.726366 1 csi-provisioner.go:154] Version: v3.5.0
I0926 02:24:30.726380 1 csi-provisioner.go:177] Building kube configs for running in cluster...
I0926 02:24:32.833655 1 common.go:111] Probing CSI driver for readiness
I0926 02:24:32.840484 1 csi-provisioner.go:230] Detected CSI driver nfs.csi.k8s.io
I0926 02:24:32.843864 1 csi-provisioner.go:302] CSI driver does not support PUBLISH_UNPUBLISH_VOLUME, not watching VolumeAttachments
I0926 02:24:32.845873 1 controller.go:732] Using saving PVs to API server in background
I0926 02:24:32.846938 1 leaderelection.go:245] attempting to acquire leader lease nfs-provisioner/nfs-csi-k8s-io...
I0926 02:24:52.779355 1 leaderelection.go:255] successfully acquired lease nfs-provisioner/nfs-csi-k8s-io
I0926 02:24:52.779803 1 leader_election.go:178] became leader, starting
I0926 02:24:52.880380 1 controller.go:811] Starting provisioner controller nfs.csi.k8s.io_k3s-hq-master1_6603e3a6-a878-417e-819d-875e2dc1b01a!
I0926 02:24:52.880470 1 clone_controller.go:66] Starting CloningProtection controller
I0926 02:24:52.880647 1 volume_store.go:97] Starting save volume queue
I0926 02:24:52.880738 1 clone_controller.go:82] Started CloningProtection controller
I0926 02:24:52.981758 1 controller.go:860] Started provisioner controller nfs.csi.k8s.io_k3s-hq-master1_6603e3a6-a878-417e-819d-875e2dc1b01a!
I0926 02:27:16.460472 1 controller.go:1359] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": started
I0926 02:27:16.460780 1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"pvc-test", UID:"0af9008e-e778-4896-a00b-baea28041232", APIVersion:"v1", ResourceVersion:"1980825", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/pvc-test"
I0926 02:27:17.859174 1 controller.go:923] successfully created PV pvc-0af9008e-e778-4896-a00b-baea28041232 for PVC pvc-test and csi volume name 192.168.7.107#k8s-nfs-sgnnas03#pvc-0af9008e-e778-4896-a00b-baea28041232##
I0926 02:27:17.859250 1 controller.go:1442] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": volume "pvc-0af9008e-e778-4896-a00b-baea28041232" provisioned
I0926 02:27:17.859270 1 controller.go:1455] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": succeeded
I0926 02:27:17.866664 1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"pvc-test", UID:"0af9008e-e778-4896-a00b-baea28041232", APIVersion:"v1", ResourceVersion:"1980825", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-0af9008e-e778-4896-a00b-baea28041232
I0926 02:34:37.196452 1 controller.go:1502] delete "pvc-0af9008e-e778-4896-a00b-baea28041232": started
I0926 02:34:37.196521 1 controller.go:1279] volume pvc-0af9008e-e778-4896-a00b-baea28041232 does not need any deletion secrets
I0926 02:34:37.306536 1 controller.go:1517] delete "pvc-0af9008e-e778-4896-a00b-baea28041232": volume deleted
I0926 02:34:37.318770 1 controller.go:1562] delete "pvc-0af9008e-e778-4896-a00b-baea28041232": persistentvolume deleted succeeded
I0926 02:35:16.719867 1 controller.go:1359] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": started
I0926 02:35:16.720125 1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"pvc-test", UID:"25747ee0-1f9a-4ce8-b3d6-1dd5f5284393", APIVersion:"v1", ResourceVersion:"1981731", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/pvc-test"
I0926 02:35:16.806455 1 controller.go:923] successfully created PV pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393 for PVC pvc-test and csi volume name 192.168.7.107#k8s-nfs-sgnnas03#pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393##
I0926 02:35:16.806520 1 controller.go:1442] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": volume "pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393" provisioned
I0926 02:35:16.806541 1 controller.go:1455] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": succeeded
I0926 02:35:16.813133 1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"pvc-test", UID:"25747ee0-1f9a-4ce8-b3d6-1dd5f5284393", APIVersion:"v1", ResourceVersion:"1981731", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393
I0926 02:36:24.270225 1 controller.go:1502] delete "pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393": started
I0926 02:36:24.270295 1 controller.go:1279] volume pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393 does not need any deletion secrets
I0926 02:36:24.382536 1 controller.go:1517] delete "pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393": volume deleted
I0926 02:36:24.392834 1 controller.go:1562] delete "pvc-25747ee0-1f9a-4ce8-b3d6-1dd5f5284393": persistentvolume deleted succeeded
I0926 02:38:44.339074 1 controller.go:1359] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": started
I0926 02:38:44.339304 1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"pvc-test", UID:"7665caae-aec4-4225-be4c-e8afde29a3b9", APIVersion:"v1", ResourceVersion:"1982148", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/pvc-test"
I0926 02:38:44.419087 1 controller.go:923] successfully created PV pvc-7665caae-aec4-4225-be4c-e8afde29a3b9 for PVC pvc-test and csi volume name 192.168.7.107#k8s-nfs-sgnnas03#pvc-7665caae-aec4-4225-be4c-e8afde29a3b9##
I0926 02:38:44.419174 1 controller.go:1442] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": volume "pvc-7665caae-aec4-4225-be4c-e8afde29a3b9" provisioned
I0926 02:38:44.419207 1 controller.go:1455] provision "default/pvc-test" class "storageclass-csi-nfs-sgnnas03": succeeded
I0926 02:38:44.427127 1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"pvc-test", UID:"7665caae-aec4-4225-be4c-e8afde29a3b9", APIVersion:"v1", ResourceVersion:"1982148", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-7665caae-aec4-4225-be4c-e8afde29a3b9
what warning? you mean this @jackchuong ?
0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports..
Yes, I mean that, my internal k8s cluster(single node) is set up by k3s, is it related ?
Yes, I mean that, my internal k8s cluster(single node) is set up by k3s, is it related ?
@jackchuong yes, one controller pod should run on one node, since you have two replicas, it requires two nodes at least.
@andyzhangx so I just ignore it or change helm chart values ?
controller:
runOnMaster: false
runOnControlPlane: true
nodeSelector:
- node-role.kubernetes.io/control-plane: true
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k3s-hq-master1 Ready control-plane,master 18d v1.27.5+k3s1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=k3s,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k3s-hq-master1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=true,node-role.kubernetes.io/master=true,node.kubernetes.io/instance-type=k3s
just ignore since controller is running now.