csi-snapshotter complains about missing CRD's quite noisily in the logs when externalSnapshotter Helm value is set to disabled
Closed this issue · 10 comments
I deployed observability to my cluster and noticed significant amout of log entries coming from csi-snapshotter:
E0724 04:37:31.925736 1 reflector.go:140] github.com/kubernetes-csi/external-snapshotter/client/v6/informers/externalversions/factory.go:117: Failed to watch *v1.VolumeSnapshotContent: failed to list *v1.VolumeSnapshotContent: volumesnapshotcontents.snapshot.storage.k8s.io is forbidden: User "system:serviceaccount:csinfs:csi-nfs-controller-sa" cannot list resource "volumesnapshotcontents" in API group "snapshot.storage.k8s.io" at the cluster scope
}
Approx 900 lines in 8 hours.
The Helm charts only modified values is:
values:
externalSnapshotter:
enabled: false
Im not sure what the csi-snapshotter is doing if we have disable externalSnapshotter - should it still be there? Or should we install the CRD's anyway even if not the container to prevent the noisy log messages?
Environment:
- CSI Driver version: Helm Chart v.4.8.0
- Kubernetes version (use
kubectl version
): 1.29.2 - OS (e.g. from /etc/os-release): Linux
- Kernel (e.g.
uname -a
): Talos Linux v1.6.6 - Install tools: Helm
- Others:
@MysticalMount
what's the kubectl get clusterrole nfs-external-provisioner-role -o yaml
, the system:serviceaccount:csinfs:csi-nfs-controller-sa
should already have following permissions:
Hi Andy, I have uninstalled and re-installed the Helm chart a few times so perhaps thats only when the issue occurs, here is the output of the the clusterrole. I had heard upgrading Helm charts from a previous version doesnt always update the CRD's - so perhaps its that (did come from version v4.4.0 - v4.8.0)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
meta.helm.sh/release-name: csinfs
meta.helm.sh/release-namespace: csinfs
creationTimestamp: "2024-07-25T14:50:48Z"
labels:
app.kubernetes.io/instance: csinfs
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: csi-driver-nfs
app.kubernetes.io/version: v4.8.0
helm.sh/chart: csi-driver-nfs-v4.8.0
helm.toolkit.fluxcd.io/name: csinfs
helm.toolkit.fluxcd.io/namespace: csinfs
name: nfs-external-provisioner-role
resourceVersion: "58106543"
uid: b23ddafb-89f1-48fe-9a6b-b3b1ca631ad8
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
verbs:
- get
- list
- watch
- create
- delete
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- get
- list
- watch
- update
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
verbs:
- get
- list
- watch
- apiGroups:
- snapshot.storage.k8s.io
resources:
- volumesnapshotclasses
- volumesnapshots
verbs:
- get
- list
- watch
- apiGroups:
- snapshot.storage.k8s.io
resources:
- volumesnapshotcontents
verbs:
- get
- list
- watch
- update
- patch
- apiGroups:
- snapshot.storage.k8s.io
resources:
- volumesnapshotcontents/status
verbs:
- get
- update
- patch
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- create
- update
- patch
- apiGroups:
- storage.k8s.io
resources:
- csinodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- list
- watch
- create
- update
- patch
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
can you check why user "system:serviceaccount:csinfs:csi-nfs-controller-sa" cannot list resource "volumesnapshotcontents" in API group "snapshot.storage.k8s.io" at the cluster scope? e.g. check service account csinfs:csi-nfs-controller-sa
info, that's not related to CRD. and those logs appears in your csi-snapshotter sidecar in csi-nfs-controller, right?
Hello,
I am experiencing a similar issue as described here. I am using MicroK8s without any snapshot controllers installed, which rightly means that the snapshot-related CRDs (VolumeSnapshot, VolumeSnapshotClass, VolumeSnapshotContent) do not exist in my environment. Given this setup, the continuous logging of errors from the csi-snapshotter seems unnecessary as snapshot functionality is not applicable.
Is there a way to disable the csi-snapshotter container in scenarios where snapshot functionality is not being used or intended? Any guidance on this would be greatly appreciated.
Thank you!
try reinstall the nfs driver if you have upgraded the csi driver, similar to kubernetes-csi/external-snapshotter#975 (comment)
Hi,
Myself facing same issue as describe here
E0821 02:12:03.111339 1 reflector.go:150] github.com/kubernetes-csi/external-snapshotter/client/v8/informers/externalversions/factory.go:142: Failed to watch *v1.VolumeSnapshotContent: failed to list *v1.VolumeSnapshotContent: the server could not find the requested resource (get volumesnapshotcontents.snapshot.storage.k8s.io)
I have the same issue here, with csi-driver-nfs installed through its Helm chart (without enabling externalSnapshotter)
Could it simply come from the fact that the Helm chart currently deploys the CRD only if externalSnapshotter is enabled? See the if
statement here: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/charts/latest/csi-driver-nfs/templates/crd-csi-snapshot.yaml#L1
Even if this CRD is also needed by the csi-snapshotter
pod of the controller:
https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/charts/latest/csi-driver-nfs/templates/csi-nfs-controller.yaml#L78 ?
Should this be fixed in the Helm chart?
A workaround is to deploy the CRD manually:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/deploy/v4.8.0/crd-csi-snapshot.yaml
(this is for v4.8.0: adapt for the version you use)
However, it's not clean to mix the Helm chart and a custom deployment like this: Helm will complain if (later on) it tries to deploy this CRD (because of a fix for this issue, or a configuration change you did). Because helm will not find its labels on the CRDs, and refuses to overwrite an object that it did not handle so far. So use this workaround only if you know what you're doing
Any update here?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
one fix is that adding a new flag, e.g. --set .controller.enableSnapshotter=false
in case you totally don't need snapshot function.