kubernetes-csi/csi-driver-nfs

[BUG][GKE] Upgrade to 4.5.0 fails if cluster role snapshot-controller-runner already exist in cluster

navilg opened this issue · 6 comments

navilg commented

What happened:
When installing or upgrading the driver to version 4.5.0 with externalSnapshotter enabled, It fails with below error

image

What you expected to happen:

If clusterrole snapshot-controller-runner already exist in system, It must be ignored during upgrade or install.
Or, there should be a way to skip it.

How to reproduce it:

Install or upgrade csi-driver-nfs to 4.5.0 with externalSnapshotter.enabled=true and externalSnapshotter.customResourceDefinitions.enabled=false

Anything else we need to know?:

Environment:

  • CSI Driver version: 4.5.0
  • Kubernetes version (use kubectl version): 1.28 (GKE)
  • OS (e.g. from /etc/os-release): Ubuntu with containerd
  • Kernel (e.g. uname -a):
  • Install tools: helm
  • Others:

Tasks

No tasks being tracked yet.
navilg commented

I can look into fixing this on weekend.

if snapshot-controller clusterrole already exists, that means snapshot controller is already enabled on the cluster, then you should set externalSnapshotter.enabled=false, that would only enable snapshot sidecar container in the csi driver controller and not install snapshot controller in the cluster.

navilg commented

Yeah. I checked the pods in GKE cluster. But didn't find any pods using snapshotter controller image.

Yeah. I checked the pods in GKE cluster. But didn't find any pods using snapshotter controller image.

@navilg what's the output of kubectl get crd? I am not familiar with GKE, on Azure AKS, the snapshot controller is managed by AKS, it's invisible to user.

navilg commented
NAME                                              
allowlistedv2workloads.auto.gke.io                
allowlistedworkloads.auto.gke.io                  
audits.warden.gke.io                              
backendconfigs.cloud.google.com                   
capacityrequests.internal.autoscaling.gke.io      
clusterpodmonitorings.monitoring.googleapis.com   
clusterrules.monitoring.googleapis.com            
frontendconfigs.networking.gke.io                 
globalrules.monitoring.googleapis.com             
managedcertificates.networking.gke.io             
memberships.hub.gke.io                            
operatorconfigs.monitoring.googleapis.com         
podmonitorings.monitoring.googleapis.com          
rules.monitoring.googleapis.com                   
serviceattachments.networking.gke.io              
servicenetworkendpointgroups.networking.gke.io    
updateinfos.nodemanagement.gke.io                 
volumesnapshotclasses.snapshot.storage.k8s.io     
volumesnapshotcontents.snapshot.storage.k8s.io    
volumesnapshots.snapshot.storage.k8s.io

that means snapshot and CRD are already installed on the cluster, you should should set externalSnapshotter.enabled=false in helm install