kubernetes-csi/csi-driver-smb

runOnControlPlane/runOnMaster: true make the controller unschedulable

janfrederik opened this issue · 4 comments

What happened:
Setting controller.runOnControlPlane: true makes the controller unschedulable.
The same applies to controller.runOnMaster.

This is the same problem as kubernetes-csi/csi-driver-nfs#787.

What you expected to happen:
The controller to get scheduled

How to reproduce it:

  • k3s 1.30
  • install csi-driver-smb v1.16.0 with following values.yaml
controller:
  runOnControlPlane: true

Environment:

  • CSI Driver version: 1.16.0
  • Kubernetes version (use kubectl version): v1.30.6+k3s1
  • OS (e.g. from /etc/os-release): openSUSE MicroOS 20241104
  • Kernel (e.g. uname -a): 6.11.5-2-default

I see the control plane nodes have label node-role.kubernetes.io/control-plane=true.
The nodeSelector is node-role.kubernetes.io/control-plane: ""

These doesn't match by the way kubernetes matches labels to selectors.

We probably want a simple selector of type "exists", without specifying a value: node-role.kubernetes.io/control-plane. However, this cannot be done with nodeSelector, but needs Affinity:

template:
  spec:
    affinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node-role.kubernetes.io/control-plane
            operator: Exists
          - key: node-role.kubernetes.io/master
            operator: Exists

References:

there are other k8s clusters that the master node has label node-role.kubernetes.io/control-plane= with (empty value), so this one controller.runOnControlPlane cannot match the two conditions, if we fix one, then another breaks, and we already have tolerations defined here

  tolerations:
    - key: "node-role.kubernetes.io/master"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node-role.kubernetes.io/controlplane"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node-role.kubernetes.io/control-plane"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "CriticalAddonsOnly"
      operator: "Exists"
      effect: "NoSchedule"

if you want to make this csi driver controller runs on the master node matching specific label, you could define controller.nodeSelector in helm chart install.

Wouldn't we simple remove controller.runOnControlPlane and controller.runOnMaster from values.yaml because

  • it doesn't work reliably anyway (in some cases);
  • the use of the labels node-role.kubernetes.io/control-plane and node-role.kubernetes.io/master is not exactly the same in all k8s cluster implementation (according to @andyzhangx above);
  • values.yaml already provides already controller.nodeSelector and controller.affinity for specifying node affinity;
  • it is impossible to blindly merge user provided affinity clauses with automatically generated affinity clauses, respecting the users intended and/or logic.

Instead, we can add some examples to the doc or as comments in values.yaml on how to use nodeSelector or affinity for the cases of runOnControlPlane and runOnMaster.

Either that or we should mention it in the docs that it may not work for some implementations.