NetApp/trident

fsGroup founction is not working in ReadWriteOncePod

Closed this issue · 5 comments

Describe the bug
The fsGroup founction is not working in ReadWriteOncePod’s PV.
See the following results.

$ cat sts.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-access-mode-pod
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: test-access-mode-pod
  serviceName: test-access-mode-pod
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app.kubernetes.io/name: test-access-mode-pod
      name: test-access-mode-pod
    spec:
      securityContext:
        fsGroup: 1001

      containers:
      - name: test-access-mode-pod
        image: docker.io/bitnami/etcd:3.5.8-debian-11-r4
        command:
        - bash
        - -c
        - |
          while true; do echo test; sleep 3; done
        volumeMounts:
        - name: data
          mountPath: /data
        securityContext:
          allowPrivilegeEscalation: false
          runAsNonRoot: true
          runAsUser: 1001

  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      storageClassName: ontap-block
      accessModes:
      - ReadWriteOncePod
      resources:
        requests:
          storage: 10Gi

$ kubectl apply -f sts.yaml

$ kubectl exec -ti test-access-mode-pod-0 -- /bin/sh
...
$ id
uid=1001 gid=0(root) groups=0(root),1001

$ ls -la /data/
total 24
drwxr-xr-x 3 root root  4096 Aug 16 03:15 .
drwxr-xr-x 1 root root  4096 Aug 16 03:15 ..
drwx------ 2 root root 16384 Aug 16 03:15 lost+found

$ touch /data/test.txt
touch: cannot touch '/data/test.txt': Permission denied

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: 23.04.0
  • Trident installation flags used: silenceAutosupport: true (Trident Operator)
  • Container runtime: containerd://1.6.22
  • Kubernetes version: 1.27.4
  • Kubernetes orchestrator: Kubernetes
  • Kubernetes enabled feature gates: none
  • OS: Ubuntu 20.04.5 LTS
  • NetApp backend types: ONTAP AFF 9.9.1P9
  • Other:

To Reproduce
See example above

Expected behavior
I expect that fsGroup will be enabled and chown correctly even for ReadWriteOncePod.

$ ls -la /data/
total 24
drwxr-xr-x 3 root 1001  4096 Aug 16 03:15 .
drwxr-xr-x 1 root root  4096 Aug 16 03:15 ..
drwx------ 2 root 1001 16384 Aug 16 03:15 lost+found

Additional context
The fsGroupPolicy in CSIDriver for Trident is set to fsGroupPolicy: ReadWriteOnceWithFSType .
However, the ReadWriteOnceWithFSType limits fsGroup execution to ReadWriteOnce only.

https://kubernetes-csi.github.io/docs/support-fsgroup.html#supported-modes
https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_mounter.go#L446

One solution is to set fsGroupPolicy to File.
So the idea is valid for all other AccessModes, so NetApp need to make sure it is not a problem (e.g., ReadWriteMany).

As you know, the ReadWriteOncePod is GA in K8S v1.29.
This ReadWriteOncePod is more secure than the existing ReadWriteOnce.
This is a highly requested feature from our users.
However, ReadWriteOncePod is not available if fsGroup is not available.
Because many operators such as DB use fsGroup.
Unfortunately,Trident has not yet been supported.

I believe that Trident v24.01 will support K8S v1.29.
If you don't fix the bug in Trident v24.01(supported K8S v1.29),
could you include in your documentation that fsGroup is not available and a workaround for this issue?

I checked with Trident v24.02.
Unfortunately, this bug had not yet been fixed.

Could you fix the problem ASAP, as the RWOP is GA on Kubernetes 1.29 and our users want to use it.

$ kubectl get deployments -n trident trident-controller -o yaml |grep image
        image: netapp/trident:24.02.0
...
$ kubectl apply -f sts.yaml 
statefulset.apps/test-access-mode-pod created

$ kubectl exec -ti test-access-mode-pod-0 -- /bin/sh
$ id
uid=1001 gid=0(root) groups=0(root),1001
$ ls -la /data/
total 24
drwxr-xr-x 3 root root  4096 May 23 07:31 .
drwxr-xr-x 1 root root  4096 May 23 07:31 ..
drwx------ 2 root root 16384 May 23 07:31 lost+found
$ touch /data/test.txt
touch: cannot touch '/data/test.txt': Permission denied
$ exit

NetApp guys.

Do you have a problem with setting the value of fsGroupPolicy to file?
As you may know, the current Trident(v24.02) also has fsGroup enabled for RWX and RWO.

https://kubernetes-csi.github.io/docs/support-fsgroup.html

The Kubernetes CSI specification specifies file in fsGroupPolicy to enable fsGroup for all accessMode and ReadWriteOnceWithFSType for RWO only.
The behavior of CreateVolume operation in NetApp/Trident->ONTAP, whether RWO or RWOP, should not change the way ONTAP volumes are created, and it should be possible to change them by chown and chmod as well.
So I believe that in such cases it is correct to specify file.

  • Case1) RWO
$ diff -u sts.yaml sts-rwo.yaml 
--- sts.yaml	2024-06-06 10:24:30
+++ sts-rwo.yaml	2024-06-06 10:25:42
@@ -41,7 +41,7 @@
     spec:
       storageClassName: ontap-block
       accessModes:
-      - ReadWriteOncePod
+      - ReadWriteOnce
       resources:
         requests:
           storage: 10Gi
$ kubectl apply -f sts-rwo.yaml 
statefulset.apps/test-access-mode-pod created

$ kubectl exec -ti test-access-mode-pod-0 -- /bin/sh
$ ls -la /data/           
total 24
drwxrwsr-x 3 root 1001  4096 Jun  6 01:26 .
drwxr-xr-x 1 root root  4096 Jun  6 01:26 ..
drwxrws--- 2 root 1001 16384 Jun  6 01:26 lost+found
$ id
uid=1001 gid=0(root) groups=0(root),1001
$ touch /data/test.txt
$ ls -la /data/test.txt
-rw-r--r-- 1 1001 1001 0 Jun  6 01:27 /data/test.txt
$ exit
  • Case2) RWX
$ diff -u sts.yaml sts-rwx.yaml 
--- sts.yaml	2024-06-06 10:24:30
+++ sts-rwx.yaml	2024-06-06 10:23:47
@@ -39,9 +39,9 @@
   - metadata:
       name: data
     spec:
-      storageClassName: ontap-block
+      storageClassName: ontap-file
       accessModes:
-      - ReadWriteOncePod
+      - ReadWriteMany
       resources:
         requests:
           storage: 10Gi
$ kubectl apply -f sts-rwx.yaml 
statefulset.apps/test-access-mode-pod created
$ kubectl exec -ti test-access-mode-pod-0 -- /bin/sh
$ ls -la /data
total 8
drwxrwxrwx 2 nobody 4294967294 4096 Jun  6 01:20 .
drwxr-xr-x 1 root   root       4096 Jun  6 01:20 ..
$ touch /data/test.txt
$ ls -l /data/    
total 0
-rw-r--r-- 1 1001 4294967294 0 Jun  6 01:22 test.txt
$ id
uid=1001 gid=0(root) groups=0(root),1001
$ exit

This problem seems to be troubling not only us, but also the Kubevirt community due to the problem.
However, it seems that the Kubevirt community does not rely on fsGroup in CSI, but uses root privileges to change permissions and deal with the problem.

kubevirt/containerized-data-importer#2919

Also, as you know, DB etc. change permissions with fsGroup, but if fsGroup is not available in RWOP, it will be impossible to deploy DB etc., causing problems all over the world.
As this is already a GA feature in Kubernetes, please support this in ASAP.

This isn't a Trident bug, and changing Trident's FsGroupPolicy to File as recommended by some users is not appropriate. Reasons for this include:

  • If the volume is NFS-based, the chgrp can fail because of root-squashing rules in most NFS servers.
  • It doesn't work for ROX/RWX volumes because a file can have only one GID and multiple pods may be accessing it at once.

We acknowledge that setting FsGroupPolicy to File for deployments that use block protocols exclusively would work for them, but it would be absolutely detrimental to all our NFS users. A mixed protocol provisioner like Trident needs to rely on an FsGroupPolicy of ReadWriteOnceWithFSType to get correct behavior for both file and block protocols.

The good news is that this is a known issue in Kubernetes:

https://github.com/kubernetes/kubernetes/issues/127170
https://github.com/kubernetes/kubernetes/issues/127817

The needed fix is a one-line change to kubelet. The community has promised to fix this in Kubernetes 1.32, and we will assist by reviewing that change.