GoogleCloudPlatform/gcs-fuse-csi-driver

FUSE CSI driver using native sidecar mutates restart policy on wrong init container

Baune8D opened this issue · 8 comments

We ran into this problem after upgrading our AutoPilot cluster from K8s 1.28.7 to 1.29.4.

We also run managed Anthos Service Mesh v1.18.7

When deploying pods using both FUSE CSI driver and Istio proxy, restartPolicy gets mutated to always on istio-validation init container instead of gke-gcsfuse-sidecar.

Manifest: FUSE CSI driver: enabled - Istio sidecar injection: enabled

Notice restartPolicy: always gets applied to istio-validation init container, and not to gke-gcsfuse-sidecar

initContainers:
  - args:
      - istio-iptables
      - '-p'
      - '15001'
      - '-z'
      - '15006'
      - '-u'
      - '1337'
      - '-m'
      - REDIRECT
      - '-i'
      - '*'
      - '-x'
      - ''
      - '-b'
      - '*'
      - '-d'
      - '15090,15021,15020'
      - '--log_output_level=default:info'
      - '--run-validation'
      - '--skip-rule-apply'
    env:
      - name: CA_PROVIDER
        value: GoogleCA
      - name: CA_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
      - name: CA_TRUSTANCHOR
      - name: EXIT_ON_ZERO_ACTIVE_CONNECTIONS
        value: 'true'
      - name: FLEET_PROJECT_NUMBER
        value: 'xxx'
      - name: GCP_METADATA
        value: xxx|xxx|xxx|xxx
      - name: OUTPUT_CERTS
        value: /etc/istio/proxy
      - name: PROXY_CONFIG_XDS_AGENT
        value: 'true'
      - name: XDS_AUTH_PROVIDER
        value: gcp
      - name: XDS_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
    image: 'gcr.io/gke-release/asm/proxyv2:1.18.7-asm.21'
    imagePullPolicy: IfNotPresent
    name: istio-validation
    resources:
      limits:
        cpu: 500m
        ephemeral-storage: 1Gi
        memory: 512Mi
      requests:
        cpu: 500m
        ephemeral-storage: 1Gi
        memory: 512Mi
    restartPolicy: Always
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsGroup: 1337
      runAsNonRoot: true
      runAsUser: 1337
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-n6xh7
        readOnly: true
  - args:
      - '--v=5'
    env:
      - name: NATIVE_SIDECAR
        value: 'TRUE'
    image: >-
      gke.gcr.io/gcs-fuse-csi-driver-sidecar-mounter:v1.2.0-gke.0@sha256:31880114306b1fb5d9e365ae7d4771815ea04eb56f0464a514a810df9470f88f
    imagePullPolicy: IfNotPresent
    name: gke-gcsfuse-sidecar
    resources:
      limits:
        cpu: 250m
        ephemeral-storage: 5Gi
        memory: 256Mi
      requests:
        cpu: 250m
        ephemeral-storage: 5Gi
        memory: 256Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
      readOnlyRootFilesystem: true
      runAsGroup: 65534
      runAsNonRoot: true
      runAsUser: 65534
      seccompProfile:
        type: RuntimeDefault
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
      - mountPath: /gcsfuse-tmp
        name: gke-gcsfuse-tmp
      - mountPath: /gcsfuse-buffer
        name: gke-gcsfuse-buffer
      - mountPath: /gcsfuse-cache
        name: gke-gcsfuse-cache
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-n6xh7
        readOnly: true

Manifest: FUSE CSI driver: disabled - Istio sidecar injection: enabled

This time there is no restartPolicy: always on the istio-validation init container.

initContainers:
  - args:
      - istio-iptables
      - '-p'
      - '15001'
      - '-z'
      - '15006'
      - '-u'
      - '1337'
      - '-m'
      - REDIRECT
      - '-i'
      - '*'
      - '-x'
      - ''
      - '-b'
      - '*'
      - '-d'
      - '15090,15021,15020'
      - '--log_output_level=default:info'
      - '--run-validation'
      - '--skip-rule-apply'
    env:
      - name: CA_PROVIDER
        value: GoogleCA
      - name: CA_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
      - name: CA_TRUSTANCHOR
      - name: EXIT_ON_ZERO_ACTIVE_CONNECTIONS
        value: 'true'
      - name: FLEET_PROJECT_NUMBER
        value: 'xxx'
      - name: GCP_METADATA
        value: xxx|xxx|xxx|xxx
      - name: OUTPUT_CERTS
        value: /etc/istio/proxy
      - name: PROXY_CONFIG_XDS_AGENT
        value: 'true'
      - name: XDS_AUTH_PROVIDER
        value: gcp
      - name: XDS_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
    image: 'gcr.io/gke-release/asm/proxyv2:1.18.7-asm.21'
    imagePullPolicy: IfNotPresent
    name: istio-validation
    resources:
      limits:
        cpu: 500m
        ephemeral-storage: 1152Mi
        memory: 512Mi
      requests:
        cpu: 500m
        ephemeral-storage: 1152Mi
        memory: 512Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsGroup: 1337
      runAsNonRoot: true
      runAsUser: 1337
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-fpcmd
        readOnly: true

Manifest: FUSE CSI driver: ´enabled- Istio sidecar injection:disabled`

Now the FUSE CSI driver sidecar contains restartPolicy: always as expected

initContainers:
  - args:
      - '--v=5'
    env:
      - name: NATIVE_SIDECAR
        value: 'TRUE'
    image: >-
      gke.gcr.io/gcs-fuse-csi-driver-sidecar-mounter:v1.2.0-gke.0@sha256:31880114306b1fb5d9e365ae7d4771815ea04eb56f0464a514a810df9470f88f
    imagePullPolicy: IfNotPresent
    name: gke-gcsfuse-sidecar
    resources:
      limits:
        cpu: 250m
        ephemeral-storage: 5Gi
        memory: 256Mi
      requests:
        cpu: 250m
        ephemeral-storage: 5Gi
        memory: 256Mi
    restartPolicy: Always
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
      readOnlyRootFilesystem: true
      runAsGroup: 65534
      runAsNonRoot: true
      runAsUser: 65534
      seccompProfile:
        type: RuntimeDefault
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
      - mountPath: /gcsfuse-tmp
        name: gke-gcsfuse-tmp
      - mountPath: /gcsfuse-buffer
        name: gke-gcsfuse-buffer
      - mountPath: /gcsfuse-cache
        name: gke-gcsfuse-cache
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-rcxtp
        readOnly: true

This manifests as the same symptoms as seen here: #53 (comment)

hime commented

Could you provide the location of the istio-proxy sidecar? I'm curious if it's being injected as a regular container (which would cause incompatibility). I'm not sure if ASM v1.18.7 injects istio-proxy as a native sidecar by default, but there should be a configuration that allows for this type of injection from ASM side.

If this is not allowed in ASM v1.18.7, it's worth to upgrade to a ASM version that supports istio-proxy as a native sidecar.

Could you provide the location of the istio-proxy sidecar? I'm curious if it's being injected as a regular container (which would cause incompatibility). I'm not sure if ASM v1.18.7 injects istio-proxy as a native sidecar by default, but there should be a configuration that allows for this type of injection from ASM side.

If this is not allowed in ASM v1.18.7, it's worth to upgrade to a ASM version that supports istio-proxy as a native sidecar.

istio-validation is injected as the first init container, and istio-proxy is injected as a regular container. I think Istio only supports native sidecars from 1.19. Also we run managed Anthos Service Mesh through stable channel which only supports 1.18 at the moment, and native sidecar seems to be a Pilot setting that we have no control over, i am not even sure if ASM supports it in newer versions at it seems to be an opt-in setting in Istio.

To me this clearly seems like a bug in FUSE driver since it mutates the wrong container. Basicly it makes istio-validation run as native sidecar, and the FUSE sidecar as a regular init container.

hime commented

Generally speaking, we want to make sure istio is using 1.19 to be compatible with our driver. Even if there wasn't any modification to the init containers, I believe the gcsfuse sidecar would fail to start since istio-proxy must be running before any other containers that use the network.

With that said, this is very interesting behavior. What would be good to know is if this is a problem with GCSFuse webhook or istio webhook. The reason I believe we need to check this is because:

  • GCSFuse webhook injects the container after the istio-proxy sidecar when present in the same container list (eg. init container list). When it is not, we inject at first position.
  • The GCSFuse native sidecar is shown to be injected in second position in the spec provided. This likely means the last webhook to make any changes was the istio webhook.
  • NATIVE_SIDECAR env_var is injected at the same time as the restartPolicy every time from the GCSFuse webhook.

I think it would be good to manually declare the gcsfuse native sidecar in the yaml spec with the driver disabled and istio enabled, and see if istio is able to correctly inject the init container without modifying any other sidecars.

Generally speaking, we want to make sure istio is using 1.19 to be compatible with our driver.

Maybe you should consider a way to disable the use of native sidecar? I feel it would make sense for you to support whatever is considered stable Google offerings. We have no way of upgrading Istio past 1.18 at the moment since we use the stable release channel of managed ASM.

Even if there wasn't any modification to the init containers, I believe the gcsfuse sidecar would fail to start since istio-proxy must be running before any other containers that use the network.

You definitly might be right about this. I will try to declare the FUSE sidecar manually tomorrow and see how things work out.

GCSFuse webhook injects the container after the istio-proxy sidecar when present in the same container list (eg. init container list). When it is not, we inject at first position.

The istio-proxy container is not present in the same container list as the FUSE sidecar, since it resides in the normal container list. The istio-validation init container is though and this is the one who gets restartPolicy applied.

I will report back tomorrow when i get a chance to try out the FUSE sidecar in combination with Istio when declared manually.

hime commented

The istio-proxy container is not present in the same container list as the FUSE sidecar, since it resides in the normal container list. The istio-validation init container is though and this is the one who gets restartPolicy applied.

There are multiple istio sidecars, and many istio sidecar combinations that exists. For simplification, we only look at istio-proxy because it is guaranteed by istio that this is the last istio sidecar injected (ordering wise).

I will report back tomorrow when i get a chance to try out the FUSE sidecar in combination with Istio when declared manually.

Awesome!

@hime I did some more investigating, and this is the results:

  • If i disable FUSE sidecar injection and define the FUSE sidecar manually, the issue still happens.
  • If i disable FUSE sidecar injection and define the FUSE sidecar manually, and i also define the istio-validation init container manually. The issue does NOT happen. Actually everything starts up fine, both FUSE and Istio seems to be working correctly even though Istio still runs as a regular sidecar.

I cannot test the behaviour with Istio sidecar injection disabled and FUSE sidecar injection enabled, because if i define istio-validation manually, FUSE injects its sidecar before istio-validation, and the problems always manifests as the restartPolicy moving to the container before the one where it is defined.

So i it seems you were correct and the problem is related to the Istio webhooks, and not to FUSE CSI driver.

This investigation also presented a workaround for now, by defining istio-validation manually, Istio still injects istio-proxy as usual, but the restartPolicy does not get messed up.

Closing this issue.

This issue is actually on ASM webhook -- the webhook wrongly modified the gke-gcsfuse-sidecar init container while injecting the istio-validation init container. The ASM team acknowledged this issue and we are working on a fix. Thank you for reporting this issue!