Enabling file cache without webhook injector
christopher-kwan-ai opened this issue · 6 comments
The webhook injector specifies an EmptyDir
as the cache directory and I would prefer to be able to specify a persistent volume so that I can readily increase the size of my cache depending on my reading pattern.
Trying to work around this by specifying the gcs-gkefuse-sidecar
container on my own instead of using the pod annotations. This allows me to specify which particular version of the CSI driver to use (trying out v1.2.0 right now) and also to specify the cache dir as an ephemeral volume.
This is however still failing: rpc error: code = FailedPrecondition desc = failed to find the sidecar container in Pod spec
.
Any idea if this idea is feasible at all? The below is an example of the YAML definition I am using:
Details
containers:
- name: gke-gcsfuse-sidecar
image: <REGISTRY>/gcs-fuse-csi-driver-sidecar-mounter:v1.2.0
imagePullPolicy: IfNotPresent
args:
- --v=5
- --grace-period=30
volumeMounts:
- mountPath: /gcsfuse-buffer
name: gke-gcsfuse-buffer
- mountPath: /gcsfuse-tmp
name: gke-gcsfuse-tmp
- mountPath: /gcsfuse-cache
name: cachedir
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
....
....
volumes:
- ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: main-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: standard
name: cachedir
- csi:
driver: gcsfuse.csi.storage.gke.io
readOnly: true
volumeAttributes:
bucketName: <some_output_bucket>
fileCacheCapacity: "-1"
fileCacheForRangeRead: "true"
metadataStatCacheCapacity: "-1"
metadataTypeCacheCapacity: "-1"
metadataCacheTTLSeconds: "-1"
mountOptions: implicit-dirs,only-dir=<some_dir>,logging:severity:info
name: fuse-dir
- name: gke-gcsfuse-tmp
emptyDir: {}
- name: gke-gcsfuse-buffer
emptyDir: {}
Hi @christopher-kwan-541 , we will support custom cache volume. The public documentation is pending will be published soon this week.
Similar to https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/cloud-storage-fuse-csi-driver#buffer-volume, you can specify a volume called gke-gcsfuse-cache
.
In your case, you are completely bypassing the sidecar injection, and it should also work.
In your yaml example, the cachedir
volume and other volumes are not on the same level with the same indentation. Is this on purpose?
Apologies. I have updated the YAML with the correct indentation for the volumes
section. That above YAML is not working with the error:
rpc error: code = FailedPrecondition desc = failed to find the sidecar container in Pod spec
though. Any clues on why that may be the case? I am using v1.2.0
, do let me know if I should be using a different release tag.
Hi @christopher-kwan-541 what is the current GKE version you are using? I cannot reproduce the issue using the shared yaml.
Please note that the file cache feature will be available soon on GKE this week, so you may want to wait a little bit to try out the official version.
I am on v1.26.11-gke.1055000
. From our paired debugging today, it looks like I need to use the image hosted by Google.
Here is the full required YAML for posterity:
Details
securityContext:
fsGroup: 100
containers:
- name: gke-gcsfuse-sidecar
image:.gke.gcr.io/gcs-fuse-csi-driver-sidecar-mounter@sha256:31880114306b1fb5d9e365ae7d4771815ea04eb56f0464a514a810df9470f88f
imagePullPolicy: IfNotPresent
args:
- --v=5
volumeMounts:
- mountPath: /gcsfuse-buffer
name: gke-gcsfuse-buffer
- mountPath: /gcsfuse-tmp
name: gke-gcsfuse-tmp
- mountPath: /gcsfuse-cache
name: cachedir
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
....
....
volumes:
- ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: main-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: standard
name: cachedir
- csi:
driver: gcsfuse.csi.storage.gke.io
readOnly: true
volumeAttributes:
bucketName: <some_output_bucket>
mountOptions:.implicit-dirs,only-dir=<test_dir>,logging:severity:info,file-cache:max-size-mb:-1,file-cache:cache-file-for-range-read:true
name: fuse-dir
- name: gke-gcsfuse-tmp
emptyDir: {}
- name: gke-gcsfuse-buffer
emptyDir: {}
Verified that the sidecar is now using the cache and the cache is saving files into the specified ephemeral volume. However, it looks like read performance is still poor despite having the cache. It even looks like it has degraded further with the cache on.
Let me know if you prefer I create a new issue since this main issue pertains to being able to instantiate the cache.
Thanks @christopher-kwan-541 , I will close this issue.
Let's follow up using Google support channel.
If the GKE logging still does not have proper permission role added, please run the following command to collect the sidecar container log and share the output file with the Google support engineer you are working with.
kubectl logs <pod-name> -c gke-gcsfuse-sidecar > sidecar-container.log