Fuse sidecar gets killed and health probe doesn't restart it
Closed this issue · 3 comments
Side container just dies and the health6/liveness probe doesn't restart it
The following logs happen and out of the sudden it dies. I expect the container to fail in the liveness probe and restart but nothing happens and there is no way for me to know if there was an issue or not to take action automatically. The only way I detect this is that the main container produces an exception because it cannot access the volume.
Logs:
"Garbage collection succeeded after deleted 0 objects in 36.496406ms."
E0928 09:11:14.903703 1 logger.go:60] gcsfuse exited with error: signal: killed
System & Version (please complete the following information):
- Version GKE: 1.29.8-gke.1096000
- Version sidecar: gke.gcr.io/gcs-fuse-csi-driver-sidecar-mounter:v1.4.3-gke.2@sha256:ba5a1cdc67a7d08968b04034ff981f4e76dbcc1b73015e315487c7487fd9f93c
The main question is not why it did (which is also an uncertainty) but how can we recover from this? the liveness probe (if any) should detect this and restart the container
This issue was originally reported by @aecc in the GCSFuse repository, but it seems more relevant to the GCSFuse CSI driver repository. I've created a duplicate issue here to ensure it's addressed by the appropriate team.
@aecc This is rather odd behavior, I don't think we can get enough context from the symptoms you described. Could you share the cluster ID with me? You can get the ID by running gcloud container clusters describe <cluster-name> --location <cluster-location> | grep id:
, and share the id with me? Thanks!
Hi @aecc, I took a closer look at this issue. The CSI driver not restarting gcsfuse sidecar is expected behavior from CSI side. The scenario where sidecar gets killed is non recoverable, and requires the whole pod to restart. It is not clear to me why gcsfuse got killed, I'm wondering if the pod was marked for termination at some point. If you have more details or thoughts about if or why the sidecar & pod was marked for termination might give us more insight.
Closing this issue as we now answered why health probe doesn't restart sidecar.