kubernetes-csi/livenessprobe

Probe requests still reporting ready status even when socket file doesn't exist anymore

nettoclaudio opened this issue · 0 comments

Description
Once established a connection to CSI driver's identity server, the livenessprobe server will not attempt to reconnect again - until a restart occurs. We rely on this persistent connection to dispatch the Probe calls to the identity server.

A side effect of this approach comes when the socket file is removed (deliberately or not) shortly after the first established connection. Under these conditions, the probe requests reach the CSI driver's identity server and may return a healthy status as long as this connection is open.

Components will not succeed to open new connections to this CSI driver, leading to a stuck scenario. For example, the kubelet won't contact the CSI driver's node server about NodePublishVolume calls, causing pending pods forever (until a human intervention).

What is expected
Whether Unix Domain Socket file does not exist anymore, requests to /healthz should return a not ready/unhealthy status.