Liveness & rediness checks timeout on Livy
Closed this issue · 2 comments
pdambrauskas commented
Currently liveness is being checked on /batches
endpoint:
https://github.com/jahstreet/spark-on-kubernetes-helm/blob/a1fd2ac19580feb0d9469c1d7cadd8630710ac13/charts/livy/templates/statefulset.yaml#L33
When there is a bigger number of batches, these check timeout occasionally:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 54m (x56 over 10d) kubelet, ip-XX Readiness probe failed: Get http://XX:8998/batches: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 54m (x59 over 10d) kubelet, ip-XX Liveness probe failed: Get http://XX:8998/batches: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Would it be ok to add ?size=1
to limit response size, or at least to have an option to disable these checks on livy chart?
jahstreet commented
Good point, thanks for the mentioning. Will update the chart.