relution-io/relution-kubernetes

Liveness Probes don't give enough time for startup.

Closed this issue · 2 comments

I just tried out the helm chart, and it mostly seems to work great, but I had to modify the chart locally to remove the liveness probes because they were causing the pod to restart before the initial database could be setup.

Issue
Liveness probes kill the relution pod before it can finish the initial setup.
My machine is very powerful, and runs on an nVME SSD, multiple cores, and has 64Gb of ram.

Expected Result
The liveness probes should not kill the pod for inaccessibility until after setup is complete.
There should at least be some configuration options for the liveness probes available.

Steps to Reproduce
Run the helm chart with the instructions provided. Observe the relution pod restart.

You can configure an initial delay to allow for the database to be set up and all Liquibase migrations to run using the startupProbe. You can configure both the startupProbe.failureThreshold (count, default 30) and startupProbe.periodSeconds (seconds, default 10) to allow the Relution server more time for the initial database setup. The default values equate to a 10s * 30 = 5 minute delay.

While the startupProbe is active no livenessProbes or readinessProbes will be run.

The time it takes to run all Liquibase migrations depends on your DMBS. PostgreSQL for example takes longer than MariaDB. That's why we can't set a global default which is valid for all possible DMBS.

Does this solve your issue?

Thanks for the response. Adjusting the startupProbe.failureThreshold to an extremely large number such as 600 did indeed fix my issue. I'm not sure how I did not notice it before.

Thank you.