
Built-in health checks should have configurable failure statuses

j-mok opened this issue · 3 comments

j-mok commented

ModulesHealthChecker is hard-coded to report failure as Degraded (here and here). While this is ok in some scenarios, it is not in others, including ours. We deploy the platform with a rigid set of module versions and expect all of them to load successfully before we can deem a deployment successful. A failure to load one module warrants deployment abort. In our deployment pipeline we poke the /health endpoint to see if it return 200 before swapping with production. But since ModulesHealthChecker reports Degraded in case of a module load error, the endpoint maps that to 200, not 503, and we're forced to parse the health report to see the status of the modules.

Make failure statuses independently configurable for built-in health checks (ModulesHealthChecker and CacheHealthChecker).

OlegoO commented

@j-mok I agree with you, health should return 503 in case of a module load error, We will fix it in coming release.

OlegoO commented

@j-mok please review #2717