pulibrary/figgy

Update datadog configuration to use filtered health check endpoints

Closed this issue · 0 comments

This is one ticket in figgy for work that will be done in princeton_ansible across 4 applications. If needed we can split it up but it probably makes sense to do them all at once.

Each time we check a health endpoint and parse the json for a single service, we're hitting every other service unnecessarily. See https://github.com/search?q=repo%3Apulibrary%2Fprinceton_ansible%20localhost%2Fhealth.json&type=code for these health checks.

The health monitor gem we use provides the ability to filter which checks to run by passing a parameter called providers, see
https://github.com/lbeder/health-monitor-rails?tab=readme-ov-file#filtered-json-response

We should update the http_check configurations to use the filtered endpoint for the service in question for each check. I think we could then simplify the checks so they fail on status code instead of content match, by removing the content_match and http_response_status_code keys from each configuration block.

sudden priority justification

This will greatly reduce the load on all our applications' services, and we have more monitors we're planning to add which would only compound the problem so we should do this first.