zalando-zmon/service-level-reporting

Faster report generation

mohabusama opened this issue · 3 comments

Optimize report generation

Meanwhile, maybe increase the server side timeout value?

https://github.com/zalando-zmon/service-level-reporting/blob/master/zmon_slr/client.py#L258 makes requests to the weekly report endpoint with 180 second timeout but never actually hits that because server returns 502 after just 60 seconds.

The timeout is enforced by the LoadBalancer used. Which is not reliable since it can change during redeployments (happened already). In case of timeouts it actually returns 504, which is returned from the Loadbalancer.

I am leaning towards having the report generated client-side only, since the client already query all SLI values to plot the charts,, better off generate the report as well without pressure on the database IOPS. Downside would be possible inconsistent/error prone report generation (not a single source of truth) in case of multiple report generation clients.