zalando/zappr

High CPU load for unknown reasons

Closed this issue · 2 comments

Occasionally we see sudden load spikes, typically resulting in the instance being killed because it doesn't respond to the health check anymore. The CPU load will raise to 100% without obvious reasons; in particular, we don't see unusual patterns in the number of requests. We might be locked in some inefficient algorithm or some sort of blocking operation.

First attempt at an explanation: The algorithm for filtering out modified comments is quadratic, so it might cause high load in some cases. However, the table of frozen comments (zappr_data.frozen_comments) is flushed after a PR is processed, so we can't reconstruct the state at the time of the incident. At present it only contains some 70 entries...

Kinda fixed with #511.