palantir/policy-bot

maximum number of statuses for SHA and context

devinburnette opened this issue · 2 comments

We got the following error today on a PR:

422 Validation Failed [{Resource:Status Field: Code:custom Message:This SHA and context has reached the maximum number of statuses.}]

I traced it down to what I believe to be our codecov integration which is routinely commenting and editing comments on PRs in this repo. Github's documentation shows that the maximum number of statuses that can be posted for a given SHA / context pair is 1000. So while I think this may be an outlier case, it still doesn't sound crazy to me that there would be 1000 events causing evaluation and subsequent status checks on a SHA.

I think under normal circumstances we shouldn't hit this error and this is the first PR that I've seen behave this way. But it does make me think if there are ways to prevent this somehow without needing to retain the state of the last evaluation if nothing needs to change. Like maybe we could first query the most recent status of that SHA / context and if the recent evaluation doesn't change the status or message text then we don't need to post anything?? something like that, idk, what do you think? It does add another API call for every event, but it might be better than bumping up against an imposed limit and having to push another commit and re-approve any invalidated rules, etc.

relevant docs:
https://docs.github.com/en/rest/commits/statuses#create-a-commit-status

Thanks for reporting this. Was the PR with this problem open for a relatively long time? I think it's unusual to have so many events for a single SHA (usually PRs with lots of events also mix in pushes), but still something we should consider fixing.

My first thought it that if we're posting a lot of (unnecessary) statuses, we're probably making a bunch of (unnecessary) API calls to compute those statuses. I think I'd start by looking for ways to skip evaluation entirely instead of doing the evaluation and then discarding the result if nothing changed (although discarding the result is easier to implement.)

For instance, I'm assuming the codecov comments don't actually impact approval and the policy might not even accept comments for approval in the first place. But the current version of the issue_comment handler will run a full evaluation either way. To detect comment tampering, we already load the policy and see if the change impacts approval. It seems relatively straightforward to extend this so that we only evaluate the PR if the new or changed comment could possibly impact a rule.

This is kind of an extension of the existing idea of "triggers", but introduces details about the event beyond the type.

The main downside here is the complication of filtering and the fact that it has to be customized for each event, instead of implemented in one place like a check for the existing status. But I think the benefit of reduced API load would be worth it.

the PR was only open for like 2 days, but I think the codecov bot had gone rogue and was editing and posting comments the whole time.. so yea i don't expect to see it a lot, but it's still not crazy to me.. and yea you're right policy bot doesn't need to evaluate every event, I might have a look later on to see if there's an easy fix.