PRs left open when GitHub is too busy
justinwyer opened this issue · 4 comments
We have been experiencing an intermittent issue where merge requests are rejected by the GitHub API:
{
"level": "info",
"rid": "cf3n6dbn37kclfh117a0",
"github_event_type": "status",
"github_delivery_id": "3cb85cd2-96e7-11ed-9516-9f680ab3bc27",
"github_installation_id": 1,
"github_repository_owner": "acme",
"github_repository_name": "anvils",
"github_pr_num": 26725,
"time": "2023-01-18T04:19:02.713727629Z",
"message": "Merge rejected due to unsatisfied condition: \"Base branch was modified. Review and try the merge again.\""
}
A small percentage of PRs are left unmerged by bulldozer due to the above issue. At this point there are no more triggering events affecting the PR so it never retries. This seems to happen when we have a lot of PRs getting merged in quick succession and is sporadic in nature.
I am happy to contribute a fix for this, but I am not sure what the best approach might be:
- Retry after a small delay? In the event bulldozer is restarted the PR would once again end up left open, but this would be exceedingly rare and so is acceptable.
- Periodically scan PRs in a repo? This feels like a can of worms best not opened.
Or if you have any other ideas I would be grateful.
Hey @justinwyer, thanks for submitting this. I think we already try 5 retries when merging, so something else might be going sideways here. How many pull requests are merging when you see this type of issue? We could probably increase the number of retries and backoff.
Thanks @asvoboda it seems like there is no retry in this case as the error returns non as retry-able
Lines 291 to 293 in 7b3d468
Base branch was modified. Review and try the merge again.
or we could more generally make http.StatusMethodNotAllowed
retry-able. What would you prefer?Yep that'll do it.
I think it probably makes sense to check for that specific error. In general the 405 should indicate the request is actually not meant to be retried, but Github internally treats this particular error as retry-able, which is a little bit wonky.
Okay great, will open a PR a bit later.