Marge fails too hard on network errors
raphael-proust opened this issue · 1 comments
raphael-proust commented
When margebot encounters a network error during the merging process (e.g., timeout when checking the CI status), it fails hard and adds a "I'm broken inside" comment. Because network errors can be transient, margebot should retry failed network requests.
I can try to make a PR for this, but I have the following questions:
-
Do you agree with the diagnostic and with the main proposal?
-
I'm not sure where to make the necessary modifications in the code. Specifically, I'm not sure what granularity to have:
- At the high-level of
single_merge_job
'sexecute
: we won't miss any errors (at least not during the merging process)? - At a lower level (
fetch_approvals
andupdate_merge_request_and_accept
or even lower): we have more specific context for what stage failed? - At a higher level (outside of
single_merge_job
)? - At a different level altogether such as by patching the
self._api
object or changing some configuration of the underlying http request library.
- At the high-level of
xtermi2 commented
Yes, please add this feature! Would save us a lot of work!