PuerkitoBio/rehttp

Support retrying timeout errors

abemedia opened this issue · 3 comments

It would be really useful if there was a RetryTimeout method which retries the request in the case of a timeout error. The reason for this is because some public APIs I request with rehttp tend to hang forever occasionally whilst responding really fast the majority of the time. In this case I'd like to cancel the running request after a reasonably short timeout and retry it.

mna commented

Hello Adam,

The timeout applies to the request as a whole, including retries, since it is set on the HTTP client. I agree that it would be nice to have a per-attempt timeout too, though I'm not sure it could be easily added. You could try setting the http.Transport's ResponseHeaderTimeout, if your slow requests hang on sending the headers, that should work and would be retried with a RetryTemporaryErr.

Hope this helps,
Martin

I actually had a look through the code and realised it wouldn't be that simple not long after submitting the issue.
How about adding a second context which gets reset at the beginning of each roundtrip? Or possibly a method that resets the main context (for use-cases where solely having a timeout per request is desirable).

mna commented

How would that context actually cancel the in-flight roundtrip when the deadline hits? The Request.Cancel channel belongs to the caller, it can't be used for that (it means "cancel the whole request", so it has to cancel any retries).

It was pretty much by design that whole request timeouts apply as a whole (and not per-attempt), as that gives a bounded time limit to the request regardless of potential retries. As I said, it sure would be nice to have both options, but I don't see how it could be done at the Transport level (there are not that many hooks available). If you want to retry in case of a timeout, I think a custom http.Client (more like a Doer interface wrapper, what I mentioned in the blog post here: https://0value.com/Let-the-Doer-Do-it) would make more sense: wrap the http.Client, and retry if the error is a timeout.

That being said, I'm open to add a per-attempt timeout feature in rehttp if it fits cleanly in the API. Maybe it can be done at the net.Conn level (which is passed by the Dialer to the Transport). If so it would work as-is as far as rehttp is concerned, probably worth trying out.