Unexpected behaviour with "maximumBackOffDuration" in HttpClient
Opened this issue · 4 comments
In this snippet of code in the HttpClient.php
$backoff = 200;
while ($backoff < $this->maximumBackoffDuration) {
...
// retry failed requests just once to diminish impact on performance
$httpResponse = $this->executePost($ch);
$responseCode = $httpResponse->getResponseCode();
//close connection
curl_close($ch);
if (200 != $responseCode) {
// log error
$this->handleError($ch, $responseCode);
if (($responseCode >= 500 && $responseCode <= 600) || 429 == $responseCode) {
// If status code is greater than 500 and less than 600, it indicates server error
// Error code 429 indicates rate limited.
// Retry uploading in these cases.
usleep($backoff * 1000);
$backoff *= 2;
} elseif ($responseCode >= 400) {
break;
} elseif ($responseCode == 0) {
break;
}
} else {
break; // no error
}
}
If a request fails for a 50* or 429 status code, the code will attempt to retry the request (and adds a sleep in the process).
However I think there is a fundamental logic bug in this process. Pointing at this area of code:
usleep($backoff * 1000);
$backoff *= 2;
Let's assume we have an API that constantly returns a 503 status code.
The first time it runs, we'd sleep for 0.2 seconds, then double $backoff
to 400
The second time it runs, we'd sleep for 0.4 seconds, then double $backoff
to 800
The third time it runs, we'd sleep for 0.8 seconds, then double $backoff
to 1600
The fourth time it runs, we'd sleep for 1.6 seconds, then double $backoff
to 3200
The fifth time it runs, we'd sleep for 3.2 seconds, then double $backoff
to 6400
The sixth time it runs, we'd sleep for 6.4 seconds, then double $backoff
to 12800
The loop would then bail out, based on the default $maximumBackoffDuration
of 10000.
This means we have in taken 12.6 seconds (sleep time), plus the request time before the script bails - in a synchronous fashion, blocking any other script in the stack in the meantime.
I propose a solution which deprecates the maxiumumBackoffDuration
parameter, and instead uses an integer for maxRetryCount
, with a retrySleep
parameter to give users the option of a sleep inbetween each retry.
Alternatively, we could strip out the retry logic altogether, and pass the responsibility of this to the user (we return a HttpResponse
object, so users could determine the responseCode
from this and retry themselves if needed.
Opinions on this?
@yakkomajuri you seem to be the most active PostHog-er on this repo, do you have any opinions on the above?
And for additional context on this, I dug into it after a deploy caused our stack to show 503 errors, and every PHP feature switch request took over 15 seconds causing user complaints.
@joesaunderson indeed this seems like it was written in a way that is not very respectful of PHPs single-threaded nature.
I do think your solution makes sense. Would you be down to put up a PR?