Default gentleman use doesn't seem to participate in TLS session resumption
caseyhadden opened this issue · 0 comments
We use gentleman as a client library between services in our product. We began noticing particular services receiving a lot of restart events due to CPU limit violations. When we enabled pprof for those services and took a look at the data, there was a lot of time being spent in addTLS
- https://github.com/golang/go/blob/master/src/net/http/transport.go#L1508. This was true even across multiple calls to the same URL and service. This also matched with our observation that the restarts were more frequent with TLS across the full environment instead of it being terminated at the reverse proxy.
We were able to replicate this in a small test outside of the larger application. I've pushed that test to http://github.com/caseyhadden/gentle-http-test. This test runs 10 identical GET calls against httpbin.org using both HTTPS and HTTP. The output below is fairly representative of the type of difference we see. We ran the test cases against our internal server so latency wouldn't have as much of a chance to affect results, but this is what I came up with as something showing the issue and could be shared. The regular "gentleman" usage is consistently around 2x as slow in this microbenchmark. We stuck the grequests
library in as another "not bare net/http" representative.
*** HTTPS
2021/02/22 12:49:32 nethttp: 757.704072ms
2021/02/22 12:49:34 gentleman: 1.266130582s
2021/02/22 12:49:35 gentleman transport: 899.396886ms
2021/02/22 12:49:35 grequests: 461.614291ms
*** HTTP
2021/02/22 12:49:36 nethttp: 968.82515ms
2021/02/22 12:49:37 gentleman: 843.205109ms
2021/02/22 12:49:38 gentleman transport: 724.24357ms
2021/02/22 12:49:38 grequests: 769.325955ms
When you check the test, you can see that the difference between "gentleman" and "gentleman transport" instances is that the latter calls cli.Use(transport.Set(http.DefaultTransport))
in order to reuse the HTTP transport instance across the client.
We used a basic load testing tool against one part of the application with such a change. That showed a marked improvement in both throughput and memory usage.
I guess I have a few questions related to this issue:
- Do you expect to see this difference for TLS and non-TLS connections seemingly related to TLS resumption protocols?
- Should the default gentleman client have a better setup for this?
- If not, is there some better way than setting the transport that you're expecting folks to have TLS connections function more performantly?