elixir-tesla/tesla

Getting very slow reply within Tesla

nateless opened this issue · 12 comments

We have an internal API call, that is being processed by curl within 10 seconds, it spits an JSON of around 20mb. With Tesla.get we get a reply within 52-56 seconds.

{:ok,
 %Tesla.Env{
   __client__: %Tesla.Client{
     adapter: {Tesla.Adapter.Hackney, :call, [[recv_timeout: 60000]]},
     fun: nil,
     post: [],
     pre: [
       {Tesla.Middleware.BaseUrl, :call, ["http://1.1.1.1/"]},
       {Tesla.Middleware.Logger, :call, [[]]},
       {Tesla.Middleware.Telemetry, :call, [[]]},
       {Tesla.Middleware.Timeout, :call, [[timeout: 60000]]}
     ]
   },
   __module__: Tesla,
   body: ".........",
   headers: [
     {"content-lenght", ""},
     {"content-type", "application/json"},
     {"etag", "\"123-1624560200-ba73c0\""},
     {"date", "Fri, 25 Jun 2021 17:20:10 GMT"},
     {"transfer-encoding", "chunked"}
   ],
   method: :get,
   opts: [],
   query: %{
     cid: 111
   },
   status: 200,
   url: "http://1.1.1.1/api "
 }}
iex(21)> 17:20:52.799 [info] GET http://1.1.1.1/api -> 200 (52945.107 ms)
17:20:52.799 [debug]
>>> REQUEST >>>
Query: cid: 111

(no headers)
(no body)

<<< RESPONSE <<<
content-lenght:
content-type: application/json
etag: "123-1624560200-ba73c0"
date: Fri, 25 Jun 2021 17:20:10 GMT
transfer-encoding: chunked

I've tried to use pure hackney and got reply within the same range of 10 seconds:
:hackney.request("GET", "http://1.1.1.1/api?cid=111", [], '', [recv_timeout: 60_000])

Can you help us pinpoint the issue on loosing ~50second? I've thought it could be linked with parsing JSON, and we turned off JSON parsing, however we still getting the same 52-56 seconds per request as shown above.

Since there isn’t any body processing when you are not using the JSON middleware, I’d start with turning off debug logging, then remove telemetry, then remove logging completely and measure outside of the call. If none of that helps, you might need to do some profiling with tools like eprof.

Nothing helped so we did profiling. This call took 86%
hackney_http:execute/2 11054 86.48 41194582 [ 3726.67]

Interesting 🧐

Can you profile the bare hackney version too?

That call is internal to hackney, tesla does not use this function directly.

Just in case, are you testing the same version of hackney/elixir/erlang/os etc and is the hackney version the latest one?

yeah, to be honest on my local macbook the same code takes proper 10 seconds, on a linux in a docker env it processes that code for 52 seconds. I get absolutely different profiling results. The only difference is that on macbook elixir installed with brew and no docker container.

Dockerized version uses elixir:1.12-alpine

mix locked with tesla "1.4.1" and hackney "1.17.4"

Hmm, you are right seems its a hackney problem, before I didnt read the body. On reading the body it spends most of the time.

prof code:

{:ok, _, _, ref} = :hackney.request("GET", "http://1.1.1.1/...", [], '', [recv_timeout: 60_000])
{:ok, body} = :hackney.body(ref)
....
erlang:setelement/3                              44258    11.53   4928593  [    111.36]
hackney_http:execute/2                           11054    87.25  37283669  [   3372.87]
----------------------------------------------  ------  -------  --------  [----------]
Total:                                          664763  100.00%  42730491  [     64.28]

In this case, we have two completely different environments. I’d recommend to reduce the problem, i.e. run the exactly same docker container on both OSes WITHOUT volume mounts and compare execution times. Make sure the API endpoint is stable (preferable running in localhost or close network). Run the code a few times to rule out network inconsistencies. Check the CPU usage and iowait. In case you are using a low-end Linux VPS it might have worse performance than top MacBook etc. etc.

Compare hackney time with raw curl in terminal to get a baseline. Maybe the API itself is slow.

Thanks. API is fast. Server is a dedicated 256gb ram, nvme hardware. We get the same speed in docker with curl/wget (10s). Which makes me think it's not the docker issue.

You can quickly try with other adapter like finch or gun and see if there is any difference.

meh, after I've started reading the body it started to lag the same on macbook.

Mint takes 15 seconds with JSON decoding.

Sounds like we got a winner.

Fortunately you can just swap the adapter when using tesla ;)

I recommend Finch (which is based on mint) to get the same functionality as hackney (pooling).

Yeah, that's why we are using Tesla - that's not the first issue with hackney unfortunately. Thanks for your help!