mtrudel/bandit

Bandit.BodyAlreadyReadError error raising since upgrade to 1.0.0-pre9

scollon42 opened this issue ยท 22 comments

Hi there!

First I wanted to thank you all for your amazing work! I really enjoyed using your library with Phoenix so far ๐Ÿ‘Œ ๐Ÿ’œ

I'm very sorry, I will try to give as much information as I can but I'm failing to find what is happening exactly here. ๐Ÿค”

I've upgraded to bandit 1.0.0-pre9 today (i was in pre6 before) and a new error appeared.

It's an endpoint that has not been updated and Bandit is raising a BodyAlreadyReadError ๐Ÿค”

I have faily small stacktrace to share :

Plug.Parsers.ParseError: malformed request, a Bandit.BodyAlreadyReadError exception was raised with message "Body has already been read"
  File "lib/bandit/http1/adapter.ex", line 288, in Bandit.HTTP1.Adapter.read_req_body/2
  File "lib/plug/conn.ex", line 1290, in Plug.Conn.next_multipart/3
  File "lib/plug/conn.ex", line 1255, in Plug.Conn.read_part_body/8
  File "lib/plug/parsers/multipart.ex", line 247, in Plug.Parsers.MULTIPART.parse_multipart_file/4
  File "lib/plug/parsers/multipart.ex", line 215, in Plug.Parsers.MULTIPART.parse_multipart_headers/5
  File "lib/plug/parsers/multipart.ex", line 186, in Plug.Parsers.MULTIPART.parse_multipart/5
  File "lib/plug/parsers/multipart.ex", line 175, in Plug.Parsers.MULTIPART.parse_multipart/2
  File "lib/plug/parsers/multipart.ex", line 127, in Plug.Parsers.MULTIPART.parse/5

On the other way. Requests seems to still go through. (This endpoint in particular is a webhook handler, and events are seems to be processed properly).
The endpoint is fairly simple, there is no plug or anything.
Also it doesn't happened with every call to this endpoint. I haven't seen any differences between calls that failed and calls that succeed.

Again, I have trouble identifying from where it can source. I don't see changes in the codebase except for the version bump ๐Ÿค”

  "bandit": {:hex, :bandit, "1.0.0-pre.9", "989b1790ecffe688d1a334b2d7fcc18fe1da7e142291dbe9b748908701d0cb04", [:mix], [{:hpax, "~> 0.1.1", [hex: :hpax, repo: "hexpm", optional: false]}, {:plug, "~> 1.14", [hex: :plug, repo: "hexpm", optional: false]}, {:telemetry, "~> 0.4 or ~> 1.0", [hex: :telemetry, repo: "hexpm", optional: false]}, {:thousand_island, "~> 1.0-pre and >= 1.0.0-pre.5", [hex: :thousand_island, repo: "hexpm", optional: false]}, {:websock, "~> 0.5", [hex: :websock, repo: "hexpm", optional: false]}], "hexpm", "0ce876ab3991167365baf430df803d18ee69ebd74dec44b58954c574db0515ba"},

If I can be of help with anything to help it will be my pleasure ๐Ÿ™

Thanks for the bug report! Would you be able to try -pre.7 and see if it happens there? We overhauled a bunch of read routines in that release and I suspect that's where the issue likely is.

@asakura likely relevant to your interests.

Thank you for the swift reply! ๐Ÿ™

Sure! I'll try to replicate the issue in local and downgrade to see if it change something ๐Ÿ‘Œ

@mtrudel yes, it's very relevant to my interests indeed :)

Plug calls read_req_body/2 and for some unknown reason for now it ends up in:

  def read_req_body(%__MODULE__{}, _opts), do: raise(Bandit.BodyAlreadyReadError)

So it means that it could not match on any read_req_body/2 function head before. I assume that %__MODULE__{state: ... wasn't :no_body or :headers_read or less likely req.body_remaining wan't a number. I have to admit that is still possible.

Anyway I suggest to place before the catch all clause for read_req_body/2 (line 286) the following snippet and look inside req. It must reveal some insights:

  def read_req_body(%__MODULE__{} = req, _opts) do
     IO.inspect(req)
     # Logger.error/1 or require IEx; IEx.pry
     raise(Bandit.BodyAlreadyReadError)
  end

@scollon42 By a chance, do you have a test which fails calling that endpoint? Or maybe it's possible to forge a command using curl which causes the endpoint to end up with the same result?

Just verified that the all return tuples in all read_req_body/2 clauses were not changed since -pre.6 version.

@asakura thanks!
Unfortunately, no tests are failing ๐Ÿ˜•
Since yesterday the error was catch on various endpoints (it's not only the webhook one). It seems a bit random.

I will add the inspect and try to replicate the issue using curl in local. I hope I managed to catch it ๐Ÿ˜„ I'll keep you updated ๐Ÿ‘

@asakura I managed to catch it ๐ŸŽ‰ Never happened in local, it took running it in preprod (where we have 2 instances running in the same cluster) to catch one going through the catch all method.

%Bandit.HTTP1.Adapter{
  state: :body_read,
  socket: %ThousandIsland.Socket{
    socket: #Port<0.159>,
    transport_module: ThousandIsland.Transports.TCP,
    read_timeout: 60000,
    span: %ThousandIsland.Telemetry{
      span_name: :connection,
      telemetry_span_context: "Reference<0.914463699.3928227841.251982>",
      start_time: -576_460_692_813_382_047,
      start_metadata: %{
        parent_telemetry_span_context: #Reference<0.914463699.3928227841.158232>,
        remote_address: _,
        remote_port: _,
        telemetry_span_context: #Reference<0.914463699.3928227841.251982>
      }
    }
  },
  buffer: "",
  body_remaining: 0,
  body_encoding: nil,
  version: :"HTTP/1.1",
  keepalive: true,
  content_encoding: "gzip",
  upgrade: nil,
  metrics: %{
    req_body_bytes: 273_125,
    req_body_end_time: -576_460_672_680_231_227,
    req_body_start_time: -576_460_672_722_026_849,
    req_header_bytes: 2437,
    req_header_end_time: -576_460_672_722_320_933,
    req_line_bytes: 37
  },
  websocket_enabled: true,
  opts: %{http_1: [], http_2: [], websocket: []}
}

It happened on a PUT call that was supposed to upload a file. I don't think that is related because in production, the error was "random" and occured for different method on different endpoints. I don't see much in the headers of the request ๐Ÿค”

@scollon42 great! I assume that the webhook handler suppose to receive gzip compressed data from users. Could you try following curl command (you might need to add authn/authz headers - use -H). It's a wild guess, so very likely it won't crash:

curl --insecure -i -X PUT -H "Content-Type: multipart/form-data" -F file=@file.zip localhost:9999/webhook 

I don't see much in the headers of the request

Could you share request headers please? Thank you :)

Yeah it looks like this :

PUT /api/internal/v1/users HTTP/1.1\r
  Host: localhost\r
  X-Request-ID: id\r
  X-Real-IP: 127.0.0.1\r
  X-Forwarded-For: 127.0.0.1\r
  X-Forwarded-Host: localhost\r
  X-Forwarded-Port: 443\r
  X-Forwarded-Proto: https\r
  X-Forwarded-Scheme: https\r
  X-Scheme: https\r
  Content-Length: 274017\r
  sec-ch-ua: \"Not.A/Brand\";v=\"8\", \"Chromium\";v=\"114\", \"Google Chrome\";v=\"114\"\r
  Accept: application/json\r
  Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryuVsWcTuMvJQQX6nB\r
  x-csrf-token: token\r
  sec-ch-ua-mobile: ?0\r
  User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36\r
  sec-ch-ua-platform: \"macOS\"\r
  Origin: https://localhost\r
  Sec-Fetch-Site: same-origin\r
  Sec-Fetch-Mode: cors\r
  Sec-Fetch-Dest: empty\r
  Referer: https://localhost/me/general\r
  Accept-Encoding: gzip, deflate, br\r
  Accept-Language: en-US,en;q=0.9,fr-FR;q=0.8,fr;q=0.7\r
  Cookie: cookies

The endpoint where I catch the error was simply an update of a user (updating the avatar).
I'll try the curl command ๐Ÿ˜„

@asakura The curl command causes no issue ๐Ÿค” I'm back to testing locally though

@scollon42 could you try out fix from #172 and tell if it fixes your problem please?

Yeah! I'll look into that, I'll keep you updated ๐Ÿ‘Œ
Thanks ๐Ÿ™

Hello there! @asakura Sorry I haven't come back sooner.
I've tested your fix (Actually i've used main since it has been merged) and it seems I have the same issue ๐Ÿค”

GenServer #PID<0.18888.0> terminating
** (Plug.Parsers.ParseError) malformed request, a Bandit.BodyAlreadyReadError exception was raised with message "Body has already been read"
    (bandit 1.0.0-pre.9) lib/bandit/http1/adapter.ex:287: Bandit.HTTP1.Adapter.read_req_body/2
    (plug 1.14.2) lib/plug/conn.ex:1290: Plug.Conn.next_multipart/3
    (plug 1.14.2) lib/plug/conn.ex:1255: Plug.Conn.read_part_body/8
    (plug 1.14.2) lib/plug/parsers/multipart.ex:215: Plug.Parsers.MULTIPART.parse_multipart_headers/5
    (plug 1.14.2) lib/plug/parsers/multipart.ex:186: Plug.Parsers.MULTIPART.parse_multipart/5
    (plug 1.14.2) lib/plug/parsers/multipart.ex:175: Plug.Parsers.MULTIPART.parse_multipart/2
    (plug 1.14.2) lib/plug/parsers/multipart.ex:127: Plug.Parsers.MULTIPART.parse/5
    (plug 1.14.2) lib/plug/parsers.ex:340: Plug.Parsers.reduce/8

Let me know if you need me to add the inspect again

I agree with @asakura's assessment of the problem. We're likely in a scenario where we're returning {:more....} to the Plug layer, but are also setting our internal state to :body_read.

I'm adding a bunch of test coverage to these flows this weekend. I have a couple of ideas of places where this may be happening that I'm going to hammer on, and if any fixes come out of this work I'll report back here.

Amazing @mtrudel! Thank you both for your work ๐Ÿ™
Let me know if I can be of any assistance.

Our latest and best attempt at fixing this was just published as 1.0.0-pre.10! Can you give this a shot and see if it's still happening? If I don't hear back I'll close this issue off in a few days otherwise.

Hey @mtrudel ! I'll test it with the new version ๐Ÿ‘Œ thank you for the fix!
I'll get back to you before the end of the week.

Hey @mtrudel! Sorry, the end of the week was more busy than expected I haven't had time to test pre10 ๐Ÿ˜•
I'll let you close the issue ๐Ÿ˜„
I'll post an update next week, I'll have time to test it ๐Ÿ˜„
Thanks again for the investigation and work on that ๐Ÿ™

Thanks for the update! I'm in no huge hurry to close this off; feel free to report back at your leisure

I am experiencing the same issue, going to update to pre.10 now and report back

@mtrudel @asakura
Hellot there!
Pre 10 seems to have fix the issue ๐ŸŽ‰ thanks for the work on that! ๐Ÿ™

I confirm it's fixed on pre.10

Amazing to hear. Thanks for the note!