cloudflare/pingora

Should not discard data if peer send more

Closed this issue · 9 comments

Describe the bug

I am using pingora as a proxy server. When using WRK tool to stress test HTTP/1.1 requests. It causes the following warn and will effect the wrk tests.

Pingora info

Please include the following information about your environment:

Pingora version: the latest
Rust version: i.e. cargo 1.86.0
Operating system version: macOS 14.6.1

Steps to reproduce

It's very difficult for me to provide the source code because we deeply embedded pingora. However, I will provide as much as i can.

  1. The wrk test config, it will send POST request with 1MB request data. This request will be proxied to a local Nginx server in our project, and Nginx will directly return 200.
    Noted that, only HTTP/1.1 will cause such problem.
wrk -c10 -t10 -d5s --latency http://127.0.0.1:8888/echo -s file-data.lua
  1. The wrk result before i try to fix it. In the 5-second test, only 13 requests could be sent, which was far lower than expected.
Running 5s test @ http://127.0.0.1:8888/echo
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    29.26ms   16.04ms  45.81ms   69.23%
    Req/Sec    12.70      5.06    20.00     70.00%
  Latency Distribution
     50%   35.05ms
     75%   44.14ms
     90%   45.05ms
     99%   45.81ms
  13 requests in 5.05s, 3.11KB read
Requests/sec:      2.58
Transfer/sec:     630.54B
  1. I studied the problem and reached the following conclusion.
  1. I'm curious whether pingora should discard this part of the data. I printed out this part of the discarded data and found that it was exactly the request header of the next request, which was stuck to the request body of the previous request.

Image

Expected results

wrk should have the following results, (i have try to fix the above problem, the PR will be given later).

Running 5s test @ http://127.0.0.1:8888/echo
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   873.62ms  371.78ms   1.12s    80.77%
    Req/Sec     1.40      3.27    10.00     86.54%
  Latency Distribution
     50%    1.02s 
     75%    1.12s 
     90%    1.12s 
     99%    1.12s 
  52 requests in 5.04s, 8.62KB read
Requests/sec:     10.31
Transfer/sec:      1.71KB

Observed results

Pingora has discard the data that peer had sent more, but which is the next request.

Pingora has discard the data that peer had sent more, but which is the next request.

If this is an ask to support HTTP/1.1 pipelining, it's unlikely that pingora will have support for that anytime in the near future (if at all).

@drcaramelsyrup In this case Pingora should return a 400 or 500 error, rather than silently dropping a request on the floor.

@drcaramelsyrup I understand that there is no need to support HTTP/1.1 pipelining. However, when the content-length of the request body is clearly known, the data on the stream should not be over-consumed and discard.

By the way, in the cases I provided, I don't think this situation is HTTP/1.1 pipelining . The request header of the next request and the request body of the previous request were read simultaneously, which could be avoided.

I believe pipelining is any situation where a subsequent message is sent before the reply to the previous message has been received.

I believe pipelining is any situation where a subsequent message is sent before the reply to the previous message has been received.

Yes, that's how we try to detect pipelining in this situation. If there is more body than expected the request likely ought to be rejected as suggested.

We likely adopted the RFC stance on discarding extra response body from https://www.rfc-editor.org/rfc/rfc9112.html#name-message-body-length. For request bodies specifically I think it makes sense to return an error and/or close the connection. We do already return a 400 error during proxying if more data is received than expected from downstream.

Why does Pingora not support pipelining? Pipelining is mandatory in HTTP/1.1, and no browser supporting it does not change that fact.

Is there an easy way to support Pipelining in my forked repo?

I believe pipelining is any situation where a subsequent message is sent before the reply to the previous message has been received.

Yes, that's how we try to detect pipelining in this situation. If there is more body than expected the request likely ought to be rejected as suggested.

We likely adopted the RFC stance on discarding extra response body from https://www.rfc-editor.org/rfc/rfc9112.html#name-message-body-length. For request bodies specifically I think it makes sense to return an error and/or close the connection. We do already return a 400 error during proxying if more data is received than expected from downstream.

Is there an easy way to support pipelining in my forked repo?

Sorry for the delayed response. We haven't had much need for supporting pipelining internally, though being able to handle pipelined requests is not something we are necessarily against either. Some places in the core proxy code don't assume pipelining is being used and that's where implementors may run into problems.

We are reviewing the attached PR, though we may attempt to preserve the existing behavior for discarding or detecting extra upstream body bytes as nginx does.

The commit from #617 was merged, as mentioned from the previous comment, upstream body is still detected and discarded (but keepalive is also disabled) in 60a7bcc.