Should not discard data if peer send more
Closed this issue · 9 comments
Describe the bug
I am using pingora as a proxy server. When using WRK tool to stress test HTTP/1.1 requests. It causes the following warn and will effect the wrk tests.
Pingora info
Please include the following information about your environment:
Pingora version: the latest
Rust version: i.e. cargo 1.86.0
Operating system version: macOS 14.6.1
Steps to reproduce
It's very difficult for me to provide the source code because we deeply embedded pingora. However, I will provide as much as i can.
- The wrk test config, it will send POST request with 1MB request data. This request will be proxied to a local Nginx server in our project, and Nginx will directly return 200.
Noted that, onlyHTTP/1.1will cause such problem.
wrk -c10 -t10 -d5s --latency http://127.0.0.1:8888/echo -s file-data.lua- The wrk result before i try to fix it. In the 5-second test, only 13 requests could be sent, which was far lower than expected.
Running 5s test @ http://127.0.0.1:8888/echo
10 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 29.26ms 16.04ms 45.81ms 69.23%
Req/Sec 12.70 5.06 20.00 70.00%
Latency Distribution
50% 35.05ms
75% 44.14ms
90% 45.05ms
99% 45.81ms
13 requests in 5.05s, 3.11KB read
Requests/sec: 2.58
Transfer/sec: 630.54B
- I studied the problem and reached the following conclusion.
- #377 Pingora not support the tech of HTTP pipelining.
- The WRK testing tool might have used this tech when sending requests.
- Pingora will discard the data if client sent more. https://github.com/BamLubi/pingora/blob/f697eee0991e13d00a67f3ee99288a058cdc5559/pingora-core/src/protocols/http/v1/body.rs#L232-L238
- I'm curious whether pingora should discard this part of the data. I printed out this part of the discarded data and found that it was exactly the request header of the next request, which was stuck to the request body of the previous request.
Expected results
wrk should have the following results, (i have try to fix the above problem, the PR will be given later).
Running 5s test @ http://127.0.0.1:8888/echo
10 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 873.62ms 371.78ms 1.12s 80.77%
Req/Sec 1.40 3.27 10.00 86.54%
Latency Distribution
50% 1.02s
75% 1.12s
90% 1.12s
99% 1.12s
52 requests in 5.04s, 8.62KB read
Requests/sec: 10.31
Transfer/sec: 1.71KB
Observed results
Pingora has discard the data that peer had sent more, but which is the next request.
Pingora has discard the data that peer had sent more, but which is the next request.
If this is an ask to support HTTP/1.1 pipelining, it's unlikely that pingora will have support for that anytime in the near future (if at all).
@drcaramelsyrup In this case Pingora should return a 400 or 500 error, rather than silently dropping a request on the floor.
@drcaramelsyrup I understand that there is no need to support HTTP/1.1 pipelining. However, when the content-length of the request body is clearly known, the data on the stream should not be over-consumed and discard.
By the way, in the cases I provided, I don't think this situation is HTTP/1.1 pipelining . The request header of the next request and the request body of the previous request were read simultaneously, which could be avoided.
I believe pipelining is any situation where a subsequent message is sent before the reply to the previous message has been received.
I believe pipelining is any situation where a subsequent message is sent before the reply to the previous message has been received.
Yes, that's how we try to detect pipelining in this situation. If there is more body than expected the request likely ought to be rejected as suggested.
We likely adopted the RFC stance on discarding extra response body from https://www.rfc-editor.org/rfc/rfc9112.html#name-message-body-length. For request bodies specifically I think it makes sense to return an error and/or close the connection. We do already return a 400 error during proxying if more data is received than expected from downstream.
Why does Pingora not support pipelining? Pipelining is mandatory in HTTP/1.1, and no browser supporting it does not change that fact.
Is there an easy way to support Pipelining in my forked repo?
I believe pipelining is any situation where a subsequent message is sent before the reply to the previous message has been received.
Yes, that's how we try to detect pipelining in this situation. If there is more body than expected the request likely ought to be rejected as suggested.
We likely adopted the RFC stance on discarding extra response body from https://www.rfc-editor.org/rfc/rfc9112.html#name-message-body-length. For request bodies specifically I think it makes sense to return an error and/or close the connection. We do already return a 400 error during proxying if more data is received than expected from downstream.
Is there an easy way to support pipelining in my forked repo?
Sorry for the delayed response. We haven't had much need for supporting pipelining internally, though being able to handle pipelined requests is not something we are necessarily against either. Some places in the core proxy code don't assume pipelining is being used and that's where implementors may run into problems.
We are reviewing the attached PR, though we may attempt to preserve the existing behavior for discarding or detecting extra upstream body bytes as nginx does.
