Streaming request / response server is unfairly penalised?
Closed this issue · 16 comments
Since I can't see all the internal details of the router, I'm unsure of my assessment. However, I'm seeing odd behaviour when deploying Falcon as a server from within Heroku and I wonder if it's related to
Lines 642 to 647 in 26cf07b
In my case, I have an app deployed within Heroku, and I can log into the web1 dyno:
~ $ curl -i http://localhost:22816
HTTP/1.1 200 OK
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
x-download-options: noopen
x-permitted-cross-domain-policies: none
referrer-policy: strict-origin-when-cross-origin
content-type: text/html; charset=utf-8
vary: Accept, Origin
etag: W/"86f012d9ce08d33ca58526d7f502de8d"
cache-control: max-age=0, private, must-revalidate
x-request-id: ecd781d9-896a-4550-9a5c-e1922f2aa54e
x-runtime: 0.012189
vary: accept-encoding
transfer-encoding: chunked
... data follows ...
However, from the outside world:
> curl -i "http://helloweather-staging.herokuapp.com"
HTTP/1.1 200 OK
Server: Cowboy
Date: Mon, 23 Aug 2021 00:54:01 GMT
Connection: close
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: strict-origin-when-cross-origin
Content-Type: text/html; charset=utf-8
Vary: Accept, Origin
Etag: W/"7151c07748c01426badcbf6f1be61558"
Cache-Control: max-age=0, private, must-revalidate
X-Request-Id: 46995793-019d-4ddd-bc93-03a5065db78c
X-Runtime: 0.011992
Vary: accept-encoding
Via: 1.1 vegur
Not that Connection: close has been added, seemingly by the router, and it's missing transfer-encoding: chunked
.
If falcon can unambiguously determine what to do before reading the entire input body, it will do so. However, this normally requires reading all the headers at least.
I wanted to narrow this down, so I've created a very simple rails app to see if I could reproduce the issue:
> curl -i https://rails-falcon-heroku.herokuapp.com
HTTP/1.1 200 OK
Server: Cowboy
Date: Mon, 23 Aug 2021 09:04:00 GMT
Connection: close
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: strict-origin-when-cross-origin
Content-Type: text/plain; charset=utf-8
Vary: Accept
Etag: W/"185f8db32271fe25f561a6fc938b2e26"
Cache-Control: max-age=0, private, must-revalidate
X-Request-Id: d4089e57-bf0c-4158-970d-dddfc2e04c6e
X-Runtime: 0.001643
Vary: accept-encoding
Via: 1.1 vegur
Hello⏎
Certainly with the most basic app, it still seems a problem. That being said, I have non-Rails apps that seem to be okay, so it might be something to do with Rails?
I've enabled verbose logging on my test server above:
2021-08-23T09:12:13.202925+00:00 app[web.1]: {
"time": "2021-08-23T09:12:13+00:00",
"severity": "info",
"class": "Async::HTTP::Protocol::HTTP1::Request",
"oid": 6980,
"pid": 11,
"subject": "http://rails-falcon-heroku.herokuapp.com: GET / HTTP/1.1",
"message": "Responding with: 200{"x-frame-options"=>["SAMEORIGIN"],
"x-xss-protection"=>["1; mode=block"],
"x-content-type-options"=>["nosniff"],
"x-download-options"=>["noopen"],
"x-permitted-cross-domain-policies"=>["none"],
"referrer-policy"=>["strict-origin-when-cross-origin"],
"content-type"=>"text/plain; charset=utf-8",
"vary"=>["accept", "accept-encoding"],
"etag"=>"W/\"185f8db32271fe25f561a6fc938b2e26\"",
"cache-control"=>["max-age=0", "private", "must-revalidate"],
"x-request-id"=>["5b27910e-daff-46d3-876c-137847bcf24c"],
"x-runtime"=>["0.001773"]};
#<Falcon::Adapters::Output length=nil body=Rack::BodyProxy> | #<Async::HTTP::Body::Statistics sent 5 bytes; took 2.39ms in total; took 2.37ms until first chunk>"
}
I cannot see anything strange about this response - the response body should be chunked since the length is unknown.
Okay, I see something odd:
Locally, using curl:
> curl -v -i http://rails-falcon-heroku.herokuapp.com
* Trying 52.5.82.174:80...
* Connected to rails-falcon-heroku.herokuapp.com (52.5.82.174) port 80 (#0)
> GET / HTTP/1.1
> Host: rails-falcon-heroku.herokuapp.com
> User-Agent: curl/7.77.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: Cowboy
Server: Cowboy
< Date: Mon, 23 Aug 2021 10:00:30 GMT
Date: Mon, 23 Aug 2021 10:00:30 GMT
< Connection: close
Connection: close
< X-Frame-Options: SAMEORIGIN
X-Frame-Options: SAMEORIGIN
< X-Xss-Protection: 1; mode=block
X-Xss-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
X-Content-Type-Options: nosniff
< X-Download-Options: noopen
X-Download-Options: noopen
< X-Permitted-Cross-Domain-Policies: none
X-Permitted-Cross-Domain-Policies: none
< Referrer-Policy: strict-origin-when-cross-origin
Referrer-Policy: strict-origin-when-cross-origin
< Content-Type: text/plain; charset=utf-8
Content-Type: text/plain; charset=utf-8
< Vary: Accept
Vary: Accept
< Etag: W/"185f8db32271fe25f561a6fc938b2e26"
Etag: W/"185f8db32271fe25f561a6fc938b2e26"
< Cache-Control: max-age=0, private, must-revalidate
Cache-Control: max-age=0, private, must-revalidate
< X-Request-Id: 0d510089-e556-4486-9e99-844b6b3a1a93
X-Request-Id: 0d510089-e556-4486-9e99-844b6b3a1a93
< X-Runtime: 0.001960
X-Runtime: 0.001960
< Vary: accept-encoding
Vary: accept-encoding
< Via: 1.1 vegur
Via: 1.1 vegur
<
* Closing connection 0
Hello
But the debug log on my server (what Falcon receives):
2021-08-23T09:57:14.709018+00:00 app[web.1]: {"time":"2021-08-23T09:57:14+00:00","severity":"info","class":"Async::HTTP::Protocol::HTTP1::Request","oid":6820,"pid":10,"subject":"http://rails-falcon-heroku.herokuapp.com: GET / HTTP/1.1","message":"Headers: {\"connection\"=>[\"close\"], \"user-agent\"=>\"curl/7.77.0\", \"accept\"=>[\"*/*\"], \"x-request-id\"=>[\"875e873d-8bfc-443d-b945-f6782eab0f48\"], \"x-forwarded-for\"=>[\"151.210.172.1\"], \"x-forwarded-proto\"=>[\"https\"], \"x-forwarded-port\"=>[\"443\"], \"via\"=>[\"1.1 vegur\"], \"connect-time\"=>[\"0\"], \"x-request-start\"=>[\"1629712634706\"], \"total-route-time\"=>[\"0\"]} from #<Addrinfo: 10.1.24.52:15634 TCP>"}
Specifically, it seems like something between curl
and falcon
is adding connection: close
.
Even when I explicitly set Connection: keep-alive
it's ignored:
> curl -v -H "Connection: keep-alive" -H "Foo: bar" -i http://rails-falcon-heroku.herokuapp.com
...
> GET / HTTP/1.1
> Host: rails-falcon-heroku.herokuapp.com
> User-Agent: curl/7.77.0
> Accept: */*
> Connection: keep-alive
> Foo: bar
On the server:
"Headers: {\"connection\"=>[\"close\"], \"user-agent\"=>\"curl/7.77.0\", \"accept\"=>[\"*/*\"], \"foo\"=>[\"bar\"], \"x-request-id\"=>[\"92e0d87c-eb5c-4837-8dde-f1bdd39da387\"], \"x-forwarded-for\"=>[\"151.210.172.1\"], \"x-forwarded-proto\"=>[\"http\"], \"x-forwarded-port\"=>[\"8
Okay, I have narrowed this down:
Here is my rack application (running in Falcon):
run lambda{|env|
case env['PATH_INFO']
when '/array'
[200, [], ["Hello World"]]
when '/enumeration'
[200, [], ["Hello World"].to_enum]
else
[404, [], []]
end
}
The /array
response will be a fixed content-length
while the /enumeration
response will use transfer-encoding: chunked
.
/array
Client side:
> curl -i https://rails-falcon-heroku.herokuapp.com/array
HTTP/1.1 200 OK
Server: Cowboy
Date: Mon, 23 Aug 2021 20:58:11 GMT
Connection: keep-alive
Vary: accept-encoding
Content-Length: 11
Via: 1.1 vegur
Hello World⏎
Server side:
2021-08-23T20:58:12.067556+00:00 heroku[router]: at=info method=GET path="/array" host=rails-falcon-heroku.herokuapp.com request_id=324f1a6d-2d98-4281-9468-0245cbc9b233 fwd="151.210.172.1" dyno=web.1 connect=0ms service=1ms status=200 bytes=92 protocol=https
2021-08-23T20:58:12.070521+00:00 app[web.1]: {"time":"2021-08-23T20:58:12+00:00","severity":"warn","class":"Async::HTTP::Protocol::HTTP1::Server","oid":2340,"pid":12,"subject":"#<Async::HTTP::Protocol::HTTP1::Server:0x000056318fb3b700>","arguments":[[["Host","rails-falcon-heroku.herokuapp.com"],["Connection","close"],["User-Agent","curl/7.77.0"],["Accept","*/*"],["X-Request-Id","324f1a6d-2d98-4281-9468-0245cbc9b233"],["X-Forwarded-For","151.210.172.1"],["X-Forwarded-Proto","https"],["X-Forwarded-Port","443"],["Via","1.1 vegur"],["Connect-Time","0"],["X-Request-Start","1629752292064"],["Total-Route-Time","0"]]]}
2021-08-23T20:58:12.070694+00:00 app[web.1]: {"time":"2021-08-23T20:58:12+00:00","severity":"info","class":"Async::HTTP::Protocol::HTTP1::Request","oid":2360,"pid":12,"subject":"http://rails-falcon-heroku.herokuapp.com: GET /array HTTP/1.1","message":"Headers: {\"connection\"=>[\"close\"], \"user-agent\"=>\"curl/7.77.0\", \"accept\"=>[\"*/*\"], \"x-request-id\"=>[\"324f1a6d-2d98-4281-9468-0245cbc9b233\"], \"x-forwarded-for\"=>[\"151.210.172.1\"], \"x-forwarded-proto\"=>[\"https\"], \"x-forwarded-port\"=>[\"443\"], \"via\"=>[\"1.1 vegur\"], \"connect-time\"=>[\"0\"], \"x-request-start\"=>[\"1629752292064\"], \"total-route-time\"=>[\"0\"]} from #<Addrinfo: 10.1.29.148:21276 TCP>"}
2021-08-23T20:58:12.071012+00:00 app[web.1]: {"time":"2021-08-23T20:58:12+00:00","severity":"info","class":"Async::HTTP::Protocol::HTTP1::Request","oid":2360,"pid":12,"subject":"http://rails-falcon-heroku.herokuapp.com: GET /array HTTP/1.1","message":"Responding with: 200 {\"vary\"=>[\"accept-encoding\"]}; #<Falcon::Adapters::Output length=11 body=Array> | #<Async::HTTP::Body::Statistics sent 11 bytes; took 0.19ms in total; took 0.18ms until first chunk>"}
Summary: Client -> Router (keep-alive) -> Falcon (close) -> content-length: 11
-> Router -> Client (keep-alive).
/enumeration
Client side:
> curl -i https://rails-falcon-heroku.herokuapp.com/enumeration
HTTP/1.1 200 OK
Server: Cowboy
Date: Mon, 23 Aug 2021 21:00:52 GMT
Connection: close
Vary: accept-encoding
Via: 1.1 vegur
Hello World⏎
Server side:
2021-08-23T21:00:53.044351+00:00 app[web.1]: {"time":"2021-08-23T21:00:53+00:00","severity":"warn","class":"Async::HTTP::Protocol::HTTP1::Server","oid":2340,"pid":13,"subject":"#<Async::HTTP::Protocol::HTTP1::Server:0x000056318fb4c618>","arguments":[[["Host","rails-falcon-heroku.herokuapp.com"],["Connection","close"],["User-Agent","curl/7.77.0"],["Accept","*/*"],["X-Request-Id","ca34386f-9153-4692-a148-bebd2f3ddc83"],["X-Forwarded-For","151.210.172.1"],["X-Forwarded-Proto","https"],["X-Forwarded-Port","443"],["Via","1.1 vegur"],["Connect-Time","0"],["X-Request-Start","1629752453042"],["Total-Route-Time","0"]]]}
2021-08-23T21:00:53.044528+00:00 app[web.1]: {"time":"2021-08-23T21:00:53+00:00","severity":"info","class":"Async::HTTP::Protocol::HTTP1::Request","oid":2360,"pid":13,"subject":"http://rails-falcon-heroku.herokuapp.com: GET /enumeration HTTP/1.1","message":"Headers: {\"connection\"=>[\"close\"], \"user-agent\"=>\"curl/7.77.0\", \"accept\"=>[\"*/*\"], \"x-request-id\"=>[\"ca34386f-9153-4692-a148-bebd2f3ddc83\"], \"x-forwarded-for\"=>[\"151.210.172.1\"], \"x-forwarded-proto\"=>[\"https\"], \"x-forwarded-port\"=>[\"443\"], \"via\"=>[\"1.1 vegur\"], \"connect-time\"=>[\"0\"], \"x-request-start\"=>[\"1629752453042\"], \"total-route-time\"=>[\"0\"]} from #<Addrinfo: 10.1.1.183:24405 TCP>"}
2021-08-23T21:00:53.044906+00:00 app[web.1]: {"time":"2021-08-23T21:00:53+00:00","severity":"info","class":"Async::HTTP::Protocol::HTTP1::Request","oid":2360,"pid":13,"subject":"http://rails-falcon-heroku.herokuapp.com: GET /enumeration HTTP/1.1","message":"Responding with: 200 {\"vary\"=>[\"accept-encoding\"]}; #<Falcon::Adapters::Output length=nil body=Enumerator> | #<Async::HTTP::Body::Statistics sent 11 bytes; took 0.23ms in total; took 0.2ms until first chunk>"}
2021-08-23T21:00:53.045036+00:00 heroku[router]: at=info method=GET path="/enumeration" host=rails-falcon-heroku.herokuapp.com request_id=ca34386f-9153-4692-a148-bebd2f3ddc83 fwd="151.210.172.1" dyno=web.1 connect=0ms service=1ms status=200 bytes=72 protocol=https
Summary: Client -> Router (close) -> Falcon (close) -> content-length: 11
-> Router -> Client (close).
Conclusion
It seems like the Heroku router has two distinct problems:
Always connection: close
to application server.
My expectation is the router should establish (assuming only HTTP/1.1) several persistent connections to the application server.
The cost here is the per-connection overhead. Not sure how much this is but it can contribute to overall latency. It's also unexpected.
Unable to handle transfer-encoding: chunked
from application server.
My expectation is that chunked
transfer encoding should not result in a connection: close
response. The reason why this would happen is because transfer-encoding: chunked
does not carry the content length, and thus somewhere it must be opting to convert this to a unknown length response followed by a close, rather than using a persistent connection.
This can have a significant impact on client side performance since the overall latency can be significantly dominated by connection overhead.
Not only that, but it appears there are some major performance regressions when taking this path.
> ab -k -n 10 https://rails-falcon-heroku.herokuapp.com/array
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking rails-falcon-heroku.herokuapp.com (be patient).....done
Server Software: Cowboy
Server Hostname: rails-falcon-heroku.herokuapp.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
Server Temp Key: ECDH P-256 256 bits
TLS Server Name: rails-falcon-heroku.herokuapp.com
Document Path: /array
Document Length: 11 bytes
Concurrency Level: 1
Time taken for tests: 2.842 seconds
Complete requests: 10
Failed requests: 0
Keep-Alive requests: 10
Total transferred: 1660 bytes
HTML transferred: 110 bytes
Requests per second: 3.52 [#/sec] (mean)
Time per request: 284.240 [ms] (mean)
Time per request: 284.240 [ms] (mean, across all concurrent requests)
Transfer rate: 0.57 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 64 203.8 0 645
Processing: 214 220 15.0 215 262
Waiting: 214 220 15.0 215 262
Total: 214 284 203.1 215 861
Percentage of the requests served within a certain time (ms)
50% 215
66% 215
75% 218
80% 262
90% 861
95% 861
98% 861
99% 861
100% 861 (longest request)
vs
> ab -k -n 10 https://rails-falcon-heroku.herokuapp.com/enumeration
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking rails-falcon-heroku.herokuapp.com (be patient).....done
Server Software: Cowboy
Server Hostname: rails-falcon-heroku.herokuapp.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
Server Temp Key: ECDH P-256 256 bits
TLS Server Name: rails-falcon-heroku.herokuapp.com
Document Path: /enumeration
Document Length: 11 bytes
Concurrency Level: 1
Time taken for tests: 12.336 seconds
Complete requests: 10
Failed requests: 0
Keep-Alive requests: 0
Total transferred: 1410 bytes
HTML transferred: 110 bytes
Requests per second: 0.81 [#/sec] (mean)
Time per request: 1233.579 [ms] (mean)
Time per request: 1233.579 [ms] (mean, across all concurrent requests)
Transfer rate: 0.11 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 642 951 343.8 921 1874
Processing: 212 282 63.0 307 400
Waiting: 212 282 63.1 307 400
Total: 927 1233 340.1 1228 2089
Percentage of the requests served within a certain time (ms)
50% 1228
66% 1229
75% 1261
80% 1398
90% 2089
95% 2089
98% 2089
99% 2089
100% 2089 (longest request)
transfer-encoding: chunked
seems to be ~6x the latency.
I investigated the connection: close
latency in more detail, of course it seems https
adds a large overhead - using http
gives a slightly better picture with only 2x the overhead (i.e. reconnecting every request?).
> ab -k -n 10 http://rails-falcon-heroku.herokuapp.com/enumeration
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking rails-falcon-heroku.herokuapp.com (be patient).....done
Server Software: Cowboy
Server Hostname: rails-falcon-heroku.herokuapp.com
Server Port: 80
Document Path: /enumeration
Document Length: 11 bytes
Concurrency Level: 1
Time taken for tests: 4.248 seconds
Complete requests: 10
Failed requests: 0
Keep-Alive requests: 0
Total transferred: 1410 bytes
HTML transferred: 110 bytes
Requests per second: 2.35 [#/sec] (mean)
Time per request: 424.788 [ms] (mean)
Time per request: 424.788 [ms] (mean, across all concurrent requests)
Transfer rate: 0.32 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 209 211 1.6 211 214
Processing: 211 214 3.5 213 223
Waiting: 211 213 3.5 213 223
Total: 420 425 4.1 424 434
Percentage of the requests served within a certain time (ms)
50% 424
66% 424
75% 426
80% 430
90% 434
95% 434
98% 434
99% 434
100% 434 (longest request)
I was interested to look at another example where I use streaming with transfer-encoding: chunked
:
> curl --no-buffer -i https://utopia-falcon-heroku.herokuapp.com/beer/index
HTTP/1.1 200 OK
Server: Cowboy
Date: Mon, 23 Aug 2021 22:55:30 GMT
Connection: close
Content-Type: text/html; charset=utf-8
Vary: accept-encoding
Via: 1.1 vegur
<!DOCTYPE html><html><head><title>99 Bottles of Beer</title></head><body>... snip ...
It seems like Heroku's router is systematically transforming transfer-encoding: chunked
to connection: close
without content-length
. This seems like an unfortunate implementation choice. Streaming does work, but it's incredibly basic (HTTP/1.0 level).
I haven't worked at Heroku in half a decade or so but if I recall:
- I believe the router always closed connections to the backend because that's what it used to do on older legacy stacks
- There are no pre-warming of connections because that would require coordination across routers and some apps (i.e. legacy Rails) can only hold 1 connection per dyno
- I do believe there are tests around keepalive behaviour with chunked connections (https://github.com/heroku/vegur/blob/master/test/vegur_roundtrip_SUITE.erl#L59-L65), and some specifically for the keepalive behaviour (https://github.com/heroku/vegur/blob/master/test/vegur_roundtrip_SUITE.erl#L59-L2644)
- There explicit code handling the chunked -> 1.0 handling specifically for 1.0 servers or clients, but not more
- Back in the day the thing I'd have asked for to debug this is a pcap dump to see what happens.
There might be a weird racy case though. IIRC there could be weird things happening if an endpoint sent a full response and closed the connection before we even were done streaming the request, but I don't know if this is what is happening here.
@ferd thanks for chiming in on this project given your circumstance.
I believe the router always closed connections to the backend because that's what it used to do on older legacy stacks
Yes, I agree that I'm observing this behaviour. Which is unfortunate since there is definitely a performance cost here (although probably only a few ms on average).
There explicit code handling the chunked -> 1.0 handling specifically for 1.0 servers or clients, but not more
It basically looks like the router only supports HTTP/1.0
semantics, even if the client supports HTTP/1.1
. Or maybe because the router sets connection: close
on the application server request, the application server is responding with non-transfer-encoding body - I need to check this. Certainly, if you are going to close the connection anyway, there is no point using transfer-encoding: chunked
. Maybe in this case, the router is just sending the response verbatim including connection: close
. I can check this.
Yeah your latter supposition might be it.
I recall some servers doing this explicitly with the idea that they might be saving a few bytes (even if they wouldn't amount to savings on a grown packet), and I recall the golang server also doing the opposite and always using a chunked connection with an instant \r\n to save a few bytes from passing connection headers.
Yes, it seems to be the case:
elsif @persistent and version == HTTP11
write_connection_header(version)
# We specifically ensure that non-persistent connections do not use chunked response, so that hijacking works as expected.
write_chunked_body(body, head, trailer)
else
@persistent = false
write_connection_header(version)
write_body_and_close(body, head)
end
I can hack this to always return transfer-encoding: chunked
to see if that impacts the front end response.
Okay, I hacked this to always return transfer-encoding: chunked
:
module AlwaysUseTransferEncodingChunked
def write_body(version, body, head = false, trailer = nil)
if body.nil?
write_connection_header(version)
write_empty_body(body)
elsif length = body.length and trailer.nil?
write_connection_header(version)
write_fixed_length_body(body, length, head)
elsif body.empty?
# Even thought this code is the same as the first clause `body.nil?`, HEAD responses have an empty body but still carry a content length. `write_fixed_length_body` takes care of this appropriately.
write_connection_header(version)
write_empty_body(body)
elsif version == ::Protocol::HTTP1::Connection::HTTP11
write_connection_header(version)
# We specifically ensure that non-persistent connections do not use chunked response, so that hijacking works as expected.
write_chunked_body(body, head, trailer)
if !@persistent
@stream.close_write
end
else
@persistent = false
write_connection_header(version)
write_body_and_close(body, head)
end
end
end
::Protocol::HTTP1::Connection.prepend(AlwaysUseTransferEncodingChunked)
...and it worked:
> curl -i https://rails-falcon-heroku.herokuapp.com/enumeration
HTTP/1.1 200 OK
Server: Cowboy
Date: Tue, 24 Aug 2021 00:16:21 GMT
Connection: keep-alive
Vary: accept-encoding
Transfer-Encoding: chunked
Via: 1.1 vegur
Hello World⏎
I need to spend some more time thinking about whether this is the right approach or not. However, this is still unexpected behaviour from the router. The router is always specify connection: close
and it's reasonable to expect the server will not use transfer-encoding: chunked
in this case. It causes the performance regression if the router does not transform this response into transfer-encoding: chunked
. The router should do everything it possibly can to avoid connection: close
and/or ensure connection: keep-alive
because it's such a huge impact to performance.
Also, there are some cases where this hack might not be the right behaviour... I don't believe the server is doing the wrong thing here, and I think the router is (significantly) sub-optimal by the design constraints/choices. Regarding the performance on the server side, such an implementation would prevent the use of splice
(/sendfile
) because transfer-encoding: chunked
is much more involved in the server side formatting of data.
My expectation back in the day is that the server should still use the chunked connection encoding. Close-delimited connections are unsafe because they contain no mechanism to know if the data transfer was completed due to an early drop in the connection or because it was truly done (some stacks make it unclear whether you get a FIN or RST packet, and so there is no in-band semantic way to know).
If you're using a content-length, the end length is known ahead of time and you know when the message is done, and with a chunked encoding you get a final delimiter that lets you know for sure the message was properly terminated as well. These mean that you should ideally avoid using close-delimited transfers as any sort of optimization because the error-adjacent behaviour is vague and therefore riskier.
The connection:close vs. connection:keepalive discussion is a good one to have. Maybe things would change today for people running Heroku, but what things were like 5-7 years ago is that, well, there were simply a lot of applications that wouldn't or couldn't handle pipelined requests right, and which otherwise expected the router to load-balance few clients sending lots of requests to various backends.
For example, if they closed the connection to the proxy, they wanted the proxy to keep forwarding the other enqueued requests to other backends, and dropping that meant breaking backwards-compatibility.
The behaviour of the proxy was mostly defined by a desire not to break even older behavior that pre-dated this generation; doing things to be more performant implies going through deprecation cycles with thousands of customers who may or may not be technical enough to handle this.
If I recall properly though, there might have been feature flags allowing keepalive connections to the dynos. I implemented code to support it so I have to figure the flag may or may not still be around in the private router product.
Contacting support could help in possibly turning that switch on for your app.
You are totally right, even if we don't need it, using transfer-encoding: chunked
is better for reliability (but slightly worse for performance). In think we can opt for this implementation in Falcon which will mitigate the issue we are experiencing here. It only applies to a small subset of cases (when no content-length is present).
That being said, I think there are still actionable items here w.r.t. Heroku's router.
Okay, I implemented this socketry/protocol-http1@cb8da42
I will test it later to see if this works as expected.
Okay, I've confirmed the issue is fixed:
> curl -i http://rails-falcon-heroku.herokuapp.com/array
HTTP/1.1 200 OK
Server: Cowboy
Date: Tue, 24 Aug 2021 04:23:18 GMT
Connection: keep-alive
Vary: accept-encoding
Content-Length: 11
Via: 1.1 vegur
Hello World⏎
> curl -i http://rails-falcon-heroku.herokuapp.com/enumeration
HTTP/1.1 200 OK
Server: Cowboy
Date: Tue, 24 Aug 2021 04:23:24 GMT
Connection: keep-alive
Vary: accept-encoding
Transfer-Encoding: chunked
Via: 1.1 vegur
Hello World⏎
Because I'm satisfied with this fix, I will close the issue.