Doesn't work with http.rb
janko opened this issue · 5 comments
I like the idea of async-io
giving you the ability to swap out Ruby's native socket class with the async counterpart. So I went to test this out with one library I know lets you do that – http.rb. Unfortunately, trying to run two requests in parallel raised an Async::Wrapper::Cancelled
exception.
require "http"
require "async"
require "async/io/tcp_socket"
require "async/io/ssl_socket"
http = HTTP::Client.new(
socket_class: Async::IO::TCPSocket,
ssl_socket_class: Async::IO::SSLSocket,
)
Async.run do |task|
task.async do
http.get("http://httpbin.org/delay/5").to_s
end
task.async do
http.get("https://httpbin.org/delay/5").to_s
end
end
Traceback (most recent call last):
12: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/task.rb:74:in `block in initialize'
11: from bla.rb:19:in `block (2 levels) in <main>'
10: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/chainable.rb:20:in `get'
9: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/client.rb:31:in `request'
8: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/client.rb:75:in `perform'
7: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/connection.rb:103:in `read_headers!'
6: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/connection.rb:212:in `read_more'
5: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/timeout/null.rb:45:in `readpartial'
4: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-io-1.16.4/lib/async/io/generic.rb:47:in `block in wrap_blocking_method'
3: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-io-1.16.4/lib/async/io/generic.rb:132:in `async_send'
2: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/wrapper.rb:102:in `wait_readable'
1: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/wrapper.rb:215:in `wait_for'
/Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/task.rb:51:in `yield': The operation has been cancelled! (Async::Wrapper::Cancelled)
Now, I'm not asking you to dive into the http.rb source code to debug this, but for starters I'm interested in what causes an Async::Wrapper::Cancelled
exception.
If I comment out either of the tasks, so that only one task runs, the script executes without errors. So it's only that when running both of the tasks the exception occurs.
In a way, this might be a bug in http.rb
.
https://github.com/socketry/async/blob/master/lib/async/wrapper.rb#L141-L145
Async::Wrapper::Cancelled
is thrown when you close a socket when someone else is still reading or writing to it, i.e. they are waiting for it to become readable or writable. When you close it from a different task, any other task waiting on it will be cancelled.
So, it represents a logic issue of sorts.
You should probably isolate HTTP::Client
to each task, i.e. make one per task. Because that class is not capable of multiplexing requests. Just like threads, you should not attempt to share resources between tasks without mutual exclusion, unless they are specifically designed for it, which most of the async eco-system is.
What's most likely happening is this (based on a quick look at http.rb's client implementation):
- Task 1: Open socket and send request: https://github.com/httprb/http/blob/7c2a8613b75e79ba276c13d48d158030ac319d8d/lib/http/client.rb#L69
- Task 1: Waiting for response.
- Task 2: See that there is open socket, but that it is "dirty": https://github.com/httprb/http/blob/7c2a8613b75e79ba276c13d48d158030ac319d8d/lib/http/client.rb#L119
- Task 2: Close the socket.
- Task 1: Waiting for response is cancelled.
Because HTTP.rb is attempting to multiplex connections, it has problems. Async::HTTP
supports both HTTP/1 and HTTP/2 multiplexed connections but only HTTP/2 supports true concurrency on a single connection. HTTP/1 needs a connection pool.
As an addendum, this issue happens because HTTP.rb
supports some kind of fake concurrency - as long as your requests are non-interleaved, it will use the same connection, so within a single task you will derive some benefit by not closing the connection, but there is no way for this to work, say, in multiple threads, fibers, or any other non-sequential execution. Basically, as I suggested, just isolate HTTP::Client
to each task and the problem should go away.
Yes, you were right. I somehow thought that #get
will internally create a new HTTP::Client
by looking at the code, but it uses HTTP::Client#request
instead of HTTP::Chainable#reqest
. The following script executes with no issues (and the Async.run
block completes in about 5 seconds):
require "http"
require "async"
require "async/io/tcp_socket"
require "async/io/ssl_socket"
options = {
socket_class: Async::IO::TCPSocket,
ssl_socket_class: Async::IO::SSLSocket,
}
Async.run do |task|
task.async do
http = HTTP::Client.new(options)
http.get("http://httpbin.org/delay/5").to_s
end
task.async do
http = HTTP::Client.new(options)
http.get("https://httpbin.org/delay/5").to_s
end
end
Thanks for the help!
You are most welcome!