socketry/async-io

Doesn't work with http.rb

janko opened this issue · 5 comments

janko commented

I like the idea of async-io giving you the ability to swap out Ruby's native socket class with the async counterpart. So I went to test this out with one library I know lets you do that – http.rb. Unfortunately, trying to run two requests in parallel raised an Async::Wrapper::Cancelled exception.

require "http"
require "async"
require "async/io/tcp_socket"
require "async/io/ssl_socket"

http = HTTP::Client.new(
  socket_class:     Async::IO::TCPSocket,
  ssl_socket_class: Async::IO::SSLSocket,
)

Async.run do |task|
  task.async do
    http.get("http://httpbin.org/delay/5").to_s
  end

  task.async do
    http.get("https://httpbin.org/delay/5").to_s
  end
end
Traceback (most recent call last):
        12: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/task.rb:74:in `block in initialize'
        11: from bla.rb:19:in `block (2 levels) in <main>'
        10: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/chainable.rb:20:in `get'
         9: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/client.rb:31:in `request'
         8: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/client.rb:75:in `perform'
         7: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/connection.rb:103:in `read_headers!'
         6: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/connection.rb:212:in `read_more'
         5: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/timeout/null.rb:45:in `readpartial'
         4: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-io-1.16.4/lib/async/io/generic.rb:47:in `block in wrap_blocking_method'
         3: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-io-1.16.4/lib/async/io/generic.rb:132:in `async_send'
         2: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/wrapper.rb:102:in `wait_readable'
         1: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/wrapper.rb:215:in `wait_for'
/Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/task.rb:51:in `yield': The operation has been cancelled! (Async::Wrapper::Cancelled)

Now, I'm not asking you to dive into the http.rb source code to debug this, but for starters I'm interested in what causes an Async::Wrapper::Cancelled exception.

janko commented

If I comment out either of the tasks, so that only one task runs, the script executes without errors. So it's only that when running both of the tasks the exception occurs.

In a way, this might be a bug in http.rb.

https://github.com/socketry/async/blob/master/lib/async/wrapper.rb#L141-L145

Async::Wrapper::Cancelled is thrown when you close a socket when someone else is still reading or writing to it, i.e. they are waiting for it to become readable or writable. When you close it from a different task, any other task waiting on it will be cancelled.

So, it represents a logic issue of sorts.

You should probably isolate HTTP::Client to each task, i.e. make one per task. Because that class is not capable of multiplexing requests. Just like threads, you should not attempt to share resources between tasks without mutual exclusion, unless they are specifically designed for it, which most of the async eco-system is.

What's most likely happening is this (based on a quick look at http.rb's client implementation):

Because HTTP.rb is attempting to multiplex connections, it has problems. Async::HTTP supports both HTTP/1 and HTTP/2 multiplexed connections but only HTTP/2 supports true concurrency on a single connection. HTTP/1 needs a connection pool.

As an addendum, this issue happens because HTTP.rb supports some kind of fake concurrency - as long as your requests are non-interleaved, it will use the same connection, so within a single task you will derive some benefit by not closing the connection, but there is no way for this to work, say, in multiple threads, fibers, or any other non-sequential execution. Basically, as I suggested, just isolate HTTP::Client to each task and the problem should go away.

janko commented

Yes, you were right. I somehow thought that #get will internally create a new HTTP::Client by looking at the code, but it uses HTTP::Client#request instead of HTTP::Chainable#reqest. The following script executes with no issues (and the Async.run block completes in about 5 seconds):

require "http"
require "async"
require "async/io/tcp_socket"
require "async/io/ssl_socket"

options = {
  socket_class:     Async::IO::TCPSocket,
  ssl_socket_class: Async::IO::SSLSocket,
}

Async.run do |task|
  task.async do
    http = HTTP::Client.new(options)
    http.get("http://httpbin.org/delay/5").to_s
  end

  task.async do
    http = HTTP::Client.new(options)
    http.get("https://httpbin.org/delay/5").to_s
  end
end

Thanks for the help!

You are most welcome!