Doesn't work with http.rb

Question

Doesn't work with http.rb

janko opened this issue 6 years ago · 5 comments

I like the idea of async-io giving you the ability to swap out Ruby's native socket class with the async counterpart. So I went to test this out with one library I know lets you do that – http.rb. Unfortunately, trying to run two requests in parallel raised an Async::Wrapper::Cancelled exception.

require "http"
require "async"
require "async/io/tcp_socket"
require "async/io/ssl_socket"

http = HTTP::Client.new(
  socket_class:     Async::IO::TCPSocket,
  ssl_socket_class: Async::IO::SSLSocket,
)

Async.run do |task|
  task.async do
    http.get("http://httpbin.org/delay/5").to_s
  end

  task.async do
    http.get("https://httpbin.org/delay/5").to_s
  end
end

Traceback (most recent call last):
        12: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/task.rb:74:in `block in initialize'
        11: from bla.rb:19:in `block (2 levels) in <main>'
        10: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/chainable.rb:20:in `get'
         9: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/client.rb:31:in `request'
         8: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/client.rb:75:in `perform'
         7: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/connection.rb:103:in `read_headers!'
         6: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/connection.rb:212:in `read_more'
         5: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/http-4.0.0/lib/http/timeout/null.rb:45:in `readpartial'
         4: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-io-1.16.4/lib/async/io/generic.rb:47:in `block in wrap_blocking_method'
         3: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-io-1.16.4/lib/async/io/generic.rb:132:in `async_send'
         2: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/wrapper.rb:102:in `wait_readable'
         1: from /Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/wrapper.rb:215:in `wait_for'
/Users/janko/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/async-1.10.3/lib/async/task.rb:51:in `yield': The operation has been cancelled! (Async::Wrapper::Cancelled)

Now, I'm not asking you to dive into the http.rb source code to debug this, but for starters I'm interested in what causes an Async::Wrapper::Cancelled exception.

Answer 1 · 2018-11-17T21:31:34.000Z

If I comment out either of the tasks, so that only one task runs, the script executes without errors. So it's only that when running both of the tasks the exception occurs.

Answer 2 · 2018-11-18T02:45:57.000Z

In a way, this might be a bug in http.rb.

https://github.com/socketry/async/blob/master/lib/async/wrapper.rb#L141-L145

Async::Wrapper::Cancelled is thrown when you close a socket when someone else is still reading or writing to it, i.e. they are waiting for it to become readable or writable. When you close it from a different task, any other task waiting on it will be cancelled.

So, it represents a logic issue of sorts.

You should probably isolate HTTP::Client to each task, i.e. make one per task. Because that class is not capable of multiplexing requests. Just like threads, you should not attempt to share resources between tasks without mutual exclusion, unless they are specifically designed for it, which most of the async eco-system is.

What's most likely happening is this (based on a quick look at http.rb's client implementation):

Task 1: Open socket and send request: https://github.com/httprb/http/blob/7c2a8613b75e79ba276c13d48d158030ac319d8d/lib/http/client.rb#L69
Task 1: Waiting for response.
Task 2: See that there is open socket, but that it is "dirty": https://github.com/httprb/http/blob/7c2a8613b75e79ba276c13d48d158030ac319d8d/lib/http/client.rb#L119
Task 2: Close the socket.
Task 1: Waiting for response is cancelled.

Because HTTP.rb is attempting to multiplex connections, it has problems. Async::HTTP supports both HTTP/1 and HTTP/2 multiplexed connections but only HTTP/2 supports true concurrency on a single connection. HTTP/1 needs a connection pool.

Answer 3 · 2018-11-18T02:47:22.000Z

As an addendum, this issue happens because HTTP.rb supports some kind of fake concurrency - as long as your requests are non-interleaved, it will use the same connection, so within a single task you will derive some benefit by not closing the connection, but there is no way for this to work, say, in multiple threads, fibers, or any other non-sequential execution. Basically, as I suggested, just isolate HTTP::Client to each task and the problem should go away.

Answer 4 · 2018-11-18T08:33:25.000Z

Yes, you were right. I somehow thought that #get will internally create a new HTTP::Client by looking at the code, but it uses HTTP::Client#request instead of HTTP::Chainable#reqest. The following script executes with no issues (and the Async.run block completes in about 5 seconds):

require "http"
require "async"
require "async/io/tcp_socket"
require "async/io/ssl_socket"

options = {
  socket_class:     Async::IO::TCPSocket,
  ssl_socket_class: Async::IO::SSLSocket,
}

Async.run do |task|
  task.async do
    http = HTTP::Client.new(options)
    http.get("http://httpbin.org/delay/5").to_s
  end

  task.async do
    http = HTTP::Client.new(options)
    http.get("https://httpbin.org/delay/5").to_s
  end
end

Thanks for the help!

Answer 5 · 2018-11-18T08:35:33.000Z

You are most welcome!