Controller breaks on SIGHUP when ready isn't explicitly called
bryanp opened this issue · 4 comments
Discussed this with @ioquatix on Slack. Reproduce with the example code from the readme:
require 'async/container'
class Controller < Async::Container::Controller
def setup(container)
container.async do |task|
while true
puts "hello"
task.sleep(1)
end
end
end
end
controller = Controller.new
controller.run
Send SIGHUP
to the main process. You should see something like this in terminal:
zsh: hangup bundle exec ruby readme.rb
"hello" is still printed because the child process is still running. Hitting Ctrl-C
does not stop the process at this point. Stopping the process through Activity Monitor causes this error:
13.52s error: Async::Container::Process [pid=5891] [2020-04-14 09:22:11 -0700]
| Async::Container::Terminate: SIGTERM
| → /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/process.rb:74 in `block (3 levels) in fork'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-1.24.2/lib/async/reactor.rb:204 in `select'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-1.24.2/lib/async/reactor.rb:204 in `run_once'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-1.24.2/lib/async/reactor.rb:234 in `run'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-1.24.2/lib/async/reactor.rb:56 in `run'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/generic.rb:168 in `block in async'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/process.rb:77 in `block (2 levels) in fork'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/process.rb:72 in `fork'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/process.rb:72 in `block in fork'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/process.rb:106 in `initialize'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/process.rb:71 in `new'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/process.rb:71 in `fork'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/forked.rb:35 in `start'
| /Users/bryanp/.gem/ruby/2.6.5/gems/async-container-0.16.4/lib/async/container/generic.rb:134 in `block in spawn'
Here's a working version:
require 'async/container'
class Controller < Async::Container::Controller
def setup(container)
container.run(count: 1) do |instance|
Async do |task|
task.async do
loop do
puts "hello"
task.sleep 1
end
end
instance.ready!
task.children.each(&:wait)
end
end
end
end
controller = Controller.new
controller.run
Couple observations:
- We should consider a better way to handle the case where
ready!
is not called. - I'm not entirely sure how
ready!
is to be called when usingcontainer.async
.
So, this code has received updates to get it working I production but there are some rough edges as you have outlined. Thanks for that.
Readiness handling
ready!
is going to be an essential part of the contract between parent and child process. Right now I don't have any semantics for handling failures except "either the entire thing has failed" or "it failed, try to restart it".
We need to introduce timeout handling and backoff if failures are happening repeatedly, and this state needs to be modelled via notifications upstream.
An example of this is a child which fails because of a syntax error, no amount of restarting will make it work. So we need to implement a policy for restarting: try 3 times, if it fails every time, put it on ice and inform the parent. Provide a mechanism to try failed children (e.g. SIGHUP
).
Interface considerations
The container.async
method can either be deprecated/removed or augmented to supply the right interfaces. It's essentially equivalent to running your own Async block in a child.
The logic was to provide something like container.async do ... async code ... end
. So you don't worry how then container is working but just tell it to execute some code asynchronously. However it might give a false sense of security to user (e.g. shared mutable resources).
I'm interested in contributing timeout handling and backoff. Since you have an approach in mind, is this something you'd be able to hand off? Realistically, it would involve some knowledge transfer. But I'm actively working on some process-related things, so I do have some context.
What about changing Async::Container::Generic#async
to this?
def async(**options)
spawn(**options) do |instance|
Async::Reactor.run(instance) do |task|
yield instance, task
end
end
end
Yielding two arguments from async
is a bit surprising, which might be an argument for removing the async
method and letting the user do it themselves. Not sure where I land here.
I'm happy for you to take a stab at the backoff handling. But, it's going to be quite tricky to get right. So happy to help.
You will need to check changes with falcon too.
To avoid breaking existing code:
def async(**options)
spawn(**options) do |instance|
Async::Reactor.run do |task|
instance.ready!
yield task
end
end
end
However I'd also be okay with removing that method.
By the way we should probably add a spec to capture the broken behaviour before fixing it to confirm it's fixed, these are tricky issues and having coverage is really helpful.