zombocom/rack-timeout

Prefer sending SIGKILL over SIGTERM to a process

katyho opened this issue · 2 comments

I understand that the motivation behind SIGTERM is to give the process time to process requests in other threads. However, in many cases SIGTERM might not be sufficient in killing hung requests.

We'd prefer SIGKILL over SIGTERM for a few additional reasons:

  • When a request takes so long that it times out, something probably already went horribly wrong in the app, in which case we might not care about cleaning up or terminating gracefully.
  • For multiprocess, single threaded applications, we don't need to care about processing requests in other threads.

Would you consider supporting a mechanism to send SIGKILL over SIGTERM?

Heroku's behavior is that it sends a SIGTERM and then waits 10 seconds and sends a SIGKILL. My preference to add such behavior as sending SIGKILL would be to mimic this pause and wait behavior, and if you wanted to not wait at all you could set it to zero seconds so then it would get SIGTERM followed by SIGKILL right away. I'm hesitant though as it doesn't seem like others have hit this and misconfiguring this will kill in-flight requests on multi-threaded servers (such as Puma) that are not given a chance to finish.

Are you still finding you want the ability to SIGKILL a process? Is this a thing you've seen in the wild or is this more of a "I think I want this" feature discussion?

Also it's been about a year since you opened the issue. Did you find any workarounds via observers or catching the exception or something else?

Adding the SIGTERM logic was added quite reluctantly and arguably outside the scope of what rack-timeout should really do, but was added to help users fix a practical problem with hung application processes.

That said, it would be easy enough to add a class variable to reference the string 'SIGTERM', that could then be replaced by the developer. (I would not support adding this via an option to the middleware itself.) So the usage would be something like:

require 'rack/timeout/base'
Rack::Timeout.signal_on_timeout = 'SIGKILL'
use Rack::Timeout, term_on_timeout: true

Setting this to an appropriate value would be the responsibility of the developer.