rails/solid_queue

Readiness status

Opened this issue · 6 comments

Is there any way of detecting (from outside of the process) when the worker process is up and running and able to start processing jobs?

During deployment on Kubernetes a ReadinessProbe will not shut down the old worker until the new one is ready, but how can we determine that from outside?

Can we make it write to a file when it's up and ready to process jobs?

rosa commented

Hey @andyjeffries, sorry for the delay replying to this one! I wonder if you could use the pidfile that the supervisor sets as your readiness probe. See the supervisor_pidfile configuration option mentioned here. Initially, that was how we were going to run Solid Queue in k8s in the cloud but we never got to do that because we moved to on-premises using Kamal, so our setup is much simpler. The pidfile is setup right before the workers are started by the supervisor, so strictly they aren't ready to process jobs yet, but there's also a shutdown timeout that the previous supervisor will wait until actually terminating all workers, so I think it should be ok.

Thanks @rosa , that should work OK for now. Would be great if there was a hook that could be executed for when it's up and running ready to process jobs...

@rosa How do you define healthcheck for Kamal? I'm wondering how to do it in most reliable way especially with the recent change basecamp/kamal#740

rosa commented

@morgoth you could use Kamal's healthcheck option and the supervisor_pidfile option, and just check for the file's presence. This only works for deploys, though, the healthcheck will do nothing once the deploy has finished. If you meant a health check more in the sense of continuous monitoring, we don't have anything like that. So far we've been fine just monitoring other things like number of pending jobs in specific queues.

In any case, I'm planning to improve the existing supervisor_pidfile approach so that we can detect that all workers are ready to process jobs/all dispatchers ready to dispatch, for version 1.0.

Maybe provide hooks into the SolidQueue boot/shutdown callbacks? This would allow custom logic for readiness detection (vs using the supervisor pid). We used this exact approach with Sidekiq on k8s, and it worked well enough.

We used the approach @zerobearing2 describes too and it worked perfectly for us.