QueueClassic/queue_classic

Suggestion: workers that can process multiple queues

Closed this issue · 11 comments

ukd1 commented

Problem: currently QC only allows for a worker to work on a single named queue; this means most people end up with something like this:

worker_queue_low: env TERM_CHILD=1 QUEUE=low bundle exec rake qc:work
worker_queue_medium: env TERM_CHILD=1 QUEUE=medium bundle exec rake qc:work
worker_queue_high: env TERM_CHILD=1 QUEUE=high bundle exec rake qc:work

This costs (on Heroku) 3 dynos and each queue can only do one task at a time.

I'm suggesting:

worker_queue_low: env TERM_CHILD=1 QUEUE=* bundle exec rake qc:work
worker_queue_medium: env TERM_CHILD=1 QUEUE=medium,high bundle exec rake qc:work
worker_queue_high: env TERM_CHILD=1 QUEUE=high bundle exec rake qc:work

Which would allow the low queue to do any work and the medium to do medium and high priority jobs.

For the same cost, we'd be able to process 3x the amount of high jobs at once and 2x the amount of medium jobs at once, assuming the other queues are empty (which, is the case most of the time).

The change is easy and non-breaking I think;

  1. Accept * to process any queue
  2. Accept CSV which would be split and then check for jobs in each queue in order

Russ

p.s. We (@ukd1 / @smathieu / @rainforestapp) are happy to code this up, test it and do a PR, but would like to know if you'd merge it.

I would gladly accept a pull request implementing said feature!

On Jul 20, 2013, at 13:04, Russell Smith notifications@github.com wrote:

We're (@ukd1 / @smathieu / @rainforestapp) are happy to code this up, test it and do a PR, but would like to know if you'd merge it.

Problem: currently QC only allows for a worker to work on a single named queue; this means most people end up with something like this:

worker_queue_low: env TERM_CHILD=1 QUEUE=low bundle exec rake qc:work
worker_queue_medium: env TERM_CHILD=1 QUEUE=medium bundle exec rake qc:work
worker_queue_high: env TERM_CHILD=1 QUEUE=high bundle exec rake qc:work
This costs (on Heroku) 3 dynos and each queue can only do one task at a time.

I'm suggesting:

worker_queue_low: env TERM_CHILD=1 QUEUE=* bundle exec rake qc:work
worker_queue_medium: env TERM_CHILD=1 QUEUE=medium,high bundle exec rake qc:work
worker_queue_high: env TERM_CHILD=1 QUEUE=high bundle exec rake qc:work
Which would allow the low queue to do any work and the medium to do medium and high priority jobs.

For the same cost, we'd be able to process 3x the amount of high jobs at once and 2x the amount of medium jobs at once, assuming the other queues are empty (which, is the case most of the time).

The change is easy and non-breaking I think;

Accept * to process any queue
Accept CSV which would be split and then check for jobs in each queue in order
Russ


Reply to this email directly or view it on GitHub.

How will you handle the scheduling of the worker? For example, imagine you have 100 jobs in the high priority queue and 1 in the medium priority queue, will you have to work off all of the high priority jobs before getting to the medium priority? Is it possible to starve the non-highest priority queues?

It might be worth taking a RDD approach to this feature. This way we can be sure that the end solution is sufficient for most cases while being easy for new developers to understand how it will work.

ukd1 commented

@ryandotsmith I'll write the README for it ASAP! 👍

ukd1 commented

We (https://www.rainforestqa.com/) are offering a 200 USD bounty for a merged pull request implementing this. You will be required to release under the same license and also invoice us. Reply here with more questions!

There's a "worker concurrency" option in the master branch of queue_classic that would let you run multiple jobs per worker.

ukd1 commented

@joevandyk isn't that still for a single queue?

Oops, yes, I misread it originally.

v9n commented

@ryandotsmith @ukd1

How will you handle the scheduling of the worker? For example, imagine you have 100 jobs in the high priority queue and 1 in the medium priority queue, will you have to work off all of the high priority jobs before getting to the medium priority? Is it possible to starve the non-highest priority queues?

To keep thing simple, I think let process all jobs in higher priority queue. Someone choose to use a single worker for multiple queues, so they should aware of that issue for starving non-highest priority queue. Moreover, this scenario should be only used when we know we won't have many job in higher queu.

Or, we can set a ratio number. Like if the job in a queue < 1% of total jobs quantity of all queu, we should process it first?

ukd1 commented

I suggest the simplest is to sort by the queue_name in the order which is specified in the environment variable;

worker: env QUEUE=high,medium,low bundle exec rake qc:work

would make things around here do the equiv of;

SELECT id, CASE WHEN q_name='high' THEN 1
            WHEN q_name='medium' THEN 2
            WHEN q_name='low' THEN 3
       END AS q_priority
FROM queue_classic_jobs
WHERE locked_at IS NULL
  AND q_name IN ('high', 'medium', 'low')
ORDER BY q_priority ASC, id ASC
LIMIT 1 .....

Just a thought, but I think this can and should be done mainly in SQL.

ukd1 commented

👍 to @nathankot

I am going to close this issue since the v3.0.0beta introduces the ability for a worker to process many queues. I will not be surprised if we need to make a fix or an improvement to the feature, but lets open a new issue for those requests.

https://github.com/ryandotsmith/queue_classic/releases/tag/v3.0.0beta