ixti/sidekiq-throttled

Jobs randomly changing position only if throttle is used

feliperaul opened this issue · 4 comments

Ruby version: 2.6.5
Sidekiq / Pro / Enterprise version(s): 6.0.7

Initializer:

:concurrency: 25
:queues:
  - mailers
  - default
  - importers
  - searchkick
  - other

We are currently inspecting a queue that is taking a long time to run ("importers").

On investigating, we noticed a very unusual behavior that we think is a bug and might be causing performance degradation and/or unexpected behavior.

If we activate "Live poll" and watch the queue on the web ui, every 2 seconds we have a completely different order of jobs, even tough no new jobs have been added and no jobs have been processed (these jobs take a few minutes each, and they only receive jobs manually, so it should be very stable during processing).

Using Rails console to inspect it further, we could confirm that on every poll (a few seconds apart), we get a completely different array of job.jids, even tough all the Job Ids remain the same (only the order changes, but the jobs ids remain constant, so no new jobs are being added or removed).

We're using this debug code to confirm the order is changing very rapidly:

def debug
  queue = Sidekiq::Queue.new("importers")
  jids = queue.map {|job| job.jid}; p "I have #{jids.size} jobs and my first one id is #{jids[0]}"
  sleep 1
  jids2 = queue.map {|job| job.jid}; p "I have #{jids2.size} jobs and my first one id is #{jids2[0]}"
  p "This is jids2 - jids"
  p jids2 - jids
  p "This is jids - jids2:"
  p jids - jids2
  p "Are they the same?"
  p jids == jids2
  p "But what if I order them?"
  p jids.sort == jids2.sort
end

This is the result:

(main)> debug
"I have 151 jobs and my first one id is 20480233f53e0b8d62231615"
"I have 151 jobs and my first one id is 0dc7e77f17c1f6b3fc6e59a8"
"This is jids2 - jids"
[]
"This is jids - jids2:"
[]
"Are they the same?"
false
"But what if I order them?"
true

AFAIK, Sidekiq guarantees that jobs are going to be fetched in order (even tough, of course, it's an Async job queue, so there's no guarantee they will finish in the order they were added), so we think this to be a bug.

I just bumped on #52

So this appears to be by design :)

Closing.

ixti commented

@feliperaul can you point me out where Sidekiq guarantees that jobs are going to be fetched in order? Because if it is so - we need to think on how to make sure we follow that.

Hi @ixti , sure.

It's on the FAQ, here: https://github.com/mperham/sidekiq/wiki/FAQ#how-can-i-process-a-certain-queue-in-serial

It reads:

How can I process a certain queue in serial?

You can't, by design. Sidekiq is designed for asynchronous processing of jobs that can be completed in isolation and independent of each other. Jobs will be popped off of Redis in the order in which they were pushed but there's no guarantee that Job #1 will execute fully before Job #2 is started.

If you need serial execution, you should look into other systems which give those types of guarantees.

To be quite honest, it hasn't been a big deal for us, but if #52 states that the job queue, when throttled, is paused from polling for two seconds, I think that it would be much better to push the jobs back to the top of the queue, instead of pushing them back to the end of the line.

Imagine a huge e-mail sending queue (like 100.000 jobs) that you are throttling to use only 5 workers ... if jobs are being pushed to the END of the queue, they would take much, much longer to send than if they were pushed back to the top of the queue to wait for the next poll window (2 seconds).

ixti commented

I think there's no one size fits all solution. But we can make throttling as customizing as possible:

  • allowing throttle queue rather than class – that can easily guarantee FIFO
  • allowing to configure how throttled jobs are pushed back – to the head or to the tail (as it is now)

I'll be happy to review and merge any improvements.