jmettraux/rufus-scheduler

Question: mutex and overlap interaction

Closed this issue · 8 comments

I think there's some funny interactions between mutex and overlap. With overlap: false and mutex: :whatever applied to multiple schedules, I expect the jobs just to be queued, but in reality, the ordering is unpredictable, and some jobs get run more often (by chance I'm sure).

Expected Output

running job1
job1 done
running job2
job2 done
running job3
job3 done
running job1
job1 done
running job2
job2 done
running job3
job3 done
...

Repro

ENV['TZ'] = 'America/Los_Angeles'
s = Rufus::Scheduler.new

s.every(1, mutex: :my_mutex, overlap: false) { puts "running job1"; sleep 2; puts "job1 done" }

s.every(1, mutex: :my_mutex, overlap: false) { puts "running job2"; sleep 2; puts "job2 done" }

s.every(1, mutex: :my_mutex, overlap: false) { puts "running job3"; sleep 2; puts "job3 done" }

Actual Output

running job1
job1 done
running job2
job2 done
running job3
job3 done
running job1
job1 done
running job1
job1 done
running job2
job2 done
running job2
job2 done
running job1
job1 done
running job3
job3 done
running job1
job1 done
running job2
job2 done
running job3
job3 done
running job3
job3 done
running job1
job1 done
running job2
job2 done
running job2
job2 done
running job3
job3 done
running job3
job3 done
running job2
job2 done
running job2
job2 done
running job1
job1 done
running job3
job3 done
running job1
job1 done
running job2
job2 done
running job1
job1 done
running job2
job2 done
running job3
job3 done
running job3
job3 done
running job1
job1 done
running job3
job3 done
running job2
job2 done
running job1
job1 done
running job3
job3 done
running job3

I'm not sure if this is a bug, or just that I'm doing something wrong (to get my desired behavior). Please advise. Thanks!

Hello,

not a solution, nor a full answer, but blocking: true makes it behaves squarely as I think you would expect:

require 'rufus-scheduler'

ENV['TZ'] = 'America/Los_Angeles'
s = Rufus::Scheduler.new

def now
  t = Time.now                                                                  
  "#{t.strftime('%H:%M:%S')}.#{sprintf('%06d', t.usec)}"             
end 

s.every(1, blocking: true) {
  print "#{now} running job1..."; sleep 2; puts " #{now} job1 done." }
s.every(1, blocking: true) {
  print "#{now} running job2..."; sleep 2; puts " #{now} job2 done." }
s.every(1, blocking: true) {
  print "#{now} running job3..."; sleep 2; puts " #{now} job3 done." }

s.join

But at the detriment of other jobs (although you could have 1 scheduler instances dedicated to those blocking jobs).

14:37:56.879407 running job1... 14:37:58.880925 job1 done.
14:37:58.881304 running job2... 14:38:00.884011 job2 done.
14:38:00.884240 running job3... 14:38:02.886557 job3 done.
14:38:03.190967 running job1... 14:38:05.196069 job1 done.
14:38:05.196322 running job2... 14:38:07.201458 job2 done.
14:38:07.201734 running job3... 14:38:09.203344 job3 done.
...

overlap: false doesn't drive the scheduler into a "queuing" mode. The routine is, if the job is already running, then let's not trigger it and let's reschedule it for the next occurrence.

I will update the readme to explain that a bit better.

@jmettraux thank you for updating the Readme, and pointing me to blocking: true. I think I actually do want to block.

You're welcome!

@jmettraux
Regarding the statement in the readme below

By default, jobs are triggered in their own, new thread. When :blocking => true, the job is triggered in the scheduler thread (a new thread is not created). Yes, while the job triggers, the scheduler is not scheduling.

If the scheduler isn't scheduling, is it possible to miss a scheduled job?
For instance, I have a job that runs every 5 minutes, takes 5-10 minutes to complete, and a job that runs every hour. If the other job is running at 12:00 when the hourly job should kick off, does the hourly job still start after the prior job completes?

I did a small amount of testing, and I can't quite determine what Rufus is doing to figure out what job runs next. Though, the one that runs every minute (in my test), does run about 1/minute.

s = Rufus::Scheduler.new
s.every(5, blocking: true) { puts "#{Time.now} job 1"; sleep 11; puts "#{Time.now} Job1 done" }
s.cron('* * * * *', blocking: true) { puts "#{Time.now} job 2"; sleep 11; puts "#{Time.now} Job2 done" }
2017-06-09 12:51:27 -0700 job 1
2017-06-09 12:51:38 -0700 Job1 done
2017-06-09 12:51:38 -0700 job 1
2017-06-09 12:51:49 -0700 Job1 done
2017-06-09 12:51:49 -0700 job 1
2017-06-09 12:52:00 -0700 Job1 done
2017-06-09 12:52:01 -0700 job 1
2017-06-09 12:52:12 -0700 Job1 done
2017-06-09 12:52:12 -0700 job 2
2017-06-09 12:52:23 -0700 Job2 done
2017-06-09 12:52:23 -0700 job 1
2017-06-09 12:52:34 -0700 Job1 done
2017-06-09 12:52:34 -0700 job 1
2017-06-09 12:52:45 -0700 Job1 done
2017-06-09 12:52:45 -0700 job 1
2017-06-09 12:52:56 -0700 Job1 done
2017-06-09 12:52:57 -0700 job 1
2017-06-09 12:53:08 -0700 Job1 done
2017-06-09 12:53:08 -0700 job 1
2017-06-09 12:53:19 -0700 Job1 done
2017-06-09 12:53:19 -0700 job 2
2017-06-09 12:53:30 -0700 Job2 done
2017-06-09 12:53:30 -0700 job 1
2017-06-09 12:53:41 -0700 Job1 done
2017-06-09 12:53:42 -0700 job 1
2017-06-09 12:53:53 -0700 Job1 done
2017-06-09 12:53:53 -0700 job 1
2017-06-09 12:54:04 -0700 Job1 done
2017-06-09 12:54:04 -0700 job 1
2017-06-09 12:54:15 -0700 Job1 done
2017-06-09 12:54:15 -0700 job 2
2017-06-09 12:54:26 -0700 Job2 done
2017-06-09 12:54:27 -0700 job 1

Hello,

I wrote that to help me remember/understand:

require 'rufus-scheduler'

ENV['TZ'] = 'America/Los_Angeles'
s = Rufus::Scheduler.new

def time(t=Time.now)
  "#{t.strftime('%H:%M:%S')}.#{sprintf('%06d', t.usec)}"
end

j = s.schedule_every('3s', blocking: true) do |j|
  puts "#{time} running blocking job1 #{j.object_id} #{j.count}"
  sleep 3
  puts "#{time} job1 done #{j.object_id} (next: #{time(j.next_time)})"
end
puts "#{time} job1 should trigger first at #{time(j.next_time)}"

j = s.schedule_every('4s', blocking: true) do |j|
  puts "#{time} running blocking job2 #{j.object_id} #{j.count}"
  sleep 2
  puts "#{time} job2 done #{j.object_id} (next: #{time(j.next_time)})"
end
puts "#{time} job2 should trigger first at #{time(j.next_time)}"

puts

s.join

It seems OK.

Maybe you're trying to give too much responsibility to rufus-scheduler. I don't know your setup, but you could queue the jobs with rufus-scheduler and have 1 consumer dequeuing and performing, in the queued order, that would give you the expected results (if I understood your aim correctly).

I think people with Resque or Sideqik schedule with rufus-scheduler jobs that get "queued" in those systems.

Behind the scenes rufus-scheduler has a single "workqueue", when non blocking, the consumer picks up successfully triggered tasks there and runs them in their own thread.

I could implement differentiated workqueues (something like queue: :q1) for jobs, but I'm wondering if it isn't going too far and turning rufus-scheduler into rufus-scheduler-and-job-queue.

ya, I think you are right. I should be scheduling jobs into a queue, then just pop them off the queue as they come in.