closeio/tasktiger

Optimize scheduled task queueing

Closed this issue · 3 comments

When there are many queues, _worker_queue_scheduled_tasks is looping through all the scheduled queues after every worker run, which causes a lot of Redis overhead.

Related code:

tasktiger/tasktiger/worker.py

Lines 1017 to 1023 in e926d78

if (
time.time() - self._last_task_check > self.config['SELECT_TIMEOUT']
and not self._stop_requested
):
self._worker_queue_scheduled_tasks()
self._worker_queue_expired_tasks()
self._last_task_check = time.time()

Some ideas:

  • Provide an argument to specify the number of expected workers and do something like self.config['SELECT_TIMEOUT'] * self.num_workers to adjust how often it is called based on how many workers will be calling it
  • Use LimitLion to control how often workers run the related code
  • Use a Redis SETNX call that creates a key that expires in SELECT_TIMEOUT. The worker that wins and sets that key makes the related calls. Need concept of worker group name so we can have a key per group of related workers (this is needed in the LimitLion approach also)

For the SETNX or LimitLion option we could probably just have a key/throttle per root queue and not need a group name.

@thomasst Should we close this? I don't think we have any further optimizations planned.