/pyetcdq

POC of a distributed task queue framework built on top of etcd

Primary LanguagePython

A worker queue implements with Etcd and Python

This is a POC of a distributed task queue framework built on top of etcd. The goal of this is to provide a reliable and scalalbe framework to build distributed task execution.

Approach

Task submissions:

  • Task submission is done by by creating a folder in tasks/data.
  • A file is put into queues folder to enqueue the task

Task processing:

  • Worker scans for pending files in queues folder
  • If there is no pending task, then it starts to watch for change
    • When there is new task, or there is change in states. Worker will try to acquire for the job by creating a lock filem, since this is an atomic action, only one will get the job
    • Worker will create a running file to indicate the task is running.
    • Worker executes the task and update status to success when it is done.
    • During the execution, the worker must periodically refresh the ttl
    • On failure, the worker should remove it locks to the file so other workers can acquire the task.

Questions

  • How to handle failures?
  • How to handle "round-robin" behavior?
  • How to cancel a task?: use process interupt
  • Capacity-based worker
  • Task routing
  • Task dependencies
  • Task priorities
  • Task retries
  • Error handling
  • Task communication

References