celery - Distributed Task Queue

Version:	2.4.0a1
Web:	http://celeryproject.org/
Download:	http://pypi.python.org/pypi/celery/
Source:	http://github.com/ask/celery/
Keywords:	task queue, job queue, asynchronous, rabbitmq, amqp, redis, python, webhooks, queue, distributed

Celery is an open source asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

The execution units, called tasks, are executed concurrently on one or more worker nodes using multiprocessing, Eventlet or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).

Celery is used in production systems to process millions of tasks a day.

Celery is written in Python, but the protocol can be implemented in any language. It can also operate with other languages using webhooks.

The recommended message broker is RabbitMQ, but limited support for Redis, Beanstalk, MongoDB, CouchDB and databases (using SQLAlchemy or the Django ORM) is also available.

Celery is easy to integrate with Django, Pylons and Flask, using the django-celery, celery-pylons and Flask-Celery add-on packages.

Overview
Example
Features
Documentation
Installation
- Downloading and installing from source
- Using the development version
Getting Help
- Mailing list
- IRC
Bug tracker
Wiki
Contributing
License

Overview

This is a high level overview of the architecture.

The broker delivers tasks to the worker nodes. A worker node is a networked machine running celeryd. This can be one or more machines depending on the workload.

The result of the task can be stored for later retrieval (called its "tombstone").

Example

You probably want to see some code by now, so here's an example task adding two numbers:

from celery.task import task

@task
def add(x, y):
    return x + y

You can execute the task in the background, or wait for it to finish:

>>> result = add.delay(4, 4)
>>> result.wait() # wait for and return the result
8

Simple!

Features

Messaging Supported brokers include RabbitMQ, Redis, Beanstalk, MongoDB, CouchDB, and popular SQL databases.

Fault-tolerant Excellent configurable error recovery when using RabbitMQ, ensures your tasks are never lost. scenarios, and your tasks will never be lost.

Distributed Runs on one or more machines. Supports broker clustering and HA when used in combination with RabbitMQ. You can set up new workers without central configuration (e.g. use your grandma's laptop to help if the queue is temporarily congested).

Concurrency Concurrency is achieved by using multiprocessing, Eventlet, gevent or a mix of these.

Scheduling Supports recurring tasks like cron, or specifying an exact date or countdown for when after the task should be executed.

Latency Low latency means you are able to execute tasks while the user is waiting.

Return Values Task return values can be saved to the selected result store backend. You can wait for the result, retrieve it later, or ignore it.

Result Stores Database, MongoDB, Redis, Tokyo Tyrant, Cassandra, or AMQP (message notification).

Webhooks Your tasks can also be HTTP callbacks, enabling cross-language communication.

Rate limiting Supports rate limiting by using the token bucket algorithm, which accounts for bursts of traffic. Rate limits can be set for each task type, or globally for all.

Routing Using AMQP's flexible routing model you can route tasks to different workers, or select different message topologies, by configuration or even at runtime.

Remote-control Worker nodes can be controlled from remote by using broadcast messaging. A range of built-in commands exist in addition to the ability to easily define your own. (AMQP/Redis only)

Monitoring You can capture everything happening with the workers in real-time by subscribing to events. A real-time web monitor is in development.

Serialization Supports Pickle, JSON, YAML, or easily defined custom schemes. One task invocation can have a different scheme than another.

Tracebacks Errors and tracebacks are stored and can be investigated after the fact.

UUID Every task has an UUID (Universally Unique Identifier), which is the task id used to query task status and return value.

Retries Tasks can be retried if they fail, with configurable maximum number of retries, and delays between each retry.

Task Sets A Task set is a task consisting of several sub-tasks. You can find out how many, or if all of the sub-tasks has been executed, and even retrieve the results in order. Progress bars, anyone?

Made for Web You can query status and results via URLs, enabling the ability to poll task status using Ajax.

Error Emails Can be configured to send emails to the administrators when tasks fails.

Documentation

The latest documentation with user guides, tutorials and API reference is hosted at Github.

Installation

You can install Celery either via the Python Package Index (PyPI) or from source.

To install using pip,:

$ pip install Celery

To install using easy_install,:

$ easy_install Celery

Downloading and installing from source

Download the latest version of Celery from http://pypi.python.org/pypi/celery/

You can install it by doing the following,:

$ tar xvfz celery-0.0.0.tar.gz
$ cd celery-0.0.0
$ python setup.py build
# python setup.py install # as root

Using the development version

You can clone the repository by doing the following:

$ git clone git://github.com/ask/celery.git

Be sure to also read the Contributing to Celery section in the documentation.

License

This software is licensed under the New BSD License. See the LICENSE file in the top distribution directory for the full license text.

Messaging	Supported brokers include RabbitMQ, Redis, Beanstalk, MongoDB, CouchDB, and popular SQL databases.
Fault-tolerant	Excellent configurable error recovery when using RabbitMQ, ensures your tasks are never lost. scenarios, and your tasks will never be lost.
Distributed	Runs on one or more machines. Supports broker clustering and HA when used in combination with RabbitMQ. You can set up new workers without central configuration (e.g. use your grandma's laptop to help if the queue is temporarily congested).
Concurrency	Concurrency is achieved by using multiprocessing, Eventlet, gevent or a mix of these.
Scheduling	Supports recurring tasks like cron, or specifying an exact date or countdown for when after the task should be executed.
Latency	Low latency means you are able to execute tasks while the user is waiting.
Return Values	Task return values can be saved to the selected result store backend. You can wait for the result, retrieve it later, or ignore it.
Result Stores	Database, MongoDB, Redis, Tokyo Tyrant, Cassandra, or AMQP (message notification).
Webhooks	Your tasks can also be HTTP callbacks, enabling cross-language communication.
Rate limiting	Supports rate limiting by using the token bucket algorithm, which accounts for bursts of traffic. Rate limits can be set for each task type, or globally for all.
Routing	Using AMQP's flexible routing model you can route tasks to different workers, or select different message topologies, by configuration or even at runtime.
Remote-control	Worker nodes can be controlled from remote by using broadcast messaging. A range of built-in commands exist in addition to the ability to easily define your own. (AMQP/Redis only)
Monitoring	You can capture everything happening with the workers in real-time by subscribing to events. A real-time web monitor is in development.
Serialization	Supports Pickle, JSON, YAML, or easily defined custom schemes. One task invocation can have a different scheme than another.
Tracebacks	Errors and tracebacks are stored and can be investigated after the fact.
UUID	Every task has an UUID (Universally Unique Identifier), which is the task id used to query task status and return value.
Retries	Tasks can be retried if they fail, with configurable maximum number of retries, and delays between each retry.
Task Sets	A Task set is a task consisting of several sub-tasks. You can find out how many, or if all of the sub-tasks has been executed, and even retrieve the results in order. Progress bars, anyone?
Made for Web	You can query status and results via URLs, enabling the ability to poll task status using Ajax.
Error Emails	Can be configured to send emails to the administrators when tasks fails.

jcsp/celery

celery - Distributed Task Queue