Celery distributed processing with Docker. This role is considered developmental.
See meta/main.yml
This role requires Docker to be running on the target hosts. Celery requires a broker such as Redis, either on the current or a remote server.
For example, you can use the openmicroscopy.docker and openmicroscopy.redis roles to set them up, see the example playbook.
All variables are optional:
celery_docker_broker_url: The URL to the broker, e.g.redis://HOST:PORT/DB, default is redis on localhostcelery_docker_log_level: Celery worker log level, defaultDEBUGcelery_docker_concurrency: Number of concurrent tasks, default is number of CPUscelery_docker_opts: Additional options for celery workercelery_docker_max_retries: Maximum number of times to retry if thedockercommand fails, default3celery_docker_retry_delay: Delay (seconds) before retrying a failed task, default10celery_docker_store_tasks_hours: Store completed tasks in the broker for this number of hours, default384(16 days)celery_docker_systemd_timeout: If a Celery worker is stopped withsystemctl stop celery-workerwait this number of seconds before killing the worker (this will kill any tasks still in progress), default7200(2 hours)
This role creates a celery worker that launches docker containers.
At present a single task is defined, run_docker.
See celery-worker-tasks.py run_docker for a description of the parameters.
Tasks can be submitted in two ways:
-
Submit a
run_dockertask with arguments passed as a dictionary. For an example of this see celery-submit-example.py. -
Run the tasks file directly (a
mainfunction is included):/opt/celery/venv/bin/python /opt/celery/worker/tasks.py --help /opt/celery/venv/bin/python /opt/celery/worker/tasks.py --inputpath /celery/in --outputpath /celery/out --out /celery/output.log busybox -- sh -c 'date > /output/date.txt'
- hosts: localhost
roles:
- role: openmicroscopy.docker
- role: openmicroscopy.redis
- role: openmicroscopy.celery-docker
You can set a password and other configuration options when starting Redis (--requirepass), or in the Redis configuration file.
All communication is in plain text so this does not guard against network sniffing.
Redis can write its data to disk and reload it on startup (--appendonly yes) instead of running as an in-memory database.
For example, if you are running Redis in Docker:
docker run -d --name redis -p 6379:6379 -v redis-volume:/data redis \
--requirepass PASSWORD --appendonly yes
And set celery_docker_broker_url: redis://:PASSWORD@redis.example.org:6379
The default Celery configuration is designed for handling short tasks. There are several changes that can be made to improve the processing of long-running tasks, see the configuration and optimising docs.
For example, use the -Ofair option celery_docker_opts: -Ofair