Docker-pipeline
Tool for pipelining processing of data. Built on top of docker infractructure, each worker is a docker container. Inside of a container it uses linux pipes for transfering data between network layer and your worker. The project is in very intial state now, but has working samples.
1. Write processing scripts
Now docker-pipeline has an implementation of worker wrapper for python only, but it's easy to write worker wrappers for other languages.
#!/usr/bin/python
from pipelining import run
def job(data):
# your processing here, for example you can return length of data
return [data, len(data)]
run(job)
2. Build base images
$ ./rebuild_images.sh
3. Build own workers
Use as base image docker-pipeline-worker or docker-pipeline-source, the only difference - source should generate data, and appropriate header, you can find example in pipelining/simple_test_source.py
FROM docker-pipeline-worker # or docker-pipeline-source
ADD {add your executable here}
4. Describe your topology as docker-compose file
version: '2'
services:
source:
image: your_source
restart: always
processing:
image: your_worker
restart: always
another_processing:
image: your_another_worker
restart: always
5. Run!!
$ docker-compose -f docker-compose-base.yml -f your-docker-compose.yml up -d
6. Scale
$ docker-compose scale processing=3 another_processing=4