/prefect-docker-compose

A simple guide to understand Prefect and make it work with your own docker-compose configuration.

Primary LanguagePython

Prefect - Docker Compose

A simple guide to understand and make Prefect 2.x work with your own docker-compose configuration.

Interested in examples for Prefect 1.x ? Switch to the last 1.x branch of this repo.

This allows you to package your Prefect instance for fully-containerized environments (e.g: docker-compose, Kubernetes) or offline use.

Operating principle of Prefect

Run the server

  1. Optionally open and edit the server/.env file

    ℹ️ All PREFECT_* env variables can be found in the Prefect settings.py file.

  2. Start the server :

    cd server && docker-compose up --build -d && cd -
  3. Access the Orion UI at localhost:4200

Run one or multiple agents

Agents are services that run your scheduled flows.

  1. Optionally open and edit the agent/docker-compose.yml file.

    ℹ️ In each docker-compose.yml, you will find the PREFECT_API_URL env variable including the 172.17.0.1 IP address. This is the IP of the Docker daemon on which are exposed all exposed ports of your containers. This allows containers launched from different docker-compose networks to communicate. Change it if yours is different (check your daemon IP by typing ip a | grep docker0).

    Docker interface IP

    Here, mine is 192.168.254.1 but the default is generally to 172.17.0.1.

  2. Start the agent :

    docker-compose -f agent/docker-compose.yml up -d

    ℹ️ You can run the agent on another machine than the one with the Prefect server. Edit the PREFECT_API_URL env variable for that.

    Maybe you want to instanciate multiple agents ?

    docker-compose -f agent/docker-compose.yml up -d --scale agent=3 agent
  3. Our agents are now starting listening the Orion server on the flows-example-queue queue (see the --work-queue option).

Run your first flow via the Prefect API

Principles to understand

💬 Execution in your cloud; orchestration in ours

This means the Prefect server never stores your code. It just orchestrates the running (optionally the scheduling) of it.

  1. After developing your flow, Prefect will register it to the Orion server through a Deployment. In that script, you may ask the server to run your flow 3 times a day, for example.

  2. Your code never lies on the Prefect server : this means the code has to be stored somewhere accessible to the agents in order to be executed.

    Prefect has a lot of storage options but the most famous are : Local, S3, Docker and git.

    • Local : saves the flows to be run on disk. So the volume where you save the flows must be shared among your client and your agent(s). Requires your agent to have the same environment than your client (Python modules, packages installed etc... (the same Dockerfile if your agent and client are containers))
    • S3 : similar to local, but saves the flows to be run in S3 objects.
    • Docker : saves the flows to be run as Docker images to your Docker Registry so your agents can easily run the code.

Flow with Local storage (easiest)

ℹ️ If your agents are installed among multiple machines, I recommend you to mount a shared directory with SSHFS.

  1. Run the following command to register your deployment and run the flow :

    docker-compose -f client/docker-compose.yml up # Executes weather.py
  2. Access the UI to see your flow correctly run

Flow with S3 Storage (recommended)

Tutorial for S3 Storage

We will use MinIO as our S3 server.

  1. Optionally open and edit the client_s3/.env file and start MinIO

    docker-compose -f client_s3/docker-compose.yml up -d minio # Starts MinIO
  2. Register the flow :

    docker-compose -f client_s3/docker-compose.yml up weather # Executes weather.py

Now your flow is registered. You can access the UI to run it.

Flow with Docker storage

This method requires our client AND agent containers to have access to Docker so they can package or load the image in which the flow will be executed. We use Docker in Docker for that.

Tutorial for (secure) Docker Storage

Preparing the Registry

A Docker Registry is needed in order to save images that are going to be used by our agents.

  1. Generate the authentication credentials for our registry

    sudo apt install apache2-utils # required to generate basic_auth credentials
    cd client_docker/registry/auth && htpasswd -B -c .htpasswd myusername && cd -

    To add more users, re-run the previous command without the -c option

  2. Start the registry

    docker-compose -f client_docker/docker-compose.yml up -d registry
  3. Login to the registry

    You need to allow your Docker daemon to push to this registry. Insert this in your /etc/docker/daemon.json (create if needed) :

    {
      "insecure-registries": ["172.17.0.1:5000"]
    }
  4. Start the registry

    docker login http://172.17.0.1:5000 # with myusername and the password you typed

    You should see : Login Succeeded

Start the Docker in Docker agent

Optionally edit registry credentials in ./agent_docker/docker-compose.yml and run :

docker-compose -f agent_docker/docker-compose.yml up --build -d

Registering the flow

We're going to push our Docker image with Python dependencies and register our flow.

  1. Build, tag and push the image

    docker build . -f ./client_docker/execution.Dockerfile -t 172.17.0.1:5000/weather/base_image:latest

    You must prefix your image with the registry URI 172.17.0.1:5000 to push it

    docker push 172.17.0.1:5000/weather/base_image:latest
  2. Register the flow

    Optionally edit registry credentials in ./client_docker/docker-compose.yml and run :

    docker-compose -f ./client_docker/docker-compose.yml up --build weather

Now your flow is registered. You can access the UI to run it.