/docker-test

Proof of concept with docker-compose and containers

Primary LanguageDockerfile

README

This is a toy project to illustrate how to use Dockerfile and docker-compose in a project with typical data science constraints. What are these?

  • I need to deploy several versions of the same models, with different parameters but same data, and different persistence.
  • All models (containers) will read from the same data directory.
  • The Software stack is the same, but each run is different depending on the parameters they're configured to run with.
  • The output from each model needs to be persisted, sometimes. In this case, this is inference time, so we don't need it. Our output directory will be inside the docker and will be destroyed once the docker ends.

So, this is how our project structure looks:

docker-test
  ├── Dockerfile
  ├── README.md
  ├── data                -> ~/docker-test/data
  │   └── file.csv
  ├── docker-compose.yml
  ├── models              -> ~/docker-test/models
  │   └── model1.json
  ├── module1
  │   └── entry1.py
  ├── module2
  │   └── entry2.py
  ├── output
  ├── pipeline.sh
  └── requirements.txt

The image is built by calling (please, clone this repo, first!):

$ docker-compose build

and then executed by calling:

$ docker-compose up

The contents of the Dockerfile, establishes that everything will run in a directory called /tester inside the docker. Within this directory, some of the sub-directories (bold) are Volumes pointing to host filesystem folders, and some others will be created inside the image (we don't care if we loose their contents).

The contents of the docker-compose.yml:

version: '3'
services:
  tester:
    build: .
    image: tester:0.1.0
    volumes:
        - ./data:/tester/data
        - ./models:/tester/models

As seen, all contents from the root directory are copied inside the container image, BUT the directories data and models will be volumes mounted over the host filesystem, referencing therefore, the actual files in the host, not in the container.

Now, you can change the contents of data or models from your computer (host) and everytime you fire up the container, those changes will be reflected in your execution. At the same time, changes in code, in you machine will never affect your docker, unless you rebuild the image.