Docker for Data Science

Background

This tutorial will show you how to integrate docker into your data science workflow. docker is an open source tool that makes it easy to build, deploy and run applications using a container framework. If you do any of the following, you can use docker to make your life easier:

share and reproduce your analysis
run large scale data cleaning tasks
build dashboards and publish models

Getting Started

Clone the repo to your machine

git clone https://github.com/harnav/pydata-docker-tutorial.git

In this tutorial, we will go over three points

Running a container
Reproducible environments
Deploying models

References

For more detailed instructions, check out:

How Docker Can Help you Become a More Effective Data Scientist
Reproducible Data Science: Docker for Data Science
Docker Labs

kizombaciao/pydata-docker-tutorial

Docker for Data Science

Background

Getting Started

References