Youtube instructions:
ENG: https://youtu.be/pTk82vkhsMc
RUS: https://youtu.be/xf58pNYhRss
This repo is inspired by the Docker for Datascience book. It's a Docker image with a data science environment based on the jupyter/datascience-notebook with pandas, matplotlib, scipy, seaborn and scikit-learn pre-installed.
- Copy this repo
Create a new directory, cd into it, and then run
git init
git pull https://github.com/glebmikha/data-science-project-template.git
Or you can just download it as a zip and use it without git.
- Add your favorite Python modules to ./docker/jupyter/requirements.txt. For example:
xgboost
tensorflow==1.6.0
Or use pip install right in jupyter (don't forget ! in front of the command)
!pip install your_package
- Start containers
docker-compose up
-
Copy a jupyter url from terminal and open it in your browser.
-
Find an examples.ipynb notebook in ipynb folder. Create your notebooks.
-
Copy your data into ./data and read it in Jupyter. You can also upload data into PostgreSQL, which is running in it's own container along with Jupyter (see examples notebook for details)
-
Stop containers
docker-compose down
- Update images
docker-compose build --pull
- Clean Docker's mess
docker rmi -f $(docker images -qf dangling=true)
Sometimes it is useful to remove all docker's data.
docker system prune