A custom Virtual Machine for Data Science running JupyterHub for multi-tenant Jupyter notebooks. This image can be used stand-alone/locally (CPU or GPU) or deployed as part of the Ubuntu Azure Data Science Virtual Machine (CPU or GPU) to add custom functionality.
The main purpose of this VM is a specialized setup for computer vision tasks.
The ways in which this repo can be used:
- Local Deployment (GPU/CPU) with a prebuilt image
- Cloud Deployment to Azure
- Build from a dockerfile (GPU/CPU)
The Cloud Deployment runs on top of Azure's Ubuntu Data Science Virtual Machine as a separate instance of JupyterHub which is multitenant and more customized for computer vision tasks.
Python 3.6
Deep Learning:
- TENSORFLOW_VERSION="1.12.0"
- KERAS_VERSION="2.2.4"
- PYTORCH_VERSION="1.0"
- TORCHVISION_VERION="0.2.1"
Azure:
- Azure CLI
- Azure ML Python SDK
- Azure Image Search Python SDK
- Azure Custom Vision Python SDK
Computer Vision Related:
- OpenCV
- Scikit-Image
- Imaged Augmentation Library - imgaug
- Shapely for spatial analysis (Ref)
- SimpleCV (Ref); can even hook up to webcam etc. - (Ref)
- Dask for external memory bound computation (e.g. digit classification here)
- and more (see
requirements.txt
)
Other:
- JupyterHub for interacting with these components
- TensorFlow Probability
- 5 users: wonderwoman, user1, user2, user3, user4
- Password is the one used to build the image. The default is "Python3!".
See the ARM template (azuredeploy.json
and azuredeploy.paramters.json
) for the specs on deploying to Azure.
Run the docker image locally:
-
Ensure you have Docker installed (Docker for Windows or Docker for Mac are recommended)
-
For CPU, run the following docker
run
command at a command prompt as follows (may needsudo
to enhance priviledges on Unix systems) (for a command prompt in Windows, search for "cmd"):docker run -it -v /var/run/docker.sock:/var/run/docker.sock -p 8788:8788 --ipc-host --expose=8788 rheartpython/cvdeep:latest
docker run -d -p 5555:5555 -p 80:7842 -p 8788:8788 -v ~/dev/:/root/sharedfolder --ipc-host --expose=8788 rheartpython/cvdeep:latest
-
For an Nvidia GPU with Cuda 9.0/cudnn7 (Linux only, as Windows does not support exposing a system Nvidia GPU in docker, yet):
sudo nvidia-docker run -it -v /var/run/docker.sock:/var/run/docker.sock -p 8788:8788 --expose=8788 rheartpython/cvdeep_gpu:latest
- Log into JupyterHub at https://0.0.0.0:8788 or https://localhost:8788 (note the use of
https
) with the userwonderwoman
and the system variable password you used when building it (the default specified above) and you should also get an Admin panel to make users Admin as well so they can pip install stuff.
You can click on the "Deploy to Azure" button to try out the Ubuntu Data Science Virtual Machine with this image running on it (Azure subscription required. Hardware compute fees applies. Free Trial available for new customers).
IMPORTANT NOTE: Before you proceed to use the Deploy to Azure button you must perform a one-time task to accept the terms of the data science virtual machine on your Azure subscription. You can do this by visiting Configure Programmatic Deployment
Description | Deploy Here |
---|---|
CPU DSVM Version | |
GPU DSVM Version (PyTorch 1.0) | |
- Log into jupyterhub at
https://<ip or dns name>:8788
(note the use ofhttps
and port8788
) with the userwonderwoman
and the system variable password you used when building it (the default specified above) and you should also get an Admin panel to make the other users Admin as well so they can pip install stuff. - Note, the DSVM already has a JupyterHub system running at port
8000
if interested. Check out Azure Portal or Azure Docs for more information on what it contains.
Create the docker image:
- In the
secrets
folder create some certificate and key files withopenssl
and name themjupyterhub.crt
andjupyterhub.key
, respectively.
-
To create these:
openssl req -new -newkey rsa:2048 -nodes -keyout jupyterhub.key -x509 -days 365 -out jupyterhub.crt
-
Create a system var called
$USER_PASSWD
with a password for an admin to jupyterhub user. This will feed into a sys var in the dockerfile/image. E.g.:export USER_PASSWD=foobar
-
Create the image by running the
docker build
command as follows (name the image tag anything you like, e.g.rheartpython/cvdeep
, whererheartpython
is a username on Dockerhub, so use yours or any tag). Note, on Windows you should run this command in Git Bash (Download Git for Windodws here):
-
CPU:
docker build --build-arg USER_PW=$USER_PASSWD -t <dockerhub user>/<image name> -f Linux_py35_CPU.dockerfile .
-
GPU (nvidia-docker 1.0; build on a machine with this command line tool):
nvidia-docker build --build-arg USER_PW=$USER_PASSWD -t <dockerhub user>/<image name> -f Linux_py35_GPU.dockerfile .
Push the image to Dockerhub so that you and others (namely the VM through the ARM template) can use it (docker login
and then docker push <dockerhub user>/<image name>
).
- This work is based on the following projects:
- Data Science Virtual Machine - https://github.com/Azure/DataScienceVM
- William Buchwalter's https://github.com/wbuchwalter/deep-learning-bootcamp-vm
- Ari Bornstein's https://github.com/aribornstein/CVWorkshop
- If you'd like to contribute to this project, fork this repo and make a Pull Request.
- If you see any problems or want a feature, create an Issue.
- Don't panic.