VMware [Deep Learning Containers (DLCs)] are a set of Docker images for training and serving models in TensorFlow, PyTorch, MXNet and PaddlePaddle. Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Bitfusion.
For the list of available DLC images, see Available Deep Learning Containers Images.
This project is licensed under the Apache-2.0 License.
Running object detection workload
We describe here the setup to build and use the DLCs to run workload.
We take an example of building a Pytorch CPU python3 inference container.
- Ensure you have access to an harbor-repo account website
- Ensure you have docker client set-up on your system.
- Clone the repo and set the following environment variables:
export REGISTRY=harbor-repo.vmware.com export REPOSITORY_NAME=pytorch-inference export ACCOUNT=dlc
- Login to harbor-repo
docker login harbor-repo.vmware.com -u 'username'
- Assuming your working directory is the cloned repo, create a virtual environment to use the repo and install requirements
git clone https://github.com/AmyHoney/deep-learning-containers.git python3 -m venv dlc #create a virtual environment source dlc/bin/activate cd deep-learning-containers pip install -r src/requirements.txt
- Perform the initial setup
bash src/setup.sh pytorch
Note: This is an example as below: first build Pytorch inference CPU python3 container image, second run object detection sample by this image.
The paths to the dockerfiles follow a specific pattern e.g., pytorch/inference/docker/<version>/<python_version>/Dockerfile.
These paths are specified by the buildspec.yml residing in pytorch/buildspec.yml i.e. <framework>/buildspec.yml. If you want to build the dockerfile for a particular version, or introduce a new version of the framework, re-create the folder structure as per above and modify the buildspec.yml file to specify the version of the dockerfile you want to build.
- If you would instead like to build only a single image
The above step should take a while to complete the first time you run it since it will have to download all base layers and create intermediate layers for the first time. Subsequent runs should be much faster.
python src/main.py --buildspec pytorch/buildspec.yml \ --framework pytorch \ --image_types inference \ --device_types cpu \ --py_versions py3
- The arguments —image_types, —device_types and —py_versions are all comma separated list who’s possible values are as follows:
--image_types <training/inference> --device_types <cpu/gpu> --py_versions <py2/py3>
-
Obtain object detection model and sample from github
git clone https://github.com/AmyHoney/torchserve_od_sample.git
-
Run with build docker image before and start container using jupyter without password
docker run -it -p 80:8888 -v <local_dir>:<container_dir> <Docker_image_id> jupyter notebook --notebook-dir=/ --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=/ # example docker run -it -p 80:8888 -v ~/workspace/torchserve_od_sample:/torchserve_od_sample harbor-repo.vmware.com/zyajing/pytorch-inference:1.11.0-cpu-py38-ubuntu20.04-2022-05-28-08-38-30-multistage-common jupyter notebook --notebook-dir=/ --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=/
-
Open Jupyter UI http://<vm_ip>:80
-
New a terminal, download the pre-trained fast-rcnn object detection model's state_dict, generate model and start model by torchserve in this new terminal
# Download FastRCNN model weights cd /torchserve_od_sample sh scripts/get_fastrcnn.sh # Archive model sh scripts/archive_model.sh # the model is stored ./model-store #Start TorchServe sh scripts/start_torchserve.sh
-
Run sample inference using REST APIs
curl http://127.0.0.1:8080/predictions/fastrcnn -T ./samples/Naxos_Taverna.jpg
Or iteratively run the "torchserve_od_sample/object_dectection.ipynb" notebook.