About

Setup a Docker container with the correct PyTorch environment. This setup allows developers to use their favorite text editor to write code on their host while leveraging the power of PyTorch from inside the container's environment.

Pull down any project on your host machine with vcs and docker will take care of binding it to the inside of the container.

NOTE: Refer to the documentation in the rif-internal-docs repo for instructions on how to train an image segmentation model with detectron2, how to interact with CVAT, how to generate synthetic data with blenderproc, etc.

First Time Instructions

Install Dependencies:

Docker: https://docs.docker.com/engine/install/ubuntu/
docker-compose: https://docs.docker.com/compose/install/
vcs: http://wiki.ros.org/vcstool

I prefer sudo apt install python3-vcstool.

Nvidia Drivers

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. Although you will not have to install the CUDA Toolkit on your host system, you will need to install the Nvidia drivers. The instructions can be found in the Nvidia docs. Namely, execute the following:

Setup the stable repository and the GPG key:

 distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
 curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
 curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

Install the nvidia-docker2 package:

 sudo apt update
 sudo apt install -y nvidia-docker2

NOTE: The nvidia-docker2 dependency is important if you want to use Kubernetes with Docker 19.03 (and newer), because Kubernetes doesn't support passing GPU information down to docker through the --gpus flag yet.

Restart the Docker daemon:
```
 sudo systemctl restart docker
```

Test the setup by running a base CUDA container:

 sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

Clone Repositories

Setup workspace and clone this repository

 mkdir -p /path/to/pytorch_ws/{src,data}
 cd /path/to/pytorch_ws
 git clone git@github.com:RIF-Robotics/pytorch_setup.git

Clone additional repositories

cd /path/to/pytorch_setup
vcs import ../src < pytorch.repos

NOTE: Regularly execute the following to keep the repositories up to date:

cd /path/to/pytorch_setup
vcs pull ../src

Build Docker image

 cd /path/to/pytorch_setup
 echo -e "USER_ID=$(id -u ${USER})\nGROUP_ID=$(id -g ${USER})" > .env
 docker-compose build

Interact with the container

Spin up the container:

cd /path/to/pytorch_setup
docker-compose up -d dev-nvidia

Drop inside a container. You can execute this in as many terminals as desired once the container is spinning. Keep in mind that they all drop you into the same container:

docker exec -it rif_detectron2 /bin/bash

Execute the following on your host to stop the container:

cd /path/to/pytorch_setup
docker-compose stop

Train Surgical Instrument Detector

Using CVAT, export the datasets that you want to use: Actions Export task dataset. Settings:
- Export Format: CVAT for images 1.1
- Save Images: True (checkbox).

Save the exported zip files to the pytorch_ws/data directory. I exported the following datasets: - #2: RealSense Images - #3: Surgical Instruments with Arm

Make individual directories (mkdir) for each dataset you downloaded and unzip the downloaded datasets into their respective directories.
Visualize the dataset with fiftyone in your browser. Inside the docker container, run the command:
```
 fiftyone_view_dataset cvat /path/to/data/<cvat-dataset>
```

If necessary, combine multiple CVAT datasets into a single CVAT dataset

 cd ./data
 combine_datasets <output-cvat-dataset> <input-cvat-dataset0> <input-cvat-dataset1>

Convert the CVAT dataset to a COCO dataset with training, validation, and test splits. This creates three separate coco datasets under the <output-coco-dataset> folder.
```
 fiftyone_cvat_to_coco <input-cvat-dataset> <output-coco-dataset> --splits 0.7 0.2 0.1
```
Visualize the COCO training dataset in fiftyone to make sure it's as expected.
```
 fiftyone_view_dataset coco <coco-dataset>/train
```

Train the model. The <coco-dataset> folder should contain train, val, and test subfolders.

 detectron2_model_train <coco-dataset> --output_dir 2022-05-11-trained-model --train

While training, use tensorboard to visualize loss and other metrics. Open another terminal in the docker container and execute:
```
 tensorboard --logdir /path/to/2022-05-11-trained-model --bind_all
```

Evaluate the model's performance on the test set

 detectron2_model_train <coco-dataset> --output_dir 2022-05-11-trained-model --evaluate

Show model predictions on the test set

detectron2_model_train <coco-dataset> --output_dir 2022-05-11-trained-model --predict

Detectron 2 Balloon Demo

Leverage the provided Docker environment to run Facebook's detectron2 library.

Run the Demo

Setup the environment by executing the following inside a spinning container:

 cd ~/workspace/src/detectron2_repo
 wget http://images.cocodataset.org/val2017/000000439715.jpg -O input.jpg
 mkdir -p outputs

Execute the following to run the demo on a pre-trained COCO model and perform instance segmentation on the previously downloaded image:

 cd ~/workspace/src/detectron2_repo
 python3 demo/demo.py --config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input input.jpg --output outputs --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

Use feh to display:

 sudo apt-get install feh
 feh ./outputs/input.jpg

Generate Synthetic Images with BlenderProc

Step inside the running Docker container:

 docker exec -it rif_detectron2 /bin/bash

Setup BlenderProc in the container with the quickstart script
```
 blenderproc quickstart
```
View the resulting image:
```
 blenderproc vis hdf5 output/0.hdf5
```

Generate five synthetic images.

 cd ./src/rif-python/scripts/blenderproc/random_placement

 blenderproc run main.py ./config.json \
     ~/workspace/src/surgical-instrument-3D-models/library/models.json \
     ~/workspace/data/blenderproc_output \
     --runs 5

View a single synthetic data sample:

 blenderproc vis hdf5 ~/workspace/data/blenderproc_output/0.hdf5

View synthetic data in fiftyone:

 $ fiftyone_view_dataset coco \
     ~/workspace/data/blenderproc_output/coco_data \
     --images-dir . \
     --labels-file coco_annotations.json

Point your browser at http://localhost:5151

eyal-friedman/pytorch_setup