Tutorial Image and Multiple Bounding Boxes Augmentation for Deep Learning in 4 Steps

Say we have images for training our Deep Neural Network. We also have separate PASCAL VOC format XML files with coordinates of bounding boxes for objects we are going to train our model to detect. We want to use TensorFlow Object Detection API. To do so we are planning to:

  1. Convert all XML files into one CSV file that we can feed into TensorFlow Object Detection API
  2. Resize all images together with the corresponding object bounding boxes
  3. Augment images to upsample our dataset. Corresponding object bounding boxes should be augmented accordingly
  4. Document augmented images' new sizes and bounding boxes' coordinates to a CSV file

This tutorial will walk you through this process step by step.

At the core of this tutorial we will use amazing imgaug library. Author has published tutorials on the use of the library and Documentation.

But here's a problem: I had to spend a whole day digging through the Documentation and coding up the script for my problem. I decided to share it, so you don't have to waste your time.

Easiest way to install imgaug is through Anaconda. Follow this steps in Anaconda promt to create virtual environment, install imgaug and activate the environment:

conda create -n myenv python=3.5.6
conda config --add channels conda-forge
conda install imgaug
conda activate myenv

You can refer to imgaug library GitHub page for additional info on installation. To work through this tutorial you would need pandas installed as well. If you work through Anaconda it is installed by default.

Download the repository and open Tutorial-Image-and-Multiple-Bounding-Boxes-Augmentation-for-Deep-Learning-in-4-Steps.ipynb to follow along.