Update 8/15/2020: Support use filename instead of database name in saved files. E.g. image and annotation names are <file_name>_

Impy (Images in python)

Impy is a library used for deep learning projects that use image datasets.

Email: lozuwaucb@gmail.com
Bug reports: https://github.com/lozuwa/impy/issues

It provides:

Data augmentation methods for images with bounding boxes (the bounding boxes are also affected by the transformations so you don't have to label again.)
Fast image preprocessing methods useful in a deep learning context. E.g: if your image is too big you need to divide it into patches.

Installation

Download the impy.whl and use pip to install it.

Follow the next steps:

Download the impy.whl from here

Use pip to install the wheel

pip install impy-0.1-py3-none-any.whl

Tutorial

Impy has multiple features that allow you to solve several different problems with a few lines of code. In order to showcase the features of impy we are going to solve common problems that involve both Computer Vision and Deep Learning.

We are going to work with a mini-dataset of cars and pedestrians (available here). This dataset has object annotations that make it suitable to solve a localization problem.

Object localization

In this section we are going to solve problems related with object localization.

Images are too big

One common problem in Computer Vision and CNNs is dealing with big images. Let's sample one of the images from our mini-dataset:

The size of this image is 3840x2160. It is too big for training. Most likely, your computer will run out of memory. In order to try to solve the big image problem, we could reduce the size of the mini-batch hyperparameter. But if the image is too big it would still not work. We could also try to reduce the size of the image. But that means the image losses quality and you would need to label the smaller image again.

Instead of hacking a solution, we are going to solve the problem efficiently. The best solution is to sample crops of a specific size that contain the maximum amount of bounding boxes possible. Crops of 1032x1032 pixels are usually small enough.

Let's see how to do this with impy:

Create a folder named testing_cars. Then enter the folder.

	mkdir -p $PWD/testing_cars
	cd testing_cars

Download the cars dataset from here

git clone https://github.com/lozuwa/cars_dataset

Create a file named reducing_big_images.py and put the next code:

import os
from impy.ObjectDetectionDataset import ObjectDetectionDataset

def main():
	# Define the path to images and annotations
	images_path:str = os.path.join(os.getcwd(), "cars_dataset", "images")
	annotations_path:str = os.path.join(os.getcwd(), "cars_dataset", "annotations", "xmls")
	# Define the name of the dataset
	dbName:str = "CarsDataset"
	# Create an object of ObjectDetectionDataset
	obda:any = ObjectDetectionDataset(imagesDirectory=images_path, annotationsDirectory=annotations_path, databaseName=dbName)
	# Reduce the dataset to smaller Rois of smaller ROIs of shape 1032x1032.
	offset:list=[1032, 1032]
	images_output_path:str = os.path.join(os.getcwd(), "cars_dataset", "images_reduced")
	annotations_output_path:str = os.path.join(os.getcwd(), "cars_dataset", "annotations_reduced", "xmls")
	obda.reduceDatasetByRois(offset = offset, outputImageDirectory = images_output_path, outputAnnotationDirectory = annotations_output_path)

if __name__ == "__main__":
	main()

Note the paths where we are going to store the reduced images don't exist. Let's create them.

mkdir -p $PWD/cars_dataset/images_reduced/
mkdir -p $PWD/cars_dataset/annotations_reduced/xmls/

Now we can run the script and reduce the images to smaller crops.

python reducing_big_images.py

Impy will create a new set of images and annotations with the size specified by offset and will include the maximum number of annotations possible so you will end up with an optimal number of data points. Let's see the results of the example:

As you can see the bounding boxes have been maintained and small crops of the big image are now available. We can use this images for training and our problem is solved.

Note that in some cases you are going to end up with an inefficient amount of crops due to overlapping crops in the clustering algorithm. I am working on this and a better solution will be released soon. Nonetheless, these results are still way more efficient than what is usually done which is crop each bounding box one by one (This leads to inefficient memory usage, repeated data points, lose of context and simpler representation.).

Data augmentation for bounding boxes

Another common problem in Computer Vision and CNNs for object localization is data augmentation. Specifically space augmentations (e.g: scaling, cropping, rotation, etc.). For this you would usually make a custom script. But with impy we can make this easier.

Create a json file named augmentation_configuration.json

touch augmentation_configuration.json

Insert the following code in the augmentation_configuration.json file

{
	"multiple_image_augmentations": {
		"Sequential": [
			{
				"image_color_augmenters": {
					"Sequential": [
						{
							"sharpening": {
								"weight": 2.0,
								"save": true,
								"restartFrame": false,
								"randomEvent": false
							}
						}
					]
				}
			},
			{
				"bounding_box_augmenters": {
					"Sequential": [
						{
							"scale": {
								"size": [1.2, 1.2],
								"zoom": true,
								"interpolationMethod": 1,
								"save": true,
								"restartFrame": false,
								"randomEvent": false
							}
						},
						{
							"verticalFlip": {
								"save": true,
								"restartFrame": false,
								"randomEvent": true
							}
						}
					]
				}
			},
			{
				"image_color_augmenters": {
					"Sequential": [
						{
							"histogramEqualization":{
								"equalizationType": 1,
								"save": true,
								"restartFrame": false,
								"randomEvent": false
							}
						}
					]
				}
			},
			{
				"bounding_box_augmenters": {
					"Sequential": [
						{
							"horizontalFlip": {
								"save": true,
								"restartFrame": false,
								"randomEvent": false
							}
						},
						{
							"crop": {
								"save": true,
								"restartFrame": true,
								"randomEvent": false
							}
						}
					]
				}
			}
		]
	}
}

Let's analyze the configuration file step by step. Currently, this is the most complex type of data augmentation you can achieve with the library.

Note the file starts with "multiple_image_augmentations", then a "Sequential" key follows. Inside "Sequential" we define an array. This is important, each element of the array is a type of augmenter.

The first augmenter we are going to define is a "image_color_agumenters" which is going to execute a sequence of color augmentations. In this case, we have defined only one type of color augmentation which is sharpening with a weight of 2.0.

After the color augmentation, we have defined a "bounding_box_augmenters" which is going to execute a "scale" augmentation with zoom followed by a "verticalFlip".

We want to keep going. So we define two more types of image augmenters. Another "image_color_augmenters" which applies "histogramEqualization" to the image. And another "bounding_box_agumeneters" which applies a "horizontalFlip" and a "crop" augmentation.

Note there are three types of parameters in each augmenter. These are optional, but I recommend specifying them in order to fully understand your pipeline. These parameters are:

"Save": saves the current transformation if True.
"Restart frame": restarts the frame to its original space if True, otherwise maintains the augmentation applied so far.
"Random event": uses an stochastic function to randomize whether this augmentation might be applied or not.

As you have seen we can define any type of crazy configuration and augment our images with the available methods while choosing whether to save each augmentation, restart the frame to its original space or randomize the event so we make things crazier. Get creative and define your own data augmentation pipelines.

Once the configuration file is created, we can apply the data augmentation pipeline with the following code.

After defining the augmentation file, let's create the code to apply the augmentations. Create a file named: apply_bounding_box_augmentations.py

Insert the following code to apply_bounding_box_augmentations.py

import os
from impy.ObjectDetectionDataset import ObjectDetectionDataset

def main():
	# Define the path to images and annotations
	images_path:str=os.path.join(os.getcwd(), "cars_dataset", "images")
	annotations_path:str=os.path.join(os.getcwd(), "cars_dataset", "annotations", "xmls")
	# Define the name of the dataset
	dbName:str="CarsDataset"
	# Create an ObjectDetectionDataset object
	obda:any=ObjectDetectionDataset(imagesDirectory=images_path, annotationsDirectory=annotations_path, databaseName=dbName)
	# Apply data augmentation by using the following method of the ObjectDetectionDataset class.
	configuration_file:str=os.path.join(os.getcwd(), "augmentation_configuration.json")
	images_output_path:str=os.path.join(os.getcwd(), "cars_dataset", "images_augmented")
	annotations_output_path:str=os.path.join(os.getcwd(), "cars_dataset", "annotations_augmented", "xmls")
	obda.applyDataAugmentation(configurationFile=configuration_file, outputImageDirectory=images_output_path, outputAnnotationDirectory=annotations_output_path)

if __name__ == "__main__":
	main()

Now execute the scrip running:

python apply_bounding_box_augmentations.py

Next I present the results of the augmentations. Note the transformation does not alter the bounding boxes of the image which saves you a lot of time in case you want to increase the representational complexity of your data.

Sharpening

Scaling (image gets a little bit bigger)

Vertical flip

Histogram equalization

Horizontal flip

Crop bounding boxes

Documentation

Object detection dataset

ObjectDetectionDataset class

A class that holds a detection dataset. Parameters:

imagesDirectory: A string that contains a path to a directory of images.
annotationsDirectory: A string that contains a path to a directory of xml annotations.
databaseName: A string that contains a name.

Class methods:

dataConsistency

Checks the consistency of the image files with the annotation files.

findEmptyOrWrongAnnotations

Examines all the annotations in the dataset and detects if any is empty or wrong. A wrong annotation is said to contain a bounding box coordinate that is either greater than width/heigh respectively or is less than zero.

removeEmpty: A boolean that if True removes the annotations that are considered to be wrong or empty.

computeBoundingBoxStats

saveDataFrame: A boolean that if True saves the dataframe with the stats.
outputDirDataFrame: A string that contains a valid path.

saveBoundingBoxes

Saves the bounding boxes of the data set as images. Parameters:

outputDirectory: A string that contains a path to a valid directory.
filterClasses: A list of strings that contains classes that are supposed to be as labels in the dataset annotations.

reduceDatasetByRois

Iterate over images and annotations and execute redueImageDataPointByRoi for each one.

offset: A list or tuple of ints.
outputImageDirectory: A string that contains a valid path.
outputAnnotationDirectory: A string that contains a valid path.

reduceImageDataPointByRoi

imagePath: A string that contains the path to an image.
annotationPath: A string that contains a path to a xml annotation.
offset A list or tuple of ints.
outputImageDirectory: A string that contains a valid path.
outputAnnotationDirectory: A string that contains a valid path.

applyDataAugmentation

configurationFile: A string that contains a path to a json file.
outputImageDirectory: A string that contains a valid path.
outputAnnotationDirectory: A string that contains a valid path.
threshold: A float in the range [0-1].

applyColorAugmentation

frame: A numpy tensor that contains an image.
augmentationType: A string that contains a valid Impy augmentation type.
parameters: A list of strings that contains the respective parameters for the type of augmentation.

applyBoundingBoxAugmentation

frame: A numpy tensor that contains an image.
boundingBoxes: A list of lists of ints that contains coordinates for a bounding box.
augmentationType: A string that contains a valid Impy augmentation type.
parameters: A list of strings that contains the respective parameters for the type of augmentation.

Types of color augmentations

All of the augmentations ought to implement the following parameters:

"Save": saves the current transformation if True.
"Restart frame": restarts the frame to its original space if True, otherwise maintains the augmentation applied so far.
"Random event": uses an stochastic function to randomize whether this augmentation might be applied or not.

Invert color

Apply a bitwise_not operation to the pixels in the image. Code example:

{
	"invertColor": {
		"Cspace": [true, true, true]
	}
}

Histogram equalization

Equalize the color space of the image. Code example:

{
	"histogramEqualization": {
		"equalizationType": 1
	}
}

Change brightness

Multiply the pixel distribution with a scalar. Code example:

{
	"changeBrightness": {
		"coefficient": 1.2
	}
}

Random sharpening

Apply a sharpening system to the image. Code example:

{
	"sharpening": {
		"weight": 0.8
	}
}

Add gaussian noise

Add gaussian noise to the image. Code example:

```json { "addGaussianNoise": { "coefficient": 0.5 } } ```

Gaussian blur

Apply a Gaussian low pass filter to the image. Code example:

{
	"gaussianBlur": {
		"sigma": 2
	}
}

Shift colors

Shift the colors of the image. Code example:

{
	"shiftBlur": {
	}
}

Types of bounding box augmentations

All of the augmentations ought to implement the following parameters:

"Save": saves the current transformation if True.
"Restart frame": restarts the frame to its original space if True, otherwise maintains the augmentation applied so far.
"Random event": uses an stochastic function to randomize whether this augmentation might be applied or not.

Scale

Scales the size of an image and maintains the location of its bounding boxes. Code example:

{
	"scale": {
		"size": [1.2, 1.2],
		"zoom": true,
		"interpolationMethod": 1
	}
}

Random crop

Crops the bounding boxes of an image. Specify the size of the crop in the size parameter. Code example:

{
	"crop": {
		"size": [50, 50]
	}
}

Random pad

Pads the bounding boxes of an image. i.e adds pixels from outside the bounding box. Specify the amount of pixels to be added in the size parameter. Code example:

{
	"pad": {
		"size": [20, 20]
	}
}

Flip horizontally

Flips the bounding boxes of an image in the x axis. Code example:

{
	"horizontalFlip": {
	}
}

Flip vertically

Flips the bounding boxes of an image in the y axis. Code example:

{
	"verticalFlip": {
	}
}

Rotation

Rotates the bounding boxes of an image anti-clockwise. Code example:

{
	"rotation": {
		"theta": 0.5 
	}
}

Jitter boxes

Draws random squares of a specific color and size in the area of the bounding box. Code example:

{
	"jitterBoxes": {
		"size": [10, 10],
		"quantity": 5,
		"color": [255,255,255]
	}
}

Dropout

Set pixels inside the bounding box to zero depending on a probability p extracted from a normal distribution. If p > threshold, then the pixel is changed. Code example:

{
	"dropout": {
		"size": [5, 5],
		"threshold": 0.8,
		"color": [255,255,255]
	}
}

Contribute

If you want to contribute to this library. Please follow the next steps so you can have a development environment.

Install anaconda with python 3.7< (tested with python 3.5 3.6 3.7). Then create an empty environment.

conda create --name=impy python3.7

Activate the conda environment

source activate impy

Clone the repository

git clone https://github.com/lozuwa/impy

Install the dependencies that are in setup.py

You are good to go. Note there are unit tests for each script inside the impy folder.

Build the project

Go to impy's parent directory and run the following code:

python setup.py sdist bdist_wheel

A folder named dist will appear. It contains the .whl and .tar.gz

sainisanjay/impy

Impy (Images in python)

Installation

Download the impy.whl and use pip to install it.

Tutorial

Object localization

Images are too big

Data augmentation for bounding boxes

Sharpening

Scaling (image gets a little bit bigger)

Vertical flip

Histogram equalization

Horizontal flip

Crop bounding boxes

Documentation

Object detection dataset

ObjectDetectionDataset class

dataConsistency

findEmptyOrWrongAnnotations

computeBoundingBoxStats

saveBoundingBoxes

reduceDatasetByRois

reduceImageDataPointByRoi

applyDataAugmentation

__applyColorAugmentation__

__applyBoundingBoxAugmentation__

Types of color augmentations

Invert color

Histogram equalization

Change brightness

Random sharpening

Add gaussian noise

Gaussian blur

Shift colors

Types of bounding box augmentations

Scale

Random crop

Random pad

Flip horizontally

Flip vertically

Rotation

Jitter boxes

Dropout

Contribute

Build the project

applyColorAugmentation

applyBoundingBoxAugmentation