Update 8/15/2020: Support use filename instead of database name in saved files. E.g. image and annotation names are <file_name>_
Impy is a library used for deep learning projects that use image datasets.
- Email: lozuwaucb@gmail.com
- Bug reports: https://github.com/lozuwa/impy/issues
It provides:
- Data augmentation methods for images with bounding boxes (the bounding boxes are also affected by the transformations so you don't have to label again.)
- Fast image preprocessing methods useful in a deep learning context. E.g: if your image is too big you need to divide it into patches.
Follow the next steps:
- Download the impy.whl from here
- Use pip to install the wheel
pip install impy-0.1-py3-none-any.whl
Impy has multiple features that allow you to solve several different problems with a few lines of code. In order to showcase the features of impy we are going to solve common problems that involve both Computer Vision and Deep Learning.
We are going to work with a mini-dataset of cars and pedestrians (available here). This dataset has object annotations that make it suitable to solve a localization problem.
In this section we are going to solve problems related with object localization.
One common problem in Computer Vision and CNNs is dealing with big images. Let's sample one of the images from our mini-dataset:
The size of this image is 3840x2160. It is too big for training. Most likely, your computer will run out of memory. In order to try to solve the big image problem, we could reduce the size of the mini-batch hyperparameter. But if the image is too big it would still not work. We could also try to reduce the size of the image. But that means the image losses quality and you would need to label the smaller image again.
Instead of hacking a solution, we are going to solve the problem efficiently. The best solution is to sample crops of a specific size that contain the maximum amount of bounding boxes possible. Crops of 1032x1032 pixels are usually small enough.
Let's see how to do this with impy:
- Create a folder named testing_cars. Then enter the folder.
mkdir -p $PWD/testing_cars
cd testing_cars
- Download the cars dataset from here
git clone https://github.com/lozuwa/cars_dataset
- Create a file named reducing_big_images.py and put the next code:
import os
from impy.ObjectDetectionDataset import ObjectDetectionDataset
def main():
# Define the path to images and annotations
images_path:str = os.path.join(os.getcwd(), "cars_dataset", "images")
annotations_path:str = os.path.join(os.getcwd(), "cars_dataset", "annotations", "xmls")
# Define the name of the dataset
dbName:str = "CarsDataset"
# Create an object of ObjectDetectionDataset
obda:any = ObjectDetectionDataset(imagesDirectory=images_path, annotationsDirectory=annotations_path, databaseName=dbName)
# Reduce the dataset to smaller Rois of smaller ROIs of shape 1032x1032.
offset:list=[1032, 1032]
images_output_path:str = os.path.join(os.getcwd(), "cars_dataset", "images_reduced")
annotations_output_path:str = os.path.join(os.getcwd(), "cars_dataset", "annotations_reduced", "xmls")
obda.reduceDatasetByRois(offset = offset, outputImageDirectory = images_output_path, outputAnnotationDirectory = annotations_output_path)
if __name__ == "__main__":
main()
- Note the paths where we are going to store the reduced images don't exist. Let's create them.
mkdir -p $PWD/cars_dataset/images_reduced/
mkdir -p $PWD/cars_dataset/annotations_reduced/xmls/
- Now we can run the script and reduce the images to smaller crops.
python reducing_big_images.py
Impy will create a new set of images and annotations with the size specified by offset and will include the maximum number of annotations possible so you will end up with an optimal number of data points. Let's see the results of the example:
As you can see the bounding boxes have been maintained and small crops of the big image are now available. We can use this images for training and our problem is solved.
Note that in some cases you are going to end up with an inefficient amount of crops due to overlapping crops in the clustering algorithm. I am working on this and a better solution will be released soon. Nonetheless, these results are still way more efficient than what is usually done which is crop each bounding box one by one (This leads to inefficient memory usage, repeated data points, lose of context and simpler representation.).
Another common problem in Computer Vision and CNNs for object localization is data augmentation. Specifically space augmentations (e.g: scaling, cropping, rotation, etc.). For this you would usually make a custom script. But with impy we can make this easier.
- Create a json file named augmentation_configuration.json
touch augmentation_configuration.json
- Insert the following code in the augmentation_configuration.json file
{
"multiple_image_augmentations": {
"Sequential": [
{
"image_color_augmenters": {
"Sequential": [
{
"sharpening": {
"weight": 2.0,
"save": true,
"restartFrame": false,
"randomEvent": false
}
}
]
}
},
{
"bounding_box_augmenters": {
"Sequential": [
{
"scale": {
"size": [1.2, 1.2],
"zoom": true,
"interpolationMethod": 1,
"save": true,
"restartFrame": false,
"randomEvent": false
}
},
{
"verticalFlip": {
"save": true,
"restartFrame": false,
"randomEvent": true
}
}
]
}
},
{
"image_color_augmenters": {
"Sequential": [
{
"histogramEqualization":{
"equalizationType": 1,
"save": true,
"restartFrame": false,
"randomEvent": false
}
}
]
}
},
{
"bounding_box_augmenters": {
"Sequential": [
{
"horizontalFlip": {
"save": true,
"restartFrame": false,
"randomEvent": false
}
},
{
"crop": {
"save": true,
"restartFrame": true,
"randomEvent": false
}
}
]
}
}
]
}
}
Let's analyze the configuration file step by step. Currently, this is the most complex type of data augmentation you can achieve with the library.
Note the file starts with "multiple_image_augmentations", then a "Sequential" key follows. Inside "Sequential" we define an array. This is important, each element of the array is a type of augmenter.
The first augmenter we are going to define is a "image_color_agumenters" which is going to execute a sequence of color augmentations. In this case, we have defined only one type of color augmentation which is sharpening with a weight of 2.0.
After the color augmentation, we have defined a "bounding_box_augmenters" which is going to execute a "scale" augmentation with zoom followed by a "verticalFlip".
We want to keep going. So we define two more types of image augmenters. Another "image_color_augmenters" which applies "histogramEqualization" to the image. And another "bounding_box_agumeneters" which applies a "horizontalFlip" and a "crop" augmentation.
Note there are three types of parameters in each augmenter. These are optional, but I recommend specifying them in order to fully understand your pipeline. These parameters are:
- "Save": saves the current transformation if True.
- "Restart frame": restarts the frame to its original space if True, otherwise maintains the augmentation applied so far.
- "Random event": uses an stochastic function to randomize whether this augmentation might be applied or not.
As you have seen we can define any type of crazy configuration and augment our images with the available methods while choosing whether to save each augmentation, restart the frame to its original space or randomize the event so we make things crazier. Get creative and define your own data augmentation pipelines.
Once the configuration file is created, we can apply the data augmentation pipeline with the following code.
- After defining the augmentation file, let's create the code to apply the augmentations. Create a file named: apply_bounding_box_augmentations.py
- Insert the following code to apply_bounding_box_augmentations.py
import os
from impy.ObjectDetectionDataset import ObjectDetectionDataset
def main():
# Define the path to images and annotations
images_path:str=os.path.join(os.getcwd(), "cars_dataset", "images")
annotations_path:str=os.path.join(os.getcwd(), "cars_dataset", "annotations", "xmls")
# Define the name of the dataset
dbName:str="CarsDataset"
# Create an ObjectDetectionDataset object
obda:any=ObjectDetectionDataset(imagesDirectory=images_path, annotationsDirectory=annotations_path, databaseName=dbName)
# Apply data augmentation by using the following method of the ObjectDetectionDataset class.
configuration_file:str=os.path.join(os.getcwd(), "augmentation_configuration.json")
images_output_path:str=os.path.join(os.getcwd(), "cars_dataset", "images_augmented")
annotations_output_path:str=os.path.join(os.getcwd(), "cars_dataset", "annotations_augmented", "xmls")
obda.applyDataAugmentation(configurationFile=configuration_file, outputImageDirectory=images_output_path, outputAnnotationDirectory=annotations_output_path)
if __name__ == "__main__":
main()
- Now execute the scrip running:
python apply_bounding_box_augmentations.py
Next I present the results of the augmentations. Note the transformation does not alter the bounding boxes of the image which saves you a lot of time in case you want to increase the representational complexity of your data.
A class that holds a detection dataset. Parameters:
- imagesDirectory: A string that contains a path to a directory of images.
- annotationsDirectory: A string that contains a path to a directory of xml annotations.
- databaseName: A string that contains a name.
Class methods:
Checks the consistency of the image files with the annotation files.
Examines all the annotations in the dataset and detects if any is empty or wrong. A wrong annotation is said to contain a bounding box coordinate that is either greater than width/heigh respectively or is less than zero.- removeEmpty: A boolean that if True removes the annotations that are considered to be wrong or empty.
- saveDataFrame: A boolean that if True saves the dataframe with the stats.
- outputDirDataFrame: A string that contains a valid path.
Saves the bounding boxes of the data set as images. Parameters:
- outputDirectory: A string that contains a path to a valid directory.
- filterClasses: A list of strings that contains classes that are supposed to be as labels in the dataset annotations.
Iterate over images and annotations and execute redueImageDataPointByRoi for each one.
- offset: A list or tuple of ints.
- outputImageDirectory: A string that contains a valid path.
- outputAnnotationDirectory: A string that contains a valid path.
- imagePath: A string that contains the path to an image.
- annotationPath: A string that contains a path to a xml annotation.
- offset A list or tuple of ints.
- outputImageDirectory: A string that contains a valid path.
- outputAnnotationDirectory: A string that contains a valid path.
- configurationFile: A string that contains a path to a json file.
- outputImageDirectory: A string that contains a valid path.
- outputAnnotationDirectory: A string that contains a valid path.
- threshold: A float in the range [0-1].
- frame: A numpy tensor that contains an image.
- augmentationType: A string that contains a valid Impy augmentation type.
- parameters: A list of strings that contains the respective parameters for the type of augmentation.
- frame: A numpy tensor that contains an image.
- boundingBoxes: A list of lists of ints that contains coordinates for a bounding box.
- augmentationType: A string that contains a valid Impy augmentation type.
- parameters: A list of strings that contains the respective parameters for the type of augmentation.
All of the augmentations ought to implement the following parameters:
- "Save": saves the current transformation if True.
- "Restart frame": restarts the frame to its original space if True, otherwise maintains the augmentation applied so far.
- "Random event": uses an stochastic function to randomize whether this augmentation might be applied or not.
Apply a bitwise_not operation to the pixels in the image. Code example:
{
"invertColor": {
"Cspace": [true, true, true]
}
}
Equalize the color space of the image. Code example:
{
"histogramEqualization": {
"equalizationType": 1
}
}
Multiply the pixel distribution with a scalar. Code example:
{
"changeBrightness": {
"coefficient": 1.2
}
}
Apply a sharpening system to the image. Code example:
{
"sharpening": {
"weight": 0.8
}
}
Add gaussian noise to the image. Code example:
```json { "addGaussianNoise": { "coefficient": 0.5 } } ```Apply a Gaussian low pass filter to the image. Code example:
{
"gaussianBlur": {
"sigma": 2
}
}
Shift the colors of the image. Code example:
{
"shiftBlur": {
}
}
All of the augmentations ought to implement the following parameters:
- "Save": saves the current transformation if True.
- "Restart frame": restarts the frame to its original space if True, otherwise maintains the augmentation applied so far.
- "Random event": uses an stochastic function to randomize whether this augmentation might be applied or not.
Scales the size of an image and maintains the location of its bounding boxes. Code example:
{
"scale": {
"size": [1.2, 1.2],
"zoom": true,
"interpolationMethod": 1
}
}
Crops the bounding boxes of an image. Specify the size of the crop in the size parameter. Code example:
{
"crop": {
"size": [50, 50]
}
}
Pads the bounding boxes of an image. i.e adds pixels from outside the bounding box. Specify the amount of pixels to be added in the size parameter. Code example:
{
"pad": {
"size": [20, 20]
}
}
Flips the bounding boxes of an image in the x axis. Code example:
{
"horizontalFlip": {
}
}
Flips the bounding boxes of an image in the y axis. Code example:
{
"verticalFlip": {
}
}
Rotates the bounding boxes of an image anti-clockwise. Code example:
{
"rotation": {
"theta": 0.5
}
}
Draws random squares of a specific color and size in the area of the bounding box. Code example:
{
"jitterBoxes": {
"size": [10, 10],
"quantity": 5,
"color": [255,255,255]
}
}
Set pixels inside the bounding box to zero depending on a probability p extracted from a normal distribution. If p > threshold, then the pixel is changed. Code example:
{
"dropout": {
"size": [5, 5],
"threshold": 0.8,
"color": [255,255,255]
}
}
If you want to contribute to this library. Please follow the next steps so you can have a development environment.
- Install anaconda with python 3.7< (tested with python 3.5 3.6 3.7). Then create an empty environment.
conda create --name=impy python3.7
- Activate the conda environment
source activate impy
- Clone the repository
git clone https://github.com/lozuwa/impy
- Install the dependencies that are in setup.py
- You are good to go. Note there are unit tests for each script inside the impy folder.
- Go to impy's parent directory and run the following code:
python setup.py sdist bdist_wheel
- A folder named dist will appear. It contains the .whl and .tar.gz