Albumentations

The library works with images in HWC format.
The library is faster than other libraries on most of the transformations.
Based on numpy, OpenCV, imgaug picking the best from each of them.
Simple, flexible API that allows the library to be used in any computer vision pipeline.
Large, diverse set of transformations.
Easy to extend the library to wrap around other libraries.
Easy to extend to other tasks.
Supports transformations on images, masks, key points and bounding boxes.
Supports python 3.5-3.7
Easy integration with PyTorch.
Easy transfer from torchvision.
Was used to get top results in many DL competitions at Kaggle, topcoder, CVPR, MICCAI.
Written by Kaggle Masters.

How to use
Authors
Installation
- PyPI
- Conda
Documentation
Pixel-level transforms
Spatial-level transforms
Migrating from torchvision to albumentations
Benchmarking results
Contributing
- Adding new transforms
Building the documentation
Comments
Citing
Competitions won with the library
Industry users

How to use

All in one showcase notebook - showcase.ipynb

Classification - example.ipynb

Object detection - example_bboxes.ipynb

Non-8-bit images - example_16_bit_tiff.ipynb

Image segmentation example_kaggle_salt.ipynb

Keypoints example_keypoints.ipynb

Custom targets example_multi_target.ipynb

Weather transforms example_weather_transforms.ipynb

Serialization serialization.ipynb

Replay/Deterministic mode replay.ipynb

You can use this Google Colaboratory notebook to adjust image augmentation parameters and see the resulting images.

Authors

Alexander Buslaev

Alex Parinov

Vladimir I. Iglovikov

Evegene Khvedchenya

Mikhail Druzhinin

Installation

PyPI

You can use pip to install albumentations:

pip install albumentations

If you want to get the latest version of the code before it is released on PyPI you can install the library from GitHub:

pip install -U git+https://github.com/albumentations-team/albumentations

And it also works in Kaggle GPU kernels (proof)

!pip install albumentations > /dev/null

Conda

To install albumentations using conda we need first to install imgaug via conda-forge collection

conda install -c conda-forge imgaug
conda install albumentations -c conda-forge

Documentation

The full documentation is available at https://albumentations.ai/docs/.

Pixel-level transforms

Pixel-level transforms will change just an input image and will leave any additional targets such as masks, bounding boxes, and keypoints unchanged. The list of pixel-level transforms:

Spatial-level transforms

Spatial-level transforms will simultaneously change both an input image as well as additional targets such as masks, bounding boxes, and keypoints. The following table shows which additional targets are supported by each transform.

Transform	Image	Masks	BBoxes	Keypoints
CenterCrop	✓	✓	✓	✓
CoarseDropout	✓	✓
Crop	✓	✓	✓	✓
CropNonEmptyMaskIfExists	✓	✓	✓	✓
ElasticTransform	✓	✓
Flip	✓	✓	✓	✓
GridDistortion	✓	✓
GridDropout	✓	✓
HorizontalFlip	✓	✓	✓	✓
IAAAffine	✓	✓	✓	✓
IAACropAndPad	✓	✓	✓	✓
IAAFliplr	✓	✓	✓	✓
IAAFlipud	✓	✓	✓	✓
IAAPerspective	✓	✓	✓	✓
IAAPiecewiseAffine	✓	✓	✓	✓
Lambda	✓	✓	✓	✓
LongestMaxSize	✓	✓	✓	✓
MaskDropout	✓	✓
NoOp	✓	✓	✓	✓
OpticalDistortion	✓	✓
PadIfNeeded	✓	✓	✓	✓
RandomCrop	✓	✓	✓	✓
RandomCropNearBBox	✓	✓	✓	✓
RandomGridShuffle	✓	✓
RandomResizedCrop	✓	✓	✓	✓
RandomRotate90	✓	✓	✓	✓
RandomScale	✓	✓	✓	✓
RandomSizedBBoxSafeCrop	✓	✓	✓
RandomSizedCrop	✓	✓	✓	✓
Resize	✓	✓	✓	✓
Rotate	✓	✓	✓	✓
ShiftScaleRotate	✓	✓	✓	✓
SmallestMaxSize	✓	✓	✓	✓
Transpose	✓	✓	✓	✓
VerticalFlip	✓	✓	✓	✓

Migrating from torchvision to albumentations

Migrating from torchvision to albumentations is simple - you just need to change a few lines of code. Albumentations has equivalents for common torchvision transforms as well as plenty of transforms that are not presented in torchvision. migrating_from_torchvision_to_albumentations.ipynb shows how one can migrate code from torchvision to albumentations.

Benchmarking results

To run the benchmark yourself follow the instructions in benchmark/README.md

Results for running the benchmark on first 2000 images from the ImageNet validation set using an Intel Xeon Gold 6140 CPU. All outputs are converted to a contiguous NumPy array with the np.uint8 data type. The table shows how many images per second can be processed on a single core, higher is better.

	albumentations 0.5.0	imgaug 0.4.0	torchvision (Pillow-SIMD backend) 0.7.0	keras 2.4.3	augmentor 0.2.8	solt 0.1.9
HorizontalFlip	9909	2821	2267	873	2301	6223
VerticalFlip	4374	2218	1952	4339	1968	3562
Rotate	371	296	163	27	60	345
ShiftScaleRotate	635	437	147	28	-	-
Brightness	2751	1178	419	229	418	2300
Contrast	2756	1213	352	-	348	2305
BrightnessContrast	2738	699	195	-	193	1179
ShiftRGB	2757	1176	-	348	-	-
ShiftHSV	597	284	58	-	-	137
Gamma	2844	-	382	-	-	946
Grayscale	5159	428	709	-	1064	1273
RandomCrop64	175886	3018	52103	-	41774	20732
PadToSize512	3418	-	574	-	-	2874
Resize512	1003	634	1036	-	1016	977
RandomSizedCrop_64_512	3191	939	1594	-	1529	2563
Posterize	2778	-	-	-	-	-
Solarize	2762	-	-	-	-	-
Equalize	644	413	-	-	735	-
Multiply	2727	1248	-	-	-	-
MultiplyElementwise	118	209	-	-	-	-
ColorJitter	368	78	57	-	-	-

Python and library versions: Python 3.8.6 (default, Oct 13 2020, 20:37:26) [GCC 8.3.0], numpy 1.19.2, pillow-simd 7.0.0.post3, opencv-python 4.4.0.44, scikit-image 0.17.2, scipy 1.5.2.

Contributing

To create a pull request to the repository follow the documentation at docs/contributing.rst

Adding new transforms

If you are contributing a new transformation, make sure to update "Pixel-level transforms" or/and "Spatial-level transforms" sections of this file (README.md). To do this, simply run (with python3 only):

python3 tools/make_transforms_docs.py make

and copy/paste the results into the corresponding sections. To validate your modifications, you can run:

python3 tools/make_transforms_docs.py check README.md

Building the documentation

Go to docs/ directory
```
cd docs
```
Install required libraries
```
pip install -r requirements.txt
```
Build html files
```
make html
```
Open _build/html/index.html in browser.

Alternatively, you can start a web server that rebuilds the documentation automatically when a change is detected by running make livehtml

Competitions won with the library

Albumentations are widely used in Computer Vision Competitions at Kaggle and other platforms.

You can find their names and links to the solutions here.

Used by

Comments

In some systems, in the multiple GPU regime PyTorch may deadlock the DataLoader if OpenCV was compiled with OpenCL optimizations. Adding the following two lines before the library import may help. For more details pytorch/pytorch#1355

cv2.setNumThreads(0)
cv2.ocl.setUseOpenCL(False)

Citing

If you find this library useful for your research, please consider citing Albumentations: Fast and Flexible Image Augmentations:

@Article{info11020125,
    AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},
    TITLE = {Albumentations: Fast and Flexible Image Augmentations},
    JOURNAL = {Information},
    VOLUME = {11},
    YEAR = {2020},
    NUMBER = {2},
    ARTICLE-NUMBER = {125},
    URL = {https://www.mdpi.com/2078-2489/11/2/125},
    ISSN = {2078-2489},
    DOI = {10.3390/info11020125}
}

You can find the full list of papers that cite Albumentations here.

nanfei666/albumentations