BBAug is a Python package for the implementation of Google’s Brain Team’s bounding box augmentation policies. The package is aimed for PyTorch users who wish to use these policies in the augmentation of bounding boxes during the training of a model. Currently all 4 versions of the policies are implemented. This package builds on top of the excellent image augmentations package imgaug.
References
- Implementation of all 4 policies
- Custom policies
- Custom augmentations
- Bounding boxes are removed if they fall outside of the image*
- Boudning boxes are clipped if they are partially outside the image*
- Augmentations that imply direction e.g. rotation is randomly determined
*Doest not happen for bounding box specific augmentations
-
Implementation of version 2 of policies(implemented in v0.2) -
Implementation of version 1 of policies(implemented in v0.2) -
For bounding box augmentations apply the probability individually for each box not collectively(implemented in v0.4)
Installation is best done via pip:
pip install bbaug
- Python 3.6+
- PyTorch
- Torchvision
For detailed description on usage please refer to the Python notebooks provided in the notebooks
folder.
A augmentation is define by 3 attributes:
- Name: Name of the augmentation
- Probability: Probability of augmentation being applied
- Magnitude: The degree of the augmentation (values are integers between 0 and 10)
A sub-policy
is a collection of augmentations: e.g.
sub_policy = [('translation', 0.5, 1), ('rotation', 1.0, 9)]
In the above example we have two augmentations in a sub-policy. The translation
augmentation has a
probability of 0.5 and a magnitude of 1, whereas the rotation
augmentation has a probability of 1.0 and a
magnitude of 9. The magnitudes do not directly translate into the augmentation policy i.e. a magnitude of 9
does not mean a 9 degrees rotation. Instead, scaling is applied to the magnitude to determine the value passed
to the augmentation method. The scaling varies depending on the augmentation used.
A policy
is a set of sub-policies:
policies = [
[('translation', 0.5, 1), ('rotation', 1.0, 9)],
[('colour', 0.5, 1), ('cutout', 1.0, 9)],
[('rotation', 0.5, 1), ('solarize', 1.0, 9)]
]
During training, a random policy is selected from the list of sub-policies and applied to the image and because each augmentation has it's own probability this adds a degree of stochasticity to training.
Each augmentation contains a string referring to the name of the augmentation. The augmentations
module
contains a dictionary mapping the name to a method reference of the augmentation.
from bbaug.augmentations import NAME_TO_AUGMENTATION
print(NAME_TO_AUGMENTATION) # Shows the dictionary of the augmentation name to the method reference
Some augmentations are applied only to the bounding boxes. Augmentations which have the suffix BBox
are only
applied to the bounding boxes in the image.
To obtain a list of all available polices run the list_policies
method. This will return a list of strings
containing the function names for the policy sets.
from bbaug.policies import list_policies
print(list_policies()) # List of policies available
from bbaug.policies import policies_v3
print(policies_v3()) # Will list all the polices in version 3
To visulaise a policy on a single image a visualise_policy
method is available in the visuals
module.
from bbaug.visuals import visualise_policy
visualise_policy(
'path/to/image',
'save/dir/of/augmentations',
bounding_boxes, # Bounding boxes is a list of list of bounding boxes in pixels (int): e.g. [[x_min, y_min, x_man, y_max], [x_min, y_min, x_man, y_max]]
labels, # Class labels for the bounding boxes as an iterable of ints eg. [0, 5]
policy, # the policy to visualise
name_to_augmentation, # (optional, default: augmentations.NAME_TO_AUGMENTATION) The dictionary mapping the augmentation name to the augmentation method
)
To help integrate the policies into training a PolicyContainer
class available in the policies
module. The container accepts the following inputs:
- policy_set (required): The policy set to use
- name_to_augmentation (optional, default:
augmentations.NAME_TO_AUGMENTATION
): The dictionary mapping the augmentation name to the augmentation method - return_yolo (optional, default:
False
): Return the bounding boxes in YOLO format otherwise[x_min, y_min, x_man, y_max]
in pixels is returned
Usage of the policy container:
from bbaug import policies
# select policy v3 set
aug_policy = policies.policies_v3()
# instantiate the policy container with the selected policy set
policy_container = policies.PolicyContainer(aug_policy)
# select a random policy from the policy set
random_policy = policy_container.select_random_policy()
# Apply the augmentation. Returns the augmented image and bounding boxes.
# Image is a numpy array of the image
# Bounding boxes is a list of list of bounding boxes in pixels (int).
# e.g. [[x_min, y_min, x_man, y_max], [x_min, y_min, x_max, y_max]]
# Labels are the class labels for the bounding boxes as an iterable of ints e.g. [1,0]
img_aug, bbs_aug = policy_container.apply_augmentation(random_policy, image, bounding_boxes, labels)
# image_aug: numpy array of the augmented image
# bbs_aug: numpy array of augmneted bounding boxes in format: [[label, x_min, y_min, x_man, y_max],...]
The policies implemented in bbaug
are shown below. Each column represents a different run for that given sub-policy
as each augmentation in the sub-policy has it's own probability this results in variations between runs.
These are the policies used in the paper.