PSIS

Data Augmentation for Object Detection via Progressive and Selective Instance-Switching

Abstract

We proposed a simple yet effective data augmentation for object detection, whose core is a progressive and selective instance-switching (PSIS) method for synthetic image generation. The proposed PSIS as data augmentation for object detection benefits several merits, i.e., increase of diversity of samples, keep of contextual coherence in the original images, no requirement of external datasets, and consideration of instance balance and class importance. Experimental results demonstrate the effectiveness of our PSIS against the existing data augmentation, including horizontal flipping and training time augmentation for FPN, segmentation masks and training time augmentation for Mask R-CNN, multi-scale training strategy for SNIPER, and Context-DA for BlitzNet. The experiments are conducted on the challenging MS COCO benchmark, and results demonstrate our PSIS brings clear improvement over various state-of-the-art detectors

Framework

Machine configurations

OS: Linux 16.02
GPU: TiTan 1080 Ti
CUDA: version 8.0
CUDNN: version 5.1

Slight changes may not results instabilities

Synthetic Image Generation

In this part, we provide the code for synthetic image generation by taking MS COCO 2017 training set as benchmark. We first generate the instance masks for the images in the training set. Then we use the methods describe in Section 3.1 in the paper to generate the quadruple. At last, depending on the quadruple, we generate the synthetic images by switching the instance.

Instance Mask Generation

Use the code extract_mask.m to generate instance mask for the images in MS COCO 2017 training dataset.

Quadruple Generation

Use the code extract_annotation_pair.py to generate quadruple for each category which satisfy the conditions. The ouput quadruple will saved in a txt file. We also provide the Omega_uni, Omega_equ and Omega_aug in dataset which follow the instance distribution in the paper.

Synthetic Image and Annotation Generation

At last, use the code instance_switch.py to generate the corresponding images depending on the input quadruple. Meanwhile, the corresponding annotation file will also be generated.

For generting images, just modify the ANN2ann_FILE in file instance_switch(e.g., dataset/omega_uni.txt) and the synthetic images and annotation file will be generated in the corresponding directory.

Our synthetic images and corresponding annotation files can be downloaded in Here(Type the Extraction Code: wnjx)

Class Imbalance Loss

The code for class imbalance loss is in \class_imbalance_loss directory, please refer to the \class_imbalance_loss\README.md for detail using.

Results

Applying PSIS to State-of-the-art Detectors

We directly employ this dataset to train four state-of-the-art detectors (i.e., FPN , Mask R-CNN , BlitzNet and SNIPER), and report results on test server for comparing with other augmentation methods.

FPN

We adopt PSIS to FPN by the publicly availabel toolkit. The configuration files are in the configs/FPN. For more training and testing information, please refer to the code. The results are shown as belows:

Training Sets	AP@0.50:0.95	AP@0.50	AP@0.75	AP@Small	AP@Med.	AP@Large	AR@1	AR@10	AR@100	AR@Small	AR@Med.	AR@Large
ori*	38.1	59.1	41.3	20.7	42.0	51.1	31.6	49.3	51.5	31.1	55.7	66.7
psis*	38.7	59.7	41.8	21.6	43.0	51.7	32.0	50.0	52.3	32.3	56.4	67.6
ori	38.6	60.4	41.6	22.3	42.8	50.0	31.8	50.6	53.2	34.5	57.7	66.8
psis(model)	39.8	61.0	43.4	22.7	44.2	52.1	32.6	51.1	53.6	34.8	59.0	68.5
ori×2	39.4	60.7	43.0	21.1	43.6	52.1	32.5	51.0	53.4	33.6	57.6	68.6
psis×2 (Coming Soon)	40.2	61.1	44.2	22.3	45.7	51.6	32.6	51.2	53.6	33.6	58.9	68.8

×2 means two times training epochs, which is regarded as training-time augmentation and * indicates no horizontal fliping. Above results clearly demonstrate our PSIS is superior and complementary to horizontal flipping and training-time augmentation methods.

Mask R-CNN

We evaluate PSIS using Mask R-CNN. The configuration files are in the configs/Mask R-CNN. For more training and testing information, please refer to the code. The results are shown as belows:

Training Sets	AP@0.50:0.95	AP@0.50	AP@0.75	AP@Small	AP@Med.	AP@Large	AR@1	AR@10	AR@100	AR@Small	AR@Med.	AR@Large
ori	39.4	61.0	43.3	23.1	43.7	51.3	32.3	51.5	54.3	34.9	58.7	68.5
psis(model)	40.7	61.8	44.5	23.4	45.2	53.0	33.3	52.8	55.4	35.5	59.7	70.3
ori×2	40.4	61.6	44.2	22.3	44.8	52.9	33.1	52.0	54.5	34.7	58.8	69.5
psis×2(Coming Soon)	41.2	62.5	45.4	23.7	46.0	53.6	33.4	52.9	55.5	36.2	60.0	70.3

×2 means two times training epochs, which is regarded as training-time augmentation. Above results clearly demonstrate our PSIS is superior and complementary to training-time augmentation method.

BlitzNet

We evaluate PSIS with the recently proposed context-based data augmentation method. We adopt PSIS to BlitzNet, For more traning and testing information, please refer to code.

Training Sets	AP@0.50:0.95	AP@0.50	AP@0.75	AP@Small	AP@Med.	AP@Large
ori	27.3	46.0	28.1	10.7	26.8	46.0
Context-DA	28.0	46.7	28.9	10.7	27.8	47.0
psis(Coming Soon)	30.8	50.0	32.2	12.6	31.0	50.2

SNIPER

We use SNIPER to verify the effectiveness of PSIS under multi-scale training strategy. The configuration files are in the configs/SNIPER. For more training and testing information, please refer to the code. The results are shown as belows:

Training Sets	AP@0.50:0.95	AP@0.50	AP@0.75	AP@Small	AP@Med.	AP@Large	AR@1	AR@10	AR@100	AR@Small	AR@Med.	AR@Large
ori	43.4	62.8	48.8	27.4	45.2	56.2	N/A	N/A	N/A	N/A	N/A	N/A
psis(Coming Soon)	44.2	63.5	49.3	29.3	46.2	57.1	35.0	60.1	65.9	50.4	70.4	78.0

Generalization to Instance Segmentation

We verify the generalization ability of our PSIS on instance segmentation task of MS COCO 2017. The instance segmetatnion results are shown belows:

Training Sets	AP@0.50:0.95	AP@0.50	AP@0.75	AP@Small	AP@Med.	AP@Large	AR@1	AR@10	AR@100	AR@Small	AR@Med.	AR@Large
ori	35.9	57.7	38.4	19.2	39.7	49.7	30.5	47.3	49.6	29.7	53.8	65.8
psis(model)	36.7	58.4	39.4	19.0	40.6	50.2	31.0	48.2	50.3	29.8	54.4	66.9
ori×2	36.6	58.2	39.2	18.5	40.3	50.4	31.0	47.7	49.7	29.5	53.5	66.6
psis×2(Coming Soon)	37.1	58.8	39.9	19.3	41.2	50.8	31.1	47.7	50.4	30.2	54.5	67.9

Above results clearly show PSIS offers a new and complementary way to use instance masks for improving both detection and segmentation performance.

Examples of Synthetic Images Generated by our IS

Here we show some examples of synthetic images generated by our IS strategy. The new (switched) instances are denoted in red boxes, and our instance-switching strategy can clearly preserve contextual coherence in the original images.

Hwang64/PSIS