/MutualGuide

[ACCV2020] Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection

Primary LanguagePythonMIT LicenseMIT

Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection

By Heng Zhang, Elisa FROMONT, SĂ©bastien LEFEVRE, Bruno AVIGNON

Introduction

Most deep learning object detectors are based on the anchor mechanism and resort to the Intersection over Union (IoU) between predefined anchor boxes and ground truth boxes to evaluate the matching quality between anchors and objects. In this paper, we question this use of IoU and propose a new anchor matching criterion guided, during the training phase, by the optimization of both the localization and the classification tasks: the predictions related to one task are used to dynamically assign sample anchors and improve the model on the other task, and vice versa. This is the Pytorch implementation of Mutual Guidance detectors. For more details, please refer to our ACCV paper.    

Experimental results

VOC2007 Test

Detector Resolution mAP AP50 AP75 Trained model
FSSD (VGG16) 320x320 54.1 80.1 58.3 uploading
FSSD (VGG16) + MG 320x320 56.2 80.4 61.4 uploading
RetinaNet (VGG16) 320x320 55.2 80.2 59.6 uploading
RetinaNet (VGG16) + MG 320x320 57.7 81.1 62.9 uploading
RFBNet (VGG16) 320x320 55.6 80.9 59.6 uploading
RFBNet (VGG16) + MG 320x320 57.9 81.5 62.6 uploading
RetinaNet (VGG16) + PAFPN 320x320 58.1 81.7 63.3 uploading
RetinaNet (VGG16) + PAFPN + MG 320x320 59.5 82.3 64.2 uploading

COCO2017 Val

Detector Resolution mAP AP50 AP75 FPS (V100) Trained model
FSSD (VGG16) 320x320 31.1 48.9 32.7 365 uploading
FSSD (VGG16) + MG 320x320 32.0 49.3 33.9 365 uploading
RetinaNet (VGG16) 320x320 32.3 50.3 34.0 270 uploading
RetinaNet (VGG16) + MG 320x320 33.6 50.8 35.7 270 uploading
RFBNet (VGG16) 320x320 33.4 51.6 35.1 115 uploading
RFBNet (VGG16) + MG 320x320 34.6 52.0 36.8 115 uploading
RetinaNet (VGG16) + PAFPN 320x320 33.9 51.9 35.7 220 Google Drive
RetinaNet (VGG16) + PAFPN + MG 320x320 35.3 52.4 37.3 220 Google Drive
RetinaNet (VGG16) 512x512 37.1 56.5 39.5 250 uploading
RetinaNet (VGG16) + MG 512x512 38.2 56.6 41.0 250 uploading
RetinaNet (VGG16) + PAFPN 512x512 running running running 195 uploading
RetinaNet (VGG16) + PAFPN + MG 512x512 39.4 57.5 42.3 195 Google Drive

Datasets

First download the VOC and COCO dataset, you may find the sripts in data/scripts/ useful. Then create a folder named datasets and link the downloaded datasets inside:

$ mkdir datasets
$ ln -s /path_to_your_voc_dataset datasets/VOCdevkit
$ ln -s /path_to_your_coco_dataset datasets/coco2017

Finally prepare folders to save evaluation results:

$ mkdir eval
$ mkdir eval/COCO
$ mkdir eval/VOC

Training

For training with Mutual Guide:

$ python3 main.py --version fssd --backbone vgg16 --dataset voc --size 320 --mutual_guide
                            retinanet       resnet18        coco       512
                            rfbnet
                            pafpn

Remarks:

  • For training without Mutual Guide, just remove the '--mutual_guide';
  • The default folder to save trained model is weights/.

Evaluation

Every time you want to evaluate a trained network:

$ python3 main.py --version fssd --backbone vgg16 --dataset voc --size 320 --trained_model path_to_saved_weights
                            retinanet       resnet18        coco       512
                            pafpn
                            rfbnet

It will directly print the mAP, AP50 and AP50 results on VOC2007 Test or COCO2017 Val.