/CenterNet

Codes for our paper "CenterNet: Keypoint Triplets for Object Detection" .

Primary LanguagePythonMIT LicenseMIT

by Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang and Qi Tian

The code to train and evaluate the proposed CenterNet is available here. For more technical details, please refer to our arXiv paper.

We thank Princeton Vision & Learning Lab for providing the original implementation of CornerNet.

CenterNet is an one-stage detector which gets trained from scratch. On the MS-COCO dataset, CenterNet achieves an AP of 47.0%, which surpasses all known one-stage detectors, and even gets very close to the top-performance two-stage detectors.

Abstract

In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions. This paper presents an efficient solution which explores the visual patterns within each cropped region with minimal costs. We build our framework upon a representative one-stage keypoint-based detector named CornerNet. Our approach, named CenterNet, detects each object as a triplet, rather than a pair, of keypoints, which improves both precision and recall. Accordingly, we design two customized modules named cascade corner pooling and center pooling, which play the roles of enriching information collected by both top-left and bottom-right corners and providing more recognizable information at the central regions, respectively. On the MS-COCO dataset, CenterNet achieves an AP of 47.0%, which outperforms all existing one-stage detectors by a large margin. Meanwhile, with a faster inference speed, CenterNet demonstrates quite comparable performance to the top-ranked two-stage detectors.

Introduction

CenterNet is a framework for object detection with deep convolutional neural networks. You can use the code to train and evaluate a network for object detection on the MS-COCO dataset.

  • It achieves state-of-the-art performance (an AP of 47.0%) on one of the most challenging dataset: MS-COCO.

  • Our code is written in Python, based on CornerNet.

More detailed descriptions of our approach and code will be made available soon.

If you encounter any problems in using our code, please contact Kaiwen Duan: kaiwen.duan@vipl.ict.ac.cn.

Architecture

Network_Structure

Comparison with other methods

Tabl

Tabl

Tabl

In terms of speed, we test the inference speed of both CornerNet and CenterNet on a NVIDIA Tesla P100 GPU. We obtain that the average inference time of CornerNet511-104 (means that the resolution of input images is 511X511 and the backbone is Hourglass-104) is 300ms per image and that of CenterNet511-104 is 340ms. Meanwhile, using the Hourglass-52 backbone can speed up the inference speed. Our CenterNet511-52 takes an average of 270ms to process per image, which is faster and more accurate than CornerNet511-104.

Preparation

Please first install Anaconda and create an Anaconda environment using the provided package list.

conda create --name CenterNet --file conda_packagelist.txt

After you create the environment, activate it.

source activate CenterNet

Compiling Corner Pooling Layers

cd <CenterNet dir>/models/py_utils/_cpools/
python setup.py install --user

Compiling NMS

cd <CenterNet dir>/external
make

Installing MS COCO APIs

cd <CenterNet dir>/data/coco/PythonAPI
make

Downloading MS COCO Data

  • Download the training/validation split we use in our paper from here (originally from Faster R-CNN)
  • Unzip the file and place annotations under <CenterNet dir>/data/coco
  • Download the images (2014 Train, 2014 Val, 2017 Test) from here
  • Create 3 directories, trainval2014, minival2014 and testdev2017, under <CenterNet dir>/data/coco/images/
  • Copy the training/validation/testing images to the corresponding directories according to the annotation files

Training and Evaluation

To train CenterNet-104:

python train.py CenterNet-104

We provide the configuration file (CenterNet-104.json) and the model file (CenterNet-104.py) for CenterNet in this repo.

We also provide a trained model for CenterNet-104, which is trained for 480k iterations using 8 Tesla V100 (32GB) GPUs. You can download it from BaiduYun CenterNet-104 (code: bfko) or Google drive CenterNet-104 and put it under <CenterNet dir>/cache/nnet (You may need to create this directory by yourself if it does not exist). If you want to train you own CenterNet, please adjust the batch size in CenterNet-104.json to accommodate the number of GPUs that are available to you.

To use the trained model:

python test.py CenterNet-104 --testiter 480000 --split <split>

To train CenterNet-52:

python train.py CenterNet-52

We provide the configuration file (CenterNet-52.json) and the model file (CenterNet-52.py) for CenterNet in this repo.

We also provide a trained model for CenterNet-52, which is trained for 480k iterations using 8 Tesla V100 (32GB) GPUs. You can download it from BaiduYun CenterNet-52 (code: 680t) or Google Drive CenterNet-52 and put it under <CenterNet dir>/cache/nnet (You may need to create this directory by yourself if it does not exist). If you want to train you own CenterNet, please adjust the batch size in CenterNet-52.json to accommodate the number of GPUs that are available to you.

To use the trained model:

python test.py CenterNet-52 --testiter 480000 --split <split>

We also include a configuration file for multi-scale evaluation, which is CenterNet-104-multi_scale.json and CenterNet-52-multi_scale.json in this repo, respectively.

To use the multi-scale configuration file:

python test.py CenterNet-52 --testiter <iter> --split <split> --suffix multi_scale

or

python test.py CenterNet-104 --testiter <iter> --split <split> --suffix multi_scale