/pcl.pytorch

PyTorch codes for our paper "PCL: Proposal Cluster Learning for Weakly Supervised Object Detection".

Primary LanguagePythonMIT LicenseMIT

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

By Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, and Alan Yuille.

This is a PyTorch implementation of our PCL. The original Caffe implementation of PCL is available here.

We embed the trick proposed in our ECCV paper for better performance.

The final performance of this implementation is mAP 49.2% and CorLoc 65.0% on PASCAL VOC 2007 using a single VGG16 model. The results are comparable with the recent state of the arts.

Introduction

Proposal Cluster Learning (PCL) is a framework for weakly supervised object detection with deep ConvNets.

The original paper has been accepted by CVPR 2017. This is an extened version. For more details, please refer to here and here.

Comparison with other methods

(a) Conventional MIL method; (b) Our original OICR method with newly proposed proposal cluster generation method; (c) Our PCL method.

method compare

Architecture

PCL architecture

Visualizations

Some PCL visualization results.

Some visualization results

Some visualization comparisons among WSDDN, WSDDN+context, and PCL.

Some visualization comparisons among WSDDN, WSDDN+context, and PCL

License

PCL is released under the MIT License (refer to the LICENSE file for details).

Citing PCL

If you find PCL useful in your research, please consider citing:

@article{tang2018pcl,
    author = {Tang, Peng and Wang, Xinggang and Bai, Song and Shen, Wei and Bai, Xiang and Liu, Wenyu and Yuille, Alan},
    title = {{PCL}: Proposal Cluster Learning for Weakly Supervised Object Detection},
    journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
    volume = {},
    number = {},
    pages = {1--1},
    year = {2018}
}

@inproceedings{tang2017multiple,
    author = {Tang, Peng and Wang, Xinggang and Bai, Xiang and Liu, Wenyu},
    title = {Multiple Instance Detection Network with Online Instance Classifier Refinement},
    booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
    pages = {3059--3067},
    year = {2017}
}

Contents

  1. Requirements: software
  2. Requirements: hardware
  3. Basic installation
  4. Installation for training and testing
  5. Extra Downloads (Models trained on PASCAL VOC)
  6. Usage
  7. TODO

Requirements: software

Tested under python3.

  • python packages
    • pytorch==0.4.1
    • torchvision>=0.2.0
    • cython
    • matplotlib
    • numpy
    • scipy
    • opencv
    • pyyaml==3.12
    • packaging
    • pycocotools — also available from pip.
    • tensorboardX — for logging the losses in Tensorboard
    • sklearn
  • An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
  • NOTICE: different versions of Pytorch package have different memory usages.

Requirements: hardware

  1. NVIDIA GTX 1080Ti (~11G of memory)

Installation

  1. Clone the PCL repository
git clone https://github.com/ppengtang/pcl.pytorch.git & cd pcl.pytorch
  1. Compile the CUDA code:
cd $PCL_ROOT/lib
sh make.sh

Installation for training and testing

  1. Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCdevkit_18-May-2011.tar
  1. Extract all of these tars into one directory named VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_18-May-2011.tar
  1. Download the COCO format pascal annotations from here and put them into the VOC2007/annotations directory

  2. It should have this basic structure

$VOC2007/                           
$VOC2007/annotations
$VOC2007/JPEGImages
$VOC2007/VOCdevkit        
# ... and several other directories ...
  1. Create symlinks for the PASCAL VOC dataset
cd $PCL_ROOT/data
ln -s $VOC2007 VOC2007

Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.

  1. [Optional] follow similar steps to get PASCAL VOC 2012.

  2. You should put the generated proposal data under the folder $PCL_ROOT/data/selective_search_data, with the name "voc_2007_trainval.pkl", "voc_2007_test.pkl". You can downlad the Selective Search proposals here.

  3. The pre-trained models are available at: Dropbox, VT Server. You should put it under the folder $PCL_ROOT/data/pretrained_model.

Download models trained on PASCAL VOC

Models trained on PASCAL VOC can be downloaded here: Google Drive, Baidu Yun (Password: 3l06).

Usage

Train a PCL network. For example, train a VGG16 network on VOC 2007 trainval

CUDA_VISIBLE_DEVICES=0 python tools/train_net_step.py --dataset voc2007 \
  --cfg configs/baselines/vgg16_voc2007.yaml --bs 1 --nw 4 --iter_size 4

Note: The current implementation has a bug on multi-gpu training and thus does not support multi-gpu training.

Test a PCL network. For example, test the VGG 16 network on VOC 2007:

On trainval

python tools/test_net.py --cfg configs/baselines/vgg16_voc2007.yaml \
  --load_ckpt Outputs/vgg16_voc2007/$MODEL_PATH \
  --dataset voc2007trainval

On test

python tools/test_net.py --cfg configs/baselines/vgg16_voc2007.yaml \
  --load_ckpt Outputs/vgg16_voc2007/$model_path \
  --dataset voc2007test

Test output is written underneath $PCL_ROOT/Outputs.

Note: Add --multi-gpu-testing if multiple gpus are available.

Evaluation

For mAP, run the python code tools/reval.py

python tools/reeval.py --result_path $output_dir/detections.pkl \
  --dataset voc2007test --cfg configs/baselines/vgg16_voc2007.yaml

For CorLoc, run the python code tools/reval.py

python tools/reeval.py --result_path $output_dir/discovery.pkl \
  --dataset voc2007trainval --cfg configs/baselines/vgg16_voc2007.yaml

What we are going to do

  • Add PASCAL VOC 2012 configurations.
  • Upload trained models.
  • Support multi-gpu training.