Deeper and Wider Siamese Networks for Real-Time Visual Tracking

we are hiring talented interns: houwen.peng@microsoft.com

News

☀️☀️ The training and test code of SiamFC+ have been released, the test code of RPN+ also released. The training procedure of RPN+ will be released after VOT-2019 (June, 2019), be patient. We use RPN+ for VOT challenge, currently.
☀️☀️ Our paper have been accepted by CVPR2019 (Oral).
☀️☀️ We provide a parameter tuning toolkit for siamese tracking framework.

Introduction

Siamese networks have drawn great attention in visual tracking because of their balanced accuracy and speed. However, the backbone network utilized in these trackers is still the classical AlexNet, which does not fully take advantage of the capability of modern deep neural networks.

Our proposals improve the performances of fully convolutional siamese trackers by,

introducing CIR and CIR-D units to unveil the power of deeper and wider networks like ResNet and Inceptipon;
designing backbone networks according to the analysis on internal network factors (e.g. receptive field, stride, output feature size), which affect tracking performances.

Main Results

Main results on VOT and OTB

Models	OTB13	OTB15	VOT15	VOT16	VOT17
Alex-FC	0.608	0.579	0.289	0.235	0.188
Alex-RPN	-	0.637	0.349	0.344	0.244
CIResNet22-FC	0.663	0.644	0.318	0.303	0.234
CIResIncep22-FC	0.662	0.642	0.310	0.295	0.236
CIResNext23-FC	0.659	0.633	0.297	0.278	0.229
CIResNext22-RPN	0.674	0.666	0.381	0.376	0.294

Main results training with GOT-10k (SiamFC)

Models	OTB13	OTB15	VOT15	VOT16	VOT17
CIResNet22-FC	0.664	0.654	0.361	0.335	0.266
CIResNet22W-FC	0.689	0.664	0.368	0.352	0.269
CIResIncep22-FC	0.673	0.650	0.332	0.305	0.251
CIResNext22-FC	0.668	0.651	0.336	0.304	0.246

Some reproduced results listed above are slightly better than the ones in the paper.
Recently we found that training on GOT10K dataset can achieve better performance for SiamFC. So we provide the results being trained on GOT10K.
CIResNet22W-FC is our recent work, which is not included in our paper.
Download pretrained on GOT10K model and GOT10K training pairs generation codes here. This version is naive, I'll embed it to codes layter.

Note

You can download raw results from GoogleDrive, OneDrive and BaiduDrive without running the code.
Extracted code for Baidu Drive is htyx

Environment

The code is developed with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz GPU: NVIDIA .GTX1080

☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️

Quick Start

Installation

For SiamFC

conda create -n SiamDWFC python=3.6
source activate SiamDWFC
sh install_fc.sh

For SiamRPN

conda create -n SiamDWRPN python=3.6
source activate SiamDWRPN
sh install_rpn.sh

Recently we found that the image is slightly inconsistent while using different OpenCV version. And the speed of some opencv versions are relatively slow for some reason. It is recommended that you install packages above.
The SiamRPN based model is trained on pytorch0.4.1, since we found that memory leak happens while testing SiamRPN on pytorch0.3.1 with multithread tools.

Data preparation

For testing
The test dataset (OTB or VOT) should be arranged in dataset directory. Your directory tree should look like this:

${Tracking_ROOT}
|—— experimnets
|—— lib
|—— snapshot
|—— dataset
  |—— OTB2013.json
  |—— OTB2015.json 
  |—— OTB2013 (or VOT2015...)
     |—— videos...
|—— run_tracker.py
|—— ...

OTB2013.json and OTB2015.json can be download here.

For training SiamFC

We pre-process VID and GOT10K to training pairs. You can download it from GoogleDrive or BaiduDrive.
BaiduDrive extracted code bnd9

Test on a specific video

eg,

python siamese_tracking/run_video.py --arch SiamRPNRes22 --resume snapshot/CIResNet22RPN.model --video videos/bag.mp4

The opencv version here is 4.1.0.25, and older versions may be not friendly to some functions.
If you try to conduct this project on a specific tracking task, eg. pedestrian tracking, it's suggested that you can tuning hyper-parameters on your collected data with our tuning toolkit detailed below.

Test through webcam

eg,

python siamese_tracking/run_wecam.py --arch SiamRPNRes22 --resume snapshot/CIResNet22RPN.model

The opencv version here is 4.1.0.25, and older versions may be not friendly to some functions.
You can embed any tracker for fun. This is also a good way to design experiments to determine how environmental factors affect your tracker.

Test on benchmarks

Download model from OneDrive, GoogleDrive or BaiduDrive, and put them to snapshot directory

BaiduDrive extracted code uqvi

CUDA_VISIBLE_DEVICES=0 python ./siamese_tracking/test_siamfc.py --arch SiamFCRes22 --resume ./snapshot/CIResNet22.pth --dataset OTB2013
or 
CUDA_VISIBLE_DEVICES=0 python ./siamese_tracking/test_siamrpn.py --arch SiamRPNRes22 --resume ./snapshot/CIResNet22_RPN.pth --dataset VOT2017

Extracted code for Baidu drive is required due to softerware maintenance recently. Please input v5du in the download box.

Analysz testing results

For OTB

python ./lib/core/eval_otb.py OTB2013 ./result SiamFC* 0 1
or
python ./lib/core/eval_otb.py OTB2013 ./result SiamRPN* 0 1

For VOT

Please refer to VOT official tutorial to set up your workspace.
Move txt result files to result directory in vot-workspace. Please keep directory name coincident with run_analysis.m.
run run_analysis.m

Reproduce -- Train/Test/Tune

Preparation

prepare conda environment and matlab-python API according to details above
modify dataset path in experiments/train/*.yaml to your needs.
download pretrained model from OneDrive, GoogleDrive or BaiduDrive, and put them to pretrain directory
Extracted code for Baidu drive is required due to softerware maintenance recently. Please input 7rfu in the download box.

SiamFC

Epoch Train

python ./siamese_tracking/train_siamfc.py --cfg experiments/train/SiamFC.yaml --gpus 0,1,2,3 --workers 32 2>&1 | tee logs/siamfc_train.log

Epoch Test

If you want to test multi-epochs after training,

mpiexec -n 16 python ./siamese_tracking/test_epochs.py --arch SiamFCRes22 --start_epoch 30 --end_epoch 50 --gpu_nums=4 --threads 16 --dataset OTB2013 2>&1 | tee logs/siamfc_epoch_test.log

python ./lib/core/eval_otb.py OTB2013 ./result SiamFC* 0 100 2>&1 | tee logs/siamfc_eval_epoch.log

Param-Tune

Siamese trackers are severely sensitive to hyper-parameters in common sense. We provide a toolkit for selecting optimal hyper-parameters on a benchmark (for SiamFC). Wish our efforts will be helpful to your work. You should choose a validation dataset and modify evaluation scripts according to practical needs.

mpiexec -n 16  python ./siamese_tracking/tune_gene.py --arch SiamFCRes22 --resume ./snapshot/CIResNet22.pth --dataset xxxx --gpu_nums 4 2>&1 | tee logs/gene_tune_fc.log

Citation

If any part of our paper and code is helpful to your work, please generously cite with:

@inproceedings{SiamDW_2019_CVPR,
    author={Zhang, Zhipeng and Peng, Houwen},
    title={Deeper and Wider Siamese Networks for Real-Time Visual Tracking},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2019}
}

License

Licensed under an MIT license.

Zugor/SiamDW