Siamese Networks with Hinge Loss for Real-Time Visual Tracking
Introduction
Siamese networks are really popular in the field of visual tracking because of their balanced efficiency, accuracy and speed. But, the backbone network utilized in these trackers is still the classical AlexNet, which does not capture the capabilities of modern deep neural networks.
Our proposal for improve the SiamDW performances of fully convolutional siamese trackers by,
- Using Hinge loss function to improve the performance of SiamDW implementation
Main Results
Results are based on the CIRResNet22-RPN model by using a simpler loss function (logistic loss), vs our Hinge loss function
Main results on VOT and OTB
Models | VOT16 | VOT17 | VOT18 |
---|---|---|---|
Logistic Loss | 0.331 | 0.376 | 0.294 |
Hinge Loss (ours) | 0.312 | 0.318 | 0.322 |
Focal Loss (ours) | --- | --- | --- |
Environment
Initially encironment: GPU: NVIDIA .GTX 1050 Advanced encironment: The code is developed with Ryzen 5 1600x 6 core 12 thread CPU @ 4.20GHz RAM: 16GB GPU: NVIDIA .RTX2060
Training
Data preparation
- There are preprocessed datasets
VID
,YTB
,GOT10K
,COCO
,DET
andLASOT
. You can download it from GoogleDrive. - You might have limitations due to the GoogleDrive download capacity.
- If you face up with download limit exceeded issue, you can copy the files into your GoogleDrive.
Pretrained model preparation
The code will download pretrained model from GoogleDrive automatically. If failed, please download from GoogleDrive , and put them to pretrain
directory.
Conda preparation
sh install_rpn.sh
Setting preparation
Modify yaml files in experiment/train/
according to your needs.
One-key Running
This script will excute train
-epoch-test
-hyper-parameter tuning
automatically to save your time.
python siamese_tracking/onekey.py
Data optimization
- Different training data and mix-up ratio will affect final performance. You can modify
WITCH_USE
in yaml files ofexperiment/train/
to find witch data is better for your task. Also, modifyUSE
in yaml files to try different mix-up ratio. High quality training data is beneficial to training. We used ["COCO"] dataset which is a large-scale object detection work.
Backbone optimization
We provide ResNet
, Inception
, DenseNet
, NasNet
and ResNext
in codes.
Add your backbone in lib/models/backbone.py
. Pretraining backbone on Imagenet is always good to training.
Loss optimization
We changed loss function from Logistic Loss into Hinge loss.
Additionally, you can add your loss function in lib/models/siamfc.py
or lib/models/siamrpn.py
Test
- Download models from GoogleDrive , and put them to
snapshot
directory
Test on a specific video
eg,
python siamese_tracking/run_video.py --arch SiamRPNRes22 --resume snapshot/CIResNet22_RPN.pth --video videos/bag.mp4
- The opencv version here is 4.1.0.25, and older versions may be not friendly to some functions.
- If you try to conduct this project on a specific tracking task, eg. pedestrian tracking, it's suggested that you can tuning hyper-parameters on your collected data with our tuning toolkit detailed below.
Test through webcam
eg,
python siamese_tracking/run_webcam.py --arch SiamRPNRes22 --resume snapshot/CIResNet22_RPN.pth
- The opencv version here is 4.1.0.25, and older versions may be not friendly to some functions.
- You can embed any tracker for fun. This is also a good way to design experiments to determine how environmental factors affect your tracker.
Test on benchmarks
Data preparation
The test dataset VOT should be arranged in dataset
directory. Your directory tree should look like this:
${Tracking_ROOT}
|—— experimnets
|—— lib
|—— snapshot
|—— dataset
|—— VOT2015
| —— videos...
|—— VOT2016
| —— videos...
|—— VOT2017
| —— videos...
|—— run_tracker.py
|—— ...
Conda preparation
sh install_rpn.sh
Toolkit preparation
- Set up vot-toolkit according to official tutorial
- Modify
path_to/toolkit
inlib/core/get_eao.m
to your vot-toolkit path - In your matlab install path (MATLAB2017b or higher),
cd $matlab_path/R2018b/extern/engines/python
python setup.py install
- Download datasets VOT2015, VOT2016, and VOT2017 and put them into the dataset directory.
Run tracker
CUDA_VISIBLE_DEVICES=0 python ./siamese_tracking/test_siamrpn.py --arch SiamRPNRes22 --resume ./snapshot/CIResNet22_RPN.pth --dataset VOT2015 --cls_type thinner
or
CUDA_VISIBLE_DEVICES=0 python ./siamese_tracking/test_siamrpn.py --arch SiamRPNRes22 --resume ./snapshot/CIResNet22_RPN.pth --dataset VOT2016 --cls_type thinner
or
CUDA_VISIBLE_DEVICES=0 python ./siamese_tracking/test_siamrpn.py --arch SiamRPNRes22 --resume ./snapshot/CIResNet22_RPN.pth --dataset VOT2017 --cls_type thinner
Analysz testing results
We implemented out VOT benchmark to estimate Distance Precision, Average Overlap Ratio, and Average Center Location error. You can get the evaluation results as in the following.
- VOT
python ./evalutionVOT.py VOT2015
or
python ./evalutionVOT.py VOT2016
or
python ./evalutionVOT.py VOT2017
Attention !!
- Recently we found that the image is slightly inconsistent while using different OpenCV version. And the speed of some opencv versions are relatively slow for some reason. It is recommended that you install packages above.
- The SiamRPN based model is trained on pytorch0.4.1, since we found that memory leak happens while testing SiamRPN on pytorch0.3.1 with multithread tools.