SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection

by Xiaowei Hu, Xuemiao Xu, Yongjie Xiao, Hao Chen, Shengfeng He, Jing Qin, and Pheng-Ann Heng

This implementation is written by Xiaowei Hu at the Chinese University of Hong Kong.

@article{hu2018sinet,
title={SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection},
author={Hu, Xiaowei and Xu, Xuemiao and Xiao, Yongjie and Chen, Hao and He, Shengfeng and Qin, Jing and Heng, Pheng-Ann},
journal={IEEE Transactions on Intelligent Transportation Systems},
year={2018}
}

Requirements

This code has been tested on Ubuntu 14.04, CUDA 7.0, cuDNN v3 with the NVIDIA TITAN X GPU and Ubuntu 16.04. CUDA 8.0 with the NVIDIA TITAN X(Pascal) GPU.
We also need MATLAB scripts to run the auxiliary code, caffe MATLAB wrapper is required. Please build matcaffe before running the detection demo.
cuDNN is required to avoid out-of-memory when training the models with VGG network.

Installation

Clone the SINet repository, and we'll call the directory that you cloned SINet into SINet.
```
git clone https://github.com/xw-hu/SINet.git
```
Build SINet (based on Caffe)

Follow the Caffe installation instructions here: http://caffe.berkeleyvision.org/installation.html
```
make all -j XX
make matcaffe
```

Training on KITTI car dataset

Download the KITTI dataset by yourself.
Enter the SINet/models/PVA/ to download the PAVNet pretrained model:
```
sh download_PVANet_imagenet.sh
```
Enter the SINet/data/kitti/window_files, and replace /home/xwhu/KITTI/KITTI/ with your KITTI path.

Another way is to run mscnn_kitti_car_window_file.m to generate the txt files that include the pathes of KITTI images.
Run SINet/data/kitti/statistical_size.m to calculate the parameters of ROISplit Layer in trainval_2nd.prototxt.
(optional) Run SINet/data/kitti/anchor_parameter.m to calculate the anchors of ImageGtData layer. This is determined by K-means.
Enter the SINet/examples/kitti_car/SINet-pva-576-2-branch.
In the command window, run (around 1 hour on a single TITAN X):
```
sh train_first_stage.sh
```
Use MATLAB to run weight_2nd_ini.m
In the command window, run (around 13.5 hours on a single TITAN X):
```
sh train_second_stage.sh
```

Tip: If the training does not converge, try some other random seeds. You should obtain a fair performance after a few tries. Due to the randomness, you are difficult to fully reproduce the same models, but the performance should be close.

Testing on KITTI car dataset

Use MATLAB to run run_SINet_2_branch.m in SINet/examples/kitti_car. It will generate the detection results in SINet/examples/kitti_car/detections. (In run_SINet_detection.m, let show = 1, we can show and save the detection results, but the speed is slower.)
We can get the quantitive results (average precision) in three levels: "easy", "moderate" and "hard" (same as the KITTI benchmark).
Without using cuDNN in testing, the running speed is higher.

Training on other datasets

Enter the SINet/data/kitti/ and modify the code mscnn_kitti_car_window_file.m to generate the txt files for your datasets.
Modify the parameters and the pathes of input images in trainval_1st.prototxt and trainval_2nd.prototxt.
Others are the same as before.

Testing on other datasets

Modify the run_SINet_2_branch.m, which generates the detection results in one txt file.
Use the evaluation functions provided by KITTI or other benchmarks to calculate the quantitative results (in SINet/examples/lsvh_result, we use the VOC2011 evaluation code to calculate the mAP in our LSVH dataset.