This repo is the 1st task of VidVRD: Video Object Detection.
The 2nd stage trunk project: Video Relation Prediction
This project is based on faster-rcnn.pytorch
Modified some parts to be the 1st step of Video VRD project.
source activate pytorch
pip install -r requirements.txt
# Compile the cuda dependencies
cd lib
bash make.sh # mayb u need 2 modify 'CUDA_ARCH' 2 suit u gpu version
Download pretrained models:
bash prepare_voc2007.sh
bash gpu_train.sh pascal_voc resnet101
The model will be saved in /storage/
Download Grand Challenge dataset
Check u own proj structure with tree.txt 2 modify
Evaluate the detection performance of a pre-trained vgg16 model on pascal_voc test set
bash gpu_test.sh
If you want to run detection on your own images with a pre-trained model, download the pretrained model listed in above tables or train your own models at first, then add images to folder $ROOT/images, and then run
bash gpu_demo.sh
You can use a webcam in a real-time demo by running
python demo.py --net vgg16 \
--checksession $SESSION --checkepoch $EPOCH --checkpoint $CHECKPOINT \
--cuda --load_dir path/to/model/directoy \
--webcam $WEBCAM_ID
The demo is stopped by clicking the image window and then pressing the 'q' key.