This project is an official implementation of "Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending". It is accepted at Computer Vision for Wildlife Conservation (CVWC) as contribution paper. It is based on open-mmlab's mmdetection, an open source detection toolbox based on PyTorch. Many thanks to mmdetection for their simple and clean framwork.
It is worth noting that:
- The two proposed modules(TCM, SCM) can be easily implemented on current detection framework.
- The framework is trained and evaluated on Pan Africa Great Ape Camera Trap Dataset.
Demo video of RetinaNet with and without TCM+SCM
This project is released under the Apache 2.0 license.
Supported methods and backbones as well as the results are shown in the below table. The pretrained models are also available!
Backbone | TCM | SCM | Train Seq | Test Seq | mAP(%) | Download |
---|---|---|---|---|---|---|
RetinaNet Res50 | ✗ | ✗ | ☐ | ☐ | 80.79 | model |
RetinaNet Res50 | ✗ | ✓ | ☐ | ☐ | 81.21 | model |
RetinaNet Res50 | ✓ | ✗ | 7 | 21 | 90.02 | model |
RetinaNet Res50 | ✓ | ✓ | 7 | 21 | 90.81 | model |
RetinaNet Res101 | ✗ | ✗ | ☐ | ☐ | 85.25 | model |
RetinaNet Res101 | ✓ | ✓ | 5 | 21 | 90.21 | model |
CascadeRCNN Res101 | ✗ | ✗ | ☐ | ☐ | 88.31 | model |
CascadeRCNN Res101 | ✓ | ✓ | 3 | 21 | 91.17 | model |
- Linux (tested on CentOS 7)
- Python 3.6
- Pytorch >=1.10
- Cython
-
Install PyTorch 1.1 or later and torchvision following the official instructions.
-
Clone this repository.
git clone https://github.com/youshyee/Greatape-Detection.git
- Compile cuda extensions.
cd Greatape-Detection
pip install cython
./compile.sh
- Install mmdetection toolbox(other dependencies will be installed automatically).
python setup.py install
Please refer to mmdetection install instruction for more details.
- Support single video input or dir of videos
- output dir is required
sh tools/dist_infer.sh <GPU_NUM> --input <VIDEO or VIDEO DIR> --config <CONFIG_FILE> --checkpoint <MODEL_PATH> [optinal arguments]
- < GPU_NUM> : number of gpu you can use for inference
- < VIDEO or VIDEO DIR>: input path of single video or directory of videos.
- < CONFIG_FILE >: model configuration files, can be found in configs dir.
- < MODEL_PATH >: should be consistent with < CONFIG_FILE >, can be download from available model
Supported arguments are:
- --output_dir <WORK_DIR>: output video dir
- --tmpdir <WORK_DIR>: tmp dir for writing some results
sh tools/dist_train.sh <GPU_NUM> <CONFIG_FILE> <PATH/TO/PANAFRICA/TRAINSPLIT> [optinal arguments]
- < GPU_NUM> : number of gpu you can use for inference
- < CONFIG_FILE >: model configuration files, can be found in configs dir.
- < PATH/TO/PANAFRICA/TRAINSPLIT >: the path of training set split: train.txt
[optinal arguments]
- --checkpoint < MODEL_PATH >
- --work_dir < WORKDIR >: the dir to save logs and models
- --resume_from: < CHECKPOINT >: the checkpoint file to resume from
- --validate: whether to evaluate the checkpoint during training
- -test: whether to test final model after training
Please download the dataset make sure your folder structure like this
Greatape-detection
├── PanAfrica
│ ├── videos
│ │ ├── 0FhliLmQri.mp4
│ │ ├── 0FUP9Wc3pg.mp4
│ │ ├── ...
│ │ ├── ZZ5rZm8j3M.mp4
│ ├── videoframes
│ │ ├── 0FhliLmQri
│ │ │ ├── 000000.jpg
│ │ │ ├── 000001.jpg
│ │ │ ├── ...
│ │ ├── 0FUP9Wc3pg
│ │ ├── ...
│ │ ├── ZZ5rZm8j3M
│ ├── annotations
│ │ ├── 0FhliLmQri
│ │ │ ├── 0FhliLmQri_frame_1.xml
│ │ │ ├── 0FhliLmQri_frame_2.xml
│ │ │ ├── ...
│ │ ├── 0FUP9Wc3pg
│ │ ├── ...
│ │ ├── ZZ5rZm8j3M
│ ├── splits
│ │ ├── all.txt
│ │ ├── train.txt
│ │ ├── test.txt
│ │ ├── val.txt