Great Ape Detection

Introduction

This project is an official implementation of "Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending". It is accepted at Computer Vision for Wildlife Conservation (CVWC) as contribution paper. It is based on open-mmlab's mmdetection, an open source detection toolbox based on PyTorch. Many thanks to mmdetection for their simple and clean framwork.

It is worth noting that:

The two proposed modules(TCM, SCM) can be easily implemented on current detection framework.
The framework is trained and evaluated on Pan Africa Great Ape Camera Trap Dataset.

Demo video of RetinaNet with and without TCM+SCM

License

This project is released under the Apache 2.0 license.

Available Models and Results

Supported methods and backbones as well as the results are shown in the below table. The pretrained models are also available!

Backbone	TCM	SCM	Train Seq	Test Seq	mAP(%)	Download
RetinaNet Res50	✗	✗	☐	☐	80.79	model
RetinaNet Res50	✗	✓	☐	☐	81.21	model
RetinaNet Res50	✓	✗	7	21	90.02	model
RetinaNet Res50	✓	✓	7	21	90.81	model
RetinaNet Res101	✗	✗	☐	☐	85.25	model
RetinaNet Res101	✓	✓	5	21	90.21	model
CascadeRCNN Res101	✗	✗	☐	☐	88.31	model
CascadeRCNN Res101	✓	✓	3	21	91.17	model

Usage

Requirments

Linux (tested on CentOS 7)
Python 3.6
Pytorch >=1.10
Cython

Installation

Install PyTorch 1.1 or later and torchvision following the official instructions.
Clone this repository.

 git clone https://github.com/youshyee/Greatape-Detection.git

Compile cuda extensions.

cd Greatape-Detection
pip install cython
./compile.sh

Install mmdetection toolbox(other dependencies will be installed automatically).

python setup.py install

Please refer to mmdetection install instruction for more details.

Inference

Support single video input or dir of videos
output dir is required

sh tools/dist_infer.sh <GPU_NUM> --input <VIDEO or VIDEO DIR> --config <CONFIG_FILE> --checkpoint <MODEL_PATH> [optinal arguments]

< GPU_NUM> : number of gpu you can use for inference
< VIDEO or VIDEO DIR>: input path of single video or directory of videos.
< CONFIG_FILE >: model configuration files, can be found in configs dir.
< MODEL_PATH >: should be consistent with < CONFIG_FILE >, can be download from available model

Supported arguments are:

--output_dir <WORK_DIR>: output video dir
--tmpdir <WORK_DIR>: tmp dir for writing some results

Train

sh tools/dist_train.sh <GPU_NUM>  <CONFIG_FILE> <PATH/TO/PANAFRICA/TRAINSPLIT>  [optinal arguments]

< GPU_NUM> : number of gpu you can use for inference
< CONFIG_FILE >: model configuration files, can be found in configs dir.
< PATH/TO/PANAFRICA/TRAINSPLIT >: the path of training set split: train.txt

[optinal arguments]

--checkpoint < MODEL_PATH >
--work_dir < WORKDIR >: the dir to save logs and models
--resume_from: < CHECKPOINT >: the checkpoint file to resume from
--validate: whether to evaluate the checkpoint during training
-test: whether to test final model after training

Dataset

Please download the dataset make sure your folder structure like this

Greatape-detection
├── PanAfrica
│   ├── videos
│   │   ├── 0FhliLmQri.mp4
│   │   ├── 0FUP9Wc3pg.mp4
│   │   ├── ...
│   │   ├── ZZ5rZm8j3M.mp4
│   ├── videoframes
│   │   ├── 0FhliLmQri
│   │   │   ├── 000000.jpg
│   │   │   ├── 000001.jpg
│   │   │   ├── ...
│   │   ├── 0FUP9Wc3pg
│   │   ├── ...
│   │   ├── ZZ5rZm8j3M
│   ├── annotations
│   │   ├── 0FhliLmQri
│   │   │   ├── 0FhliLmQri_frame_1.xml
│   │   │   ├── 0FhliLmQri_frame_2.xml
│   │   │   ├── ...
│   │   ├── 0FUP9Wc3pg
│   │   ├── ...
│   │   ├── ZZ5rZm8j3M
│   ├── splits
│   │   ├── all.txt
│   │   ├── train.txt
│   │   ├── test.txt
│   │   ├── val.txt