/Greatape-Detection

Implementation for Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending

Primary LanguagePythonApache License 2.0Apache-2.0

Introduction

This project is an official implementation of "Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending". It is accepted at Computer Vision for Wildlife Conservation (CVWC) as contribution paper. It is based on open-mmlab's mmdetection, an open source detection toolbox based on PyTorch. Many thanks to mmdetection for their simple and clean framwork.

It is worth noting that:

Demo video of RetinaNet with and without TCM+SCM

great ape detection result great ape detection result great ape detection result great ape detection result

License

This project is released under the Apache 2.0 license.

Available Models and Results

Supported methods and backbones as well as the results are shown in the below table. The pretrained models are also available!

Backbone TCM SCM Train Seq Test Seq mAP(%) Download
RetinaNet Res50 80.79 model
RetinaNet Res50 81.21 model
RetinaNet Res50 7 21 90.02 model
RetinaNet Res50 7 21 90.81 model
RetinaNet Res101 85.25 model
RetinaNet Res101 5 21 90.21 model
CascadeRCNN Res101 88.31 model
CascadeRCNN Res101 3 21 91.17 model

Usage

Requirments

  • Linux (tested on CentOS 7)
  • Python 3.6
  • Pytorch >=1.10
  • Cython

Installation

  1. Install PyTorch 1.1 or later and torchvision following the official instructions.

  2. Clone this repository.

 git clone https://github.com/youshyee/Greatape-Detection.git
  1. Compile cuda extensions.
cd Greatape-Detection
pip install cython
./compile.sh
  1. Install mmdetection toolbox(other dependencies will be installed automatically).
python setup.py install 

Please refer to mmdetection install instruction for more details.

Inference

  • Support single video input or dir of videos
  • output dir is required
sh tools/dist_infer.sh <GPU_NUM> --input <VIDEO or VIDEO DIR> --config <CONFIG_FILE> --checkpoint <MODEL_PATH> [optinal arguments]
  • < GPU_NUM> : number of gpu you can use for inference
  • < VIDEO or VIDEO DIR>: input path of single video or directory of videos.
  • < CONFIG_FILE >: model configuration files, can be found in configs dir.
  • < MODEL_PATH >: should be consistent with < CONFIG_FILE >, can be download from available model

Supported arguments are:

  • --output_dir <WORK_DIR>: output video dir
  • --tmpdir <WORK_DIR>: tmp dir for writing some results

Train

sh tools/dist_train.sh <GPU_NUM>  <CONFIG_FILE> <PATH/TO/PANAFRICA/TRAINSPLIT>  [optinal arguments]
  • < GPU_NUM> : number of gpu you can use for inference
  • < CONFIG_FILE >: model configuration files, can be found in configs dir.
  • < PATH/TO/PANAFRICA/TRAINSPLIT >: the path of training set split: train.txt

[optinal arguments]

  • --checkpoint < MODEL_PATH >
  • --work_dir < WORKDIR >: the dir to save logs and models
  • --resume_from: < CHECKPOINT >: the checkpoint file to resume from
  • --validate: whether to evaluate the checkpoint during training
  • -test: whether to test final model after training

Dataset

Please download the dataset make sure your folder structure like this

Greatape-detection
├── PanAfrica
│   ├── videos
│   │   ├── 0FhliLmQri.mp4
│   │   ├── 0FUP9Wc3pg.mp4
│   │   ├── ...
│   │   ├── ZZ5rZm8j3M.mp4
│   ├── videoframes
│   │   ├── 0FhliLmQri
│   │   │   ├── 000000.jpg
│   │   │   ├── 000001.jpg
│   │   │   ├── ...
│   │   ├── 0FUP9Wc3pg
│   │   ├── ...
│   │   ├── ZZ5rZm8j3M
│   ├── annotations
│   │   ├── 0FhliLmQri
│   │   │   ├── 0FhliLmQri_frame_1.xml
│   │   │   ├── 0FhliLmQri_frame_2.xml
│   │   │   ├── ...
│   │   ├── 0FUP9Wc3pg
│   │   ├── ...
│   │   ├── ZZ5rZm8j3M
│   ├── splits
│   │   ├── all.txt
│   │   ├── train.txt
│   │   ├── test.txt
│   │   ├── val.txt