/IPN-hand

Code and models of our arXiv paper "IPN Hand: A Video Dataset and Benchmark for Real-Time Continuous Hand Gesture Recognition"

Primary LanguagePythonMIT LicenseMIT

IPN Hand: A Video Dataset and Benchmark for Real-Time Continuous Hand Gesture Recognition

PyTorch implementation, codes and pretrained models of the paper:

IPN Hand: A Video Dataset and Benchmark for Real-Time Continuous Hand Gesture Recognition
Gibran Benitez-Garcia, Jesus Olivares-Mercado, Gabriel Sanchez-Perez, and Keiji Yanai
Accepted at ICPR 2020

This paper proposes the IPN Hand dataset, a new benchmark video dataset with sufficient size, variation, and real-world elements able to train and evaluate deep neural networks for continuous Hand Gesture Recognition (HGR). With our dataset, the performance of three 3D-CNN models is evaluated on the tasks of isolated and continuous real-time HGR. Since IPN hand contains RGB videos only, we analyze the possibility of increasing the recognition accuracy by adding multiple modalities derived from RGB frames, i.e., optical flow and semantic segmentation, while keeping the real-time performance.

Introduction video (supplementary material):

Dataset details

The subjects from the dataset were asked to record gestures using their own PC keeping the defined resolution and frame rate. Thus, only RGB videos were captured, and the distance between the camera and each subject varies. All videos were recorded in the resolution of 640x480 at 30 fps.

Each subject continuously performed 21 gestures with three random breaks in a single video. We defined 13 gestures to control the pointer and actions focused on the interaction with touchless screens.

Description and statics of each gesture are shown in the next table. Duration is measured in the number of frames (30 frames = 1 s).

id Label Gesture Instances Mean duration (std)
1 D0X Non-gesture 1431 147 (133)
2 B0A Pointing with one finger 1010 219 (67)
3 B0B Pointing with two fingers 1007 224 (69)
4 G01 Click with one finger 200 56 (29)
5 G02 Click with two fingers 200 60 (43)
6 G03 Throw up 200 62 (25)
7 G04 Throw down 201 65 (28)
8 G05 Throw left 200 66 (27)
9 G06 Throw right 200 64 (28)
10 G07 Open twice 200 76 (31)
11 G08 Double click with one finger 200 68 (28)
12 G09 Double click with two fingers 200 70 (30)
13 G10 Zoom in 200 65 (29)
14 G11 Zoom out 200 64 (28)
All non-gestures: 1431 147 (133)
All gestures: 4218 140 (94)
Total: 5649 142 (105)

Baseline results

Baseline results for isolated and continuous hand gesture recognition of the IPN Hand dataset can be found here.

Requirements

Please install the following requirements.

  • Python 3.5+
  • PyTorch 1.0+
  • TorchVision
  • Pillow
  • OpenCV

Pretrained models

Usage

Preparation

  • Download the dataset from here
  • Clone this repository
$ git clone https://github.com/GibranBenitez/IPN-hand
  • Store all pretrained models in ./report_ipn/

Isolated testing

  • Change the path of the dataset from ./tests/run_offline_ipn_Clf.sh and run
$ bash run_offline_ipn_Clf.sh

Continuous testing

  • Change the path of the dataset from ./tests/run_online_ipnTest.sh and run
$ bash run_online_ipnTest.sh

Citation

If you find useful the IPN Hand dataset for your research, please cite the paper:

@inproceedings{bega2020IPNhand,
  title={IPN Hand: A Video Dataset and Benchmark for Real-Time Continuous Hand Gesture Recognition},
  author={Benitez-Garcia, Gibran and Olivares-Mercado, Jesus and Sanchez-Perez, Gabriel and Yanai, Keiji},
  booktitle={25th International Conference on Pattern Recognition, {ICPR 2020}, Milan, Italy, Jan 10--15, 2021},
  pages={1--8},
  year={2021},
  organization={IEEE},
}

Acknowledgement

This project is inspired by many previous works, including: