/quick-action-recognition

Spatial Temporal Graph Convolutional Networks (ST-GCN) for the recognition of quick human actions. Quick actions are derived by downsampling the NTU-RGB+D dataset.

Primary LanguagePython

Quick Human Actions Recognition

Introduction

This repository holds the codebase and dataset for the project:

Spatial Temporal Graph Convolutional Networks for the Recognition of Quick Human Actions

Prerequisites

Data Preparation

We Experimented on the 3D Skeletal Data of NTU-RGB+D.
The pre-processed data can be downloaded from GoogleDrive.
After downloading the data, extract the "NTU-RGB-D" folder into path.

Downsampling

To create a dataset of fast actions, we downsample the NTU-RGB+D dataset.
The downsampling is done by taking one frame and leaving another, halving the number of frames.
Run "downsample.py" to downsample the desired data.

Data Reduction (optional)

We provide "create_small_data.py" that creates a smaller data from the original data by selecting a number of actions out of all 60 actions. The desired actions can be selected in the code based on their labels on the NTU-RGB+D website.

Visualization

We provide visualization of the 3D skeletal data of NTU-RGB+D on MATLAB.

output

More details can be found on the "visualize" folder.

Training

A model can be trained by running "main.py". The results will show in the "results" folder.
In case of using a smaller data, some modifications to the code are needed, they're detailed in the code.

Results

Some results of different experiments are shown here:

Model Temporal Kernel Size Downsampled NTU-RGB+D
(60 actions)
Downsampled NTU-RGB+D
(10 actions)
Model I (ST-GCN) [1] 9 86.02% 93.39%
Model II (Proposed) 9 85.59% 94.01%
Model I (ST-GCN) [1] 13 86.53% 94%
Model II (Proposed) 13 84.7% 93.29%

[1] Sijie Yan et al., 2018. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition.