/Ske2Grid

The official project website of "Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition" (The paper of Ske2Grid is published in ICML 2023)

Primary LanguagePythonApache License 2.0Apache-2.0

Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition

By Dongqi Cai, Yangyuxuan Kang, Anbang Yao and Yurong Chen.

This repository is an official Pytorch implementation of "Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition", dubbed Ske2Grid, published in ICML 2023.

Overview

Ske2Grid, a progressive representation learning framework conditioned on transforming human skeleton graph into an up-sampled grid representation, which is dedicated to skeleton-based human action recognition, showing leading performance on six mainstream benchmarks.

Comparison of convolution operations in GCNs and in our Ske2Grid. In Ske2Grid, we construct a regular grid patch for skeleton representation via up-sampling transform (UPT) and graph-node index transform (GIT). Convolution operation upon this grid patch convolves every grid cell using a shared regular kernel. It operates on a set of grid cells within a squared sub-patch which may be filled by a set of nodes distributed remotely on the graph, achieving a learnable receptive field on the skeleton for action feature modeling. In the figure, the up-sampled skeleton graph is visualized assuming the locations of the original graph nodes being unchanged for a better illustration.

(a) The overall framework of Ske2Grid: the input skeleton graph with $N$ joints is converted to a grid patch of size $H\times W$ using a pair of up-sampling transform (UPT) and graph-node index transform (GIT), which is then fed into the Ske2Grid convolution network for action recognition. (b) Ske2Grid with progressive learning strategy (PLS): the input skeleton is converted to a larger grid patch ($H'>H, W'>W$) using two-stage UPT plus GIT pairs. The well-trained Ske2Grid convolution network for the first-stage grid patch as in (a) is re-used to initialize the network for the second-stage grid patch as in (b), and the first-stage UPT plus GIT pair is fixed during training. PLS is used in a cascaded way to boost the performance of our Ske2Grid convolution network with increasing grid patch size.

Usage

  • Download PYSKL:
git clone https://github.com/kennymckormick/pyskl.git
  • Prepare datasets following PYSKL data format or download the pre-processed 2D or 3D skeletons from PYSKL repository.
  • Merge our model and config file folders "models", "utils" and "configs" into the corresponding folders "pyskl/models", "pyskl/utils" and "pyskl/configs" respectively.
  • Replace folder "pyskl/datasets" with ours "datasets" (to avoid mismatch due to PYSKL version update).
  • Install PYSKL as officially instructed:
pip3 install -e .

Models & Results

Results comparison on the NTU-60 XSub validation set.

Method Grid Patch Size Config Top-1 Acc(%) Model Log
ST-GCN -- config 85.15 model log
Ske2Grid 5x5 config 86.20 model log
Ske2Grid 6x6 config 87.87 model log
Ske2Grid 7x7 config 88.26 model log
Ske2Grid 8x8 config 88.55 model log

Training & Testing

Training Ske2Grid using grid patch representation of $D_{5\times 5}$:

bash tools/dist_train.sh configs/Ske2Grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j.py 4 --validate

Progressively training Ske2Grid of $D_{6\times 6}$ from previous trained Ske2Grid model of $D_{5\times 5}$:

bash tools/dist_train.sh configs/Ske2Grid/d5tod6_stgcn2cn_pyskl_ntu60_xsub_hrnet/j.py 4 --validate  --cfg-options load_from='work_dirs/ske2grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j/best_top1_acc_epoch_*.pth'

Evaluating Ske2Grid of grid patch $D_{5\times 5}$:

bash tools/dist_test.sh configs/Ske2Grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j.py work_dirs/ske2grid/d5_stgcn2cn_pyskl_ntu60_xsub_hrnet/j/best_top1_acc_epoch_*.pth 4 --out results_d5.pkl --eval top_k_accuracy mean_class_accuracy

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{cai2023ske2grid,
  title={Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition},
  author={Cai, Dongqi and Kang, Yangyuxuan and Yao, Anbang and Chen, Yurong},
  booktitle={International Conference on Machine Learning},
  year={2023}
  url={https://openreview.net/forum?id=SQtp4uUByd}
}

License

Ske2Grid is released under the Apache license. We encourage use for both research and commercial purposes, as long as proper attribution is given.

Acknowledgement

This repository is built based on PYSKL repository. We thank the authors for releasing their amazing codes.