/GroupFormer

GroupFormer

Primary LanguagePythonApache License 2.0Apache-2.0

GroupFormer

By Shuaicheng Li, Qianggang Cao, Lingbo Liu, Kunlin Yang, Shinan Liu, Jun Hou, Shuai Yi. This repository is an official implementation of the paper Group Activity Recognition with Clustered Spatial-TemporalTransformer

Introduction

GroupFormer utilizes a tailor-modified Transformer to model individual and group representation for group activity recognition. Firstly, we develop a Group Representation Generator to generate an initial group representation by merging the individual context and scene-wide context. Multiple stacked Spatial-Temporal Transformers(STTR) are then deployed to augment and refine both the individual and group representation. It takes advantage of query-key mechanism to model spatial-temporal context jointly for group activity inferring. GroupFormer

License

This project is released under the Apache 2.0 license.

Results

Backbone Style Action Acc Activity Acc Config Download
Inv3+flow+pose pytorch 0.847 0.957 config model | test_log

##Preparation

Requirements

  • Linux, CUDA>=9.2, GCC>=5.4

  • Python>=3.7

Compiling RoIAlign

Usage

Dataset and Exatrcted features

First download the Volleyball dataset.

The following file need to be adapted in order to run the code on your own machine:

  • Change the file path including keypoint, dataset,tracks and flow in config/*.yaml.

We also provide the Keypoint data extracted from AlphaPose.

Flow data is too huge to upload, it can be easily generated by flownet as mentioned in our paper.

Training

./dist_train.sh $GPU_NUM $CONFIG

Testing

./dist_test.sh $GPU_NUM $CONFIG $CHECKPOINT 

Citing GroupFormer

If you find this work is useful in your research, please consider citing:

@inproceedings{li2021groupformer,
  title={GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer},
  author={Li, Shuaicheng and Cao, Qianggang and Liu, Lingbo and Yang, Kunlin and Liu, Shinan and Hou, Jun and Yi, Shuai},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={13668--13677},
  year={2021}
}

More Info

A humble version has been released, containing core modules mentioned in this paper.

Any suggestion are welcome. We are glad to optimize our code and provide more details.