This is the official implementation of the approach described in the paper:
Wenhao Li, Hong Liu, Runwei Ding, Mengyuan Liu, Pichao Wang, and Wenming Yang. Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation. IEEE Transactions on Multimedia, 2022.
Recently, our method has been verified in self-supervised pre-training as a backbone network!
- Cuda 11.1
- Python 3.6
- Pytorch 1.7.1
Please download the dataset from Human3.6M website and refer to VideoPose3D to set up the Human3.6M dataset ('./dataset' directory).
${POSE_ROOT}/
|-- dataset
| |-- data_3d_h36m.npz
| |-- data_2d_h36m_cpn_ft_h36m_dbb.npz
The pretrained model can be found in Google_Drive, please download it and put in the './checkpoint' dictory.
To test on pretrained model on Human3.6M:
python main.py --refine --reload --refine_reload --previous_dir 'checkpoint/pretrained'
To train on Human3.6M:
python main.py --train
After training for several epoches, add refine module
python main.py --train --refine --lr 1e-5 --reload --previous_dir [your model saved path]
If you find our work useful in your research, please consider citing:
@article{li2022exploiting,
title={Exploiting temporal contexts with strided transformer for 3d human pose estimation},
author={Li, Wenhao and Liu, Hong and Ding, Runwei and Liu, Mengyuan and Wang, Pichao and Yang, Wenming},
journal={IEEE Transactions on Multimedia},
year={2022},
}
Our code is built on top of ST-GCN and is extended from the following repositories. We thank the authors for releasing the codes.
This project is licensed under the terms of the MIT license.