The codebase is tested under the following environment settings:
- cuda: 11.0
- python: 3.8.10
- pytorch: 1.7.1
- torchvision: 0.8.2
- scikit-image: 0.18.2
- einops: 0.3.0
- timm: 0.4.9
- pyyaml: 6.0
- easydict: 1.9
- opencv-python: 4.5.2.54
To perform the evaluation on the Human3.6M dataset, you should:
- Download data.zip from https://cloud.tsinghua.edu.cn/f/b102a975ff8d4ae1a4c1/?dl=1.
- Extract the file.
- Put the extracted files into ./data/ directory.
After doing so, the file structure should be as follows:
./data
data_2d_h36m_cpn_ft_h36m_dbb.npz
data_2d_h36m_gt.npz
data_3d_h36m.npz
The trained checkpoints can be downloaded from https://cloud.tsinghua.edu.cn/d/fae76890154a45a99b31/. After downloaded, the checkpoints should be put into the ./checkpoint/ directory and the file structure of ./checkpoint/ should be as follows.
./checkpoint
cpn_f81.bin
cpn_f243.bin
gt_f81.bin
gt_f243.bin
To conduct evaluation using the CPN inputs, you can run the following commands:
CUDA_VISIBLE_DEVICES=0 python eval.py -c ./exp/exp_cpn_f81.yaml # using 81 frames as input
CUDA_VISIBLE_DEVICES=0 python eval.py -c ./exp/exp_cpn_f243.yaml # using 243 frames as input
Similarly, to evaluate using the GT inputs, you can run the following commands:
CUDA_VISIBLE_DEVICES=0 python eval.py -c ./exp/exp_gt_f81.yaml # using 81 frames as input
CUDA_VISIBLE_DEVICES=0 python eval.py -c ./exp/exp_gt_f243.yaml # using 243 frames as input
Part of the code is borrowed from Poseformer and VideoPose3D. We thank the authors for releasing their codes.