This repository contains the implementation of the approach described in the paper:
Yu Zhan, Fenghai Li, Renliang Weng, and Wongun Choi. Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization. In Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Dashed lines denote 3D ground-truth poses. Solid lines represent the poses estimated by Ray3D.
Please make sure you have the following dependencies installed before running:
- python 3
- torch==1.4.0
- other necessary dependencies
requirements.txt
- (optional) screen, rsync
We use the data processed by Videopose. You can generate the files by yourself as well, or you can download them from google drive.
3DHP is set up by our own script. You can download the original dataset and generate the files with the following command:
# set up the 'data_root' parameter which stores the original 3DHP data
python3 prepare_data_3dhp.py
Or you can directly download the processed data from google drive.
We set up HumanEva-I by following the procedure provided by Videopose. You can set it up by yourself, or you can download the files from google drive.
The synthetic dataset is set up based on Human3.6M. Once you have the 'data_3d_h36m.npz' file generated, you can generate the synthetic dataset with following procedure:
# 1). generate synthetic data for camera intrinsic test
python3 camera_intrinsic.py
Then, run the following preprocessing script:
# 2). generate synthetic data for camera extrinsic test
python3 camera_intrinsic.py
Finally, use the following preprocessing script to generate training file for synthetic training
# 3). generate train and evaluation file for synthetic training
python3 aggregate_camera.py
We train and test five approaches on the above mentioned datasets.
- Ray3D: implemented in the
main
branch. - RIE: implemented in the
main
branch. - Videopose: implemented in the
videopose
branch. - Poseformer: implemented in the
poseformer
branch. - Poselifter: implemented in the
poselifter
branch.
We release the pretrained models
for academic purpose. You can create a folder named checkpoint
to store all the pretrained models.
Please turn on visdom
before you start training a new model.
To train the above mentioned methods, you can run the following command by
specifying different configuration file in the cfg
folder:
python3 main.py --cfg cfg_ray3d_3dhp_stage1
To train Ray3D with synthetic data, please use the codes from
synthetic
branch.
We did some optimization for large scale training.
To evaluate the models on the public and the synthetic datasets, you can run the following command by specifying different configuration files, timestamps and checkpoints:
python3 main.py \
--cfg cfg_ray3d_h36m_stage3 \
--timestamp Oct_31_2021_05_43_36 \
--evaluate best_epoch.bin
To evaluate Ray3D on the synthetic dataset with 14-joint set up, please use these scripts.
We use the same visualization techniques provided by VideoPose3D. You can perform visualization with the following command:
python3 main.py \
--cfg cfg_ray3d_h36m_stage3 \
--timestamp Oct_31_2021_05_43_36 \
--evaluate best_epoch.bin \
--render
If you find this repository useful, please cite our paper:
@Inproceedings{yzhan2022,
Title = {Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization},
Author = {Yu Zhan, Fenghai Li, Renliang Weng, and Wongun Choi},
Booktitle = {CVPR},
Year = {2022}
}
Our implementation took inspiration from the following authors and repositories:
We thank the authors for kindly releasing their codes!