RootNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image"
All download links are replaced with Google drive link. Sorry for slow and unstable previous links. If you have a problem with 'Download limit' problem when tried to download dataset from google drive link, please try this trick.
* Go the shared folder, which contains files you want to copy to your drive
* Select all the files you want to copy
* In the upper right corner click on three vertical dots and select “make a copy”
* Then, the file is copied to your personal google drive account. You can download it from your personal account.
This repo is official PyTorch implementation of Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image (ICCV 2019). It contains RootNet part.
What this repo provides:
- PyTorch implementation of Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image (ICCV 2019).
- Flexible and simple code.
- Compatibility for most of the publicly available 2D and 3D, single and multi-person pose estimation datasets including Human3.6M, MPII, MS COCO 2017, MuCo-3DHP and MuPoTS-3D.
- Human pose estimation visualization code.
This code is tested under Ubuntu 16.04, CUDA 9.0, cuDNN 7.1 environment with two NVIDIA 1080Ti GPUs.
Python 3.6.5 version with Anaconda 3 is used for development.
The ${POSE_ROOT}
is described as below.
${POSE_ROOT}
|-- data
|-- common
|-- main
`-- output
data
contains data loading codes and soft links to images and annotations directories.common
contains kernel codes for 3d multi-person pose estimation system.main
contains high-level codes for training or testing the network.output
contains log, trained models, visualized outputs, and test result.
You need to follow directory structure of the data
as below.
${POSE_ROOT}
|-- data
|-- |-- Human36M
| `-- |-- bbox
| | |-- bbox_human36m_output.json
| |-- images
| `-- annotations
|-- |-- MPII
| `-- |-- images
| `-- annotations
|-- |-- MSCOCO
| `-- |-- images
| | |-- train/
| | |-- val/
| `-- annotations
|-- |-- MuCo
| `-- |-- data
| | |-- augmented_set
| | |-- unaugmented_set
| | `-- MuCo-3DHP.json
`-- |-- MuPoTS
| `-- |-- bbox
| | |-- bbox_mupots_output.json
| |-- data
| | |-- MultiPersonTestSet
| | `-- MuPoTS-3D.json
- Download Human3.6M parsed data [data]
- Download MPII parsed data [images][annotations]
- Download MuCo parsed and composited data [data]
- Download MuPoTS parsed parsed data [images][annotations]
- All annotation files follow MS COCO format.
- If you want to add your own dataset, you have to convert it to MS COCO format.
You need to follow the directory structure of the output
folder as below.
${POSE_ROOT}
|-- output
|-- |-- log
|-- |-- model_dump
|-- |-- result
`-- |-- vis
- Creating
output
folder as soft link form is recommended instead of folder form because it would take large storage capacity. log
folder contains training log file.model_dump
folder contains saved checkpoints for each epoch.result
folder contains final estimation files generated in the testing stage.vis
folder contains visualized results.
- In the
main/config.py
, you can change settings of the model including dataset to use, network backbone, and input size and so on.
In the main
folder, run
python train.py --gpu 0-1
to train the network on the GPU 0,1.
If you want to continue experiment, run
python train.py --gpu 0-1 --continue
--gpu 0,1
can be used instead of --gpu 0-1
.
Place trained model at the output/model_dump/
.
In the main
folder, run
python test.py --gpu 0-1 --test_epoch 20
to test the network on the GPU 0,1 with 20th epoch trained model. --gpu 0,1
can be used instead of --gpu 0-1
.
Here I report the performance of the RootNet. Also, you can download pre-trained model of RootNet in here and bounding boxs (from DetectNet) and root joint coordinates (from RootNet) of Human3.6M, MSCOCO, and MuPoTS-3D datasets in here.
For the evaluation, you can run test.py
or there are evaluation codes in Human36M
and MuPoTS
.
Method | MRPE | MRPE_x | MRPE_y | MRPE_z |
---|---|---|---|---|
RootNet | 120.0 | 23.3 | 23.0 | 108.1 |
Method | AP_25 |
---|---|
RootNet | 31.0 |
We additionally provide estimated 3D human root coordinates in on the MSCOCO dataset. The coordinates are in 3D camera coordinate system, and focal lengths are set to 1500mm for both x and y axis. You can change focal length and corresponding distance using equation 2 or equation in supplementarial material of my paper.
@InProceedings{Moon_2019_ICCV_3DMPPE,
author = {Moon, Gyeongsik and Chang, Juyong and Lee, Kyoung Mu},
title = {Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image},
booktitle = {The IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}
}