Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image ", ICCV 2019

Primary LanguagePythonMIT LicenseMIT



PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image"


This repo is official PyTorch implementation of Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image (ICCV 2019). It contains PoseNet part.

What this repo provides:


This code is tested under Ubuntu 16.04, CUDA 9.0, cuDNN 7.1 environment with two NVIDIA 1080Ti GPUs.

Python 3.6.5 version with Anaconda 3 is used for development.



The ${POSE_ROOT} is described as below.

|-- data
|-- common
|-- main
|-- vis
`-- output
  • data contains data loading codes and soft links to images and annotations directories.
  • common contains kernel codes for 3d multi-person pose estimation system.
  • main contains high-level codes for training or testing the network.
  • vis contains scripts for 3d visualization.
  • output contains log, trained models, visualized outputs, and test result.


You need to follow directory structure of the data as below.

|-- data
|-- |-- Human36M
|   `-- |-- bbox_root
|       |   |-- bbox_root_human36m_output.json
|       |-- images
|       `-- annotations
|-- |-- MPII
|   `-- |-- images
|       `-- annotations
|-- |-- MSCOCO
|   `-- |-- bbox_root
|       |   |-- bbox_root_coco_output.json
|       |-- images
|       |   |-- train/
|       |   |-- val/
|       `-- annotations
|-- |-- MuCo
|   `-- |-- data
|       |   |-- augmented_set
|       |   |-- unaugmented_set
|       |   `-- MuCo-3DHP.json
`-- |-- MuPoTS
|   `-- |-- bbox_root
|       |   |-- bbox_mupots_output.json
|       |-- data
|       |   |-- MultiPersonTestSet
|       |   `-- MuPoTS-3D.json


You need to follow the directory structure of the output folder as below.

|-- output
|-- |-- log
|-- |-- model_dump
|-- |-- result
`-- |-- vis
  • Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
  • log folder contains training log file.
  • model_dump folder contains saved checkpoints for each epoch.
  • result folder contains final estimation files generated in the testing stage.
  • vis folder contains visualized results.

3D visualization

  • Run $DB_NAME_img_name.py to get image file names in .txt format.
  • Place your test result files (preds_2d_kpt_$DB_NAME.mat, preds_3d_kpt_$DB_NAME.mat) in single or multi folder.
  • Run draw_3Dpose_$DB_NAME.m



  • In the main/config.py, you can change settings of the model including dataset to use, network backbone, and input size and so on.


In the main folder, run

python train.py --gpu 0-1

to train the network on the GPU 0,1.

If you want to continue experiment, run

python train.py --gpu 0-1 --continue

--gpu 0,1 can be used instead of --gpu 0-1.


Place trained model at the output/model_dump/.

In the main folder, run

python test.py --gpu 0-1 --test_epoch 20

to test the network on the GPU 0,1 with 20th epoch trained model. --gpu 0,1 can be used instead of --gpu 0-1.


Here I report the performance of the PoseNet. Also, I provide pre-trained models of the PoseNetNet. Bounding box and root locations are obtained from DetectNet and RootNet.

Human3.6M dataset using protocol 1

For the evaluation, you can run test.py or there are evaluation codes in Human36M.

Human3.6M dataset using protocol 2

For the evaluation, you can run test.py or there are evaluation codes in Human36M.

MuPoTS-3D dataset

For the evaluation, run test.py. After that, move data/MuPoTS/mpii_mupots_multiperson_eval.m in data/MuPoTS/data. Also, move the test result files (preds_2d_kpt_mupots.mat and preds_3d_kpt_mupots.mat) in data/MuPoTS/data. Then run mpii_mupots_multiperson_eval.m with your evaluation mode arguments.

  • Bounding box [MuPoTS-3D]
  • PoseNet model trained on MuCO-3DHP + MSCOCO [model]


author = {Moon, Gyeongsik and Chang, Juyong and Lee, Kyoung Mu},
title = {Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image},
booktitle = {The IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}