Realtime Multi-Person Pose Estimation
By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh.
Introduction
Code repo for winning 2016 MSCOCO Keypoints Challenge, 2016 ECCV Best Demo Award, and 2017 CVPR Oral paper.
Watch our video result in YouTube or our website.
We present a bottom-up approach for multi-person pose estimation, without using any person detector. For more details, refer to our CVPR'17 paper or our presentation slides at ILSVRC and COCO workshop 2016.
This project is licensed under the terms of the license.
Contents
Testing
C++ (realtime version, for demo purpose)
- Use our modified caffe: caffe_rtpose. Follow the instruction on that repo.
- In May 2017, we released an updated library openPose
- Three input options: images, video, webcam
Matlab (slower, for COCO evaluation)
- Compatible with general Caffe. Compile matcaffe.
- Run
cd testing; get_model.sh
to retrieve our latest MSCOCO model from our web server. - Change the caffepath in the
config.m
and rundemo.m
for an example usage.
Python
cd testing/python
ipython notebook
- Open
demo.ipynb
and execute the code
Training
Network Architecture
Training Steps
- Run
cd training; bash getData.sh
to obtain the COCO images indataset/COCO/images/
, keypoints annotations indataset/COCO/annotations/
and COCO official toolbox indataset/COCO/coco/
. - Run
getANNO.m
in matlab to convert the annotation format from json to mat indataset/COCO/mat/
. - Run
genCOCOMask.m
in matlab to obatin the mask images for unlabeled person. You can use 'parfor' in matlab to speed up the code. - Run
genJSON('COCO')
to generate a json file indataset/COCO/json/
folder. The json files contain raw informations needed for training. - Run
python genLMDB.py
to generate your LMDB. (You can also download our LMDB for the COCO dataset (189GB file) by:bash get_lmdb.sh
) - Download our modified caffe: caffe_train. Compile pycaffe. It will be merged with caffe_rtpose (for testing) soon.
- Run
python setLayers.py --exp 1
to generate the prototxt and shell file for training. - Download VGG-19 model, we use it to initialize the first 10 layers for training.
- Run
bash train_pose.sh 0,1
(generated by setLayers.py) to start the training with two gpus.
Other Implementations
- MXnet version of the code
- Tensorflow(keras) version of the code
- Pytorch version of the code
- Our new C++ library openPose
Citation
Please cite the paper in your publications if it helps your research:
@inproceedings{cao2017realtime,
author = {Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},
booktitle = {CVPR},
title = {Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},
year = {2017}
}
@inproceedings{wei2016cpm,
author = {Shih-En Wei and Varun Ramakrishna and Takeo Kanade and Yaser Sheikh},
booktitle = {CVPR},
title = {Convolutional pose machines},
year = {2016}
}