This is a pytroch version of Realtime Multi-Person Pose Estimation, origin code is https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation and https://github.com/last-one/pytorch_realtime_multi-person_pose_estimation. Thanks to ZheC and last-one for providing the codes.
Code for reproducing CVPR 2017 Oral paper using pytorch
The result is generated by the model, which has trained 30 epoches.
1.preprocessing: some scripts for preprocessing data.
2.training: some scripts for training networks.
3.testing: the test script and example.
4.caffe2pytorch: the script for converting.
5.caffe_model: caffe model
Pytorch: 0.2.0_3
Caffe: If you want to convert the caffemodel by your own.
Mytransforms.py: some transformer.
transformer the image, mask, keypoints and center points, together.
CocoFolder.py: to read data for network.
It will generate the PAFs vector and heatmap when get the image.
The PAFs vector's format as follow:
POSE_COCO_PAIRS = {
{3, 4},
{4, 5},
{6, 7},
{7, 8},
{9, 10},
{10, 11},
{12, 13},
{13, 14},
{1, 2},
{2, 9},
{2, 12},
{2, 3},
{2, 6},
{3, 17},
{6, 18},
{1, 16},
{1, 15},
{16, 17},
{15, 18},
}
Where each index is the key value corresponding to each part in POSE_COCO_BODY_PARTS
utils.py: some common functions, such as adjust learning rate, read configuration and etc.
visualize_input.ipynb: the script to vierfy the validaity of preprocessing and generating heatmap and vectors. It shows some examples.
pose_estimation.py: the structure of networks.
The first 10 layers equals to VGG-19, so if set pretrained as True, it will be initialized by the VGG-19. And the stage is 6. The first stage has 5 layers (3 3x3conv + 2 1x1conv) and the remainder stages have 7 layers (5 3x3conv + 2 1x1conv).
TODO: the stage is adjustable.
- Download the data set, annotations and COCO official toolbox
- Go to the "preprocessing" folder
cd preprocessing
. - Generate json file and masks
python generate_json_mask,py
. - Go to the "training" folder
cd ../training
. - Set the train parameters in "config.yml".
- Set the train data dir , train mask dir, train json filepath and val data dir, val mask dir, val json filepath.
- Train the model
sh train.sh
.
- When you want to train some other datasets, please change the code: Mytransforms.py, CocoFolder.py to correspond to your datasets. Besides, please ensure '0' corresponds to background.
- The converted model and my code are used BGR to train and test images.
Please cite the paper in your publocations if it helps your research:
@InProceedings{cao2017realtime,
title = {Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields}},
author = {Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2017}
}