A Tensorflow V2 implementation of a simple baseline for 3d human pose estimation. Check the original implementation writen by Julieta Martinez et al.. Data processing and model architecture are mostly the same as the original version, thanks to the authors.
- Python ≥ 3.9, along with the packages:
- cdflib
- tensorflow 2.7.0 or later
- Human3.6M dataset. Request access permisions in its website.
Go to Human3.6M website, log in, and download the D3 Positions
files for subjects [1, 5, 6, 7, 8, 9, 11]
,
and put them under the folder data/h36m
. Your directory structure should look like this:
src/
README.md
LICENCE
...
data/
└── h36m/
├── Poses_D3_Positions_S1.tgz
├── Poses_D3_Positions_S11.tgz
├── Poses_D3_Positions_S5.tgz
├── Poses_D3_Positions_S6.tgz
├── Poses_D3_Positions_S7.tgz
├── Poses_D3_Positions_S8.tgz
└── Poses_D3_Positions_S9.tgz
Now, move to the data folder, and uncompress all the data
cd data/h36m/
for file in *.tgz; do tar -xvzf $file; done
Finally, download the code-v1.2.zip
file, unzip it, and copy the metadata.xml
file under data/h36m/
Now, your data directory should look like this:
data/
└── h36m/
├── metadata.xml
├── S1/
├── S11/
├── S5/
├── S6/
├── S7/
├── S8/
└── S9/
There is one little fix we need to run for the data to have consistent names:
mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/Photo.cdf
mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto\ 1.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/Photo\ 1.cdf
mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/WalkDog.cdf
mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog\ 1.cdf \
h36m/S1/MyPoseFeatures/D3_Positions/WalkDog\ 1.cdf
And you are done!
For creating a similar model as the original work, GT detections (MA)
, run:
python3 src/run.py --dropout 0.5 --residual --clip-linear-weights --batch-norm --eval-by-action
All available options are:
python3 run.py [-h] [--linear-size LINEAR_SIZE] [--num-bi-layers NUM_BI_LAYERS] [--dropout DROPOUT] [--residual] [--clip-linear-weights] [--batch-norm] [--dont-load] [--learning-rate LEARNING_RATE] [--epochs EPOCHS] [--eval-by-action] [--tflite | --tflite-int8] [--cameras-path CAMERAS_PATH] [--data-path DATA_PATH] [--train-path TRAIN_PATH]
Train WorldPose Tensorflow model
optional arguments:
-h, --help show this help message and exit
model creation arguments:
--linear-size LINEAR_SIZE
Size of model layers. Defaults to 1024
--num-bi-layers NUM_BI_LAYERS
Number of "bi-linear" blocks in the model. Defaults to 2
--dropout DROPOUT Dropout keep probability. 1 means no dropout. Defaults to 1.0
--residual Whether to add a residual connection every 2 bi-linear block
--clip-linear-weights
Clip weights of Dense layers by norm 1
--batch-norm Use BatchNormalization
--dont-load Do not load model from checkpoint, if any
training arguments:
--learning-rate LEARNING_RATE
Learning rate. Defaults to 0.001
--epochs EPOCHS How many epochs we should train for. Defaults to 200
testing arguments:
--eval-by-action Evaluate model by action instead of all test data at once
saving arguments:
--tflite Save trained model in TFLite format.
--tflite-int8 Save trained model in TFLite format but quantized.
paths arguments:
--cameras-path CAMERAS_PATH
File with h36m metadata, including cameras. Defaults to ./data/h36m/metadata.xml
--data-path DATA_PATH
Data directory. Defaults to ./data/h36m
--train-path TRAIN_PATH
Training directory. Defaults to ./training
@inproceedings{martinez_2017_3dbaseline,
title={A simple yet effective baseline for 3d human pose estimation},
author={Martinez, Julieta and Hossain, Rayat and Romero, Javier and Little, James J.},
booktitle={ICCV},
year={2017}
}
@article{h36m_pami,
author = {Ionescu, Catalin and Papava, Dragos and Olaru, Vlad and Sminchisescu, Cristian},
title = {Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
publisher = {IEEE Computer Society},
volume = {36},
number = {7},
pages = {1325-1339},
month = {jul},
year = {2014}
}
@inproceedings{IonescuSminchisescu11,
author = {Catalin Ionescu, Fuxin Li, Cristian Sminchisescu},
title = {Latent Structured Models for Human Pose Estimation},
booktitle = {International Conference on Computer Vision},
year = {2011}
}
MIT