/DePOSit

[ICRA 2023] Official implementation of "A generic diffusion-based approach for 3D human pose prediction in the wild".

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

A generic diffusion-based approach for
3D human pose prediction in the wild

Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi

International Conference on Robotics and Automation (ICRA), 2023

[arXiv] [video] [poster] [livedemo]

Abstract

Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is generic enough to improve any 3D pose prediction model as a pre-processing step to repair their inputs and a post-processing step to refine their outputs.


Getting started

Requirements

The code requires Python 3.7 or later. The file requirements.txt contains the full list of required Python modules.

pip install -r requirements.txt

Data

Human3.6M in exponential map can be downloaded from here.

Directory structure:

H3.6m
|-- S1
|-- S5
|-- S6
|-- ...
|-- S11

AMASS and 3DPW from their official websites.

Specify the data path with data_dir argument.

Training and Testing

Human3.6M

You need to train a short-term and long-term model using these commands:

python main_tcd_h36m.py --mode train --epochs 50 --data all --joints 22 --input_n 50 --output_n 5 --data_dir data_dir --output_dir model_s
python main_tcd_h36m.py --mode train --epochs 50 --data all --joints 22 --input_n 55 --output_n 20 --data_dir data_dir --output_dir model_l

For evaluating the TCD model you can run the following command. Specify the short-term and long-term model checkpoints directory with --model_s and --model_l arguments.

python main_tcd_h36m.py --mode test --data all --joints 22 --input_n 50 --output_n 25 --data_dir data_dir --model_s model_s --model_l model_l --output_dir model_l

The results will be saved in a csv file in the output directory.

AMASS and 3DPW

You can train a model on AMASS dataset using the following command:

python main_amass.py --mode train --epochs 50 --dataset AMASS --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass

Then you can evaluate it on both AMASS and 3DPW datasets:

python main_amass.py --mode test --dataset AMASS --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass
python main_amass.py --mode test --dataset 3DPW --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass

The results will be saved in csv files in the output directory.

Work in Progress

This repository is being updated so stay tuned!

Acknowledgments

The overall code framework (dataloading, training, testing etc.) was adapted from HRI. The base of the diffusion was borrowed from CSDI.

Citation

@INPROCEEDINGS{saadatnejad2023diffusion,
  author = {Saeed Saadatnejad and Ali Rasekh and Mohammadreza Mofayezi and Yasamin Medghalchi and Sara Rajabzadeh and Taylor Mordan and Alexandre Alahi},
  title = {A generic diffusion-based approach for 3D human pose prediction in the wild},
  booktitle={International Conference on Robotics and Automation (ICRA)}, 
  year  = {2023}
}

License

AGPL-3.0 license