/4D-Humans

4DHumans: Reconstructing and Tracking Humans with Transformers

Primary LanguagePythonMIT LicenseMIT

4DHumans: Reconstructing and Tracking Humans with Transformers

Code repository for the paper: Humans in 4D: Reconstructing and Tracking Humans with Transformers Shubham Goel, Georgios Pavlakos, Jathushan Rajasegaran, Angjoo Kanazawa*, Jitendra Malik*

arXiv Website shields.io Open In Colab Hugging Face Spaces

teaser

Installation and Setup

First, clone the repo. Then, we recommend creating a clean conda environment, installing all dependencies, and finally activating the environment, as follows:

git clone https://github.com/shubham-goel/4D-Humans.git
cd 4D-Humans
conda env create -f environment.yml
conda activate 4D-humans

If conda is too slow, you can use pip:

conda create --name 4D-humans python=3.10
conda activate 4D-humans
pip install torch
pip install -e .[all]

All checkpoints and data will automatically be downloaded to $HOME/.cache/4DHumans the first time you run the demo code.

Besides these files, you also need to download the SMPL model. You will need the neutral model for training and running the demo code. Please go to the corresponding website and register to get access to the downloads section. Download the model and place basicModel_neutral_lbs_10_207_0_v1.0.0.pkl in ./data/.

Run demo on images

The following command will run ViTDet and HMR2.0 on all images in the specified --img_folder, and save renderings of the reconstructions in --out_folder. --batch_size batches the images together for faster processing. The --side_view flags additionally renders the side view of the reconstructed mesh, --full_frame renders all people together in front view, --save_mesh saves meshes as .objs.

python demo.py \
    --img_folder example_data/images \
    --out_folder demo_out \
    --batch_size=48 --side_view --save_mesh --full_frame

Run tracking demo on videos

Our tracker builds on PHALP, please install that first:

pip install git+https://github.com/brjathu/PHALP.git

Now, run track.py to reconstruct and track humans in any video. Input video source may be a video file, a folder of frames, or a youtube link:

# Run on video file
python track.py video.source="example_data/videos/gymnasts.mp4"

# Run on extracted frames
python track.py video.source="/path/to/frames_folder/"

# Run on a youtube link (depends on pytube working properly)
python track.py video.source=\'"https://www.youtube.com/watch?v=xEH_5T9jMVU"\'

The output directory (./outputs by default) will contain a video rendering of the tracklets and a .pkl file containing the tracklets with 3D pose and shape. Please see the PHALP repository for details.

Training

Download the training data to ./hmr2_training_data/, then start training using the following command:

bash fetch_training_data.sh
python train.py exp_name=hmr2 data=mix_all experiment=hmr_vit_transformer trainer=gpu launcher=local

Checkpoints and logs will be saved to ./logs/. We trained on 8 A100 GPUs for 7 days using PyTorch 1.13.1 and PyTorch-Lightning 1.8.1 with CUDA 11.6 on a Linux system. You may adjust batch size and number of GPUs per your convenience.

Evaluation

Download the evaluation metadata to ./hmr2_evaluation_data/. Additionally, download the Human3.6M, 3DPW, LSP-Extended, COCO, and PoseTrack dataset images and update the corresponding paths in hmr2/configs/datasets_eval.yaml.

Run evaluation on multiple datasets as follows, results are stored in results/eval_regression.csv.

python eval.py --dataset 'H36M-VAL-P2,3DPW-TEST,LSP-EXTENDED,POSETRACK-VAL,COCO-VAL' 

By default, our code uses the released checkpoint (mentioned as HMR2.0b in the paper). To use the HMR2.0a checkpoint, you may download and untar from here

Preprocess code

To preprocess LSP Extended and Posetrack into metadata zip files for evaluation, see hmr2/datasets/preprocess.

Training data preprocessing coming soon.

Acknowledgements

Parts of the code are taken or adapted from the following repos:

Additionally, we thank StabilityAI for a generous compute grant that enabled this work.

Citing

If you find this code useful for your research, please consider citing the following paper:

@inproceedings{goel2023humans,
    title={Humans in 4{D}: Reconstructing and Tracking Humans with Transformers},
    author={Goel, Shubham and Pavlakos, Georgios and Rajasegaran, Jathushan and Kanazawa, Angjoo and Malik, Jitendra},
    booktitle={ICCV},
    year={2023}
}