/4D-Humans

4DHumans: Reconstructing and Tracking Humans with Transformers

Primary LanguagePythonMIT LicenseMIT

4DHumans: Reconstructing and Tracking Humans with Transformers

Code repository for the paper: Humans in 4D: Reconstructing and Tracking Humans with Transformers Shubham Goel, Georgios Pavlakos, Jathushan Rajasegaran, Angjoo Kanazawa*, Jitendra Malik*

arXiv Website shields.io Open In Colab Hugging Face Spaces

teaser

Installation and Setup

First, clone the repo. Then, we recommend creating a clean conda environment, installing all dependencies, and finally activating the environment, as follows:

git clone https://github.com/shubham-goel/4D-Humans.git
cd 4D-Humans
conda env create -f environment.yml
conda activate 4D-humans

If conda is too slow, you can use pip:

conda create --name 4D-humans python=3.10
conda activate 4D-humans
pip install torch
pip install -e .[all]

All checkpoints and data will automatically be downloaded to $HOME/.cache/4DHumans the first time you run the demo code.

Run demo on images

The following command will run ViTDet and HMR2.0 on all images in the specified --img_folder, and save renderings of the reconstructions in --out_folder. --batch_size batches the images together for faster processing. The --side_view flags additionally renders the side view of the reconstructed mesh, --full_frame renders all people together in front view, --save_mesh saves meshes as .objs.

python demo.py \
    --img_folder example_data/images \
    --out_folder demo_out \
    --batch_size=48 --side_view --save_mesh --full_frame

Run tracking demo on videos

Our tracker builds on PHALP, please install that first:

pip install git+https://github.com/brjathu/PHALP.git

Now, run track.py to reconstruct and track humans in any video. Input video source may be a video file, a folder of frames, or a youtube link:

# Run on video file
python track.py video.source="example_data/videos/gymnasts.mp4"

# Run on extracted frames
python track.py video.source="/path/to/frames_folder/"

# Run on a youtube link (depends on pytube working properly)
python track.py video.source=\'"https://www.youtube.com/watch?v=xEH_5T9jMVU"\'

The output directory (./outputs by default) will contain a video rendering of the tracklets and a .pkl file containing the tracklets with 3D pose and shape. Please see the PHALP repository for details.

Training

Download the training data to ./hmr2_training_data/, then start training using the following command:

bash fetch_training_data.sh
python train/train.py exp_name=hmr2 data=mix_all experiment=hmr_vit_transformer trainer=gpu launcher=local

Checkpoints and logs will be saved to ./logs/. We trained on 8 A100 GPUs for 7 days using PyTorch 1.13.1 and PyTorch-Lightning 1.8.1 with CUDA 11.6 on a Linux system. You may adjust batch size and number of GPUs per your convenience.

Evaluation

Coming soon.

Acknowledgements

Parts of the code are taken or adapted from the following repos:

Additionally, we thank StabilityAI for a generous compute grant that enabled this work.

Citing

If you find this code useful for your research, please consider citing the following paper:

@article{goel2023humans,
    title={Humans in 4{D}: Reconstructing and Tracking Humans with Transformers},
    author={Goel, Shubham and Pavlakos, Georgios and Rajasegaran, Jathushan and Kanazawa, Angjoo and Malik, Jitendra},
    journal={arXiv preprint arXiv:2305.20091},
    year={2023}
}