Paper | Video | Project Page
Tobias Kirschstein, Simon Giebenhain
and Matthias Nießner
CVPR 2024
-
Setup environment
conda env create -f environment.yml conda activate diffusion-avatars
which creates a new conda environment
diffusion-avatars
(Installation may take a while). -
Install the
diffusion_avatars
package itself by runningpip install -e .
inside the cloned repository folder.
-
[Optional Linux] Update
LD_LIBRARY_PATH
for nvidffrastln -s "$CONDA_PREFIX/lib" "$CONDA_PREFIX/lib64"
Solves the issue
/usr/bin/ld: cannot find -lcudart
-
[Optional Windows] Update
CUDA_HOME
for nvidiffrastconda env config vars set CUDA_HOME=$Env:CONDA_PREFIX conda activate base conda activate diffusion-avatars
Solves the issue when
nvidffrast
wants to use a globally installed CUDA toolkit instead of the one from the environment.
All paths to data / models / renderings are defined by environment variables.
Please create a file in your home directory in ~/.config/diffusion-avatars/.env
with the following content:
DIFFUSION_AVATARS_DATA_PATH="..."
DIFFUSION_AVATARS_MODELS_PATH="..."
DIFFUSION_AVATARS_RENDERS_PATH="..."
Replace the ...
with the locations where data / models / renderings should be located on your machine.
DIFFUSION_AVATARS_DATA_PATH
: Location of the multi-view videos and preprocessed 3DMM meshes (See section 2 for how to obtain the dataset)DIFFUSION_AVATARS_MODELS_PATH
: During training, model checkpoints and configs will be saved hereDIFFUSION_AVATARS_RENDERS_PATH
: Video renderings of trained models will be stored here
If you do not like creating a config file in your home directory, you can instead hard-code the paths in the env.py.
Data as well as model checkpoints can be found in the Downloads section.
The folder structure assumed by the code looks as follows:
DIFFUSION_AVATARS_DATA_PATH
├── nersemble # Raw NeRSemble data containing RGB images, etc
│ ├── 018 # Data folder for participant 18
│ ├── 037
│ ...
└── rendering_data # Rasterized NPHM meshes that are the input for DiffusionAvatars
├── v1.1-ID-18-nphm # Dataset of rasterized NPHM meshes for participant 18
├── v1.2-ID-37-nphm
...
DIFFUSION_AVATARS_MODELS_PATH
└── diffusion-avatars # Folder for DiffusionAvatars checkpoints
├── DA-1-ID-18 # Checkpoint for DiffusionAvatars model trained on participant 18
├── DA-2-ID-37
...
python scripts/train/train_diffusion_avatars.py $DATASET
where $DATASET
is the identifier of a dataset folder in ${DIFFUSION_AVATARS_DATA_PATH}/rendering_data
.
E.g., v1.1
will train DiffusionAvatars for data of person 37
stored in the folder v1.1-ID-37-nphm
.
Please find the respective raw NeRSemble data as well as processed datasets in the Downloads section.
Checkpoints and train configurations will be stored in a model folder DA-xxx
inside ${DIFFUSION_AVATARS_MODELS_PATH}/diffusion-avatars/DA-xxx-${name}
.
The incremental run id xxx
will be automatically determined.
During training, the script will log metrics and images to weights and biases (wandb.ai) to a project diffusion-avatars
.
The hold out test sequences are specified in constants.py.
Training takes roughly 2 days and requires at least an RTX A6000 GPU (48GB VRAM). For debugging purposes, the following flags may be used to keep the GPU memory consumption below 10G:
--batch_size 1 --gradient_accumulation 4 --use_8_bit_adam --mixed_precision FP16 --dataloader_num_workers 0
From a trained model DA-xxx
, a self-reenactment rendering may be obtained via:
python scripts/render/render_trajectory.py DA-xxx
The resulting .mp4
file is stored in DIFFUSION_AVATARS_RENDERS_PATH
.
Please find trained model checkpoints and corresponding raw NeRSemble data in the Downloads section.
python scripts/evaluate/evaluate.py DA-xxx
will evaluate the self-reenactment scenario for avatar DA-xxx
.
Please find trained model checkpoints and corresponding processed datasets in the Downloads section.
The computed metrics and generated model predictions with paired GT images will be stored
in ${DIFFUSION_AVATARS_MODELS_PATH}/diffusion-avatars/DA-xxx-${name}/evaluations
.
The key "average_per_sequence_metric"
in the generated .json
file reproduces the metrics from the paper.
The script
python scripts/data/create_renderings_dataset.py $PARTICIPANT_ID $SEQUENCES
processes the raw NeRSemble data of $PARTICIPANT_ID
for the comma-separated $SEQUENCES
.
It creates a new folder in ${DIFFUSION_AVATARS_DATA_PATH}/rendering_data
that contains the rasterized NPHM images and forms the input for training DiffusionAvatars.
Please refer to the provided datasets in the Downloads section for the expected folder layout for the raw NeRSemble data.
The NPHM fittings where obtained using MonoNPHM.
The notebooks folder contains minimal examples on how to
- Load RGB images and NPHM renderings (visualize_data.ipynb)
- Load a trained model and obtain a prediction (inference.ipynb)
Participant ID | Model | Raw NeRSemble data | Processed Data |
---|---|---|---|
18 | DA-1-ID-18 | ID-18 | v1.1-ID-18-nphm |
37 | DA-2-ID-37 | ID-37 | v1.2-ID-37-nphm |
55 | DA-3-ID-55 | ID-55 | v1.3-ID-55-nphm |
124 | DA-4-ID-124 | ID-124 | v1.4-ID-124-nphm |
145 | DA-5-ID-145 | ID-145 | v1.5-ID-145-nphm |
210 | DA-6-ID-210 | ID-210 | v1.6-ID-210-nphm |
251 | DA-7-ID-251 | ID-251 | v1.7-ID-251-nphm |
264 | DA-8-ID-264 | ID-264 | v1.8-ID-264-nphm |
Participant ID refers to the participants from the NeRSemble dataset.
The model zip files contain a checkpoint as well as hyperparameters.
The raw NeRSemble data files contain the RGB images, segmentation masks, foreground masks, NPHM fittings (obtained with MonoNPHM), and camera parameters.
The processed data archives contain the rasterized NPHM meshes (normals, depth, and canonical coordinates) that serve as the input for DiffusionAvatars.
Before using the raw NeRSemble data or processed data for your own projects, please fill out the NeRSemble dataset Terms of Use.
If you find our paper or code useful, please consider citing
@inproceedings{kirschstein2024diffusionavatars,
title={DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars},
author={Kirschstein, Tobias and Giebenhain, Simon and Nie{\ss}ner, Matthias},
booktitle={Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}
Contact Tobias Kirschstein for questions, comments and reporting bugs, or open a GitHub issue.