We address the problem of clothed human reconstruction from a single image or uncalibrated multi-view images. Existing methods struggle with reconstructing detailed geometry of a clothed human and often require a calibrated setting for multi-view reconstruction. We propose a flexible framework which, by leveraging the parametric SMPL-X model, can take an arbitrary number of input images to reconstruct a clothed human model under an uncalibrated setting. At the core of our framework is our novel self-evolved signed distance field (SeSDF) module which allows the framework to learn to deform the signed distance field (SDF) derived from the fitted SMPL-X model, such that detailed geometry reflecting the actual clothed human can be encoded for better reconstruction. Besides, we propose a simple method for self-calibration of multi-view images via the fitted SMPL-X parameters. This lifts the requirement of tedious manual calibration and largely increases the flexibility of our method. Further, we introduce an effective occlusion-aware feature fusion strategy to account for the most useful features to reconstruct the human model. We thoroughly evaluate our framework on public benchmarks, and our framework establishes a new state-of-the-art.
The environment (in conda yaml
) for SeSDF
training and testing.
conda env create -f environment_sesdf.yaml
conda activate sesdf
You may also need to install torch_scatter
manually for the 3D encoder
pip install torch==1.10.0+cu102 torchvision==0.11.0+cu102 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install torch-scatter==2.0.9 -f https://pytorch-geometric.com/whl/torch-1.10.0+cu102.html
If you are using RTX 3090 or higher, please install torch
, torchvision
, and torch-scatter
correspondingly, i.e.,
pip install torch==1.10.0+cu111 torchvision==0.11.1+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install torch-scatter==2.0.9 -f https://pytorch-geometric.com/whl/torch-1.10.0+cu111.html
Follow pytorch3d and kaolin to install the required package.
We use THUman2.0
dataset for training and testing. You can download it and corresponding SMPL-X parameters and meshes from this link after following the instruction and sending the request form.
- Generating the precomputed radiance transfer (PRT) for each obj mesh as in instructed in
PIFu
:
cd process_data
python -m prt_util -i {path_to_THUman2.0_dataset}
- Rendering the meshes into 360-degree images. Together with
RENDER
folder for images, it will generate theMASK
,PARAM
,UV_RENDER
,UV_MASK
,UV_NORMAL
,UV_POS
, and copy the meshes toGEO/OBJ
. Remember to add -e to enable EGL rendering.
python -m render_data -i {path_to_THUman2.0_dataset} -o {path_to_THUman2.0_processed_for_training} -e
cd ..
After data preprocessing, the dataset directory would be like:
THU2_processed/
├── UV_RENDER
├── UV_POS
├── UV_NORMAL
├── UV_MASK
├── PARAM
├── MASK
├── GEO/OBJ
├── SMPLX
├── val.txt
Training the single-view pipeline (The intermediate results and checkpoints are saved under ./results and ./checkpoints respectively.)
cd ./single-view
python -m apps.train_shape --dataroot {path_to_processed_THUman2.0_processed} --random_flip --random_scale --num_stack 4 --num_hourglass 2 --resolution 512 --hg_down 'ave_pool' --norm 'group' --val_train_error --val_test_error --gpu_ids=0,1,2 --batch_size 3 --learning_rate 0.0001 --sigma 0.02 --checkpoints_path {your_checkpoints_folder} --results_path {your_results_folder} --schedule 4 --num_threads 10
Training the multi-view pipeline
cd ../multi-view
python -m apps.train_shape --dataroot {path_to_processed_THUman2.0_processed} --random_flip --random_scale --num_stack 4 --num_hourglass 2 --resolution 512 --hg_down 'ave_pool' --norm 'group' --val_train_error --val_test_error --gpu_id=0 --batch_size 1 --learning_rate 0.0001 --sigma 0.02 --checkpoints_path {your_checkpoints_folder} --results_path {your_results_folder} --schedule 4 --num_threads 10 --num_views 3
See {your_results_folder}/pred.ply for intermediate visualizations during training.
Testing with the single-view pipeline (we follow a similar approach in ICON to obtain the estimated SMPL-X model.)
python ./apps/eval.py --batch_size 1 --num_stack 4 --num_hourglass 2 --resolution 512 --hg_down 'ave_pool' --norm 'group' --norm_color 'group' --test_folder_path {path_to_your_testdata} --load_netG_checkpoint_path {path_to_your_netG_checkpoint}
If you want to cite our work, please use the following bib entry:
@inproceedings{cao2023sesdf,
author = {Yukang Cao and Kai Han and Kwan-Yee K. Wong},
title = {SeSDF: Self-evolved Signed Distance Field for Implicit 3D Clothed Human Reconstruction},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2023},
}