/SAILOR

Synergizing Radiance and Occupancy Fields for Live Human Performance Capture [SIGGRAPH ASIA 2023]

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture

Zheng Dong, Xu Ke, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu*, Rynson W.H. Lau
SIGGRAPH ASIA 2023

SAILOR is a generalizable method for human free-view rendering and reconstruction from very sparse (e.g., 4) RGBD streams , achieving near real-time performance under acceleration.

Our free-view rendering results and bullet-time effects on our real-captured dataset (Unseen performers ).

PyTorch CUDA TensorRT
Paper PDF Project Page youtube views LICENSE LICENSE


Installation

Please install python dependencies in requirements.txt:

conda create -n SAILOR python=3.8
conda activate SAILOR
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

Install the surface localization algorithm named ImplicitSeg provided by MonoPort, for our fast post-merging operation.

Our code has been tested under the following system :

  • Ubuntu 18.04, 20.04 or 22.04
  • Python 3.8 and PyTorch 1.8.0
  • GCC/G++ 9.5.0
  • Nvidia GPU (RTX 3090) CUDA 11.1 CuDNN

Build the c++ and CUDA libraries:

  • VoxelEncoding, FastNerf and Mesh-RenderUtil: cd c_lib/* && python setup.py install. VoxelEncoding provides the CUDA accelerated versions of TSDF-Fusion, two-layer tree construction, ray-voxel intersection, adaptive points sampling, etc. FastNerf provides a fully-fused version of the MLPs and Hydra-attention for our SRONet.
  • AugDepth and Depth2Color [optional] (Eigen3, OpenCV, OpenMp and pybind11 are required):
cd c_lib/* 
mkdir build && cd build
cmake .. && make

Setup

  • Clone or download this repo
  • Download our pretrained depth denoising model (latest_model_BodyDRM2.pth) and our rendering model (latest_model_BasicRenNet.pth) here
  • Move the downloaded models to ./checkpoints_rend/SAILOR folder

Usage

The example static test data is provided in ./test_data, the data structure of static (or dynamic) is listed as :

<dataset_name>
|-- COLOR
    |-- FRAMExxxx
        |-- 0.jpg       # input RGB image (1024x1024) for each view
        |-- 1.jpg
        ...
|-- DEPTH
    |-- FRAMExxxx
        |-- 0.png       # input depth image (1024x1024, uint16, unit is m after dividing by 10000) for each view
        |-- 1.png
        ...
|-- MASK
    |-- FRAMExxxx
        |-- 0.png       # input human-region mask (1024x1024) for each view
        |-- 1.png
        ...
|-- PARAM
    |-- FRAMExxxx
        |-- 0.npy       # camera intrinsic ('K': 3x3) and pose ('RT': 3x4) matrices for each view
        |-- 1.npy

Depth denoising :

  • Run python -m depth_denoising.inference
  • The original and denoised point clouds are in the ./depth_denoising/results folder. Use meshlab to visualize the 3D results
  • Modify basic_path, frame idx and view_id in the file inference.py to obtain the results of other examples


SRONet and SRONetUp :

  • For provided static data, run python -m upsampling.inference_static --name SAILOR (in 1K resolution) or python -m SRONet.inference_static --name SAILOR (in 512 resolution), to obtain the reconstructed 3D mesh and free-view rendering results.
  • The reconstructed 3D meshes are in the ./checkpoints_rend/SAILOR/val_results folder. To render the 3D mesh, run python -m utils_render.render_mesh to obtain the free-view mesh rendering results. Modify opts.ren_data_root, obj_path and obj_name in the file render_mesh.py to get new results.
  • For dynamic data, first download our real-captured data here, unzip the data and put them in the ./test_data folder
  • For dynamic data, then run python -m upsampling.inference_dynamic --name SAILOR or python -m SRONet.inference_dynamic --name SAILOR to obtain the rendering results.
  • Modify opts.ren_data_root and opts.data_name in inference_static.py and inference_dynamic.py to obtain new rendering results
  • The rendering images and videos are in the ./SRONet(or upsampling)/results folder.


Interactive rendering :

We release our interactive rendering GUI for our real-captured dataset.

  • TensorRT is required to accelerate our depth denoising network and the encoders in SRONet(upsampling). Please refer to TensorRT installation guide and then install torch2trt. Our TensorRT version is 7.2
  • Run python -m depth_denoising.toTensorRT, python -m SRONet.toTensorRT and python -m upsampling.toTensorRT to obtain the TRTModules (the param opts.num_gpus in toTensorRT.py controls the number of GPUs). The final pth models are in the ./SAILOR/accelerated_models folder
  • Run python -m gui.gui_render. Modify the opts.ren_data_root in gui_render.py to test other data, and modify the opts.num_gpus to use 1 GPU (slow) or 2 GPUs. The GIF below shows the rendering result of using 2 Nvidia RTX 3090, an Intel i9-13900k, and an MSI Z790 god-like motherboard

License

The code, models, and GUI demos in this repository are released under the GPL-3.0 license.

Citation

If you find our work helpful to your research, please cite our paper.

@article{dong2023sailor,
author = {Zheng Dong, Xu Ke, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, Rynson W.H. Lau},
title = {SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture},
year = {2023},
journal = {ACM Transactions on Graphics (TOG)},
volume = {42},
number = {6},
doi = {10.1145/3618370},
publisher = {ACM}
}