/sof

Training and testing scripts for Semantic Occupancy Field.

Primary LanguagePython

Semantic Occupancy Field

This repository contains the code for training/generating SOF (semantic occupancy field) as part of the TOG submission: SofGAN: A Portrait Image Generator with Dynamic Styling.

Installation

Clone the main SofGAN repo by git clone --recursive https://github.com/apchenstu/softgan_test.git. This repo will be automatically included in softgan_test/modules.

Data preparation

Create a root directory (e.g. data), and for each instance (e.g. 00000) create a folder with seg images and calibrated camera poses. The folder structure looks like:

└── data   # instance id
    └── 00000
    │   ├── cam2world.npy       # camera extrinsics
    │   ├── cameras.npy            
    │   ├── intrinsic.npy       # camera intrinsics
    │   ├── zRange.npy          # optional only when use depth for training
    │   ├── 00000.png
    │   ...
    │   └── 00029.png
    ├── 00001
    │   └── ...
    ...
    └── xxxxx
        └── ...

Download the example data from here. We provide a notebook for data preprocessing.

Ideally, SOF could be trained with your own datasets with multi-view face segmentation maps. Similar to SRNs we uses an "OpenCV" style camera coordinate system, where the Y-axis points downwards (the up-vector points in the negative Y-direction), the X-axis points right, and the Z-axis points into the image plane. Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates. Please specify --orthogonal during training if you're using orthogonal projection for your own data. Please also notice that you might need to change the sample_instances_* and sample_observations_* parameter according to the number of instances and views of your own dataset.

As the accuracy of camera parameters might largly affect the training, you can specify --opt_cam during training to automatically optimize the camera parameters.

Training

STEP 1: Training network parameters

The training is done following two phrases. Firstly, please train the network parameters with multiview segmaps:

python train.py --config_filepath=./configs/face_seg_real.yml 

Training might take 1 to 3 days depends on the dataset size and quality.

STEP 2 (optional): Inverse rendering

We use inverse rendering to expand the trained geometric sampling space with single view segmaps collected from CelebAMaskHQ. The example config file is provided in ./configs/face_seg_single_view.yml, notice that we set --overwrite_embeddings and --freeze_networks to True, and specify --checkpoint_path as the trained checkpoint in STEP 1. After training, you can access the corresponding latent code for each portrait by loading the checkpoint.

python train.py --config_filepath=./configs/face_seg_single_view.yml 

Similar process could be used to back project in-the-wild portrait images into a latent vector in SOF geometric sampling space, and used for mutiview portrait generation.

Pretrained Checkpoints

Please download the pre-trained checkpoint from either GoogleDrive or BaiduDisk (password: k0b8) and save to ./checkpoints.

Inference

Please follow renderer.ipynb in the SofGAN repo for free-view portrait generation.

Once trained, SOF could be used for generating free-view segmentation maps for arbitrary instances in the geometric space. The inference codes are provided in notebooks in scripts:

  • Most testing codes are included in scripts/TestAll.ipynb, e.g. generating multiview images, modify attributes, visualize depth layers and build depth prior with marching cube.
  • To generate sampling free-view portrait segmentations from the geometry space, please refer to scripts/Test_MV_Inference.ipynb.
  • To visulalize a trained SOF volume as in Fig.5, please use scripts/Test_Slicing.ipynb.
  • To calculat mIOU during SOF training (Fig.9), please modify the model checkpoint directory and run scripts/Test_mIoU.ipynb.
  • We also provide scripts/Test_GMM.ipynb for miscs like fitting GMM model to the geometric space.

Acknowledgment

Thanks vsitzmann for sharing the awesome idea of SRNs, which has greatly inspired our design of SOF.