To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision
Chunlu Li, Andreas Morel-Forster, Thomas Vetter, Bernhard Egger*, and Adam Kortylewski*
This is a pytorch implementation of the following paper:
This work enables a model-based face autoencoder to segment occlusions accurately for 3D face reconstruction and provides state-of-the-art occlusion segmentation results and the face reconstruction is robust to occlusions. It requires only weak supervision for the face reconstruction subnetwork and can be trained end-to-end efficiently. The effectiveness of this method is verified on the Celeb A HQ dataset, the AR dataset, and the NoW Challenge.
-
The ArcFace for perceptual-level loss.
-
Better tuned hyper-parameters for higher reconstruction accuracy.
-
Test and evaluation code released. 3D shape (.obj mesh), rendered faces, and estimated masks available. Evaluation indices (accuracy, precision, F1 socre, and recall rate) available.
This method provides reliable occlusion segmentation masks and the training of the segmentation network does not require any additional supervision.
This method produces accurate 3D face model fitting results which are robust to occlusions.
[New!] Our method, named 'FOCUS' (Face-autoencoder and OCclUsion Segmentation), reaches the SOTA on the NoW Challenge!
The results of the state-of-the-art methods on the NoW face benchmark is as follows:
Rank | Method | Median(mm) | Mean(mm) | Std(mm) |
---|---|---|---|---|
1. | FOCUS (Ours) | 1.04 | 1.30 | 1.10 |
2. | DECA[Feng et al., SIGGRAPH 2021] | 1.09 | 1.38 | 1.18 |
3. | Deep3DFace PyTorch [Deng et al., CVPRW 2019] | 1.11 | 1.41 | 1.21 |
4. | RingNet [Sanyal et al., CVPR 2019] | 1.21 | 1.53 | 1.31 |
5. | Deep3DFace [Deng et al., CVPRW 2019] | 1.23 | 1.54 | 1.29 |
6. | 3DDFA-V2 [Guo et al., ECCV 2020] | 1.23 | 1.57 | 1.39 |
7. | MGCNet [Shang et al., ECCV 2020] | 1.31 | 1.87 | 2.63 |
8. | PRNet [Feng et al., ECCV 2018] | 1.50 | 1.98 | 1.88 |
9. | 3DMM-CNN [Tran et al., CVPR 2017] | 1.84 | 2.33 | 2.05 |
For more details about the evaluation, check Now Challenge website.
This method follows a step-wise manner and is easy to implement.
To train and/or test this work, you need to:
-
Prepare .csv files for the training set, validation set, and testing set.
The .csv files should contain rows of [filename + landmark coordinates].
We recommend using the 68 2D landmarks detected by 2D-and-3D-face-alignment.
-
To evaluate the accuracy of the estimated masks, ground truth occlusion segmentation masks are required. Please name the target image as 'image_name.jpg' and ground truth masks as 'image_name_visible_skin_mask.png'.
The image directory should follow the structure below:
./image_root ├── Dataset # Database folder containing the train set, validation set, and test set. ├──1.jpg # Target image ├──1_visible_skin_mask.png # GT masks for testing. (optional for training) └──... ├── train_landmarks.csv # .csv file for the train set. ├── test_landmarks.csv # .csv file for the test set. ├── val_landmarks.csv # .csv file for the validation set. └── all_landmarks.csv # .csv file for the whole dataset. (optional)
- Our implementation employs the BFM 2017. Please copy 'model2017-1_bfm_nomouth.h5' to './basel_3DMM'.
We depend on ArcFace to compute the perceptual features for the target images and the rendered image.
-
Download the trained model.
-
Place ms1mv3_arcface_r50_fp16.zip and backbone.pth under ./Occlusion_Robust_MoFA/models/.
-
To install the ArcFace, please run the following code:
cd ./Occlusion_Robust_MoFA
git clone https://github.com/deepinsight/insightface.git
cp -r ./insightface/recognition/arcface_torch/* ./models/
- Overwrite './models/backbones/iresnet.py' with the file in our repository.
The structure of the directory 'models' should be:
./models
├── ms1mv3_arcface_r50_fp16
├──backbone.pth
└──... # Trained model downloaded.
├── backbones
├──*iresnet.py # Overwritten by our code.
└──...
└── ... # files/directories downloaded from ArcFace repo.
We recommend using anaconda or miniconda to create virtue environment and install the packages. You can set up the environment with the following commands:
conda create -n FOCUS python=3.6
conda activate FOCUS
pip install -r requirements.txt
To train the proposed network, please follow the steps:
- Enter the directory
cd ./Occlusion_Robust_MoFA
- Unsupervised Initialization
python Step1_Pretrain_MoFA.py --img_path ./image_root/Dataset
- Generate UNet Training Set
python Step2_UNet_trainset_generation.py --img_path ./image_root/Dataset
- Pretrain Unet
python Step3_Pretrain_Unet.py
- Joint Segmentation and Reconstruction
python Step4_UNet_MoFA_EM.py --img_path ./image_root/Dataset
-
Test-time adaptation (Optional)
To bridge the domain gap between training and testing data to reach higher performance on the test dataset, test-time adaptation is available with the following command:
python Step4_UNet_MoFA_EM.py --img_path ./image_root/Dataset_adapt --pretrained_model iteration_num
To test the model saved as './MoFA_UNet_Save/model-path/model-name', use the command below:
python Demo.py --img_path ./image_root/Dataset --pretrained_model_test ./MoFA_UNet_Save/model-path/model-name.model --test_mode pipeline_name --test_path test_dataset_root --save_path save_path --landmark_list_name landmark_filename_optional.csv
Please cite the following papers if this model helps your research:
@article{li2021fit,
title={To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision},
author={Li, Chunlu and Morel-Forster, Andreas and Vetter, Thomas and Egger, Bernhard and Kortylewski, Adam},
journal={arXiv preprint arXiv:2106.09614},
year={2021}}
This code is built on top of the MoFA re-implementation from Tatsuro Koizumi. If you establish your own work based on our work, please also cite the following paper:
@inproceedings{koizumi2020look,
title={“Look Ma, no landmarks!”--Unsupervised, model-based dense face alignment},
author={Koizumi, Tatsuro and Smith, William AP},
booktitle={European Conference on Computer Vision},
pages={690--706},
year={2020},
organization={Springer}
}