Face Parsing
Implementation of a one-stage Face Parsing model using Unet architecture.
The face is divided into 10 classes (background, eyes, nose, lips, ear, hair, teeth, eyebrows, general face, beard).
This results in regions of interest that are difficult to segment. We address this problem by implementing different loss functions:
- Tversky focal loss
- Weighted Cross entropy
- α-balanced focal loss
The latter leads to a better accuracy of 0.90 (IOU-train) and 0.72 (IOU-test) for the non fine-tuned model, with a Unet-16 driven on 1 gpu.
Implementation
Language version : Python 3.7.6
Operating System : MacOS Catalina 10.15.4
Framework: Pytorch
Results
We can observe the probability map for each channel.
It allows us to estimate how much a pixel belongs to a specific part of the face.
Getting Started
Install dependencies:
# use a virtual-env to keep your packages unchanged
>> pip install -r requirements.txt
Train the model:
# Train the model
>> ./main.py
Inference:
from src.model import Unet
from src.framework import Context
img = "./img_relative_path.png"
# load model
model = torch.load("my_model.pt", map_location=torch.device('cpu'))
# create context
ctx = Context()
# predict
yhat = ctx.predict(model, img, plot=True)
References
- Mut1ny Face/head dataset
- U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, and Thomas Brox. Tech report, arXiv, May 2015.
- A novel Focal Tversky loss function with improved Attention U-Net for lesion segmentation Abraham, Nabila and Khan, Naimul Mefraz. Tech report, arXiv, Oct 2018.
- Focal Loss for Dense Object Detection Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár. Tech report, arXiv, Feb 2018.