/yolov3-facial-landmark-detection

Yolov3-facial-landmark-detection

Primary LanguageJupyter Notebook

Yolov3-facial-landmark-detection

This repository contains files for training and testing Yolov3 for multi-task face detection and facial landmarks extraction.

Example Output

P.S. A jupyter-notebook for all parts can be found here.

Installation

  • First, clone the repository.
git clone https://github.com/sefaburakokcu/yolov3-facial-landmark-detection.git
  • Then, install prerequisites.
pip install -r requirements.txt

Training

  1. For training the models with Widerface dataset, first download dataset from Widerface website. Then, under data/datasets/ run,
python widerface_yolo_format.py

Or downlaod Widerface training dataset in YOLO format directly from Google Drive and put images folder under data/datasets/widerface/.

  1. Under src folder, run
python train.py

Inference

For inference, pretrained weights can be used. Pretrained weights can be download from Google Drive. After downloading weigts, put all weights in weights folder under project main folder.

Under src folder, run

python inference.py

Tests

  1. In order to evaluate the models, first download Widerface Validation dataset from Widerface Website and WFLW dataset from WFLW Website or Google Drive and put it under data/datasets/wflw/.

  2. Then, under src run,

python test.py

in order to save face detection and facial landmarks predictions.

  1. Finally, under src/evaluations/widerface/, run
python evaluate_widerface.py

for face detection performance and under src/evaluations/wiflw/, run

python evaluate_wflw.py

for facial landmarks extraction performance.

Face Detection

Evaluation of models on Widerface Validation dataset for face detection is indicated below. Average Precision is used as a performance metric.

Models Easy Medium Hard
Mobilenetv2(0.75) 0.85 0.83 0.63
Mobilenetv2(1.0) 0.87 0.86 0.69
Retinaface(Mobilenetv2(0.25)) 0.90 0.87 0.67
Retinaface(Resnet50) 0.93 0.91 0.69
MTCNN 0.79 0.76 0.50

Facial Landmarks Extraction

Evaluation of models on WFLW dataset for facial landmarks extraction is shown below. Average Root Mean Square Error(RMSE) is chosen as a performance metric.

Models RMSE
Mobilenetv2(0.75) 6.53
Mobilenetv2(1.0) 4.36
Retinaface(Mobilenetv2(0.25)) 4.03
Retinaface(Resnet50) 0.93
MTCNN 4.5

References