/yolo-face-parts-detector

👁 Using YOLOv8 to detect face parts

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

Face parts detection with YOLOv8 🎯

If you find my work useful, please cite it as:

Hernández Montilla, I (2024) Face parts detection with YOLOv8, DOI: https://doi.org/10.5281/zenodo.12507625

Introduction

In this project I use Ultralytics' implementation of YOLOv8. The goal is to train an algorithm that is able to detect separate face parts without having to use landmark detectors that don't do well when part of the face is occluded or missing. My goal is to also combine frontal, semi-frontal and profile face datasets so that the YOLO model works well on all of them.

It is also a great opportunity to try out the supervision library by Roboflow. It looks really helpful for some common YOLO-related tasks such as drawing the detections.

A live demo of YOLOv8 nano

This project uses Python 3.11. To install the required packages, use pip install -r requirements.txt in a Python 3.11 environment.

Motivation

All I want these models for is data exploration and check what face parts can be seen in an image. I'm talking about detecting face parts, which is not the same as detecting faces.

I've been asked many times: why not using facial landmark detectors? And the reason is that these do not work well with close-up images or heavily occluded faces, like this one:

An example of a close-up image where facial landmark detection is not possible

Image source: Pexels

I know there are several works about facial landmark detection for occluded faces (such as "Robust face landmark estimation under occlusion"), but a picture of the entire face is always needed. If I wanted to be able to detect face parts in close-up images, I would have to develop something new. And that's exactly what I've done.

Data

For this experiment I'm using a variety of facial landmark detection datasets. Each dataset came in a different structure, so I had to deal with that in prepare_full_dataset.py:

I am not sharing any of these datasets: they are not mine, and they are 100% accessible from their corresponding sites. I may release the Pexels dataset that I create in the future, though.

Results

Data quality

Some datasets such as Helen may generate noisy examples when the images have more than one face but only one set of landmarks (i.e. the ones corresponding to the "main" face in the image). This is probably affecting the precision because the model is actually detecting all the faces in these images (which is good, though). Other datasets such as AFW have as many landmarks as faces in the images.

A training batch with some images with incomplete labels

Performance

In this section you can see the performance of the nano model. It struggles with eyebrows, but it works really well for eyes, mouths, and noses. I would need to add more close-up images of each part to increase the number of incomplete or occluded faces.

Yolov8-nano F1 curve

Here are the metrics of the nano model:

YOLOv8-nano results

Reports

Use run.py to run the model on a folder with images to obtain a CSV with all the detections.