face_mask_detection

NVIDIA Developer Blog

The project shows, tutorial for NVIDIA's Transfer Learning Toolkit (TLT) + DeepStream (DS) SDK ie training and inference flow for detecting faces with mask and without mask on Jetson Platform.

By the end of this project; you will be able to build DeepStream app on Jetson platform to detect faces with mask and without mask.

What this project includes

Transfer Learning Toolkit (TLT) scripts:
- Dataset processing script to convert it in KITTI format
- Specification files for configuring tlt-train, tlt-prune, tlt-evalute
DeepStream (DS) scripts:
- deepstream-app config files (For demo on single stream camera and detection on stored video file)

What this project does not provide

Trained model for face-mask detection; we will go through step by step to produce detetctnet_v2 (with ResNet18 backbone) model for face-mask detection.
NVIDIA specific dataset for faces with and without mask; we suggest following dataset based on our experiments.

Preferred Datasets

Faces with Mask
- Kaggle Medical Mask Dataset Download Link
- MAFA - MAsked FAces Download Link
Faces without Mask
- FDDB Dataset Download Link
- WiderFace Dataset Download Link

Note: We do not use all the images from MAFA and WiderFace. Combining we will use about 6000 faces each with and without mask

Steps to perform Face Detection with Mask:

Install dependencies and Docker Container
- On Training Machine with NVIDIA GPU:
  - Install NVIDIA Docker Container: installation instructions TLT Toolkit Requirements
  - Running Transfer Learning Toolkit using Docker
    - Pull docker container:
      docker pull nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3
    - Run the docker image:
      docker run --gpus all -it -v "/path/to/dir/on/host":"/path/to/dir/in/docker" \ -p 8888:8888 nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3 /bin/bash
  - Clone Git repo in TLT container:
```
git clone https://github.com/NVIDIA-AI-IOT/face-mask-detection.git
```
  - Install data conversion dependencies
```
cd face-mask-detection
python3 -m pip install -r requirements.txt
```
- On NVIDIA Jetson:
  - Install DeepStream

Prepare input data set (On training machine)

We expect downloaded data in this structure.

Convert data set to KITTI format cd face-mask-detection

python3 data2kitti.py --kaggle-dataset-path <kaggle dataset absolute directory path> \
                         --mafa-dataset-path <mafa dataset absolute  directory path> \
                         --fddb-dataset-path < FDDB dataset absolute  directory path> \
                         --widerface-dataset-path <widerface dataset absolute  directory path> \
                         --kitti-base-path < Out directory for storing KITTI formatted annotations > \
                         --category-limit < Category Limit for Masked and No-Mask Faces > \
                         --tlt-input-dims_width < tlt input width > \
                         --tlt-input-dims_height <tlt input height > \
                         --train < for generating training dataset >

You will see following output log:

  Kaggle Dataset: Total Mask faces: 4154 and No-Mask faces:790
  Total Mask Labelled:4154 and No-Mask Labelled:790

  MAFA Dataset: Total Mask faces: 1846 and No-Mask faces:232
  Total Mask Labelled:6000 and No-Mask Labelled:1022

  FDDB Dataset: Mask Labelled:0 and No-Mask Labelled:2845
  Total Mask Labelled:6000 and No-Mask Labelled:3867

  WideFace: Total Mask Labelled:0 and No-Mask Labelled:2134
  ----------------------------
  Final: Total Mask Labelled:6000
  Total No-Mask Labelled:6001
  ----------------------------

Note: You might get warnings; you can safely ignore it

Perform training using TLT training flow
- Use 'face-mask-detection' Jupyter Notebook provided with this repository.
- Follow TLT training flow
Perform inference using DeepStream SDK on Jetson
- Transfer model files (.etlt), if int8: calibration file (calibration.bin)
- Use config files from /ds_configs/* $vi config_infer_primary_masknet.txt
  - Modify model and label paths: according to your directory locations
    - Look for tlt-encoded-model, labelfile-path, model-engine-file, int8-calib-file
  - Modify confidence_threshold, class-attributes according to training
    - Look for classifier-threshold, class-attrs
- Use deepstream_config files: $ vi deepstream_app_source1_masknet.txt
  - Modify model file and config file paths:
    - Look for model-engine-file, config-file under primary-gie
- Use deepstream-app to deploy in real-time $deepstream-app -c deepstream_app_source1_video_masknet_gpu.txt
- We provide two different config files:
  - DS running on GPU only with camera input: deepstream_app_source1__camera_masknet_gpu.txt
  - DS running on GPU only with saved video input: deepstream_app_source1_video_masknet_gpu.txt

Note:
- model-engine-file is generated at first run; once done you can locate it in same directory as .etlt - In case you want to generate model-engine-file before first run; use tlt-converter

Evaluation Results on NVIDIA Jetson Platform

Pruned	mAP (Mask/No-Mask) (%)	Inference Evaluations on Nano	Inference Evaluations on Xavier NX		Inference Evaluations on Xavier
Pruned	mAP (Mask/No-Mask) (%)	GPU (FPS)	GPU (FPS)	DLA (FPS)	GPU (FPS)	DLA (FPS)
No	86.12 (87.59, 84.65)	6.5	125.36	30.31	269.04	61.96
Yes (12%**)	85.50 (86.72, 84.27)	21.25	279	116.2	508.32	155.5

NVIDIA Transfer Learning Toolkit (TLT) Training Flow

Download Pre-trained model ( For Mask Detection application, we have experimented with Detectnet_v2 with ResNet18 backbone)
Convert dataset to KITTI format
Train Model (tlt-train)
Evaluate on validation data or infer on test images (tlt-evaluate, tlt-infer)
Prune trained model (tlt-prune)
Pruning model will help you to reduce parameter count thus improving FPS performance
Retrain pruned model (tlt-train)
Evaluate re-trained model on validation data (tlt-evaluate)
If accuracy does not fall below satisfactory range in (7); perform step (5), (6), (7); else go to step (9)
Export trained model from step (6) (tlt-export)
Choose int8, fp16 based on you platform needs; such as Jetson Xavier and Jetson Xavier-NX has int8 DLA support

Interesting Resources

References

Evan Danilovich (2020 March). Medical Masks Dataset. Version 1. Retrieved May 14, 2020 from https://www.kaggle.com/ivandanilovich/medical-masks-dataset
Shiming Ge, Jia Li, Qiting Ye, Zhao Luo; "Detecting Masked Faces in the Wild With LLE-CNNs", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2682-2690
Vidit Jain and Erik Learned-Miller. "FDDB: A Benchmark for Face Detection in Unconstrained Settings". Technical Report UM-CS-2010-009, Dept. of Computer Science, University of Massachusetts, Amherst. 2010
Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou; "WIDER FACE: A Face Detection Benchmark", IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2016
MAFA Dataset Google Link: Courtesy aome510

reddevil1310/face-mask-detection