Wenzheng Zeng1, Yang Xiao1†, Sicheng Wei1, Jinfang Gan1, Xintao Zhang1, Zhiguo Cao1, Zhiwen Fang2, Joey Tianyi Zhou3.
1Huazhong University of Science and Technology, 2Southern Medical University, 3A*STAR.
This repository contains the official implementation of the CVPR 2023 paper "Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video".
- New Task: It is the first time that the task of instance-level multi-person eyeblink detection in untrimmed videos is formally defined and explored. We think that a good multi-person eyeblink detection algorithm should be able to (1) detect and track human instances’ faces reliably to ensure the instance-level analysis ability along the whole video, and (2) detect eyeblink boundaries accurately within each human instance to ensure the precise awareness of their eyeblink behaviors. We design new metrics to give attention to both instance awareness quality and eyeblink detection quality;
- New Dataset: To support this research task, we introduce MPEblink. It is featured with multi-instance, unconstrained, and untrimmed, which makes it more challenging and offers a closer correspondence to real-world demands;
- New Framework: We propose a one-stage multi-person eyeblink detection method InstBlink. It can jointly perform face detection, tracking, and instance-level eyeblink detection. Such a task-joint paradigm can benefit the sub-tasks uniformly. Benefited from the one-stage design, InstBlink also shows high efficiency especially in multi-instance scenarios.
-
Create a new conda environment:
conda create -n instblink python=3.9 conda activate instblink
-
Install Pytorch (1.7.1 is recommended), scipy, tqdm, pandas.
-
Install MMDetection.
-
Install MMCV first. 1.4.8 is recommended.
-
cd MPEblink pip install -v -e .
-
-
Download the MPEblink dataset. Remember to change the dataset root path into yours in
configs/base/mpeblink.py
. -
Convert the videos to raw frames.
python tools/dataset_converters/mpeblink_build_raw_frames_dataset.py --root $YOUR_DATA_PATH
You can put some videos in demo_video/source_video/
and get the visualization inference result in demo_video/visual_result/
by running the following command:
bash tools/code_for_demo/demo.sh
-
You can download the pre-trained model at Google Drive or Baidu Drive (code avk9) and put it in the
pretrained_models
directory. -
Run
test.sh
for inference and evaluation. Remember to change the dataset path into yours.bash tools/test_eval.sh
-
Download the pretrained tevit_r50 model and place it in the
pretrained_models
directory. -
Run
train.sh
to begin training.bash tools/train.sh
This code is inspired by TeViT and MMDetection. Thanks for their great contributions on the computer vision community.
If you find our work useful in your research, please consider to cite our paper:
@inproceedings{zeng2023real,
title={Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video},
author={Zeng, Wenzheng and Xiao, Yang and Wei, Sicheng and Gan, Jinfang and Zhang, Xintao and Cao, Zhiguo and Fang, Zhiwen and Zhou, Joey Tianyi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
pages={13854--13863},
year={2023}
}