Video Face Detection

This Python script processes video files in a given directory and detects frames with multiple faces using the MTCNN (Multi-task Cascaded Convolutional Networks) face detection model from the facenet_pytorch library. Videos with more than a specified number of frames containing multiple faces are copied to an output directory, and their names are logged in a file.

Requirements

Python 3.x
PyTorch
torchvision
torchaudio
CUDA (for GPU acceleration)

Installation

Install PyTorch, torchvision, and torchaudio with CUDA support:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Install the required dependencies:

pip install facenet_pytorch opencv-python tqdm

Usage

Clone the repository or download the script file.
Open a terminal and navigate to the directory containing the script.
Run the script with the following command:

python process_videos_torch_parallel.py --video_dir /path/to/video/directory --output_dir /path/to/output/directory --log_file /path/to/log/file.txt

Replace /path/to/video/directory with the path to the directory containing the video files you want to process, /path/to/output/directory with the path to the directory where you want to save the videos with multiple faces, and /path/to/log/file.txt with the path to the log file where you want to save the names of the processed videos.

The script will process the videos in the specified directory and save the videos with multiple faces in the output directory. The names of the processed videos will be logged in the specified log file.

Code Overview

The script uses the MTCNN face detection model from the facenet_pytorch library to detect faces in video frames.
It processes the videos in parallel using multiprocessing to speed up the execution.
For each video, it samples frames at a specified interval (default is 45 frames) and checks if there are more than one face in each frame.
If a video has more than a specified number of frames (default is 20) with multiple faces, it is copied to the output directory, and its name is logged in the log file.
The script uses OpenCV for reading video frames and tqdm for displaying progress bars.

Notes