/AVA-AVD

Primary LanguagePython

This is the official repo for AVA-AVD dataset and AVR-Net model.

Preprints: AVA-AVD: Audio-visual Speaker Diarization in the Wild

Dependencies

Build the environment:

sudo apt-get install ffmpeg
pip install -r requirement.txt

Data preparation

AVA-AVD Dataset

Training and inference

AVR-Net codebase

Please kindly cite our paper if you find this repo useful:

@inproceedings{xu2022ava,
author = {Xu, Eric Zhongcong and Song, Zeyang and Tsutsui, Satoshi and Feng, Chao and Ye, Mang and Shou, Mike Zheng},
title = {AVA-AVD: Audio-Visual Speaker Diarization in the Wild},
year = {2022},
pages = {3838–3847},
location = {Lisboa, Portugal},
series = {MM '22}
}