/AVSR-Dataset-Pipeline

Multi-stage pipeline for generating an AVSR dataset consisting of active-speaker face tracks with their transcriptions from widely available videos (such as TV data).

Primary LanguagePythonMIT LicenseMIT

Watchers