/Audio-Visual-TAD

Centre Stage: Centricity-based Audio-Visual Temporal Action Detection

Primary LanguagePython