audio-visual-speech-recognition

There are 12 repositories under audio-visual-speech-recognition topic.

modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python4.4k 50 894491
smeetrs/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Language:Python196 6 4343
ankurbhatia24/MULTIMODAL-EMOTION-RECOGNITION
Human Emotion Understanding using multimodal dataset.
Language:Jupyter Notebook76 7 920
georgesterpu/Taris
Transformer-based online speech recognition system with TensorFlow 2
Language:Python25 6 16
lzuwei/end-to-end-multiview-lipreading
End to End Multiview Lip Reading
Language:Python10 2 12
hmeutzner/kaldi-avsr
Kaldi-based audio-visual speech recognition
Language:Shell6 6 16
karlsimsBBC/cassette-bot
🤖 📼 Command-line tool for remixing videos with time-coded transcriptions.
Language:Python5 2 02
Sreyan88/LipGER
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
Language:Python5
luomingshuang/lipreading_with_icefall
In this repository, I try to use k2, icefall and Lhotse for lip reading. I will modify it for the lip reading task. Many different lip-reading datasets should be added. -_-
Language:Python1 2 00
Remi-Gau/McGurk_prior_code
Code related to the fMRI experiment on the contextual modulation of the McGurk Effect
Language:MATLAB1 3 02
zulfiqarAlibalti/audio-visual-Transcription
Real-Time Audio-visual Speech Recongition
Language:Python10
MaazKhan98/Multimodal-Emotion-Recognition-speech-facial-and-body-gestures
Human Emotion Understanding using multimodal dataset
Language:Jupyter Notebook0 1 00

audio-visual-speech-recognition

modelscope/FunASR

smeetrs/deep_avsr

ankurbhatia24/MULTIMODAL-EMOTION-RECOGNITION

georgesterpu/Taris

lzuwei/end-to-end-multiview-lipreading

hmeutzner/kaldi-avsr

karlsimsBBC/cassette-bot

Sreyan88/LipGER

luomingshuang/lipreading_with_icefall

Remi-Gau/McGurk_prior_code

zulfiqarAlibalti/audio-visual-Transcription

MaazKhan98/Multimodal-Emotion-Recognition-speech-facial-and-body-gestures