SuchismitaSahu1993/Lipreading-Using-Mutimodal-Speech-Recognition

Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for the audio subnetwork and CNN-LSTMs for the video subnetwork.

Python

Stargazers

635360771
Ezzadein3
KenyonY
Shanghai
Imbecillus
Braunschweig
fastcode3d
qwjaskzxl
Hangzhou
DavieWu
Aidenfaustine
Grit1024
tianshunhan
SPA3K
wuhan, hubei, China
Chenxuey20
Hong Kong
xxxukk