PingYufeng/Lipreading-Using-Mutimodal-Speech-Recognition
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with bidirectional LSTMs for the audio subnetwork and CNN-LSTMs for the video subnetwork.
Python
No issues in this repository yet.