This project aims to develop a deep learning model for lip reading, leveraging state-of-the-art techniques in computer vision and machine learning. Lip reading, also known as speechreading or lipreading, is the ability to understand speech by watching the movements of a speaker's lips. By training a deep learning model on a large dataset of videos containing spoken words, phrases, and sentences, we can create a system capable of accurately transcribing spoken language solely from visual cues.
- Data Preparation : The project includes scripts for collecting and preprocessing large-scale video datasets suitable for lip reading model training.
- Model Architecture : I employed advanced deep learning architectures such as convolutional neural networks (CNNs).
- Training and Evaluation : The model is trained using standard machine learning techniques and evaluated on benchmark datasets to assess its performance and accuracy.
- Deployment : Once trained, the model can be deployed as part of a real-time lip reading system, capable of transcribing spoken language from video input in various applications.
- Python
- Keras
- Tensorflow
- Numpy
- Typing
- OpenCV
- Clone this repository to your local machine. Install the required dependencies using
pip install -r requirements.txt
- Download the dataset and preprocess it using the provided scripts.
- Train the lip reading model using the prepared dataset.
- Evaluate the trained model on benchmark datasets or custom test sets.
- Deploy the model for real-time lip reading applications.
- All of the above steps, except step 1, can be done by running the single file
app.py