/Lip-Reader

This is a machine learning project which reads and predicts what a person is saying.

Primary LanguagePython

Lip Reader

This project aims to develop a deep learning model for lip reading, leveraging state-of-the-art techniques in computer vision and machine learning. Lip reading, also known as speechreading or lipreading, is the ability to understand speech by watching the movements of a speaker's lips. By training a deep learning model on a large dataset of videos containing spoken words, phrases, and sentences, we can create a system capable of accurately transcribing spoken language solely from visual cues.

Features

  • Data Preparation : The project includes scripts for collecting and preprocessing large-scale video datasets suitable for lip reading model training.
  • Model Architecture : I employed advanced deep learning architectures such as convolutional neural networks (CNNs).
  • Training and Evaluation : The model is trained using standard machine learning techniques and evaluated on benchmark datasets to assess its performance and accuracy.
  • Deployment : Once trained, the model can be deployed as part of a real-time lip reading system, capable of transcribing spoken language from video input in various applications.

Requirements

  • Python
  • Keras
  • Tensorflow
  • Numpy
  • Typing
  • OpenCV

Usage

  • Clone this repository to your local machine. Install the required dependencies using pip install -r requirements.txt
  • Download the dataset and preprocess it using the provided scripts.
  • Train the lip reading model using the prepared dataset.
  • Evaluate the trained model on benchmark datasets or custom test sets.
  • Deploy the model for real-time lip reading applications.
  • All of the above steps, except step 1, can be done by running the single file app.py

Screenshots

Video used for Prediction

Output