/Scrawl

Reverse OCR using DeepLearning

Primary LanguagePythonMIT LicenseMIT

Scrawl Project

Automating assingments, homeworks and projects by text to user-handwriting conversion using machine learning HTR system 😄.

This project uses the keras deeplearning library to recognize the user images, Tesseract-OCR and OpenCV-Python to detect and to convert images as notes. Here, the handwritten character recognition is implemented using the MNIST dataset.

The MNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 and converted to a 28x28 pixel image format and dataset structure.

Sample Image from dataset

Introduction

For those starting in the Optical Character Recognition (OCR) environment, here's some brief context: Basically, the algorithm takes an image (image of a handwritten character) as an input and outputs the likelihood that the image belongs to different classes (the machine-encoded character, a-zA-Z). So, in our case, the goal is to take an image of a handwritten character and determine what that alphabet is, using a trained model (HTR).

For many years, HTR systems have used the Hidden Markov Models (HMM) for the transcription task, but recently, through Deep Learning, the Convolutional Recurrent Neural Networks (CRNN) approach has been used to overcome some limitations of HMM.

CRNN

Overview of CRNN

Training process

The workflow is divided into 3 steps:

  • The input image is fed into the CNN layers to extract features. The output is feature map.
  • Through the implementation of Long Short-Term Memory (LSTM), the RNN is able to propagate information over longer distances and provide more robust features to training.
  • With RNN Connectionist Temporal Classification (CTC), calculate the loss value and also decodes into the final text.

Finally, all training and predictions were conducted on the Google Colaboratory (Colab) platform. By default, the platform offers Linux operating system, with 12GB ram and Nvidia Tesla T4 GPU 16GB memory (thank you so much, Google ❤).

Tesseract-OCR is used to detect the contours and convert each letter to individual images along with OpenCV. OpenCV then uses the detected and recognized image data to paste each letters into a background image that we supplied.

This project uses code created/inspired by many other repositories and projects. Links to all of them will be given in the references.

Pre-requisites:

Installation:

  • Clone the repository
git clone https://github.com/Asjidkalam/scrawl/
  • Install the necessary dependencies
pip install -r requirements.txt
  • Change the path to tesseract-ocr's executable in this line.

Usage:

  • Displays the current version
python3 scrawl.py --version
  • Use the handwriting data image(-hw/--handwriting) and text data(-t/--text) to convert
python3 scrawl.py -hw my_handwriting.jpg -t test_data.txt
  • Use the images used earlier from font/ directory
python3 scrawl.py --usetrained -t test_data.txt

References:

🍰