/NeuralHandwrittenTextRecognition

Handwritten text recognition with CRNN and Seq2Seq model architectures.

Primary LanguageJupyter Notebook

Neural Handwritten Text Recognition

This repository contains experiments to recognize handwritten text with Convolutional Recurrent Neural Network (CRNN) + CTC and Seq2Seq with attention model architectures.

The dataset used is the IAM Handwriting Database.

The models are run for 20 epochs. The Seq2Seq model is run twice using the weights generated by the first run as initialization for the second run. This enhances the score of the model significantly. Alternatively one can only preload and fix the CNN weights to the same effect.

(orange: seq2seq model first run, red: seq2seq model second run, blue: CRNN)

Alt text

The notebooks IAM_SEQ_with_Line_Segmentation and IAM_CRNN_with_Line_Segmentation contain the logic to segment pages into lines using kraken https://github.com/mittagessen/kraken. Above deep learning models are applied to each line to recognize the text on the page.

Results:

Target: Alt text

CRNN:

"RESIOENT KENNEDI in ready to get tough over Mat Germny's csh offer to help America's talance of payments position He sid bluntly in Woshington yesterdy that the offer -367 million- was not good ensugh . And he intiated thathis Government would try trgt Germary to pary more . He did not mention perional talks with Dr. Adenauer , the West Germen Chancelor ."

Seq2Seq:

"PRESIDENT KENNEDI is realy to get tough over not Germany's ask offer to # help Aneriea's talance of payments parition He said bluntly in Nashington yestedy that the offer - 357million - was not good enrugh . And he intiated that hat his Government would try toget Germany to pay more . He did not nextion perioul talks with Dr. Adenauer , the West ferman Chancelor ."