
OCR model for Pytorch based on a recent Fully Convolutional Neural Network

Primary LanguageJupyter NotebookMIT LicenseMIT


This repository is a work in progress implementation of: Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks I am not affiliated with the authors of the paper.

I also implemented the excellent data generator of @Belval as a pytorch dataset. I rely on the pytorch implementation of baidus Warp-CTC loss from @jpuigcerver. I found the performance from warp-CTC to be better than the ctc loss native to pytorch.

code organization

Pytorch Datasets:

  • fake_texts: A folder that includes python files to set up a pytroch dataset using @Belval synthetic data generator. The pytorch dataset is defined in 'pytorch_dataset_fake.py' I load the data during the init of the dataset, with a particular set of parameters. The generation has many parameters most of them are hard set inside of the init.
  • IAM_dataset: Contains a simple Pytorch dataset to load the IAM handwritten offline dataset, on a line by line basis.


  • OCR_Training_synthetic.ipynb Trains a model on the synthetic dataset
  • OCR_training_handwritten.ipynb Trains a model on the IAM offline handwritten line segment dataset.
  • inference_demo.ipynb An example of how to go from images to text predictions.


Implement performance on benchmark datasets. (IAM etc) Implement Batch-Renorm for Pytorch

  • Implement Model in Pytorch
  • Implement CTC Loss
  • Implement Synthetic Data Generator
  • Implement synthetic training script
  • Implement IAM dataset
  • Implement IAM training script
  • Compare performance on benchmark dataset
  • Implement Batch-Renorm for Pytorch