/Awesome-Scene-Text-Recognition

A curated list of resources dedicated to scene text localization and recognition

Scene Text Localization & Recognition Resources

A curated list of resources dedicated to scene text localization and recognition. Any suggestions and pull requests are welcome.

Papers & Code

Overview

  • [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper
  • [2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends paper

Visual Geometry Group, University of Oxford

CUHK & SIAT

  • [2016-arXiv] Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network paper
  • [2016-AAAI] Reading Scene Text in Deep Convolutional Sequences paper
  • [2016-TIP] Text-Attentional Convolutional Neural Networks for Scene Text Detection paper
  • [2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees paper

Media and Communication Lab, HUST

  • [2016-CVPR] Robust scene text recognition with automatic rectification paper
  • [2016-CVPR] Multi-oriented text detection with fully convolutional networks paper
  • [2015-CoRR] An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition paper code github

AI Lab, Stanford

  • [2012-ICPR, Wang] End-to-End Text Recognition with Convolutional Neural Networks paper code SVHN Dataset
  • [2012-PhD thesis, David Wu] End-to-End Text Recognition with Convolutional Neural Networks paper

Others

  • [2014-TPAMI] Word Spotting and Recognition with Embedded Attributes paper homepage code
  • [2016-CVPR] Recursive Recurrent Nets with Attention Modeling for OCR in the Wild paper
  • [2016-arXiv] COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images paper
  • [2016-arXiv] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images paper
  • [2015 ICDAR] Object Proposals for Text Extraction in the Wild paper code

Datasets

  • COCO-Text (Computer Vision Group, Cornell) 2016

  • 63,686 images, 173,589 text instances, 3 fine-grained text attributes.

  • Task: text location and recognition

  • [COCO-Text API] (https://github.com/andreasveit/coco-text)

  • Synthetic Word Dataset (Oxford, VGG) 2014

  • 9 million images covering 90k English words

  • Task: text recognition, segmantation

  • download

  • IIIT 5K-Words 2012

  • 5000 images from Scene Texts and born-digital (2k training and 3k testing images)

  • Each image is a cropped word image of scene text with case-insensitive labels

  • Task: text recognition

  • download

  • StanfordSynth(Stanford, AI Group) 2012

  • Small single-character images of 62 characters (0-9, a-z, A-Z)

  • Task: text recognition

  • download

  • MSRA Text Detection 500 Database (MSRA-TD500) 2012

  • 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)

  • Chinese, English or mixture of both

  • Task: text detection

  • Street View Text (SVT) 2010

  • 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)

  • Only word level bounding boxes are provided with case-insensitive labels

  • Task: text location

  • KAIST Scene_Text Database 2010

  • 3000 images of indoor and outdoor scenes containing text

  • Korean, English (Number), and Mixed (Korean + English + Number)

  • Task: text location, segmantation and recognition

  • Chars74k 2009

  • Over 74K images from natural images, as well as a set of synthetically generated characters

  • Small single-character images of 62 characters (0-9, a-z, A-Z)

  • Task: text recognition

  • ICDAR Benchmark Datasets

Dataset Discription Competition Paper
ICDAR 2015 1000 training images and 500 testing images paper link
ICDAR 2013 229 training images and 233 testing images paper link
ICDAR 2011 229 training images and 255 testing images paper link
ICDAR 2005 1001 training images and 489 testing images paper link
ICDAR 2003 181 training images and 251 testing images(word level and character level) paper link

Blogs