Scene Text Localization & Recognition Resources

Read this institute-wise: English, 简体中文.

Read this year-wise: English, 简体中文.

Tags: [STL] (Scene Text Localization), [TR] (Text Recognition)

[STL] (Scene Text Localization) Detect text area from scene input image

[TR] (Text Recognition) Recognize text content

Last update: May.03 2019

1. Papers & Code

Overview

[2018-arxiv] Scene Text Detection and Recognition: The Deep Learning Era paper
[2016-TIP] Text Detection Tracking and Recognition in Video: A Comprehensive Survey paper
[2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper
[2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends paper

University of Oxford

[2018-BMVC][TR] Inductive Visual Localisation: Factorised Training for Superior Generalisation paper
[2016-IJCV][STL][TR] Reading Text in the Wild with Convolutional Neural Networks paper demo homepage
[2016-CVPR][STL] Synthetic Data for Text Localisation in Natural Images paper code data
[2015-ICLR][TR] Deep structured output learning for unconstrained text recognition paper
[2015-PhD Thesis][STL] Deep Learning for Text Spotting paper code
[2014-ECCV][STL] Deep Features for Text Spotting paper code model GitXiv
[2014-NIPS][TR] Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition paper homepage model

Shenzhen Institutes of Advanced Technology

[2018-arxiv][STL][TR] FOTS: Fast Oriented Text Spotting with a Unified Network paper
[2016-ECCV][STL] CTPN: Detecting Text in Natural Image with Connectionist Text Proposal Network paper code
[2016-CVPR][STL] Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network paper
[2016-AAAI][STL] Reading Scene Text in Deep Convolutional Sequences paper
[2016-TIP][STL] Text-Attentional Convolutional Neural Networks for Scene Text Detection paper
[2014-ECCV][STL] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees paper

South China University of Technology

[2018-AAAI][STL] Feature Enhancement Network: A Refined Scene Text Detector paper
[2017-arXiv][STL] Detecting Curve Text in the Wild: New Dataset and New Solution paper
[2017-TPAMI][TR] Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition paper
[2017-CVPR][STL] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection paper
[2016-arXiv][STL] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images paper

Fudan University

[2018-CVPR][TR] Edit Probability for Scene Text Recognition paper
[2017-arXiv][STL] Arbitrary-Oriented Scene Text Detection via Rotation Proposals paper code

Huazhong University of Science and Technology

[2018-ECCV][TR][STL] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes paper
[2018-ICIP][STL] Feature Fusion Network for Scene Text Detection paper
[2018-CVPR][STL] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation paper
[2018-CVPR][STL] Rotation-sensitive Regression for Oriented Scene Text Detection paper
[2018-TIP][STL] TextBoxes++: A Single-Shot Oriented Scene Text Detector paper code
[2017-AAAI][STL] TextBoxes: A Fast TextDetector with a Single Deep Neural Network paper code
[2017-CVPR][STL] Detecting Oriented Text in Natural Images by Linking Segments paper
[2016-CVPR][TR] Robust scene text recognition with automatic rectification paper
[2016-arXiv][STL] Scene Text Detection via Holistic, Multi-Channel Prediction paper
[2016-CVPR][STL] Multi-oriented text detection with fully convolutional networks paper
[2015-TPAMI][TR] An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition paper code code

Universitat Autònoma de Barcelona

[2018-ECCV][STL] Single Shot Scene Text Retrieval paper
[2017-arXiv][STL] Improving Text Proposal for Scene Images with Fully Convolutional Networks paper
[2016-arXiv][STL] TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild paper code
[2015-ICDAR][STL] Object Proposals for Text Extraction in the Wild paper code
[2014-TPAMI][TR] Word Spotting and Recognition with Embedded Attributes paper homepage code

Stanford University

[2012-ICPR][TR] End-to-End Text Recognition with Convolutional Neural Networks paper code SVHN Dataset
[2012-PhD Thesis][TR] End-to-End Text Recognition with Convolutional Neural Networks paper

Seoul National University

[2017-AAAI][STL][TR] Detection and Recognition of Text Embedding in Online Images via Neural Context Models paper

Megvii Technology Inc: Face++

[2017-CVPR][STL] EAST: An Efficient and Accurate Scene Text Detector paper code code with improvement

Institute of Automation, Chinese Academy of Sciences

[2017-arXiv][STL] Deep Direct Regression for Multi-Oriented Scene Text Detection paper

University of California, San Diego

[2016-CVPR][TR] Recursive Recurrent Nets with Attention Modeling for OCR in the Wild paper

University of California, Santa Cruz

[2017-arXiv][STL] Cascaded Segmentation-Detection Networks for Word-Level Text Spotting paper

Cornell University

[2016-arXiv][STL][TR] COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images paper

Pennsylvania State University

[2016-PhD Thesis][STL] Context Modeling for Semantic Text Matching and Scene Text Detection paper

University of Science and Technology Beijing

[2016-IJCAI][STL] Scene Text Detection in Video by Learning Locally and Globally paper
[2014-TPAMI][TR] Robust Text Detection in Natural Scene Images paper

Pohang University of Science and Technology

[2016-CVPR][STL] CannyText Detector: Fast and Robust Scene Text Localization Algorithm paper

École d'Ingénieurs en Informatique

[2016-IJDAR][STL] TextCatcher: a method to detect curved and challenging text in natural scenes paper

České vysoké učení technické v Praze. Czech Technical University

[2017-ICCV][STL][TR] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework peper code
[2015-TPAMI][STL][TR] Real-time Lexicon-free Scene Text Localization and Recognition paper
[2015-ICCV][STL] FASText: Efficient unconstrained scene text detector paper code
[2012-CVPR][STL][TR] Real-time scene text localization and recognition paper code

Google Inc

[2013-ICCV][STL][TR] Photo OCR: Reading Text in Uncontrolled Conditions paper

Microsoft Inc

[2010-CVPR][STL] SWT: Detecting Text in Natural Scenes with Stroke Width Transform paper code

Samsung R&D Institute China

[2017-arXiv][STL] R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection paper

Vicarious FPC Inc

[2016-NIPS][TR] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data paper

Chinese State Key Laboratory of Management and Control for Complex Systems

[2013-CVPR][TR] Scene Text Recognition using Part-based Tree-structured Character Detection paper

Stanford University

[2012-ICPR][TR] End-to-End Text Recognition with CNN paper code

Visual Computing Department, Institute for Infocomm Research

[2017-ICCV][STL] WeText: Scene Text Detection under Weak Supervision paper

University of Florida

[2017-ICCV][STL] Single Shot Text Detector with Regional Attention paper code

University of Southern California

[2017-ICCV][STL] Self-organized Text Detection with Minimal Post-processing via Border Learning paper

Hikvision Research Institute

[2017-ICCV][TR] Focusing Attention: Towards Accurate Text Recognition in Natural Images paper

University of Adelaide

[2017-ICCV][STL][TR] Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks paper

City University of New York

[2017-CVPR][STL] Unambiguous Text Localization and Retrieval for Cluttered Scenes paper

The University of Hong Kong

[2018-AAAI][TR] Char-Net: A Character-Aware Neural Network for Distorted Scene Text paper

Zhejiang University

[2018-AAAI][STL] PixelLink: Detecting Scene Text via Instance Segmentation paper

University of Potsdam

[2018-AAAI][STL][TR] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition paper code

Arizona State Unviversity

[2018-AAAI][TR] SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network paper

Stevens Institute of Technology

[2018-CVPR][STL] Geometry-Aware Scene Text Detection with Instance Transformation Network paper

Nanyang Technological University

[2018-ECCV][STL] Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes paper
[2018-ECCV][STL] Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping paper
[2018-ECCV][STL] Using Object Information for Spotting Text paper
[2018-CVPR][STL] Learning Markov Clustering Networks for Scene Text Detection paper

Alibaba Group

[2018-IJCAI][STL] IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection paper

Chinese Academy of Sciences

[2018-ICIP][STL] Focal Text: An Accurate Text Detection With Focal Loss
[2018-ICIP][STL] Dense Chained Attention Network for Scene Text Recognition

University of Cambridge

[2018-ECCV][STL] Synthetically Supervised Feature Learning for Scene Text Recognition paper

Peking University

[2018-ECCV][STL] TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes paper code

SenseTime Research

[2018-BMVC][STL] Boosting up Scene Text Detectors with Guided CNN paper

Naver Clova AI Research

[2019-CVPR][STL][TR] Character Region Awareness for Text Detection paper

Baidu

[2019-CVPR][STL] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes paper

University of Adelaide

[2018-CVPR][STL][TR] An End-to-End TextSpotter with Explicit Alignment and Attention paper code

2. Datasets

`SCUT-CTW1500` `2018`

Task: text location(with different style) and recognition

download

`Total Text Dataset` `2017`

1,555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind

Task: text location(with different style) and recognition

download

`PowerPoint Text Detection and Recognition Dataset` `2017`

21,384 images, 21,384+ text instances

Task: text location and recognition

download

`COCO-Text (Computer Vision Group, Cornell)` `2016`

63,686 images, 173,589 text instances, 3 fine-grained text attributes.

Task: text location and recognition

download

`Synthetic Word Dataset (Oxford, VGG)` `2014`

9 million images covering 90k English words

Task: text recognition, segmantation

download

`The Street View House Number Dataset (SVHN)` `2012`

Real-world street view number image with its position and classification tags.

Task: number location detection, text recognition

download

`IIIT 5K-Words` `2012`

5000 images from Scene Texts and born-digital (2k training and 3k testing images)

Each image is a cropped word image of scene text with case-insensitive labels

Task: text recognition

download

`StanfordSynth(Stanford, AI Group)` `2012`

Small single-character images of 62 characters (0-9, a-z, A-Z)

Task: text recognition

download

`MSRA Text Detection 500 Database (MSRA-TD500)` `2012`

500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)

Chinese, English or mixture of both

Task: text detection

`Street View Text (SVT)` `2010`

350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)

Only word level bounding boxes are provided with case-insensitive labels

Task: text location

`KAIST Scene_Text Database` `2010`

3000 images of indoor and outdoor scenes containing text

Korean, English (Number), and Mixed (Korean + English + Number)

Task: text location, segmantation and recognition

`Chars74k` `2009`

Over 74K images from natural images, as well as a set of synthetically generated characters

Small single-character images of 62 characters (0-9, a-z, A-Z)

Task: text recognition

`ICDAR Benchmark Datasets`

Dataset	Description	Competition Paper
ICDAR 2017	over 173,589 labeled text regions in over 63,686 images	`paper`
ICDAR 2015	1000 training images and 500 testing images	`paper`
ICDAR 2013	229 training images and 233 testing images	`paper`
ICDAR 2011	229 training images and 255 testing images	`paper`
ICDAR 2005	1001 training images and 489 testing images	`paper`
ICDAR 2003	181 training images and 251 testing images(word level and character level)	`paper`

3. Competitions

ICDAR - Robust Reading Competitions

4. Online OCR Service

Name	Description
Tesseract OCR	API，free
Online OCR	API，free
Free OCR	API，free
New OCR	API，free
ABBYY FineReader Online	No API，Not free
Super Online Transfer Tools (Chinese)	API，free
Online Chinese Recognition	API，free

kaiwakari/image-text-localization-recognition

Scene Text Localization & Recognition Resources

1. Papers & Code

Overview

University of Oxford

Shenzhen Institutes of Advanced Technology

South China University of Technology

Fudan University

Huazhong University of Science and Technology

Universitat Autònoma de Barcelona

Stanford University

Seoul National University

Megvii Technology Inc: Face++

Institute of Automation, Chinese Academy of Sciences

University of California, San Diego

University of California, Santa Cruz

Cornell University

Pennsylvania State University

University of Science and Technology Beijing

Pohang University of Science and Technology

École d'Ingénieurs en Informatique

České vysoké učení technické v Praze. Czech Technical University

Google Inc

Microsoft Inc

Samsung R&D Institute China

Vicarious FPC Inc

Chinese State Key Laboratory of Management and Control for Complex Systems

Stanford University

Visual Computing Department, Institute for Infocomm Research

University of Florida

University of Southern California

Hikvision Research Institute

University of Adelaide

City University of New York

The University of Hong Kong

Zhejiang University

University of Potsdam

Arizona State Unviversity

Stevens Institute of Technology

Nanyang Technological University

Alibaba Group

Chinese Academy of Sciences

University of Cambridge

Peking University

SenseTime Research

Naver Clova AI Research

Baidu

University of Adelaide

2. Datasets

SCUT-CTW1500 2018

Total Text Dataset 2017

PowerPoint Text Detection and Recognition Dataset 2017

COCO-Text (Computer Vision Group, Cornell) 2016

Synthetic Word Dataset (Oxford, VGG) 2014

The Street View House Number Dataset (SVHN) 2012

IIIT 5K-Words 2012

StanfordSynth(Stanford, AI Group) 2012

MSRA Text Detection 500 Database (MSRA-TD500) 2012

Street View Text (SVT) 2010

KAIST Scene_Text Database 2010

Chars74k 2009

ICDAR Benchmark Datasets

3. Competitions

4. Online OCR Service

5. Blogs

`SCUT-CTW1500` `2018`

`Total Text Dataset` `2017`

`PowerPoint Text Detection and Recognition Dataset` `2017`

`COCO-Text (Computer Vision Group, Cornell)` `2016`

`Synthetic Word Dataset (Oxford, VGG)` `2014`

`The Street View House Number Dataset (SVHN)` `2012`

`IIIT 5K-Words` `2012`

`StanfordSynth(Stanford, AI Group)` `2012`

`MSRA Text Detection 500 Database (MSRA-TD500)` `2012`

`Street View Text (SVT)` `2010`

`KAIST Scene_Text Database` `2010`

`Chars74k` `2009`

`ICDAR Benchmark Datasets`