Scene Text
A curated list of papers and resources for scene text detection and recognition
The year when a paper was first published, including ArXiv publications, is used. As a result, there may be cases when a paper was accepted for example to CVPR 2019, but it is listed in year 2018 because it was published in 2018 on ArXiv.
Table of contents |
---|
1. Scene Text Detection |
2. Weakly Supervised Scene Text Detection |
3. Scene Text Recognition |
4. Other scene text papers |
5. Scene Text Survey papers |
6. Dataset |
Scene Text Detection (including methods for end-to-end detection and recognition)
2010
- Detecting text in natural scenes with stroke width transform [CVPR 2010] [paper]
- A Method for Text Localization and Recognition in Real-World Images [ACCV 2010] [paper]
2011
2012
- Real-time scene text localization and recognition [CVPR 2012] [paper]
2013
2014
- Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [ECCV 2014] [paper]
2015
- Symmetry-based text line detection in natural scenes [CVPR 2015] [paper]
- Object proposals for text extraction in the wild [ICDAR 2015] [paper]
- Text-Attentional Convolutional Neural Network for Scene Text Detection [TIP 2016] [paper]
- Text Flow : A Unified Text Detection System in Natural Scene Images [ICCV 2015] [paper]
2016
- Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network [ArXiv] [paper]
- Multi-Oriented Text Detection With Fully Convolutional Networks [CVPR 2016] [paper]
- Scene Text Detection Via Holistic, Multi-Channel Prediction [ArXiv] [paper]
- Detecting Text in Natural Image with Connectionist Text Proposal Network [ECCV 2016] [paper]
- TextBoxes: A Fast Text Detector with a Single Deep Neural Network [AAAI 2017] [paper]
2017
- Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting In The Wild [CVPR 2017] [paper]
- Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework [ICCV 2017] [paper]
- Arbitrary-Oriented Scene Text Detection via Rotation Proposals [TMM 2018] [paper]
- Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [CVPR 2017] [paper]
- Detecting Oriented Text in Natural Images by Linking Segments [CVPR 2017] [paper]
- Deep Direct Regression for Multi-Oriented Scene Text Detection [ICCV 2017] [paper]
- Cascaded Segmentation-Detection Networks for Word-Level Text Spotting [ArXiv] [paper]
- EAST: An Efficient and Accurate Scene Text Detector [CVPR 2017] [paper]
- WordFence: Text Detection in Natural Images with Border Awareness [ICIP 2017] [paper]
- R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection [ArXiv] [paper]
- WordSup: Exploiting Word Annotations for Character based Text Detection [ICCV 2017] [paper]
- Single Shot Text Detector With Regional Attention [ICCV 2017] [paper]
- https://github.com/BestSonny/SSTD [Caffe]
- https://github.com/HotaekHan/SSTDNet [PyTorch]
- Fused Text Segmentation Networks for Multi-oriented Scene Text Detection [ArXiv] [paper]
- Deep Residual Text Detection Network for Scene Text [ICDAR 2017] [paper]
- Feature Enhancement Network: A Refined Scene Text Detector [AAAI 2018] [paper]
- ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene [ArXiv] [paper]
- Self-organized Text Detection with Minimal Post-processing via Border Learning [ICCV 2017] [paper]
2018
- PixelLink: Detecting Scene Text via Instance Segmentation [AAAI 2018] [paper]
- FOTS: Fast Oriented Text Spotting With a Unified Network [CVPR 2018] [paper]
- TextBoxes++: A Single-Shot Oriented Scene Text Detector [TIP 2018] [paper]
- Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation [CVPR 2018] [paper]
- An end-to-end TextSpotter with Explicit Alignment and Attention [CVPR 2018] [paper]
- Rotation-Sensitive Regression for Oriented Scene Text Detection [CVPR 2018] [paper]
- https://github.com/MhLiao/RRD [Caffe]
- Detecting multi-oriented text with corner-based region proposals [Neurocomputing 2019] [paper]
- https://github.com/xhzdeng/crpn [Caffe]
- An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches [ArXiv] [paper]
- IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection [IJCAI 2018] [paper]
- Shape Robust Text Detection with Progressive Scale Expansion Network [CVPR 2019] [paper] [paper v2]
- TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes [ECCV 2018] [paper]
- Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [ECCV 2018] [paper]
- Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping [ECCV 2018] [paper]
- A New Anchor-Labeling Method For Oriented Text Detection Using Dense Detection Framework [SPL 2018] [paper]
- An Efficient System for Hazy Scene Text Detection using a Deep CNN and Patch-NMS [ICPR 2018] [paper]
- Scene Text Detection with Supervised Pyramid Context Network [AAAI 2019] [paper]
- Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks [ArXiv] [paper]
- Mask R-CNN with Pyramid Attention Network for Scene Text Detection [WACV 2019] [paper]
- TextMountain: Accurate Scene Text Detection via Instance Segmentation [ArXiv] [paper]
- TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [ArXiv] [paper]
- TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network [ACCV 2018] [paper]
2019
- MSR: Multi-Scale Shape Regression for Scene Text Detection [IJCAI 2019] [paper]
- Scene Text Detection with Inception Text Proposal Generation Module [ICMLC 2019] [paper]
- Towards Robust Curve Text Detection with Conditional Spatial Expansion [CVPR 2019] [paper]
- Curve Text Detection with Local Segmentation Network and Curve Connection [ArXiv] [paper]
- Pyramid Mask Text Detector [ArXiv] [paper]
- Tightness-aware Evaluation Protocol for Scene Text Detection [CVPR 2019] [paper]
- Character Region Awareness for Text Detection [CVPR 2019] [paper]
- Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes [CVPR 2019] [paper]
- TextCohesion: Detecting Text for Arbitrary Shapes [ArXiv] [paper]
- Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation [CVPR 2019] [paper]
- Learning Shape-Aware Embedding for Scene Text Detection [CVPR 2019] [paper]
- A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning [ACMMM 2019] [paper]
- Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network [ICCV 2019] [paper]
- Towards Unconstrained End-to-End Text Spotting [ICCV 2019] [paper]
- TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting [paper]
- Convolutional Character Networks [ICCV 2019] [paper]
Weakly supervised Scene Text Detection & Recognition
2017
- Attention-Based Extraction of Structured Information from Street View Imagery [ICDAR 2017] [paper]
- WeText: Scene Text Detection under Weak Supervision [ICCV 2017] [paper]
- SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [AAAI 2018] [paper]
- https://github.com/Bartzi/see [Chainer]
Scene Text Recognition
2014
- Deep Structured Output Learning for Unconstrained Text Recognition [ICLR 2015] [paper]
- Reading text in the wild with convolutional neural networks [IJCV 2016] [paper]
2015
- Reading Scene Text in Deep Convolutional Sequences [AAAI 2016] [paper]
- An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition [TPAMI 2017] [paper]
2016
- Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [CVPR 2016] [paper]
- Robust scene text recognition with automatic rectification [CVPR 2016] [paper]
- https://github.com/WarBean/tps_stn_pytorch [PyTorch]
- https://github.com/marvis/ocr_attention [PyTorch]
- CNN-N-Gram for Handwriting Word Recognition [CVPR 2016] [paper]
- STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition [BMVC 2016] [paper]
2017
- STN-OCR: A single Neural Network for Text Detection and Text Recognition [ArXiv] [paper]
- Learning to Read Irregular Text with Attention Mechanisms [IJCAI 2017] [paper]
- Scene Text Recognition with Sliding Convolutional Character Models [ArXiv] [paper]
- Focusing Attention: Towards Accurate Text Recognition in Natural Images [ICCV 2017] [paper]
- AON: Towards Arbitrarily-Oriented Text Recognition [CVPR 2018] [paper]
- Gated Recurrent Convolution Neural Network for OCR [NIPS 2017] [paper]
2018
- Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition [AAAI 2018] [paper]
- SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network [AAAI 2018] [paper]
- Edit Probability for Scene Text Recognition [CVPR 2018] [paper]
- ASTER: An Attentional Scene Text Recognizer with Flexible Rectification [TPAMI 2018] [paper]
- Synthetically Supervised Feature Learning for Scene Text Recognition [ECCV 2018] [paper]
- Scene Text Recognition from Two-Dimensional Perspective [AAAI 2019] [paper]
- ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification [CVPR 2019] [paper]
2019
- A Multi-Object Rectified Attention Network for Scene Text Recognition [Pattern Recognition] [paper]
- https://github.com/Canjie-Luo/MORAN_v2 [PyTorch]
- A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition [paper]
- Aggregation Cross-Entropy for Sequence Recognition [CVPR 2019][paper]
- Sequence-to-Sequence Domain Adaptation Network for Robust Text Image Recognition [CVPR 2019][paper]
- 2D Attentional Irregular Scene Text Recognizer [ArXiv] [paper]
- Deep Neural Network for Semantic-based Text Recognition in Images [ArXiv] [paper]
- Symmetry-constrained Rectification Network for Scene Text Recognition [ICCV 2019] [paper]
- Rethinking Irregular Scene Text Recognition (ICDAR 2019-ArT) [paper]
- Focus-Enhanced Scene Text Recognition with Deformable Convolutions [ArXiv] [paper]
- https://github.com/Alpaca07/dtr [PyTorch]
- Adaptive Embedding Gate for Attention-Based Scene Text Recognition [ArXiv] [paper]
Script Identification
Other scene text related papers
2016
- Synthetic Data for Text Localisation in Natural Images [CVPR 2016] [paper]
2019
- Scene Text Synthesis for Efficient and Effective Deep Network Training [ArXiv] [paper]
Scene text survey
2018
- Scene Text Detection and Recognition: The Deep Learning Era [ArXiv] [paper]
2019
- Scene text detection and recognition with advances in deep learning: a survey [IJDAR 2019] [paper]
Dataset
PowerPoint Text Detection and Recognition Dataset
2017
- Over 1k A Consolidated Receipt Dataset for Post-OCR Parsing
- Task:text recognition
- Over 1062 images from Scanned receipts
- Task:text location and recognition
COCO-Text (ComputerVision Group, Cornell) 2016
- 63,686images, 173,589 text instances, 3 fine-grained text attributes.
- Task:text location and recognition
Synthetic Data for Text Localisation in Natural Image (VGG)2016
- 800k thousand images
- 8 million synthetic word instances
- download
Synthetic Word Dataset (Oxford, VGG) 2014
- 9million images covering 90k English words
- Task:text recognition, segmentation
- download
- 5000images from Scene Texts and born-digital (2k training and 3k testing images)
- Eachimage is a cropped word image of scene text with case-insensitive labels
- Task:text recognition
- download
StanfordSynth(Stanford, AI Group) 2012
- Small single-character images of 62 characters (0-9, a-z, A-Z)
- Task:text recognition
- download
MSRA Text Detection 500 Database(MSRA-TD500) 2012
- 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
- Chinese,English or mixture of both
- Task:text detection
- 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
- Only word level bounding boxes are provided with case-insensitive labels
- Task:text location
KAIST Scene_Text Database 2010
- 3000 images of indoor and outdoor scenes containing text
- Korean,English (Number), and Mixed (Korean + English + Number)
- Task:text location, segmentation and recognition
-
Over 74K images from natural images, as well as a set of synthetically generatedcharacters
-
Smallsingle-character images of 62 characters (0-9, a-z, A-Z)
-
Task:text recognition
-
ICDAR Benchmark Datasets
Dataset | Discription | Competition Paper |
---|---|---|
ICDAR 2019 | training and testing images | paper |
ICDAR 2017 | 42618 training images and 9837 testing images | paper |
ICDAR 2015 | 1000 training images and 500 testing images | paper |
ICDAR 2013 | 229 training images and 233 testing images | paper |
ICDAR 2011 | 229 training images and 255 testing images | paper |
ICDAR 2005 | 1001 training images and 489 testing images | paper |
ICDAR 2003 | 181 training images and 251 testing images(word level and character level) | paper |
Blogs
- Scene Text Detection with OpenCV 3
- Handwritten numbers detection and recognition
- Applying OCR Technology for Receipt Recognition
- Convolutional Neural Networks for Object(Car License) Detection
- Extracting text from an image using Ocropus
- Number plate recognition with Tensorflow [github]
- Using deep learning to break a Captcha system
report
[github] - Breaking reddit captcha with 96% accuracy [github]
- Scene Text Recognition in iOS [github]
Online Service
Name | Description |
---|---|
Online OCR | API,Free |
Free OCR | API,Free |
New OCR | API,Free |
ABBYY FineReader Online | nonAPI,free |
Open Resources Code
- 本项目基于yolo3 与crnn 实现中文自然场景文字检测及识别 [code]
- 超轻量级中文ocr,支持竖排文字识别, 支持ncnn推理 , psenet(8.5M) + crnn(6.3M) + anglenet(1.5M) 总模型仅17M [code]
- Tesseract c++ based tools for documents analysis and OCR [code]
- Ocropy: Python-based tools for document analysis and OCR https://github.com/tmbdev/ocropy
- CLSTM A small implementation of LSTM networks,focused on OCR https://github.com/tmbdev/clstm
- Convolutional Recurrent Neural Network Torch7 https://github.com/bgshih/crnn
- Attention-OCR Visual Attention based OCR https://github.com/da03/Attention-OCR
- Umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm https://github.com/edward-zhu/umaru
- AKSHAYUBHAT/DeepVideoAnalytics (CTPN+CRNN) code
- ankush-me/SynthText code
- JarveeLee/SynthText_Chinese_version code
Hand Writing Recognition
- [2016-arXiv]Drawingand Recognizing Chinese Characters with Recurrent Neural Network https://arxiv.org/abs/1606.06539
- Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition https://arxiv.org/abs/1610.02616
- Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition https://arxiv.org/abs/1610.04057
- High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps http://arxiv.org/abs/1505.04925">
- DeepHCCR:Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel) https://github.com/chongyangtao/DeepHCCR">
- Scan,Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTMAttention http://arxiv.org/abs/1604.03286
- MLPaint:the Real-Time Handwritten Digit Recognizer http://blog.mldb.ai/blog/posts/2016/09/mlpaint/
- caffe-ocr: OCR with caffe deep learning framework https://github.com/pannous/caffe-ocr
Licence Tag Recognition
- ReadingCar License Plates Using Deep Convolutional Neural Networks and LSTMs
- Numberplate recognition with Tensorflow http://matthewearl.github.io/2016/05/06/cnn-anpr/
- end-to-end-for-plate-recognition href="https://github.com/szad670401/end-to-end-for-chinese-plate-recognitionbhttp://rnd.azoft.com/applying-ocr-technology-receipt-recognition/