/AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Alibaba DAMO Academy.

Primary LanguagePythonApache License 2.0Apache-2.0

Advanced Literate Machinery

Introduction

The ultimate goal of our research is to build a system that has high-level intelligence, i.e., the abilities to read, think and create, so advanced that it could even surpass human intelligence one day in the future. We name this kind of systems Advanced Literate Machinery (ALM).

This project is maintained by the 读光 OCR Team (读光-Du Guang means “Reading The Light”) in the Language Technology Lab, Alibaba DAMO Academy.

Logo

Visit our 读光-Du Guang Portal to experience online demos for OCR and Document Understanding.

Recent Updates

2022.9 Release

  • MGP-STR (ECCV 2022, paper): Based on ViT and a tailored Adaptive Addressing and Aggregation module, we explore an implicit way for incorporating linguistic knowledge by introducing subword representations to facilitate multi-granularity prediction and fusion in scene text recognition.
  • LevOCR (ECCV 2022, paper): Inspired by Levenshtein Transformer, we cast the problem of scene text recognition as an iterative sequence refinement process, which allows for parallel decoding, dynamic length change and good interpretability.