/DAVAR-Lab-OCR

OCR toolbox from Davar-Lab

Primary LanguagePythonApache License 2.0Apache-2.0

DAVAR-OCR

This is the opensourced OCR repository of DAVAR Lab, from Hikvision Research Institute, China.

We begin to maintain this code repository to release the implementations of our recent academic publishments and some re-implementations of previous popular algorithms/modules in OCR.

We also provide some of the ablation experiment comparasions for better reproduction.

A short paper introduces DavarOCR is available at arxiv.

Note: Due to the policy limits of the company. All of the codes were re-implemented based on the open-source frameworks, mmdetection-2.11.0 and mmcv-1.3.4, from open-mmlab. The code architecture also refers to mmocr, which means these two frameworks can be well compatible to each other.

Implementations

To date, davarocr contains the following algorithms:

Basic OCR Tasks

Text Detection

Text Recognition

Text Spotting

Video Text Spotting

  • YORO (ACM MM 2019)

Document Understanding Tasks

Information Extraction

Table Recognition

Table Understanding

  • CTUNet (ACMMM 2022) (To be released)

Layout Recognition

  • VSR (ICDAR 2021)

Reading Order Detection

Named Entity Reocognition

Development Environment

The recommended environment requirements can be found in mmdetection. Follows are the lowest compatible environment.

Basic Env version
Python 3.6+
cuda 10.0+
cudnn 7.6.3+
pytorch 1.3.0+
torchvision 0.4.1+
opencv 3.0.0+

For some of the algorithms (EAST, Text Perceptron), C++ version opencv are required. If you do not need to use these algorithms, you could temporarily ignore the error about 'opencv.hpp' or remove the related codes temporarily.

Installation and Development Instruction

To Download the repository and install the davarocr, please follow the instructions:

git clone https://github.com/hikopensource/DAVAR-Lab-OCR.git
cd DAVAR-Lab-OCR/
bash setup.sh

This script will automatically download and install the "mmdetection" and "mmcv-full". You can also manually install them followinging the official instructions

Going to the specific algorithm's directory to see more details.

Problem solution and collection

For the problems existing in the process of installation and researching, we will reasonably collect them and provide corresponding solutions. Please refer to FAQ.md for details.

Changelog

DavarOCR v0.6.0 was released in 13/07/2022. Please refer to Changelog.md for details and release history.

Citation

If you find this repository is helpful to your research, please feel free to cite us:

@article{qiao2022davarocr,
  title    ={{DavarOCR:} {A} Toolbox for OCR and Multi-Modal Document Understanding},
  author   ={Liang Qiao and
			  Hui Jiang and
			  Ying Chen and
			  Can Li and
			  Pengfei Li and
			  Zaisheng Li and
			  Baorui Zou and
			  Dashan Guo and
			  Yingda Xu and
			  Yunlu Xu and
			  Zhanzhan Cheng and
			  Yi Niu}
  journal   = {CoRR},
  volume    = {abs/2207.06695},
  year      = {2022},
}

License

This project is released under the Apache 2.0 license

Copyright

The copyright of corresponding contributions of our implementations belongs to Davar-Lab, Hikvision Research Institute, China, and other codes from open source repository follows the original distributive licenses.

Welcome to DAVAR-LAB!

See latest news in DAVAR-Lab. If you have any question and suggestion, please feel free to contact us. Contact email: qiaoliang6@hikvision.com, chengzhanzhan@hikvision.com.