/CDistNet-OpenVINO

Official Pytorch implementations of CDistNet

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The official code of CDistNet.

Paper Link : Arxiv Link

As a different paradigm, we have great confidence in CDistNet's continued high recognition performance across multiple scenarios. To do this, we explore the reasons for ABINet's high performance. The separation of LM and VM gives ABINet even more performance improvements. To this end, we also applied training strategies to CDistNetv2 to find more room for improvement. Be more concerned about CDistNet~ pipline

To Do List

  • HA-IC13 & CA-IC13
  • Pre-train model
  • Cleaned Code
  • Document
  • Distributed Training

Two New Datasets

we test other sota method in HA-IC13 and CA-IC13 datasets.

HA_CA CDistNet has a performance advantage over other SOTA methods as the character distance increases (1-6)

HA-IC13

Method 1 2 3 4 5 6 Code & Pretrain model
VisionLAN (ICCV 2021) 93.58 92.88 89.97 82.26 72.23 61.03 Offical Code
ABINet (CVPR 2021 ) 95.92 95.22 91.95 85.76 73.75 64.99 Offical Code
RobustScanner* (ECCV 2020) 96.15 95.33 93.23 88.91 81.10 71.53 --
Transformer-baseline* 96.27 95.45 92.42 86.46 79.35 72.46 --
CDistNet 96.62 96.15 94.28 89.96 83.43 77.71 --

CA-IC13

Method 1 2 3 4 5 6 Code & Pretrain model
VisionLAN (ICCV 2021) 94.87 92.77 84.01 75.03 64.29 52.74 Offical Code
ABINet (CVPR 2021 ) 96.62 95.92 87.86 76.31 65.46 54.49 Offical Code
RobustScanner* (ECCV 2020) 95.22 94.87 85.30 76.55 68.38 60.79 --
Transformer-baseline* 95.68 94.40 85.88 75.85 65.93 58.58 --
CDistNet 96.27 95.57 88.45 79.58 70.36 63.13 --

Datasets

The datasets are same as ABINet

Environment

package you can find in env_cdistnet.yaml.

#Installed
conda create -n CDistNet python=3.7
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=9.2 -c pytorch
pip install opencv-python mmcv notebook numpy einops tensorboardX Pillow thop timm tornado tqdm matplotlib lmdb

Pretrained Models

Get the pretrained models from BaiduNetdisk(passwd:d6jd), GoogleDrive. (We both offer training log and result.csv in same file.) The pretrained model should set in models/reconstruct_CDistNetv3_3_10

Performances of the pretrained models are summaried as follows:

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config=configs/CDistNet_config.py

Eval

CUDA_VISIBLE_DEVICES=0 python eval.py --config=configs/CDistNet_config.py

Citation

@article{Zheng2021CDistNetPM,
  title={CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition},
  author={Tianlun Zheng and Zhineng Chen and Shancheng Fang and Hongtao Xie and Yu-Gang Jiang},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.11011}
}