The official code of CDistNet.
Paper Link : Arxiv Link
As a different paradigm, we have great confidence in CDistNet's continued high recognition performance across multiple scenarios. To do this, we explore the reasons for ABINet's high performance. The separation of LM and VM gives ABINet even more performance improvements. To this end, we also applied training strategies to CDistNetv2 to find more room for improvement. Be more concerned about CDistNet~
- HA-IC13 & CA-IC13
- Pre-train model
- Cleaned Code
- Document
- Distributed Training
we test other sota method in HA-IC13 and CA-IC13 datasets.
CDistNet has a performance advantage over other SOTA methods as the character distance increases (1-6)
Method | 1 | 2 | 3 | 4 | 5 | 6 | Code & Pretrain model |
---|---|---|---|---|---|---|---|
VisionLAN (ICCV 2021) | 93.58 | 92.88 | 89.97 | 82.26 | 72.23 | 61.03 | Offical Code |
ABINet (CVPR 2021 ) | 95.92 | 95.22 | 91.95 | 85.76 | 73.75 | 64.99 | Offical Code |
RobustScanner* (ECCV 2020) | 96.15 | 95.33 | 93.23 | 88.91 | 81.10 | 71.53 | -- |
Transformer-baseline* | 96.27 | 95.45 | 92.42 | 86.46 | 79.35 | 72.46 | -- |
CDistNet | 96.62 | 96.15 | 94.28 | 89.96 | 83.43 | 77.71 | -- |
Method | 1 | 2 | 3 | 4 | 5 | 6 | Code & Pretrain model |
---|---|---|---|---|---|---|---|
VisionLAN (ICCV 2021) | 94.87 | 92.77 | 84.01 | 75.03 | 64.29 | 52.74 | Offical Code |
ABINet (CVPR 2021 ) | 96.62 | 95.92 | 87.86 | 76.31 | 65.46 | 54.49 | Offical Code |
RobustScanner* (ECCV 2020) | 95.22 | 94.87 | 85.30 | 76.55 | 68.38 | 60.79 | -- |
Transformer-baseline* | 95.68 | 94.40 | 85.88 | 75.85 | 65.93 | 58.58 | -- |
CDistNet | 96.27 | 95.57 | 88.45 | 79.58 | 70.36 | 63.13 | -- |
The datasets are same as ABINet
-
Training datasets
-
Evaluation & Test datasets, LMDB datasets can be downloaded from BaiduNetdisk(passwd:1dbv), GoogleDrive.
- ICDAR 2013 (IC13)
- ICDAR 2015 (IC15)
- IIIT5K Words (IIIT)
- Street View Text (SVT)
- Street View Text-Perspective (SVTP)
- CUTE80 (CUTE)
-
Augment IC13
- HA-IC13 & CA-IC13 : BaiduNetdisk(passwd:d6jd), GoogleDrive
-
The structure of
dataset
directory isdataset ├── eval │ ├── CUTE80 │ ├── IC13_857 │ ├── IC15_1811 │ ├── IIIT5k_3000 │ ├── SVT │ └── SVTP ├── train │ ├── MJ │ │ ├── MJ_test │ │ ├── MJ_train │ │ └── MJ_valid │ └── ST
package you can find in env_cdistnet.yaml
.
#Installed
conda create -n CDistNet python=3.7
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=9.2 -c pytorch
pip install opencv-python mmcv notebook numpy einops tensorboardX Pillow thop timm tornado tqdm matplotlib lmdb
Get the pretrained models from BaiduNetdisk(passwd:d6jd), GoogleDrive.
(We both offer training log and result.csv in same file.)
The pretrained model should set in models/reconstruct_CDistNetv3_3_10
Performances of the pretrained models are summaried as follows:
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config=configs/CDistNet_config.py
CUDA_VISIBLE_DEVICES=0 python eval.py --config=configs/CDistNet_config.py
@article{Zheng2021CDistNetPM,
title={CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition},
author={Tianlun Zheng and Zhineng Chen and Shancheng Fang and Hongtao Xie and Yu-Gang Jiang},
journal={ArXiv},
year={2021},
volume={abs/2111.11011}
}