LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network

This is the official implementation of Paper: LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network (AAAI 2024 Oral).

Environment

This implementation is based on mmocr-0.2.1, so please refer to it for detailed requirements. Our code has been test with Pytorch-1.8.1 + cuda11.1 We recommend using Anaconda to manage environments. Run the following commands to install dependencies.

conda create -n lranet python=3.7 -y
conda activate lranet
 conda install pytorch=1.8 torchvision cudatoolkit=11.1 -c pytorch -c nvidia -c conda-forge
pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
pip install mmdet==2.14.0
git clone https://github.com/ychensu/LRANet
cd LRANet
pip install -r requirements.txt
python setup.py build develop

Dataset

Please download TotalText, CTW1500, and SynText150k according to the guide provided by TPSNet: README.md.

Please download and extract the above datasets into the data folder following the file structure below.

data
├─totaltext
│  │ totaltext_train.json
│  │ totaltext_test.json
│  └─imgs
│      ├─training
│      └─test
├─CTW1500
│  │ instances_training.json
│  │ instance_test.json
│  └─imgs
│      ├─training
│      └─test
├─synthtext-150k
      ├─syntext1
      │  │  train_polygon.json
      │  └─images
      ├─syntext2
         │  train_polygon.json
         └─images

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/lranet/lranet_totaltext_det.py work_dirs/totaltext_det 4

Evaluation

CUDA_VISIBLE_DEVICES=0 python tools/test.py configs/lranet/lranet_totaltext_det.py work_dirs/totaltext_det/latest.pth --eval hmean-e2e

Trained Model

Total-Text : One Drive

Acknowledgement

We sincerely thank MMOCR, ABCNet, and TPSNet for their excellent works.