This is the official implementation of Paper: LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network (AAAI 2024 Oral).
This implementation is based on mmocr-0.2.1, so please refer to it for detailed requirements. Our code has been test with Pytorch-1.8.1 + cuda11.1 We recommend using Anaconda to manage environments. Run the following commands to install dependencies.
conda create -n lranet python=3.7 -y
conda activate lranet
conda install pytorch=1.8 torchvision cudatoolkit=11.1 -c pytorch -c nvidia -c conda-forge
pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
pip install mmdet==2.14.0
git clone https://github.com/ychensu/LRANet
cd LRANet
pip install -r requirements.txt
python setup.py build develop
Please download TotalText, CTW1500, and SynText150k according to the guide provided by TPSNet: README.md.
Please download and extract the above datasets into the data
folder following the file structure below.
data
├─totaltext
│ │ totaltext_train.json
│ │ totaltext_test.json
│ └─imgs
│ ├─training
│ └─test
├─CTW1500
│ │ instances_training.json
│ │ instance_test.json
│ └─imgs
│ ├─training
│ └─test
├─synthtext-150k
├─syntext1
│ │ train_polygon.json
│ └─images
├─syntext2
│ train_polygon.json
└─images
CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/lranet/lranet_totaltext_det.py work_dirs/totaltext_det 4
CUDA_VISIBLE_DEVICES=0 python tools/test.py configs/lranet/lranet_totaltext_det.py work_dirs/totaltext_det/latest.pth --eval hmean-e2e
Total-Text : One Drive
We sincerely thank MMOCR, ABCNet, and TPSNet for their excellent works.