Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

Unofficial PyTorch implementation of the paper, which integrates not only global semantic reasoning module but also parallel visual attention module and visual-semantic fusion decoder.the semanti reasoning network(SRN) can be trained end-to-end.

At present, the accuracy of the paper cannot be achieved. And i borrowed code from deep-text-recognition-benchmark

model

result

IIIT5k_3000	SVT	IC03_860	IC03_867	IC13_857	IC13_1015	IC15_1811	IC15_2077	SVTP	CUTE80
84.600	83.617	92.907	92.849	90.315	88.177	71.010	68.064	71.008	68.641

total_accuracy: 80.597

Feature

predict the character at once time
DistributedDataParallel training

Requirements

Pytorch >= 1.1.0

Test

download the evaluation data from deep-text-recognition-benchmark
download the pretrained model from Baidu, Password: d2qn
test on the evaluation data

python test.py --eval_data path-to-data --saved_model path-to-model

Train

download the training data from deep-text-recognition-benchmark
training from scratch

python train.py --train_data path-to-train-data --valid-data path-to-valid-data

Reference

difference with the origin paper

use resnet for 1D feature not resnetFpn 2D feature
use add not gated unit for visual-semanti fusion decoder

other

It is difficult to achieve the accuracy of the paper, hope more people to try and share

chenjun2hao/SRN.pytorch