PyTorch implementation of Scene Text Recognition for Transfer Learning in Indic Languages. We received Best Paper Award for this work at CBDAR'21.
- Python3.6.5
- torch==1.2.0
- torchvision==0.4.0
- tensorboard==2.3.0
- Follow the instructions in meijieru/crnn.pytorch to create lmdb datasets Use the same step to create separate training and validation synthetic data.
Please update the parameters and alphabets according to your requirement.
- Change parameters in the train.py file accordingly
- Input the list of characters (or alphabets) to the train.py file wrt to the language you are training on : The charlist input to the train.py should contain all the characters corresponding to the language present in the dataset, else the program will throw a key error during the training process.
Run mytrain.py
--
python3 mytrain.py --trainRoot <\path to the train lmdb dataset> \
--valRoot <\path to the validation lmdb dataset> \
--arch <\architecture:CRNN/STARNET> --lan <\language> --charlist <\path to the character text file> \
--batchSize 32 --nepoch 15 --cuda --expr_dir <\path to the output experiments directory> \
--displayInterval 10 --valInterval 100 --adadelta \
--manualSeed 1234 --random_sample --deal_with_lossnan
meijieru/crnn.pytorch
Sierkinhane/crnn_chinese_characters_rec