中文 文本区域检测和文本识别的pytorch实现
conda create -n ocr python=3.6
source activate ocr
pip install -r requirements.txt -i https://mirrors.163.com/pypi/simple/ --default-timeout=3000
sh ocr_main.sh
pip install ipykernel
python -m ipykernel install --name ocr
#jupyter kernelspec remove kernelname
#jupyter kernelspec list
修改对应参数 输入输出文件夹路径 以及两个模型的路径
python3 ocr_main.py \
-i 'temp/input/' \ #修改
-o 'temp/output/' \ #修改
--cuda==False \
--batch_size 32 \
--label_file_list 'sample_data/chars.txt' \
--Transformation TPS \
--FeatureExtraction ResNet \
--SequenceModeling BiLSTM \
--Prediction Attn \
--saved_model atte/saved_models/TPS-ResNet-BiLSTM-Attn-Seed1111/best_accuracy.pth \ #修改
--trained_model 'craft/weights/craft_mlt_25k.pth' #修改
sh ocr_main.sh
Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary
- Download the trained models
Model name | Used datasets | Languages | Purpose | Model Link |
---|---|---|---|---|
General | SynthText, IC13, IC17 | Eng + MLT | For general purpose | Click |
IC15 | SynthText, IC15 | Eng | For IC15 only | Click |
LinkRefiner | CTW1500 | - | Used with the General Model | Click |
- Run with pretrained model
python test.py --trained_model=[weightfile] --test_folder=[folder path to test images]
| paper | training and evaluation data | failure cases and cleansed label | pretrained model | Baidu ver(passwd:rryk) |
cd data_generate
sh generation.sh 'demo.txt' 0.2 ../output/
cd recognition
sh run.sh
python3 demo.py \
--Transformation TPS \
--FeatureExtraction ResNet\
--SequenceModeling BiLSTM \
--Prediction Attn \
--image_folder ../test_image/ \
--saved_model saved_model/None-VGG-BiLSTM-CTC-Seed1111\best_accuracy.pth
共约364万张图片,按照99:1划分成训练集和验证集。 数据利用中文语料库(新闻 + 文言文),通过字体、大小、灰度、模糊、透视、拉伸等变化随机生成 包含汉字、英文字母、数字和标点共5990个字符(字符集合:https://github.com/YCG09/chinese_ocr/blob/master/train/char_std_5990.txt ) 每个样本固定10个字符,字符随机截取自语料库中的句子 图片分辨率统一为280x32
下载地址:https://pan.baidu.com/s/1QkI7kjah8SPHwOQ40rS1Pw (密码: lu7m
)
python3 create_lmdb_dataset.py --inputPath /home/ec2-user/datasets/DataSet/images --gtFile /home/ec2-user/datasets/DataSet/train_label.txt --outputPath ../output/train
python3 create_lmdb_dataset.py --inputPath /home/ec2-user/datasets/DataSet/images --gtFile /home/ec2-user/datasets/DataSet/valid_label.txt --outputPath ../output/valid
aws s3 cp temp/output/test012.json s3://dikers-html/ocr_output/