/TextBoxes_plusplus_Pytorch

TextBoxes++的pytorch版本,在天池mtwi比赛上进行应用

Primary LanguagePython

TextBoxes_plusplus_Pytorch_for_mtwi

TextBoxes++在天池"MTWI 2018 挑战赛二:网络图像的文本检测"比赛上的应用

一、简介

TextBoxes++ 的pytorch复现版本,所有方法基于ssd.pytorch进行修改。

二、预训练模型

vgg16_reducedfc.pth
百度云提取码:rtcz

三、数据集

MTWI 2018 挑战赛二:网络图像的文本检测
注册后即可下载

四、环境

  • python 3.7
  • torch 1.4.0+cu92
  • torchvision 0.5.0+cu92
  • Google colab默认环境可正常运行

五、使用方法

参数设置可以参考train.py,test_mtwi.py 文件中的parser参数,以下是两个示例。

  1. train:
    !python train.py --dataset mtwi384 --dataset_root /content/mtwi_2018_train --batch_size 8 --lr 1e-4
  2. test:
    !python test_mtwi.py --trained_model weights/ssd384_mtwi_90000.pth --save_folder test/sample_task2 --visual_threshold 0.18 --mtwi_root /content/mtwi_2018_task2_test

六、结果

MTWI 2018 挑战赛中的结果Precision:0.629,Recall:0.365
Image text

七、提示

因为复现的效果还是有一些问题,尤其是在倾斜文本的检测中,个人感觉是在预选检测框时,用文本外接最小矩形与预选框的最大IOU作为检测框,但是倾斜文本本身与预选框的IOU却很小,导致检测框选的不准,这个可能还需要再想想,也欢迎大家给出issues。

八、文章地址

原始仓库TextBoxes_plusplus
文章地址TextBoxes++: A Single-Shot Oriented Scene Text Detector
Cite:
@article{Liao2018Text, title = {{TextBoxes++}: A Single-Shot Oriented Scene Text Detector}, author = {Minghui Liao, Baoguang Shi and Xiang Bai}, journal = {{IEEE} Transactions on Image Processing}, doi = {10.1109/TIP.2018.2825107}, url = {https://doi.org/10.1109/TIP.2018.2825107}, volume = {27}, number = {8}, pages = {3676--3690}, year = {2018} } @inproceedings{LiaoSBWL17, author = {Minghui Liao and Baoguang Shi and Xiang Bai and Xinggang Wang and Wenyu Liu}, title = {TextBoxes: {A} Fast Text Detector with a Single Deep Neural Network}, booktitle = {AAAI}, year = {2017} } @article{ShiBY17, author = {Baoguang Shi and Xiang Bai and Cong Yao}, title = {An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition}, journal = {{IEEE} TPAMI}, volume = {39}, number = {11}, pages = {2298--2304}, year = {2017} }