FCENet in MindSpore

English | 简体中文

Contents

FCENet Description

Fourier Contour Embedding Network (FCENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene. One of the main challenges for arbitrary-shaped text detection is to design a good text instance representation that allows networks to learn diverse text geometry variances. Most of existing methods model text instances in image spatial domain via masks or contour point sequences in the Cartesian or the polar coordinate system. However, the mask representation might lead to expensive post-processing, while the point sequence one may have limited capability to model texts with highly-curved shapes. To tackle these problems, we model text instances in the Fourier domain and propose one novel Fourier Contour Embedding (FCE) method to represent arbitrary shaped text contours as compact signatures. We further construct FCENet with a backbone, feature pyramid networks (FPN) and a simple post-processing with the Inverse Fourier Transformation (IFT) and Non-Maximum Suppression (NMS).

Paper: Yiqin Zhu, Jianyong Chen, Lingyu Liang, Zhanghui Kuang, Lianwen Jin, Wayne Zhang; Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.

Example

Performance

FCENet Val Performance(ICDAR2105).

Recall Precision Hmean-iou
Paper 84.2% 85.1% 84.6%
Torch 81.2% 88.7% 84.7%
MindSpore 80.7% 88.4% 84.4%

FCENet Val Performance(CTW1500).

Recall Precision Hmean-iou
Paper 80.7% 85.7% 83.1%
Torch 79.1% 83.0% 81.0%
MindSpore 82.3% 83.5% 82.8%

Dataset

Overview

Dataset Link
CTW1500 homepage
ICDAR2015 homepage

CTW1500

  • Step1: Download train_images.zip, test_images.zip, train_labels.zip, test_labels.zip from github

    mkdir CTW1500 && cd CTW1500
    mkdir imgs && mkdir annotations
    
    # For annotations
    cd annotations
    wget -O train_labels.zip https://universityofadelaide.box.com/shared/static/jikuazluzyj4lq6umzei7m2ppmt3afyw.zip
    wget -O test_labels.zip https://cloudstor.aarnet.edu.au/plus/s/uoeFl0pCN9BOCN5/download
    unzip train_labels.zip && mv ctw1500_train_labels training
    unzip test_labels.zip -d test
    cd ..
    # For images
    cd imgs
    wget -O train_images.zip https://universityofadelaide.box.com/shared/static/py5uwlfyyytbb2pxzq9czvu6fuqbjdh8.zip
    wget -O test_images.zip https://universityofadelaide.box.com/shared/static/t4w48ofnqkdw7jyc4t11nsukoeqk9c3d.zip
    unzip train_images.zip && mv train_images training
    unzip test_images.zip && mv test_images test
  • Step2: Generate instances_training.txt and instances_test.txt with following command:

    python tools/ctw1500_converter.py /path/to/ctw1500 -o /path/to/ctw1500 --split-list training test
  • The resulting directory structure looks like the following:

    ├── CTW1500
    │   ├── imgs
    │   ├── annotations
    │   ├── instances_training.txt
    │   └── instances_val.txt
    

ICDAR2015

  • Step1: Download ch4_training_images.zip, ch4_test_images.zip, ch4_training_localization_transcription_gt.zip, Challenge4_Test_Task1_GT.zip from homepage

    mkdir ICDAR2015 && cd ICDAR2015
    mkdir imgs && mkdir annotations
    
    unzip ch4_training_images.zip -d ch4_training_images
    unzip ./ch4_test_images.zip -d ch4_test_images
    unzip ch4_training_localization_transcription_gt.zip -d ch4_training_localization_transcription_gt
    unzip Challenge4_Test_Task1_GT.zip -d Challenge4_Test_Task1_GT
    # For images,
    mv ch4_training_images imgs/training
    mv ch4_test_images imgs/test
    # For annotations,
    mv ch4_training_localization_transcription_gt annotations/training
    mv Challenge4_Test_Task1_GT annotations/test
  • Step2: Generate instances_training.txt and instances_test.txt with following command:

    python tools/icdar2015_converter.py ./ICDAR2015 -o ./ICDAR2015 -d icdar2015 --split-list training test
  • The resulting directory structure looks like the following:

    ├── ICDAR2015
    │   ├── imgs
    │   ├── annotations
    │   ├── instances_training.txt
    │   └── instances_val.txt
    

Pretrained Model

download pytorch pretrained model: resnet50-19c8e357.pth transform pytorch model to mindspore model

python tools/resnet_model_torch2mindspore.py --torch_file=/path_to_model/resnet50-19c8e357.pth --output_path=../

Environment Requirements

Quick Start

After installing MindSpore via the official website, you can start training and evaluation as follows:

  • running on GPU
# run training example
CUDA_VISIBLE_DEVICES=1 nohup python train.py --config_path='./configs/CTW1500_config.yaml' > CTW1500_out.log &

nohup python train.py --config_path='./configs/ICDAR2015_config.yaml' > ICDAR2015_out.log &

# run distributed training example
CUDA_VISIBLE_DEVICES=0,1,2,3 sh scripts/run_distribute_train_gpu.sh

# run test.py
python test.py

python test.py --config_path='./configs/CTW1500_config.yaml'

python test.py --config_path='./configs/ICDAR2015_config.yaml'

# inference display detection results
python infer_det.py

python infer_det.py --config_path='./configs/ICDAR2015_config.yaml'
  • running on Ascend
# run training example
nohup python train.py --config_path='./configs/CTW1500_config.yaml' --device_target='Ascend' > CTW1500_out.log &

nohup python train.py --config_path='./configs/ICDAR2015_config.yaml' --device_target='Ascend' > ICDAR2015_out.log &

# run distributed training example
sh scripts/run_distribute_train_ascend.sh

# run test.py
python test.py

python test.py --config_path='./configs/CTW1500_config.yaml' --device_target='Ascend'

python test.py --config_path='./configs/ICDAR2015_config.yaml' --device_target='Ascend'

# inference display detection results
python infer_det.py

python infer_det.py --config_path='./configs/ICDAR2015_config.yaml'