/CFAM

Contrast-guided Feature Adjustment Module for Visual Information Extraction

Primary LanguagePython

Code and dataset for Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution. (ICDAR2023)

POIE dataset is available at https://drive.google.com/file/d/1eEMNiVeLlD-b08XW_GfAGfPmmII-GDYs/view?usp=share_link.

More details on the code and dataset will be refined soon. Thank you for your attention.

Introduction

This project is about a novel end-to-end framework with a plug-and-play CFAM for VIE tasks, which adopts contrastive learning and properly designs the representation of VIE tasks for contrastive learning.

The main branch works with PyTorch 1.6+.

Major Features

Installation

MMOCR depends on PyTorch, MMEngine, MMCV and MMDetection. Below are quick steps for installation. Please refer to Install Guide for more detailed instruction.

conda create -n open-mmlab python=3.8 pytorch=1.10 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate open-mmlab
pip3 install openmim
git clone https://github.com/jfkuang/CFAM.git
cd mmocr
mim install -e .

Acknowledgement

We appreciate MMOCR as our codebase.

Citation

If you find this project useful in your research, please consider cite:

@article{
    title={Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution},
    author={Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Yu Zhou, Xiang Bai},
    journal= {The 17th International Conference on Document Analysis and Recognition},
    year={2023}
}