Fast-Invoice

A fast&simple model for multi-scenario multi-class invoices detection (only localization)

Introduction

This model is designed for information localization on invoice-like images which have dense, very long, blurred and overlapped text. There were many excellent models for scene text detection like EAST, PixelLink, FTOS, but they are not very suitable for this high accuracy required and classification task. So we designed our model for various kinds of invoices. Our model is based on semantic segmentation and center points prediction. It is composed of an Encoder and a Decoder. Encoder is for feature extraction, while Decoder is for pixel classification, center points prediction, and distance estimation. For most data, our model could precisely find center points. So non-maxima suppression for bounding box can be removed. We have provide pretrained models for added-value tax and taxi invoice. Lite models will be released soon.

Demo

Defferent color means defferent class, it can detect as many items as it can

Quick start

Install

Install PyTorch>=0.4.1 following the official instructions

git clone https://github.com/wirustea/Fast_Invoice
pip install -r requirements.txt

Pretrained models

on added-value tax invoice dataset

model	num classes	#Params	GFLOPs	Multi-scale	mIoU_for_Seg	Link
FastInvoice_Res11	64	57M	55.5	YES	78.4%	BaiDuYun (key:ey4g)
Lite-FastInvoice_Res11	64	-	-	YES	-	-

on multi-invoice(added-value-tax and taxi) dataset

model	num classes	#Params	GFLOPs	Multi-scale	mIoU_for_Seg	Link
FastInvoice_Res18	64	-	-	YES	-	-

Test

first download pretrained models, and move to folder PROJECT_ROOT/pretrained_model.

python test.py --path IMAGE_PATH/VIDEO_PATH/FOLDER_PATH --model_name MODEL_NAME --pretrained_model PTH_NAME

if MODEL_NAME is not given, it will use FastInvoice_Res11 as default
--use_gpu if gpu is avilable
--visualize visulalize bounding boxes on input
--echo show details of detection
--K limit of number of items, default to 110, you can switch to larger one if many invoices given
you can find bounding boxes and visualized version for every image in folder PROJECT_ROOT/result

if you just want to call detection function in projects

from test import Detection
detection = Detection(model_name:str, pretrained_model:str, on_gpu=False)
test.detection.detect(input:numpy.ndarray, visualize:bool)

Train (updating)

Data preparation

Your directory tree and label file(json) should be look like this:

$PROJECT_ROOT/dataset
├── 512_train
│   ├── IMAGE_NAME_1
│   ├── IMAGE_NAME_2
│   ├── ...
│   ├── label.json 
├── 512_test
│   ├── IMAGE_NAME_1
│   ├── IMAGE_NAME_2
│   ├── ...
│   ├── label.json 
├── mapping.json

label.json
{
    'IMAGE_NAME_1':{
        'TAG_ONG':[
            [[12,45],[68,90],[12,90],[25,68]],
            [[12,45],[68,90],[12,90],[25,68]]
        ],
        'TAG_TWO':[
            [[12,45],[68,90],[12,90],[25,68]],
            [[12,45],[68,90],[12,90],[25,68]]
        ]
    },
}

mapping.json
{
    'TAG_ONG':1, # background is 0
    'TAG_two':2,
    ...
    'TAG_N':N
}

labeling tool

we provide a tool for invoice data labeling Labeling Tool

chungbd/Fast_Invoice