A fast&simple model for multi-scenario multi-class invoices detection (only localization)
This model is designed for information localization on invoice-like images which have dense, very long, blurred and overlapped text. There were many excellent models for scene text detection like EAST, PixelLink, FTOS, but they are not very suitable for this high accuracy required and classification task. So we designed our model for various kinds of invoices. Our model is based on semantic segmentation and center points prediction. It is composed of an Encoder and a Decoder. Encoder is for feature extraction, while Decoder is for pixel classification, center points prediction, and distance estimation. For most data, our model could precisely find center points. So non-maxima suppression for bounding box can be removed. We have provide pretrained models for added-value tax and taxi invoice. Lite models will be released soon.
Defferent color means defferent class, it can detect as many items as it can
- Install PyTorch>=0.4.1 following the official instructions
git clone https://github.com/wirustea/Fast_Invoice
pip install -r requirements.txt
on added-value tax invoice dataset
model | num classes | #Params | GFLOPs | Multi-scale | mIoU_for_Seg | Link |
---|---|---|---|---|---|---|
FastInvoice_Res11 | 64 | 57M | 55.5 | YES | 78.4% | BaiDuYun (key:ey4g) |
Lite-FastInvoice_Res11 | 64 | - | - | YES | - | - |
on multi-invoice(added-value-tax and taxi) dataset
model | num classes | #Params | GFLOPs | Multi-scale | mIoU_for_Seg | Link |
---|---|---|---|---|---|---|
FastInvoice_Res18 | 64 | - | - | YES | - | - |
first download pretrained models, and move to folder PROJECT_ROOT/pretrained_model.
python test.py --path IMAGE_PATH/VIDEO_PATH/FOLDER_PATH --model_name MODEL_NAME --pretrained_model PTH_NAME
- if MODEL_NAME is not given, it will use FastInvoice_Res11 as default
- --use_gpu if gpu is avilable
- --visualize visulalize bounding boxes on input
- --echo show details of detection
- --K limit of number of items, default to 110, you can switch to larger one if many invoices given
- you can find bounding boxes and visualized version for every image in folder PROJECT_ROOT/result
if you just want to call detection function in projects
from test import Detection
detection = Detection(model_name:str, pretrained_model:str, on_gpu=False)
test.detection.detect(input:numpy.ndarray, visualize:bool)
Your directory tree and label file(json) should be look like this:
$PROJECT_ROOT/dataset
├── 512_train
│ ├── IMAGE_NAME_1
│ ├── IMAGE_NAME_2
│ ├── ...
│ ├── label.json
├── 512_test
│ ├── IMAGE_NAME_1
│ ├── IMAGE_NAME_2
│ ├── ...
│ ├── label.json
├── mapping.json
label.json
{
'IMAGE_NAME_1':{
'TAG_ONG':[
[[12,45],[68,90],[12,90],[25,68]],
[[12,45],[68,90],[12,90],[25,68]]
],
'TAG_TWO':[
[[12,45],[68,90],[12,90],[25,68]],
[[12,45],[68,90],[12,90],[25,68]]
]
},
}
mapping.json
{
'TAG_ONG':1, # background is 0
'TAG_two':2,
...
'TAG_N':N
}
we provide a tool for invoice data labeling Labeling Tool