/CDeCNet

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Primary LanguagePythonMIT LicenseMIT

CDeC-Net

PWC PWC

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Paper Link: arXiv | Research Gate | CVIT, IIIT-H

Introduction

CDeC-Net is an end-to-end network for detecting tables in document images. The network consists of a multistage extension of Mask R-CNN with a dual backbone having deformable convolution for detecting tables varying in scale with high detection accuracy at higher IoU threshold. CDeC-Net achieves state-of-the-art results on various publicly available benchmark datasets. The code is implemented in PyTorch using MMdetection framework (Version 2.0.0).

Release Notes:

Oct 10, 2020: Our paper has been accepted to ICPR 2020 as oral paper.

Setup

Dependencies
Python = 3.6+
PyTorch = 1.4.0
Torchvision = 0.5.0
Cuda = 10.0
MMdetection = 2.0.0
mmcv = 0.5.4

  1. Clone this repository
git clone https://github.com/mdv3101/CDeCNet
  1. Install the require dependencies
pip install torch==1.4.0 torchvision==0.5.0
cd CDecNet/
pip install -r requirements/build.txt
pip install "git+https://github.com/open-mmlab/cocoapi.git#subdirectory=pycocotools"
pip install -v -e .

Please follow install.md for detailed installation steps.

Training

  1. Create a folder 'dataset' in the CDeCNet and put your data into this folder. Your dataset must be in MS-Coco format. The directory structure should be:
dataset
  ├── coco
  | ├── annotations
  | ├── train2014
  | ├── val2014
  | ├── logs
  1. Create a folder 'model' in the CDeCNet and put the pre-trained model on MS-Coco into this directory. The model file can be downloaded from the google drive

  2. Set load_from= /path/of/pre-trained/model in default_runtime.py

  3. To train a model on CDeC-Net, use the following commnand

python -u tools/train.py configs/dcn/db_cascade_mask_rcnn_x101_fpn_dconv_c3-c5_1x_coco.py --work-dir dataset/coco/logs/

Note that step 2 and 3 are optional. If you want to train a model from scratch, then you can skip these two steps. (Training a model from scratch will take larger time to converge)

Evaluation

To evaluate the trained model, run the following command

python tools/test.py configs/dcn/db_cascade_mask_rcnn_x101_fpn_dconv_c3-c5_1x_coco.py dataset/coco/logs/latest.pth \
    --format-only --options "jsonfile_prefix=evaluation_result"

Details about various training and evaluation methods can be found in getting_started.md

Demo

To run inference on single image, use the image_demo.py file by running the following command

python demo/image_demo.py demo_image.jpg configs/dcn/db_cascade_mask_rcnn_x101_fpn_dconv_c3-c5_1x_coco.py dataset/coco/logs/latest.pth \
    --score-thr 0.95 --output-img 'output_demo.jpg'

CDeCNet Results

  1. Comparison between CDeC-Net and state-of-the-art techniques on the existing benchmark datasets.
Dataset Method Precision Recall F1 mAP Checkpoint
ICDAR-2013 DeCNT
CDeC-Net
0.996
1.000
0.996
1.000
0.996
1.000
-
1.000

model
ICADR-2017 Yolov3
CDeC-Net
0.968
0.924
0.975
0.970
0.971
0.947
-
0.912

model
ICADR -2019 TableRadar
CDeC-Net
0.940
0.934
0.950
0.953
0.945
0.944
-
0.922

model
UNLV GOD
CDeC-Net
0.910
0.925
0.946
0.952
0.928
0.938
-
0.912

model
Marmot DeCNT
CDeC-Net
0.946
0.930
0.849
0.975
0.895
0.952
-
0.911

model
TableBank Li et al.
CDeC-Net
0.975
0.979
0.987
0.995
0.981
0.987
-
0.976

model
PubLayNet M-RCNN
CDeC-Net
-
0.970
-
0.988
-
0.978
0.960
0.967

model
  1. Comparison between our single model CDeC-Net‡ and state-of-the-art techniques on existing benchmark datasets.
Dataset Method Precision Recall F1 mAP
ICDAR-2013 DeCNT
CDeC-Net‡
0.996
0.942
0.996
0.993
0.996
0.968
-
0.942
ICADR-2017 Yolov3
CDeC-Net‡
0.968
0.899
0.975
0.969
0.971
0.934
-
0.880
ICADR -2019 TableRadar
CDeC-Net‡
0.940
0.930
0.950
0.971
0.945
0.950
-
0.913
UNLV GOD
CDeC-Net‡
0.910
0.915
0.946
0.970
0.928
0.943
-
0.912
Marmot DeCNT
CDeC-Net‡
0.946
0.779
0.849
0.943
0.895
0.861
-
0.756
TableBank Li et al.
CDeC-Net‡
0.975
0.970
0.987
0.990
0.981
0.980
-
0.965
PubLayNet M-RCNN
CDeC-Net‡
-
0.975
-
0.993
-
0.984
0.960
0.978

Note: Our single model CDeC-Net‡ is trained on IIIT-AR-13K dataset and fine-tuned with training set of respective datasets (if available). The base model trained on IIIT-AR-13K dataset can be downloaded from the google drive

Qualitative Results: Table Detection by CDeC-Net



Issue

Kindly go through the various tutorails and documentation provided in docs folder.
Most of the common issues were already solved in MMdetection official repo's Issue Page. We strongly suggest to go through it before raising a new issue.

Citation

If you find this work useful for your research, please cite our paper

@misc{agarwal2020cdecnet,
    title={CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images},
    author={Madhav Agarwal and Ajoy Mondal and C. V. Jawahar},
    year={2020},
    eprint={2008.10831},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Contact

CDeCNet was developed by Madhav Agarwal, Dr. Ajoy Mondal and Dr. C.V. Jawahar.
For any query, feel free to drop a mail to Madhav Agarwal by explicitly mentioning 'CDeCNet' in the subject.