CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images
Paper Link: arXiv | Research Gate | CVIT, IIIT-H
CDeC-Net is an end-to-end network for detecting tables in document images. The network consists of a multistage extension of Mask R-CNN with a dual backbone having deformable convolution for detecting tables varying in scale with high detection accuracy at higher IoU threshold. CDeC-Net achieves state-of-the-art results on various publicly available benchmark datasets. The code is implemented in PyTorch using MMdetection framework (Version 2.0.0).
Oct 10, 2020: Our paper has been accepted to ICPR 2020 as oral paper.
Dependencies
Python = 3.6+
PyTorch = 1.4.0
Torchvision = 0.5.0
Cuda = 10.0
MMdetection = 2.0.0
mmcv = 0.5.4
- Clone this repository
git clone https://github.com/mdv3101/CDeCNet
- Install the require dependencies
pip install torch==1.4.0 torchvision==0.5.0
cd CDecNet/
pip install -r requirements/build.txt
pip install "git+https://github.com/open-mmlab/cocoapi.git#subdirectory=pycocotools"
pip install -v -e .
Please follow install.md for detailed installation steps.
- Create a folder 'dataset' in the CDeCNet and put your data into this folder. Your dataset must be in MS-Coco format. The directory structure should be:
dataset
├── coco
| ├── annotations
| ├── train2014
| ├── val2014
| ├── logs
-
Create a folder 'model' in the CDeCNet and put the pre-trained model on MS-Coco into this directory. The model file can be downloaded from the google drive
-
Set
load_from= /path/of/pre-trained/model
in default_runtime.py -
To train a model on CDeC-Net, use the following commnand
python -u tools/train.py configs/dcn/db_cascade_mask_rcnn_x101_fpn_dconv_c3-c5_1x_coco.py --work-dir dataset/coco/logs/
Note that step 2 and 3 are optional. If you want to train a model from scratch, then you can skip these two steps. (Training a model from scratch will take larger time to converge)
To evaluate the trained model, run the following command
python tools/test.py configs/dcn/db_cascade_mask_rcnn_x101_fpn_dconv_c3-c5_1x_coco.py dataset/coco/logs/latest.pth \
--format-only --options "jsonfile_prefix=evaluation_result"
Details about various training and evaluation methods can be found in getting_started.md
To run inference on single image, use the image_demo.py file by running the following command
python demo/image_demo.py demo_image.jpg configs/dcn/db_cascade_mask_rcnn_x101_fpn_dconv_c3-c5_1x_coco.py dataset/coco/logs/latest.pth \
--score-thr 0.95 --output-img 'output_demo.jpg'
- Comparison between CDeC-Net and state-of-the-art techniques on the existing benchmark datasets.
Dataset | Method | Precision | Recall | F1 | mAP | Checkpoint |
---|---|---|---|---|---|---|
ICDAR-2013 | DeCNT CDeC-Net |
0.996 1.000 |
0.996 1.000 |
0.996 1.000 |
- 1.000 |
model |
ICADR-2017 | Yolov3 CDeC-Net |
0.968 0.924 |
0.975 0.970 |
0.971 0.947 |
- 0.912 |
model |
ICADR -2019 | TableRadar CDeC-Net |
0.940 0.934 |
0.950 0.953 |
0.945 0.944 |
- 0.922 |
model |
UNLV | GOD CDeC-Net |
0.910 0.925 |
0.946 0.952 |
0.928 0.938 |
- 0.912 |
model |
Marmot | DeCNT CDeC-Net |
0.946 0.930 |
0.849 0.975 |
0.895 0.952 |
- 0.911 |
model |
TableBank | Li et al. CDeC-Net |
0.975 0.979 |
0.987 0.995 |
0.981 0.987 |
- 0.976 |
model |
PubLayNet | M-RCNN CDeC-Net |
- 0.970 |
- 0.988 |
- 0.978 |
0.960 0.967 |
model |
- Comparison between our single model CDeC-Net‡ and state-of-the-art techniques on existing benchmark datasets.
Dataset | Method | Precision | Recall | F1 | mAP |
---|---|---|---|---|---|
ICDAR-2013 | DeCNT CDeC-Net‡ |
0.996 0.942 |
0.996 0.993 |
0.996 0.968 |
- 0.942 |
ICADR-2017 | Yolov3 CDeC-Net‡ |
0.968 0.899 |
0.975 0.969 |
0.971 0.934 |
- 0.880 |
ICADR -2019 | TableRadar CDeC-Net‡ |
0.940 0.930 |
0.950 0.971 |
0.945 0.950 |
- 0.913 |
UNLV | GOD CDeC-Net‡ |
0.910 0.915 |
0.946 0.970 |
0.928 0.943 |
- 0.912 |
Marmot | DeCNT CDeC-Net‡ |
0.946 0.779 |
0.849 0.943 |
0.895 0.861 |
- 0.756 |
TableBank | Li et al. CDeC-Net‡ |
0.975 0.970 |
0.987 0.990 |
0.981 0.980 |
- 0.965 |
PubLayNet | M-RCNN CDeC-Net‡ |
- 0.975 |
- 0.993 |
- 0.984 |
0.960 0.978 |
Note: Our single model CDeC-Net‡ is trained on IIIT-AR-13K dataset and fine-tuned with training set of respective datasets (if available). The base model trained on IIIT-AR-13K dataset can be downloaded from the google drive
Kindly go through the various tutorails and documentation provided in docs folder.
Most of the common issues were already solved in MMdetection official repo's Issue Page. We strongly suggest to go through it before raising a new issue.
If you find this work useful for your research, please cite our paper
@misc{agarwal2020cdecnet,
title={CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images},
author={Madhav Agarwal and Ajoy Mondal and C. V. Jawahar},
year={2020},
eprint={2008.10831},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
CDeCNet was developed by Madhav Agarwal, Dr. Ajoy Mondal and Dr. C.V. Jawahar.
For any query, feel free to drop a mail to Madhav Agarwal by explicitly mentioning 'CDeCNet' in the subject.