CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents
Devashish Prasad, Ayan Gadpal, Kshitij Kapadni, Manish Visave,
Preprint Link of Paper
Supplementary file
The paper has been accepted at CVPR 2020 Workshop on Text and Documents in the Deep Learning Era
CascadTabNet is an automatic table recognition method for interpretation of tabular data in document images. We present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition using a single Convolution Neural Network (CNN) model. CascadeTabNet is a Cascade mask Region-based CNN High-Resolution Network (Cascade mask R-CNN HRNet) based model that detects the regions of tables and recognizes the structural body cells from the detected tables at the same time. We evaluate our results on ICDAR 2013, ICDAR 2019 and TableBank public datasets. We achieved 3rd rank in ICDAR 2019 post-competition results for table detection while attaining the best accuracy results for the ICDAR 2013 and TableBank dataset. We also attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset.
Models are developed in Pytorch based MMdetection framework (Version 1.2)
pip install -q mmcv terminaltables git clone --branch v1.2.0 'https://github.com/open-mmlab/mmdetection.git' cd "mmdetection" python setup.py install python setup.py develop pip install -r {"requirements.txt"}
Code is developed under following library dependencies
PyTorch = 1.4.0
Torchvision = 0.5.0
Cuda = 10.0
pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html
If you are using Google Colaboratory (Colab), Then you need add
from google.colab.patches import cv2_imshow
and replace all the cv2.imshow
with cv2_imshow
Codes: Code for dilation transform Code for smudge transform
TableBank Benchmarking : Leaderboard
TableBank Dataset Divisions : TableBank
Config file for the Models :
cascade_mask_rcnn_hrnetv2p_w32_20e.py
Note: Config paths are only required to change during training
Checkpoints of the Models we have trained :
Model Name | Checkpoint File |
---|---|
General Model table detection | Checkpoint |
ICDAR 13 table detection | Checkpoint |
ICDAR 19 (Track A Modern) table detection | Checkpoint |
Table Bank Word table detection | Checkpoint |
Table Bank Latex table detection | Checkpoint |
Table Bank Both table detection | Checkpoint |
ICDAR 19 (Track B2 Modern) table structure recognition | Checkpoint |
-
End to End Table Recognition Dataset
We manually annotated some of the ICDAR 19 table competition (cTDaR) dataset images for cell detection in the borderless tables. More details about the dataset are mentioned in the paper.
dataset link -
General Table Detection Dataset (ICDAR 19 + Marmot + Github)
We manually corrected the annotations of Marmot and Github and combined them with ICDAR 19 dataset to create a general and robust dataset.
dataset link
You may refer this tutorial for training Mmdetection models on your custom datasets in colab.
having useful links and results
Devashish Prasad : devashishkprasad [at] gmail [dot] com
Ayan Gadpal : ayangadpal2 [at] gmail [dot] com
Kshitij Kapadni : kshitij.kapadni [at] gmail [dot] com
Manish Visave : manishvisave149 [at] gmail [dot] com
The code of CascadeTabNet is released under the MIT License. There is no limitation for both academic and commercial usage.
If you find this work useful for your research, please cite our paper:
@misc{ cascadetabnet2020,
title={CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents},
author={Devashish Prasad and Ayan Gadpal and Kshitij Kapadni and Manish Visave and Kavita Sultanpure},
year={2020},
eprint={2004.12629},
archivePrefix={arXiv},
primaryClass={cs.CV}
}