/Document-Graphics-Digitization

official repo for the ICDAR 2023 paper "Line Graphics Digitization: A Step Towards Full Automation"

LG Dataset

Download Dataset 🤗 | Paper | Video

Update (Nov/2023)

The dataset has been uploaded to Huggingface! Check it out: https://huggingface.co/datasets/omoured/line-graphics-dataset

Description

we introduce the task of fine-grained visual understanding of mathematical graphics and present the Line Graphics (LG) dataset, which includes pixel-wise annotations of 5 coarse and 10 fine-grained categories. Our dataset covers 520 images of mathematical graphics collected from 450 documents from different disciplines.

Samples

Diverse mathematical graphics are covered in our Line Graphics (LG) dataset, including 100 bar charts (a) 320 line graphics (b,d-f), and 100 scatter plots (c). Each sample poses a significant challenge for existing chart analysis methods.

Images
1
Annotations
1

Citation

If you find this useful for your work, please cite it as follows:

@inproceedings{moured2023line,
  title={Line Graphics Digitization: A Step Towards Full Automation},
  author={Moured, Omar and Zhang, Jiaming and Roitberg, Alina and Schwarz, Thorsten and Stiefelhagen, Rainer},
  booktitle={International Conference on Document Analysis and Recognition},
  pages={438--453},
  year={2023},
  organization={Springer}
}