/Chart-to-text

Primary LanguageOpenEdge ABLGNU General Public License v3.0GPL-3.0

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

  • Authors: Shankar Kantharaj, Rixie Tiffany Ko Leong, Xiang Lin, Ahmed Masry, Megh Thakkar, Enamul Hoque, Shafiq Joty
  • Paper Link: Chart-to-Text
  • [NEW] If you are looking for powerful Chart Models, explore our latest models for chart understanding:
    • UniChart
      • A lightweight model (140M parameters) excelling in ChartQA, Chart-to-Table, Chart Summarization, and Open-ended QA.
    • ChartInstruct
      • Our advanced Chart Large Language Model based on LLaVA, supporting LLama2 (7B) and Flan-T5-XL (3B). Perfect for a wide range of chart-related tasks.
    • ChartGemma
      • The state-of-the-art Chart LLM built on PaliGemma (3B), optimized for visual reasoning tasks.
    • All models are user-friendly and can be run with just a few lines of code. Public web demos are available! Check out their GitHub repositories for more details.

Chart-to-Text Dataset

Each dataset folder (Statiata or Pew) has the following structure:

├── dataset folder                  
│   ├── bboxes # Json files that contain the list of words and their bounidng boxes that were detected in the Chart Images.   
│   │   │   ...
│   │   │   ...
│   └── captions # Text files that contain the target summaries/captions for the chart images.
│   │   │   ...
│   │   │   ...
│   └── data # CSV or Txt files that contain the underlying data table for each chart image.   
│   │   │   ...
│   │   │   ...
│   └── imgs # Chart images (png format)  
│   │   │   ...
│   │   │   ...
│   └── titles # Txt files the contain the titles of the chart images.  
│   │   │   ...
│   │   │   ...
│   └── dataset_splits # CSV files that contain a list of the chart images names for each split (train/val/test)
│   │   │   ...
│   │   │   ...
│   └── **multiColumn** # A folder with the same structure, but it contains the multicolumn charts (e.g., stack bar charts, multi line charts). 
│   │   │   ...
│   │   │   ...
│   └── metadata.csv # A csv file that contain extra metadata that were saved during the crawling process (title, x-axis label, y-axis label, ..etc).
│   └── sta.txt # A text file with some statistics about the data in the folder.  

Models

BART or T5

Please refer to Bart-T5

LogicNLG

Please refer to LogicNLG

Chart2Text

Please refer to Chart2Text

Evaluation

The metrics used in this work are listed in evaluation_metrics. For each metric, we have steps.txt which presents the steps to setup and run the metric.

Contact

If you have any questions about this work, please contact Ahmed Masry using the following email addresses: amasry17@ku.edu.tr or ahmed.elmasry24653@gmail.com. Please note that my school email which was mentioned in the paper (masry20@yorku.ca) has been deactivated since I have already graduated.

Reference

Please cite our paper if you use our models or dataset in your research.

@inproceedings{kantharaj-etal-2022-chart,
    title = "Chart-to-Text: A Large-Scale Benchmark for Chart Summarization",
    author = "Kantharaj, Shankar  and
      Leong, Rixie Tiffany  and
      Lin, Xiang  and
      Masry, Ahmed  and
      Thakkar, Megh  and
      Hoque, Enamul  and
      Joty, Shafiq",
    editor = "Muresan, Smaranda  and
      Nakov, Preslav  and
      Villavicencio, Aline",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.277",
    doi = "10.18653/v1/2022.acl-long.277",
    pages = "4005--4023",
}