/Awesome-LLM-Tabular

Awesome-LLM-Tabular: a curated list of Large Language Model applied to Tabular Data

Awesome-LLM-Tabular

Awesome License: MIT

💡 Since the emergence of ChatGPT, Large Language Models (LLMs) have garnered significant attention, with new advancements continuously emerging. LLMs have found applications in various domains like vision, audio, and text tasks. However, tabular data remains a crucial data format in this world. Hence, this repo focuses on collecting research papers that explore the integration of LLM technology with tabular data, and aims to save you valuable time and boost research efficiency.

✨ Awesome-LLM-Tabular is a curated list of Large Language Model applied to Tabular Data.

🔥 This project is currently under development. Feel free to ⭐ (STAR) and 🔭 (WATCH) it to stay updated on the latest developments.

Table of Content

Related Papers

Date keywords Paper Publication Resource
2019/09 TabFact TabFact: A Large-scale Dataset for Table-based Fact Verification Static Badge Static Badge
2020 TableGPT TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching Static Badge Static Badge
2020/05 TaBERT TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data Static Badge Static Badge
2020/09 GaPPa GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing Static Badge Static Badge
2022/02 TableQuery TableQuery: Querying tabular data with natural language
2022/05 FeSTE Few-Shot Tabular Data Enrichment Using Fine-Tuned Transformer Architectures Static Badge
2022/05 FM Can Foundation Models Wrangle Your Data? Static Badge Static Badge
2022/05 TURL Technical Perspective of TURL: Table Understanding through Representation Learning Static Badge
2022/06 TabText TabText: A Flexible and Contextual Approach to Tabular Data Representation
2022/06 LIFT LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks Static Badge Static Badge
2022/09 PTab PTab: Using the Pre-trained Language Model for Modeling Tabular Data
2022/09 TabMWP Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning Static Badge Static Badge
2022/10 GReaT Language Models are Realistic Tabular Data Generators Static Badge Static Badge
2022/10 TabLLM TabLLM: Few-shot Classification of Tabular Data with Large Language Models Static Badge Static Badge
2023/?? IngesTables IngesTables: Scalable and Efficient Training of LLM-Enabled Tabular Foundation Models Static Badge
2023/?? Elephants Elephants Never Forget: Testing Language Models for Memorization of Tabular Data Static Badge Static Badge
2023/01 DATER Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning Static Badge
2023/02 AdaPTGen Adapting Prompt for Few-shot Table-to-Text Generation
2023/03 Survey Paper Transformers for Tabular Data Representation: A Survey of Models and Applications Static Badge
2023/04 TABLET TABLET: Learning From Instructions For Tabular Data Static Badge
2023/05 AnyPredict AnyPredict: Foundation Model for Tabular Prediction
2023/05 TAPTAP Generative Table Pre-training Empowers Models for Tabular Prediction Static Badge Static Badge
2023/07 TableGPT TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT
2023/07 UniTabE UniTabE: A Universal Pretraining Protocol for Tabular Foundation Model in Data Science Static Badge
2023/10 TabFMs TOWARDS FOUNDATION MODELS FOR LEARNING ON TABULAR DATA
2023/10 TableFormat Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs
2023/10 UniPredict UniPredict: Large Language Models are Universal Tabular Classifiers
2023/10 Table-GPT Table-GPT: Table-tuned GPT for Diverse Table Tasks Static Badge
2024/03 TableLLM Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study Static Badge
2023/11 NumericalReasoning Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data Static Badge
2023/12 TaCo Chain-of-Thought Reasoning in Tabular Language Models Static Badge Static Badge
2024/01 Chain-of-Table Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding Static Badge
2024/01 TAT-LLM TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data Static Badge
2024/02 Survey Paper LLM on Tabular Data: Prediction, Generation, and Understanding
2024/02 CABINET CABINET: Content Relevance based Noise Reduction for Table Question Answering Static Badge Static Badge
2024/02 OpenTab OpenTab: Advancing Large Language Models as Open-domain Table Reasoners Static Badge Static Badge
2024/02 CancerGPT CancerGPT for few shot drug pair synergy prediction using large pretrained language models Static Badge
2024/02 Exploration of LLM on Tabular Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data
2024/03 TableLLM Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study Static Badge
2024/03 ITAB-LLM Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science Static Badge
2024/03 TP-BERTa Making Pre-trained Language Models Great on Tabular Prediction Static Badge Static Badge
2024/04 FeatLLM Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning Static Badge
2024/04 TabSQLify TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition Static Badge Static Badge
2024/04 LLMClean LLMClean: Context-Aware Tabular Data Cleaning via LLM-Generated OFDs Static Badge
2024/07 folktexts Evaluating language models as risk scores Static Badge
2024/07 SpreadsheetLLM SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Workshops

Useful Blogs

Citation

@misc{wu2024awesomellmtabular,
  author = {Hong-Wei, Wu},
  title = {Awesome-LLM-Tabular},
  year = {2024},
  note = {Accessed: 2024-05-30},
  url = {https://github.com/johnnyhwu/Awesome-LLM-Tabular},
  orcid = {https://orcid.org/0009-0005-8073-5297}
}

Contributing

We welcome contributions to keep this repository up-to-date with the latest research and applications of LLM in the tabular domain. Whether you want to correct any mistakes, add new content, or suggest improvements, your contributions are highly appreciated 🤗.