Awesome-LLM-Tabular

💡 Since the emergence of ChatGPT, Large Language Models (LLMs) have garnered significant attention, with new advancements continuously emerging. LLMs have found applications in various domains like vision, audio, and text tasks. However, tabular data remains a crucial data format in this world. Hence, this repo focuses on collecting research papers that explore the integration of LLM technology with tabular data, and aims to save you valuable time and boost research efficiency.

✨ Awesome-LLM-Tabular is a curated list of Large Language Model applied to Tabular Data.

🔥 This project is currently under development. Feel free to ⭐ (STAR) and 🔭 (WATCH) it to stay updated on the latest developments.

Table of Content

Awesome-LLM-Tabular

Related Papers

Date	keywords	Paper
2019/09	TabFact	TabFact: A Large-scale Dataset for Table-based Fact Verification
2020	TableGPT	TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching
2020/05	TaBERT	TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
2020/09	GaPPa	GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
2022/02	TableQuery	TableQuery: Querying tabular data with natural language
2022/05	FeSTE	Few-Shot Tabular Data Enrichment Using Fine-Tuned Transformer Architectures
2022/05	FM	Can Foundation Models Wrangle Your Data?
2022/05	TURL	Technical Perspective of TURL: Table Understanding through Representation Learning
2022/06	TabText	TabText: A Flexible and Contextual Approach to Tabular Data Representation
2022/06	LIFT	LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
2022/09	PTab	PTab: Using the Pre-trained Language Model for Modeling Tabular Data
2022/09	TabMWP	Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
2022/10	GReaT	Language Models are Realistic Tabular Data Generators
2022/10	TabLLM	TabLLM: Few-shot Classification of Tabular Data with Large Language Models
2023/??	IngesTables	IngesTables: Scalable and Efficient Training of LLM-Enabled Tabular Foundation Models
2023/??	Elephants	Elephants Never Forget: Testing Language Models for Memorization of Tabular Data
2023/01	DATER	Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning
2023/02	AdaPTGen	Adapting Prompt for Few-shot Table-to-Text Generation
2023/03	Survey Paper	Transformers for Tabular Data Representation: A Survey of Models and Applications
2023/04	TABLET	TABLET: Learning From Instructions For Tabular Data
2023/05	AnyPredict	AnyPredict: Foundation Model for Tabular Prediction
2023/05	TAPTAP	Generative Table Pre-training Empowers Models for Tabular Prediction
2023/07	TableGPT	TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT
2023/07	UniTabE	UniTabE: A Universal Pretraining Protocol for Tabular Foundation Model in Data Science
2023/10	TabFMs	TOWARDS FOUNDATION MODELS FOR LEARNING ON TABULAR DATA
2023/10	TableFormat	Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs
2023/10	UniPredict	UniPredict: Large Language Models are Universal Tabular Classifiers
2023/10	Table-GPT	Table-GPT: Table-tuned GPT for Diverse Table Tasks
2024/03	TableLLM	Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study
2023/11	NumericalReasoning	Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data
2023/12	TaCo	Chain-of-Thought Reasoning in Tabular Language Models
2024/01	Chain-of-Table	Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding
2024/01	TAT-LLM	TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data
2024/02	Survey Paper	LLM on Tabular Data: Prediction, Generation, and Understanding
2024/02	CABINET	CABINET: Content Relevance based Noise Reduction for Table Question Answering
2024/02	OpenTab	OpenTab: Advancing Large Language Models as Open-domain Table Reasoners
2024/02	CancerGPT	CancerGPT for few shot drug pair synergy prediction using large pretrained language models
2024/02	Exploration of LLM on Tabular	Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data
2024/03	TableLLM	Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study
2024/03	ITAB-LLM	Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science
2024/03	TP-BERTa	Making Pre-trained Language Models Great on Tabular Prediction
2024/04	FeatLLM	Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning
2024/04	TabSQLify	TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
2024/04	LLMClean	LLMClean: Context-Aware Tabular Data Cleaning via LLM-Generated OFDs
2024/07	folktexts	Evaluating language models as risk scores
2024/07	SpreadsheetLLM	SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Workshops

Useful Blogs

A Short Chronology Of Deep Learning For Tabular Data by Sebastian Raschka

Citation

@misc{wu2024awesomellmtabular,
  author = {Hong-Wei, Wu},
  title = {Awesome-LLM-Tabular},
  year = {2024},
  note = {Accessed: 2024-05-30},
  url = {https://github.com/johnnyhwu/Awesome-LLM-Tabular},
  orcid = {https://orcid.org/0009-0005-8073-5297}
}

Contributing

We welcome contributions to keep this repository up-to-date with the latest research and applications of LLM in the tabular domain. Whether you want to correct any mistakes, add new content, or suggest improvements, your contributions are highly appreciated 🤗.