A comprehensive paper list of awesome reasoning over tables.
- Table Representation Learning
- Logical Text Generation over Tabular Data
- Reasoning over Tabular Data
- Reasoning over Hybrid Data
- Other directions
- Tutorials
- A Graph Representation of Semi-structured Data for Web Question Answering (COLING 2020) [Paper]
- Retrieving Complex Tables with Multi-Granular Graph Representation Learning (SIGIR 2021) [Paper][Github]
- TaBERT: Learning Contextual Representations for Natural Language Utterances and Structured Tables (ACL 2020) [Paper][Github]
- TAPAS: Weakly Supervised Table Parsing via Pre-training (ACL 2020) [Paper][Github]
- TABBIE: Pretrained Representations of Tabular Data (NAACL 2021) [Paper][Github]
- Capturing Row and Column Semantics in Transformer Based Question Answering over Tables (NAACL 2021) [Paper][Github]
- GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing (ICLR 2021) [Paper][Huggingface]
- TUTA: Tree-based Transformers for Generally Structured Table Pre-training (KDD 2021) [Paper]
- MATE: Multi-view Attention for Table Transformer Efficiency (EMNLP 2021) [Paper][Github]
- Understanding tables with intermediate pre-training (EMNLP-findings 2021) [Paper][Github]
- UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models (EMNLP 2022) [Paper][Github]
- TAPEX: Table Pre-training via Learning a Neural SQL Executor (ICLR 2022) [Paper][Github]
- TableFormer: Robust Transformer Modeling for Table-Text Encoding (ACL 2022) [Paper][Github]
- OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering (NAACL 2022) [Paper][Github]
- REASTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples (EMNLP 2022) [Paper][Github]
- Table-To-Text generation and pre-training with TabT5 (EMNLP 2022-findings) [Paper]
- STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing (EMNLP 2022) [Paper]
- Large Language Models are few(1)-shot Table Reasoners (Pre-print) [Paper]
- Logical Natural Language Generation from Open-Domain Tables (ACL 2020) [Paper][Github]
- ToTTo: A Controlled Table-To-Text Generation Dataset (EMNLP 2020) [Paper][Github]
- SciGen: a Dataset for Reasoning-Aware Text Generation from Scientific Tables (NIPS 2021) [Paper][Github]
- HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation (Pre-print 2021) [Paper]
- TWT: Table with Written Text for Controlled Data-to-Text Generation (EMNLP-findings 2021) [Paper][Github]
- Towards Table-to-Text Generation with Numerical Reasoning (ACL 2021) [Paper]
- De-Confounded Variational Encoder-Decoder for Logical Table-to-Text Generation (ACL 2021) [Paper]
- Few-Shot Table-to-Text Generation with Prototype Memory (EMNLP-findings 2021) [Paper]
- Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots (EMNLP-findings 2021) [Paper][Github]
- Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning (NAACL 2022) [Paper][Github]
- R2D2: Robust Data-to-Text with Replacement Detection (EMNLP 2022) [Paper][Github]
- PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation (EMNLP 2022) [Paper][Github]
- Diversity Enhanced Table-to-Text Generation via Type Control (Pre-print) [Paper]
- FeTaQA: Free-form Table Question Answering (TACL 2021) [Paper][Github]
- HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation (Pre-print 2021) [Paper]
- Joint Verification and Reranking for Open Fact Checking Over Tables (ACL 2021) [Paper][Github]
- Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification (EMNLP 2021) [Paper][Blank Github]
- Exploring Decomposition for Table-based Fact Verification (EMNLP-findings 2021) [Paper]
- Table-based Fact Verification With Salience-aware Learning (EMNLP-findings 2021) [Paper][Github]
- Learning to Generate Programs for Table Fact Verification via Structure-Aware Semantic Parsing (ACL 2022) [Paper][Github]
- HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data (EMNLP-findings 2020) [Paper][Github]
- Open Question Answering over Tables and Text (ICLR 2021) [Paper][Github]
- MultiModalQA: complex question answering over text, tables and images (ICLR 2021) [Paper]
- TSQA: Tabular Scenario Based Question Answering (AAAI 2021) [Paper][Github]
- TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance (ACL 2021) [Paper][Github]
- FinQA: A Dataset of Numerical Reasoning over Financial Data (EMNLP 2021) [Paper][Github]
- FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information (NeurIPS-benchmark 2021) [Paper][Github]
- MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data (ACL 2022) [Paper][Github]
- Learning to Imagine: Integrating Counterfactual Thinking in Neural Discrete Reasoning (ACL 2022) [Paper]
- HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data (ACL-findings 2022) [Paper][Github]
- FinMath: Injecting a Tree-structured Solver for Question Answering over Financial Reports (LREC 2022) [Paper]
- Towards Complex Document Understanding By Discrete Reasoning (MM 2022) [Paper][Github]
- Answering Numerical Reasoning Questions in Table-Text Hybrid Contents with Graph-based Encoder and Tree-based Decoder (COLING 2022) [Paper][Github]
- TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data (EMNLP 2022) [Paper][Github]
- MuGER2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering (EMNLP-findings 2022) [Paper][Github]
- Towards Robustness of Text-to-SQLs against Synonym Substitution (ACL 2021) [Paper][Github]
- Topic Transferable Table Question Answering (EMNLP 2021) [Paper][Github]
- Bridging the Generalization Gap in Text-to-SQL Parsing with Schema Expansion (ACL 2022) [Paper][Github]
- Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness (ICLR 2023 submission) [Paper]
- KDD 2021 Tutorial: From Tables to Knowledge: Recent Advances in Table Understanding [Website]
- EMNLP 2021 Tutorial: Knowledge-Enriched Natural Language Generation [Website]
Please feel free to make a pull request or email Yilun Zhao (yilun.zhao@yale.edu) for any interesting updates.