/Data-to-Text-Generation

Some papers and datasets about Data-To-Text Generation

Data-to-Text-Generation

News:

2023.05.31

Add 16 papers from COLING2022, EMNLP2022, AACL2023, EACL2023, ArXiv, and ACL2023

Content

  1. Papers
  2. Datasets
  3. Evaluation Metrics

1. Papers

2016

  1. Neural Text Generation from Structured Data with Application to the Biography Domain EMNLP2016

2017

  1. Challenges in Data-to-Document Generation EMNLP2017
  2. Order-planning neural text generation from structured data AAAI2018
  3. Table-to-text Generation by Structure-aware Seq2seq Learning AAAI2018
  4. Table-to-Text: Describing Table Region with Natural Language AAAI2018
  5. A Graph-to-Sequence Model for AMR-to-Text Generation ACL2018
  6. Graph-to-Sequence Learning using Gated Graph Neural Networks ACL2018
  7. Generating Descriptions from Structured Data Using a Bifocal Attention Mechanism and Gated Orthogonalization NAACL2018
  8. A mixed hierarchical attention based encoder-decoder approach for standard summarizaion NAACL2018

2018

  1. Operation-guided Neural Networks for High Fidelity Data-To-Text Generation EMNLP2018
  2. Learning Neural Templates for Text Generation EMNLP2018
  3. Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data EMNLP2018
  4. Data2Text Studio: Automated Text Generation from Structured Data EMNLP2018
  5. Data-to-Text Generation with Content Selection and Planning AAAI2019
  6. Hierarchical Encoder with Auxiliary Supervision for Neural Table-to-Text Generation: Learning Better Representation for Tables AAAI2019
  7. Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation ACL2019
  8. Learning to Select, Track, and Generate for Data-to-Text ACL2019
  9. Towards Comprehensive Description Generation from Factual Attribute-value Tables ACL2019
  10. Data-to-text Generation with Entity Modeling ACL2019
  11. Handling Divergent Reference Texts when Evaluating Table-to-Text Generation ACL2019
  12. Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation NAACL2019
  13. Text Generation from Knowledge Graphs with Graph Transformers NAACL2019
  14. Structural Neural Encoders for AMR-to-text Generation NAACL2019 NAACL2019
  15. Deep Graph Convolutional Encoders for Structured Data to Text Generation INLG2018
  16. Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning TACL2018
  17. ...

2019

  1. Enhancing Neural Data-To-Text Generation Models with External Background Knowledge EMNLP2019
  2. Neural data-to-text generation: A comparison between pipeline and end-to-end architectures EMNLP2019
  3. Table-to-Text Generation with Effective Hierarchical Encoder on Three dimensions (Row, Column and Time) EMNLP2019
  4. Modeling Graph Structure in Transformer for Better AMR-to-Text Generation EMNLP2019
  5. Enhanced Transformer Model for Data-to-Text Generation EMLP-WGNT2019
  6. Selecting, Planning, and Rewriting: A Modular Approach for Data-to-Document Generation and Translation EMNLP2019-short
  7. Long and Diverse Text Generation with Planning-based Hierarchical Variational Model EMNLP2019
  8. Enhancing AMR-to-Text Generation with Dual Graph Representations EMNLP2019
  9. An Encoder with non-Sequential Dependency for Neural Data-to-Text Generation INLG2019
  10. Controlling Contents in Data-to-Document Generation with Human-Designed Topic Labels INLG2019
  11. Revisiting Challenges in Data-to-Text Generation with Fact Grounding INLG2019
  12. Graph Transformer for Graph-to-Sequence Learning AAAI2020
  13. Sentence Generation for Entity Description with Content-plan Attention AAAI2020
  14. Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation AAAI2020
  15. Variational Template Machine for Data-to-Text Generation ICLR2020
  16. Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints ACL2020
  17. Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence ACL2020
  18. Bridging the Structural Gap Between Encoding and Decoding for Data-To-Text Generation ACL2020
  19. Heterogeneous Graph Transformer for Graph-to-Sequence Learning ACL2020
  20. Structural Information Preserving for Graph-to-Text Generation ACL2020
  21. Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks ACL2020
  22. GPT-too: A Language-Model-First Approach for AMR-to-Text Generation ACL2020
  23. Logical Natural Language Generation from Open-Domain Tables ACL2020
  24. A Generative Model for Joint Natural Language Understanding and Generation ACL2020
  25. Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data ACL2020
  26. Infobox-to-text Generation with Tree-like PLanning based Attention Network IJCAI2020
  27. Better AMR-To-Text Generation with Graph Structure Reconstruction IJCAI2020
  28. RDF-to-Text Generation with Graph-augmented Structural Neural Encoders IJCAI2020
  29. A Hierarchical Model for Data-to-Text Generation ECIR2020
  30. ...

2020

  1. ToTTo: A Controlled Table-To-Text Generation Dataset EMNLP2020
  2. CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training NIPS2020
  3. Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs TACL2020
  4. AMR-to-text Generation with Graph Transformer TACL2020
  5. Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity COLING2020
  6. Investigating Pretrained Language Models for Graph-to-Text Generation arXiv2020
  7. Logic2Text: High-Fidelity Natural Language Generation from Logical Forms EMNLP2020
  8. KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation EMNLP2020
  9. Online Back-Parsing for AMR-to-Text Generation EMNLP2020
  10. Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation EMNLP2020
  11. Stepwise Extractive Summarization and Planning with Structured Transformers EMNLP2020
  12. Partially-Aligned Data-to-Text Generation with Distant Supervision EMNLP2020
  13. GenWiki: A Dataset of 1.3 Million Content-Sharing Text and Graphs for Unsupervised Graph-to-Text Generation COLING2020
  14. Enhancing Content Planning for Table-to-Text Generation with Data Understanding and Verification EMNLP2020 Findings
  15. Multilingual AMR-to-Text Generation EMNLP2020
  16. ENT-DESC: Entity Description Generation by Exploring Knowledge Graph EMNLP2020
  17. An Unsupervised Joint System for Text Generation from Knowledge Graphs and Semantic Parsing EMNLP2020
  18. Make Templates Smarter: A Template Based Data2Text System Powered by Text Stitch Model EMNLP2020 Findings
  19. TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching COLING2020
  20. Towards Faithfulness in Open Domain Table-to-text Generation from an Entity-centric View AAAI2021
  21. Neural Data-to-Text Generation with LM-based Text Augmentation EACL2021
  22. DART: Open-Domain Structured Data Record to Text Generation NAACL2021
  23. Modeling Graph Structure via Relative Position for Text Generation from Knowledge Graphs NAACL|TextGraphs2021
  24. WikiGraphs: A Wikipedia Text - Knowledge Graph Paired Dataset NAACL|TextGraphs2021
  25. Data-to-text Generation with Macro Planning TACL2021
  26. Few-shot Knowledge Graph-to-Text Generation with Pretrained Language Models ACL2021 Findings
  27. Stage-wise Fine-tuning for Graph-to-Text Generation ACL2021 Workshop
  28. Generating Landmark Navigation Instructions from Maps as a Graph-to-Text Problem ACL2021
  29. Sketch and Refine: Towards Faithful and Informative Table-to-Text Generation ACL2021 Findings
  30. Promoting Graph Awareness in Linearized Graph-to-Text Generation ACL2021 Findings
  31. De-Confounded Variational Encoder-Decoder for Logical Table-to-Text Generation ACL2021
  32. Towards Table-to-Text Generation with Numerical Reasoning ACL2021
  33. Improving Encoder by Auxiliary Supervision Tasks for Table-to-Text Generation ACL2021
  34. WIKITABLET: A Large-Scale Data-to-Text Dataset for Generating Wikipedia Article Sections ACL2021 Findings
  35. Text-to-Text Pre-Training for Data-to-Text Tasks INLG2020
  36. TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching COLING2020
  37. Structure-Aware Pre-Training for Table-to-Text Generation ACL2021 Findings ACL2021 Findings
  38. Does the Order of Training Samples Matter? Improving NeuralData-to-Text Generation with Curriculum Learning EACL2021
  39. ...

2021

  1. Structural Adapters in Pretrained Language Models for AMR-to-text Generation EMNLP 2021
  2. Learning to Reason for Text Generation from Scientific Tables ArXiv2021
  3. Plan-then-Generate: Controlled Data-to-Text Generation via Planning EMNLP2021 Findings
  4. Few-Shot Table-to-Text Generation with Prototype Memory EMNLP2021Findings
  5. Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation EMNLP2021
  6. Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots EMNLP2021 Findings
  7. Data-to-text Generation by Splicing Together Nearest Neighbors EMNLP2021
  8. TWT: Table with Written Text for Controlled Data-to-Text Generation EMNLP2021 Findings
  9. Data-QuestEval: A Reference-less Metric for Data-to-Text Semantic Evaluation EMNLP2021
  10. EventNarrative: A large-scale Event-centric Dataset for Knowledge Graph-to-Text Generation NIPS2021
  11. Attention Is Indeed All You Need: Semantically Attention-Guided Decoding for Data-to-Text NLG INLG2021
  12. Latent Tree Decomposition Parsers for AMR-to-Text Generation ArXiv2021
  13. Tree Decomposition Attention for AMR-to-Text Generation ArXiv2021
  14. Search and Learn: Improving Semantic Coverage for Data-to-Text Generation AAAI2022
  15. Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation IJCAI2022
  16. Improving Compositional Generalization with Self-Training for Data-to-Text Generation ACL2022
  17. Hitab: A hierarchical table dataset for question answering and natural language generation ACL2022 - Code: Official
  18. Neural Pipeline for Zero-Shot Data-to-Text Generation ACL2022
  19. uFACT: Unfaithful Alien-Corpora Training for Semantically Consistent Data-to-Text Generation ACL2022 Findings Short
  20. Rewarding Semantic Similarity under Optimized Alignments for AMR-to-Text Generation ACL2022 Short
  21. Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning NAACL2022
  22. Syntax Controlled Knowledge Graph-to-Text Generation with Order and Semantic Consistency NAACL2022 Findings - Code: Official
  23. Generating Textual Explanations for Machine Learning Models Performance: A Table-to-Text Task LREC2022
  24. Table-To-Text generation and pre-training with TABT5 NAACL2022 SUKI Workshop - Code: Official
  25. UNIFIEDSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models EMNLP2022 - Code: Official
  26. FLAP: Table-to-Text Generation with Feature Indication and Numerical Reasoning Pretraining ARR-2021-12
  27. What Makes Data-to-Text Generation Hard for Pretrained Language Models GEM2022

2022

  1. GAP: A Graph-aware Language Model Framework for Knowledge Graph-to-Text Generation COLING2022
  2. Graph-to-Text Generation with Dynamic Structure Pruning COLING2022
  3. Self-supervised Graph Masking Pre-training for Graph-to-Text Generation EMNLP2022
  4. R2D2: Robust Data-to-Text with Replacement Detection EMNLP2022
  5. VISTOT: Vision-Augmented Table-to-Text Generation EMNLP2022
  6. Grounded Keys-to-Text Generation: Towards Factual Open-Ended Generation EMNLP2022 Findings
  7. ASDOT: Any-Shot Data-to-Text Generation with Pretrained Language Models EMNLP2022 Findings
  8. TaKG: A New Dataset for Paragraph-level Table-to-Text Generation Enhanced with Knowledge Graphs AACL2022 Findings
  9. Block Diagram-to-Text: Understanding Block Diagram Images by Generating Natural Language Descriptors AACL2022 Findings
  10. LOFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control EACL2023
  11. Investigating the Effect of Relative Positional Embeddings on AMR-to-Text Generation with Structural Adapters EACL2023
  12. Incorporating Question Answering-Based Signals into Abstractive Summarization via Salient Span Selection EACL2023
  13. Plan-then-Seam: Towards Efficient Table-to-Text Generation EACL2023 Findings
  14. MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation ArXiv2022
  15. TabGenie: A Toolkit for Table-to-Text Generation ACL 2023 System Demonstration Track
  16. MVP: Multi-task Supervised Pre-training for Natural Language Generation ACL2023

2. Datasets

The citation information was updated on Jan 4, 2021

2.1 Table2Text

No Dataset Domain Source Train/Dev/Test Cited
1 WIKIBIO Wikipedia Neural text generation from structured data with application to the biography domain EMNLP2016 582,695/72,831/72,831 212 Official
2 E2E Restaurants The E2E dataset: New challenges for end-to- end generation SIGDIAL2017 4,2061/4,672/4,693 135 Official
3 ROTOWIRE Basketball Challenges in Data-to-Document Generation EMNLP2017 3,371/727/728 227 Official
4 WEATHERGOV Weather Learning Semantic Correspondences with Less Supervision ACL2009 25,000/1,000/3,528 255 Official
6 ESPN Basketball Operation-guided Neural Networks for High Fidelity Data-To-Text Generation EMNLP2018 12,043/1,505/1,506 25 Official
7 ROTOWIRE-MODIFIED Basketball Learning to Select, Track, and Generate for Data-to-Text ACL2019 2,705/532/497 14 Official
8 MLB Baseball Data-to-text Generation with Entity Modeling ACL2019 22,821/1,739/1,744 27 Official
9 WikiBio (21 domains) Wikipedia Enhancing Neural Data-To-Text Generation Models with External Background Knowledge EMNLP2019 - 9 Official
10 Rotowire-FG Basketball Revisiting Challenges in Data-to-Text Generation with Fact Grounding INLG2019 5,232/1,125/1,119 5 Official
11 Wikiperson Wikipedia Describing a Knowledge Base INLG2018 250,186/30,487/29,982 21 Official
12 LOGICNLG Wikipedia Logical Natural Language Generation from Open-Domain Tables ACL2020 28,450/4,260/4,305 12 Official
13 ToTTo Wikipedia ToTTo: A Controlled Table-To-Text Generation Dataset EMNLP2020 120,761/7,700/7,700 13 Official
14 SciGen ArXiv Learning to Reason for Text Generation from Scientific Tables ArXiv2021 - - Official
15 numericNLG Scientific Towards Table-to-Text Generation with Numerical Reasoning ACL2021 - - Official
16 WIKITABLET Wikipedia WIKITABLET: A Large-Scale Data-to-Text Dataset for Generating Wikipedia Article Sections ACL2021 Findings -/4533/4351 - Official
17 TWT Wikipedia TWT: Table with Written Text for Controlled Data-to-Text Generation EMNLP2021 Findings 113, 063/7, 690/7, 515 and 39, 678/5, 009/4, 730. - Official
18 Hitab Wikipedia Hitab: A hierarchical table dataset for question answering and natural language generation ACL2022 10,686 - Official

2.2 Graph2Text

No Dataset Type Source Train/Dev/Test Cited
1 WebNLG RDF - - - Official
2 LDC2015E86 AMR - 16,833/1,368/1,371 - Official
3 LDC2017T10 AMR - 36,521/1,368/1,371 - Official
4 LDC2020T02 AMR - 55,635/1,722/1,898 - Official
5 AGENDA Knowledge Graphs Text Generation from Knowledge Graphs with Graph Transformers NAACL2019 38,720/1,000/1,000 74 Official
6 LOGIC2TEXT Wikipedia Logic2Text: High-Fidelity Natural Language Generation from Logical Forms EMNLP2020 8,566/1,095/1,092 1 Official
7 WITA Wikipedia Partially-Aligned Data-to-Text Generation with Distant Supervision EMNLP2020 50,000/5,000/400 2 Official
8 GenWiki - GenWiki: A Dataset of 1.3 Million Content-Sharing Text and Graphs for Unsupervised Graph-to-Text Generation COLING2020 - - Official
9 ENT-DESC - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph EMNLP2020 - - Official
10 WikiGraphs Wikipedia WikiGraphs: A Wikipedia Text - Knowledge Graph Paired Dataset NAACL|TextGraphs2021 23,431/48/43 - Official
11 Map2Seq OpenStreetMap Generating Landmark Navigation Instructions from Maps as a Graph-to-Text Problem ACL2021 - - Official
12 DART Wikipedia+Restaurant DART: Open-Domain Structured Data Record to Text Generation NAACL2021 62,659/6,980/12,552 11 Official
13 EventNarrative EventKG+Wikidata EventNarrative: A large-scale Event-centric Dataset for Knowledge Graph-to-Text Generation NIP2021 179,542/22,443/22,443 - Official

3. Evaluation Metrics

No Metric Source Cited
1 BLEU Bleu: a Method for Automatic Evaluation of Machine Translation ACL2020 14039 -
2 CS, RG, CO Challenges in Data-to-Document Generation EMNLP2017 227 Official
3 PARENT Handling Divergent Reference Texts when Evaluating Table-to-Text Generation ACL2019 18 Official
4 Data-QuestEval Data-QuestEval: A Reference-less Metric for Data-to-Text Semantic Evaluation EMNLP2021 - Official

Updating ......