/dl4math

Resources of deep learning for mathematical reasoning (DL4MATH).

MIT LicenseMIT

Deep Learning for Mathematical Reasoning (DL4MATH)

Awesome License: MIT Survey

This repository is the reading list on Deep Learning for Mathematical Reasoning (DL4MATH).

Contributors: Pan Lu @UCLA, Liang Qiu @UCLA, Wenhao Yu @Notre Dame, Sean Welleck @UW, Kai-Wei Chang @UCLA

For more details, please refer to the paper: A Survey of Deep Learning for Mathematical Reasoning.

πŸ”” If you have any suggestions or notice something we missed, please don't hesitate to let us know. You can directly email Pan Lu (lupantech@gmail.com), comment on the twitter, or post an issue on this repo.

🧰 Resources

Related Surveys

  • A Survey of Question Answering for Math and Science Problem, arXiv:1705.04530 [paper]
  • The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers, TPAMI 2019 [paper]
  • Representing Numbers in NLP: a Survey and a Vision, NACL 2021 [paper]
  • Survey on Mathematical Word Problem Solving Using Natural Language Processing, ICIICT 2021 [paper]
  • A Survey in Mathematical Language Processing, arXiv:2205.15231 [paper]
  • Partial Differential Equations Meet Deep Neural Networks: A Survey, arXiv:2211.05567 [paper]
  • πŸ”₯ Reasoning with Language Model Prompting: A Survey, arXiv:2212.09597 [paper]
  • πŸ”₯ Towards Reasoning in Large Language Models: arXiv:2212.10403 [paper]
  • πŸ”₯ A Survey for In-context Learning, arXiv:2301.00234 [paper]

Related Blogs

  • πŸ”₯ How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources, Dec 2022, Yao Fu’s Notion [link]

Workshops

  • πŸ”₯ The 1st MATH-AI Workshop: the Role of Mathematical Reasoning in General Artificial Intelligence, ICLR 2021 [website]
  • πŸ”₯ Math AI for Education: Bridging the Gap Between Research and Smart Education (MATHAI4ED), NeurIPS 2021 [website]
  • πŸ”₯ The 1st Workshop on Mathematical Natural Language Processing, EMNLP 2022 [website]
  • πŸ”₯ The 2nd MATH-AI Workshop: Toward Human-Level Mathematical Reasoning, NeurIPS 2022 [website]
  • πŸ”₯ FLAIM: Formal Languages, AI and Mathematics, IHP & META 2022 [YouTube]
  • πŸ”₯ AI to Assist Mathematical Reasoning: A Workshop, NASEM 2023 [YouTube]

Talks

  • Can GPT-3 do math? | Grant Sanderson and Lex Fridman, 2020 [YouTube]
  • Computer Scientist Explains One Concept in 5 Levels of Difficulty, 2022 [YouTube]

🎨 Mathematical Reasoning Benchmarks

Math Word Problems (MWP)

  • [AI2/Verb395] Learning to Solve Arithmetic Word Problems with Verb Categorization, EMNLP 2014 [paper]
  • [Alg514] Learning to automatically solve algebra word problems, ACL 2014 [paper]
  • [IL] Reasoning about Quantities in Natural Language, TACL 2015 [paper]
  • [SingleEQ] Parsing Algebraic Word Problems into Equations, TACL 2015 [paper]
  • [DRAW] Draw: A challenging and diverse algebra word problem set, 2015 [paper]
  • [Dolphin1878] Automatically solving number word problems by semantic parsing and reasoning, EMNLP 2015 [paper]
  • [Dolphin18K] How well do computers solve math word problems? large-scale dataset construction and evaluation, ACL 2016 [paper]
  • [MAWPS] MAWPS: A math word problem repository, NAACL-HLT 2016 [paper]
  • [AllArith] Unit dependency graph and its application to arithmetic word problem solving, AAAI 2017 [paper]
  • [DRAW-1K] Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems, ACL 2017 [paper]
  • πŸ”₯ [Math23K] Deep neural solver for math word problems, EMNLP 2017 [paper]
  • [AQuA] Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems, ACL 2017 [paper]
  • [Aggregate] Mapping to Declarative Knowledge for Word Problem Solving, TACL 2018 [paper]
  • πŸ”₯ [MathQA] MathQA: Towards interpretable math word problem solving with operation-based formalisms, NAACL-HLT 2019 [paper]
  • [ASDiv] A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers, ACL 2020 [paper]
  • [HMWP] Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems, EMNLP 2020 [paper]
  • [Ape210K] Ape210k: A large-scale and template-rich dataset of math word problems, arXiv:2009.11506 [paper]
  • πŸ”₯ [SVAMP] Are NLP Models really able to Solve Simple Math Word Problems?, NAACL-HIT 2021 [paper]
  • πŸ”₯ [GSM8K] Training verifiers to solve math word problems, arXiv:2110.14168 [paper]
  • πŸ”₯ [IconQA] IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning, NeurIPS 2021] [paper]
  • πŸ”₯ [MathQA-Python] Program synthesis with large language models, arXiv:2108.07732 [paper]
  • [ArMATH] ArMATH: a Dataset for Solving Arabic Math Word Problems, LREC 2022 [paper]
  • πŸ”₯ [TabMWP] Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning, arXiv:2209.14610, 2022 [paper]

Theorem Proving (TP)

  • [MML] Four Decades of Mizar, Journal of Automated Reasoning 2015, [paper]
  • [HolStep] HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving, ICLR 2017 [paper]
  • [GamePad] GamePad: A Learning Environment for Theorem Proving, ICLR 2019 [paper]
  • πŸ”₯ [CoqGym] Learning to Prove Theorems via Interacting with Proof Assistants, ICML 2019 [paper]
  • [HOList] HOList: An environment for machine learning of higher order logic theorem proving, ICML 2019 [paper]
  • [IsarStep] IsarStep: a Benchmark for High-level Mathematical Reasoning, ICLR 2021 [paper]
  • [LISA] LISA: Language models of ISAbelle proofs, AITP 2021 [paper]
  • [INT] INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving, ICLR 2021 [paper]
  • πŸ”₯ [NaturalProofs] NaturalProofs: Mathematical Theorem Proving in Natural Language, NeurIPS 2021 [paper]
  • [NaturalProofs-Gen] NaturalProver: Grounded Mathematical Proof Generation with Language Models, NeurIPS 2022 [paper]
  • πŸ”₯ [MiniF2F] MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics, ICLR 2022 [paper]
  • πŸ”₯ [LeanStep] Proof Artifact Co-training for Theorem Proving with Language Models, ICLR 2022 [paper]
  • πŸ”₯ [miniF2F+informal] Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs, arXiv:2210.12283 [paper]

Geometry Problem Solving (GPS)

  • πŸ”₯ [GEOS] Solving geometry problems: Combining text and diagram interpretation, EMNLP 2015 [paper]
  • [GeoShader] Synthesis of solutions for shaded area geometry problems, The Thirtieth International Flairs Conference, 2017 [paper]
  • [GEOS++] From textbooks to knowledge: A case study in harvesting axiomatic knowledge from textbooks to solve geometry problems, EMNLP 2017 [paper]
  • [GEOS-OS] Learning to solve geometry problems from natural language demonstrations in textbooks, Proceedings of the 6th Joint Conference on Lexical and Computational Semantics, 2017 [paper]
  • πŸ”₯ [Geometry3K] Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
  • [GeoQA] GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
  • [GeoQA+] An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper]
  • πŸ”₯ [UniGeo] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]

Math Question Answering (MathQA)

  • [QUAREL] QUAREL: A Dataset and Models for Answering Questions about Qualitative Relationships, AAAI 2019 [paper]
  • [McTaco] β€œGoing on a vacation” takes longer than β€œGoing for a walk”: A Study of Temporal Commonsense Understanding, EMNLP 2019 [paper]
  • πŸ”₯ [DROP] DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs, NAACL 2019 [paper]
  • πŸ”₯ [Mathematics] Analysing Mathematical Reasoning Abilities of Neural Models, ICLR 2019 [paper]
  • [FinQA] FinQA: A Dataset of Numerical Reasoning over Financial Data, EMNLP 2021 [paper]
  • [Fermi] How Much Coffee Was Consumed During EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI, EMNLP 2020 [paper]
  • πŸ”₯ [MATH, AMPS] Measuring Mathematical Problem Solving With the MATH Dataset, NeurIPS 2021 [paper]
  • [TAT-QA] TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance, ACL-JCNLP 2021 [paper]
  • [MultiHiertt] MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data, ACL 2022 [paper]
  • [NumGLUE] NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks, ACL 2022 [paper]
  • πŸ”₯ [Lila] Lila: A Unified Benchmark for Mathematical Reasoning, EMNLP 2022 [paper]

Other Quantitative Problems

  • [FigureQA] Figureqa: An annotated figure dataset for visual reasoning, arXiv:1710.07300 [paper]
  • πŸ”₯ [DVQA] Dvqa: Understanding data visualizations via question answering, CVPR 2018 [paper]
  • [DREAM] DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension,TACL 2019 [paper]
  • [EQUATE] EQUATE: A Benchmark Evaluation Framework for Quantitative Reasoning in Natural Language Inference, CoNLL 2019 [paper]
  • πŸ”₯ [NumerSense] Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models, EMNLP 2020 [paper]
  • [MNS] Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning, AAAI 2020 [paper]
  • [P3] Programming Puzzles, NeurIPS 2021 [paper]
  • [NOAHQA] NOAHQA: Numerical Reasoning with Interpretable Graph Question Answering Dataset, Findings of EMNLP 2021 [paper]
  • [ConvFinQA] ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering, arXiv:2210.03849 [paper]
  • [PGDP5K] PGDP5K: A Diagram Parsing Dataset for Plane Geometry Problems, arXiv:2205.0994 [paper]
  • [GeoRE] GeoRE: A Relation Extraction Dataset for Chinese Geometry Problems, NeurIPS 2021 MATHAI4ED Workshop [paper]
  • πŸ”₯ [ScienceQA] Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering, NeurIPS 2022 [paper]

🧩 Neural Networks for Mathematical Reasoning

General Neural Networks

  • [LSTM] Long short-term memory, Neural computation 1997 [paper]
  • [Seq2Seq] Sequence to sequence learning with neural networks, NeurIPS 2014 [paper]
  • [GRU] Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation, EMNLP 2014 [paper]
  • [Attention] Neural machine translation by jointly learning to align and translate, arXiv:1409.0473 [paper]
  • [Attention] Show, attend and tell: Neural image caption generation with visual attention, ICML 2015 [paper]
  • [Faster-RCNN] Faster r-cnn: Towards real-time object detection with region proposal networks, NeurIPS 2015 [paper]
  • [TreeLSTM] Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, ACL 2015 [paper]
  • [BiLSTM] Google's neural machine translation system: Bridging the gap between human and machine translation, arXiv:1609.08144 [paper]
  • [ResNet] Deep residual learning for image recognition, CVPR 2016 [paper]
  • [ConvS2S] Convolutional sequence to sequence learning, ICML 2017 [paper]
  • [Top-Down Attention] Bottom-up and top-down attention for image captioning and visual question answering, CVPR 2018 [paper]
  • [FiLM] Film: Visual reasoning with a general conditioning layer, AAAI 2018 [paper]
  • [BAN] Bilinear Attention Networks, NeurIPS 2018 [paper]
  • [DAFA] Dynamic Fusion With Intra-and Inter-Modality Attention Flow for Visual Question Answering, CVPR 2018 [paper]

Seq2Seq Networks for Math

  • πŸ”₯ [DNS] Deep Neural Solver for Math Word Problems, EMNLP 2017 [paper]
  • πŸ”₯ [AnsRat] Program induction by rationale generation: Learning to solve and explain algebraic word problems, ACL 2017 [paper]
  • [Math-EN] Translating a Math Word Problem to a Expression Tree, EMNLP 2018 [paper]
  • [CASS] Neural math word problem solver with reinforcement learning, COLING 2018 [paper]
  • [SelfAtt] Data-driven methods for solving algebra word problems, arXiv:1804.10718 [paper]
  • [S-Aligned] Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems, NAACL 2019 [paper]
  • [T-RNN] Template-based math word problem solvers with recursive neural networks, AAAI 2019 [paper]
  • [GROUP-ATT] Modeling intra-relation in math word problems with different functional multi-head attentions, ACL 2019 [paper]
  • [QuaSP+] QUAREL: A Dataset and Models for Answering Questions about Qualitative Relationships, AAAI 2019 [paper]
  • [SMART] SMART: A Situation Model for Algebra Story Problems via Attributed Grammar, AAAI 2021 [paper]

Graph-based Networks for Math

  • [AST-Dec] Tree-structured decoding for solving math word problems, EMNLP 2019 [paper]
  • πŸ”₯ [GTS] A Goal-Driven Tree-Structured Neural Model for Math Word Problems, IJCAI 2019 [paper]
  • [CoqGym] Learning to Prove Theorems via Interacting with Proof Assistants, ICML 2019 [paper]
  • [KA-S2T] A knowledge-aware sequence-to-tree network for math word problem solving, EMNLP 2020 [paper]
  • [TSN-MD, NT-LSTM] Solving arithmetic word problems by scoring equations with recursive neural networks, Expert Systems with Applications 2021 [paper]
  • [NS-Solver] Neural-Symbolic Solver for Math Word Problems with Auxiliary Tasks, ACL 2021 [paper]
  • [NumS2T] Math word problem solving with explicit numerical values, ACL 2021 [paper]
  • [HMS] Hms: A hierarchical solver with dependency-enhanced understanding for math word problem, AAAI 2021 [paper]
  • [LBF] Learning by fixing: Solving math word problems with weak supervision, AAAI 2021 [paper]
  • [Seq2DAG] A bottom-up dag structure extraction model for math word problems, AAAI 2021 [paper]
  • [Graph2Tree] Graph-to-Tree Neural Networks for Learning Structured Input-Output Translation with Applications to Semantic Parsing and Math Word Problem, EMNLP 2020 [paper]
  • [Multi-E/D] Solving math word problems with multi-encoders and multi-decoders, COLING 2020 [paper]
  • πŸ”₯ [Graph2Tree] Graph-to-Tree Learning for Solving Math Word Problems, ACL 2020 [paper]
  • [EEH-G2T] An edge-enhanced hierarchical graph-to-tree network for math word problem solving, EMNLP 2021 [paper]

Other Neural Networks for Math

  • [DeepMath] Deepmath-deep sequence models for premise selection, NeurIPS 2016 [paper]
  • [Holophrasm] Holophrasm: a neural automated theorem prover for higher-order logic, arXiv:1608.02644 [paper]
  • πŸ”₯ [CNNTP, WaveNetTP] Deep network guided proof search, arXiv:1701.06972 [paper]
  • πŸ”₯ [MathDQN] Mathdqn: Solving arithmetic word problems via deep reinforcement learning, AAAI 2018 [paper]
  • [DDT] Solving math word problems with double-decoder transformer, arXiv:1908.10924 [paper]
  • [DeepHOL] HOList: An environment for machine learning of higher order logic theorem proving, ICML 2019 [paper]
  • [NGS] GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
  • [PGDPNet] Learning to Understand Plane Geometry Diagram, NeurIPS 2022 MATH-AI Workshop [paper]

πŸ“œ Pre-trained Language Models for Mathematical Reasoning

General Pre-trained Language Models (<100B)

  • [Transformer] Attention is all you need, NeurIPS 2017 [paper]
  • [BERT] Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv:1810.04805 [paper]
  • [T5] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, JMLR 2020 [paper]
  • [RoBERTa] Roberta: A robustly optimized bert pretraining approach, arXiv:1907.11692 [paper]
  • [GPT-2, 1.5B] Language models are unsupervised multitask learners, OpenAI Blog, 2020 [paper]
  • [BART] BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension, ACL 2020 [paper]
  • [ALBERT] Albert: A lite bert for self-supervised learning of language representations, arXiv:1909.11942 [paper]
  • [GPT-Neo] The pile: An 800gb dataset of diverse text for language modeling, arXiv:2101.00027 [paper]
  • [VL-T5] Unifying Vision-and-Language Tasks via Text Generation, ICML 2021 [paper]

Self-Supervised Learning for Math

  • πŸ”₯ [GenBERT] Injecting numerical reasoning skills into language models, ACL 2020 [paper]
  • πŸ”₯ [GPT-f] Generative language modeling for automated theorem proving, arXiv:2009.03393 [paper]
  • [LISA] LISA: Language models of ISAbelle proofs, AITP 2021 [paper]
  • [MATH-PLM] Measuring Mathematical Problem Solving With the MATH Dataset, NeurIPS 2021 [paper]
  • [LIME] Lime: Learning inductive bias for primitives of mathematical reasoning, ICML 2021 [paper]
  • [NF-NSM] Injecting Numerical Reasoning Skills into Knowledge Base Question Answering Models, arXiv:2112.06109 [paper]
  • [MWP-BERT] MWP-BERT: Numeracy-augmented pre-training for math word problem solving, Findings of NAACL 2022 [paper]
  • [HTPS] HyperTree Proof Search for Neural Theorem Proving, arXiv:2205.11491 [paper]
  • [Thor] Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers, arXiv:2205.10893 [paper]
  • [Set] Insights into pre-training via simpler synthetic tasks, arXiv:2206.10139 [paper]
  • [PACT] Proof artifact co-training for theorem proving with language models, ICLR 2022 [paper]
  • πŸ”₯ [TaPEX] TAPEX: Table Pre-training via Learning a Neural SQL Executor, ICLR 2022 [paper]
  • πŸ”₯ [Minerva] Solving quantitative reasoning problems with language models, NeurIPS 2022 [paper]

Task-specific Fine-tuning for Math

  • [EPT] Point to the expression: Solving algebraic word problems using the expression-pointer transformer model, EMNLP 2020 [paper]
  • [Generate & Rank] Generate & Rank: A Multi-task Framework for Math Word Problems, EMNLP 2021 [paper]
  • [RPKHS] Improving Math Word Problems with Pre-trained Knowledge and Hierarchical Reasoning, EMNLP 2021 [paper]
  • [PatchTRM] IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning, NeurIPS 2021 [paper]
  • πŸ”₯ [GSM8K-PLM] Training verifiers to solve math word problems, arXiv:2110.14168 [paper]
  • πŸ”₯ [Inter-GPS] Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
  • [Aristo] From β€˜F’to β€˜A’on the NY regents science exams: An overview of the aristo project, AI Magazine 2020 [paper]
  • [FinQANet] FinQA: A Dataset of Numerical Reasoning over Financial Data, EMNLP 2021 [paper]
  • [TAGOP] TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance, ACL-JCNLP 2021 [paper]
  • [LAMT] Linear algebra with transformers, arXiv:2112.01898 [paper]
  • πŸ”₯ [Scratchpad] Show your work: Scratchpads for intermediate computation with language models, arXiv:2112.00114 [paper]
  • [Self-Sampling] Learning from Self-Sampled Correct and Partially-Correct Programs, arXiv:2205.14318 [paper]
  • [DeductReasoner] Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction, ACL 2022 [paper]
  • [DPE-NGS] An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper]
  • [BERT-TD+CL] Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems, Findings of ACL 2022 [paper]
  • [MT2Net] MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data, ACL 2022 [paper]
  • [miniF2F-PLM] MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics, ICLR 2022 [paper]
  • πŸ”₯ [NaturalProver] NaturalProver: Grounded Mathematical Proof Generation with Language Models, NeurIPS 2022 [paper]
  • πŸ”₯ [UniGeo] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]
  • πŸ”₯ [Bhaskara] Lila: A Unified Benchmark for Mathematical Reasoning, EMNLP 2022 [paper]

🌠 In-context Learning for Mathematical Reasoning

General Large Language Models (100B+)

  • πŸ”₯ [GPT-3, 175B] Language models are few-shot learners, NeurIPS 2020 [paper]
  • πŸ”₯ [Codex, 175B] Evaluating large language models trained on code, arXiv:2107.03374 [paper]
  • πŸ”₯ [PaLM, 540B] PaLM: Scaling Language Modeling with Pathways, arXiv:2204.02311 [paper]
  • πŸ”₯ [ChatGPT, 175B] ChatGPT: Optimizing Language Models for Dialogue, November 30, 2022 [website]
  • ❓ [GPT-4]

In-context Example Selection

  • πŸ”₯ [Few-shot-CoT] Chain of thought prompting elicits reasoning in large language models, NeurIPS 2022 [paper]
  • [Retrieval] Learning to retrieve prompts for in-context learning, NAACL-HLT 2022 [paper]
  • πŸ”₯ [PromptPG-CoT] Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning, arXiv:2209.14610 [paper]
  • [Retrieval-CoT] Automatic Chain of Thought Prompting in Large Language Models, arXiv:2210.03493 [paper]
  • [Generate] Generate rather than retrieve: Large language models are strong context generators, arXiv:2209.10063 [paper]
  • [Complexity-CoT] Complexity-Based Prompting for Multi-Step Reasoning, arXiv:2210.00720 [paper]
  • [Auto-CoT] Automatic Chain of Thought Prompting in Large Language Models, arXiv:2210.03493 [paper]

High-quality Reasoning Chains

  • πŸ”₯ [Self-Consistency-CoT] Self-consistency improves chain of thought reasoning in language models, arXiv:2203.11171 [paper]
  • πŸ”₯ [Least-to-most CoT] Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, arXiv:2205.10625 [paper]
  • On the Advance of Making Language Models Better Reasoners, arXiv:2206.02336 [paper]
  • Decomposed prompting: A modular approach for solving complex tasks, arXiv:2210.02406 [paper]
  • PAL: Program-aided Language Models, arXiv:2211.10435 [paper]
  • πŸ”₯ [Few-shot-PoT] Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks, arXiv:2211.12588 [paper]

♣️ Other Related Work for Mathematical Reasoning

Early Work

  • Empirical explorations of the geometry theorem machine, Western Joint IRE-AIEE-ACM Computer Conference 1960 [paper]
  • Basic principles of mechanical theorem proving in elementary geometries, Journal of Automated Reasoning 1986 [paper]
  • Automated generation of readable proofs with geometric invariants, Journal of Automated Reasoning 1996 [paper]

Datasets

  • πŸ”₯ [TextbookQA] Are You Smarter Than A Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension, CVPR 2017 [paper]
  • πŸ”₯ [Raven] Raven: A dataset for relational and analogical visual reasoning, CVPR 2019 [paper]
  • [APPS] Measuring Coding Challenge Competence With APPS, NeurIPS 2021 [paper]
  • [PhysNLU] PhysNLU: A Language Resource for Evaluating Natural Language Understanding and Explanation Coherence in Physics, 2022 [paper]

Methods

  • My computer is an honor studentβ€”but how intelligent is it? Standardized tests as a measure of AI, AI Magazine 2016 [paper]
  • Learning pipelines with limited data and domain knowledge: A study in parsing physics problems, NeurIPS 2018 [paper]
  • Automatically proving plane geometry theorems stated by text and diagram, International Journal of Pattern Recognition and Artificial Intelligence 2019 [paper]
  • Classification and Clustering of arXiv Documents, Sections, and Abstracts, Comparing Encodings of Natural and Mathematical Language, JCDL 2020 [paper]

Latest Work (To be classified)

  • πŸ”₯ Advancing mathematics by guiding human intuition with AI, Nature 2021 [paper]
  • [MWPToolkit] Mwptoolkit: an open-source framework for deep learning-based math word problem solvers, AAAI 2022 [paper]
  • A deep reinforcement learning agent for geometry online tutoring, Knowledge and Information Systems 2022 [paper]
  • ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler, NeurIPS 2022 [paper]
  • Solving math word problems with process and outcome-based feedback, arXiv:2211.14275 [paper]
  • APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning, arXiv:2212.07249 [paper]
  • Enhancing Financial Table and Text Question Answering with Tabular Graph and Numerical Reasoning, AACL 2022 [paper]
  • DyRRen: A Dynamic Retriever-Reranker-Generator Model for Numerical Reasoning over Tabular and Textual Data, AAAI 2023 [paper]
  • Generalizing Math Word Problem Solvers via Solution Diversification, arXiv:2212.00833 [paper]
  • Textual Enhanced Contrastive Learning for Solving Math Word Problems, arXiv:2211.16022 [paper]
  • Analogical Math Word Problems Solving with Enhanced Problem-Solution Association, EMNLP 2022 [paper]

Citation

If you find this repo useful, please kindly cite our survey:

@article{lu2022dl4math,
  title={A Survey of Deep Learning for Mathematical Reasoning},
  author={Lu, Pan and Qiu, Liang and Yu, Wenhao and Welleck, Sean and Chang, Kai-Wei},
  journal={arXiv preprint arXiv:2212.10535},
  year={2022}
}