/DL4TP

A Survey on Deep Learning for Theorem Proving

MIT LicenseMIT

Deep Learning for Theorem Proving (DL4TP)

Awesome License: MIT PRs Welcome GitHub last commit (branch)

Welcome to our repository! This is a curated collection of resources related to deep learning for theorem proving.

We categorize papers primarily based on the applications of deep learning models, organizing them into five task-specific categories and two dataset categories. A single paper may appear in multiple categories due to its relevance to different tasks or datasets. Additionally, each paper is labeled with the used theorem prover/proof calculus/problem domain to help users quickly find the resources that best match their interests or needs. For example, papers using/generating theorems/proofs in natural language are labeled with [NL].

For more details, please refer to our survey paper: A Survey on Deep Learning for Theorem Proving.

Table of Contents

Surveys

  1. Towards the Automatic Mathematician CADE 2021 [paper]

    Rabe, Markus N and Szegedy, Christian

  2. Learning Guided Automated Reasoning: A Brief Survey arXiv 2024 [paper]

    Blaauwbroek, Lasse and Cerna, David and Gauthier, Thibault and Jakubův, Jan and Kaliszyk, Cezary and Suda, Martin and Urban, Josef

  3. A Survey on Deep Learning for Theorem Proving arXiv 2024 [paper]

    Li, Zhaoyu and Sun, Jialiang and Murphy, Logan and Su, Qidong and Li, Zenan and Zhang, Xian and Yang, Kaiyu and Si, Xujie

Tutorials

  1. Tutorial on Machine Learning for Theorem Proving NeurIPS 2023 Tutorial [link] [Coq, Isabelle, Lean]

    First, Emily and Jiang, Albert and Yang, Kaiyu

Tasks

Autoformalization

  1. First Experiments with Neural Translation of Informal to Formal Mathematics CICM 2018 [paper] [NL, Mizar]

    Wang, Qingxiang and Kaliszyk, Cezary and Urban, Josef

  2. Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar CPP 2020 [paper] [NL, Mizar]

    Wang, Qingxiang and Brown, Chad and Kaliszyk, Cezary and Urban, Josef

  3. Learning Alignment between Formal & Informal Mathematics AITP 2020 [paper] [NL, HOL Light]

    Bansal, Kshitij and Szegedy, Christian

  4. A Promising Path Towards Autoformalization and General Artificial Intelligence CICM 2020 [paper]

    Szegedy, Christian

  5. Autoformalization with Large Language Models NeurIPS 2022 [paper] [NL, Isabelle]

    Wu, Yuhuai and Jiang, Albert Qiaochu and Li, Wenda and Rabe, Markus and Staats, Charles and Jamnik, Mateja and Szegedy, Christian

  6. Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs EMNLP 2022 MathNLP Workshop [paper] [NL, Coq]

    Cunningham, Garett and Bunescu, Razvan C and Juedes, David

  7. Towards a Mathematics Formalisation Assistant using Large Language Models arXiv 2022 [paper] [NL, Lean]

    Agrawal, Ayush and Gadgil, Siddhartha and Goyal, Navin and Narayanan, Ashvni and Tadipatri, Anand

  8. Towards Automating Formalisation of Theorem Statements using Large Language Models NeurIPS 2022 MATH-AI Workshop [paper] [NL, Lean]

    Gadgil, Siddhartha and Tadipatri, Anand Rao and Agrawal, Ayush and Narayanan, Ashvni and Goyal, Navin

  9. Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs ICLR 2023 [paper] [NL, Isabelle]

    Jiang, Albert Q and Welleck, Sean and Zhou, Jin Peng and Li, Wenda and Liu, Jiacheng and Jamnik, Mateja and Lacroix, Timothée and Wu, Yuhuai and Lample, Guillaume

  10. Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning EMNLP 2023 Findings [paper] [NL, Prover9, Z3]

    Pan, Liangming and Albalak, Alon and Wang, Xinyi and Wang, William Yang

  11. LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers EMNLP 2023 [paper] [NL, Prover9]

    Olausson, Theo X and Gu, Alex and Lipkin, Benjamin and Zhang, Cedegao E and Solar-Lezama, Armando and Tenenbaum, Joshua B and Levy, Roger

  12. SATLM: Satisfiability-Aided Language Models Using Declarative Prompting NeurIPS 2023 [paper] [NL, Z3]

    Ye, Xi and Chen, Qiaochu and Dillig, Isil and Durrett, Greg

  13. FIMO: A Challenge Formal Dataset for Automated Theorem Proving arXiv 2023 [paper] [NL, Lean]

    Liu, Chengwu and Shen, Jianhao and Xin, Huajian and Liu, Zhengying and Yuan, Ye and Wang, Haiming and Ju, Wei and Zheng, Chuanyang and Yin, Yichun and Li, Lin and Zhang, Ming Zhang and Liu, Qun

  14. Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zhao, Xueliang and Li, Wenda and Kong, Lingpeng

  15. ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics arXiv 2023 [paper] [NL, Lean]

    Azerbayev, Zhangir and Piotrowski, Bartosz and Schoelkopf, Hailey and Ayers, Edward W and Radev, Dragomir and Avigad, Jeremy

  16. Multilingual Mathematical Autoformalization arXiv 2023 [paper] [NL, Isabelle, Lean]

    Jiang, Albert Q and Li, Wenda and Jamnik, Mateja

  17. Lyra: Orchestrating Dual Correction in Automated Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zheng, Chuanyang and Wang, Haiming and Xie, Enze and Liu, Zhengying and Sun, Jiankai and Xin, Huajian and Shen, Jianhao and Li, Zhenguo and Li, Yu

  18. A New Approach Towards Autoformalization arXiv 2023 [paper] [NL, Lean]

    Patel, Nilay and Flanigan, Jeffrey and Saha, Rahul

  19. MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data ICLR 2024 [paper] [NL, Lean]

    Huang, Yinya and Lin, Xiaohan and Liu, Zhengying and Cao, Qingxing and Xin, Huajian and Wang, Haiming and Li, Zhenguo and Song, Linqi and Liang, Xiaodan

  20. Don't Trust: Verify - Grounding LLM Quantitative Reasoning with Autoformalization ICLR 2024 [paper] [NL, Isabelle]

    Zhou, Jin Peng and Staats, Charles E and Li, Wenda and Szegedy, Christian and Weinberger, Kilian Q and Wu, Yuhuai

  21. LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [NL, Isabelle]

    Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan

  22. Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [NL, Isabelle, Lean]

    Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean

  23. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [NL, Isabelle]

    Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya

  24. InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [NL, Lean]

    Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

Premise Selection

  1. Deepmath - Deep Sequence Models for Premise Selection NeurIPS 2016 [paper] [Mizar]

    Irving, Geoffrey and Szegedy, Christian and Alemi, Alexander A and Eén, Niklas and Chollet, François and Urban, Josef

  2. HolStep: A Machine Learning Dataset for Higher-Order Logic Theorem Proving ICLR 2017 [paper] [HOL Light]

    Kaliszyk, Cezary and Chollet, François and Szegedy, Christian

  3. Tree-structure CNN for Automated Theorem Proving ICONIP 2017 [paper] [HOL Light]

    Peng, Kebin and Ma, Dianfu

  4. Premise Selection for Theorem Proving by Deep Graph Embedding NeurIPS 2017 [paper] [HOL Light]

    Wang, Mingzhe and Tang, Yihe and Wang, Jian and Deng, Jia

  5. Premise Selection with Neural Networks and Distributed Representation of Features arXiv 2018 [paper] [Mizar]

    Kucik, Andrzej Stanisław and Korovin, Konstantin

  6. HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

    Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox

  7. Learning Representations of Logical Formulae using Graph Neural Networks NeurIPS 2019 GRL Workshop [paper] [Mizar]

    Glorot, Xavier and Anand, Ankit and Aygun, Eser and Mourad, Shibl and Kohli, Pushmeet and Precup, Doina

  8. Property Invariant Embedding for Automated Reasoning ECAI 2019 [paper] [Mizar]

    Olšák, Miroslav and Kaliszyk, Cezary and Urban, Josef

  9. Usefulness of Lemmas via Graph Neural Networks AITP 2019 [paper] [Mizar]

    Goertzel, Zarathustra and Urban, Josef

  10. Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling arXiv 2019 [paper] [Mizar, HOL Light]

    Crouse, Maxwell and Abdelaziz, Ibrahim and Cornelio, Cristina and Thost, Veronika and Wu, Lingfei and Forbus, Kenneth and Fokoue, Achille

  11. Directed Graph Networks for Logical Reasoning (Extended Abstract) PAAR 2020 [paper] [Mizar]

    Rawson, Michael and Reger, Giles

  12. Stateful Premise Selection by Recurrent Neural Networks LPAR 2020 [paper] [Mizar]

    Piotrowski, Bartosz and Urban, Josef

  13. Premise Selection in Natural Language Mathematical Texts ACL 2020 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  14. Natural Language Premise Selection: Finding Supporting Statements for Mathematical Text LREC 2020 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  15. TextGraphs 2022 Shared Task on Natural Language Premise Selection TextGraphs 2020 [paper] [NL]

    Valentino, Marco and Ferreira, Deborah and Thayaparan, Mokanarangan and Freitas, André and Ustalov, Dmitry

  16. IJS at TextGraphs-16 Natural Language Premise Selection Task: Will Contextual Information Improve Natural Language Premise Selection? TextGraphs 2020 [paper] [NL]

    Tran, Thi Hong Hanh and Martinc, Matej and Doucet, Antoine and Pollak, Senja

  17. UNLPS at TextGraphs-16 Natural Language Premise Selection Task: Unsupervised Natural Language Premise Selection in Mathematical Text using Sentence-MPNet TextGraphs 2020 [paper] [NL]

    Trust, Paul and Kadusabe, Provia and Younis, Haseeb and Minghim, Rosane and Milios, Evangelos and Zahran, Ahmed

  18. Keyword-based Natural Language Premise Selection for an Automatic Mathematical Statement Proving TextGraphs 2020 [paper] [NL]

    Dastgheib, Doratossadat and Asgari, Ehsaneddin

  19. TextGraphs-16 Natural Language Premise Selection Task: Zero-Shot Premise Selection with Prompting Generative Language Models TextGraphs 2020 [paper] [NL]

    Kovriguina, Liubov and Teucher, Roman and Wardenga, Robert

  20. Attention Recurrent Cross-Graph Neural Network for Selecting Premises IJMLC 2021 [paper] [Mizar]

    Liu, Qinghua and Xu, Yang and He, Xingxing

  21. Graph Representations for Higher-Order Logic and Theorem Proving AAAI 2020 [paper] [HOL Light]

    Paliwal, Aditya and Loos, Sarah and Rabe, Markus and Bansal, Kshitij and Szegedy, Christian

  22. Improving Stateful Premise Selection with Transformers CICM 2021 [paper] [Mizar]

    Proroković, Krsto and Wand, Michael and Schmidhuber, Jürgen

  23. Contrastive Graph Representations for Logical Formulas Embedding TKDE 2021 [paper] [Mizar]

    Lin, Qika and Liu, Jun and Zhang, Lingling and Pan, Yudai and Hu, Xin and Xu, Fangzhi and Zeng, Hongwei

  24. Graph Contrastive Pre-training for Effective Theorem Reasoning ICML 2021 SSL Workshop [paper] [Coq]

    Li, Zhaoyu and Chen, Binghong and Si, Xujie

  25. NaturalProofs: Mathematical Theorem Proving in Natural Language NeurIPS 2021 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Bras, Ronan Le and Hajishirzi, Hannaneh and Choi, Yejin and Cho, Kyunghyun

  26. Contrastive Finetuning of Generative Language Models for Informal Premise Selection AITP 2021 [paper] [NL]

    Han, Jesse Michael and Xu, Tao and Polu, Stanislas and Neelakantan, Arvind and Radford, Alec

  27. STAR: Cross-modal [STA]tement [R]epresentation for Selecting Relevant Mathematical Premises EACL 2021 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  28. A Study of Continuous Vector Representations for Theorem Proving Journal of Logic and Computation 2021 [paper] [Mizar]

    Purgał, Stanisław and Parsert, Julian and Kaliszyk, Cezary

  29. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  30. The Isabelle ENIGMA ITP 2022 [paper] [Isabelle]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  31. Formal Premise Selection with Language Models AITP 2022 [paper] [Isabelle]

    Tworkowski, Szymon and Mikuła, Maciej and Odrzygóżdż, Tomasz and Czechowski, Konrad and Antoniak, Szymon and Jiang, Albert and Szegedy, Christian and Kuciński, Łukasz and Miłoś, Piotr and Wu, Yuhuai

  32. MizAR 60 for Mizar 50 ITP 2023 [paper] [Mizar]

    Jakubův, Jan and Chvalovský, Karel and Goertzel, Zarathustra and Kaliszyk, Cezary and Olšák, Mirek and Piotrowski, Bartosz and Schulz, Stephan and Suda, Martin and Urban, Josef

  33. Graph Sequence Learning for Premise Selection AITP 2023 [paper] [Mizar]

    Holden, Edvard K and Korovin, Konstantin

  34. CoProver: A Recommender System for Proof Construction CICM 2023 [paper] [PVS]

    Yeh, Eric and Hitaj, Briland and Owre, Sam and Quemener, Maena and Shankar, Natarajan

  35. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

    Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima

  36. MLFMF: Data Sets for Machine Learning for Mathematical Formalization NeurIPS 2023 [paper] [Agda, Lean]

    Bauer, Andrej and Petković, Matej and Todorovski, Ljupco

  37. Magnushammer: A Transformer-Based Approach to Premise Selection ICLR 2024 [paper] [Isabelle]

    Mikuła, Maciej and Antoniak, Szymon and Tworkowski, Szymon and Jiang, Albert Qiaochu and Zhou, Jin Peng and Szegedy, Christian and Kuciński, Łukasz and Miłos, Piotr and Wu, Yuhuai

Proofstep Generation

  1. Holophrasm: A Neural Automated Theorem Prover for Higher-Order Logic arXiv 2016 [paper] [Metamath]

    Whalen, Daniel

  2. GamePad: A Learning Environment for Theorem Proving ICLR 2019 [paper] [Coq]

    Huang, Daniel and Dhariwal, Prafulla and Song, Dawn and Sutskever, Ilya

  3. HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

    Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox

  4. Learning to Prove Theorems via Interacting with Proof Assistants ICML 2019 [paper] [Coq]

    Yang, Kaiyu and Deng, Jia

  5. Graph Representations for Higher-Order Logic and Theorem Proving AAAI 2020 [paper] [HOL Light]

    Paliwal, Aditya and Loos, Sarah and Rabe, Markus and Bansal, Kshitij and Szegedy, Christian

  6. Learning to Prove Theorems by Learning to Generate Theorems NeurIPS 2020 [paper] [Metamath]

    Wang, Mingzhe and Deng, Jia

  7. Generating Correctness Proofs with Neural Networks MAPL 2020 [paper] [Coq]

    Sanchez-Stern, Alex and Alhessi, Yousef and Saul, Lawrence and Lerner, Sorin

  8. TacTok: Semantics-Aware Proof Synthesis OOPSLA 2020 [paper] [Coq]

    First, Emily and Brun, Yuriy and Guha, Arjun

  9. Automated Theorem Proving via Interacting with Proof Assistants by Dynamic Strategies BigCom 2020 [paper] [Coq]

    Mo, Guangshuai and Xiong, Yan and Huang, Wenchao and Ma, Lu

  10. Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [Metamath]

    Polu, Stanislas and Sutskever, Ilya

  11. INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving ICLR 2021 [paper] [inequality]

    Wu, Yuhuai and Jiang, Albert Qiaochu and Ba, Jimmy and Grosse, Roger

  12. TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning NeurIPS 2021 [paper] [HOL4]

    Wu, Minchao and Norrish, Michael and Walder, Christian and Dezfouli, Amir

  13. LISA: Language Models of Isabelle Proofs AITP 2021 [paper] [Isabelle]

    Jiang, Albert Qiaochu and Li, Wenda and Han, Jesse Michael and Wu, Yuhuai

  14. Retrieval-Augmented Proof Step Synthesis AITP 2021 [paper] [HOL Light]

    Szegedy, Christian and Rabe, Markus and Michalewski, Henryk

  15. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  16. UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression EMNLP 2022 [paper] [geometry]

    Chen, Jiaqi and Li, Tong and Qin, Jinghui and Lu, Pan and Lin, Liang and Chen, Chongyu and Liang, Xiaodan

  17. Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs EMNLP 2022 MathNLP Workshop [paper] [NL, Coq]

    Cunningham, Garett and Bunescu, Razvan C and Juedes, David

  18. HyperTree Proof Search for Neural Theorem Proving NeurIPS 2022 [paper] [Metamath, Lean]

    Lample, Guillaume and Lacroix, Timothee and Lachaux, Marie-Anne and Rodriguez, Aurelien and Hayat, Amaury and Lavril, Thibaut and Ebner, Gabriel and Martinet, Xavier

  19. NaturalProver: Grounded Mathematical Proof Generation with Language Models NeurIPS 2022 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Lu, Ximing and Hajishirzi, Hannaneh and Choi, Yejin

  20. Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers NeurIPS 2022 [paper] [Isabelle]

    Jiang, Albert Qiaochu and Li, Wenda and Tworkowski, Szymon and Czechowski, Konrad and Odrzygóżdż, Tomasz and Miłos, Piotr and Wu, Yuhuai and Jamnik, Mateja

  21. Diversity-Driven Automated Formal Verification ICSE 2022 [paper] [Coq]

    First, Emily and Brun, Yuriy

  22. Formal Premise Selection with Language Models AITP 2022 [paper] [Isabelle]

    Tworkowski, Szymon and Mikuła, Maciej and Odrzygóżdż, Tomasz and Czechowski, Konrad and Antoniak, Szymon and Jiang, Albert and Szegedy, Christian and Kuciński, Łukasz and Miłoś, Piotr and Wu, Yuhuai

  23. Passport: Improving Automated Formal Verification Using Identifiers TOPLAS 2023 [paper] [Coq]

    Sanchez-Stern, Alex and First, Emily and Zhou, Timothy and Kaufman, Zhanna and Brun, Yuriy and Ringer, Talia

  24. Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs ICLR 2023 [paper] [NL, Isabelle]

    Jiang, Albert Q and Welleck, Sean and Zhou, Jin Peng and Li, Wenda and Liu, Jiacheng and Jamnik, Mateja and Lacroix, Timothée and Wu, Yuhuai and Lample, Guillaume

  25. Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

    Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya

  26. Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zhao, Xueliang and Li, Wenda and Kong, Lingpeng

  27. CoProver: A Recommender System for Proof Construction CICM 2023 [paper] [PVS]

    Yeh, Eric and Hitaj, Briland and Owre, Sam and Quemener, Maena and Shankar, Natarajan

  28. Baldur: Whole-Proof Generation and Repair with Large Language Models ESEC/FSE 2023 [paper] [Isabelle]

    First, Emily and Rabe, Markus and Ringer, Talia and Brun, Yuriy

  29. Peano: Learning Formal Mathematical Reasoning Philosophical Transactions of the Royal Society A 2023 [paper] [Peano]

    Poesia, Gabriel and Goodman, Noah D

  30. Mathematical Capabilities of ChatGPT NeurIPS 2023 [paper] [NL]

    Simon Frieder and Luca Pinchetti and Alexis Chevalier and Ryan-Rhys Griffiths and Tommaso Salvatori and Thomas Lukasiewicz and Philipp Christian Petersen and Julius Berner

  31. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

    Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima

  32. LLM vs ITP NeurIPS 2023 MATH-AI Workshop [paper] [NL]

    Frieder, Simon and Trimmel, Martin and Alawadhi, Rashid and Gy, Klaus

  33. LLMSTEP: LLM Proofstep Suggestions in Lean NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

    Welleck, Sean and Saha, Rahul

  34. Towards Large Language Models as Copilots for Theorem Proving in Lean NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

    Song, Peiyang and Yang, Kaiyu and Anandkumar, Anima

  35. Temperature-Scaled Large Language Models for Lean Proofstep Prediction NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

    Gloeckle, Fabian and Roziere, Baptiste and Hayat, Amaury and Synnaeve, Gabriel

  36. DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-Level Value Function ACL 2023 [paper] [Isabelle, Lean]

    Wang, Haiming and Yuan, Ye and Liu, Zhengying and Shen, Jianhao and Yin, Yichun and Xiong, Jing and Xie, Enze and Shi, Han and Li, Yujun and Li, Lin and Yin, Jian and Li, Zhenguo and Liang, Xiaodan

  37. Getting More out of Large Language Models for Proofs AITP 2023 [paper] [Coq]

    Zhang, Shizhuo Dylan and Ringer, Talia and First, Emily

  38. UniMath: A Foundational and Multimodal Mathematical Reasoner EMNLP 2023 [paper] [geometry]

    Liang, Zhenwen and Yang, Tianyu and Zhang, Jipeng and Zhang, Xiangliang

  39. TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models EMNLP 2023 [paper] [Lean]

    Xiong, Jing and Shen, Jianhao and Yuan, Ye and Wang, Haiming and Yin, Yichun and Liu, Zhengying and Li, Lin and Guo, Zhijiang and Cao, Qingxing and Huang, Yinya and Zheng, Chuanyang and Liang, Xiaodan and Zhang, Ming and Liu, Qun

  40. Learning Proof Transformations and Its Applications in Interactive Theorem Proving FroCoS 2023 [paper] [Coq]

    Zhang, Liao and Blaauwbroek, Lasse and Kaliszyk, Cezary and Urban, Josef

  41. Evaluating Language Models for Mathematics through Interactions arXiv 2023 [paper] [NL]

    Collins, Katherine M and Jiang, Albert Q and Frieder, Simon and Wong, Lionel and Zilka, Miri and Bhatt, Umang and Lukasiewicz, Thomas and Wu, Yuhuai and Tenenbaum, Joshua B and Hart, William and Gowers, Timothy and Li, Wenda and Weller, Adrian and Jamnik, Mateja

  42. Large Language Models for Mathematicians arXiv 2023 [paper] [NL]

    Scheidt, Gregor vom

  43. Experimental Results from Applying GPT-4 to An Unpublished Formal Language arXiv 2023 [paper] [Axiotome]

    Scheidt, Gregor vom

  44. Large Language Models' Understanding of Math: Source Criticism and Extrapolation arXiv 2023 [paper] [Lean]

    Yousefzadeh, Roozbeh and Cao, Xuenan

  45. Lyra: Orchestrating Dual Correction in Automated Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zheng, Chuanyang and Wang, Haiming and Xie, Enze and Liu, Zhengying and Sun, Jiankai and Xin, Huajian and Shen, Jianhao and Li, Zhenguo and Li, Yu

  46. Enhancing Neural Theorem Proving through Data Augmentation and Dynamic Sampling Method arXiv 2023 [paper] [Lean]

    Vishwakarma, Rahul and Mishra, Subhankar

  47. An In-Context Learning Agent for Formal Theorem-Proving arXiv 2023 [paper] [Lean, Coq]

    Thakur, Amitayush and Tsoukalas, George and Wen, Yeming and Xin, Jimmy and Chaudhuri, Swarat

  48. LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [NL, Isabelle]

    Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan

  49. Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [NL, Isabelle, Lean]

    Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean

  50. Solving Proof Block Problems Using Large Language Models SIGCSE 2024 [paper] [NL]

    Poulsen, Seth and Sarsa, Sami and Prather, James and Leinonen, Juho and Becker, Brett A and Hellas, Arto and Denny, Paul and Reeves, Brent N

  51. Solving Olympiad Geometry without Human Demonstrations Nature 2024 [paper] [geometry]

    Trinh, Trieu H and Wu, Yuhuai and Le, Quoc V and He, He and Luong, Thang

  52. Graph2Tac: Learning Hierarchical Representations of Math Concepts in Theorem proving arXiv 2024 [paper] [Coq]

    Rute, Jason and Olšák, Miroslav and Blaauwbroek, Lasse and Massolo, Fidel Ivan Schaposnik and Piepenbrock, Jelle and Pestun, Vasily

  53. FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems arXiv 2024 [paper] [geometry]

    He, Yiming and Zou, Jia and Zhang, Xiaokai and Zhu, Na and Leng, Tuo

  54. Selene: Pioneering Automated Proof in Software Verification arXiv 2024 [paper] [Isabelle]

    Zhang, Lichen and Lu, Shuai and Duan, Nan

  55. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [NL, Isabelle]

    Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya

  56. InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [NL, Lean]

    Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

  57. Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search arXiv 2024 [paper] [Dafny, Coq, Lean]

    Brandfonbrener, David and Raja, Sibi and Prasad, Tarun and Loughridge, Chloe and Yang, Jianang and Henniger, Simon and Byrd, William E and Zinkov, Robert and Amin, Nada

Proof Search

  1. Deep Network Guided Proof Search LPAR 2017 [paper] [E]

    Loos, Sarah and Irving, Geoffrey and Szegedy, Christian and Kaliszyk, Cezary

  2. Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning arXiv 2018 [paper] [IPL]

    Kusumoto, Mitsuru and Yahata, Keisuke and Sakai, Masahiro

  3. ENIGMA-NG: Efficient Neural and Gradient-Boosted Inference Guidance for E CADE 2019 [paper] [ENIGMA]

    Chvalovský, Karel and Jakubův, Jan and Suda, Martin and Urban, Josef

  4. Hammering Mizar by Learning Clause Guidance ITP 2019 [paper] [ENIGMA]

    Jakubův, Jan and Urban, Josef

  5. Learning Dynamic Polynomial Proofs NeurIPS 2019 [paper] [polynomial inequality]

    Fawzi, Alhussein and Malinowski, Mateusz and Fawzi, Hamza and Fawzi, Omar

  6. A Neurally-Guided, Parallel Theorem Prover FroCoS 2019 [paper] [Z3]

    Rawson, Michael and Reger, Giles

  7. Property Invariant Embedding for Automated Reasoning ECAI 2019 [paper] [leanCoP]

    Olšák, Miroslav and Kaliszyk, Cezary and Urban, Josef

  8. Automated Theorem Proving via Interacting with Proof Assistants by Dynamic Strategies BigCom 2020 [paper] [Coq]

    Mo, Guangshuai and Xiong, Yan and Huang, Wenchao and Ma, Lu

  9. Guiding Inferences in Connection Tableau by Recurrent Neural Networks CICM 2020 [paper] [connection tableau]

    Piotrowski, Bartosz and Urban, Josef

  10. ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (System Description) IJCAR 2020 [paper] [ENIGMA]

    Jakubův, Jan and Chvalovský, Karel and Olšák, Miroslav and Piotrowski, Bartosz and Suda, Martin and Urban, Josef

  11. Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic LPAR 2020 [paper] [HOL4]

    Gauthier, Thibault

  12. Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [Metamath]

    Polu, Stanislas and Sutskever, Ilya

  13. Learning to Prove from Synthetic Theorems arXiv 2020 [paper] [saturation]

    Aygün, Eser and Ahmed, Zafarali and Anand, Ankit and Firoiu, Vlad and Glorot, Xavier and Orseau, Laurent and Precup, Doina and Mourad, Shibl

  14. An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic arXiv 2020 [paper] [TRAIL]

    Abdelaziz, Ibrahim and Thost, Veronika and Crouse, Maxwell and Fokoue, Achille

  15. Training a First-Order Theorem Prover from Synthetic Data ICLR 2021 MATH-AI Workshop [paper] [saturation]

    Firoiu, Vlad and Aygün, Eser and Anand, Ankit and Ahmed, Zafarali and Glorot, Xavier and Orseau, Laurent and Zhang, Lei and Precup, Doina and Mourad, Shibl

  16. Learned Provability Likelihood for Tactical Search SCSS 2021 [paper] [HOL4]

    Gauthier, Thibault

  17. TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning NeurIPS 2021 [paper] [HOL4]

    Wu, Minchao and Norrish, Michael and Walder, Christian and Dezfouli, Amir

  18. Vampire with a Brain Is a Good ITP Hammer FroCoS 2021 [paper] [Vampire]

    Suda, Martin

  19. Fast and Slow Enigmas and Parental Guidance FroCoS 2021 [paper] [ENIGMA]

    Goertzel, Zarathustra A and Chvalovský, Karel and Jakubův, Jan and Olšák, Miroslav and Urban, Josef

  20. Towards Finding Longer Proofs TABLEAUX 2021 [paper] [leanCoP]

    Zombori, Zsolt and Csiszárik, Adrián and Michalewski, Henryk and Kaliszyk, Cezary and Urban, Josef

  21. lazyCoP: Lazy Paramodulation Meets Neurally Guided Search TABLEAUX 2021 [paper] [leanCoP]

    Rawson, Michael and Reger, Giles

  22. The Role of Entropy in Guiding a Connection Prover TABLEAUX 2021 [paper] [leanCoP]

    Zombori, Zsolt and Urban, Josef and Olšák, Miroslav

  23. Learning Theorem Proving Components TABLEAUX 2021 [paper] [ENIGMA]

    Chvalovský, Karel and Jakubův, Jan and Olšák, Miroslav and Urban, Josef

  24. A Deep Reinforcement Learning Approach to First-Order Logic Theorem Proving AAAI 2021 [paper] [TRAIL]

    Crouse, Maxwell and Abdelaziz, Ibrahim and Makni, Bassem and Whitehead, Spencer and Cornelio, Cristina and Kapanipathi, Pavan and Srinivas, Kavitha and Thost, Veronika and Witbrock, Michael and Fokoue, Achille

  25. Improving ENIGMA-Style Clause Selection While Learning From History CADE 2021 [paper] [Vampire]

    Suda, Martin

  26. Proving Theorems using Incremental Learning and Hindsight Experience Replay ICML 2022 [paper] [saturation]

    Aygün, Eser and Anand, Ankit and Orseau, Laurent and Glorot, Xavier and Mcaleer, Stephen M and Firoiu, Vlad and Zhang, Lei M and Precup, Doina and Mourad, Shibl

  27. HyperTree Proof Search for Neural Theorem Proving NeurIPS 2022 [paper] [Metamath, Lean]

    Lample, Guillaume and Lacroix, Timothee and Lachaux, Marie-Anne and Rodriguez, Aurelien and Hayat, Amaury and Lavril, Thibaut and Ebner, Gabriel and Martinet, Xavier

  28. Learning to Prove Trigonometric Identities arXiv 2022 [paper] [trigonometric identity]

    Liu, Zhou and Li, Yujun and Liu, Zhengying and Li, Lin and Li, Zhenguo

  29. Learning to Guide a Saturation-Based Theorem Prover TPAMI 2022 [paper] [TRAIL]

    Abdelaziz, Ibrahim and Crouse, Maxwell and Makni, Bassem and Austil, Vernon and Cornelio, Cristina and Ikbal, Shajith and Kapanipathi, Pavan and Makondo, Ndivhuwo and Srinivas, Kavitha and Witbrock, Michael and Fokoue, Achille

  30. Machine Learning Meets the Herbrand Universe arXiv 2022 [paper] [instantiation]

    Piepenbrock, Jelle and Urban, Josef and Korovin, Konstantin and Olšák, Miroslav and Heskes, Tom and Janota, Mikolaš

  31. Guiding An Automated Theorem Prover with Neural Rewriting IJCAR 2022 [paper] [Prover9]

    Piepenbrock, Jelle and Heskes, Tom and Janota, Mikoláš and Urban, Josef

  32. The Isabelle ENIGMA ITP 2022 [paper] [ENIGMA]

    Goertzel, Zarathustra A and Jakubův, Jan and Kaliszyk, Cezary and Olšák, Miroslav and Piepenbrock, Jelle and Urban, Josef

  33. Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

    Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya

  34. An Ensemble Approach for Automated Theorem Proving Based on Efficient Name Invariant Graph Neural Representations IJCAI 2023 [paper] [TRAIL]

    Fokoue, Achille and Abdelaziz, Ibrahim and Crouse, Maxwell and Ikbal, Shajith and Kishimoto, Akihiro and Lima, Guilherme and Makondo, Ndivhuwo and Marinescu, Radu

  35. Peano: Learning Formal Mathematical Reasoning Philosophical Transactions of the Royal Society A 2023 [paper] [Peano]

    Poesia, Gabriel and Goodman, Noah D

  36. MizAR 60 for Mizar 50 ITP 2023 [paper] [ENIGMA]

    Jakubův, Jan and Chvalovský, Karel and Goertzel, Zarathustra and Kaliszyk, Cezary and Olšák, Mirek and Piotrowski, Bartosz and Schulz, Stephan and Suda, Martin and Urban, Josef

  37. Reinforcement Learning for Guiding the E Theorem Prover FLAIRS 2023 [paper] [E]

    McKeown, Jack and Sutcliffe, Geoff

  38. Guiding an Instantiation Prover with Graph Neural Networks LPAR 2023 [paper] [iProver]

    Chvalovský, Karel and Korovin, Konstantin and Piepenbrock, Jelle and Urban, Josef

  39. How Much Should This Symbol Weigh? A GNN-Advised Clause Selection LPAR 2023 [paper] [Vampire]

    Bártěk, Filip and Suda, Martin

  40. gym-saturation: Gymnasium Environments for Saturation Provers (System Description) TABLEAUX 2023 [paper] [Vampire, iProver]

    Shminke, Boris

  41. DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-Level Value Function ACL 2023 [paper] [Isabelle, Lean]

    Wang, Haiming and Yuan, Ye and Liu, Zhengying and Shen, Jianhao and Yin, Yichun and Xiong, Jing and Xie, Enze and Shi, Han and Li, Yujun and Li, Lin and Yin, Jian and Li, Zhenguo and Liang, Xiaodan

  42. An In-Context Learning Agent for Formal Theorem-Proving arXiv 2023 [paper] [Lean, Coq]

    Thakur, Amitayush and Tsoukalas, George and Wen, Yeming and Xin, Jimmy and Chaudhuri, Swarat

  43. Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search arXiv 2024 [paper] [Dafny, Coq, Lean]

    Brandfonbrener, David and Raja, Sibi and Prasad, Tarun and Loughridge, Chloe and Yang, Jianang and Henniger, Simon and Byrd, William E and Zinkov, Robert and Amin, Nada

Other Tasks

  1. First Neural Conjecturing Datasets and Experiments CICM 2020 [paper] [Mizar]

    Urban, Josef and Jakubův, Jan

  2. Guiding Inferences in Connection Tableau by Recurrent Neural Networks CICM 2020 [paper] [connection tableau]

    Piotrowski, Bartosz and Urban, Josef

  3. Mathematical Reasoning in Latent Space ICLR 2020 [paper] [HOL Light]

    Lee, Dennis and Szegedy, Christian and Rabe, Markus N and Loos, Sarah M and Bansal, Kshitij

  4. Mathematical Reasoning via Self-supervised Skip-tree Training ICLR 2021 [paper] [HOL Light]

    Rabe, Markus N and Lee, Dennis and Bansal, Kshitij and Szegedy, Christian

  5. Latent Action Space for Efficient Planning in Theorem Proving AITP 2021 [paper] [inequality]

    Wu, Minchao and Wu, Yuhuai

  6. IsarStep: A Benchmark for High-Level Mathematical Reasoning ICLR 2021 [paper] [Isabelle]

    Li, Wenda and Yu, Lei and Wu, Yuhuai and Paulson, Lawrence C

  7. LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning ICML 2021 [paper] [Isabelle, HOL Light, Metamath, Lean]

    Wu, Yuhuai and Rabe, Markus N and Li, Wenda and Ba, Jimmy and Grosse, Roger B and Szegedy, Christian

  8. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  9. Exploring Mathematical Conjecturing with Large Language Models NeSy 2023 [paper] [Isabelle]

    Johansson, Moa and Smallbone, Nicholas

  10. BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs EACL 2023 [paper] [NL]

    Li, Weixian Waylon and Ziser, Yftah and Coavoux, Maximin and Cohen, Shay B

  11. REFACTOR: Learning to Extract Theorems from Proofs ICLR 2024 [paper] [Metamath]

    Zhou, Jin Peng and Wu, Yuhuai and Li, Qiyang and Grosse, Roger

Datasets

Data Collection

  1. Premise Selection for Mathematics by Corpus Analysis and Kernel Methods Journal of Automated Reasoning 2014 [paper] [Mizar]

    Alama, Jesse and Heskes, Tom and Kühlwein, Daniel and Tsivtsivadze, Evgeni and Urban, Josef

  2. MizAR 40 for Mizar 40 Journal of Automated Reasoning 2015 [paper] [Mizar]

    Kaliszyk, Cezary and Urban, Josef

  3. The TPTP Problem Library and Associated Infrastructure Journal of Automated Reasoning 2017 [paper] [TPTP]

    Sutcliffe, Geoff

  4. HolStep: A Machine Learning Dataset for Higher-Order Logic Theorem Proving ICLR 2017 [paper] [HOL Light]

    Kaliszyk, Cezary and Chollet, François and Szegedy, Christian

  5. Reinforcement Learning of Theorem Proving NeurIPS 2018 [paper] [Mizar]

    Kaliszyk, Cezary and Urban, Josef and Michalewski, Henryk and Olšák, Miroslav

  6. GamePad: A Learning Environment for Theorem Proving ICLR 2019 [paper] [Coq]

    Huang, Daniel and Dhariwal, Prafulla and Song, Dawn and Sutskever, Ilya

  7. HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

    Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox

  8. Learning to Prove Theorems via Interacting with Proof Assistants ICML 2019 [paper] [Coq]

    Yang, Kaiyu and Deng, Jia

  9. TacticToe: Learning to Prove with Tactics Journal of Automated Reasoning 2020 [paper] [HOL4]

    Gauthier, Thibault and Kaliszyk, Cezary and Urban, Josef

  10. Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [pretraining, Metamath]

    Polu, Stanislas and Sutskever, Ilya

  11. The Tactician: A Seamless, Interactive Tactic Learner and Prover for Coq CICM 2020 [paper] [Coq]

    Blaauwbroek, Lasse and Urban, Josef and Geuvers, Herman

  12. Natural Language Premise Selection: Finding Supporting Statements for Mathematical Text LREC 2020 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  13. IsarStep: A Benchmark for High-Level Mathematical Reasoning ICLR 2021 [paper] [Isabelle]

    Li, Wenda and Yu, Lei and Wu, Yuhuai and Paulson, Lawrence C

  14. NaturalProofs: Mathematical Theorem Proving in Natural Language NeurIPS 2021 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Bras, Ronan Le and Hajishirzi, Hannaneh and Choi, Yejin and Cho, Kyunghyun

  15. LISA: Language Models of Isabelle Proofs AITP 2021 [paper] [Isabelle]

    Jiang, Albert Qiaochu and Li, Wenda and Han, Jesse Michael and Wu, Yuhuai

  16. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  17. MiniF2F: A Cross-System Benchmark for Formal Olympiad-Level Mathematics ICLR 2022 [paper] [Metamath, Lean, Isabelle, HOL Light]

    Zheng, Kunhao and Han, Jesse Michael and Polu, Stanislas

  18. UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression EMNLP 2022 [paper] [geometry]

    Chen, Jiaqi and Li, Tong and Qin, Jinghui and Lu, Pan and Lin, Liang and Chen, Chongyu and Liang, Xiaodan

  19. NaturalProver: Grounded Mathematical Proof Generation with Language Models NeurIPS 2022 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Lu, Ximing and Hajishirzi, Hannaneh and Choi, Yejin

  20. A Parallel Corpus of Natural Language and Isabelle Artefacts AITP 2022 [paper] [NL, Isabelle]

    Bordg, Anthony and Stathopoulos, Yiannos A and Paulson, Lawrence C

  21. Proof Repair Infrastructure for Supervised Models: Building a Large Proof Repair Dataset ITP 2023 [paper] [Coq]

    Reichel, Tom and Henderson, R and Touchet, Andrew and Gardner, Andrew and Ringer, Talia

  22. BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs EACL 2023 [paper] [NL]

    Li, Weixian Waylon and Ziser, Yftah and Coavoux, Maximin and Cohen, Shay B

  23. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

    Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima

  24. MLFMF: Data Sets for Machine Learning for Mathematical Formalization NeurIPS 2023 [paper] [Agda, Lean]

    Bauer, Andrej and Petković, Matej and Todorovski, Ljupco

  25. Mathematical Capabilities of ChatGPT NeurIPS 2023 [paper] [NL]

    Simon Frieder and Luca Pinchetti and Alexis Chevalier and Ryan-Rhys Griffiths and Tommaso Salvatori and Thomas Lukasiewicz and Philipp Christian Petersen and Julius Berner

  26. TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models EMNLP 2023 [paper] [Lean]

    Xiong, Jing and Shen, Jianhao and Yuan, Ye and Wang, Haiming and Yin, Yichun and Liu, Zhengying and Li, Lin and Guo, Zhijiang and Cao, Qingxing and Huang, Yinya and Zheng, Chuanyang and Liang, Xiaodan and Zhang, Ming and Liu, Qun

  27. Generative AI for Math: Part I - MathPile: A Billion-Token-Scale Pretraining Corpus for Math arXiv 2023 [paper] [pretraining]

    Wang, Zengzhi and Xia, Rui and Liu, Pengfei

  28. ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics arXiv 2023 [paper] [pretraining, NL, Lean]

    Azerbayev, Zhangir and Piotrowski, Bartosz and Schoelkopf, Hailey and Ayers, Edward W and Radev, Dragomir and Avigad, Jeremy

  29. FIMO: A Challenge Formal Dataset for Automated Theorem Proving arXiv 2023 [paper] [NL, Lean]

    Liu, Chengwu and Shen, Jianhao and Xin, Huajian and Liu, Zhengying and Yuan, Ye and Wang, Haiming and Ju, Wei and Zheng, Chuanyang and Yin, Yichun and Li, Lin and Zhang, Ming Zhang and Liu, Qun

  30. FormalGeo: The First Step Toward Human-like IMO-level Geometric Automated Reasoning arXiv 2023 [paper] [geometry]

    Zhang, Xiaokai and Zhu, Na and He, Yiming and Zou, Jia and Huang, Qike and Jin, Xiaoxiao and Guo, Yanjun and Mao, Chenyang and Li, Yang and Zhu, Zhe and Yue, Dengfeng and Zhu, Fangzhen and Wang, Yifan and Huang, Yiwen and Wang, Runan and Qin, Cheng and Zeng, Zhenbing and Xie, Shaorong and Luo, Xiangfeng and Leng, Tuo

  31. Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [pretraining]

    Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean

  32. Magnushammer: A Transformer-Based Approach to Premise Selection ICLR 2024 [paper] [Isabelle]

    Mikuła, Maciej and Antoniak, Szymon and Tworkowski, Szymon and Jiang, Albert Qiaochu and Zhou, Jin Peng and Szegedy, Christian and Kuciński, Łukasz and Miłos, Piotr and Wu, Yuhuai

  33. OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text ICLR 2024 [paper] [pretraining]

    Paster, Keiran and Santos, Marco Dos and Azerbayev, Zhangir and Ba, Jimmy

  34. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [pretraining]

    Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya

  35. InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [pretraining]

    Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

Data Generation

  1. Learning to Reason in Large Theories without Imitation arXiv 2019 [paper] [HOL Light]

    Bansal, Kshitij and Szegedy, Christian and Rabe, Markus N and Loos, Sarah M and Toman, Viktor

  2. Learning to Prove Theorems by Learning to Generate Theorems NeurIPS 2020 [paper] [Metamath]

    Wang, Mingzhe and Deng, Jia

  3. INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving ICLR 2021 [paper] [inequality]

    Wu, Yuhuai and Jiang, Albert Qiaochu and Ba, Jimmy and Grosse, Roger

  4. Training a First-Order Theorem Prover from Synthetic Data ICLR 2021 MATH-AI Workshop [paper] [saturation]

    Firoiu, Vlad and Aygün, Eser and Anand, Ankit and Ahmed, Zafarali and Glorot, Xavier and Orseau, Laurent and Zhang, Lei and Precup, Doina and Mourad, Shibl

  5. LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning ICML 2021 [paper] [reasoning primitives]

    Wu, Yuhuai and Rabe, Markus N and Li, Wenda and Ba, Jimmy and Grosse, Roger B and Szegedy, Christian

  6. Proving Theorems using Incremental Learning and Hindsight Experience Replay ICML 2022 [paper] [saturation]

    Aygün, Eser and Anand, Ankit and Orseau, Laurent and Glorot, Xavier and Mcaleer, Stephen M and Firoiu, Vlad and Zhang, Lei M and Precup, Doina and Mourad, Shibl

  7. Synthetic Proof Term Data Augmentation for Theorem Proving with Language Models AITP 2022 [paper] [Lean]

    Palermo, Joseph and Ye, Johnny and Han, Jesse Michael

  8. Learning to Prove Trigonometric Identities arXiv 2022 [paper] [trigonometric identity]

    Liu, Zhou and Li, Yujun and Liu, Zhengying and Li, Lin and Li, Zhenguo

  9. Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

    Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya

  10. GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning arXiv 2023 [paper] [geometry]

    Kazemi, Mehran and Alvari, Hamidreza and Anand, Ankit and Wu, Jialin and Chen, Xi and Soricut, Radu

  11. Multilingual Mathematical Autoformalization arXiv 2023 [paper] [NL, Isabelle, Lean]

    Jiang, Albert Q and Li, Wenda and Jamnik, Mateja

  12. Solving Olympiad Geometry without Human Demonstrations Nature 2024 [paper] [geometry]

    Trinh, Trieu H and Wu, Yuhuai and Le, Quoc V and He, He and Luong, Thang

  13. LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [Isabelle]

    Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan

  14. REFACTOR: Learning to Extract Theorems from Proofs ICLR 2024 [paper] [Metamath]

    Zhou, Jin Peng and Wu, Yuhuai and Li, Qiyang and Grosse, Roger

  15. MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data ICLR 2024 [paper] [NL, Lean]

    Huang, Yinya and Lin, Xiaohan and Liu, Zhengying and Cao, Qingxing and Xin, Huajian and Wang, Haiming and Li, Zhenguo and Song, Linqi and Liang, Xiaodan

Related Surveys

  1. QED at Large: A Survey of Engineering of Formally Verified Software Foundations and Trends® in Programming Languages 2019 [paper]

    Ringer, Talia and Palmskog, Karl and Sergey, Ilya and Gligoric, Milos and Tatlock, Zachary

  2. A Survey of Deep Learning for Mathematical Reasoning ACL 2023 [paper]

    Lu, Pan and Qiu, Liang and Yu, Wenhao and Welleck, Sean and Chang, Kai-Wei

  3. Mathematical Language Models: A Survey arXiv 2023 [paper]

    Liu, Wentao and Hu, Hanglei and Zhou, Jie and Ding, Yuyang and Li, Junsong and Zeng, Jiayi and He, Mengliang and Chen, Qin and Jiang, Bo and Zhou, Aimin and He, Liang

  4. Large Language Models for Mathematical Reasoning: Progresses and Challenges EACL 2024 [paper]

    Ahn, Janice and Verma, Rishu and Lou, Renze and Liu, Di and Zhang, Rui and Yin, Wenpeng

  5. AI for Math Resources Google Sheet [link]

    Available to Everyone

Citation

If you find this repository useful, please consider citing our survey paper:

@article{li2024dl4tp,
   title={A Survey on Deep Learning for Theorem Proving}, 
   author={Li, Zhaoyu and Sun, Jialiang and Murphy, Logan and Su, Qidong and Li, Zenan and Zhang, Xian and Yang, Kaiyu and Si, Xujie},
   journal={arXiv preprint arXiv:2404.09939},
   year={2024},
}