/DL4TP

[COLM 2024] A Survey on Deep Learning for Theorem Proving

MIT LicenseMIT

Deep Learning for Theorem Proving (DL4TP)

Awesome License: MIT PRs Welcome GitHub last commit (branch)

Welcome to our repository! This is a curated collection of resources related to deep learning for theorem proving.

We categorize papers primarily based on the applications of deep learning models, organizing them into five task-specific categories and two dataset categories. A single paper may appear in multiple categories due to its relevance to different tasks or datasets. Additionally, each paper is labeled with the used theorem prover/proof calculus/problem domain to help users quickly find the resources that best match their interests or needs. For example, papers using/generating theorems/proofs in natural language are labeled with [NL].

For more details, please refer to our survey paper: A Survey on Deep Learning for Theorem Proving (COLM 2024).

Table of Contents

Surveys

  1. Towards the Automatic Mathematician CADE 2021 [paper]

    Rabe, Markus N and Szegedy, Christian

  2. Theorem Proving in Artificial Neural Networks: New Frontiers in Mathematical AI EJPS 2024 [paper]

    Pantsar, Markus

  3. Learning Guided Automated Reasoning: A Brief Survey arXiv 2024 [paper]

    Blaauwbroek, Lasse and Cerna, David and Gauthier, Thibault and Jakubův, Jan and Kaliszyk, Cezary and Suda, Martin and Urban, Josef

  4. A Survey on Deep Learning for Theorem Proving COLM 2024 [paper]

    Li, Zhaoyu and Sun, Jialiang and Murphy, Logan and Su, Qidong and Li, Zenan and Zhang, Xian and Yang, Kaiyu and Si, Xujie

Tutorials

  1. A Tutorial on Neural Theorem Proving IJCAI 2023 Tutorial [link] [Lean, Isabelle]

    Welleck, Sean

  2. Tutorial on Machine Learning for Theorem Proving NeurIPS 2023 Tutorial [link] [Coq, Isabelle, Lean]

    First, Emily and Jiang, Albert and Yang, Kaiyu

Tasks

Autoformalization

  1. First Experiments with Neural Translation of Informal to Formal Mathematics CICM 2018 [paper] [NL, Mizar]

    Wang, Qingxiang and Kaliszyk, Cezary and Urban, Josef

  2. Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar CPP 2020 [paper] [NL, Mizar]

    Wang, Qingxiang and Brown, Chad and Kaliszyk, Cezary and Urban, Josef

  3. Learning Alignment between Formal & Informal Mathematics AITP 2020 [paper] [NL, HOL Light]

    Bansal, Kshitij and Szegedy, Christian

  4. A Promising Path Towards Autoformalization and General Artificial Intelligence CICM 2020 [paper]

    Szegedy, Christian

  5. Autoformalization with Large Language Models NeurIPS 2022 [paper] [NL, Isabelle]

    Wu, Yuhuai and Jiang, Albert Qiaochu and Li, Wenda and Rabe, Markus and Staats, Charles and Jamnik, Mateja and Szegedy, Christian

  6. Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs EMNLP 2022 MathNLP Workshop [paper] [NL, Coq]

    Cunningham, Garett and Bunescu, Razvan C and Juedes, David

  7. Towards a Mathematics Formalisation Assistant using Large Language Models arXiv 2022 [paper] [NL, Lean]

    Agrawal, Ayush and Gadgil, Siddhartha and Goyal, Navin and Narayanan, Ashvni and Tadipatri, Anand

  8. Towards Automating Formalisation of Theorem Statements using Large Language Models NeurIPS 2022 MATH-AI Workshop [paper] [NL, Lean]

    Gadgil, Siddhartha and Tadipatri, Anand Rao and Agrawal, Ayush and Narayanan, Ashvni and Goyal, Navin

  9. Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs ICLR 2023 [paper] [NL, Isabelle]

    Jiang, Albert Q and Welleck, Sean and Zhou, Jin Peng and Li, Wenda and Liu, Jiacheng and Jamnik, Mateja and Lacroix, Timothée and Wu, Yuhuai and Lample, Guillaume

  10. Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning EMNLP 2023 Findings [paper] [NL, Prover9, Z3]

    Pan, Liangming and Albalak, Alon and Wang, Xinyi and Wang, William Yang

  11. LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers EMNLP 2023 [paper] [NL, Prover9]

    Olausson, Theo X and Gu, Alex and Lipkin, Benjamin and Zhang, Cedegao E and Solar-Lezama, Armando and Tenenbaum, Joshua B and Levy, Roger

  12. SATLM: Satisfiability-Aided Language Models Using Declarative Prompting NeurIPS 2023 [paper] [NL, Z3]

    Ye, Xi and Chen, Qiaochu and Dillig, Isil and Durrett, Greg

  13. FIMO: A Challenge Formal Dataset for Automated Theorem Proving arXiv 2023 [paper] [NL, Lean]

    Liu, Chengwu and Shen, Jianhao and Xin, Huajian and Liu, Zhengying and Yuan, Ye and Wang, Haiming and Ju, Wei and Zheng, Chuanyang and Yin, Yichun and Li, Lin and Zhang, Ming Zhang and Liu, Qun

  14. Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zhao, Xueliang and Li, Wenda and Kong, Lingpeng

  15. ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics arXiv 2023 [paper] [NL, Lean]

    Azerbayev, Zhangir and Piotrowski, Bartosz and Schoelkopf, Hailey and Ayers, Edward W and Radev, Dragomir and Avigad, Jeremy

  16. Multilingual Mathematical Autoformalization arXiv 2023 [paper] [NL, Isabelle, Lean]

    Jiang, Albert Q and Li, Wenda and Jamnik, Mateja

  17. Lyra: Orchestrating Dual Correction in Automated Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zheng, Chuanyang and Wang, Haiming and Xie, Enze and Liu, Zhengying and Sun, Jiankai and Xin, Huajian and Shen, Jianhao and Li, Zhenguo and Li, Yu

  18. A New Approach Towards Autoformalization arXiv 2023 [paper] [NL, Lean]

    Patel, Nilay and Flanigan, Jeffrey and Saha, Rahul

  19. MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data ICLR 2024 [paper] [NL, Lean]

    Huang, Yinya and Lin, Xiaohan and Liu, Zhengying and Cao, Qingxing and Xin, Huajian and Wang, Haiming and Li, Zhenguo and Song, Linqi and Liang, Xiaodan

  20. Don't Trust: Verify - Grounding LLM Quantitative Reasoning with Autoformalization ICLR 2024 [paper] [NL, Isabelle]

    Zhou, Jin Peng and Staats, Charles E and Li, Wenda and Szegedy, Christian and Weinberger, Kilian Q and Wu, Yuhuai

  21. LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [NL, Isabelle]

    Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan

  22. Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [NL, Isabelle, Lean]

    Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean

  23. LeanReasoner: Boosting Complex Logical Reasoning with Lean NAACL 2024 [paper] [NL, Lean]

    Jiang, Dongwei and Fonseca, Marcio and Cohen, Shay B

  24. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [NL, Isabelle]

    Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya

  25. InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [NL, Lean]

    Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

  26. Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving arXiv 2024 [paper] [NL, Isabelle]

    Quan, Xin and Valentino, Marco and Dennis, Louise A and Freitas, André

  27. Process-Driven Autoformalization in Lean 4 arXiv 2024 [paper] [NL, Lean]

    Lu, Jianqiao and Liu, Zhengying and Wan, Yingjia and Huang, Yinya and Wang, Haiming and Yang, Zhicheng and Tang, Jing and Guo, Zhijiang

  28. DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data arXiv 2024 [paper] [NL, Lean]

    Xin, Huajian and Guo, Daya and Shao, Zhihong and Ren, Zhizhou and Zhu, Qihao and Liu, Bo and Ruan, Chong and Li, Wenda and Liang, Xiaodan

  29. Improving Autoformalization using Type Checking arXiv 2024 [paper] [NL, Lean]

    Poiroux, Auguste and Weiss, Gail and Kunčak, Viktor and Bosselut, Antoine

  30. Lean Workbook: A Large-Scale Lean Problem Set Formalized from Natural Language Math Problems arXiv 2024 [paper] [NL, Lean]

    Ying, Huaiyuan and Wu, Zijian and Geng, Yihan and Wang, Jiayu and Lin, Dahua and Chen, Kai

Premise Selection

  1. Deepmath - Deep Sequence Models for Premise Selection NeurIPS 2016 [paper] [Mizar]

    Irving, Geoffrey and Szegedy, Christian and Alemi, Alexander A and Eén, Niklas and Chollet, François and Urban, Josef

  2. HolStep: A Machine Learning Dataset for Higher-Order Logic Theorem Proving ICLR 2017 [paper] [HOL Light]

    Kaliszyk, Cezary and Chollet, François and Szegedy, Christian

  3. Tree-structure CNN for Automated Theorem Proving ICONIP 2017 [paper] [HOL Light]

    Peng, Kebin and Ma, Dianfu

  4. Premise Selection for Theorem Proving by Deep Graph Embedding NeurIPS 2017 [paper] [HOL Light]

    Wang, Mingzhe and Tang, Yihe and Wang, Jian and Deng, Jia

  5. Premise Selection with Neural Networks and Distributed Representation of Features arXiv 2018 [paper] [Mizar]

    Kucik, Andrzej Stanisław and Korovin, Konstantin

  6. HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

    Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox

  7. Learning Representations of Logical Formulae using Graph Neural Networks NeurIPS 2019 GRL Workshop [paper] [Mizar]

    Glorot, Xavier and Anand, Ankit and Aygun, Eser and Mourad, Shibl and Kohli, Pushmeet and Precup, Doina

  8. Property Invariant Embedding for Automated Reasoning ECAI 2019 [paper] [Mizar]

    Olšák, Miroslav and Kaliszyk, Cezary and Urban, Josef

  9. Usefulness of Lemmas via Graph Neural Networks AITP 2019 [paper] [Mizar]

    Goertzel, Zarathustra and Urban, Josef

  10. Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling arXiv 2019 [paper] [Mizar, HOL Light]

    Crouse, Maxwell and Abdelaziz, Ibrahim and Cornelio, Cristina and Thost, Veronika and Wu, Lingfei and Forbus, Kenneth and Fokoue, Achille

  11. Directed Graph Networks for Logical Reasoning (Extended Abstract) PAAR 2020 [paper] [Mizar]

    Rawson, Michael and Reger, Giles

  12. Stateful Premise Selection by Recurrent Neural Networks LPAR 2020 [paper] [Mizar]

    Piotrowski, Bartosz and Urban, Josef

  13. Premise Selection in Natural Language Mathematical Texts ACL 2020 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  14. Natural Language Premise Selection: Finding Supporting Statements for Mathematical Text LREC 2020 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  15. TextGraphs 2022 Shared Task on Natural Language Premise Selection TextGraphs 2020 [paper] [NL]

    Valentino, Marco and Ferreira, Deborah and Thayaparan, Mokanarangan and Freitas, André and Ustalov, Dmitry

  16. IJS at TextGraphs-16 Natural Language Premise Selection Task: Will Contextual Information Improve Natural Language Premise Selection? TextGraphs 2020 [paper] [NL]

    Tran, Thi Hong Hanh and Martinc, Matej and Doucet, Antoine and Pollak, Senja

  17. UNLPS at TextGraphs-16 Natural Language Premise Selection Task: Unsupervised Natural Language Premise Selection in Mathematical Text using Sentence-MPNet TextGraphs 2020 [paper] [NL]

    Trust, Paul and Kadusabe, Provia and Younis, Haseeb and Minghim, Rosane and Milios, Evangelos and Zahran, Ahmed

  18. Keyword-based Natural Language Premise Selection for an Automatic Mathematical Statement Proving TextGraphs 2020 [paper] [NL]

    Dastgheib, Doratossadat and Asgari, Ehsaneddin

  19. TextGraphs-16 Natural Language Premise Selection Task: Zero-Shot Premise Selection with Prompting Generative Language Models TextGraphs 2020 [paper] [NL]

    Kovriguina, Liubov and Teucher, Roman and Wardenga, Robert

  20. Attention Recurrent Cross-Graph Neural Network for Selecting Premises IJMLC 2021 [paper] [Mizar]

    Liu, Qinghua and Xu, Yang and He, Xingxing

  21. Graph Representations for Higher-Order Logic and Theorem Proving AAAI 2020 [paper] [HOL Light]

    Paliwal, Aditya and Loos, Sarah and Rabe, Markus and Bansal, Kshitij and Szegedy, Christian

  22. Improving Stateful Premise Selection with Transformers CICM 2021 [paper] [Mizar]

    Proroković, Krsto and Wand, Michael and Schmidhuber, Jürgen

  23. Contrastive Graph Representations for Logical Formulas Embedding TKDE 2021 [paper] [Mizar]

    Lin, Qika and Liu, Jun and Zhang, Lingling and Pan, Yudai and Hu, Xin and Xu, Fangzhi and Zeng, Hongwei

  24. Graph Contrastive Pre-training for Effective Theorem Reasoning ICML 2021 SSL Workshop [paper] [Coq]

    Li, Zhaoyu and Chen, Binghong and Si, Xujie

  25. NaturalProofs: Mathematical Theorem Proving in Natural Language NeurIPS 2021 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Bras, Ronan Le and Hajishirzi, Hannaneh and Choi, Yejin and Cho, Kyunghyun

  26. Contrastive Finetuning of Generative Language Models for Informal Premise Selection AITP 2021 [paper] [NL]

    Han, Jesse Michael and Xu, Tao and Polu, Stanislas and Neelakantan, Arvind and Radford, Alec

  27. STAR: Cross-modal [STA]tement [R]epresentation for Selecting Relevant Mathematical Premises EACL 2021 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  28. A Study of Continuous Vector Representations for Theorem Proving Journal of Logic and Computation 2021 [paper] [Mizar]

    Purgał, Stanisław and Parsert, Julian and Kaliszyk, Cezary

  29. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  30. The Isabelle ENIGMA ITP 2022 [paper] [Isabelle]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  31. Formal Premise Selection with Language Models AITP 2022 [paper] [Isabelle]

    Tworkowski, Szymon and Mikuła, Maciej and Odrzygóżdż, Tomasz and Czechowski, Konrad and Antoniak, Szymon and Jiang, Albert and Szegedy, Christian and Kuciński, Łukasz and Miłoś, Piotr and Wu, Yuhuai

  32. MizAR 60 for Mizar 50 ITP 2023 [paper] [Mizar]

    Jakubův, Jan and Chvalovský, Karel and Goertzel, Zarathustra and Kaliszyk, Cezary and Olšák, Mirek and Piotrowski, Bartosz and Schulz, Stephan and Suda, Martin and Urban, Josef

  33. Graph Sequence Learning for Premise Selection AITP 2023 [paper] [Mizar]

    Holden, Edvard K and Korovin, Konstantin

  34. CoProver: A Recommender System for Proof Construction CICM 2023 [paper] [PVS]

    Yeh, Eric and Hitaj, Briland and Owre, Sam and Quemener, Maena and Shankar, Natarajan

  35. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

    Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima

  36. MLFMF: Data Sets for Machine Learning for Mathematical Formalization NeurIPS 2023 [paper] [Agda, Lean]

    Bauer, Andrej and Petković, Matej and Todorovski, Ljupco

  37. BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving AAAI 2024 [paper]

    Lamont, Sean and Norrish, Michael and Dezfouli, Amir and Walder, Christian and Montague, Paul

  38. Magnushammer: A Transformer-Based Approach to Premise Selection ICLR 2024 [paper] [Isabelle]

    Mikuła, Maciej and Antoniak, Szymon and Tworkowski, Szymon and Jiang, Albert Qiaochu and Zhou, Jin Peng and Szegedy, Christian and Kuciński, Łukasz and Miłos, Piotr and Wu, Yuhuai

Proofstep Generation

  1. Holophrasm: A Neural Automated Theorem Prover for Higher-Order Logic arXiv 2016 [paper] [Metamath]

    Whalen, Daniel

  2. GamePad: A Learning Environment for Theorem Proving ICLR 2019 [paper] [Coq]

    Huang, Daniel and Dhariwal, Prafulla and Song, Dawn and Sutskever, Ilya

  3. HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

    Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox

  4. Learning to Prove Theorems via Interacting with Proof Assistants ICML 2019 [paper] [Coq]

    Yang, Kaiyu and Deng, Jia

  5. Graph Representations for Higher-Order Logic and Theorem Proving AAAI 2020 [paper] [HOL Light]

    Paliwal, Aditya and Loos, Sarah and Rabe, Markus and Bansal, Kshitij and Szegedy, Christian

  6. Learning to Prove Theorems by Learning to Generate Theorems NeurIPS 2020 [paper] [Metamath]

    Wang, Mingzhe and Deng, Jia

  7. Generating Correctness Proofs with Neural Networks MAPL 2020 [paper] [Coq]

    Sanchez-Stern, Alex and Alhessi, Yousef and Saul, Lawrence and Lerner, Sorin

  8. TacTok: Semantics-Aware Proof Synthesis OOPSLA 2020 [paper] [Coq]

    First, Emily and Brun, Yuriy and Guha, Arjun

  9. Automated Theorem Proving via Interacting with Proof Assistants by Dynamic Strategies BigCom 2020 [paper] [Coq]

    Mo, Guangshuai and Xiong, Yan and Huang, Wenchao and Ma, Lu

  10. Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [Metamath]

    Polu, Stanislas and Sutskever, Ilya

  11. INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving ICLR 2021 [paper] [inequality]

    Wu, Yuhuai and Jiang, Albert Qiaochu and Ba, Jimmy and Grosse, Roger

  12. TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning NeurIPS 2021 [paper] [HOL4]

    Wu, Minchao and Norrish, Michael and Walder, Christian and Dezfouli, Amir

  13. LISA: Language Models of Isabelle Proofs AITP 2021 [paper] [Isabelle]

    Jiang, Albert Qiaochu and Li, Wenda and Han, Jesse Michael and Wu, Yuhuai

  14. Retrieval-Augmented Proof Step Synthesis AITP 2021 [paper] [HOL Light]

    Szegedy, Christian and Rabe, Markus and Michalewski, Henryk

  15. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  16. UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression EMNLP 2022 [paper] [geometry]

    Chen, Jiaqi and Li, Tong and Qin, Jinghui and Lu, Pan and Lin, Liang and Chen, Chongyu and Liang, Xiaodan

  17. Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs EMNLP 2022 MathNLP Workshop [paper] [NL, Coq]

    Cunningham, Garett and Bunescu, Razvan C and Juedes, David

  18. HyperTree Proof Search for Neural Theorem Proving NeurIPS 2022 [paper] [Metamath, Lean]

    Lample, Guillaume and Lacroix, Timothee and Lachaux, Marie-Anne and Rodriguez, Aurelien and Hayat, Amaury and Lavril, Thibaut and Ebner, Gabriel and Martinet, Xavier

  19. NaturalProver: Grounded Mathematical Proof Generation with Language Models NeurIPS 2022 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Lu, Ximing and Hajishirzi, Hannaneh and Choi, Yejin

  20. Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers NeurIPS 2022 [paper] [Isabelle]

    Jiang, Albert Qiaochu and Li, Wenda and Tworkowski, Szymon and Czechowski, Konrad and Odrzygóżdż, Tomasz and Miłos, Piotr and Wu, Yuhuai and Jamnik, Mateja

  21. Diversity-Driven Automated Formal Verification ICSE 2022 [paper] [Coq]

    First, Emily and Brun, Yuriy

  22. Formal Premise Selection with Language Models AITP 2022 [paper] [Isabelle]

    Tworkowski, Szymon and Mikuła, Maciej and Odrzygóżdż, Tomasz and Czechowski, Konrad and Antoniak, Szymon and Jiang, Albert and Szegedy, Christian and Kuciński, Łukasz and Miłoś, Piotr and Wu, Yuhuai

  23. Passport: Improving Automated Formal Verification Using Identifiers TOPLAS 2023 [paper] [Coq]

    Sanchez-Stern, Alex and First, Emily and Zhou, Timothy and Kaufman, Zhanna and Brun, Yuriy and Ringer, Talia

  24. Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs ICLR 2023 [paper] [NL, Isabelle]

    Jiang, Albert Q and Welleck, Sean and Zhou, Jin Peng and Li, Wenda and Liu, Jiacheng and Jamnik, Mateja and Lacroix, Timothée and Wu, Yuhuai and Lample, Guillaume

  25. Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

    Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya

  26. Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zhao, Xueliang and Li, Wenda and Kong, Lingpeng

  27. CoProver: A Recommender System for Proof Construction CICM 2023 [paper] [PVS]

    Yeh, Eric and Hitaj, Briland and Owre, Sam and Quemener, Maena and Shankar, Natarajan

  28. Baldur: Whole-Proof Generation and Repair with Large Language Models ESEC/FSE 2023 [paper] [Isabelle]

    First, Emily and Rabe, Markus and Ringer, Talia and Brun, Yuriy

  29. Peano: Learning Formal Mathematical Reasoning Philosophical Transactions of the Royal Society A 2023 [paper] [Peano]

    Poesia, Gabriel and Goodman, Noah D

  30. Mathematical Capabilities of ChatGPT NeurIPS 2023 [paper] [NL]

    Simon Frieder and Luca Pinchetti and Alexis Chevalier and Ryan-Rhys Griffiths and Tommaso Salvatori and Thomas Lukasiewicz and Philipp Christian Petersen and Julius Berner

  31. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

    Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima

  32. LLM vs ITP NeurIPS 2023 MATH-AI Workshop [paper] [NL]

    Frieder, Simon and Trimmel, Martin and Alawadhi, Rashid and Gy, Klaus

  33. LLMSTEP: LLM Proofstep Suggestions in Lean NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

    Welleck, Sean and Saha, Rahul

  34. Towards Large Language Models as Copilots for Theorem Proving in Lean NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

    Song, Peiyang and Yang, Kaiyu and Anandkumar, Anima

  35. Temperature-Scaled Large Language Models for Lean Proofstep Prediction NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

    Gloeckle, Fabian and Roziere, Baptiste and Hayat, Amaury and Synnaeve, Gabriel

  36. DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-Level Value Function ACL 2023 [paper] [Isabelle, Lean]

    Wang, Haiming and Yuan, Ye and Liu, Zhengying and Shen, Jianhao and Yin, Yichun and Xiong, Jing and Xie, Enze and Shi, Han and Li, Yujun and Li, Lin and Yin, Jian and Li, Zhenguo and Liang, Xiaodan

  37. Getting More out of Large Language Models for Proofs AITP 2023 [paper] [Coq]

    Zhang, Shizhuo Dylan and Ringer, Talia and First, Emily

  38. UniMath: A Foundational and Multimodal Mathematical Reasoner EMNLP 2023 [paper] [geometry]

    Liang, Zhenwen and Yang, Tianyu and Zhang, Jipeng and Zhang, Xiangliang

  39. TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models EMNLP 2023 [paper] [Lean]

    Xiong, Jing and Shen, Jianhao and Yuan, Ye and Wang, Haiming and Yin, Yichun and Liu, Zhengying and Li, Lin and Guo, Zhijiang and Cao, Qingxing and Huang, Yinya and Zheng, Chuanyang and Liang, Xiaodan and Zhang, Ming and Liu, Qun

  40. Learning Proof Transformations and Its Applications in Interactive Theorem Proving FroCoS 2023 [paper] [Coq]

    Zhang, Liao and Blaauwbroek, Lasse and Kaliszyk, Cezary and Urban, Josef

  41. Evaluating Language Models for Mathematics through Interactions arXiv 2023 [paper] [NL]

    Collins, Katherine M and Jiang, Albert Q and Frieder, Simon and Wong, Lionel and Zilka, Miri and Bhatt, Umang and Lukasiewicz, Thomas and Wu, Yuhuai and Tenenbaum, Joshua B and Hart, William and Gowers, Timothy and Li, Wenda and Weller, Adrian and Jamnik, Mateja

  42. Large Language Models for Mathematicians arXiv 2023 [paper] [NL]

    Scheidt, Gregor vom

  43. Experimental Results from Applying GPT-4 to An Unpublished Formal Language arXiv 2023 [paper] [Axiotome]

    Scheidt, Gregor vom

  44. Large Language Models' Understanding of Math: Source Criticism and Extrapolation arXiv 2023 [paper] [Lean]

    Yousefzadeh, Roozbeh and Cao, Xuenan

  45. Lyra: Orchestrating Dual Correction in Automated Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

    Zheng, Chuanyang and Wang, Haiming and Xie, Enze and Liu, Zhengying and Sun, Jiankai and Xin, Huajian and Shen, Jianhao and Li, Zhenguo and Li, Yu

  46. Enhancing Neural Theorem Proving through Data Augmentation and Dynamic Sampling Method arXiv 2023 [paper] [Lean]

    Vishwakarma, Rahul and Mishra, Subhankar

  47. An In-Context Learning Agent for Formal Theorem-Proving arXiv 2023 [paper] [Lean, Coq]

    Thakur, Amitayush and Tsoukalas, George and Wen, Yeming and Xin, Jimmy and Chaudhuri, Swarat

  48. LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [NL, Isabelle]

    Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan

  49. Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [NL, Isabelle, Lean]

    Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean

  50. Solving Proof Block Problems Using Large Language Models SIGCSE 2024 [paper] [NL]

    Poulsen, Seth and Sarsa, Sami and Prather, James and Leinonen, Juho and Becker, Brett A and Hellas, Arto and Denny, Paul and Reeves, Brent N

  51. Solving Olympiad Geometry without Human Demonstrations Nature 2024 [paper] [geometry]

    Trinh, Trieu H and Wu, Yuhuai and Le, Quoc V and He, He and Luong, Thang

  52. BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving AAAI 2024 [paper]

    Lamont, Sean and Norrish, Michael and Dezfouli, Amir and Walder, Christian and Montague, Paul

  53. Graph2Tac: Learning Hierarchical Representations of Math Concepts in Theorem proving arXiv 2024 [paper] [Coq]

    Rute, Jason and Olšák, Miroslav and Blaauwbroek, Lasse and Massolo, Fidel Ivan Schaposnik and Piepenbrock, Jelle and Pestun, Vasily

  54. Selene: Pioneering Automated Proof in Software Verification ACL 2024 [paper] [Isabelle]

    Zhang, Lichen and Lu, Shuai and Duan, Nan

  55. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [NL, Isabelle]

    Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya

  56. InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [NL, Lean]

    Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

  57. Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search arXiv 2024 [paper] [Dafny, Coq, Lean]

    Brandfonbrener, David and Raja, Sibi and Prasad, Tarun and Loughridge, Chloe and Yang, Jianang and Henniger, Simon and Byrd, William E and Zinkov, Robert and Amin, Nada

  58. DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data arXiv 2024 [paper] [NL, Lean]

    Xin, Huajian and Guo, Daya and Shao, Zhihong and Ren, Zhizhou and Zhu, Qihao and Liu, Bo and Ruan, Chong and Li, Wenda and Liang, Xiaodan

  59. Proving Theorems Recursively arXiv 2024 [paper] [Isabelle]

    Wang, Haiming and Xin, Huajian and Liu, Zhengying and Li, Wenda and Huang, Yinya and Lu, Jianqiao and Yang, Zhicheng and Tang, Jing and Yin, Jian and Li, Zhenguo and Liang, Xiaodan

  60. FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving arXiv 2024 [paper] [Isabelle]

    Lin, Xiaohan and Cao, Qingxing and Huang, Yinya and Wang, Haiming and Lu, Jianqiao and Liu, Zhengying and Song, Linqi and Liang, Xiaodan

  61. Lean-STaR: Learning to Interleave Thinking and Proving arXiv 2024 [paper] [Isabelle]

    Lin, Haohan and Sun, Zhiqing and Yang, Yiming and Welleck, Sean

Proof Search

  1. Deep Network Guided Proof Search LPAR 2017 [paper] [E]

    Loos, Sarah and Irving, Geoffrey and Szegedy, Christian and Kaliszyk, Cezary

  2. Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning arXiv 2018 [paper] [IPL]

    Kusumoto, Mitsuru and Yahata, Keisuke and Sakai, Masahiro

  3. ENIGMA-NG: Efficient Neural and Gradient-Boosted Inference Guidance for E CADE 2019 [paper] [ENIGMA]

    Chvalovský, Karel and Jakubův, Jan and Suda, Martin and Urban, Josef

  4. Hammering Mizar by Learning Clause Guidance ITP 2019 [paper] [ENIGMA]

    Jakubův, Jan and Urban, Josef

  5. Learning Dynamic Polynomial Proofs NeurIPS 2019 [paper] [polynomial inequality]

    Fawzi, Alhussein and Malinowski, Mateusz and Fawzi, Hamza and Fawzi, Omar

  6. A Neurally-Guided, Parallel Theorem Prover FroCoS 2019 [paper] [Z3]

    Rawson, Michael and Reger, Giles

  7. Property Invariant Embedding for Automated Reasoning ECAI 2019 [paper] [leanCoP]

    Olšák, Miroslav and Kaliszyk, Cezary and Urban, Josef

  8. Automated Theorem Proving via Interacting with Proof Assistants by Dynamic Strategies BigCom 2020 [paper] [Coq]

    Mo, Guangshuai and Xiong, Yan and Huang, Wenchao and Ma, Lu

  9. Guiding Inferences in Connection Tableau by Recurrent Neural Networks CICM 2020 [paper] [connection tableau]

    Piotrowski, Bartosz and Urban, Josef

  10. ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (System Description) IJCAR 2020 [paper] [ENIGMA]

    Jakubův, Jan and Chvalovský, Karel and Olšák, Miroslav and Piotrowski, Bartosz and Suda, Martin and Urban, Josef

  11. Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic LPAR 2020 [paper] [HOL4]

    Gauthier, Thibault

  12. Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [Metamath]

    Polu, Stanislas and Sutskever, Ilya

  13. Learning to Prove from Synthetic Theorems arXiv 2020 [paper] [saturation]

    Aygün, Eser and Ahmed, Zafarali and Anand, Ankit and Firoiu, Vlad and Glorot, Xavier and Orseau, Laurent and Precup, Doina and Mourad, Shibl

  14. An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic arXiv 2020 [paper] [TRAIL]

    Abdelaziz, Ibrahim and Thost, Veronika and Crouse, Maxwell and Fokoue, Achille

  15. Training a First-Order Theorem Prover from Synthetic Data ICLR 2021 MATH-AI Workshop [paper] [saturation]

    Firoiu, Vlad and Aygün, Eser and Anand, Ankit and Ahmed, Zafarali and Glorot, Xavier and Orseau, Laurent and Zhang, Lei and Precup, Doina and Mourad, Shibl

  16. Learned Provability Likelihood for Tactical Search SCSS 2021 [paper] [HOL4]

    Gauthier, Thibault

  17. TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning NeurIPS 2021 [paper] [HOL4]

    Wu, Minchao and Norrish, Michael and Walder, Christian and Dezfouli, Amir

  18. Vampire with a Brain Is a Good ITP Hammer FroCoS 2021 [paper] [Vampire]

    Suda, Martin

  19. Fast and Slow Enigmas and Parental Guidance FroCoS 2021 [paper] [ENIGMA]

    Goertzel, Zarathustra A and Chvalovský, Karel and Jakubův, Jan and Olšák, Miroslav and Urban, Josef

  20. Towards Finding Longer Proofs TABLEAUX 2021 [paper] [leanCoP]

    Zombori, Zsolt and Csiszárik, Adrián and Michalewski, Henryk and Kaliszyk, Cezary and Urban, Josef

  21. lazyCoP: Lazy Paramodulation Meets Neurally Guided Search TABLEAUX 2021 [paper] [leanCoP]

    Rawson, Michael and Reger, Giles

  22. The Role of Entropy in Guiding a Connection Prover TABLEAUX 2021 [paper] [leanCoP]

    Zombori, Zsolt and Urban, Josef and Olšák, Miroslav

  23. Learning Theorem Proving Components TABLEAUX 2021 [paper] [ENIGMA]

    Chvalovský, Karel and Jakubův, Jan and Olšák, Miroslav and Urban, Josef

  24. A Deep Reinforcement Learning Approach to First-Order Logic Theorem Proving AAAI 2021 [paper] [TRAIL]

    Crouse, Maxwell and Abdelaziz, Ibrahim and Makni, Bassem and Whitehead, Spencer and Cornelio, Cristina and Kapanipathi, Pavan and Srinivas, Kavitha and Thost, Veronika and Witbrock, Michael and Fokoue, Achille

  25. Improving ENIGMA-Style Clause Selection While Learning From History CADE 2021 [paper] [Vampire]

    Suda, Martin

  26. Proving Theorems using Incremental Learning and Hindsight Experience Replay ICML 2022 [paper] [saturation]

    Aygün, Eser and Anand, Ankit and Orseau, Laurent and Glorot, Xavier and Mcaleer, Stephen M and Firoiu, Vlad and Zhang, Lei M and Precup, Doina and Mourad, Shibl

  27. HyperTree Proof Search for Neural Theorem Proving NeurIPS 2022 [paper] [Metamath, Lean]

    Lample, Guillaume and Lacroix, Timothee and Lachaux, Marie-Anne and Rodriguez, Aurelien and Hayat, Amaury and Lavril, Thibaut and Ebner, Gabriel and Martinet, Xavier

  28. Learning to Prove Trigonometric Identities arXiv 2022 [paper] [trigonometric identity]

    Liu, Zhou and Li, Yujun and Liu, Zhengying and Li, Lin and Li, Zhenguo

  29. Learning to Guide a Saturation-Based Theorem Prover TPAMI 2022 [paper] [TRAIL]

    Abdelaziz, Ibrahim and Crouse, Maxwell and Makni, Bassem and Austil, Vernon and Cornelio, Cristina and Ikbal, Shajith and Kapanipathi, Pavan and Makondo, Ndivhuwo and Srinivas, Kavitha and Witbrock, Michael and Fokoue, Achille

  30. Machine Learning Meets the Herbrand Universe arXiv 2022 [paper] [instantiation]

    Piepenbrock, Jelle and Urban, Josef and Korovin, Konstantin and Olšák, Miroslav and Heskes, Tom and Janota, Mikolaš

  31. Guiding An Automated Theorem Prover with Neural Rewriting IJCAR 2022 [paper] [Prover9]

    Piepenbrock, Jelle and Heskes, Tom and Janota, Mikoláš and Urban, Josef

  32. The Isabelle ENIGMA ITP 2022 [paper] [ENIGMA]

    Goertzel, Zarathustra A and Jakubův, Jan and Kaliszyk, Cezary and Olšák, Miroslav and Piepenbrock, Jelle and Urban, Josef

  33. Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

    Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya

  34. An Ensemble Approach for Automated Theorem Proving Based on Efficient Name Invariant Graph Neural Representations IJCAI 2023 [paper] [TRAIL]

    Fokoue, Achille and Abdelaziz, Ibrahim and Crouse, Maxwell and Ikbal, Shajith and Kishimoto, Akihiro and Lima, Guilherme and Makondo, Ndivhuwo and Marinescu, Radu

  35. Peano: Learning Formal Mathematical Reasoning Philosophical Transactions of the Royal Society A 2023 [paper] [Peano]

    Poesia, Gabriel and Goodman, Noah D

  36. MizAR 60 for Mizar 50 ITP 2023 [paper] [ENIGMA]

    Jakubův, Jan and Chvalovský, Karel and Goertzel, Zarathustra and Kaliszyk, Cezary and Olšák, Mirek and Piotrowski, Bartosz and Schulz, Stephan and Suda, Martin and Urban, Josef

  37. Reinforcement Learning for Guiding the E Theorem Prover FLAIRS 2023 [paper] [E]

    McKeown, Jack and Sutcliffe, Geoff

  38. Guiding an Instantiation Prover with Graph Neural Networks LPAR 2023 [paper] [iProver]

    Chvalovský, Karel and Korovin, Konstantin and Piepenbrock, Jelle and Urban, Josef

  39. How Much Should This Symbol Weigh? A GNN-Advised Clause Selection LPAR 2023 [paper] [Vampire]

    Bártěk, Filip and Suda, Martin

  40. gym-saturation: Gymnasium Environments for Saturation Provers (System Description) TABLEAUX 2023 [paper] [Vampire, iProver]

    Shminke, Boris

  41. DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-Level Value Function ACL 2023 [paper] [Isabelle, Lean]

    Wang, Haiming and Yuan, Ye and Liu, Zhengying and Shen, Jianhao and Yin, Yichun and Xiong, Jing and Xie, Enze and Shi, Han and Li, Yujun and Li, Lin and Yin, Jian and Li, Zhenguo and Liang, Xiaodan

  42. An In-Context Learning Agent for Formal Theorem-Proving arXiv 2023 [paper] [Lean, Coq]

    Thakur, Amitayush and Tsoukalas, George and Wen, Yeming and Xin, Jimmy and Chaudhuri, Swarat

  43. Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search arXiv 2024 [paper] [Dafny, Coq, Lean]

    Brandfonbrener, David and Raja, Sibi and Prasad, Tarun and Loughridge, Chloe and Yang, Jianang and Henniger, Simon and Byrd, William E and Zinkov, Robert and Amin, Nada

  44. Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving arXiv 2024 [paper] [Lean]

    An, Chenyang and Chen, Zhibo and Ye, Qihao and First, Emily and Peng, Letian and Zhang, Jiayun and Wang, Zihan and Lerner, Sorin and Shang, Jingbo

  45. Proving Olympiad Algebraic Inequalities without Human Demonstrations arXiv 2024 [paper] [Inequality]

    Wei, Chenrui and Sun, Mengzhou and Wang, Wei

Other Tasks

  1. Transformers as Soft Reasoners over Language IJCAI 2020 [paper] [NL]

    Clark, Peter and Tafjord, Oyvind and Richardson, Kyle

  2. PRover: Proof Generation for Interpretable Reasoning over Rules EMNLP 2020 [paper] [NL]

    Saha, Swarnadeep and Ghosh, Sayan and Srivastava, Shashank and Bansal, Mohit

  3. First Neural Conjecturing Datasets and Experiments CICM 2020 [paper] [Mizar]

    Urban, Josef and Jakubův, Jan

  4. Guiding Inferences in Connection Tableau by Recurrent Neural Networks CICM 2020 [paper] [connection tableau]

    Piotrowski, Bartosz and Urban, Josef

  5. Mathematical Reasoning in Latent Space ICLR 2020 [paper] [HOL Light]

    Lee, Dennis and Szegedy, Christian and Rabe, Markus N and Loos, Sarah M and Bansal, Kshitij

  6. Explaining Answers with Entailment Trees EMNLP 2021 [paper] [NL]

    Dalvi, Bhavana and Jansen, Peter and Tafjord, Oyvind and Xie, Zhengnan and Smith, Hannah and Pipatanangkura, Leighanna and Clark, Peter

  7. ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language ACL 2021 [paper] [NL]

    Tafjord, Oyvind and Mishra, Bhavana Dalvi and Clark, Peter

  8. Mathematical Reasoning via Self-supervised Skip-tree Training ICLR 2021 [paper] [HOL Light]

    Rabe, Markus N and Lee, Dennis and Bansal, Kshitij and Szegedy, Christian

  9. Latent Action Space for Efficient Planning in Theorem Proving AITP 2021 [paper] [inequality]

    Wu, Minchao and Wu, Yuhuai

  10. IsarStep: A Benchmark for High-Level Mathematical Reasoning ICLR 2021 [paper] [Isabelle]

    Li, Wenda and Yu, Lei and Wu, Yuhuai and Paulson, Lawrence C

  11. LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning ICML 2021 [paper] [Isabelle, HOL Light, Metamath, Lean]

    Wu, Yuhuai and Rabe, Markus N and Li, Wenda and Ba, Jimmy and Grosse, Roger B and Szegedy, Christian

  12. FaiRR: Faithful and Robust Deductive Reasoning over Natural Language ACL 2022 [paper] [NL]

    Sanyal, Soumya and Singh, Harman and Ren, Xiang

  13. Natural Language Deduction through Search over Statement Compositions EMNLP 2022 [paper] [NL]

    Bostrom, Kaj and Sprague, Zayne and Chaudhuri, Swarat and Durrett, Greg

  14. MetGen: A ModuleBased Entailment Tree Generation Framework for Answer Explanation NAACL 2022 [paper] [NL]

    Hong, Ruixin and Zhang, Hongming and Yu, Xintong and Zhang, Changshui

  15. Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement arXiv 2022 [paper] [NL]

    Mishra, Bhavana Dalvi and Tafjord, Oyvind and Clark, Peter

  16. Generating Natural Language Proofs with Verifier-Guided Search EMNLP 2022 [paper] [NL]

    Yang, Kaiyu and Deng, Jia and Chen, Danqi

  17. Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning EMNLP 2022 [paper] [NL]

    Tafjord, Oyvind and Mishra, Bhavana Dalvi and Clark, Peter

  18. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  19. Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic ICML 2023 [paper] [NL]

    Morishita, Terufumi and Morio, Gaku and Yamaguchi, Atsuki and Sogawa, Yasuhiro

  20. Language Models are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought ICLR 2023 [paper] [NL]

    Saparov, Abulhair and He, He

  21. Exploring Mathematical Conjecturing with Large Language Models NeSy 2023 [paper] [Isabelle]

    Johansson, Moa and Smallbone, Nicholas

  22. BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs EACL 2023 [paper] [NL]

    Li, Weixian Waylon and Ziser, Yftah and Coavoux, Maximin and Cohen, Shay B

  23. REFACTOR: Learning to Extract Theorems from Proofs ICLR 2024 [paper] [Metamath]

    Zhou, Jin Peng and Wu, Yuhuai and Li, Qiyang and Grosse, Roger

  24. Machine Learning and Information Theory Concepts Towards an AI Mathematician AMS 2024 [paper]

    Bengio, Yoshua and Nikolay Malkin

  25. Learning Formal Mathematics From Intrinsic Motivation arXiv 2024 [paper]

    Poesia, Gabriel and Broman, David and Haber, Nick and Goodman, Noah D

Datasets

Data Collection

  1. Premise Selection for Mathematics by Corpus Analysis and Kernel Methods Journal of Automated Reasoning 2014 [paper] [Mizar]

    Alama, Jesse and Heskes, Tom and Kühlwein, Daniel and Tsivtsivadze, Evgeni and Urban, Josef

  2. MizAR 40 for Mizar 40 Journal of Automated Reasoning 2015 [paper] [Mizar]

    Kaliszyk, Cezary and Urban, Josef

  3. The TPTP Problem Library and Associated Infrastructure Journal of Automated Reasoning 2017 [paper] [TPTP]

    Sutcliffe, Geoff

  4. HolStep: A Machine Learning Dataset for Higher-Order Logic Theorem Proving ICLR 2017 [paper] [HOL Light]

    Kaliszyk, Cezary and Chollet, François and Szegedy, Christian

  5. Reinforcement Learning of Theorem Proving NeurIPS 2018 [paper] [Mizar]

    Kaliszyk, Cezary and Urban, Josef and Michalewski, Henryk and Olšák, Miroslav

  6. GamePad: A Learning Environment for Theorem Proving ICLR 2019 [paper] [Coq]

    Huang, Daniel and Dhariwal, Prafulla and Song, Dawn and Sutskever, Ilya

  7. HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

    Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox

  8. Learning to Prove Theorems via Interacting with Proof Assistants ICML 2019 [paper] [Coq]

    Yang, Kaiyu and Deng, Jia

  9. TacticToe: Learning to Prove with Tactics Journal of Automated Reasoning 2020 [paper] [HOL4]

    Gauthier, Thibault and Kaliszyk, Cezary and Urban, Josef

  10. Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [pretraining, Metamath]

    Polu, Stanislas and Sutskever, Ilya

  11. The Tactician: A Seamless, Interactive Tactic Learner and Prover for Coq CICM 2020 [paper] [Coq]

    Blaauwbroek, Lasse and Urban, Josef and Geuvers, Herman

  12. Natural Language Premise Selection: Finding Supporting Statements for Mathematical Text LREC 2020 [paper] [NL]

    Ferreira, Deborah and Freitas, André

  13. IsarStep: A Benchmark for High-Level Mathematical Reasoning ICLR 2021 [paper] [Isabelle]

    Li, Wenda and Yu, Lei and Wu, Yuhuai and Paulson, Lawrence C

  14. NaturalProofs: Mathematical Theorem Proving in Natural Language NeurIPS 2021 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Bras, Ronan Le and Hajishirzi, Hannaneh and Choi, Yejin and Cho, Kyunghyun

  15. LISA: Language Models of Isabelle Proofs AITP 2021 [paper] [Isabelle]

    Jiang, Albert Qiaochu and Li, Wenda and Han, Jesse Michael and Wu, Yuhuai

  16. Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

    Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas

  17. MiniF2F: A Cross-System Benchmark for Formal Olympiad-Level Mathematics ICLR 2022 [paper] [Metamath, Lean, Isabelle, HOL Light]

    Zheng, Kunhao and Han, Jesse Michael and Polu, Stanislas

  18. UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression EMNLP 2022 [paper] [geometry]

    Chen, Jiaqi and Li, Tong and Qin, Jinghui and Lu, Pan and Lin, Liang and Chen, Chongyu and Liang, Xiaodan

  19. NaturalProver: Grounded Mathematical Proof Generation with Language Models NeurIPS 2022 [paper] [NL]

    Welleck, Sean and Liu, Jiacheng and Lu, Ximing and Hajishirzi, Hannaneh and Choi, Yejin

  20. A Parallel Corpus of Natural Language and Isabelle Artefacts AITP 2022 [paper] [NL, Isabelle]

    Bordg, Anthony and Stathopoulos, Yiannos A and Paulson, Lawrence C

  21. Proof Repair Infrastructure for Supervised Models: Building a Large Proof Repair Dataset ITP 2023 [paper] [Coq]

    Reichel, Tom and Henderson, R and Touchet, Andrew and Gardner, Andrew and Ringer, Talia

  22. BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs EACL 2023 [paper] [NL]

    Li, Weixian Waylon and Ziser, Yftah and Coavoux, Maximin and Cohen, Shay B

  23. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

    Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima

  24. MLFMF: Data Sets for Machine Learning for Mathematical Formalization NeurIPS 2023 [paper] [Agda, Lean]

    Bauer, Andrej and Petković, Matej and Todorovski, Ljupco

  25. Mathematical Capabilities of ChatGPT NeurIPS 2023 [paper] [NL]

    Simon Frieder and Luca Pinchetti and Alexis Chevalier and Ryan-Rhys Griffiths and Tommaso Salvatori and Thomas Lukasiewicz and Philipp Christian Petersen and Julius Berner

  26. TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models EMNLP 2023 [paper] [Lean]

    Xiong, Jing and Shen, Jianhao and Yuan, Ye and Wang, Haiming and Yin, Yichun and Liu, Zhengying and Li, Lin and Guo, Zhijiang and Cao, Qingxing and Huang, Yinya and Zheng, Chuanyang and Liang, Xiaodan and Zhang, Ming and Liu, Qun

  27. Generative AI for Math: Part I - MathPile: A Billion-Token-Scale Pretraining Corpus for Math arXiv 2023 [paper] [pretraining]

    Wang, Zengzhi and Xia, Rui and Liu, Pengfei

  28. ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics arXiv 2023 [paper] [pretraining, NL, Lean]

    Azerbayev, Zhangir and Piotrowski, Bartosz and Schoelkopf, Hailey and Ayers, Edward W and Radev, Dragomir and Avigad, Jeremy

  29. FIMO: A Challenge Formal Dataset for Automated Theorem Proving arXiv 2023 [paper] [NL, Lean]

    Liu, Chengwu and Shen, Jianhao and Xin, Huajian and Liu, Zhengying and Yuan, Ye and Wang, Haiming and Ju, Wei and Zheng, Chuanyang and Yin, Yichun and Li, Lin and Zhang, Ming Zhang and Liu, Qun

  30. Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [pretraining]

    Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean

  31. Magnushammer: A Transformer-Based Approach to Premise Selection ICLR 2024 [paper] [Isabelle]

    Mikuła, Maciej and Antoniak, Szymon and Tworkowski, Szymon and Jiang, Albert Qiaochu and Zhou, Jin Peng and Szegedy, Christian and Kuciński, Łukasz and Miłos, Piotr and Wu, Yuhuai

  32. OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text ICLR 2024 [paper] [pretraining]

    Paster, Keiran and Santos, Marco Dos and Azerbayev, Zhangir and Ba, Jimmy

  33. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [pretraining]

    Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya

  34. InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [pretraining]

    Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

  35. Selene: Pioneering Automated Proof in Software Verification ACL 2024 [paper] [Isabelle]

    Zhang, Lichen and Lu, Shuai and Duan, Nan

  36. FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving arXiv 2024 [paper] [Isabelle]

    Lin, Xiaohan and Cao, Qingxing and Huang, Yinya and Wang, Haiming and Lu, Jianqiao and Liu, Zhengying and Song, Linqi and Liang, Xiaodan

  37. PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition arXiv 2024 [paper] [Isabelle, Lean]

    Tsoukalas, George and Lee, Jasper and Jennings, John and Xin, Jimmy and Ding, Michelle and Jennings, Michael and Thakur, Amitayush and Chaudhuri, Swarat

  38. miniCodeProps: a Minimal Benchmark for Proving Code Properties arXiv 2024 [paper] [Lean]

    Lohn, Evan and Welleck, Sean

Data Generation

  1. Learning to Reason in Large Theories without Imitation arXiv 2019 [paper] [HOL Light]

    Bansal, Kshitij and Szegedy, Christian and Rabe, Markus N and Loos, Sarah M and Toman, Viktor

  2. Learning to Prove Theorems by Learning to Generate Theorems NeurIPS 2020 [paper] [Metamath]

    Wang, Mingzhe and Deng, Jia

  3. INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving ICLR 2021 [paper] [inequality]

    Wu, Yuhuai and Jiang, Albert Qiaochu and Ba, Jimmy and Grosse, Roger

  4. Training a First-Order Theorem Prover from Synthetic Data ICLR 2021 MATH-AI Workshop [paper] [saturation]

    Firoiu, Vlad and Aygün, Eser and Anand, Ankit and Ahmed, Zafarali and Glorot, Xavier and Orseau, Laurent and Zhang, Lei and Precup, Doina and Mourad, Shibl

  5. LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning ICML 2021 [paper] [reasoning primitives]

    Wu, Yuhuai and Rabe, Markus N and Li, Wenda and Ba, Jimmy and Grosse, Roger B and Szegedy, Christian

  6. Proving Theorems using Incremental Learning and Hindsight Experience Replay ICML 2022 [paper] [saturation]

    Aygün, Eser and Anand, Ankit and Orseau, Laurent and Glorot, Xavier and Mcaleer, Stephen M and Firoiu, Vlad and Zhang, Lei M and Precup, Doina and Mourad, Shibl

  7. Synthetic Proof Term Data Augmentation for Theorem Proving with Language Models AITP 2022 [paper] [Lean]

    Palermo, Joseph and Ye, Johnny and Han, Jesse Michael

  8. Learning to Prove Trigonometric Identities arXiv 2022 [paper] [trigonometric identity]

    Liu, Zhou and Li, Yujun and Liu, Zhengying and Li, Lin and Li, Zhenguo

  9. Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

    Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya

  10. Multilingual Mathematical Autoformalization arXiv 2023 [paper] [NL, Isabelle, Lean]

    Jiang, Albert Q and Li, Wenda and Jamnik, Mateja

  11. Solving Olympiad Geometry without Human Demonstrations Nature 2024 [paper] [geometry]

    Trinh, Trieu H and Wu, Yuhuai and Le, Quoc V and He, He and Luong, Thang

  12. LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [Isabelle]

    Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan

  13. REFACTOR: Learning to Extract Theorems from Proofs ICLR 2024 [paper] [Metamath]

    Zhou, Jin Peng and Wu, Yuhuai and Li, Qiyang and Grosse, Roger

  14. MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data ICLR 2024 [paper] [NL, Lean]

    Huang, Yinya and Lin, Xiaohan and Liu, Zhengying and Cao, Qingxing and Xin, Huajian and Wang, Haiming and Li, Zhenguo and Song, Linqi and Liang, Xiaodan

  15. ATG: Benchmarking Automated Theorem Generation for Generative Language Models NAACL 2024 Findings [paper] [Metamath]

    Lin, Xiaohan and Cao, Qingxing and Huang, Yinya and Yang, Zhicheng and Liu, Zhengying and Li, Zhenguo and Liang, Xiaodan

  16. Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving arXiv 2024 [paper] [Lean]

    An, Chenyang and Chen, Zhibo and Ye, Qihao and First, Emily and Peng, Letian and Zhang, Jiayun and Wang, Zihan and Lerner, Sorin and Shang, Jingbo

  17. Process-Driven Autoformalization in Lean 4 arXiv 2024 [paper] [NL, Lean]

    Lu, Jianqiao and Liu, Zhengying and Wan, Yingjia and Huang, Yinya and Wang, Haiming and Yang, Zhicheng and Tang, Jing and Guo, Zhijiang

  18. DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data arXiv 2024 [paper] [NL, Lean]

    Xin, Huajian and Guo, Daya and Shao, Zhihong and Ren, Zhizhou and Zhu, Qihao and Liu, Bo and Ruan, Chong and Li, Wenda and Liang, Xiaodan

  19. Lean Workbook: A Large-Scale Lean Problem Set Formalized from Natural Language Math Problems arXiv 2024 [paper] [NL, Lean]

    Ying, Huaiyuan and Wu, Zijian and Geng, Yihan and Wang, Jiayu and Lin, Dahua and Chen, Kai

  20. Proving Olympiad Algebraic Inequalities without Human Demonstrations arXiv 2024 [paper] [Inequality]

    Wei, Chenrui and Sun, Mengzhou and Wang, Wei

Related Surveys

  1. QED at Large: A Survey of Engineering of Formally Verified Software Foundations and Trends® in Programming Languages 2019 [paper]

    Ringer, Talia and Palmskog, Karl and Sergey, Ilya and Gligoric, Milos and Tatlock, Zachary

  2. A Survey of Deep Learning for Mathematical Reasoning ACL 2023 [paper]

    Lu, Pan and Qiu, Liang and Yu, Wenhao and Welleck, Sean and Chang, Kai-Wei

  3. Mathematical Language Models: A Survey arXiv 2023 [paper]

    Liu, Wentao and Hu, Hanglei and Zhou, Jie and Ding, Yuyang and Li, Junsong and Zeng, Jiayi and He, Mengliang and Chen, Qin and Jiang, Bo and Zhou, Aimin and He, Liang

  4. Large Language Models for Mathematical Reasoning: Progresses and Challenges EACL 2024 [paper]

    Ahn, Janice and Verma, Rishu and Lou, Renze and Liu, Di and Zhang, Rui and Yin, Wenpeng

  5. AI for Math Resources Google Sheet [link]

    Available to Everyone

Citation

If you find this repository useful, please consider citing our survey paper:

@inproceedings{li2024dl4tp,
   title={A Survey on Deep Learning for Theorem Proving}, 
   author={Li, Zhaoyu and Sun, Jialiang and Murphy, Logan and Su, Qidong and Li, Zenan and Zhang, Xian and Yang, Kaiyu and Si, Xujie},
   booktitle={First Conference on Language Modeling},
   year={2024},
   url={https://openreview.net/forum?id=zlw6AHwukB}
}