Deep Learning for Theorem Proving (DL4TP)

Welcome to our repository! This is a curated collection of resources related to deep learning for theorem proving.

We categorize papers primarily based on the applications of deep learning models, organizing them into five task-specific categories and two dataset categories. A single paper may appear in multiple categories due to its relevance to different tasks or datasets. Additionally, each paper is labeled with the used theorem prover/proof calculus/problem domain to help users quickly find the resources that best match their interests or needs. For example, papers using/generating theorems/proofs in natural language are labeled with [NL].

For more details, please refer to our survey paper: A Survey on Deep Learning for Theorem Proving.

Surveys
Tutorials
Tasks
Datasets
- Data Collection
- Data Generation
Related Surveys
Citation

Surveys

Towards the Automatic Mathematician CADE 2021 [paper]

Rabe, Markus N and Szegedy, Christian
Learning Guided Automated Reasoning: A Brief Survey arXiv 2024 [paper]

Blaauwbroek, Lasse and Cerna, David and Gauthier, Thibault and Jakubův, Jan and Kaliszyk, Cezary and Suda, Martin and Urban, Josef
A Survey on Deep Learning for Theorem Proving arXiv 2024 [paper]

Li, Zhaoyu and Sun, Jialiang and Murphy, Logan and Su, Qidong and Li, Zenan and Zhang, Xian and Yang, Kaiyu and Si, Xujie

Tutorials

Tutorial on Machine Learning for Theorem Proving NeurIPS 2023 Tutorial [link] [Coq, Isabelle, Lean]

First, Emily and Jiang, Albert and Yang, Kaiyu

Tasks

Autoformalization

First Experiments with Neural Translation of Informal to Formal Mathematics CICM 2018 [paper] [NL, Mizar]

Wang, Qingxiang and Kaliszyk, Cezary and Urban, Josef
Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar CPP 2020 [paper] [NL, Mizar]

Wang, Qingxiang and Brown, Chad and Kaliszyk, Cezary and Urban, Josef
Learning Alignment between Formal & Informal Mathematics AITP 2020 [paper] [NL, HOL Light]

Bansal, Kshitij and Szegedy, Christian
A Promising Path Towards Autoformalization and General Artificial Intelligence CICM 2020 [paper]

Szegedy, Christian
Autoformalization with Large Language Models NeurIPS 2022 [paper] [NL, Isabelle]

Wu, Yuhuai and Jiang, Albert Qiaochu and Li, Wenda and Rabe, Markus and Staats, Charles and Jamnik, Mateja and Szegedy, Christian
Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs EMNLP 2022 MathNLP Workshop [paper] [NL, Coq]

Cunningham, Garett and Bunescu, Razvan C and Juedes, David
Towards a Mathematics Formalisation Assistant using Large Language Models arXiv 2022 [paper] [NL, Lean]

Agrawal, Ayush and Gadgil, Siddhartha and Goyal, Navin and Narayanan, Ashvni and Tadipatri, Anand
Towards Automating Formalisation of Theorem Statements using Large Language Models NeurIPS 2022 MATH-AI Workshop [paper] [NL, Lean]

Gadgil, Siddhartha and Tadipatri, Anand Rao and Agrawal, Ayush and Narayanan, Ashvni and Goyal, Navin
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs ICLR 2023 [paper] [NL, Isabelle]

Jiang, Albert Q and Welleck, Sean and Zhou, Jin Peng and Li, Wenda and Liu, Jiacheng and Jamnik, Mateja and Lacroix, Timothée and Wu, Yuhuai and Lample, Guillaume
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning EMNLP 2023 Findings [paper] [NL, Prover9, Z3]

Pan, Liangming and Albalak, Alon and Wang, Xinyi and Wang, William Yang
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers EMNLP 2023 [paper] [NL, Prover9]

Olausson, Theo X and Gu, Alex and Lipkin, Benjamin and Zhang, Cedegao E and Solar-Lezama, Armando and Tenenbaum, Joshua B and Levy, Roger
SATLM: Satisfiability-Aided Language Models Using Declarative Prompting NeurIPS 2023 [paper] [NL, Z3]

Ye, Xi and Chen, Qiaochu and Dillig, Isil and Durrett, Greg
FIMO: A Challenge Formal Dataset for Automated Theorem Proving arXiv 2023 [paper] [NL, Lean]

Liu, Chengwu and Shen, Jianhao and Xin, Huajian and Liu, Zhengying and Yuan, Ye and Wang, Haiming and Ju, Wei and Zheng, Chuanyang and Yin, Yichun and Li, Lin and Zhang, Ming Zhang and Liu, Qun
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

Zhao, Xueliang and Li, Wenda and Kong, Lingpeng
ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics arXiv 2023 [paper] [NL, Lean]

Azerbayev, Zhangir and Piotrowski, Bartosz and Schoelkopf, Hailey and Ayers, Edward W and Radev, Dragomir and Avigad, Jeremy
Multilingual Mathematical Autoformalization arXiv 2023 [paper] [NL, Isabelle, Lean]

Jiang, Albert Q and Li, Wenda and Jamnik, Mateja
Lyra: Orchestrating Dual Correction in Automated Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

Zheng, Chuanyang and Wang, Haiming and Xie, Enze and Liu, Zhengying and Sun, Jiankai and Xin, Huajian and Shen, Jianhao and Li, Zhenguo and Li, Yu
A New Approach Towards Autoformalization arXiv 2023 [paper] [NL, Lean]

Patel, Nilay and Flanigan, Jeffrey and Saha, Rahul
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data ICLR 2024 [paper] [NL, Lean]

Huang, Yinya and Lin, Xiaohan and Liu, Zhengying and Cao, Qingxing and Xin, Huajian and Wang, Haiming and Li, Zhenguo and Song, Linqi and Liang, Xiaodan
Don't Trust: Verify - Grounding LLM Quantitative Reasoning with Autoformalization ICLR 2024 [paper] [NL, Isabelle]

Zhou, Jin Peng and Staats, Charles E and Li, Wenda and Szegedy, Christian and Weinberger, Kilian Q and Wu, Yuhuai
LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [NL, Isabelle]

Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan
Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [NL, Isabelle, Lean]

Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [NL, Isabelle]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [NL, Lean]

Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

Premise Selection

Deepmath - Deep Sequence Models for Premise Selection NeurIPS 2016 [paper] [Mizar]

Irving, Geoffrey and Szegedy, Christian and Alemi, Alexander A and Eén, Niklas and Chollet, François and Urban, Josef
HolStep: A Machine Learning Dataset for Higher-Order Logic Theorem Proving ICLR 2017 [paper] [HOL Light]

Kaliszyk, Cezary and Chollet, François and Szegedy, Christian
Tree-structure CNN for Automated Theorem Proving ICONIP 2017 [paper] [HOL Light]

Peng, Kebin and Ma, Dianfu
Premise Selection for Theorem Proving by Deep Graph Embedding NeurIPS 2017 [paper] [HOL Light]

Wang, Mingzhe and Tang, Yihe and Wang, Jian and Deng, Jia
Premise Selection with Neural Networks and Distributed Representation of Features arXiv 2018 [paper] [Mizar]

Kucik, Andrzej Stanisław and Korovin, Konstantin
HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox
Learning Representations of Logical Formulae using Graph Neural Networks NeurIPS 2019 GRL Workshop [paper] [Mizar]

Glorot, Xavier and Anand, Ankit and Aygun, Eser and Mourad, Shibl and Kohli, Pushmeet and Precup, Doina
Property Invariant Embedding for Automated Reasoning ECAI 2019 [paper] [Mizar]

Olšák, Miroslav and Kaliszyk, Cezary and Urban, Josef
Usefulness of Lemmas via Graph Neural Networks AITP 2019 [paper] [Mizar]

Goertzel, Zarathustra and Urban, Josef
Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling arXiv 2019 [paper] [Mizar, HOL Light]

Crouse, Maxwell and Abdelaziz, Ibrahim and Cornelio, Cristina and Thost, Veronika and Wu, Lingfei and Forbus, Kenneth and Fokoue, Achille
Directed Graph Networks for Logical Reasoning (Extended Abstract) PAAR 2020 [paper] [Mizar]

Rawson, Michael and Reger, Giles
Stateful Premise Selection by Recurrent Neural Networks LPAR 2020 [paper] [Mizar]

Piotrowski, Bartosz and Urban, Josef
Premise Selection in Natural Language Mathematical Texts ACL 2020 [paper] [NL]

Ferreira, Deborah and Freitas, André
Natural Language Premise Selection: Finding Supporting Statements for Mathematical Text LREC 2020 [paper] [NL]

Ferreira, Deborah and Freitas, André
TextGraphs 2022 Shared Task on Natural Language Premise Selection TextGraphs 2020 [paper] [NL]

Valentino, Marco and Ferreira, Deborah and Thayaparan, Mokanarangan and Freitas, André and Ustalov, Dmitry
IJS at TextGraphs-16 Natural Language Premise Selection Task: Will Contextual Information Improve Natural Language Premise Selection? TextGraphs 2020 [paper] [NL]

Tran, Thi Hong Hanh and Martinc, Matej and Doucet, Antoine and Pollak, Senja
UNLPS at TextGraphs-16 Natural Language Premise Selection Task: Unsupervised Natural Language Premise Selection in Mathematical Text using Sentence-MPNet TextGraphs 2020 [paper] [NL]

Trust, Paul and Kadusabe, Provia and Younis, Haseeb and Minghim, Rosane and Milios, Evangelos and Zahran, Ahmed
Keyword-based Natural Language Premise Selection for an Automatic Mathematical Statement Proving TextGraphs 2020 [paper] [NL]

Dastgheib, Doratossadat and Asgari, Ehsaneddin
TextGraphs-16 Natural Language Premise Selection Task: Zero-Shot Premise Selection with Prompting Generative Language Models TextGraphs 2020 [paper] [NL]

Kovriguina, Liubov and Teucher, Roman and Wardenga, Robert
Attention Recurrent Cross-Graph Neural Network for Selecting Premises IJMLC 2021 [paper] [Mizar]

Liu, Qinghua and Xu, Yang and He, Xingxing
Graph Representations for Higher-Order Logic and Theorem Proving AAAI 2020 [paper] [HOL Light]

Paliwal, Aditya and Loos, Sarah and Rabe, Markus and Bansal, Kshitij and Szegedy, Christian
Improving Stateful Premise Selection with Transformers CICM 2021 [paper] [Mizar]

Proroković, Krsto and Wand, Michael and Schmidhuber, Jürgen
Contrastive Graph Representations for Logical Formulas Embedding TKDE 2021 [paper] [Mizar]

Lin, Qika and Liu, Jun and Zhang, Lingling and Pan, Yudai and Hu, Xin and Xu, Fangzhi and Zeng, Hongwei
Graph Contrastive Pre-training for Effective Theorem Reasoning ICML 2021 SSL Workshop [paper] [Coq]

Li, Zhaoyu and Chen, Binghong and Si, Xujie
NaturalProofs: Mathematical Theorem Proving in Natural Language NeurIPS 2021 [paper] [NL]

Welleck, Sean and Liu, Jiacheng and Bras, Ronan Le and Hajishirzi, Hannaneh and Choi, Yejin and Cho, Kyunghyun
Contrastive Finetuning of Generative Language Models for Informal Premise Selection AITP 2021 [paper] [NL]

Han, Jesse Michael and Xu, Tao and Polu, Stanislas and Neelakantan, Arvind and Radford, Alec
STAR: Cross-modal [STA]tement [R]epresentation for Selecting Relevant Mathematical Premises EACL 2021 [paper] [NL]

Ferreira, Deborah and Freitas, André
A Study of Continuous Vector Representations for Theorem Proving Journal of Logic and Computation 2021 [paper] [Mizar]

Purgał, Stanisław and Parsert, Julian and Kaliszyk, Cezary
Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas
The Isabelle ENIGMA ITP 2022 [paper] [Isabelle]

Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas
Formal Premise Selection with Language Models AITP 2022 [paper] [Isabelle]

Tworkowski, Szymon and Mikuła, Maciej and Odrzygóżdż, Tomasz and Czechowski, Konrad and Antoniak, Szymon and Jiang, Albert and Szegedy, Christian and Kuciński, Łukasz and Miłoś, Piotr and Wu, Yuhuai
MizAR 60 for Mizar 50 ITP 2023 [paper] [Mizar]

Jakubův, Jan and Chvalovský, Karel and Goertzel, Zarathustra and Kaliszyk, Cezary and Olšák, Mirek and Piotrowski, Bartosz and Schulz, Stephan and Suda, Martin and Urban, Josef
Graph Sequence Learning for Premise Selection AITP 2023 [paper] [Mizar]

Holden, Edvard K and Korovin, Konstantin
CoProver: A Recommender System for Proof Construction CICM 2023 [paper] [PVS]

Yeh, Eric and Hitaj, Briland and Owre, Sam and Quemener, Maena and Shankar, Natarajan
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima
MLFMF: Data Sets for Machine Learning for Mathematical Formalization NeurIPS 2023 [paper] [Agda, Lean]

Bauer, Andrej and Petković, Matej and Todorovski, Ljupco
Magnushammer: A Transformer-Based Approach to Premise Selection ICLR 2024 [paper] [Isabelle]

Mikuła, Maciej and Antoniak, Szymon and Tworkowski, Szymon and Jiang, Albert Qiaochu and Zhou, Jin Peng and Szegedy, Christian and Kuciński, Łukasz and Miłos, Piotr and Wu, Yuhuai

Proofstep Generation

Holophrasm: A Neural Automated Theorem Prover for Higher-Order Logic arXiv 2016 [paper] [Metamath]

Whalen, Daniel
GamePad: A Learning Environment for Theorem Proving ICLR 2019 [paper] [Coq]

Huang, Daniel and Dhariwal, Prafulla and Song, Dawn and Sutskever, Ilya
HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox
Learning to Prove Theorems via Interacting with Proof Assistants ICML 2019 [paper] [Coq]

Yang, Kaiyu and Deng, Jia
Graph Representations for Higher-Order Logic and Theorem Proving AAAI 2020 [paper] [HOL Light]

Paliwal, Aditya and Loos, Sarah and Rabe, Markus and Bansal, Kshitij and Szegedy, Christian
Learning to Prove Theorems by Learning to Generate Theorems NeurIPS 2020 [paper] [Metamath]

Wang, Mingzhe and Deng, Jia
Generating Correctness Proofs with Neural Networks MAPL 2020 [paper] [Coq]

Sanchez-Stern, Alex and Alhessi, Yousef and Saul, Lawrence and Lerner, Sorin
TacTok: Semantics-Aware Proof Synthesis OOPSLA 2020 [paper] [Coq]

First, Emily and Brun, Yuriy and Guha, Arjun
Automated Theorem Proving via Interacting with Proof Assistants by Dynamic Strategies BigCom 2020 [paper] [Coq]

Mo, Guangshuai and Xiong, Yan and Huang, Wenchao and Ma, Lu
Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [Metamath]

Polu, Stanislas and Sutskever, Ilya
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving ICLR 2021 [paper] [inequality]

Wu, Yuhuai and Jiang, Albert Qiaochu and Ba, Jimmy and Grosse, Roger
TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning NeurIPS 2021 [paper] [HOL4]

Wu, Minchao and Norrish, Michael and Walder, Christian and Dezfouli, Amir
LISA: Language Models of Isabelle Proofs AITP 2021 [paper] [Isabelle]

Jiang, Albert Qiaochu and Li, Wenda and Han, Jesse Michael and Wu, Yuhuai
Retrieval-Augmented Proof Step Synthesis AITP 2021 [paper] [HOL Light]

Szegedy, Christian and Rabe, Markus and Michalewski, Henryk
Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression EMNLP 2022 [paper] [geometry]

Chen, Jiaqi and Li, Tong and Qin, Jinghui and Lu, Pan and Lin, Liang and Chen, Chongyu and Liang, Xiaodan
Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs EMNLP 2022 MathNLP Workshop [paper] [NL, Coq]

Cunningham, Garett and Bunescu, Razvan C and Juedes, David
HyperTree Proof Search for Neural Theorem Proving NeurIPS 2022 [paper] [Metamath, Lean]

Lample, Guillaume and Lacroix, Timothee and Lachaux, Marie-Anne and Rodriguez, Aurelien and Hayat, Amaury and Lavril, Thibaut and Ebner, Gabriel and Martinet, Xavier
NaturalProver: Grounded Mathematical Proof Generation with Language Models NeurIPS 2022 [paper] [NL]

Welleck, Sean and Liu, Jiacheng and Lu, Ximing and Hajishirzi, Hannaneh and Choi, Yejin
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers NeurIPS 2022 [paper] [Isabelle]

Jiang, Albert Qiaochu and Li, Wenda and Tworkowski, Szymon and Czechowski, Konrad and Odrzygóżdż, Tomasz and Miłos, Piotr and Wu, Yuhuai and Jamnik, Mateja
Diversity-Driven Automated Formal Verification ICSE 2022 [paper] [Coq]

First, Emily and Brun, Yuriy
Formal Premise Selection with Language Models AITP 2022 [paper] [Isabelle]

Tworkowski, Szymon and Mikuła, Maciej and Odrzygóżdż, Tomasz and Czechowski, Konrad and Antoniak, Szymon and Jiang, Albert and Szegedy, Christian and Kuciński, Łukasz and Miłoś, Piotr and Wu, Yuhuai
Passport: Improving Automated Formal Verification Using Identifiers TOPLAS 2023 [paper] [Coq]

Sanchez-Stern, Alex and First, Emily and Zhou, Timothy and Kaufman, Zhanna and Brun, Yuriy and Ringer, Talia
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs ICLR 2023 [paper] [NL, Isabelle]

Jiang, Albert Q and Welleck, Sean and Zhou, Jin Peng and Li, Wenda and Liu, Jiacheng and Jamnik, Mateja and Lacroix, Timothée and Wu, Yuhuai and Lample, Guillaume
Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

Zhao, Xueliang and Li, Wenda and Kong, Lingpeng
CoProver: A Recommender System for Proof Construction CICM 2023 [paper] [PVS]

Yeh, Eric and Hitaj, Briland and Owre, Sam and Quemener, Maena and Shankar, Natarajan
Baldur: Whole-Proof Generation and Repair with Large Language Models ESEC/FSE 2023 [paper] [Isabelle]

First, Emily and Rabe, Markus and Ringer, Talia and Brun, Yuriy
Peano: Learning Formal Mathematical Reasoning Philosophical Transactions of the Royal Society A 2023 [paper] [Peano]

Poesia, Gabriel and Goodman, Noah D
Mathematical Capabilities of ChatGPT NeurIPS 2023 [paper] [NL]

Simon Frieder and Luca Pinchetti and Alexis Chevalier and Ryan-Rhys Griffiths and Tommaso Salvatori and Thomas Lukasiewicz and Philipp Christian Petersen and Julius Berner
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima
LLM vs ITP NeurIPS 2023 MATH-AI Workshop [paper] [NL]

Frieder, Simon and Trimmel, Martin and Alawadhi, Rashid and Gy, Klaus
LLMSTEP: LLM Proofstep Suggestions in Lean NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

Welleck, Sean and Saha, Rahul
Towards Large Language Models as Copilots for Theorem Proving in Lean NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

Song, Peiyang and Yang, Kaiyu and Anandkumar, Anima
Temperature-Scaled Large Language Models for Lean Proofstep Prediction NeurIPS 2023 MATH-AI Workshop [paper] [Lean]

Gloeckle, Fabian and Roziere, Baptiste and Hayat, Amaury and Synnaeve, Gabriel
DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-Level Value Function ACL 2023 [paper] [Isabelle, Lean]

Wang, Haiming and Yuan, Ye and Liu, Zhengying and Shen, Jianhao and Yin, Yichun and Xiong, Jing and Xie, Enze and Shi, Han and Li, Yujun and Li, Lin and Yin, Jian and Li, Zhenguo and Liang, Xiaodan
Getting More out of Large Language Models for Proofs AITP 2023 [paper] [Coq]

Zhang, Shizhuo Dylan and Ringer, Talia and First, Emily
UniMath: A Foundational and Multimodal Mathematical Reasoner EMNLP 2023 [paper] [geometry]

Liang, Zhenwen and Yang, Tianyu and Zhang, Jipeng and Zhang, Xiangliang
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models EMNLP 2023 [paper] [Lean]

Xiong, Jing and Shen, Jianhao and Yuan, Ye and Wang, Haiming and Yin, Yichun and Liu, Zhengying and Li, Lin and Guo, Zhijiang and Cao, Qingxing and Huang, Yinya and Zheng, Chuanyang and Liang, Xiaodan and Zhang, Ming and Liu, Qun
Learning Proof Transformations and Its Applications in Interactive Theorem Proving FroCoS 2023 [paper] [Coq]

Zhang, Liao and Blaauwbroek, Lasse and Kaliszyk, Cezary and Urban, Josef
Evaluating Language Models for Mathematics through Interactions arXiv 2023 [paper] [NL]

Collins, Katherine M and Jiang, Albert Q and Frieder, Simon and Wong, Lionel and Zilka, Miri and Bhatt, Umang and Lukasiewicz, Thomas and Wu, Yuhuai and Tenenbaum, Joshua B and Hart, William and Gowers, Timothy and Li, Wenda and Weller, Adrian and Jamnik, Mateja
Large Language Models for Mathematicians arXiv 2023 [paper] [NL]

Scheidt, Gregor vom
Experimental Results from Applying GPT-4 to An Unpublished Formal Language arXiv 2023 [paper] [Axiotome]

Scheidt, Gregor vom
Large Language Models' Understanding of Math: Source Criticism and Extrapolation arXiv 2023 [paper] [Lean]

Yousefzadeh, Roozbeh and Cao, Xuenan
Lyra: Orchestrating Dual Correction in Automated Theorem Proving arXiv 2023 [paper] [NL, Isabelle]

Zheng, Chuanyang and Wang, Haiming and Xie, Enze and Liu, Zhengying and Sun, Jiankai and Xin, Huajian and Shen, Jianhao and Li, Zhenguo and Li, Yu
Enhancing Neural Theorem Proving through Data Augmentation and Dynamic Sampling Method arXiv 2023 [paper] [Lean]

Vishwakarma, Rahul and Mishra, Subhankar
An In-Context Learning Agent for Formal Theorem-Proving arXiv 2023 [paper] [Lean, Coq]

Thakur, Amitayush and Tsoukalas, George and Wen, Yeming and Xin, Jimmy and Chaudhuri, Swarat
LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [NL, Isabelle]

Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan
Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [NL, Isabelle, Lean]

Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean
Solving Proof Block Problems Using Large Language Models SIGCSE 2024 [paper] [NL]

Poulsen, Seth and Sarsa, Sami and Prather, James and Leinonen, Juho and Becker, Brett A and Hellas, Arto and Denny, Paul and Reeves, Brent N
Solving Olympiad Geometry without Human Demonstrations Nature 2024 [paper] [geometry]

Trinh, Trieu H and Wu, Yuhuai and Le, Quoc V and He, He and Luong, Thang
Graph2Tac: Learning Hierarchical Representations of Math Concepts in Theorem proving arXiv 2024 [paper] [Coq]

Rute, Jason and Olšák, Miroslav and Blaauwbroek, Lasse and Massolo, Fidel Ivan Schaposnik and Piepenbrock, Jelle and Pestun, Vasily
FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems arXiv 2024 [paper] [geometry]

He, Yiming and Zou, Jia and Zhang, Xiaokai and Zhu, Na and Leng, Tuo
Selene: Pioneering Automated Proof in Software Verification arXiv 2024 [paper] [Isabelle]

Zhang, Lichen and Lu, Shuai and Duan, Nan
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [NL, Isabelle]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [NL, Lean]

Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua
Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search arXiv 2024 [paper] [Dafny, Coq, Lean]

Brandfonbrener, David and Raja, Sibi and Prasad, Tarun and Loughridge, Chloe and Yang, Jianang and Henniger, Simon and Byrd, William E and Zinkov, Robert and Amin, Nada

Proof Search

Deep Network Guided Proof Search LPAR 2017 [paper] [E]

Loos, Sarah and Irving, Geoffrey and Szegedy, Christian and Kaliszyk, Cezary
Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning arXiv 2018 [paper] [IPL]

Kusumoto, Mitsuru and Yahata, Keisuke and Sakai, Masahiro
ENIGMA-NG: Efficient Neural and Gradient-Boosted Inference Guidance for E CADE 2019 [paper] [ENIGMA]

Chvalovský, Karel and Jakubův, Jan and Suda, Martin and Urban, Josef
Hammering Mizar by Learning Clause Guidance ITP 2019 [paper] [ENIGMA]

Jakubův, Jan and Urban, Josef
Learning Dynamic Polynomial Proofs NeurIPS 2019 [paper] [polynomial inequality]

Fawzi, Alhussein and Malinowski, Mateusz and Fawzi, Hamza and Fawzi, Omar
A Neurally-Guided, Parallel Theorem Prover FroCoS 2019 [paper] [Z3]

Rawson, Michael and Reger, Giles
Property Invariant Embedding for Automated Reasoning ECAI 2019 [paper] [leanCoP]

Olšák, Miroslav and Kaliszyk, Cezary and Urban, Josef
Automated Theorem Proving via Interacting with Proof Assistants by Dynamic Strategies BigCom 2020 [paper] [Coq]

Mo, Guangshuai and Xiong, Yan and Huang, Wenchao and Ma, Lu
Guiding Inferences in Connection Tableau by Recurrent Neural Networks CICM 2020 [paper] [connection tableau]

Piotrowski, Bartosz and Urban, Josef
ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (System Description) IJCAR 2020 [paper] [ENIGMA]

Jakubův, Jan and Chvalovský, Karel and Olšák, Miroslav and Piotrowski, Bartosz and Suda, Martin and Urban, Josef
Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic LPAR 2020 [paper] [HOL4]

Gauthier, Thibault
Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [Metamath]

Polu, Stanislas and Sutskever, Ilya
Learning to Prove from Synthetic Theorems arXiv 2020 [paper] [saturation]

Aygün, Eser and Ahmed, Zafarali and Anand, Ankit and Firoiu, Vlad and Glorot, Xavier and Orseau, Laurent and Precup, Doina and Mourad, Shibl
An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic arXiv 2020 [paper] [TRAIL]

Abdelaziz, Ibrahim and Thost, Veronika and Crouse, Maxwell and Fokoue, Achille
Training a First-Order Theorem Prover from Synthetic Data ICLR 2021 MATH-AI Workshop [paper] [saturation]

Firoiu, Vlad and Aygün, Eser and Anand, Ankit and Ahmed, Zafarali and Glorot, Xavier and Orseau, Laurent and Zhang, Lei and Precup, Doina and Mourad, Shibl
Learned Provability Likelihood for Tactical Search SCSS 2021 [paper] [HOL4]

Gauthier, Thibault
TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning NeurIPS 2021 [paper] [HOL4]

Wu, Minchao and Norrish, Michael and Walder, Christian and Dezfouli, Amir
Vampire with a Brain Is a Good ITP Hammer FroCoS 2021 [paper] [Vampire]

Suda, Martin
Fast and Slow Enigmas and Parental Guidance FroCoS 2021 [paper] [ENIGMA]

Goertzel, Zarathustra A and Chvalovský, Karel and Jakubův, Jan and Olšák, Miroslav and Urban, Josef
Towards Finding Longer Proofs TABLEAUX 2021 [paper] [leanCoP]

Zombori, Zsolt and Csiszárik, Adrián and Michalewski, Henryk and Kaliszyk, Cezary and Urban, Josef
lazyCoP: Lazy Paramodulation Meets Neurally Guided Search TABLEAUX 2021 [paper] [leanCoP]

Rawson, Michael and Reger, Giles
The Role of Entropy in Guiding a Connection Prover TABLEAUX 2021 [paper] [leanCoP]

Zombori, Zsolt and Urban, Josef and Olšák, Miroslav
Learning Theorem Proving Components TABLEAUX 2021 [paper] [ENIGMA]

Chvalovský, Karel and Jakubův, Jan and Olšák, Miroslav and Urban, Josef
A Deep Reinforcement Learning Approach to First-Order Logic Theorem Proving AAAI 2021 [paper] [TRAIL]

Crouse, Maxwell and Abdelaziz, Ibrahim and Makni, Bassem and Whitehead, Spencer and Cornelio, Cristina and Kapanipathi, Pavan and Srinivas, Kavitha and Thost, Veronika and Witbrock, Michael and Fokoue, Achille
Improving ENIGMA-Style Clause Selection While Learning From History CADE 2021 [paper] [Vampire]

Suda, Martin
Proving Theorems using Incremental Learning and Hindsight Experience Replay ICML 2022 [paper] [saturation]

Aygün, Eser and Anand, Ankit and Orseau, Laurent and Glorot, Xavier and Mcaleer, Stephen M and Firoiu, Vlad and Zhang, Lei M and Precup, Doina and Mourad, Shibl
HyperTree Proof Search for Neural Theorem Proving NeurIPS 2022 [paper] [Metamath, Lean]

Lample, Guillaume and Lacroix, Timothee and Lachaux, Marie-Anne and Rodriguez, Aurelien and Hayat, Amaury and Lavril, Thibaut and Ebner, Gabriel and Martinet, Xavier
Learning to Prove Trigonometric Identities arXiv 2022 [paper] [trigonometric identity]

Liu, Zhou and Li, Yujun and Liu, Zhengying and Li, Lin and Li, Zhenguo
Learning to Guide a Saturation-Based Theorem Prover TPAMI 2022 [paper] [TRAIL]

Abdelaziz, Ibrahim and Crouse, Maxwell and Makni, Bassem and Austil, Vernon and Cornelio, Cristina and Ikbal, Shajith and Kapanipathi, Pavan and Makondo, Ndivhuwo and Srinivas, Kavitha and Witbrock, Michael and Fokoue, Achille
Machine Learning Meets the Herbrand Universe arXiv 2022 [paper] [instantiation]

Piepenbrock, Jelle and Urban, Josef and Korovin, Konstantin and Olšák, Miroslav and Heskes, Tom and Janota, Mikolaš
Guiding An Automated Theorem Prover with Neural Rewriting IJCAR 2022 [paper] [Prover9]

Piepenbrock, Jelle and Heskes, Tom and Janota, Mikoláš and Urban, Josef
The Isabelle ENIGMA ITP 2022 [paper] [ENIGMA]

Goertzel, Zarathustra A and Jakubův, Jan and Kaliszyk, Cezary and Olšák, Miroslav and Piepenbrock, Jelle and Urban, Josef
Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya
An Ensemble Approach for Automated Theorem Proving Based on Efficient Name Invariant Graph Neural Representations IJCAI 2023 [paper] [TRAIL]

Fokoue, Achille and Abdelaziz, Ibrahim and Crouse, Maxwell and Ikbal, Shajith and Kishimoto, Akihiro and Lima, Guilherme and Makondo, Ndivhuwo and Marinescu, Radu
Peano: Learning Formal Mathematical Reasoning Philosophical Transactions of the Royal Society A 2023 [paper] [Peano]

Poesia, Gabriel and Goodman, Noah D
MizAR 60 for Mizar 50 ITP 2023 [paper] [ENIGMA]

Jakubův, Jan and Chvalovský, Karel and Goertzel, Zarathustra and Kaliszyk, Cezary and Olšák, Mirek and Piotrowski, Bartosz and Schulz, Stephan and Suda, Martin and Urban, Josef
Reinforcement Learning for Guiding the E Theorem Prover FLAIRS 2023 [paper] [E]

McKeown, Jack and Sutcliffe, Geoff
Guiding an Instantiation Prover with Graph Neural Networks LPAR 2023 [paper] [iProver]

Chvalovský, Karel and Korovin, Konstantin and Piepenbrock, Jelle and Urban, Josef
How Much Should This Symbol Weigh? A GNN-Advised Clause Selection LPAR 2023 [paper] [Vampire]

Bártěk, Filip and Suda, Martin
gym-saturation: Gymnasium Environments for Saturation Provers (System Description) TABLEAUX 2023 [paper] [Vampire, iProver]

Shminke, Boris
DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-Level Value Function ACL 2023 [paper] [Isabelle, Lean]

Wang, Haiming and Yuan, Ye and Liu, Zhengying and Shen, Jianhao and Yin, Yichun and Xiong, Jing and Xie, Enze and Shi, Han and Li, Yujun and Li, Lin and Yin, Jian and Li, Zhenguo and Liang, Xiaodan
An In-Context Learning Agent for Formal Theorem-Proving arXiv 2023 [paper] [Lean, Coq]

Thakur, Amitayush and Tsoukalas, George and Wen, Yeming and Xin, Jimmy and Chaudhuri, Swarat
Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search arXiv 2024 [paper] [Dafny, Coq, Lean]

Brandfonbrener, David and Raja, Sibi and Prasad, Tarun and Loughridge, Chloe and Yang, Jianang and Henniger, Simon and Byrd, William E and Zinkov, Robert and Amin, Nada

Other Tasks

First Neural Conjecturing Datasets and Experiments CICM 2020 [paper] [Mizar]

Urban, Josef and Jakubův, Jan
Guiding Inferences in Connection Tableau by Recurrent Neural Networks CICM 2020 [paper] [connection tableau]

Piotrowski, Bartosz and Urban, Josef
Mathematical Reasoning in Latent Space ICLR 2020 [paper] [HOL Light]

Lee, Dennis and Szegedy, Christian and Rabe, Markus N and Loos, Sarah M and Bansal, Kshitij
Mathematical Reasoning via Self-supervised Skip-tree Training ICLR 2021 [paper] [HOL Light]

Rabe, Markus N and Lee, Dennis and Bansal, Kshitij and Szegedy, Christian
Latent Action Space for Efficient Planning in Theorem Proving AITP 2021 [paper] [inequality]

Wu, Minchao and Wu, Yuhuai
IsarStep: A Benchmark for High-Level Mathematical Reasoning ICLR 2021 [paper] [Isabelle]

Li, Wenda and Yu, Lei and Wu, Yuhuai and Paulson, Lawrence C
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning ICML 2021 [paper] [Isabelle, HOL Light, Metamath, Lean]

Wu, Yuhuai and Rabe, Markus N and Li, Wenda and Ba, Jimmy and Grosse, Roger B and Szegedy, Christian
Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas
Exploring Mathematical Conjecturing with Large Language Models NeSy 2023 [paper] [Isabelle]

Johansson, Moa and Smallbone, Nicholas
BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs EACL 2023 [paper] [NL]

Li, Weixian Waylon and Ziser, Yftah and Coavoux, Maximin and Cohen, Shay B
REFACTOR: Learning to Extract Theorems from Proofs ICLR 2024 [paper] [Metamath]

Zhou, Jin Peng and Wu, Yuhuai and Li, Qiyang and Grosse, Roger

Datasets

Data Collection

Premise Selection for Mathematics by Corpus Analysis and Kernel Methods Journal of Automated Reasoning 2014 [paper] [Mizar]

Alama, Jesse and Heskes, Tom and Kühlwein, Daniel and Tsivtsivadze, Evgeni and Urban, Josef
MizAR 40 for Mizar 40 Journal of Automated Reasoning 2015 [paper] [Mizar]

Kaliszyk, Cezary and Urban, Josef
The TPTP Problem Library and Associated Infrastructure Journal of Automated Reasoning 2017 [paper] [TPTP]

Sutcliffe, Geoff
HolStep: A Machine Learning Dataset for Higher-Order Logic Theorem Proving ICLR 2017 [paper] [HOL Light]

Kaliszyk, Cezary and Chollet, François and Szegedy, Christian
Reinforcement Learning of Theorem Proving NeurIPS 2018 [paper] [Mizar]

Kaliszyk, Cezary and Urban, Josef and Michalewski, Henryk and Olšák, Miroslav
GamePad: A Learning Environment for Theorem Proving ICLR 2019 [paper] [Coq]

Huang, Daniel and Dhariwal, Prafulla and Song, Dawn and Sutskever, Ilya
HOList: An Environment for Machine Learning of Higher-Order Logic Theorem Proving ICML 2019 [paper] [HOL Light]

Kshitij Bansal and Sarah M. Loos and Markus Norman Rabe and Christian Szegedy and Stewart Wilcox
Learning to Prove Theorems via Interacting with Proof Assistants ICML 2019 [paper] [Coq]

Yang, Kaiyu and Deng, Jia
TacticToe: Learning to Prove with Tactics Journal of Automated Reasoning 2020 [paper] [HOL4]

Gauthier, Thibault and Kaliszyk, Cezary and Urban, Josef
Generative Language Modeling for Automated Theorem Proving arXiv 2020 [paper] [pretraining, Metamath]

Polu, Stanislas and Sutskever, Ilya
The Tactician: A Seamless, Interactive Tactic Learner and Prover for Coq CICM 2020 [paper] [Coq]

Blaauwbroek, Lasse and Urban, Josef and Geuvers, Herman
Natural Language Premise Selection: Finding Supporting Statements for Mathematical Text LREC 2020 [paper] [NL]

Ferreira, Deborah and Freitas, André
IsarStep: A Benchmark for High-Level Mathematical Reasoning ICLR 2021 [paper] [Isabelle]

Li, Wenda and Yu, Lei and Wu, Yuhuai and Paulson, Lawrence C
NaturalProofs: Mathematical Theorem Proving in Natural Language NeurIPS 2021 [paper] [NL]

Welleck, Sean and Liu, Jiacheng and Bras, Ronan Le and Hajishirzi, Hannaneh and Choi, Yejin and Cho, Kyunghyun
LISA: Language Models of Isabelle Proofs AITP 2021 [paper] [Isabelle]

Jiang, Albert Qiaochu and Li, Wenda and Han, Jesse Michael and Wu, Yuhuai
Proof Artifact Co-Training for Theorem Proving with Language Models ICLR 2022 [paper] [Lean]

Han, Jesse Michael and Rute, Jason and Wu, Yuhuai and Ayers, Edward W and Polu, Stanislas
MiniF2F: A Cross-System Benchmark for Formal Olympiad-Level Mathematics ICLR 2022 [paper] [Metamath, Lean, Isabelle, HOL Light]

Zheng, Kunhao and Han, Jesse Michael and Polu, Stanislas
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression EMNLP 2022 [paper] [geometry]

Chen, Jiaqi and Li, Tong and Qin, Jinghui and Lu, Pan and Lin, Liang and Chen, Chongyu and Liang, Xiaodan
NaturalProver: Grounded Mathematical Proof Generation with Language Models NeurIPS 2022 [paper] [NL]

Welleck, Sean and Liu, Jiacheng and Lu, Ximing and Hajishirzi, Hannaneh and Choi, Yejin
A Parallel Corpus of Natural Language and Isabelle Artefacts AITP 2022 [paper] [NL, Isabelle]

Bordg, Anthony and Stathopoulos, Yiannos A and Paulson, Lawrence C
Proof Repair Infrastructure for Supervised Models: Building a Large Proof Repair Dataset ITP 2023 [paper] [Coq]

Reichel, Tom and Henderson, R and Touchet, Andrew and Gardner, Andrew and Ringer, Talia
BERT Is Not The Count: Learning to Match Mathematical Statements with Proofs EACL 2023 [paper] [NL]

Li, Weixian Waylon and Ziser, Yftah and Coavoux, Maximin and Cohen, Shay B
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models NeurIPS 2023 [paper] [Lean]

Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima
MLFMF: Data Sets for Machine Learning for Mathematical Formalization NeurIPS 2023 [paper] [Agda, Lean]

Bauer, Andrej and Petković, Matej and Todorovski, Ljupco
Mathematical Capabilities of ChatGPT NeurIPS 2023 [paper] [NL]

Simon Frieder and Luca Pinchetti and Alexis Chevalier and Ryan-Rhys Griffiths and Tommaso Salvatori and Thomas Lukasiewicz and Philipp Christian Petersen and Julius Berner
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models EMNLP 2023 [paper] [Lean]

Xiong, Jing and Shen, Jianhao and Yuan, Ye and Wang, Haiming and Yin, Yichun and Liu, Zhengying and Li, Lin and Guo, Zhijiang and Cao, Qingxing and Huang, Yinya and Zheng, Chuanyang and Liang, Xiaodan and Zhang, Ming and Liu, Qun
Generative AI for Math: Part I - MathPile: A Billion-Token-Scale Pretraining Corpus for Math arXiv 2023 [paper] [pretraining]

Wang, Zengzhi and Xia, Rui and Liu, Pengfei
ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics arXiv 2023 [paper] [pretraining, NL, Lean]

Azerbayev, Zhangir and Piotrowski, Bartosz and Schoelkopf, Hailey and Ayers, Edward W and Radev, Dragomir and Avigad, Jeremy
FIMO: A Challenge Formal Dataset for Automated Theorem Proving arXiv 2023 [paper] [NL, Lean]

Liu, Chengwu and Shen, Jianhao and Xin, Huajian and Liu, Zhengying and Yuan, Ye and Wang, Haiming and Ju, Wei and Zheng, Chuanyang and Yin, Yichun and Li, Lin and Zhang, Ming Zhang and Liu, Qun
FormalGeo: The First Step Toward Human-like IMO-level Geometric Automated Reasoning arXiv 2023 [paper] [geometry]

Zhang, Xiaokai and Zhu, Na and He, Yiming and Zou, Jia and Huang, Qike and Jin, Xiaoxiao and Guo, Yanjun and Mao, Chenyang and Li, Yang and Zhu, Zhe and Yue, Dengfeng and Zhu, Fangzhen and Wang, Yifan and Huang, Yiwen and Wang, Runan and Qin, Cheng and Zeng, Zhenbing and Xie, Shaorong and Luo, Xiangfeng and Leng, Tuo
Llemma: An Open Language Model for Mathematics ICLR 2024 [paper] [pretraining]

Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Santos, Marco Dos and McAleer, Stephen and Jiang, Albert Q and Deng, Jia and Biderman, Stella and Welleck, Sean
Magnushammer: A Transformer-Based Approach to Premise Selection ICLR 2024 [paper] [Isabelle]

Mikuła, Maciej and Antoniak, Szymon and Tworkowski, Szymon and Jiang, Albert Qiaochu and Zhou, Jin Peng and Szegedy, Christian and Kuciński, Łukasz and Miłos, Piotr and Wu, Yuhuai
OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text ICLR 2024 [paper] [pretraining]

Paster, Keiran and Santos, Marco Dos and Azerbayev, Zhangir and Ba, Jimmy
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models arXiv 2024 [paper] [pretraining]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, YK and Wu, Y and Guo, Daya
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning arXiv 2024 [paper] [pretraining]

Ying, Huaiyuan and Zhang, Shuo and Li, Linyang and Zhou, Zhejian and Shao, Yunfan and Fei, Zhaoye and Ma, Yichuan and Hong, Jiawei and Liu, Kuikun and Wang, Ziyi and Wang, Yudong and Wu, Zijian and Li, Shuaibin and Zhou, Fengzhe and Liu, Hongwei and Zhang, Songyang and Zhang, Wenwei and Yan, Hang and Qiu, Xipeng and Wang, Jiayu and Chen, Kai and Lin, Dahua

Data Generation

Learning to Reason in Large Theories without Imitation arXiv 2019 [paper] [HOL Light]

Bansal, Kshitij and Szegedy, Christian and Rabe, Markus N and Loos, Sarah M and Toman, Viktor
Learning to Prove Theorems by Learning to Generate Theorems NeurIPS 2020 [paper] [Metamath]

Wang, Mingzhe and Deng, Jia
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving ICLR 2021 [paper] [inequality]

Wu, Yuhuai and Jiang, Albert Qiaochu and Ba, Jimmy and Grosse, Roger
Training a First-Order Theorem Prover from Synthetic Data ICLR 2021 MATH-AI Workshop [paper] [saturation]

Firoiu, Vlad and Aygün, Eser and Anand, Ankit and Ahmed, Zafarali and Glorot, Xavier and Orseau, Laurent and Zhang, Lei and Precup, Doina and Mourad, Shibl
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning ICML 2021 [paper] [reasoning primitives]

Wu, Yuhuai and Rabe, Markus N and Li, Wenda and Ba, Jimmy and Grosse, Roger B and Szegedy, Christian
Proving Theorems using Incremental Learning and Hindsight Experience Replay ICML 2022 [paper] [saturation]

Aygün, Eser and Anand, Ankit and Orseau, Laurent and Glorot, Xavier and Mcaleer, Stephen M and Firoiu, Vlad and Zhang, Lei M and Precup, Doina and Mourad, Shibl
Synthetic Proof Term Data Augmentation for Theorem Proving with Language Models AITP 2022 [paper] [Lean]

Palermo, Joseph and Ye, Johnny and Han, Jesse Michael
Learning to Prove Trigonometric Identities arXiv 2022 [paper] [trigonometric identity]

Liu, Zhou and Li, Yujun and Liu, Zhengying and Li, Lin and Li, Zhenguo
Formal Mathematics Statement Curriculum Learning ICLR 2023 [paper] [Lean]

Polu, Stanislas and Han, Jesse Michael and Zheng, Kunhao and Baksys, Mantas and Babuschkin, Igor and Sutskever, Ilya
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning arXiv 2023 [paper] [geometry]

Kazemi, Mehran and Alvari, Hamidreza and Anand, Ankit and Wu, Jialin and Chen, Xi and Soricut, Radu
Multilingual Mathematical Autoformalization arXiv 2023 [paper] [NL, Isabelle, Lean]

Jiang, Albert Q and Li, Wenda and Jamnik, Mateja
Solving Olympiad Geometry without Human Demonstrations Nature 2024 [paper] [geometry]

Trinh, Trieu H and Wu, Yuhuai and Le, Quoc V and He, He and Luong, Thang
LEGO-Prover: Neural Theorem Proving with Growing Libraries ICLR 2024 [paper] [Isabelle]

Wang, Haiming and Xin, Huajian and Zheng, Chuanyang and Li, Lin and Liu, Zhengying and Cao, Qingxing and Huang, Yinya and Xiong, Jing and Shi, Han and Xie, Enze and Yin, Jian and Li, Zhenguo and Liao, Heng and Liang, Xiaodan
REFACTOR: Learning to Extract Theorems from Proofs ICLR 2024 [paper] [Metamath]

Zhou, Jin Peng and Wu, Yuhuai and Li, Qiyang and Grosse, Roger
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data ICLR 2024 [paper] [NL, Lean]

Huang, Yinya and Lin, Xiaohan and Liu, Zhengying and Cao, Qingxing and Xin, Huajian and Wang, Haiming and Li, Zhenguo and Song, Linqi and Liang, Xiaodan

Related Surveys

QED at Large: A Survey of Engineering of Formally Verified Software Foundations and Trends® in Programming Languages 2019 [paper]

Ringer, Talia and Palmskog, Karl and Sergey, Ilya and Gligoric, Milos and Tatlock, Zachary
A Survey of Deep Learning for Mathematical Reasoning ACL 2023 [paper]

Lu, Pan and Qiu, Liang and Yu, Wenhao and Welleck, Sean and Chang, Kai-Wei
Mathematical Language Models: A Survey arXiv 2023 [paper]

Liu, Wentao and Hu, Hanglei and Zhou, Jie and Ding, Yuyang and Li, Junsong and Zeng, Jiayi and He, Mengliang and Chen, Qin and Jiang, Bo and Zhou, Aimin and He, Liang
Large Language Models for Mathematical Reasoning: Progresses and Challenges EACL 2024 [paper]

Ahn, Janice and Verma, Rishu and Lou, Renze and Liu, Di and Zhang, Rui and Yin, Wenpeng
AI for Math Resources Google Sheet [link]

Available to Everyone

Citation

If you find this repository useful, please consider citing our survey paper:

@article{li2024dl4tp,
   title={A Survey on Deep Learning for Theorem Proving}, 
   author={Li, Zhaoyu and Sun, Jialiang and Murphy, Logan and Su, Qidong and Li, Zenan and Zhang, Xian and Yang, Kaiyu and Si, Xujie},
   journal={arXiv preprint arXiv:2404.09939},
   year={2024},
}

sushmaakoju/DL4TP