AI4Sci Edu related repo. Still under construction🚧👷🏻♂️
A list of awesome AI for chemistry papers. Inspired by the "Awesome" branded repositories in Computer Science. Please feel free to contribute and help to improve the quality of this page.
-
Note 1: To highlight the paper, we switch the order of authors and paper title in the citation. Please cite the authors correctly instead of directly copying from our page.
-
Note 2: If your work is not listed, but you feel it is influncial/inspiring/novel, please feel free to provide comments via Discussion or create an issue.
Press ^
to return to the Table of Contents
.
- Reviews
- Books
- Organic/Inorganic chemsitry: Retrosynthesis planning
- Theoretical/Computational chemistry: Electronic structure
- Theoretical/Computational chemistry: Molecular Dynamics
- Theoretical/Computational chemistry: Property learning
- Generalized model/Datasets
- Experimental physical chemistry
- Biochemistry: Biomolecule (protein/nucleic acid/lipid) design/structure
- Analytical chemistry
- Robotic chemist/Automation
Reviews ^
Note: We will try to provide both the most influncial/comprehensive reviews and the most recent/updated reviews. The list will be updated timely manner.
-
Roadmap on Machine learning in electronic structure. H. J. Kulik, T. Hammerschmidt, J. Schmidt, S. Botti, M. A. L. Marques, M. Boley, M. Scheffler, M. Todorović, P. Rinke, C. Oses, A. Smolyanyuk, S. Curtarolo, A. Tkatchenko, A. P. Bartók, S. Manzhos, M. Ihara, T. Carrington, J. Behler, O. Isayev, M. Veit, A. Grisafi, J. Nigam, M. Ceriotti, K. T. Schütt, J. Westermayr, M. Gastegger, R. J. Maurer, B. Kalita, K. Burke, R. Nagai, R. Akashi, O. Sugino, J. Hermann, F. Noé, S. Pilati, C. Draxl, M. Kuban, S. Rigamonti, M. Scheidgen, M. Esters, D. Hicks, C. Toher, P. V. Balachandran, I. Tamblyn, S. Whitelam, C. Bellinger, & L. M. Ghiringhelli, Electronic Structure 4, 023004 (2022)
-
Machine learning for molecular and materials science. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, & A. Walsh, Nature 559, 547–555 (2018)
-
Four generations of high-dimensional neural network potentials J. Behler, Chemical Reviews 121(16), 10037–10072 (2021)
-
Inverse molecular design using machine learning: Generative models for matter engineering. B. Sanchez-Lengeling,& A. Aspuru-Guzik, Science 361(6400), 360-365 (2018)
-
Data-driven strategies for accelerated materials design R. Pollice, G. dos Passos Gomes, M. Aldeghi, R. J. Hickman, M. Krenn, C. Lavigne, M. Lindner-D’Addario, A. Nigam, C. T. Ser, Z. Yao, & A. Aspuru-Guzik, Accounts of Chemical Research 54(4), 849–860 (2021)
-
Molecular excited states through a machine learning lens P. O. Dral & M. Barbatti, Nature Reviews Chemistry 5, 388–405 (2021)
-
Combining machine learning and computational chemistry for predictive insights into chemical systems J. A. Keith, V. Vassilev-Galindo, B. Cheng, S. Chmiela, M. Gastegger, K.-R. Müller, & A. Tkatchenko, Chemical Reviews 121(16), 9816–9872 (2021)
-
Machine learning force fields O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger, I. Poltavsky, K. T. Schütt, A. Tkatchenko, & K.-R. Müller, Chemical Reviews 121(16), 10142–10186 (2021)
-
Physics-inspired structural tepresentations for molecules and materials F. Musil, A. Grisafi, A. P. Bartók, C. Ortner, G. Csányi, & M. Ceriotti, Chemical Reviews 121(16), 9759–9815 (2021)
-
Machine Learning for Chemical Reactions M. Meuwly, Chemical Reviews 121(16), 10218–10239 (2021)
-
Artificial intelligence applied to battery research: Hype or reality? T. Lombardo, M. Duquesnoy, H. El-Bouysidy, F. Årén, A. Gallo-Bueno, P. B. Jørgensen, A. Bhowmik, A. Demortière, E. Ayerbe, F. Alcaide, M. Reynaud, J. Carrasco, A. Grimaud, C. Zhang, T. Vegge, P. Johansson, & A. A. Franco, Chemical Reviews 122(12), 9759–9815 (2022)
-
Autonomous Discovery in the Chemical Sciences Part I: Progress. and Autonomous Discovery in the Chemical Sciences Part II: Outlook. C. W. Coley, N. S. Eyke, & K. F. Jensen, Angewandte Chemie International Edition, 59, Part I: 22858 and Part II: 23414 (2020)
-
Taking the leap between analytical chemistry and artificial intelligence: A tutorial review. L. B. Ayres, F. J.V. Gomez, J. R. Linton, M. F. Silva, & C. D. Garcia, Analytica Chimica Acta 1161, 338403 (2021)
-
Recent advances and applications of deep learning methods in materials science. K. Choudhary, B. DeCost, C. Chen, A. Jain, F. Tavazza, R. Cohn, C. W. Park, A. Choudhary, A. Agrawal, S. J. L. Billinge, E. Holm, S. P. Ong, & C. Wolverton, npj Computational Materials 8, 59 (2022)
-
Perspective on integrating machine learning into computational chemistry and materials science. J. Westermayr, M. Gastegger, K. T. Schütt, & R. J. Maurer, The Journal of Chemical Physics 154, 230903 (2021)
-
Machine learning in scanning transmission electron microscopy. S. V. Kalinin, C. Ophus, P. M. Voyles, R. Erni, D. Kepaptsoglou, V. Grillo, A. R. Lupini, M. P. Oxley, E. Schwenker, M. K. Y. Chan, J. Etheridge, X. Li, G. G. D. Han, M. Ziatdinov, N. Shibata, & S. J. Pennycook, Nature Reviews Methods Primers 2, 11 (2022)
-
Deep learning and artificial intelligence methods for Raman and surface-enhanced Raman scattering. F. Lussier, V. Thibault, B. Charron, G. Q. Wallace, & J.-F. Masson. TrAC Trends in Analytical Chemistry 124, 115796 (2020)
-
Machine Learning-Driven Multiscale Modeling: Bridging the Scales with a Next-Generation Simulation Infrastructure. H. I. Ingólfsson, H. Bhatia, F. Aydin,..., & F. H. Streitz, Journal of Chemical Theory and Computation 19(9), 2658-2675 (2023)
Books ^
Note: We will try to provide both the most influncial/comprehensive books (usually the bible book for a field). We provide the links to the corresponding Amazon page to avoid confusion. The list will be updated timely manner.
Note: Undergrad level readings. More general, cover many subfields.
- Chemistry: The central science 14th edition T. Brown, H. LeMay, B, Bursten, C. Murphy, P. Woodward, M. Stoltzfus. Pearson, 2017
- Principles of modern chemistry D.W. Oxtoby, H.P. Gillis, & A. Campion. Thomson Brooks/Cole, 2008
- Organic chemistry seventh edition M. Loudon & J. Parise. W. H. Freeman, 2021
- Quantitative chemical analysis D. C. Harris. W. H. Freeman, 2010
- Lehninger principles of biochemistry D. L. Nelson & M. M. Cox
- Physical chemistry 4th edition R. J. Silbey, R. A. Alberty, & M. G. Bawendi. Wiley, 2004
- Atkins' physical chemistry P. Atkins, J. de Paula, & J. Keeler. Oxford University Press, 2018
- Fundamental of polymer science M. M. Coleman & P. C. Painter. Routledge, 1998
Note: Graduate level readings.
- Modern quantum chemistry: introduction to advanced electronic structure theory A. Szabo & N.S. Ostlund. Courier Corporation, 2012.
- Density-functional theory of atoms and molecules R. G. Parr & W. Yang. Oxford University Press, 1994
- Molecular electronic-structure theory T. Helgaker, P. Jorgensen, J. Olsen. Wiley, 2013
- Statistical mechanics D. A. McQuarrie. VIVA, 2015
- Statistical mechanics: theory and molecular simulation M. E. Tuckerman. Oxford University Press, 2010
- Theories of molecular reaction dynamics: the microscopic foundation of chemical kinetics N. E. Henriksen & F. Y. Hansen. Oxford University Press, 2012
- Essentials of Computational Chemistry: Theories and Models C. J. Cramer. Wiley, 2004
- Introduction to spectroscopy D. L. Pavia, G. M. Lampman, G. S. Kriz, & J. A. Vyvyan. Cengage Learning, 2014
- First course in electrode processes D. Pletcher. Royal Society of Chemistry, 2019
- Biological physics: Energy, information, life P. Nelson. Chiliagon Science, 2020
- Molecular symmetry and group theory A. Vincent. Wiley, 2001
- Introduction to atmospheric chemistry D. J. Jacob. Princeton University Press, 1999
- Introduction to Bioorganic Chemistry and Chemical Biology D. Van Vranken & G. A. Weiss. Garland Science, 2012
- Solid state chemistry and its applications A. R. West. Wiley, 2014
- Modern physical organic chemistry E. V. Anslyn & D. A. Dougherty. University Science, 2005
- Physics and chemistry of interfaces H. Butt, K. Graf, & M. Kappl. Wiley-VCH, 2023
Note: General reading materials. For indepth reading, could check out other awesome series repo, for example, Awesome Machine Learning
- Machine learning: A probabilistic perspective K. P. Murphy. The MIT Press, 2012
- Deep Learning I. Goodfellow, Y. Bengio & A. Courville. The MIT Press, 2016
- Pattern recognition and machine learning C. M. Bishop. Springer, 2006
- The elements of statistical learning: Data mining, inference, and prediction T. Hastie, R. Tibshirani & J. Friedman. Springer, 2016
Retrosynthesis ^
- Planning chemical syntheses with deep neural networks and symbolic AI. Marwin H. S. Segler, Mike Preuss, & Mark P. Waller, Nature 555, 604–610 (2018) [Reinforcement learning]
- Learning retrosynthetic planning through simulated experience. John S. Schreck, Connor W. Coley, & Kyle J. M. Bishop, ACS Central Science 5(6), 970-981 (2019) [Reinforcement learning]
- ...
- Prediction of organic reaction outcomes using machine learning Connor W. Coley, Regina Barzilay, Tommi S. Jaakkola, William H. Green, & Klavs F. Jensen, ACS Central Science 3(5), 434–443 (2017)
Electronic Structure ^
Molecular Dynamics ^
- SchNet - A deep learning architecture for molecules and materials K. T. Schütt, H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko, & K.-R. Müller, The Journal of Chemical Physics 148, 241722 (2018)
- ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost J. S. Smith, O. Isayev, & A. E. Roitberg, Chemical Sciences 8, 3192-3203 (2017)
- Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons. A. P. Bartók, M. C. Payne, R. Kondor, & G. Csányi, Physical Review Letters 104, 136403 (2010)
- Machine learning Of accurate energy-conserving molecular force fields. S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, K. T. Schütt, & K.-R. Müller, Science Advances 3, No. e1603015 (2017)
- DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics. H. Wang, L. Zhang, J. Han, & E. Weinan, Computer Physics Communications 228, 178−184 (2018)
- Generalized neural-network representation of high-dimensional potential-energy surfaces. J. Behler & M. Parrinello Physical Review Letters 98, 146401 (2007)
- The TensorMol-0.1 model chemistry: A neural network augmented with long-range physics. K. Yao, J. E. Herr, D. W. Toth, R. Mckintyre, J. Parkhill, Chemical Sciences 9, 2261−2269 (2018)
- Teaching a neural network to attach and detach electrons from molecules. R. Zubatyuk, J. S. Smith, B. T. Nebgen, S. Tretiak, & O. Isayev, Nature Communications 12, 4870 (2021)
Property learning ^
- Physnet: A neural network for predicting energies, forces, dipole moments and partial charges. O. T. Unke & M. Meuwly, Journal of Chemical Theory and Computions 15, 3678−3693 (2019)
Generalized Model/Datasets ^
- The Open Reaction Database. S. M. Kearnes, M. R. Maser, M. Wleklinski, A. Kast, A. G. Doyle, S. D. Dreher, J. M. Hawkins, K.s F. Jensen, & C. W. Coley, Journal of the American Chemical Society 143(45), 18820–18826 (2021). Dataset
- QM series (Please also see quantum-machine page)
-
QM9 Dataset:
- Quantum chemistry structures and properties of 134 kilo molecules. R. Ramakrishnan, P. O. Dral, M. Rupp, & O. A. von Lilienfeld, Scientific Data 1, 140022 (2014)
- Orginial version. Theory: B3LYP/6-31G(2df,p)
- Wavefunction theory verison. Theory: MP2/cc-pVTZ Only has dipole moments and molecular energies
- Kaggle page
-
QM7 & QM7b Dataset:
- Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning. (QM7) M. Rupp, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld, Physical Review Letters 108(5),058301 (2012)
- Machine Learning of Molecular Electronic Properties in Chemical Compound Space. G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K.-R. Müller, O.A. von Lilienfeld, New Journal of Physics 15, 095003(2013)
- QM7 original dataset
- QM7b original dataset
- Wavefunction theory verison. Theory: MP2/cc-pVTZ Improved features & addtional datasets are available here
-
QMSpin Dataset:
- Large yet bounded: Spin gap ranges in carbenes. M. Schwilk, D. N. Tahchieva, & O. A. von Lilienfeld, arXiv:2004.10600 (2020)
- DatasetPart 1 & Part 2
-
tmQM Dataset. Dataset: Geometries & Properties D. Balcells, B. B. Skjelstad ChemRxiv. (2020)
- GDB series:
- GDB 17:
- Original paper. L. Ruddigkeit, R. van Deursen, L. C. Blum, & J.-L. Reymond, Journal of Chemical Information and Modelling 52, 2864–2875 (2012)
- GDB 13:
- Original paper. L. C. Blum & J.-L. Reymond, Journal of the American Chemical Society 131, 8732 (2009)
-
- Original Dataset. S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, K. T. Schütt, & K.-R. Müller, Science Advances 3,e1603015(2017)
- Reviseddataset (rMD17). A. S. Christensen & O. A. von Lilienfeld, arXiv:2007.09593 (2020)
-
DESMILES Models & Training datasets. P. Maragakis, H. Nisonoff, B. Cole, & D. E. Shaw
-
Harvard organic photovoltaic dataset(HOPV15). Dataset S. A. Lopez, E. O. Pyzer-Knapp, G. N. Simm, T. Lutzow, K. Li, L. R. Seress, J. Hachmann, & A. Aspuru-Guzik, Scientific Data 3, 160086 (2016)
-
The material project. Website A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, & K. A. Persson, APL Materials 1, 011002 (2013)
-
The Liverpool materials discovery server. Website S. Durdy, C. J. Hargreaves, M. Dennison, B. Wagg, M. Moran, J. A. Newnham, M. W. Gaultois, M. J. Rosseinsky, & M. Dyer, ChemRxiv, 10.26434/chemrxiv-2023-wgdm0 (2023)
Experimental Physical Chemistry ^
- ...
Biomolecule Design ^
- ...
Analytical chemistry ^
- ...
Automated Experiments ^
- A mobile robotic chemist. B. Burger, P. M. Maffettone, V. V. Gusev, C. M. Aitchison, Y. Bai, X. Wang, X. Li, B. M. Alston, B. Li, R. Clowes, N. Rankin, B. Harris, R. S. Sprick, & A. I. Cooper, Nature 583, 237–241 (2020) [Robtics + Bayesian learning]
- A robotic platform for flow synthesis of organic compounds informed by AI planning. C. W. Coley, D. A. Thomas III, J. A.M. Lummiss, J. N. Jaworski, C. P. Breen, V. Schultz, T. Hart, J. S. Fishman, L. Rogers, H. Gao, R. W. Hicklin, P. P. Plehiers, J. Byington, J. S. Piotti, W. H. Green, A. J. Hart, T. F. Jamison, & K. F. Jensen, Science 365(6453), eaax1566 (2019) [Software + Experiments]