/maml

Python for Materials Machine Learning, Materials Descriptors, Machine Learning Force Fields, Deep Learning, etc.

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

https://github.com/materialsvirtuallab/maml/blob/master/resources/logo_horizontal.png?raw=true https://coveralls.io/repos/github/materialsvirtuallab/maml/badge.svg?branch=master&service=github https://static.pepy.tech/badge/maml

maml (MAterials Machine Learning) is a Python package that aims to provide useful high-level interfaces that make ML for materials science as easy as possible.

The goal of maml is not to duplicate functionality already available in other packages. maml relies on well-established packages such as scikit-learn and tensorflow for implementations of ML algorithms, as well as other materials science packages such as pymatgen and matminer for crystal/molecule manipulation and feature generation.

Official documentation at http://maml.ai/

Features

  1. Convert materials (crystals and molecules) into features. In addition to common compositional, site and structural features, we provide the following fine-grain local environment features.
  1. Bispectrum coefficients
  2. Behler Parrinello symmetry functions
  3. Smooth Overlap of Atom Position (SOAP)
  4. Graph network features (composition, site and structure)
  1. Use ML to learn relationship between features and targets. Currently, the maml supports sklearn and keras models.
  2. Applications:
  1. pes for modelling the potential energy surface, constructing surrogate models for property prediction.
  1. Neural Network Potential (NNP)
  2. Gaussian approximation potential (GAP) with SOAP features
  3. Spectral neighbor analysis potential (SNAP)
  4. Moment Tensor Potential (MTP)
  1. rfxas for random forest models in predicting atomic local environments from X-ray absorption spectroscopy.
  2. bowsr for rapid structural relaxation with bayesian optimization and surrogate energy model.

Installation

Pip install via PyPI:

pip install maml

To run the potential energy surface (pes), lammps installation is required you can install from source or from conda:

conda install -c conda-forge/label/cf202003 lammps

The SNAP potential comes with this lammps installation. The GAP package for GAP and MLIP package for MTP are needed to run the corresponding potentials. For fitting NNP potential, the n2p2 package is needed.

Install all the libraries from requirement.txt file:

pip install -r requirements.txt

For all the requirements above:

pip install -r requirements-ci.txt
pip install -r requirements-optional.txt
pip install -r requirements-dl.txt
pip install -r requirements.txt

Usage

Many Jupyter notebooks are available on usage. See notebooks. We also have a tool and tutorial lecture at nanoHUB https://nanohub.org/resources/maml.

API documentation

See API docs.

Machine learning (ML) is the study of computer algorithms that improve automatically through experience.[1][2] It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so.[3] Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop conventional algorithms to perform the needed tasks.

Machine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a related field of study, focusing on exploratory data analysis through unsupervised learning.[5][6] In its application across business problems, machine learning is also referred to as predictive analytics.

Citing

@misc{maml,
    author = {Chen, Chi and Zuo, Yunxing, Ye, Weike, Ji, Qi and Ong, Shyue Ping},
    title = {{Maml - materials machine learning package}},
    year = {2020},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/materialsvirtuallab/maml}},
}

For the ML-IAP package (maml.pes), please cite:

Zuo, Y.; Chen, C.; Li, X.; Deng, Z.; Chen, Y.; Behler, J.; Csányi, G.; Shapeev, A. V.; Thompson, A. P.;
Wood, M. A.; Ong, S. P. Performance and Cost Assessment of Machine Learning Interatomic Potentials.
J. Phys. Chem. A 2020, 124 (4), 731–745. https://doi.org/10.1021/acs.jpca.9b08723.

For the BOWSR package (maml.bowsr), please cite:

Zuo, Y.; Qin, M.; Chen, C.; Ye, W.; Li, X.; Luo, J.; Ong, S. P. Accelerating Materials Discovery with Bayesian
Optimization and Graph Deep Learning. Materials Today 2021, 51, 126–135.
https://doi.org/10.1016/j.mattod.2021.08.012.

For the AtomSets model (maml.models.AtomSets), please cite:

Chen, C.; Ong, S. P. AtomSets as a hierarchical transfer learning framework for small and large materials
datasets. Npj Comput. Mater. 2021, 7, 173. https://doi.org/10.1038/s41524-021-00639-w