/MindlessGen

Mindless molecule generator in a Python package.

Primary LanguagePythonApache License 2.0Apache-2.0

Mindless Molecule Generator

CI Apache-2.0 Python Versions

mindlessgen is a Python-based program for semi-automated generation of "mindless" small molecules, as described here. The rule-based algorithm places atoms randomly in coordinate space and applies several optimization, fragment detection, and sanity check steps. The program is mainly controlled via a TOML configuration file, see below for details.

One-page overview

One-pager overview

Installation

Important

xtb (see here) has to be installed on your system, either via conda-forge, as a release binary, or compiled from source. If post-processing with DFT is desired, also orca (see here) has to be available.

Non-development purposes

You can install the project in an existing virtual environment (provided for example by the package managers conda or mamba (see also here and here)). With mamba, a matching Python environment can be set up and activated as follows:

mamba create -n mindlessgen python=3.12
mamba activate mindlessgen

Afterwards, the package can be installed by downloading the package from PyPi:

pip install mindlessgen

This command installs the latest release version of mindlessgen.

Alternatively, it can be installed from the latest source code via cloning the repository:

git clone https://github.com/grimme-lab/MindlessGen.git # or the analogous SSH link
pip install .

Both installation methods work in principle also without a virtual environment, but it is strongly recommended to use one to avoid conflicts with other packages.

Development purposes

For working on the code of mindlessgen, the following setup is recommended:

mamba create -n mindlessgen python=3.12
mamba activate mindlessgen
git clone https://github.com/grimme-lab/MindlessGen.git # or the analogous SSH link
pip install -e '.[dev]'

Thereby, all necessary development tools (e.g., ruff, mypy, tox, pytest, and pre-commit) are installed. Before making changes to the code, activate the pre-commit hooks via:

pre-commit install

Before pushing a commit, run the optional tests, which depend on external dependencies like xtb, via

pytest -vv --optional

Further information on how to contribute to this project can also be found in the contribution guidelines.

Usage

Command line interface

Warning

mindlessgen may still be subject to API changes.

mindlessgen can be executed after installation in the desired environment via:

mindlessgen -h

This command displays all command line options in the terminal. In addition, all commands are accessible via the TOML configuration file. The template configuration file in the root directory of the repository contains comprehensive explanations for each of the available configuration keys. If the path is not specified with -c/--config, mindlessgen.toml will be searched in the following locations, in order:

  1. Current working directory ($CWD)
  2. Home directory ($USER/)

If neither a corresponding CLI command nor an entry in the configuration file is provided, the default values are used. The active configuration, including the default values, can be printed using --print-config.

Element composition

There are two related aspects of the element composition:

  1. Which elements should occur within the generated molecule?
  2. How many atoms of the specified element should occur?
  • Example 1: C:1-3, O:1-1, H:1-* would result in a molecule with 1, 2, or 3 carbon atoms, exactly 1 oxygen atom, and between 1 and an undefined number of hydrogen atoms (i.e., at least 1).
  • Example 2: Na:10-10, In:10-10, O:20-20. This example would result in a molecule with exactly 10 sodium atoms, 10 indium atoms, and 20 oxygen atoms. For a fixed element composition, the number of atoms (40) has to be within the min_num_atoms and max_num_atom interval. mindlessgen will consequently always return a molecule with exactly 40 atoms.

Warning

When using orca and specifying elements with Z > 86, ensure that the basis set you've selected is compatible with (super-)heavy elements like actinides. You can find a list of available basis sets here. A reliable standard choice that covers the entire periodic table is def2-mTZVPP.

Python application programming interface

"""
Python script that calls the MindlessGen API.
"""

import warnings

from mindlessgen.generator import generator
from mindlessgen.prog import ConfigManager


def main():
    """
    Main function for execution of MindlessGen via Python API.
    """
    config = ConfigManager()

    # General settings
    config.general.max_cycles = 500
    config.general.parallel = 6
    config.general.verbosity = -1
    config.general.num_molecules = 2
    config.general.postprocess = False
    config.general.write_xyz = False

    # Settings for the random molecule generation
    config.generate.min_num_atoms = 10
    config.generate.max_num_atoms = 15
    config.generate.element_composition = "Ce:1-1"
    config.generate.forbidden_elements = "21-30,39-48,57-80"

    # xtb-related settings
    config.xtb.level = 1

    try:
        molecules, exitcode = generator(config)
    except RuntimeError as e:
        print(f"\nGeneration failed: {e}")
        raise RuntimeError("Generation failed.") from e
    if exitcode != 0:
        warnings.warn("Generation completed with errors for parts of the generation.")
    for molecule in molecules:
        molecule.write_xyz_to_file()
        print(
            "\n###############\nProperties of molecule "
            + f"'{molecule.name}' with formula {molecule.sum_formula()}:"
        )
        print(molecule)


if __name__ == "__main__":
    main()

Citation

When using the program for academic purposes, please cite i) the original idea and ii) the new Python implementation.

  1. J. Chem. Theory Comput. 2009, 5, 4, 993–1003

    @article{korth_mindless_2009,
    	title = {Mindless {DFT} benchmarking},
    	volume = {5},
    	issn = {15499618},
    	url = {https://pubs.acs.org/doi/full/10.1021/ct800511q},
    	doi = {10.1021/ct800511q},
    	number = {4},
    	urldate = {2022-11-07},
    	journal = {J. Chem. Theo. Comp.},
    	author = {Korth, Martin and Grimme, Stefan},
    	month = apr,
    	year = {2009},
    	note = {Publisher: American Chemical Society},
    	pages = {993--1003},
    }
    
  2. A new publication featuring all functionalities and improvements of mindlessgen is in preparation. In the meantime, please refer to the original publication and to the following preprint, which uses the mindlessgen program for the first time: Müller, M.; Froitzheim, T.; Hansen, A.; Grimme, S. ChemRxiv October 28, 2024. https://doi.org/10.26434/chemrxiv-2024-h76ms.

    @misc{muller_advanced_2024,
    	title = {Advanced {Charge} {Extended} {Hückel} ({CEH}) {Model} and a {Consistent} {Adaptive} {Minimal} {Basis} {Set} for the {Elements} {Z}=1-103},
    	url = {https://chemrxiv.org/engage/chemrxiv/article-details/671a92581fb27ce1247466ad},
    	doi = {10.26434/chemrxiv-2024-h76ms},
    	urldate = {2024-10-28},
    	publisher = {ChemRxiv},
    	author = {Müller, Marcel and Froitzheim, Thomas and Hansen, Andreas and Grimme, Stefan},
    	month = oct,
    	year = {2024},
    	keywords = {DFT, Basis sets, EHT, SQM},
    }
    

Acknowledgements

T. Gasevic for creating an initial GitHub migration of the code and making important adjustments to the workflow. S. Grimme and M. Korth for the original code written in Fortran associated to the publication in J. Chem. Theory Comput.. T. Froitzheim for helpful discussons during the development of the program.