/CatOGM

Machine learning utilities

Primary LanguagePython

CatOGM

Fingerprint Generator

The fingerprint generator is designed to be a general purpose utility for returning manipulated parameters from a list of possible operations. The possible operations which are currently implemented cat be found in ./catogm/fingerprint/operations.py.

The complete list of seeding parameters which can be called from is available from the data dictionary produced from Mendeleev.

import json
from pprint import pprint

with open('./catogm/data/properties.json', 'r') as f:
    data = json.load(f)

pprint(list(data.keys()))
[‘_series_id’, ‘abundance_crust’, ‘abundance_sea’, ‘atomic_number’, ‘atomic_radius’, ‘atomic_radius_rahm’, ‘atomic_volume’, ‘atomic_weight’, ‘atomic_weight_uncertainty’, ‘block’, ‘boiling_point’, ‘c6’, ‘c6_gb’, ‘cas’, ‘covalent_radius_bragg’, ‘covalent_radius_cordero’, ‘covalent_radius_pyykko’, ‘covalent_radius_pyykko_double’, ‘covalent_radius_pyykko_triple’, ‘covalent_radius_slater’, ‘dband_center_bulk’, ‘dband_center_slab’, ‘dband_kurtosis_bulk’, ‘dband_kurtosis_slab’, ‘dband_skewness_bulk’, ‘dband_skewness_slab’, ‘dband_width_bulk’, ‘dband_width_slab’, ‘density’, ‘dipole_polarizability’, ‘discovery_year’, ‘electron_affinity’, ‘en_allen’, ‘en_ghosh’, ‘en_pauling’, ‘evaporation_heat’, ‘fusion_heat’, ‘gas_basicity’, ‘group’, ‘group_id’, ‘heat_of_formation’, ‘lattice_constant’, ‘lattice_structure’, ‘melting_point’, ‘metallic_radius’, ‘metallic_radius_c12’, ‘period’, ‘proton_affinity’, ‘specific_heat’, ‘symbol’, ‘thermal_conductivity’, ‘vdw_radius’, ‘vdw_radius_alvarez’, ‘vdw_radius_batsanov’, ‘vdw_radius_bondi’, ‘vdw_radius_dreiding’, ‘vdw_radius_mm3’, ‘vdw_radius_rt’, ‘vdw_radius_truhlar’, ‘vdw_radius_uff’]

A list of atoms objects can also be passed to any of the generators.This will increase the dimensionality of the numpy array returned by 1 for each atoms object in the supplied list of images. Fingerprints produced in this way are returned in the same order as the list of objects provided.

Non-local fingerprint generation

Currently, the fingerprint generator has operations for producing convolutions over all atoms in a provided list of atoms objects. These are either:

  1. periodic_convolution_r0

Calculates the square of each atoms parameter in the unit cell.

  1. periodic_convolution_r1

Multiplies the parameter of a given atom by its next nearest neighbors parameter types.

from catogm.fingerprint import Fingerprinter
from catogm.fingerprint.operations import convolution_d0
from catogm.fingerprint.operations import convolution_d1
from ase.build import fcc111
from ase.build import bulk

bulk = bulk('Pd', cubic=True)
bulk[3].symbol = 'Pt'

slab = fcc111('Al', size=(2, 2, 3), a=3.8, vacuum=10.0)

images = [bulk, slab]

parameters = [
    'atomic_number',
    'atomic_radius',
    'atomic_volume',
    'atomic_weight',
    'dband_center_bulk',
    'dband_width_bulk',
    'dband_skewness_bulk',
    'dband_kurtosis_bulk'
]

operations = [
    convolution_d0,
    convolution_d1
]

fp = Fingerprinter(images)
fingerprints = fp.get_fp(parameters, operations)
print(fingerprints)
[[1.24320000e+04 7.56280000e+00 3.20440000e+02 7.20334163e+04 2.16273665e+01 7.67298895e+02 7.38789348e+03 1.62700011e+07 1.36896000e+05 9.07488000e+01 3.84480000e+03 7.70065336e+05 2.57176420e+02 8.84466917e+03 8.85407060e+04 1.75573219e+08] [2.02800000e+03 2.45388000e+01 1.20000000e+03 8.73604104e+03 nan nan nan nan 2.02800000e+04 2.45388000e+02 1.20000000e+04 8.73604104e+04 nan nan nan nan]]

Local fingerprint generation

Localized fingerprints can also be generated. This is demonstrated for a catalytic structure of Palladium with a single Carbon adsorbate.

from catogm.fingerprint import Fingerprinter
from catogm.fingerprint.operations import bonding_convolution
from ase.build import fcc111
from ase.build import add_adsorbate

atoms = fcc111('Pd', size=(2, 2, 3), vacuum=10.0)
add_adsorbate(atoms, 'C', 1, 'fcc')

# -1 is the tag convention to identify bonded atoms
tags = atoms.get_tags()
tags[-1] = -1
atoms.set_tags(tags)

parameters = [
    'atomic_number',
    'atomic_radius',
    'atomic_volume',
    'atomic_weight',
    'boiling_point',
    'covalent_radius_cordero',
    'dipole_polarizability',
    'electron_affinity',
    'en_pauling',
    'en_allen',
    'en_ghosh',
    'evaporation_heat',
    'fusion_heat',
    'group_id',
    'period',
    'heat_of_formation',
    'melting_point',
    'metallic_radius',
    'specific_heat',
    'thermal_conductivity', 
    'vdw_radius',
    'dband_center_slab',
    'dband_width_slab',
    'dband_skewness_slab',
    'dband_kurtosis_slab'
]

operations = [
    bonding_convolution
]

fp = Fingerprinter(atoms)
fingerprints = fp.get_fp(parameters, operations)
print(fingerprints)
<a href=” 2.76000000e+02 1.24670000e+00 4.71700000e+01 1.27821062e+03 1.74063000e+07 1.01470000e+00 6.56960000e+02 7.09310878e-01 5.61000000e+00 1.41500100e+02 3.23739478e-02 nan nan 1.40000000e+02 1.00000000e+01 2.69973242e+05 6.97150000e+06 nan 1.73484000e-01 1.14162000e+02 3.57000000e+00 -1.57034029e+00 6.51684717e+00 -8.13678523e+01 3.65148674e+03”> 2.76000000e+02 1.24670000e+00 4.71700000e+01 1.27821062e+03 1.74063000e+07 1.01470000e+00 6.56960000e+02 7.09310878e-01 5.61000000e+00 1.41500100e+02 3.23739478e-02 nan nan 1.40000000e+02 1.00000000e+01 2.69973242e+05 6.97150000e+06 nan 1.73484000e-01 1.14162000e+02 3.57000000e+00 -1.57034029e+00 6.51684717e+00 -8.13678523e+01 3.65148674e+03

Writing personalized operations

Currently, the default structure of and operation is as follows:

def periodic_convolution_r0(
        atoms,
        atoms_properties,
        connectivity):

Where the atoms, atoms_properties, and connectivity properties are required. This is because these properties are passed to all operation functions currently implemented so that they do not need to be generated multiple times. This way change in future versions if it seems that these are better suited as global variables (This may makes the code overly difficult to follow).

Here is an example of a simple operation which supplies the adsorbate connectivity.

from catogm.fingerprint import Fingerprinter
from catogm.fingerprint.operations import bonding_convolution
from ase.build import fcc111
from ase.build import add_adsorbate
import numpy as np

atoms = fcc111('Pd', size=(2, 2, 3), vacuum=10.0)
add_adsorbate(atoms, 'C', 1, 'fcc')

# -1 is the tag convention to identify bonded atoms
tags = atoms.get_tags()
tags[-1] = -1
atoms.set_tags(tags)

parameters = [
    'atomic_number',
    'dband_center_slab',
    'dband_width_slab',
    'dband_skewness_slab',
    'dband_kurtosis_slab'
]

def example_operation(
        atoms,
        atoms_properties,
        connectivity):
    # This is a CatKit convention
    bond_index = np.where(atoms.get_tags() == -1)[0]

    return np.sum(connectivity[bond_index], axis=1)

operations = [
    bonding_convolution,
    example_operation
]

fp = Fingerprinter(atoms)
fingerprints = fp.get_fp(parameters, operations)
print(fingerprints)
<a href=” 2.76000000e+02 -1.57034029e+00 6.51684717e+00 -8.13678523e+01 3.65148674e+03 3.00000000e+00”> 2.76000000e+02 -1.57034029e+00 6.51684717e+00 -8.13678523e+01 3.65148674e+03 3.00000000e+00