IntelLabs/matsciml

[Feature request]: Interface to atomic properties

laserkelvin opened this issue · 0 comments

Feature/behavior summary

Instead of hardcoded numbers littered around the codebase without units or traceability, we should provide a standardized interface for reference atomic/chemical property values by introducing maintained dependencies like mendeleev and periodictable.

As examples, MACE and M3gnet implementations have brought in dictionaries that provide utility, but are literally hardcoded mappings like matsciml.datasets.utils.atomic_number_map.

Request attributes

  • Would this be a refactor of existing code?
  • Does this proposal require new package dependencies?
  • Would this change break backwards compatibility?
  • Does this proposal include a new model?
  • Does this proposal include a new dataset?
  • Does this proposal include a new task/workflow?

Related issues

No response

Solution description

The core idea would be to provide a consistent, "vectorized" or broadcast-able interface to retrieve reference atomic and chemical properties, such as ionization energies and just atomic symbol-number mappings. I don't know if we need a class based implementation, but at the very least, functions that look like this:

import mendeleev

def symbols_to_elements(atomic_symbols: list[str]) -> list[mendeleev.Element]:
    return [mendeleev.element(symbol) for symbol in atomic_symbols]


def retrieve_atomic_numbers(elements: list[mendeleev.Element]) -> list[int]:
    return [element.atomic_number for element in elements]

We can choose to abstract things out more, but some things might get a bit more tricky (e.g. fully stripped ions, etc.)

Additional notes

No response