espressomd/espresso

Implement a Metatensor adaptor

jngrad opened this issue · 7 comments

Data flow

See Metatensor - Data flow

  • ESPResSo needs to export a neighbor list in the form of a vector of distance vectors
  • Metatensor sends back the central particle force, pressure tensor, energy, or observables like induced dipole

Metatensor expects distances, forces and energies in standard SI units. One has to set up the ESPResSo system such that e.g. the reduced unit is in Ångstroms, e.g. box_l=[100, 100, 100] means a box size of 100 Å. This can be achieved by designing a convenience function that leverages pint to create an empty ESPResSo system with the correct size, time step and energy units.

For domain decomposition: typically the ML model neighborhood radius is less than the interaction range. Thus we should be able to use metatensor in parallel without the need to tweak the ESPResSo domain decomposition and cell cutoff. One caveat: without LJ interaction the non-bonded cutoff will be zero, so we need to introduce a mechanism in the Metatensor plugin of the system call to adapt the min_cutoff cell system property.

Contact persons from the ML side: @SamTov, @Luthaf, @PicoCentauri

Roadmap

  • add pytorch dependency to the CI environment
  • include Metatensor via CMake FetchContent
  • write the C++ adaptor
    • adapt std::vector<int> nbhood(System::System const &system, Utils::Vector3d const &pos, double dist) to return a vector of particle distances (can be achieved with std::ranges::transform and a Particle to Vector3d projector)
    • use pytorch's autograd feature to compute the force as the gradient of the energy
  • introduce a Python API with the following input parameters:
    • ML model and any additional parameters, e.g. file paths
    • list of particle types for which the C++ core needs to export descriptors
      • which data structure would be suitable to represent the primary types and optional cross-terms?
    • figure out how to set system.min_global_cut
    • see Metatensor - Exporting models
  • write unit tests in the testsuite/python folder
  • write a convenience function to instantiate an ESPResSo system with the correct reduced units

See lammps/lammps#4163 for how Metatensor is being integrated into LAMMPS.

Metatensor expects distances, forces and energies in standard SI units.

Not quite, but it does have to be one of the named units in there if you want metatensor to perform unit conversion:
https://docs.metatensor.org/latest/atomistic/reference/models/index.html#known-quantities-and-units.

If some unit are left unset, metatensor will just pass data as-is from engine to model and respectively (which might fail if the model is not trained to work with the same units as espresso).

The rest of the plan here looks good to me!

I will start working on it.

  • Reading through the Lammps implementation, I guess we should begin by extracting the model loading and capabilities and requested neighbor list parsing.
  • then populate the System calss with data from Espresso
  • then populating the neighbor lists using Espresso's mechanism

Where can we find a good example (trained) model to try this on?

Where can we find a good example (trained) model to try this on?

We have been using https://github.com/Luthaf/metatensor-lj-test/ to test the interface to other simulation engines. It can define a LJ model for a single atomic type and export the model in the metatensor format.

We are currently going through the Lammps integration. Collecting our questions here.

@Luthaf could you maybe help us with those, either here on the issue, or in a video call? Many thanks!

General

  • what is the meaning of negative interaction ranges (checke for in PairMetatensor::init_style() (line 377)
  • in PairMetatensor::compute:
    • for the selected_atoms, what is the 2nd numbe? one is set ot i, the other to 0.

MetatensorSystemOptions

  • What is the format of the type mapping, and is that fromat an acutl requirement of Metatensor. In Espresso type numbers do not have to be contiguous. Can we use an std::map<int,int> instead?

MetaTensorNeighborData

  • in sample-t, what are i,j S_a, S_b, and S_c?
  • What are the caches needed for. Do have have to keep them, if the neighbor list code in ESPresSo makes sure, every pair is listed only once?
  • Can we use an std::variant to store either the 64 bit or 32 bit distances, or does metatensor expect both arrays to exist?

PairMetatensro

  • init_style: what does "all pairs" signify here?
  • compute() what does check consisty chekc?
  • compute)(: what is the role of energy, energy_detached, energy_blcoks energy_samples?

Provisoinal planning

  • More or less take over the structs from metattensor_syste.h
  • from metatensro_pair.h
    • Take over PairMetatensorData, removing the lammps dependency from the moel loading
  • Do not take over tha main pair style class. But the following indiviaul parts
    • Device setup from the settings function
    • Get interaction range form model and set in Espresso
    • Populating the metatensor system optiosn (init_style method)
    • Parsing the requested neighbor lists fro mthe model (init_style)
    • Populate evaulation optoins (particularly energy per atom) (PaierMetatensor::compute)
    • dtype selection
    • Populate Metatensro System from ESPResso system
    • most of the MetatensorPair::compute() method

Hey! I'm currently on vacation, but I'm happy to answer all your questions when I'm back. I should have some time the week of the 12th of August for a video chat!

I'm back and available for a chat about this, let me know when it is convenient for you!