[BUG] Issue with the approach described to combine EMLE and OpenMM-ML

Question

[BUG] Issue with the approach described to combine EMLE and OpenMM-ML

Closed this issue 3 months ago · 5 comments

Describe the bug

This is not a bug per se in Sire, but rather an issue in the approach described here for combining EMLE and OpenMM-ML. This approach works fine when lambda_interpolate=1, i.e., when electrostatic embedding is fully turned on. However, I’ve realised that it does not work as expected when lambda values other than 1 are used.

The proposed approach performs the following steps:

Sets up an EMLECalculator, QM engine, and so on, and gets the EMLE forces (engine.get_forces())
Zeroes the charges of the QM molecule
Writes AMBER coordinates and topology files
Reads those files back into OpenMM
Creates the MM system
Creates the mixed ML/MM system
Adds the EMLE force to the mixed ML/MM system.

The issue occurs in step 6, where the mixed ML/MM system is created using the charges from the AMBER files. This causes the CustomBondForce, which handles all intramolecular nonbonded interactions within the ML region and is necessary for ML ↔ MM interpolation, to be constructed using zero charges. As a result, all intramolecular MM electrostatic interactions vanish regardless of the lambda_interpolate value, leading to incorrect behavior (see the Sire-EMLE/OpenMM-ML (A) results below).

I believe the charges of the QM regions should only be zeroed after the mixed system is created. This would ensure that CustomBondForce is constructed using the correct set of charges and that solute-solvent electrostatic interactions are only handled by EMLE. For confirmation, refer to the Sire-EMLE/OpenMM-ML (B) results below.

Note that I am using two different lambda parameters here: one passed to the EMLECalculator, and the other used for ML/MM interpolation in OpenMM-ML (both of which I set to same value). This is done to ensure interpolation between electrostatic and mechanical embedding (and not only turning off of the EMLE potential).

	`lambda_interpolate=1`, electrostatic embedding	`lambda_interpolate=0`, mechanical embedding
Native Sire-EMLE	-1326697.81	-25252.56
Sire-EMLE/OpenMM-ML (A)	-1326697.81	-25118.81
Sire-EMLE/OpenMM-ML (B)	-1326697.81	-25252.16

Values in kJ/mol.

To Reproduce

Native Sire-EMLE

import sire as sr
from emle.calculator import EMLECalculator

# Set up the EMLE calculation
mols = sr.load_test_files("ala.crd", "ala.top")
calculator = EMLECalculator(device="cpu", backend="torchani")
qm_mols, engine = sr.qm.emle(mols, mols[0], calculator, "12A", 20)

# Set up the dynamics
d = qm_mols.dynamics(
    timestep="1fs",
    constraint="none",
    qm_engine=engine,
    platform="cpu",
    cutoff="12A",
    lambda_interpolate=1,
)

# Get the OpenMM context, system, integrator, and platform
omm_context = d._d._omm_mols

# Get the potential energy
print(omm_context.getState(getEnergy=True).getPotentialEnergy())

Sire-EMLE/OpenMM-ML (a)

import numpy as np
import openmm as mm
import sire as sr
from emle.calculator import EMLECalculator
from openmmml import MLPotential

mols = sr.load_test_files("ala.crd", "ala.top")

# Get the MM charges of the QM molecule and save the QM molecule to AMBER files
mm_charges = np.asarray([atom.charge().value() for atom in mols[0].atoms()])
sr.save(mols[0], "qm", ["prm7"])

# Set up the EMLE calculation
lambda_interpolate = 1
calculator = EMLECalculator(
    backend=None,
    device="cpu",
    lambda_interpolate=lambda_interpolate,
    mm_charges=mm_charges,
    parm7="qm.prm7",
    qm_indices=list(range(mols[0].num_atoms())),
)

qm_mols, engine = sr.qm.emle(mols, mols[0], calculator, "12A", 20)
emle_force, interpolation_force = engine.get_forces()

# Zero the charge of the QM molecule
qm_mols = sr.qm.zero_charge(qm_mols, qm_mols[0])

# Write AMBER files
sr.save(qm_mols, "ala_qm", ["prm7", "rst7"])

# Load the AMBER files into OpenMM
prmtop = mm.app.AmberPrmtopFile("ala_qm.prm7")
inpcrd = mm.app.AmberInpcrdFile("ala_qm.rst7")

# Create the OpenMM system
mm_system = prmtop.createSystem(
    nonbondedMethod=mm.app.PME,
    nonbondedCutoff=12 * mm.unit.angstrom,
    constraints=mm.app.HBonds,
    ewaldErrorTolerance=1e-4,
)

# Create the mixed ML/MM system and add the EMLE and interpolation forces
ml_atoms = list(range(qm_mols[0].num_atoms()))
potential = MLPotential("ani2x")
ml_system = potential.createMixedSystem(
    prmtop.topology, mm_system, ml_atoms, interpolate=True
)
ml_system.addForce(emle_force)
ml_system.addForce(interpolation_force)

# Turn off the dispersion correction to match Sire's calculation
for force in ml_system.getForces():
    if isinstance(force, mm.NonbondedForce):
        force.setUseDispersionCorrection(False)

# Create the OpenMM context and set the positions
context = mm.Context(ml_system, mm.VerletIntegrator(1.0))
context.setPositions(inpcrd.positions)
context.setParameter("lambda_interpolate", lambda_interpolate)

print("Total energy:", context.getState(getEnergy=True).getPotentialEnergy())

Sire-EMLE/OpenMM-ML (B)

import numpy as np
import openmm as mm
import sire as sr
from emle.calculator import EMLECalculator
from openmmml import MLPotential

mols = sr.load_test_files("ala.crd", "ala.top")

# Get the MM charges of the QM molecule and save the QM molecule to AMBER files
mm_charges = np.asarray([atom.charge().value() for atom in mols[0].atoms()])
sr.save(mols[0], "qm", ["prm7"])

# Set up the EMLE calculation
lambda_interpolate = 1
calculator = EMLECalculator(
    backend=None,
    device="cpu",
    lambda_interpolate=lambda_interpolate,
    mm_charges=mm_charges,
    parm7="qm.prm7",
    qm_indices=list(range(mols[0].num_atoms())),
)

qm_mols, engine = sr.qm.emle(mols, mols[0], calculator, "12A", 20)
emle_force, interpolation_force = engine.get_forces()

# Write AMBER files
sr.save(qm_mols, "ala_qm", ["prm7", "rst7"])

# Load the AMBER files into OpenMM
prmtop = mm.app.AmberPrmtopFile("ala_qm.prm7")
inpcrd = mm.app.AmberInpcrdFile("ala_qm.rst7")

# Create the OpenMM system
mm_system = prmtop.createSystem(
    nonbondedMethod=mm.app.PME,
    nonbondedCutoff=12 * mm.unit.angstrom,
    constraints=mm.app.HBonds,
    ewaldErrorTolerance=1e-4,
)

# Create the mixed ML/MM system and add the EMLE and interpolation forces
ml_atoms = list(range(qm_mols[0].num_atoms()))
potential = MLPotential("ani2x")
ml_system = potential.createMixedSystem(
    prmtop.topology, mm_system, ml_atoms, interpolate=True
)
ml_system.addForce(emle_force)
ml_system.addForce(interpolation_force)

# Turn off the dispersion correction to match Sire's calculation
# Set the MM charges to zero
for force in ml_system.getForces():
    if isinstance(force, mm.NonbondedForce):
        for i in ml_atoms:
            _, sigma, epsilon = force.getParticleParameters(i)
            force.setParticleParameters(i, 0, sigma, epsilon)
        force.setUseDispersionCorrection(False)

# Create the OpenMM context and set the positions
context = mm.Context(ml_system, mm.VerletIntegrator(1.0))
context.setPositions(inpcrd.positions)
context.setParameter("lambda_interpolate", lambda_interpolate)

print("Total energy:", context.getState(getEnergy=True).getPotentialEnergy())

Answer 1 · 2024-08-14T12:53:56.000Z

Thanks for this, I'll update the docs accordingly. In this case, it should probably be safe to just ignore qm_mols entirely and use the original mols as the reference for OpenMM-ML, i.e. only use qm_mols when working with Sire or OpenMM directly, not OpenMM-ML.

Answer 2 · 2024-08-14T13:00:51.000Z

Yes, indeed. OpenMM-ML should be given the original system without any modifications, as it internally takes care of everything. Any necessary modifications must be performed after the mixed system is created.

Answer 3 · 2024-08-14T13:08:48.000Z

I've now updated the section in the tutorial.

Answer 4 · 2024-08-14T13:12:06.000Z

I think it's still necessary to mention that the charges of the ML region must be zeroed after the mixed system is created (see code for approach B), otherwise, mechanical embedding is always turned on.

Answer 5 · 2024-08-14T13:24:51.000Z

Ah yes, sorry, jumping between too many things!