shirtsgroup/cg_openmm

Deal with energy spikes while maintaining large time step

cwalker7 opened this issue · 9 comments

When using a 10 or 20fs timestep, in the course of a simulation there will be energy spikes, up to the order of 1E12 kJ/mol. This is more prevalent for higher epsilon value (~2kJ/mol and higher) and at higher temperature (>350K). This causes invalid heat capacities, as well as other issues in the analysis. I looked further into what was happening, and the structures corresponding to the extremely high energy are badly distorted (bond lengths of 100's of angstrom, etc). So it is not simply an error with the reporting of energies.

It also doesn't seem like it is an issue with the exchanges. Looking more closely at one case, one spike spans a period of 6 frames (600 time steps), upon which the system recovers, and there are no exchanges accepted involving that replica. The LangevinDynamicsMove we are using is not metropolized, a simulation period of 100 time steps will always be accepted, even if the energy blows up. It is only the exchange moves that are accepted/rejected.

So far I have tried using the GHMCMove instead (https://openmmtools.readthedocs.io/en/0.18.1/api/generated/openmmtools.mcmc.GHMCMove.html#openmmtools.mcmc.GHMCMove), which is a metropolized form of Langevin dynamics and also uses the BAOAB scheme (recall our discussion here choderalab/openmmtools#480), and it seems to be working with the 20fs step (no energy spikes at all). Will need to test it out on more force field parameter combinations.

GHMCMove is quite a bit slower than LangevinDynamicsMove (about 3x slower), yet the alternative of moving down from 20fs to 5fs time step with LangevinDynamicsMove would be worse. Other options are:

  • Sticking with LangevinDynamicsMove and filtering out data associated with unusually high energy, defined by some cutoff - not ideal.
  • Try LangevinSplittingDynamicsMove (non-metropolized but uses BAOAB scheme)

So it looks like the reason for energy spikes was due to the harmonic bonds being too weak relative to the 5kJ/mol epsilon parameter I was using. Increasing from k_bond=1,000kJ/mol/nm^2 to 10,000 kJ/mol/nm^2 seems to solve the bad overlap issue. This is the improvement in the sidechain-backbone rdf:

bb_sc_rdf_angle_off_torsion_off_bond1k_step5fs

sc_bb_rdf_angle_off_torsion_off_bond10k_step5fs

For the weaker bond force constant, the beads were nearly on top of eachother to double up the strong nonbond interactions.

Turning on the angles and torsions, and using the 10,000 kJ/mol/nm^2 bonds, we see no issues with the energy or rdf using a 20fs time step (as we would expect - even with those stiff bonds, the period of vibration for the bonds using masses of 100 amu is ~400fs). It is a bit surprising that the 1000 kJ/mol/nm^2 was so bad, since it is close to the Martini value of 1250 kJ/mol/nm^2 for 72 amu particles, but values of up to 20,000 kJ/mol/nm^2 are used in literature to mimic constrained bonds.

The nonbonded exceptions look fine, though I did find a bug for the case of include_angle_forces=False, where the 1-3 interactions were still being excluded. Will fix that in a later PR.

Great detective work, @cwalker7!

t is a bit surprising that the 1000 kJ/mol/nm^2 was so bad, since it is close to the Martini value of 1250 kJ/mol/nm^2

But they are generally using lower epsilon.

Yep - I think we should be able to use the faster LangevinDynamicsMove with 20fs step with the stiff bonds now. I also tried 2,000 and 5,000 kJ/mol/nm^2 bonds, and neither is strong enough to avoid the spikes with eps=5kJ/mol.

seems to solve the bad overlap issue. This is the improvement in the sidechain-backbone rdf:

So the large accumulation of sidechain/backbone RDF at low energy is 1-2 connected sidechain/backbones? The LG 1-2 interactions are off, so there is no LJ attraction between these two - but does packing them together like this increase the total amount of LJ interaction between 1-4 and greater LJ contacts? I'm just trying to get a good physical picture.

close to the Martini value of 1250 kJ/mol/nm^2

Note that the larger bond lengths in Martini might also help, as with large bonds, you can't collapse sidechains and backbones down together.

I also tried 2,000 and 5,000 kJ/mol/nm^2 bonds, and neither is strong enough to avoid the spikes with eps=5kJ/mol.
Good checking.

I think we should be able to use the faster LangevinDynamicsMove with 20fs step with the stiff bonds now

Just wondering (since it's sometimes useful to run things until they break) what happens with 25 or 30 fs?

Just wondering (since it's sometimes useful to run things until they break) what happens with 25 or 30 fs?

Haven't tried to larger time step yet but it could work, good idea.

So the large accumulation of sidechain/backbone RDF at low energy is 1-2 connected sidechain/backbones? The LG 1-2 interactions are off, so there is no LJ attraction between these two - but does packing them together like this increase the total amount of LJ interaction between 1-4 and greater LJ contacts?

Yeah I would imagine it does. If we turn the bond angles off, then they will go towards the backbone bead to interact will both 1-3 backbone beads. The 1-4 interactions are harder to picture. With angles on we see strange grouping of beads like this:

24mer_angles_on_bond_1k

Just wondering (since it's sometimes useful to run things until they break) what happens with 25 or 30 fs?

30fs leads to lots of spiking but can usually recover.
40fs fails very quickly

I'd say 20fs is a good choice then.

Sounds like a plan. We should keep thinking about overall energetic setups that might help us identify better ways to put together CG models - but this sounds like it works for now just fine.