Binding affinity in cb6-but-pbc very different
Closed this issue · 11 comments
Hi, upon running the NB with Amber22, I am getting very differrent results from those precalculated in the NB file, also the binding_affinity variable for some reason also contains the unit. Is this OK?
My result:
The binding affinity for butane and cucurbit[6]uril = -157.63 kilocalorie / mole +/- 11.34 kilocalorie / mole kcal/mol
@martinj80 The binding affinity variable should contain the unit and this is the intended behavior from v1.2.0. I suspect you are getting a different value because the simulations were very short in each window. This is indicated by the magnitude of the error = 11.34 kcal/mol. If you run it longer, perhaps 5ns per window, you would get a closer value to the pre-calculated value in the notebook.
I don't remember how long I ran the simulations on the NB, but I suspect I ran them much longer. One way to debug in a production run is to look at the free energy for the different phases (attach, pull, release). If the free energy for a phase looks incredibly wrong, the next step would be to view the trajectory for the windows in a GUI like VMD or PyMol. Most of the time, you can see what's wrong with the simulation (wrong choice of restraints, etc)
print(free_energy.results["attach"]["ti-block"]["fe"])
print(free_energy.results["pull"]["ti-block"]["fe"])
I increased the nstlim to 1ns (500,000steps) but the results are still horrible not much changed. The trajectory seems fine, no obvious problems. I am now running 10 ns production step...
I thought the values in the README are what you'd get if you evaluated the code as written -- and of course we expect fluctuation in the mean ΔG given the high uncertainty, but seeing -160 kcal/mol is definitely outside of what I'd expect. I'm curious what a 10 ns production phase gives you. Is the restraint energy really really high?
Also, you could check the free profile across the pull phase. From memory, the free energy for each window is stored in fe_matrix,
but I can't remember if you need to remove the units first from the array for matplotlib.
plt.plot(free_energy.results["pull"]["ti-block"]["fe_matrix"][0,:])
You can also plot the histogram overlap between neighboring windows in the pull phase. Normally, there is a connection between skewed free energy values and the (lack of) overlap between histograms. I will need to go through my folders to find the notebook, I'll get back to you on this one.
Cucurbiturils are known to be very rigid and highly polar at the entrance, which makes it hard for the guest molecule to leave the pocket (hence, the unusually large binding affinity for like CB7-8). In the original APR paper (SI of https://pubs.acs.org/doi/10.1021/acs.jctc.5b00405), they had to apply jack restraints on the CB7 to make the portal wider, which improved the overlap and/or convergence in the pull phase.
even with 10ns, the value is still similar. I checked the fe_matrix in the pull phase:
[-0.0, 1.11, 4.31, 9.49, 12.82, 13.61, 13.82, 12.59, 11.50, 10.62, 10.07, 10.17, 10.14, 9.920, 17.66, 46.76, 97.98, 171.3]
problem occurs in the last few windows or in the first windows (is the order ascending or descending)? I see no extreme restrain energies (all approx. 2-13) in the md outputs, although the histograms (almost) do not overlap between 4. and 5. pull window. Otherwise, all looks pretty much normal from what I can see.
@martinj80 I think I know what the problem is. If you look at the free energy values for the pull phase, the free energy converges around the 10 kcal/mol value. However, it then jumps to 17.66 and then all the way up to 171.3 kcal/mol. I'm pretty sure this is because the water box is too small and the butane molecule is probably reaching the upper boundary of the box. There are two ways to recover the free energy:
- Redo the simulations for the last four windows with a larger rectangular box (maybe add around 500-1000 molecules in the leap command).
- Reduce the
pull_distances
length to 14 Angstrom, run the cells that generate the restraints definition and run the analysis again (you may need to delete or move the folders of the last 4 pull windows from the working directory)
Let me know if this helps fix the issue.
Thank you for the suggestions. Increasing the number of water by 1000 improved it, with 50000 steps I got to 13.99 +/- 1.54 kcal/mol.
Closing since the question has been answered.