Constructing interpolation paths between molecular geometries to obtain reaction path.
Traditional interpolation methods encounter difficulty when it comes to redundant internal coordinate spaces, because the feasible physical space compose only a very small and highly curved subspace. In this method, we avoid the problem of feasibility by operating strictly in feasible space, and bring in the benefit of internal coordinates through proper application of the corresponding metric tensor. With this new formulation, we view the configuration space as a Riemannian manifold with a metric generated from a set of internal coordinates. The interpolation paths are defined as geodesic curves on such manifolds. In other words the integrated total coordinate change is minimized. Such a definition ensures that the constructed paths are smooth and well behaved. The package is also used for smoothing discontinuous or noisy trajectories obtained from MD simulations.
It has been shown that the method generate smooth paths with reasonable barrier height even for highly complex reactions, such as protein unfolding or concerted cycloaddition reactions with many simutaneous ring formations. The default coordinate system uses Morse scaled pair-wise distances. The lengths in such coordinate systems have the physical meaning of the total number of bond changes along the path.
This is a pure python implementation, so it is not optimized for speed, but rather is intended to serve as a reference implementation of the algorithms described in the paper. Still, interpolating systems with ~1000 atoms should not be a problem.
- geodesic_interpolate Python package for interpolation and smoothing by finding geodesic curves with redundant internal metrics
__init__.py
Python package file__main__.py
Standalone script for performing interpolation and smoothings.geodesic.py
Computation and minimization of path length in redundant internal metrics. This is used to optimize a path way to find a geodesic. Cannot change the number of images during optimizationinterpolation.py
Generating approximate interpolation points to be used as starting guess for the geodesic optimizations. Recursively attempt to perform bisections on largest segment in internal space. Generally will create path with discontinuities, so a smoothing process need to follow this.coord_utils.py
Coordinate utilities. A simplified version of Nanoreactor coordinates module which provide a scaled interatomic distance coordinate, pair screening based on threshold and connectivity, as well as trans-rotational alignment based on Kabsch algorithm.fileio.py
XYZ file reading and writing
- setup.py Installation script. This will install both the Python package, which can be imported by name
geodesic_interpolate
and the standalone script, which is also namedgeodesic_interpolate
- test_cases A few test cases used to check the performance of the code. Note that the large ones may take a few minutes
thanks to Python. Especially need to run them when testing out alternative coordinate scaling methods.
H+CH4_CH3+H2.xyz
A simple test case. Should always workDielsAlder.xyz
Dehydro-Diels-Alder reaction. This is an important test case because the initial structure is planar symmetric and could access both the final structure and its mirror image, which as exactly the same internal coordinates. Proper monitoring of geodesic length during the interpolation process is therefore crucial for this to work, without which the raw interpolated path will jump between mirror images and the optimized path would contain some flopping. Also tests the non-sweeping global path optimization algo in a relatively large system.TrpCage_unfold.xyz
Unfolding a Trp-cage mini-protein. Need at least 10 images to work. Folded geometry taken from Stefan''s test directory which is instead taken from Nathan. Unfolded structure is generated by force navigated MD for 1ns under 1nN on C and N terminal with ReaxFF, then optimizing without force at 6-31g/b3lyp level. For testing many simultaneous large amplitude motions.collagen.xyz
Interpolate between the solution and crystal structure of collagen Kunitz domain 1kun - 1kth. Solvents removed from the PDB structures. Tests collision avoidance for large movements in the core of folded protein. Other groups should slightly breath to create room for the part that changes.calcium_binding.xyz
Binding two Ca2+ ions to the yeast Calmodulin N terminal domain 1f54 -1f55. The apo structure did not have ions so two of which were added by hand at random locations away from the protein. It seem to be hard to avoid large movements of the Ca2+ cations when they are very far away from the protein but once in contact they should move smoothly. This is to test if the interpolater can correctly route and find the entry point and connect the path.
Python : >2.7 or >3.5
Numpy : Tested with 1.13
Scipy : Tested with 0.19
The package can be used without installation from the package directory with
python -m geodesic_interpolate filename ...
To use the script from arbitrary location or to use the Python module, use setup tools:
python setup.py install
This will install a Python package geodesic_interpolate
and a standalone script geodesic_interpolate
.
The package can be involved from arbitrary location using the aforementioned command line after installation,
and a standalone script with the same signature can also be used
geodesic_interpolate filename.xyz --output output.xyz --nimages 20 ...
usage: geodesic_interpolate [-h] [--nimages NIMAGES] [--sweep] [--no-sweep] [--output OUTPUT] [--tol TOL] [--maxiter MAXITER] [--microiter MICROITER] [--scaling SCALING] [--friction FRICTION] [--dist-cutoff DIST_CUTOFF] [--logging {DEBUG,INFO,WARNING,ERROR}] [--save-raw SAVE_RAW] filename
Interpolates between two geometries
positional arguments:
- filename XYZ file containing geometries. If the number of images is smaller than the desired number, interpolation points will be added. If the number is greater, subsampling will be performed.
optional arguments:
-h
,--help
show this help message and exit--nimages NIMAGES
Number of images. (default: 17)--sweep
Sweep across the path optimizing one image at a time, instead of moving all images at the same time. Default is to perform sweeping updates if there are more than 30 atoms. (default: None)--no-sweep
Do not perform sweeping. (default: None)--output OUTPUT
Output filename. Default is interp.xyz (default: interpolated.xyz)--tol TOL
Convergence tolerance (default: 0.002)--maxiter MAXITER
Maximum number of minimization iterations (default: 15)--microiter MICROITER
Maximum number of micro iterations for sweeping algorithm. (default: 20)--scaling SCALING
Exponential parameter for morse potential (default: 1.7)--friction FRICTION
Size of friction term used to prevent very large change of geometry. (default: 0.01)--dist-cutoff DIST_CUTOFF
Cut-off value for the distance between a pair of atoms to be included in the coordinate system. (default: 3)--logging {DEBUG,INFO,WARNING,ERROR}
Logging level to adopt [ DEBUG | INFO | WARNING | ERROR ] (default: INFO)--save-raw SAVE_RAW
When specified, save the raw path after bisections be before smoothing. (default: None)