/hydrocycler

A set of programs to study hydrogen bond cycles

Primary LanguagePythonMIT LicenseMIT

Screen Shot 2022-03-30 at 9 59 51 AM

Hydrocycler

See the accompanying preprint for an application example.

The Hydrocycler set of programs views the covalent bonding and H-bonding network as a directed graph in the direction of H-bonding donation; that is

[O-H> - - [O-H> - - [O-H> - -

is viewed as a graph with three nodes in the rightward direction. The result of a reversal of the graph above would look like

O] - - <H-O] - - <H-O] - - <H

Because neutrality is preferred over charge separation, it is hypothesized that most proton transfers occur in directed rings or cycles. By the reversal of cycles and subsequent successful optimization, polycyclomorphs are obtained. Proto-polycyclomorphs are the configurations before optimization.

  • Usage: python hydrocycler.py file.xyz

An interactive script to generate proto-polycyclomorphs. It find cycles of hydrogen bonding within a (ideally optimized) molecular cluster and generates proto-polycyclomorphs, derivative molecular clusters by reversing the direction of the molecular- and H-bonding. These derivative molecular clusters then serve as input for subsequent energy minimization. It is also an instructive script to understand hydrocycler_findall.py below.

The input is a cartesian coordinate file and the outputs are cartesian coordinate files as well.

  • Usage: python hydrocycler_findall.py file.xyz

hydrocycler_findall.py is the workhorse program and does an exhaustive cycle reversal search. It is an automated script that generates .xyz files of proto-polycyclomorphs. It also generates a ledger of signatures (.sig) file that defines the H-bonding family from which the input file belongs. This ledger file can then be used to test other configurations for H-bonding family membership. (See hydrocycler_isamember.py below for usage.) For convenience, hydrocycler_findall.py also creates a bash script for executing Gaussian .com files corresponding to the generated .xyz files. (To create .com files from .xyz files quickly, see gt-xyz2com.py in https://github.com/mihali/gt-x.)

  • Usage: python hydrocycler_isamember.py file0.sig fileA.xyz fileB.xyz ...

hydrocycler_isamember.py tests if xyz files have H-bonding cycles that fall under a family of H-bonding as defined by a signature file generated by hydrocycler_findall.py

Installation

Procedure

  1. git clone https://github.com/mihali/hydrocycler.git
  2. cd hydrocycler
  3. conda create --name hydrocycler python=3.7
  4. conda activate hydrocycler
  5. conda install numpy scipy

Method

H-bonds can be determined from the cartesian coordinates of a cluster by measuring O-O distances between nearest neighbors and then taking H-O-O angles. A quick algorithm to perform a nearest neighbor search is by KDtree. The information of the H-bonds are held in a data structure called a "trio", which consists of the cartesian coordinates of the two oxygens, and the one hydrogen in the donor and acceptor roles. A reversal of a cycle can easily be performed because the coordinates after swapping have been precomputed.

Screen Shot 2022-03-31 at 10 59 07 AM

Dataset

There are hydrocycler input and output files, and Gaussian output results, in the DATASET.tar.gz file. To conserve space, Gaussian output files only contain extracts of the archive portions of successful Gaussian runs. (To make these more human readable, see gt-parsearchive.py in https://github.com/mihali/gt-x.)

References

  1. Uses johnson.py derived from https://github.com/qpwo/python-simple-cycles and modified to compute Tarjan's algorithm iteratively.
  2. Algorithm employed is from Johnson (1975) https://doi.org/10.1137/0204007