xyztraj: A Python repository from jro1234

XYZTraj is for reading xyz trajectories in Python, then preparing the data for other Python-based trajectory analysis tools. We will call this 'featurizing' the data, i.e. creating new trajectories by applying a calculation to all the frames in the trajectory.

You can clone this repository and install the package with

git clone https://github.com/jrossyra/xyztraj
cd xyztraj
python setup.py install

There are 2 objects associated with retrieving data, an XYZReader that gets the data from a file, and a XYZTrajectory object that this reader creates and returns. Once you have the trajectory object, you can apply feature calculations with the Featurizer object to create a new trajectory in feature space.

Read an xyz trajectory file like this:

from xyztraj import XYZReader

reader = XYZReader()
# An XYZTrajectory
traj = reader.readfile('mytraj.xyz')
# A numpy array of the coordinates
traj.trajectory

Then you can featurize your trajectory with calculations we provide or your own simple (or complex) script that takes atomic coordinates to calculate something. There are options for how to provide the featurizing function, here we just give the name of functions in the xyztraj.features package.

from xyztraj.features import Featurizer

dihedral_atoms = [0,10,11,3]
keep_position_atoms = [2,4,6]
features = {'dihedral': dihedral_atoms, 'nofeature': keep_position_atoms}

featurizer = Featurizer(traj.trajectory)
featurizer.add_features(features)
featurizer.featurize()
featuretraj = featurizer.trajectory

# (mxd) shape of m frames by d feature dimensions
featuretraj.shape

The featurizing functions in features, or your functions, can be given in place of the function name strings shown above. Also, we can just keep featurizing the trajectory and append to the featurespace.

from xyztraj.features import distance

distance_atoms = [0,1]
featurizer.add_features({distance: distance_atoms})
featurizer.featurize()

# larger by the 1 feature dimension we just added
featuretraj.shape

We show the good practice of enforcing some input structure so that you don't accidentally get a misunderstood result from erroneous input that happens to calculte without error. The coordinates are flattened, so make sure to take the number of atoms indices given and multiply by 3.

def calc_weirdfeature(atomcoordinates):
    # unknown frames is shape[0], 3xNatoms is shape[1]
    assert atomcoordinates.shape[1] == 12
    return np.mean(atomcoordinates)

weirdfeature_atoms = [13,11,15]
weirdfeature = {calc_weirdfeature: weirdfeature_atoms}

featurizer.add_features(weirdfeature)
featurizer.featurize()
featuretraj.shape

jro1234/xyztraj