/pensa

PENSA (Protein Ensemble Analysis) - a collection of python methods for exploratory analysis and comparison of protein structural ensembles.

Primary LanguagePythonMIT LicenseMIT

PENSA - Protein Ensemble Analysis

DOI PyPI version Package Documentation Status Powered by MDAnalysis

A collection of Python methods for exploratory analysis and comparison of protein structural ensembles, e.g., from molecular dynamics simulations. All functionality is available as a Python package.

To get started, see the documentation which includes a tutorial for the PENSA library.

If you would like to contribute, check out our contribution guidelines and our to-do list.

Functionality

With PENSA, you can (currently):

  • compare structural ensembles of proteins via the relative entropy of their features, statistical tests, or state-specific information and visualize deviations on a reference structure.
  • project several ensembles on a joint reduced representation using principal component analysis (PCA) or time-lagged independent component analysis (tICA) and sort the structures along the obtained components.
  • cluster structures across ensembles via k-means or regular-space clustering and write out the resulting clusters as trajectories.
  • trace allosteric information flow through a protein using state-specific information analysis methods.

Proteins are featurized via PyEMMA using backbone torsions, sidechain torsions, or backbone C-alpha distances, making PENSA compatible to all functionality available in PyEMMA. In addition, we provide density-based methods to featurize water and ion pockets.

Trajectories are processed and written using MDAnalysis. Plots are generated using Matplotlib.

Documentation

PENSA's documentation pages are here, where you find installation instructions, API documentation, and a tutorial.

Example Scripts

For the most common applications, example Python scripts are provided. We show how to run the example scripts in a short separate tutorial.

Demo on Google Colab

We demonstrate how to use the PENSA library in an interactive and animated example on Google Colab, where we use freely available simulations of a mu-Opioid Receptor from GPCRmd.

Open In Colab

Citation

General citation, representing the "concept" of the software:

Martin Vögele, Neil Thomson, Sang Truong, Jasper McAvity. (2021). PENSA. Zenodo. http://doi.org/10.5281/zenodo.4362136

To get the citation and DOI for a particular version, see Zenodo.

Acknowledgments

Contributors

Martin Vögele, Neil Thomson, Sang Truong, Jasper McAvity

Beta-Testers

Alex Powers, Lukas Stelzl, Nicole Ong, Eleanore Ocana, Callum Ives

Funding & Support

This project was started by Martin Vögele at Stanford University, supported by an EMBO long-term fellowship (ALTF 235-2019), as part of the INCITE computing project 'Enabling the Design of Drugs that Achieve Good Effects Without Bad Ones' (BIP152).