Protein Graph Library
This package provides functionality for producing a number of types of graph-based representations of proteins. We provide compatibility with standard formats, as well as graph objects designed for ease of use in deep learning.
from graphein.construct_graphs import ProteinGraph
# Initialise ProteinGraph class
pg = ProteinGraph(granularity='CA', insertions=False, keep_hets=True,
node_featuriser='meiler', get_contacts_path='/Users/arianjamasb/github/getcontacts',
pdb_dir='examples/pdbs/',
contacts_dir='examples/contacts/',
exclude_waters=True, covalent_bonds=False, include_ss=True)
# Create graph. Chain selection is either 'all' or a list e.g. ['A', 'B', 'D'] specifying the polypeptide chains to capture
graph = pg.dgl_graph('3eiy', chain_selection='all')
granularity: {'CA', 'CB', 'atom'} - specifies node-level granularity of graph
insertions: bool - keep atoms with multiple insertion positions
keep_hets: bool - keep hetatoms
node_featuriser: {'meiler', 'kidera'} low-dimensional embeddings of AA physico-chemical properties
pdb_dir: path to pdb files
contacts_dir: path to contact files generated by get_contacts
get_contacts_path: path to GetContacts installation
exclude_waters: bool - retain structural waters
covalent_bonds: bool - maintain covalent bond edges or just use intramolecular interactions
include_ss: bool - calculate protein SS and surface features using DSSP and assign them as node features
Create env
conda create --name graphein
conda activate graphein
-
Install
vmd-python
conda install -c conda-forge vmd-python
-
Install Get Contacts
# Install get_contact_ticc.py dependencies $ conda install scipy numpy scikit-learn matplotlib pandas cython seaborn $ pip install ticc==0.1.4 # Install vmd-python dependencies $ conda install netcdf4 numpy pandas seaborn expat tk=8.5 # Alternatively use pip $ brew install netcdf pyqt # Assumes https://brew.sh/ is installed # Set up vmd-python library $ git clone https://github.com/Eigenstate/vmd-python.git $ cd vmd-python $ python setup.py build $ python setup.py install $ cd .. # Set up getcontacts library $ git clone https://github.com/getcontacts/getcontacts.git $ echo "export PATH=`pwd`/getcontacts:\$PATH" >> ~/.bash_profile $ source ~/.bash_profile # Test installation $ cd getcontacts/example/5xnd $ get_dynamic_contacts.py --topology 5xnd_topology.pdb \ --trajectory 5xnd_trajectory.dcd \ --itypes hb \ --output 5xnd_hbonds.tsv
-
Install DSSP
conda install -c salilab dssp
- Install graphein
$ git clone https://www.github.com/a-r-j/graphein
$ cd graphein
$ pip install -e .