/pICalculax

Isoelectric point (pI) predictor for chemically modified peptides and proteins.

Primary LanguageC++BSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

pICalculax

Isoelectric point (pI) predictor for chemically modified peptides and proteins.

Published in:
Esben J. Bjerrum , Jan Holst Jensen, and Jakob L. Tolborg
J Chem Inf Model. 2017 Aug 28;57(8):1723-1727. doi: 10.1021/acs.jcim.7b00030.
Available at github.com/EBjerrum/pICalculax

A docker image has been compiled by Troels Schwarz-Linnet and is available at hub.docker.com/r/tlinnet/picalculax. Proteax Desktop is NOT installed in the image.

Notes

For handling condensed molfile formats, RDKit needs to be patched. Patch can be found in the rdkit_patch directory.

Patching RDKit can be difficucult. See the Dockerfile how this was done.

For handling conversion of protein line notation (PLN) to condensed molformat, Proteax desktop is needed. A modification database for Proteax Desktop import is found in the mods_db directory

Example usage of pICalculax can be found in the file Example_usage.py

Online use with mybinder.org

The easiest solution, is to use the service mybinder.org, to launch an interactive Jupyter Notebook. Click here or the icon for access for online environment.

This can take up to 10 min, since the image is quite large.

Binder

How to use Docker

Get prebuild image

docker pull tlinnet/picalculax:02_picalculax

Running docker with image Link to run reference:

Running on a mac

First make an alias

alias pi='docker run -ti --rm -p 8888:8888 -v "$PWD":/home/jovyan/work --name picalculax tlinnet/picalculax:02_picalculax'

Run it

# With no arguments, starts Jupyter notebook
pi
# Or else start bash, to start programs
pi bash

Example use

Command line

Start Docker image, with activated python 2.7 environment

$ pi pICalculax -h

Python 2.7.14 :: Anaconda custom (64-bit)
Calling pICalculax.py with arguments: -h
usage: pICalculax.py [-h] [--fasta FASTA [FASTA ...]] [--pln PLN [PLN ...]]

Predict isoeletric point pI of peptides and modified peptides

optional arguments:
  -h, --help            show this help message and exit
  --fasta FASTA [FASTA ...]
                        Predict fasta sequence
  --pln PLN [PLN ...]   Predict PLN sequence (Requires Proteax Desktop)

With FASTA

$ pi pICalculax --fasta ICECREAM FATCAT

Python 2.7.14 :: Anaconda custom (64-bit)
Calling pICalculax.py with arguments: --fasta ICECREAM FATCAT
4.14 	ICECREAM
5.02 	FATCAT

Interactive session

Start Docker image, with activated python 2.7 environment

pi py27
from pICalculax import find_pKas, pI
from rdkit import Chem

fasta = 'ICECREAM'
mol = Chem.MolFromFASTA(fasta)

#find pKa values and charge class
pkalist, charge = find_pKas(mol)
#Calculate pI
pIpred = pI(pkalist, charge)

print(pIpred)

Jupyter Notebook example with image

Example usage of the pICalculax for pI prediction of unmodified and modified peptides

alt text

Start image with Jupyter notebook

pi

Then do as follow:

  • In your browser, go to 0.0.0.0:8888
  • Click New --> Python 2
  • Paste in from below

The peptides can be loaded from a SDfile.

from pICalculax import find_pKas, pI
from rdkit import Chem
from rdkit.Chem import Draw
# https://github.com/rdkit/rdkit-tutorials/tree/master/notebooks
from rdkit.Chem.Draw import IPythonConsole

#Load a protein from SD file in condensed format
import os.path
paths = ['Datasets/example_mols.sdf', 'pICalculax_dir/Datasets/example_mols.sdf', '../pICalculax_dir/Datasets/example_mols.sdf']
for path in paths:
    if os.path.exists(path):
        sdsup = Chem.SDMolSupplier(path)
        break

def predict_show(mol):
    #Get list of identified pKa values and charge
    pkalist, charge = find_pKas(mol)
    #Predict the pI from the identified pKa values
    piPred = pI(pkalist, charge)
    #Report and Visualize
    print("Predicted pI:%0.2F"%piPred)
    # display is a Jupyter command 
    display(mol)
    #Draw.ShowMol(mol, legend = "Predicted pI:%0.2F"%piPred)
    #Draw.tkRoot.update()
    #txt = raw_input('Press <ENTER> to continue')

# An unmodified peptide
mol = sdsup[0]
predict_show(mol)

# A peptide with a modification
mol = sdsup[1]
predict_show(mol)

Proteax Desktop example

With Proteax Desktop protein line notation of modified peptides can be converted to a RDKit mol object and the pI predicted

from proteax_desktop import *
prtx = ProteaxDesktop()

pln = 'H-GHANY[Gla]A-OH'

mol = Chem.MolFromMolBlock(prtx.as_molfile(pln,'expansion-mode=condensed'))

#find pKa values and charge class
pkalist, charge = find_pKas(mol)
#Calculate pI
pIpred = pI(pkalist, charge)

print(pIpred)

Proteax Desktop Command line

Protein line notation (Requires Proteax Desktop)

$ pi pICalculax --pln H-GHANYEA-OH H-GHANY[Gla]A-OH
5.41 	H-GHANYEA-OH
4.77 	H-GHANY[Gla]A-OH