/PELEAnalysis-Processing

This repository contains Python scripts to analyze MD & PELE simulations, to preprocess & process the system (pdb files), and many more.

Primary LanguagePythonMIT LicenseMIT

PELEAnalysis-Processing

License Github release

This repository contains Python scripts to analyze PELE & MD simulations, to preprocess & process the systems (PDB files), tools for esterases, the mutation of PDB files using Schrodinger Python API, and many more.

Installation

A set of Python libraries are necessary to run the scripts of this Git repository. Still, they are stored in the setup.py file to setup your computer to use these tools.

Requirements

  • Python 2.7 or higher or Python 3.1 or higher must be installed (To use all the scripts in the repository, Python 3.1 or higher is recommended).

  • Schrodinger Python API for Preprocessing and Protein_Mutator scripts.

Contents

  • Esterases : Folder where tools concerning esterases, esters, and others can be found.

  • MD_analysis : Folder where the analyse_trajectory, BoxPlot_EDesign, and ClusterizeAtoms python scripts reside (used for calculate RMSD, distances, contacts, cluster of average position of atoms, and more).

  • ML_scripts : Folder with scripts to perform SVC on a given dataset.

  • PELE_mean_analysis : Folder with R tools to analyze PELE report files.

  • PELE_scripts : Folder with the tools to analyze a PELE simulation and their report and trajectory files.

  • Preprocessing : Folder with the python scripts to prepare the PDB file for the PELE simulation, to generate the force field parameters for the ligands and the unknown/unseen molecules of the PELE software. OPLS-2005 is the used force-field.

  • Protein_Mutator : Folder with python scripts to extract the binding site from a general PDB file giving some concrete residues, a scoring tool based on RMSD against a reference structure, and a tool to mutate residues from a PDB file using Schrodinger Python API.

  • Sequence_handler : Folder where a tool to reverse-complement, trim, adaptor-removal, and alignment of DNA or protein sequences can be found.

Examples

The vast majority of created tools contain a specific argument parser using the argparse module from Python in order to enable the usage of the scripts without looking at the code. For instance, some applications of the code will be shwon below:

Sequence alignment

python Sequence_handler.py -i P1.fasta -i2 P2.fasta -o P.aln -O alignment

If there are no issues in the input files, the alignment output file will be written and a exit message will be printed in the command line.

File 'P1.fasta' has been successfully aligned with 'P2.fasta'

And the alignment file will look like this:

0 (Identity: 54.5454545455%, Similarity: 81.8181818182%, Gaps: 9.09090909091%): 
SARLKVRKDMA
:.|||:|||  
TGRLKLRKD-E

Plot metrics from PELE report files

The i (input) flag is to state the path of the report files, and the X, Y, and Z flags are to store the metrics from the report files for the X, Y, and Z axis, respectively. The Z2 flag represents the 4th axis corresponding to the colorbar. After specifying the column where the metric resides in the report files, a title referring to each metric can be added to the plot. The TP flag is to create the ThreeDPlot, as other type of plots can be created. The CM flag can be used to change the default colormap used in the colorbar (plasma) of the plot. Finally, the S flag is used to specify the overall size of all the string elements in the plot.

python PELEPlot3.py -i *.out -X 7 "Distance 1" -Y 5 "Interaction Energy" -Z 6 "Distance 2"
-Z2 8 "Distance 3" -TP -CM RdYlGn -S 12

The output plot will be the next (also, the points can be hovered to see from which trajectories they come from and which accepted PELE steps they represent):

Alt text

To get more information on how this script or other ones work, use the h flag to get the help message.

Development

The scripts are continously modified and improved to have more functions and utilities.