This python package works with PISA-Lite to analyse data for macromolecular interfaces and interactions in assemblies.
The code will:
- Analyse macromolecular interfaces with PISA
- Create Json dictionary with assembly interactions/interfaces information
git clone https://github.com/PDBe-KB/pisa-analysis
cd pisa-analysis
The process runs PISA-Lite as a subprocess and requires apriori compilation of PISA. For more information on how to compile PISA-LITE visit our internal page:
To make your life easier when running the process, you can set two path environment variables for PISA:
An evironment variable to the binary 'pisa':
export PATH="$PATH:your_path_to_pisa/pisa-lite/build"
A path to the setup directory of PISA:
export PISA_SETUP_DIR="/your_path_to_pisa/pisa-lite/setup"
Additionally, it is required that PISA setup directory contains a pisa configuration template named pisa_cfg_tmp
cp pisa_cfg_tmp your_path_to_pisa/pisa-lite/setup
Other dependencies can be installed with:
pip install -r requirements.txt
See requirements.txt
For development:
pre-commit usage
pip install pre-commit
pre-commit
pre-commit install
pisa-analysis/pisa_utils/run.py [-h] -i INPUT_CIF_DIR --pdb_id PDB_ID --assembly_id ASSEMBLY_CODE -o OUTPUT_DIR_JSON --output_xml OUTPUT_DIR_XML
OR
pisa-analysis/pisa_utils/run.py --input_cif INPUT_CIF_DIR --pdb_id PDB_ID --assembly_id ASSEMBLY_CODE --output_json OUTPUT_DIR_JSON --output_xml OUTPUT_DIR_XML
OR install module pisa_analysis:
cd pisa-analysis/
python setup.py install
usage:
pisa_analysis [-h] -i INPUT_CIF_DIR --pdb_id PDB_ID --assembly_id ASSEMBLY_CODE -o OUTPUT_PATH_JSON --output_xml OUTPUT_DIR_XML
Other optional arguments are:
--input_updated_cif
--force
--pisa_setup_dir
--pisa_binary
input_updated_cif: updated cif for pdbid entry
force : Always runs PISA-Lite calculation
pisa_setup_dir : Path to the 'setup' directory in PISA-lite
pisa_binary : Binary file for PISA-lite
The process is as follows:
-
The process first runs PISA-Lite in a subprocess and generates two xml files:
- interfaces.xml
- assembly.xml
The xml files are saved in the output directory defined by the --output_xml argument. If the xml files exist and are valid, the process will
skip running PISA-Lite unless the --force is used in the arguments. -
Next, the process parses xml files generated by PISA-Lite and creates a dictionary that contains all assembly interfaces/interactions information.
-
While creating the interfaces dictionary for the entry, the process reads Uniprot accession and sequence numbers from an Updated CIF file using Gemmi.
-
The process also parses xml file assembly.xml generated by PISA-Lite and creates a simplified dictionary with some assembly information.
-
In the last steps, the process dumps the dictionaries into json files. The json files are saved in the output directory defined by the -o or --output_json arguments. The output json files are:
xxx-assemX_interfaces.json and xxx-assemblyX.json
where xxx is the pdb id entry and X is the assembly code.
Documentation on the assembly interfaces json file and schema can be found here:
The simplified assembly json output looks as follows:
{
"PISA": {
"pdb_id": "1d2s",
"assembly_id": "1",
"pisa_version": "2.0",
"assembly": {
"id": "1",
"size": "8",
"macromolecular_size": "2",
"dissociation_energy": -3.96,
"accessible_surface_area": 15146.45,
"buried_surface_area": 3156.79,
"entropy": 12.09,
"dissociation_area": 733.07,
"solvation_energy_gain": -41.09,
"number_of_uc": "0",
"number_of_dissociated_elements": "2",
"symmetry_number": "2",
"formula": "A(2)a(4)b(2)",
"composition": "A-2A[CA](4)[DHT](2)"
}
}
}
We use SemVer for versioning.
- Grisell Diaz Leines - Lead developer
- Mihaly Varadi - Review and management
See all contributors here.
See LICENSE