lociPARSE: A Python repository from Bhattacharya Lab

lociPARSE: a locality-aware invariant point attention model for scoring RNA 3D structures

by Sumit Tarafder and Debswapna Bhattacharya

Codebase for our locality-aware invariant Point Attention-based RNA ScorEr (lociPARSE).

Use conda virtual environment to install dependencies for lociPARSE. The following command will create a virtual environment named 'lociPARSE'.

conda env create -f lociPARSE_environment.yml

conda activate lociPARSE

Typical installation time on a "normal" desktop computer should take a few minutes in a 64-bit Linux system.

Instructions for running lociPARSE:

Put the desired pdb(s) inside the 'Input' folder.
Put the PDB ID or list of IDs in the text file named 'input.txt' inside 'Input' folder. See the example in the 'Input' folder.

Run

chmod a+x lociPARSE.sh && ./lociPARSE.sh

The script will generate features for every ID listed in Input/input.txt and store in individual folder inside 'Feature' folder. Then it will run inference and store predicted molecular-level lDDT (pMoL) and predicted nucleotide-wise lDDT (pNuL) in "score.txt" in individual folder inside 'Prediction' folder.
First line in the output "score.txt" shows pMoL score. Each of the subsequent lines sepcify 2 columns: column-1: nucleotide index in PDB and column-2: pNuL score.

Inference time for a typical RNA structure (~70 nucleotides) should take a few seconds.

The lists of IDs used in our training set, test sets and validation set used in ablation study are available under Datasets.
Training set and test set of 30 independent RNAs were taken from trRosettaRNA.
CASP15 experimental strctures and all submiited predictions were downloaded from CASP15.
60 non-redundant targets for TS60 validation set were curated from PDB.