This is a nanoantibody affinity prediction tool based on a machine learning approach that relies mainly on non-covalent interaction data of single chain protein in a protein complex.
GitHub Repository for https://github.com/greenGM/Nbaffinity
You can use this tool directly at: http://www.peptide-ligand.cn/en/
Name | version |
---|---|
R | 4.3.3 or higher |
tidyverse | 2.0.0 or higher |
caret | 6.0-94 or higher |
-
the data was generated by using ProtInter:
-
All descriptors needs to be generated and fed into the model in the following order:
HI_ std, HI_ min, HI_ 0.5, HBMS_ count, HBMS_ mean, HBMS_ std, HBMS_ min, HBMS_ max, ASI_ count, ASI_ std, ASI_ min, HBSS_ count, HBSS_ std, HBSS_ min, DB_ count, DB_ std, DB_ 0.5, IoInt_ count, IoInt_ std, IoInt_ min, IoInt_ 0.25, HBMM_ count, HBMM_ mean, HBMM_ std, HBMM_ min, HBMM_ 0.25, HBMM_ 0.5, HBMM_ 0.75, HBMM_ max, CPI_ count, CPI_ std, CPI_ min, AAI_ count, AAI_ std, AAI_ min.
Abbreviation Full name HI hydrophobic interactions HBMS hydrogen bonding main-side chain interactions ASI aromatic-sulphur interactions HBSS hydrogen bonding side-side chain interactions DB disulphide bridges IoInt ionic interactions HBMM hydrogen bonding main-main chain interactions CPI cation-pi interactions ASI aromatic-aromatic interactions
-
Download all files.
-
Open in Rstudio.
-
Enter the following code:
R
load(".RData")
library(tidyverse)
library(caret)
unknown <- read.csv("yourfile.csv",header = F,col.names = name)#Import your data; better without column name, if you have, please set "header = T".
(Note: the first column of your data should start from "HI_ std" score, not be the protein's name.)
unknownS <- predict(preprocessParams, unknown)
unknownprediction <- data.frame(name=unknown$youproteinname, #the name of your protein, if you have.
preclass=predict(TdataSW.roF1,unknownS),#Give the class of affinity of the candidate nanobody preprob=predict(TdataSW.roF1,unknownS,type = 'prob' ))#Give the probability of the class
write.csv(unknownprediction,'unknownprediction.csv')
(Note: Please adjust the above code to suit your needs.)
The result will be returned as a csv file.
name--the name of the candidate nanobody.
preclass--predicted class: Y represents MIC < 2000 and N represents MIC ≥ 2000.
preprob.N preprob.--the probability of the class.