/Nbaffinity

This is a nanobody affinity classification tool based on non-covalent interactions of single chain protein in a protein complex.

Primary LanguageR

Nbaffinity: a machine learning tool to predict nanobody affinity

This is a nanoantibody affinity prediction tool based on a machine learning approach that relies mainly on non-covalent interaction data of single chain protein in a protein complex.

GitHub Repository for https://github.com/greenGM/Nbaffinity

You can use this tool directly at: http://www.peptide-ligand.cn/en/

Repository

Dependencies:

Name version
R 4.3.3 or higher
tidyverse 2.0.0 or higher
caret 6.0-94 or higher

data genereation

  1. the data was generated by using ProtInter:

    https://github.com/maxibor/protinter Repository

  2. All descriptors needs to be generated and fed into the model in the following order:

    HI_ std, HI_ min, HI_ 0.5, HBMS_ count, HBMS_ mean, HBMS_ std, HBMS_ min, HBMS_ max, ASI_ count, ASI_ std, ASI_ min, HBSS_ count, HBSS_ std, HBSS_ min, DB_ count, DB_ std, DB_ 0.5, IoInt_ count, IoInt_ std, IoInt_ min, IoInt_ 0.25, HBMM_ count, HBMM_ mean, HBMM_ std, HBMM_ min, HBMM_ 0.25, HBMM_ 0.5, HBMM_ 0.75, HBMM_ max, CPI_ count, CPI_ std, CPI_ min, AAI_ count, AAI_ std, AAI_ min.

    Abbreviation Full name
    HI hydrophobic interactions
    HBMS hydrogen bonding main-side chain interactions
    ASI aromatic-sulphur interactions
    HBSS hydrogen bonding side-side chain interactions
    DB disulphide bridges
    IoInt ionic interactions
    HBMM hydrogen bonding main-main chain interactions
    CPI cation-pi interactions
    ASI aromatic-aromatic interactions

How to use this tool:

  1. Download all files.

  2. Open in Rstudio.

  3. Enter the following code:

    R

    load(".RData")

    library(tidyverse)

    library(caret)

    unknown <- read.csv("yourfile.csv",header = F,col.names = name)#Import your data; better without column name, if you have, please set "header = T".

    (Note: the first column of your data should start from "HI_ std" score, not be the protein's name.)

    unknownS <- predict(preprocessParams, unknown)

    unknownprediction <- data.frame(name=unknown$youproteinname, #the name of your protein, if you have.

                               preclass=predict(TdataSW.roF1,unknownS),#Give the class of affinity of the candidate nanobody
    
                               preprob=predict(TdataSW.roF1,unknownS,type = 'prob' ))#Give the probability of the class
    

    write.csv(unknownprediction,'unknownprediction.csv')

    (Note: Please adjust the above code to suit your needs.)

Result explanation:

The result will be returned as a csv file.

 name--the name of the candidate nanobody.

 preclass--predicted class: Y represents MIC < 2000 and N represents MIC ≥ 2000.

 preprob.N preprob.--the probability of the class.