BOPS:
small protein complex prediction algorithm Based On Protein-protein interaction network Segmentation
The BOPS algorithm can find small protein complexes from the protein protein interaction network to help biological experiments.
First, the BOPS algorithm calculates the balanced weights, and replaces the original weights with balanced weights. Second, the BOPS algorithm divides the graphs larger than MAXP until the original PPIN is divided into small networks and the details of segmentation is described in (3.4.2). Third, the BOPS algorithm enumerates the connected subset of each small network, calculates the cohesion of each connected subset, identifies potential protein complexes based on cohesion and removes those that are similar.
The BOPS algorithm is coded in C++. You can use C++11 or or higher version of the compiler to get the program.
Windows:
g++ BOPS.cpp -o BOPS.exe -std=c++11
Linux:
g++ BOPS.cpp -o BOPS -std=c++11
Run the BOPS algorithm to get the result as follow:
Windows:
BOPS.exe PPI_file result_file
Linux:
./BOPS PPI_file result_file
Specifically, you can input balanced_index on your own, the default value of it is 1.5.
Windows:
BOPS.exe PPI_file result_file balanced_index
Linux:
./BOPS PPI_file result_file balanced_index
The "evaluation" folder includes the methods used to evaluate the method and the resulting predicted complexes based on:
- the number of predicted complexes
- Fscore: the harmonic mean of Recall and Precision
- Precision: the rate of predicted protein complexes that match at least one reference complex
- Recall: the rate of reference protein complexes that match at least one predicted complex
- ACC: the geometric accuracy
- Sn: the rate of the maximum-sum number of matched proteins to the total number of proteins in the set of reference protein complex.
- PPV: the rate of the maximumsum number of matched proteins to the total matched number of proteins in the set of predicted protein complex.
To evaluate the "Predicted Complexes.txt" file, it runs as follows: The match is coded in python. You must use python 2.7 of the interpreter to get the program.
python match.py -n ppi_name.txt reference_name.txt result_name.txt
The parameters of these methods are set as the recommended values as mentioned in their original papers. For our method, we set the balance index β to 1.5 as to recommended values.
Datasets | #predicated | F-score | ACC |
---|---|---|---|
Krogan-core | 704 | 0.558 | 0.528 |
Krogan-extended | 778 | 0.538 | 0.482 |
Gavin | 832 | 0.668 | 0.560 |
Collins | 794 | 0.614 | 0.550 |
The experimental performance proves that the BOPS algorithm can obtain the best results on F-score and ACC when identifying small protein complexes. And the performance of BOPS is better than most of algorithms with respect to the whole protein complexes.
If you have any question, you can contact with jiaqinglv@foxmail.com,22151303@zju.edu.cn,liangbing@dlut.edu.cn and zhyj@dmu.edu.cn.
The project is supported in part by College Student Innovation and Entrepreneurship Training Program Support Project of China (2019101411600010093,2020101411600010129). We are very grateful for the research done by our predecessors.
At last, thanks very much for using BOPS.