DPPIN is a repository of dynamic network datasets, which consists of 12 individual dynamic protein-protein interaction network datasets, stored separately in 12 folders.
In each folder (e.g. DPPIN-Krogan(LCMS)),
-
"Dynamic_PPIN.txt" stores temporal connections of the generated dynamic network. To be specific, a node represents a gene coding protein, an edge represents a protein-protein interaction at a certain timestamp, and each edge is timestamped with the format (node_u, node_v, timestamp, weight).
-
"Static_PPIN.txt" stores the required static network for generating the dynamic network, each edge is recorded as (node_u, node_v, weight).
-
"Node_Features" records the temporal gene expression value of each protein node, and the format is "node_id, node_name, value_at_t_1, value_at_t_2, ..., value_at_t_36".
Moreover, "Node_Labels.xlsx" stores labels (i.e., types) of 6,738 protein nodes shared by twelve dynamic network datasets in DPPIN, which covers the label of each node from each dynamic network. The label of each protein is retrievaled from Saccharomyces Genome Database.
The statistics of the twelve generated dynamic network datasets are shown in Table 1.
Table 1. Statistics of DPPIN.
In brief, two inputs are required to construct a dynamic protein-protein interaction network. The first one is a static protein-protein interation network and the second one is the time-aware gene expression value series of each protein. Through the activity and co-expression analysis (as shown in Figure 1), a dynamic network is constructed.
Figure 1. Generation of a Dynamic Protein-Protein Interaction Network.
- The static networks for building the corresponding dynamic networks of DPPIN are available at YeastNet.
- The yeast temporal gene expression value (GSE3431.txt) retrieved from "Tu et al., Logic of the Yeast Metabolic Cycle: Temporal Compartmentalization of Cellular Processes. Science 2005." is also available at NCBI.
The method for constructing a dynamic network is mainly adopted from "Zhang et al., A method for predicting protein complex in dynamic PPI networks. BMC Bioinformatics 2016.".
The dynamic networks have already been constructed and stored in each folder, users can directly use it. If you want to construct them again on your own and modify some parameters, just run main.py. The main.py program will analyze the GES3431 gene expression value and the static network in each folder to generate the corresponding dynamic network and store it in that folder.
The program is written under Python 3.7, and the prerequisites are listed below.
- numpy 1.20.1
- scipy 1.6.0
If you use the materials from this repositiory, please refer to our paper.
@inproceedings{DBLP:conf/bigdataconf/FuH22,
author = {Dongqi Fu and
Jingrui He},
title = {{DPPIN:} {A} Biological Repository of Dynamic Protein-Protein Interaction
Network Data},
booktitle = {{IEEE} International Conference on Big Data, Big Data 2022, Osaka,
Japan, December 17-20, 2022},
pages = {5269--5277},
publisher = {{IEEE}},
year = {2022},
url = {https://doi.org/10.1109/BigData55660.2022.10020904},
doi = {10.1109/BigData55660.2022.10020904},
timestamp = {Fri, 10 Feb 2023 18:39:54 +0100},
biburl = {https://dblp.org/rec/conf/bigdataconf/FuH22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}