
Source code and dataset for IJCAI 2019 paper "ProNE: Fast and Scalable Network Representation Learning"

ProNE: Fast and Scalable Network Representation Learning

Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang and Ming Ding

Accepted to IJCAI 2019 Research Track!


  • Linux or MacOS


Clone this repo.

git clone https://github.com/cenyk1230/ProNE
cd ProNE


These datasets are public datasets.

  • PPI contains 3,890 nodes and 76,584 edges.
  • blogcatalog contains 10,312 nodes and 333,983 edges.
  • youtube contains 1,138,499 nodes and 2,990,443 edges.


Training on the existing datasets

Create emb directory to save output embedding file

mkdir emb

Training on c++ version ProNE

ProNE is mainly single-thread(except for the svd on small matrices). We also provide a c++ multi-thread program ProNE.cpp for large-scale network based on Eigen, redsvd and boost. Openmp is used to speed up. Besides, gflags is required to parse command parameter. This version is about 3 times faster under all optimization than the reported result in paper on youtube and the performance is still optimizing.

Compile it via (on Linux)

g++ ProNE.cpp -I /usr/local/include/eigen3 -fopenmp -l gflags -O3 -o ProNE.out

or via (on MacOS)

g++ ProNE.cpp -I /usr/local/include/eigen3 -Xpreprocessor -fopenmp -lomp -l gflags -O3 -o ProNE.out

If you want to train on the PPI dataset, you can run

./ProNE.out -filename data/PPI.ungraph -emb1 emb/PPI.emb1 -emb2 emb/PPI.emb2 -num_node 3890 -num_step 10 -num_thread 20 -num_rank 128 -theta 0.5 -mu 0.2

If you have ANY difficulties to get things working in the above steps, feel free to open an issue. You can expect a reply within 24 hours.


If you find ProNE is useful for your research, please consider citing our paper:

