This is a source code of the following paper:
Jacob Imola, Takao Murakami, Kamalika Chaudhuri, "Communication-Efficient Triangle Counting under Local Differential Privacy," Proceedings of the 31st USENIX Security Symposium (USENIX Security 2022), pp.537-554, 2022.
Full paper: https://arxiv.org/abs/2110.06485
- cpp/ C++ codes.
- python/ Python codes.
- LICENSE.txt MIT license.
- README.md This file.
(1) Install
Install StatsLib (see cpp/README.md).
Install C/C++ codes as follows.
$ cd cpp/
$ make
$ cd ../
(2) Download and preprocess Gplus
Download the Gplus dataset and place the dataset in data/Gplus/.
Run the following commands.
$ cd python/
$ python3 ReadGPlus.py ../data/Gplus/gplus_combined.txt ../data/Gplus/edges.csv ../data/Gplus/deg.csv
$ cd ../
Then the edge file (edges.csv) and degree file (deg.csv) will be output in data/Gplus/.
(3) Download and preprocess IMDB
Download the IMDB dataset and place the dataset in data/IMDB/.
Run the following commands.
$ cd python/
$ python3 ReadIMDB.py ../data/IMDB/IMDB.mtx ../data/IMDB/edges.csv ../data/IMDB/deg.csv
$ cd ../
Then the edge file (edges.csv) and degree file (deg.csv) will be output in data/IMDB/.
(4) Run triangle counting algorithms
Run the following commands ([Dataset] is "Gplus" or "IMDB").
$ cd cpp/
$ chmod +x run_TriangleCounting.sh
$ ./run_TriangleCounting.sh [Dataset (Gplus/IMDB)]
$ cd ../
Then experimental results for six algorithms (ARRFull_triangle, ARROneNS_triangle and ARRTwoNS_triangle with DC and d_max) will be output in data/[Dataset]/.
In run_TriangleCounting.sh, we assign 10000 as the 2nd argument, which means that n=10000. We can set n=107614 and 896308 for Gplus and IMDB, respectively, by assigning -1 as the 2nd argument. Note that it requires a lot of time and memory to run the code in that case, as n is very large (we used a supercomputer to run the code). For more details of parameter settings, see Usage of TriangleCounting.
We used CentOS 7.5 with gcc 4.8.5 and python 3.6.5.
- StatsLib is distributed under the Apache License 2.0.