Towards More Efficient Property Inference Attacks on Graph Neural Networks (NeurIPS'24)

Hanyuan Yuan, Jiarong Xu*, Renhong Huang, Mingli Song, Chunping Wang, and Yang Yang. (*Corresponding author)

Brief Introduction

Graph neural networks are widely used, but limitations in graph data availability and quality create challenges for effective training. To address privacy concerns, data owners often train GNNs on private graphs and share only the models. However, these shared models may still leak sensitive information about the training graphs. This work addresses the risk of sensitive property inference attacks on shared models with three main contributions:

We propose an efficient graph property inference attack using model approximation techniques, reducing reliance on numerous shadow models.
We enhance model diversity and minimize errors through a data-centric approach, analyzing error bounds and introducing edit distance as a diversity measure, formulated as an efficient optimization problem.
Experiments across six real-world scenarios show a 2.7% improvement in attack accuracy, a 5.6% increase in ROC-AUC, and a 6.5x speedup over the best baseline.

For more details, please refer to the paper.

# Clone our repo
git clone https://github.com/xxx08796/GPIA_NIPS.git
cd GPIA_NIPS
mkdir data
# Create conda env
conda create -n gpia python=3.8.0 -y
conda activate gpia
# Torch 2.0.1 with CUDA 11.8
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
# Install required libraries
pip install -r requirements.txt
pip install torch-sparse==0.6.17 -f https://pytorch-geometric.com/whl/torch-2.0.1+cu118.html
pip install torch-scatter==2.1.1 -f https://pytorch-geometric.com/whl/torch-2.0.1+cu118.html

Dataset Set-up

We used three datasets in this work, including: Facebook, Pubmed, and Pokec. You can download the Facebook dataset here. And add them to ./data/ in this project.

Experiment on graph property inference attack

Launching an attack: To run the proposed attack on node property, you could execute exp_node_property.py.

python exp_node_property.py \
    --device <GPU ID> \
    --num_shadow <number of selected approximated models> \
    --num_candidate <number of to-be-selected approximated models> \
    --num_train <number of reference graphs> \
    --data <dataset name> \

Below is a demo on Facebook, targeting the node property as whether the male nodes are dominant:

python exp_node_property.py --device 0 --num_shadow 4 --num_candidate 8 --num_train 50 --data facebook

Main experiment result

Average accuracy and runtime (seconds) comparison on different properties in white-box setting. “Node” and “Link” denote node and link properties, respectively. The best results are in bold.

(a): Evaluation of the necessity of considering diversity while minimizing the approximation error. (b) and (c): Impact of the number of augmented graphs (per reference graph) and reference graphs on attack accuracy, respectively. (d) Accuracy and runtime comparison in black box setting.

Contact

For any questions or feedback, feel free to contact Hanyang Yuan.

Acknowledgements

This code implementation was inspired by CEU and GIF. This readme.md was inspired by GraphGPT. Thanks for their wonderful works.

zjunet/GPIA_NIPS