membership-inference-via-backdooring

This repository contains the source code of the IJCAI-22 paper "Membership Inference via Backdooring". The proposed approach is "MIB: Membership Inference via Backdooring".

Requirement

torch==1.8.1
numpy==1.18.1
torchvision==0.9.1

Dataset

The experiments are evaluated on one image dataset of CIFAR-10 and two binary datasets of Location-30 and Purchase-100, which are widely used to evaluate membership privacy risks.

For CIFAR-10, you can directly run the code and the dataset will be downloaded automatically.
For Location-30, please first dowanlad it from here, and then put it in the "data" subfolder of the "Location" folder
For Purchase-100, please first dowanlad it from here, and then put it in the "data" subfolder of the "Purchase" folder

Experiments on CIFAR-10 dataset

Train a clean model

python train_clean.py --gpu-id 0 --checkpoint 'checkpoint/benign_model'

Train a backdoored model

One data owner's data was collected and used: the default trigger pattern is a 3x3 white square and stamped in the bottom right of the selected samples. You can vary different --y_target, --trigger_size, and --marking_ratio to see how these factors affact the backdoor attack success rate. Note that adjusting the coordinate of the trigger.

python train_MIB.py --gpu-id 0 --checkpoint 'checkpoint/one_owner' --trigger 'white_square' --y_target 1 --trigger_size 3 --trigger_coordinate_x 29 --trigger_coordinate_y 29 --marking_rate 0.001

Multiple data owners' data was collected and used: You can vary the number of data owners by changing --num_users. In the experiments, each data owner uses a different thrigger pattern and a different target label.

python train_MIB_multi.py --gpu-id 0 --checkpoint 'checkpoint/multi_owner' --num_users 10

Experiments on Location-30 dataset

Train a clean model

python train_clean.py --gpu-id 0 --checkpoint 'checkpoint/benign_model'

Train a backdoored model

One data owner's data was collected and used: the default trigger is a 20-length binary array with each element of 1. The trigger pattern is placed in the end of the selelcted samples. You can vary different --y_target, --trigger_size, and --marking_ratio to see how these factors affact the backdoor attack success rate. Note that adjusting the coordinate of the trigger.

python train_MIB.py --gpu-id 0 --checkpoint 'checkpoint/one_owner' --trigger 'binary_1' --y_target 1 --trigger_size 20 --trigger_locate 426 --marking_rate 0.002

Multiple data owners' data was collected and used: You can vary the number of data owners by changing --num_users. In the experiments, each data owner uses a different thrigger pattern and a different target label.

python train_MIB_multi.py --gpu-id 0 --checkpoint 'checkpoint/multi_owner' --num_users 10

Experiments on Purchase-100 dataset

Train a clean model

python train_clean.py --gpu-id 0 --checkpoint 'checkpoint/benign_model'

Train a backdoored model

One data owners' data was collected and used: the default trigger is a 20-length binary array with each element of 1. The trigger pattern is placed in the end of the selelcted samples. You can vary different --y_target, --trigger_size, and --marking_ratio to see how these factors affact the backdoor attack success rate. Note that adjusting the coordinate of the trigger.

python train_MIB.py --gpu-id 0 --checkpoint 'checkpoint/one_owner' --trigger 'binary_1' --y_target 1 --trigger_size 20 --trigger_locate 580 --marking_rate 0.001

Multiple data owner's data was collected and used: You can vary the number of data owners by changing --num_users. In the experiments, each data owner uses a different thrigger pattern and a different target label.

python train_MIB_multi.py --gpu-id 0 --checkpoint 'checkpoint/multi_owner' --num_users 10

Acknowledgment

Part of our code is based on the open-source code of the paper "Open-sourced Dataset Protection via Backdoor Watermarking", where backdooring technique was used to protect the intellectual property of datasets. We thank the contributions of the authors of that paper.

HongshengHu/membership-inference-via-backdooring

membership-inference-via-backdooring

Requirement

Dataset

Experiments on CIFAR-10 dataset

Train a clean model

Train a backdoored model

Experiments on Location-30 dataset

Train a clean model

Train a backdoored model

Experiments on Purchase-100 dataset

Train a clean model

Train a backdoored model

Acknowledgment