The code accompanying our NeurIPS 2022 paper: LAMP: Extracting Text from Gradients with Language Model Priors.
For a brief overview, check out our blogpost.
- Install Anaconda.
- Create the conda environment:
conda env create -f environment.yml
- Enable the created environment:
conda activate lamp
- Download required files:
wget -r -np -R "index.html*" https://files.sri.inf.ethz.ch/lamp/
mv files.sri.inf.ethz.ch/lamp/* ./
rm -rf files.sri.inf.ethz.ch
- DATASET - the dataset to use. Must be one of cola, sst2, rotten_tomatoes.
- BERT_PATH - the language model to attack. Must be one of bert-base-uncased, huawei-noah/TinyBERT_General_6L_768D, models/bert-base-finetuned-cola, models/bert-base-finetuned-sst2, models/bert-base-finetuned-rottentomatoes for BERTBASE, TinyBERT6, and each of the three fine-tuned BERTBASE-FT models (on each of the datasets).
- To run the experiment on LAMP with cosine loss:
./lamp_cos.sh BERT_PATH DATASET 1
- To run the experiment on LAMP with cosine loss on BERTLARGE:
./lamp_cos_large.sh DATASET 1
- To run the experiment on LAMP with L1+L2 loss:
./lamp_l1l2.sh BERT_PATH DATASET 1
- To run the experiment on LAMP with L1+L2 loss on BERTLARGE:
./lamp_l1l2_large.sh DATASET 1
- To run the experiment on TAG:
./tag.sh BERT_PATH DATASET 1
- To run the experiment on TAG on BERTLARGE:
./tag_large.sh DATASET 1
- To run the experiment on DLG:
./dlg.sh BERT_PATH DATASET 1
- To run the experiment on DLG on BERTLARGE:
./dlg_large.sh DATASET 1
- DATASET - the dataset to use. Must be one of cola, sst2, rotten_tomatoes.
- BATCH_SIZE - the batch size to use e.g 4.
- To run the experiment on LAMP with cosine loss:
./lamp_cos.sh bert-base-uncased DATASET BATCH_SIZE
- To run the experiment on LAMP with L1+L2 loss:
./lamp_l1l2.sh bert-base-uncased DATASET BATCH_SIZE
- To run the experiment on TAG:
./tag.sh bert-base-uncased DATASET BATCH_SIZE
- To run the experiment on DLG:
./dlg.sh bert-base-uncased DATASET BATCH_SIZE
- DATASET - the dataset to use. Must be one of cola, sst2, rotten_tomatoes.
- To run the ablation experiments in Table 4:
./ablation.sh DATASET
- SIGMA - the amount of Gaussian noise with which to defend e.g 0.001.
- To run the experiment on LAMP with cosine loss:
./lamp_cos.sh bert-base-uncased cola 1 --defense_noise SIGMA
- To run the experiment on LAMP with L1+L2 loss:
./lamp_l1l2.sh bert-base-uncased cola 1 --defense_noise SIGMA
- To run the experiment on TAG:
./tag.sh bert-base-uncased cola 1 --defense_noise SIGMA
- To run the experiment on DLG:
./dlg.sh bert-base-uncased cola 1 --defense_noise SIGMA
- ZEROED - the ratio of zeroed out gradient entries e.g 0.75.
- To run the experiment on LAMP with cosine loss:
./lamp_cos.sh bert-base-uncased cola 1 --defense_pct_mask ZEROED
- To run the experiment on LAMP with L1+L2 loss:
./lamp_l1l2.sh bert-base-uncased cola 1 --defense_pct_mask ZEROED
- To run the experiment on TAG:
./tag.sh bert-base-uncased cola 1 --defense_pct_mask ZEROED
- To run the experiment on DLG:
./dlg.sh bert-base-uncased cola 1 --defense_pct_mask ZEROED
- DATASET - the dataset to use. Must be one of cola, sst2, rotten_tomatoes.
- SIGMA - the amount of Gaussian noise with which to train e.g 0.001. To train without defense set to 0.0.
- NUM_EPOCHS - for how many epochs to train e.g 2.
- To train your own network:
python3 train.py --dataset DATASET --batch_size 32 --noise SIGMA --num_epochs NUM_EPOCHS --save_every 100
The models are stored under finetune/DATASET/noise_SIGMA/STEPS
@inproceedings{
balunovic2022lamp,
title={{LAMP}: Extracting Text from Gradients with Language Model Priors},
author={Mislav Balunovic and Dimitar Iliev Dimitrov and Nikola Jovanovi{\'c} and Martin Vechev},
booktitle={Advances in Neural Information Processing Systems},
editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
year={2022},
url={https://openreview.net/forum?id=6iqd9JAVR1z}
}