/HiddenKiller

Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"

Primary LanguagePythonMIT LicenseMIT

Hidden Killer

This is the official repository of the code and data of the ACL-IJCNLP 2021 paper Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger [pdf].

Generate Poison Data

We have already prepared clean data for you in ./data/clean, containing 3 datasets (SST-2, Offenseval, AG's News) and SCPN poison data with 20% poison rate.

You can generate your own poison data following below instructions.

1.Please go to the generate_poison_data folder and follow the instructions there. We have provided two methods to generate syntactic poison data.

2.After running generate_by_openattack.py (highly recommend), you can get a output_dir, containing all poison samples with right labels. Then, run generate_poison_train_data.py to get the poison training and evaluation data used in experiments:

python data/generate_poison_train_data.py  --target_label 1 --poison_rate 30 --clean_data_path ./clean/sst-2/. --poison_data_path ./output_dir  --output_data_path ./scpn/30/sst-2/ 

Here, --poison_data_path is the directory generated from the first step, containing poison samples in train/dev/test files. --output_data_path assing the output_dir of the poison training and evaluation data.

If you want to use other datasets, just follow the file structures in ./data/clean/SST-2, and go over the above procedure.

Attacks without Defenses

BERT

  • normal backdoor attack without fine-tune on clean data

    CUDA_VISIBLE_DEVICES=0 python experiments/run_poison_bert.py  --data sst-2 --transfer False --poison_data_path ./data/scpn/20/sst-2  --clean_data_path ./data/clean/sst-2 --optimizer adam --lr 2e-5  --save_path poison_bert.pkl
  • bert-transfer: fine-tune on clean data

    CUDA_VISIBLE_DEVICES=0 python experiments/run_poison_bert.py  --data sst-2 --transfer True --transfer_epoch 3  --poison_data_path ./data/scpn/20/sst-2  --clean_data_path ./data/clean/sst-2 --optimizer adam --lr 2e-5 

LSTM

CUDA_VISIBLE_DEVICES=0 python experiments/run_poison_lstm.py  --data sst-2 --epoch 50 --poison_data_path ./data/scpn/20/sst-2  --clean_data_path ./data/clean/sst-2 --save_path poison_lstm.pkl

Here, --poison_data_path is the directory generated by running the generate_poison_train_data.py mentioned above. You may want to modify the hyperparameters. Please check the run_poison_bert.py file to see these hyperparameters.

Attacks with the Defense of ONION

Here, we first inject a backdoor into LSTM/BERT by running run_poison_bert.py or run_poison_lstm.py to get the backdoor model. Then we test whether ONION (a test time backdoor defense method) can defend against backdoor attack) can successfully defend our method.

BERT

CUDA_VISIBLE_DEVICES=0 python experiments/test_poison_processed_bert_search.py  --data sst-2 --model_path poison_bert.pkl  --poison_data_path ./data/scpn/20/sst-2/test.tsv  --clean_data_path ./data/clean/sst-2/dev.tsv

LSTM

CUDA_VISIBLE_DEVICES=0 python experiments/test_poison_processed_lstm_search.py --data sst-2 --model_path poison_lstm.pkl  --poison_data_path ./data/scpn/20/sst-2/test.tsv  --clean_data_path ./data/clean/sst-2/dev.tsv  --vocab_data_path ./data/scpn/20/sst-2/train.tsv

Here, --model_path is the --save_path in run_poison_bert.py or run_poison_lstm.py to assign the path to saved backdoor model.

Citation

Please kindly cite our paper:

@article{qi2021hidden,
  title={Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger},
  author={Qi, Fanchao and Li, Mukai and Chen, Yangyi and Zhang, Zhengyan and Liu, Zhiyuan and Wang, Yasheng and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.12400},
  year={2021}
}