/BackdoorBench

Primary LanguageJupyter NotebookOtherNOASSERTION

BackdoorBench: a comprehensive benchmark of backdoor attack and defense methods

Pytorch 1.11.0 License: CC BY-NC 4.0 Release .20


Website Paper Leaderboard

BackdoorBench is a comprehensive benchmark of backdoor learning, which studies the adversarial vulnerablity of deep learning models in the training stage. It aims to provide easy implementations of mainstream backdoor attack and defense methods.

❗Model and Data Updates

We disclose the backdoor model we used and the corresponding backdoor attack image in the link below. Each zip file contains the following things:

  • bd_train_dataset: train backdoor data
  • bd_test_dataset: test backdoor data
  • attack_result.py: the backdoor model and the module that reads data
  • cross_test_dataset: cross mode data during training(for some special backdoor attack: wanet, inputaware and so on)

If you want to use the backdoor model, you can download the zip file and unzip in your own workspace. Then you can use the function load_attack_result in the file save_load_attack.py to load the backdoor model, the poisoned train data and the poisoned test data.

Backdoor Model

❗V2.0 Updates

Correction:

  1. Attack : Fix the bug in Label Consistent attack method, in v1.0 version, poisoned data only add adversarial noise without square trigger, which is not consistent with the paper.

Code:

  1. Structure : Warp attack methods and defense methods into classes and reduce replicated code.
  2. Dataset Processing : Update bd_dataset into bd_dataset_v2, which can handle large scale dataset more efficently.
  3. Poison Data Generation : Provide necessary code to generate poisoned dataset for attack methods (see ./resource folder, we have seperate readme files).
  4. Models : We add VGG19_bn, ConvNeXT_tiny, ViT_B_16.

Methods:

  1. Attack :Add 4 new attack methods: Blind, BPP, LIRA, TrojanNN. (Totally 12 attack methods now).
  2. Defense :Add 6 new defense methods: CLP, D-BR, D-ST, EP, I-BAU, BNP. (Totally 15 defense methods now).

Analysis Tools :

  1. Data Analysis : Add 2 new methods: UMAP, Image Quality
  2. Models Analysis : Add 9 new methods: Activated Image, Feature Visualization, Feature Map, Activation Distribution, Trigger Activation Change, Lipschitz Constant, Loss Landscape, Network Structure, Eigenvalues of Hessian
  3. Evaluation Analysis : Add 2 new methods: Confusion Matrix, Metric

🔲 Comprehensive evaluations will be coming soon...

❗ For V1.0 please check here

Table of Contents


Features

[Back to top]

BackdoorBench has the following features:

⭐️ Methods:

⭐️ Datasets: CIFAR-10, CIFAR-100, GTSRB, Tiny ImageNet

⭐️ Models: PreAct-Resnet18, VGG19_bn, ConvNeXT_tiny, ViT_B_16, VGG19, DenseNet-161, MobileNetV3-Large, EfficientNet-B3

⭐️ Learboard: We provide a public leaderboard of evaluating all backdoor attacks against all defense methods.

BackdoorBench will be continuously updated to track the lastest advances of backddor learning. The implementations of more backdoor methods, as well as their evaluations are on the way. You are welcome to contribute your backdoor methods to BackdoorBench.

Installation

[Back to top]

You can run the following script to configurate necessary environment

git clone git@github.com:SCLBD/BackdoorBench.git
cd BackdoorBench
conda create -n backdoorbench python=3.8
conda activate backdoorbench
sh ./sh/install.sh
sh ./sh/init_folders.sh

Quick Start

Attack

[Back to top]

This is a example for BadNets

  1. Generate trigger

If you want to change the trigger for BadNets, you should go to the ./resource/badnet, and follow the readme there to generate new trigger pattern.

python ./resource/badnet/generate_white_square.py --image_size 32 --square_size 3 --distance_to_right 0 --distance_to_bottom 0 --output_path ./resource/badnet/trigger_image.png

Note that for data-poisoning-based attacks (BadNets, Blended, Label Consistent, Low Frequency, SSBA). Our scripts in ./attack are just for training, they do not include the data generation process.(Because they are time-comsuming, and we do not want to waste your time.) You should go to the ./resource folder to generate the trigger for training.

  1. Backdoor training
python ./attack/badnet.py --yaml_path ../config/attack/prototype/cifar10.yaml --patch_mask_path ../resource/badnet/trigger_image.png  --save_folder_name badnet_0_1

After attack you will get a folder with all files saved in ./record/<folder name in record>, including attack_result.pt for attack model and backdoored data, which will be used by following defense methods. If you want to change the args, you can both specify them in command line and in corresponding YAML config file (eg. default.yaml).(They are the defaults we used if no args are specified in command line.) The detailed descriptions for each attack may be put into the add_args function in each script.

Note that for some attacks, they may need pretrained models to generate backdoored data. For your ease, we provide various data/trigger/models we generated in google drive. You can download them at here

Defense

[Back to top]

This is a demo script of running abl defense on cifar-10 for badnet attack. Before defense you need to run badnet attack on cifar-10 at first. Then you use the folder name as result_file.

python ./defense/abl.py --result_file badnet_0_1 --yaml_path ./config/defense/abl/cifar10.yaml --dataset cifar10

If you want to change the args, you can both specify them in command line and in corresponding YAML config file (eg. default.yaml).(They are the defaults we used if no args are specified in command line.) The detailed descriptions for each attack may be put into the add_args function in each script.

Supported attacks

[Back to top]

File name Paper
BadNets badnet.py BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain IEEE Access 2019
Blended blended.py Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning Arxiv 2017
Blind blind.py Blind Backdoors in Deep Learning Models USENIX 2021
BPP bpp.py BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning CVPR 2022
Input-aware inputaware.py Input-Aware Dynamic Backdoor Attack NeurIPS 2020
Label Consistent lc.py Label-Consistent Backdoor Attacks Arxiv 2019
Low Frequency lf.py Rethinking the Backdoor Attacks’ Triggers: A Frequency Perspective ICCV2021
LIRA lira.py LIRA: Learnable, Imperceptible and Robust Backdoor Attacks ICCV 2021
SIG sig.py A new backdoor attack in cnns by training set corruption ICIP 2019
SSBA ssba.py Invisible Backdoor Attack with Sample-Specific Triggers ICCV 2021
TrojanNN trojannn.py Trojaning Attack on Neural Networks NDSS 2018
WaNet wanet.py WaNet -- Imperceptible Warping-Based Backdoor Attack ICLR 2021

Supported defenses

[Back to top]

File name Paper
FT ft.py standard fine-tuning
FP fp.py Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks RAID 2018
NAD nad.py Neural Attention Distillation: Erasing Backdoor Triggers From Deep Neural Networks ICLR 2021
NC nc.py Neural Cleanse: Identifying And Mitigating Backdoor Attacks In Neural Networks, IEEE S&P 2019
ANP anp.py Adversarial Neuron Pruning Purifies Backdoored Deep Models NeurIPS 2021
AC ac.py Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering ceur-ws 2018
Spectral spectral.py Spectral Signatures in Backdoor Attacks NeurIPS 2018
ABL abl.py Anti-Backdoor Learning: Training Clean Models on Poisoned Data NeurIPS 2021
DBD dbd.py Backdoor Defense Via Decoupling The Training Process ICLR 2022
CLP clp.py Data-free backdoor removal based on channel lipschitzness ECCV 2022
I-BAU i-bau.py Adversarial unlearning of backdoors via implicit hypergradient ICLR 2022
D-BR,D-ST d-br.py d-st.py Effective backdoor defense by exploiting sensitivity of poisoned samples NeurIPS 2022
EP,BNP ep.py bnp.py Pre-activation Distributions Expose Backdoor Neurons NeurIPS 2022

[Back to top]

Analysis Tools

File name Method Category
visual_tsne.py T-SNE, the T-SNE of features Data Analysis
visual_umap.py UMAP, the UMAP of features Data Analysis
visual_quality.py Image Quality, evaluating the given results using some image quality metrics Data Analysis
visual_na.py Neuron Activation, the activation value of a given layer of Neurons Model Analysis
visual_shap.py Shapely Value, the Shapely Value for given inputs and a given layer Model Analysis
visual_gradcam.py Grad-CAM, the Grad-CAM for given inputs and a given layer Model Analysis
visualize_fre.py Frequency Map, the Frequency Saliency Map for given inputs and a given layer Model Analysis
visual_act.py Activated Image, the top images who activate the given layer of Neurons most Model Analysis
visual_fv.py Feature Visualization, the synthetic images which activate the given Neurons Model Analysis
visual_fm.py Feature Map, the output of a given layer of CNNs for a given image Model Analysis
visual_actdist.py Activation Distribution, the class distribution of Top-k images which activate the Neuron most Model Analysis
visual_tac.py Trigger Activation Change, the average (absolute) activation change between images with and without triggers Model Analysis
visual_lips.py Lipschitz Constant, the lipschitz constant of each neuron Model Analysis
visual_landscape.py Loss Landscape, the loss landscape of given results with two random directions Model Analysis
visual_network.py Network Structure, the Network Structure of given model Model Analysis
visual_hessian.py Eigenvalues of Hessian, the dense plot of hessian matrix for a batch of data Model Analysis
visual_metric.py Metrics, evaluating the given results using some metrics Evaluation
visual_cm.py Confusion Matrix

Citation

[Back to top]

If interested, you can read our recent works about backdoor learning, and more works about trustworthy AI can be found here.

@inproceedings{backdoorbench,
  title={BackdoorBench: A Comprehensive Benchmark of Backdoor Learning},
  author={Wu, Baoyuan and Chen, Hongrui and Zhang, Mingda and Zhu, Zihao and Wei, Shaokui and Yuan, Danni and Shen, Chao},
  booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2022}
}

@article{wu2023adversarial,
  title={Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example},
  author={Wu, Baoyuan and Liu, Li and Zhu, Zihao and Liu, Qingshan and He, Zhaofeng and Lyu, Siwei},
  journal={arXiv preprint arXiv:2302.09457},
  year={2023}
}

@article{cheng2023tat,
  title={TAT: Targeted backdoor attacks against visual object tracking},
  author={Cheng, Ziyi and Wu, Baoyuan and Zhang, Zhenya and Zhao, Jianjun},
  journal={Pattern Recognition},
  volume={142},
  pages={109629},
  year={2023},
  publisher={Elsevier}
}

@inproceedings{sensitivity-backdoor-defense-nips2022,
 title = {Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples},
 author = {Chen, Weixin and Wu, Baoyuan and Wang, Haoqian},
 booktitle = {Advances in Neural Information Processing Systems},
 volume = {35},
 pages = {9727--9737},
 year = {2022}
}

@inproceedings{dbd-backdoor-defense-iclr2022,
    title={Backdoor Defense via Decoupling the Training Process},
    author={Huang, Kunzhe and Li, Yiming and Wu, Baoyuan and Qin, Zhan and Ren, Kui},
    booktitle={International Conference on Learning Representations},
    year={2022}
}

@inproceedings{ssba-backdoor-attack-iccv2021,
    title={Invisible backdoor attack with sample-specific triggers},
    author={Li, Yuezun and Li, Yiming and Wu, Baoyuan and Li, Longkang and He, Ran and Lyu, Siwei},
    booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
    pages={16463--16472},
    year={2021}
}

Copyright

[Back to top]

This repository is licensed by The Chinese University of Hong Kong, Shenzhen and Shenzhen Research Institute of Big Data under Creative Commons Attribution-NonCommercial 4.0 International Public License (identified as CC BY-NC-4.0 in SPDX). More details about the license could be found in LICENSE.

This project is built by the Secure Computing Lab of Big Data (SCLBD) at The Chinese University of Hong Kong, Shenzhen and Shenzhen Research Institute of Big Data, directed by Professor Baoyuan Wu. SCLBD focuses on research of trustworthy AI, including backdoor learning, adversarial examples, federated learning, fairness, etc.

If any suggestion or comment, please contact us at wubaoyuan@cuhk.edu.cn.