/backdoor-toolbox

A compact toolbox for backdoor attacks and defenses.

Primary LanguagePython

assets/backdoor-toolbox.gif

Backdoor-Toolbox is a compact toolbox that integrates various backdoor attacks and defenses. We designed our toolbox with a shallow function call stack, which makes it easy to read and transplant by other researchers. Most codes are adapted from the original attack/defense implementation. This repo is still under heavy updates. Welcome to make your contributions for attacks/defenses that have not yet been implemented!

Features

You may register your own attacks, defenses and visualization methods in the corresponding files and directories.

Attacks

Poisoning attacks

See poison_tool_box/ and create_poisoned_set.py.

Other attacks

See other_attacks_tool_box/ and other_attack.py.

Defenses

Poison Cleansers

See cleansers_tool_box/ and cleanser.py.

Other Defenses

See other_defenses_tool_box/ and other_defense.py.

Visualization

Visualize the latent space of backdoor models. See visualize.py.

Dependency

This repository was developed with PyTorch 1.12.1, and should be compatible with PyTorch of newer versions. To set up the required environment, first manually install PyTorch with CUDA, and then install other packages via pip install -r requirement.txt.

TODO before You Start

  • Datasets:
    • Original CIFAR10 and GTSRB datasets would be automatically downloaded.
    • ImageNet should be manually downloaded from Kaggle or other available sources. Then set up the local path to your ImageNet dataset via the imagenet_dir variable in config.py.
  • Before any experiments, first initialize the clean reserved data and validation data using command python create_clean_set.py -dataset=$DATASET -clean_budget $N, where $DATASET = cifar10, gtsrb, ember, imagenet, $N = 2000 for cifar10, gtsrb, $N = 5000 for imagenet.
  • Before launching clean_label attack, run data/cifar10/clean_label/setup.sh.
  • Before launching dynamic attack, download pretrained generators all2one_cifar10_ckpt.pth.tar and all2one_gtsrb_ckpt.pth.tar to models/ from https://drive.google.com/file/d/1vG44QYPkJjlOvPs7GpCL2MU8iJfOi0ei/view?usp=sharing and https://drive.google.com/file/d/1x01TDPwvSyMlCMDFd8nG05bHeh1jlSyx/view?usp=sharing.
  • SPECTRE baseline defense is implemented in Julia. To compare our defense with SPECTRE, you must install Julia and install dependencies before running SPECTRE, see cleansers_tool_box/spectre/README.md for configuration details.
  • Frequency baseline defense is based on Tensorflow. If you would like to reproduce their results, please install Tensorflow (code is tested with Tensorflow 2.8.1 and should be compatible with newer versions) manually, after installing all the dependencies upon. We suggest you create and use a separate (conda) environment for it.

Quick Start

For example, to launch and defend against the Adaptive-Blend attack:

# Create a poisoned training set
python create_poisoned_set.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.003 -alpha 0.15

# Train on the poisoned training set
python train_on_poisoned_set.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.003 -alpha 0.15 -test_alpha 0.2

# Test the backdoor model
python test_model.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.003 -alpha 0.15 -test_alpha 0.2

# Visualize
## $METHOD = ['pca', 'tsne', 'oracle']
python visualize.py -method=$METHOD -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.003 -alpha 0.15 -test_alpha 0.2

# Cleanse with other cleansers
## Except for 'Frequency', you need to train poisoned backdoor models first.
## $CLEANSER = ['SCAn', 'AC', 'SS', 'Strip', 'SPECTRE', 'SentiNet', 'Frequency', etc.]
python cleanser.py -cleanser=$CLEANSER -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.003 -alpha 0.15 -test_alpha 0.2

# Retrain on cleansed set
## $CLEANSER = ['SCAn', 'AC', 'SS', 'Strip', 'SPECTRE', 'SentiNet', etc.]
python train_on_cleansed_set.py -cleanser=$CLEANSER -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.003 -alpha 0.15 -test_alpha 0.2

# Other defenses
## $DEFENSE = ['ABL', 'NC', 'NAD', 'STRIP', 'FP', 'SentiNet', etc.]
## Except for 'ABL', you need to train poisoned backdoor models first.
python other_defense.py -defense=$DEFENSE -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.003 -alpha 0.15 -test_alpha 0.2

Some examples for creating other backdoor poison datasets:

# CIFAR10
python create_poisoned_set.py -dataset cifar10 -poison_type none
python create_poisoned_set.py -dataset cifar10 -poison_type badnet -poison_rate 0.003
python create_poisoned_set.py -dataset cifar10 -poison_type blend -poison_rate 0.003
python create_poisoned_set.py -dataset cifar10 -poison_type trojan -poison_rate 0.003
python create_poisoned_set.py -dataset cifar10 -poison_type clean_label -poison_rate 0.003
python create_poisoned_set.py -dataset cifar10 -poison_type SIG -poison_rate 0.02
python create_poisoned_set.py -dataset cifar10 -poison_type dynamic -poison_rate 0.003
python create_poisoned_set.py -dataset cifar10 -poison_type ISSBA -poison_rate 0.02
python create_poisoned_set.py -dataset cifar10 -poison_type WaNet -poison_rate 0.05 -cover_rate 0.1
python create_poisoned_set.py -dataset cifar10 -poison_type TaCT -poison_rate 0.003 -cover_rate 0.003
python create_poisoned_set.py -dataset cifar10 -poison_type adaptive_blend -poison_rate 0.003 -cover_rate 0.003 -alpha 0.15
python create_poisoned_set.py -dataset cifar10 -poison_type adaptive_patch -poison_rate 0.003 -cover_rate 0.006


# GTSRB
python create_poisoned_set.py -dataset gtsrb -poison_type none
python create_poisoned_set.py -dataset gtsrb -poison_type badnet -poison_rate 0.01
python create_poisoned_set.py -dataset gtsrb -poison_type blend -poison_rate 0.01
python create_poisoned_set.py -dataset gtsrb -poison_type trojan -poison_rate 0.01
python create_poisoned_set.py -dataset gtsrb -poison_type SIG -poison_rate 0.02
python create_poisoned_set.py -dataset gtsrb -poison_type dynamic -poison_rate 0.003
python create_poisoned_set.py -dataset gtsrb -poison_type WaNet -poison_rate 0.05 -cover_rate 0.1
python create_poisoned_set.py -dataset gtsrb -poison_type TaCT -poison_rate 0.005 -cover_rate 0.005
python create_poisoned_set.py -dataset gtsrb -poison_type adaptive_blend -poison_rate 0.003 -cover_rate 0.003 -alpha 0.15
python create_poisoned_set.py -dataset gtsrb -poison_type adaptive_patch -poison_rate 0.005 -cover_rate 0.01

Additional Options and Configurations

You can also:

  • specify more details on the trigger selection

    • For basic, blend and adaptive_blend:

      specify the opacity of the trigger by -alpha=$ALPHA.

    • For basic, blend, clean_label, adaptive_blend and TaCT:

      specify the trigger by -trigger=$TRIGGER_NAME, where $TRIGGER_NAME is the file name of a 32x32 trigger mark image in triggers/ (e.g., -trigger=badnet_patch_32.png).

    • For basic, clean_label and TaCT:

      if another image named mask_$TRIGGER_NAME also exists in triggers/, it will be used as the trigger mask. Otherwise, all black pixels of the trigger mark are not applied by default.

  • test a trained model via

    python test_model.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.003 -cover_rate=0.006 -alpha=0.15 -test_alpha=0.2
    # other options include: -no_aug, -cleanser=$CLEANSER, -model_path=$MODEL_PATH, see our code for details
  • enforce a fixed running seed via -seed=$SEED option

  • change dataset to GTSRB via -dataset=gtsrb option

  • change model architectures in config.py

  • configure hyperparamters of other defenses in other_defense.py

  • see more configurations in config.py

Citation

If you find this toolbox useful for your research, please consider citing our work:

@inproceedings{qi2022revisiting,
  title={Revisiting the assumption of latent separability for backdoor defenses},
  author={Qi, Xiangyu and Xie, Tinghao and Li, Yiming and Mahloujifar, Saeed and Mittal, Prateek},
  booktitle={The eleventh international conference on learning representations},
  year={2022}
}

@inproceedings{qi2023towards,
  title={Towards a proactive $\{$ML$\}$ approach for detecting backdoor poison samples},
  author={Qi, Xiangyu and Xie, Tinghao and Wang, Jiachen T and Wu, Tong and Mahloujifar, Saeed and Mittal, Prateek},
  booktitle={32nd USENIX Security Symposium (USENIX Security 23)},
  pages={1685--1702},
  year={2023}
}

@article{xie2023badexpert,
  title={BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection},
  author={Xie, Tinghao and Qi, Xiangyu and He, Ping and Li, Yiming and Wang, Jiachen T and Mittal, Prateek},
  journal={arXiv preprint arXiv:2308.12439},
  year={2023}
}