Official PyTorch implementation of Towards Efficient Data Free Black-Box Adversarial Attack (CVPR 2022)
Abstract:
Classic black-box adversarial attacks can take advantage of transferable adversarial examples generated by a similar substitute model to successfully fool the target model. However, these substitute models need to be trained by target models' training data, which is hard to acquire due to privacy or transmission reasons. Recognizing the limited availability of real data for adversarial queries, recent works proposed to train substitute models in a data-free black-box scenario. However, their generative adversarial networks (GANs) based framework suffers from the convergence failure and the model collapse, resulting in low efficiency. In this paper, by rethinking the collaborative relationship between the generator and the substitute model, we design a novel black-box attack framework. The proposed method can efficiently imitate the target model through a small number of queries and achieve high attack success rate. The comprehensive experiments over six datasets demonstrate the effectiveness of our method against the state-of-the-art attacks. Especially, we conduct both label-only and probability-only attacks on the Microsoft Azure online model, and achieve a 100% attack success rate with only 0.46% query budget of the SOTA method.
Experiments of original paper:
- Train the substitute model. If you want to train a substitute model in MNIST:
python3 train_scratch.py --dataset=mnist --epoch=200
- Generate the adversarial attacks by white-box attacks and transfer them to the attacked model. When the substitute model is obtained, you can use the following command to evaluate the substitute model:
python3 main.py --epochs=400 --save_dir=run/mnist --dataset=mnist --score=1 --other=cnn_mnsit --g_steps=10
You can also attack the Microsoft Azure API model by run attack_api.py
, the downloaded remote model can be seen in remote_model.
Dataset | Scripts |
---|---|
MNIST | python3 main.py --epochs=400 --save_dir=run/mnist --dataset=mnist --score=1 --other=cnn_mnsit --g_steps=10 |
Fashion-MNIST | python3 main.py --epochs=400 --save_dir=run/fmnist --dataset=fmnist --score=1 --other=cnn_fmnsit --g_steps=10 |
SVHN | python3 main.py --epochs=400 --save_dir=run/svhn --dataset=svhn --score=1 --other=cnn_svhn --g_steps=10 |
CIFAR10 | python3 main.py --epochs=2000 --save_dir=run/cifar10 --dataset=cifar10 --score=1 --other=cnn_cifar10 --g_steps=5 |
CIFAR100 | python3 main.py --epochs=2000 --save_dir=run/cifar100 --dataset=cifar100 --score=1 --other=cnn_cifar100 --g_steps=5 --batch_size=1000 |
Tiny-Imagenet | python3 main.py --epochs=2000 --save_dir=run/tiny --dataset=tiny --score=1 --other=cnn_tiny --g_steps=5 --batch_size=800 |
Just set --score=0
An example of the training:
@inproceedings{zhang2022towards,
title={Towards Efficient Data Free Black-Box Adversarial Attack},
author={Zhang, Jie and Li, Bo and Xu, Jianghe and Wu, Shuang and Ding, Shouhong and Zhang, Lei and Wu, Chao},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={15115--15125},
year={2022}
}