Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks (CVPR'23)

This is a pytorch implementation of Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks.

Abstract: *In this work, we study the black-box targeted attack problem from the model discrepancy perspective. On the theoretical side, we present a generalization error bound for black-box targeted attacks, which gives a rigorous theoretical analysis for guaranteeing the success of the attack. We reveal that the attack error on a target model mainly depends on empirical attack error on the substitute model and the maximum model discrepancy among substitute models. On the algorithmic side, we derive a new algorithm for black-box targeted attacks based on our theoretical analysis, in which we additionally minimize the maximum model discrepancy(M3D) of the substitute models when training the generator to generate adversarial examples. In this way, our model is capable of crafting highly transferable adversarial examples that are robust to the model variation, thus improving the success rate for attacking the black-box model. We conduct extensive experiments on the ImageNet dataset with different classification models, and our proposed approach outperforms existing state-of-the-art methods by a significant margin. *

Citation

If you find our work, this repository and pretrained adversarial generators useful. Please consider giving a star ⭐ and cite our work.

@inproceedings{zhao2023minimizing,
  title={Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks},
  author={Zhao, Anqi and Chu, Tong and Liu, Yahao and Li, Wen and Li, Jingjing and Duan, Lixin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8153--8162},
  year={2023}
}

Contributions
Acknowledge
Pretrained Targeted Generator
Installation
Training
The Implementation Details of Discriminators
Evaluation

Contributions

We present a generalization error bound for black-box targeted attacks based on the model discrepancy perspective.
We design a novel generative approach called Minimizing Maximum Model Discrepancy (M3D) attack to craft adversarial examples with high transferability based on the generalization error bound.
We demonstrate the effectiveness of our method by strong empirical results, where our approach outperforms the state-of-art methods by a significant margin.

Acknowledge

^(top) Code adapted from TTP. We thank them for their wonderful code base.

Pretrained Targeted Generator

^(top) If you find our pretrained Adversarial Generators useful, please consider citing our work.

Class to Label Mapping

Class Number: Class Name
24: Great Grey Owl
99: Goose
245: French Bulldog
344: Hippopotamus
471: Cannon
555: Fire Engine
661: Model T
701: Parachute
802: Snowmobile
919: Street Sign

This is how the pretrianed generators are saved: _"netG_Discriminator_epoch_targetLabel.pth" e.g., netG_vgg19_bn_9_919.pth means that generator is trained agisnt vgg19_bn (Discriminator) for 10 epoch by M3D and the target label is 919(Street Sign).

Source Model	24	99	245	344	471	555	661	701	802	919
VGG19_BN	Grey Owl	Goose	French Bulldog	Hippopotamus	Cannon	Fire Engine	Model T	Parachute	Snowmobile	Street Sign
ResNet50	Grey Owl	Goose	French Bulldog	Hippopotamus	Cannon	Fire Engine	Model T	Parachute	Snowmobile	Street Sign
Dense121	Grey Owl	Goose	French Bulldog	Hippopotamus	Cannon	Fire Engine	Model T	Parachute	Snowmobile	Street Sign

Installation

conda create --name M3D -y python=3.7.0
conda activate M3D
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch

#apex
git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Training

Run the following command to train a generator:

CUDA_VISIBLE_DEVICES=0,1,2 nohup  python3 -m torch.distributed.launch --nproc_per_node=3 --master_port 23411 train_M3D.py --gs  --match_target 802 --batch_size 16  --epochs 10 --model_type vgg19_bn --log_dir ./checkpoint/vgg19_bn_M3D_3gpu/ --save_dir ./checkpoint/vgg19_bn_M3D_3gpu --apex_train 1 > ./checkpoint/vgg19_bn_M3D_3gpu/output_802.txt

The Implementation Details of Discriminators

Note that since the training data input to D1 and D2 are the same, the two models need to be initialized slightly different to ensure the model discrepancy loss works. We simply use a pre-trained model for one discriminator, and a model fine-tuned for one batch using ImageNet training data for another discriminator. We saved the finetuned models in 'pretrain_save_models' for convenience here. You can also finetune the discriminator during the training period！

vgg19_bn|resnet50|densenet121

Evaluation

Run the following command to evaluate transferability of the 10 targets to black-box model on the ImageNet-Val.

  CUDA_VISIBLE_DEVICES=0 python eval_M3D.py  --source_model resnet50 --target_model densenet121

Contact

zhaoanqiii@gmail.com

Suggestions and questions are welcome!

Asteriajojo/M3D