/TransferAttackEval

Revisiting Transferable Adversarial Images (arXiv)

Primary LanguagePython

Revisiting Transferable Adversarial Images

Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights. Zhengyu Zhao*, Hanwei Zhang*, Renjue Li*, Ronan Sicre, Laurent Amsaleg, Michael Backes, Qi Li, Qian Wang, Chao Shen.

We identify two main problems in common evaluation practices:
(1) for attack transferability, lack of systematic, one-to-one attack comparisons and fair hyperparameter settings;
(2) for attack stealthiness, simply no evaluations.

We address these problems by
(1) introducing a complete attack categorization and conducting systematic and fair intra-category analyses on transferability;
(2) considering diverse imperceptibility metrics and finer-grained stealthiness characteristics from the perspective of attack traceback.

We draw new insights, e.g.,
(1) under a fair attack hyperparameter setting, one early attack method, DI, actually outperforms all the follow-up methods;
(2) popular diffusion-based defenses give a false sense of security since it is indeed largely bypassed by (black-box) transferable attacks;
(3) even when all attacks are bounded by the same Lp norm, they lead to dramatically different stealthiness performance, which negatively correlates with their transferability performance.

We provide the first large-scale evaluation of transferable adversarial examples on ImageNet, involving 23 representative attacks against 9 representative defenses.

We reveal that existing problematic evaluations have indeed caused misleading conclusions and missing points, and as a result, hindered the assessment of the actual progress in this field.

Evaluated Attacks and Defenses

Attack Categorization (Welcome more papers!)

Gradient Stabilization Attacks [Code for 3 representative attacks]

Input Augmentation Attacks [Code for 5 representative attacks]

Feature Disruption Attacks [Code for 5 representative attacks]

Surrogate Refinement Attacks [Code for 5 representative attacks]

Generative Modeling Attacks

Surveys/Evaluations/Explanations