A Paper List for Localized Adversarial Patch Research

What is the localized adversarial patch attack?

Different from classic adversarial examples that are configured to has a small L_p norm distance to the normal examples, a localized adversarial patch attacker can arbitrarily modify the pixel values within a small region.

The attack algorithm is similar to those for the classic L_p adversarial example attack. You define a loss function and then optimize your perturbation to attain attack objective. The only difference is that now 1) you can only optimize over pixels within a small region, 2) but within that region, the pixel values are be arbitrary as long as they are valid pixels.

Example of localized adversarial patch attack (image from Brown et al.):

What makes this attack interesting?

It can be realized in the physical world!

Since all perturbations are within a small region, we can print and attach the patch in our physical world. This type of attack impose a real-world threat on ML systems!

Note: not all existing physically-realizable attacks are in the category of patch attacks, but the localized patch attack is (one of) the simplest and the most popular physical attacks.

About this paper list

Focus

Test-time attacks/defenses (not consider localized backdoor triggers)
2D computer vision tasks (e.g., image classification, object detection, image segmentation)
Localized attacks (not consider other physical attacks that are more "global", e.g., some stop sign attacks which require changing the entire stop sign background)
More on defenses: I try to provide a comprehensive list of defense papers while the attack papers might be incomprehensive

Terminology

Empirical defense: defenses that are heuristic-based and have little security guarantee against an adaptive attacker
Provably robust defenses / certifiably robust defenses / certified defenses: we can prove the robustness for certain certified images. The robustness guarantee holds for any adaptive white-box attacker within the threat model

I am still developing this paper list (I haven't added notes for all papers). If you want to contribute to the paper list, add your paper, correct any of my comments, or share any of your suggestions, feel free to reach out :)

Image Classification
- Attacks
- Defenses
Object Detection (and Semantic Segmentation)
- Attacks
- Defenses

Image Classification

Attacks

Adversarial Patch

arXiv 1712; NeurIPS workshop 2017

The first paper that introduces the concept of adversarial patch attacks
Demonstrate a universal physical world attack

LaVAN: Localized and Visible Adversarial Noise

arXiv 1801; ICML 2018

Seems to be a concurrent work (?) as "Adversarial Patch"
Digital domain attack

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

arXiv 2004; ECCV 2020

a black-box attack via reinforcement learning

A Data Independent Approach to Generate Adversarial Patches

Sprinter 2021

data independent attack; attack via increasing the magnitude of feature values

Enhancing Real-World Adversarial Patches with 3D Modeling Techniques

arXiv 2102

use 3D modeling to enhance physical-world patch attack

Meaningful Adversarial Stickers for Face Recognition in Physical World

arXiv 2104

add stickers to face to fool face recognition system

Improving Transferability of Adversarial Patches on Face Recognition with Generative Models

arXiv 2106; CVPR 2021

focus on transferability

Inconspicuous Adversarial Patches for Fooling Image Recognition Systems on Mobile Devices

arXiv 2106; an old version is available at arXiv 2009

generate small (inconspicuous) and localized perturbations

Patch Attack Invariance: How Sensitive are Patch Attacks to 3D Pose?

arXiv 2108

consider physical-world patch attack in the 3-D space (images are taken from different angles)

https://arxiv.org/pdf/2106.09222.pdf

(go back to table of contents)

Defenses

On Visible Adversarial Perturbations & Digital Watermarking

CVPR workshop 2018

The first empirical defense. Use saliency map to detect and mask adversarial patches.

Local Gradients Smoothing: Defense against Localized Adversarial Attacks

arXiv 1807; WACV 2019

An empirical defense. Use pixel gradient to detect patch and smooth in the suspected regions.

Defending Against Physically Realizable Attacks on Image Classification

arXiv 1909, ICLR 2020

Empirical defense via adversarial training
Interestingly show that adversarial training for patch attack does not hurt model clean accuracy
Only works on small images

Certified Defenses for Adversarial Patches

ICLR 2020

The first certified defense.

Show that previous two empirical defenses (DW and LGS) are broken against an adaptive attacker
Adapt IBP (Interval Bound Propagation) for certified defense
Evaluate robustness against different shapes
Very expensive; only works for CIFAR-10

Clipped BagNet: Defending Against Sticker Attacks with Clipped Bag-of-features

IEEE S&P Workshop on Deep Learning Security 2020

Certified defense; clip BagNet features
Efficient

SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems

arXiv 1812; IEEE S&P Workshop on Deep Learning Security 2020

Empirical defense that leverages the universality of the attack (inapplicable to non-universal attacks)

(De)Randomized Smoothing for Certifiable Defense against Patch Attacks

arXiv 2002, NeurIPS 2020

Certified defense; adapt ideas of randomized smoothing for $L_0$ adversary
Majority voting on predictions made from cropped pixel patches
Scale to ImageNet but expansive

Detecting Patch Adversarial Attacks with Image Residuals

arXiv 2002

empirical defense

Minority Reports Defense: Defending Against Adversarial Patches

arXiv 2004; ACNS workshop 2020

Certified defense for detecting an attack
Apply masks to the different locations of the input image and check inconsistency in masked predictions
Too expansive to scale to ImageNet (?)

PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking

arXiv 2005; USENIX Security 2021

Certified defense framework with two general principles: small receptive field to bound the number of corrupted features and secure aggregation for final robust prediction
BagNet for small receptive fields; robust masking for secure aggregation, which detects and masks malicious feature values
Efficient; SOTA performance (in terms of both clean accuracy and provable robust accuracy)
Subsumes several existing and follow-up papers
Not parameter-free

Adversarial Training against Location-Optimized Adversarial Patches

arXiv 2005, ECCV workshop 2020

empirical defense via adversarial training (in which the patch location is being optimized)

Vax-a-Net: Training-time Defence Against Adversarial Patch Attacks

arXiv 2009; ACCV 2020

Efficient Certified Defenses Against Patch Attacks on Image Classifiers

Available on ICLR open review in 10/2020; ICLR 2021

Certified defense
BagNet to bound the number of corrupted features; Heaviside step function & majority voting for secure aggregation
SOTA performance on CIFAR-10
Efficient, evaluate on different patch shapes

Certified Robustness against Physically-realizable Patch Attack via Randomized Cropping

Available on ICLR open review in 10/2020

Certified defense
Randomized image cropping + majority voting
only probabilistic certified robustness

Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

arXiv 2012

empirical defense; directly use CompNet to defend against black-box patch attack (evaluated with PatchAttack)

Compositional Generative Networks and Robustness to Perceptible Image Changes

CISS 2021

An empirical defense against black-box patch attacks
A direct application of CompNet

PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

arXiv 2104; ICLR workshop 2021

Certified defense for detecting an attack
A hybrid of PatchGuard and Minority Report
SOTA provable robust accuracy (for attack detection) and clean accuracy on ImageNet

Real-time Detection of Practical Universal Adversarial Perturbations

arXiv 2105

An empirical defense that use the magnitude and variance of the feature map values to detect an attack
focus more on the universal attack (both localized patch and global perturbations)

Turning Your Strength against You: Detecting and Mitigating Robust and Universal Adversarial Patch Attack

arXiv 2108

empirical defense; use universality

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

arXiv 2108

Certified defense that is compatible with any state-of-the-art image classifier
huge improvements in clean accuracy and certified robust accuracy (its clean accuracy is close to SOTA image classifier)

(go back to table of contents)

Certified Defense Leaderboard

TODO

see Table 2 of the PatchCleanser paper for a comprehensive comparison

Object Detection (and Semantic Segmentation)

Attacks

DPATCH: An Adversarial Patch Attack on Object Detectors

arXiv 1806; AAAI workshop 2019

The first (?) patch attack against object detector

Fooling automated surveillance cameras: adversarial patches to attack person detection

arXiv 1904; CVPR workshop 2019

using a rigid board printed with adversarial perturbations to evade detection of person

On Physical Adversarial Patches for Object Detection

arXiv 1906

interestingly show that a physical-world patch at the background (far away from the victim objects) can have malicious effect

use a non-rigid T-shirt to evade person detection

Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors

arXiv 1910; ECCV 2020

wear an ugly T-shirt to evade person detection

(go back to table of contents)

Defenses

Role of Spatial Context in Adversarial Robustness for Object Detection

arXiv 1910; CVPR workshop 2020

The first empirical defense, adding a regularization loss to constrain the use of spatial information
only experiment on YOLOv2 and small datasets like PASCAL VOC

Meta Adversarial Training against Universal Patches

arXiv 210; ICML 2021 workshop

DetectorGuard: Provably Securing Object Detectors against Localized Patch Hiding Attacks

arXiv 2102

The first certified defense for patch hiding attack
Adapt robust image classifiers for robust object detection
Provable robustness at a negligible cost of clean performance

Adversarial YOLO: Defense Human Detection Patch Attacks via Detecting Adversarial Patches

arXiv 2103

Empirical defense via adding adversarial patches and a "patch" class during the training

We Can Always Catch You: Detecting Adversarial Patched Objects WITH or WITHOUT Signature

arXiv 2106

Two empirical defenses for patch hiding attack
Feed small image region to the detector; grows the region with some heuristics; detect an attack when YOLO detects objects in a smaller region but miss objects in a larger expanded region.

(go back to table of contents)

Raytsang123/adv-patch-paper-list

A Paper List for Localized Adversarial Patch Research

What is the localized adversarial patch attack?

What makes this attack interesting?

About this paper list

Focus

Terminology

Table of Contents

Image Classification

Attacks

Defenses

Certified Defense Leaderboard

Object Detection (and Semantic Segmentation)

Attacks

Defenses