/adv-patch-paper-list

A paper list for localized adversarial patch research

A Paper List for Localized Adversarial Patch Research

What is the localized adversarial patch attack?

Different from classic adversarial examples that are configured to has a small L_p norm distance to the normal examples, a localized adversarial patch attacker can arbitrarily modify the pixel values within a small region.

The attack algorithm is similar to those for the classic L_p adversarial example attack. You define a loss function and then optimize your perturbation to attain attack objective. The only difference is that now 1) you can only optimize over pixels within a small region, 2) but within that region, the pixel values are be arbitrary as long as they are valid pixels.

Example of localized adversarial patch attack (image from Brown et al.):

patch image example

What makes this attack interesting?

It can be realized in the physical world!

Since all perturbations are within a small region, we can print and attach the patch in our physical world. This type of attack impose a real-world threat on ML systems!

Note: not all existing physically-realizable attacks are in the category of patch attacks, but the localized patch attack is (one of) the simplest and the most popular physical attacks.

About this paper list

Focus

  1. Test-time attacks/defenses (not consider localized backdoor triggers)
  2. 2D computer vision tasks (e.g., image classification, object detection, image segmentation)
  3. Localized attacks (not consider other physical attacks that are more "global", e.g., some stop sign attacks which require changing the entire stop sign background)
  4. More on defenses: I try to provide a comprehensive list of defense papers while the attack papers might be incomprehensive

Terminology

  1. Empirical defense: defenses that are heuristic-based and have little security guarantee against an adaptive attacker
  2. Provably robust defenses / certifiably robust defenses / certified defenses: we can prove the robustness for certain certified images. The robustness guarantee holds for any adaptive white-box attacker within the threat model

I am still developing this paper list (I haven't added notes for all papers). If you want to contribute to the paper list, add your paper, correct any of my comments, or share any of your suggestions, feel free to reach out :)

Table of Contents

Image Classification

Attacks

arXiv 1712; NeurIPS workshop 2017

  1. The first paper that introduces the concept of adversarial patch attacks
  2. Demonstrate a universal physical world attack

arXiv 1801; ICML 2018

  1. Seems to be a concurrent work (?) as "Adversarial Patch"
  2. Digital domain attack

arXiv 2004; ECCV 2020

  1. a black-box attack via reinforcement learning

Sprinter 2021

  1. data independent attack; attack via increasing the magnitude of feature values

arXiv 2102

  1. use 3D modeling to enhance physical-world patch attack

arXiv 2104

  1. add stickers to face to fool face recognition system

arXiv 2106; CVPR 2021

  1. focus on transferability

arXiv 2106; an old version is available at arXiv 2009

  1. generate small (inconspicuous) and localized perturbations

arXiv 2108

  1. consider physical-world patch attack in the 3-D space (images are taken from different angles)

https://arxiv.org/pdf/2106.09222.pdf

(go back to table of contents)

Defenses

CVPR workshop 2018

  1. The first empirical defense. Use saliency map to detect and mask adversarial patches.

arXiv 1807; WACV 2019

  1. An empirical defense. Use pixel gradient to detect patch and smooth in the suspected regions.

arXiv 1909, ICLR 2020

  1. Empirical defense via adversarial training
  2. Interestingly show that adversarial training for patch attack does not hurt model clean accuracy
  3. Only works on small images

ICLR 2020

The first certified defense.

  1. Show that previous two empirical defenses (DW and LGS) are broken against an adaptive attacker
  2. Adapt IBP (Interval Bound Propagation) for certified defense
  3. Evaluate robustness against different shapes
  4. Very expensive; only works for CIFAR-10

IEEE S&P Workshop on Deep Learning Security 2020

  1. Certified defense; clip BagNet features
  2. Efficient

arXiv 1812; IEEE S&P Workshop on Deep Learning Security 2020

  1. Empirical defense that leverages the universality of the attack (inapplicable to non-universal attacks)

arXiv 2002, NeurIPS 2020

  1. Certified defense; adapt ideas of randomized smoothing for $L_0$ adversary
  2. Majority voting on predictions made from cropped pixel patches
  3. Scale to ImageNet but expansive

arXiv 2002

  1. empirical defense

arXiv 2004; ACNS workshop 2020

  1. Certified defense for detecting an attack
  2. Apply masks to the different locations of the input image and check inconsistency in masked predictions
  3. Too expansive to scale to ImageNet (?)

arXiv 2005; USENIX Security 2021

  1. Certified defense framework with two general principles: small receptive field to bound the number of corrupted features and secure aggregation for final robust prediction
  2. BagNet for small receptive fields; robust masking for secure aggregation, which detects and masks malicious feature values
  3. Efficient; SOTA performance (in terms of both clean accuracy and provable robust accuracy)
  4. Subsumes several existing and follow-up papers
  5. Not parameter-free

arXiv 2005, ECCV workshop 2020

  1. empirical defense via adversarial training (in which the patch location is being optimized)

arXiv 2009; ACCV 2020

Available on ICLR open review in 10/2020; ICLR 2021

  1. Certified defense
  2. BagNet to bound the number of corrupted features; Heaviside step function & majority voting for secure aggregation
  3. SOTA performance on CIFAR-10
  4. Efficient, evaluate on different patch shapes

Available on ICLR open review in 10/2020

  1. Certified defense
  2. Randomized image cropping + majority voting
  3. only probabilistic certified robustness

arXiv 2012

  1. empirical defense; directly use CompNet to defend against black-box patch attack (evaluated with PatchAttack)

CISS 2021

  1. An empirical defense against black-box patch attacks
  2. A direct application of CompNet

arXiv 2104; ICLR workshop 2021

  1. Certified defense for detecting an attack
  2. A hybrid of PatchGuard and Minority Report
  3. SOTA provable robust accuracy (for attack detection) and clean accuracy on ImageNet

arXiv 2105

  1. An empirical defense that use the magnitude and variance of the feature map values to detect an attack
  2. focus more on the universal attack (both localized patch and global perturbations)

arXiv 2108

  1. empirical defense; use universality

arXiv 2108

  1. Certified defense that is compatible with any state-of-the-art image classifier
  2. huge improvements in clean accuracy and certified robust accuracy (its clean accuracy is close to SOTA image classifier)

(go back to table of contents)

Certified Defense Leaderboard

TODO

see Table 2 of the PatchCleanser paper for a comprehensive comparison

Object Detection (and Semantic Segmentation)

Attacks

arXiv 1806; AAAI workshop 2019

  1. The first (?) patch attack against object detector

arXiv 1904; CVPR workshop 2019

  1. using a rigid board printed with adversarial perturbations to evade detection of person

arXiv 1906

  1. interestingly show that a physical-world patch at the background (far away from the victim objects) can have malicious effect

arXiv 1909

CCS 2019

arXiv 1910; ECCV 2020

  1. use a non-rigid T-shirt to evade person detection

arXiv 1910; ECCV 2020

  1. wear an ugly T-shirt to evade person detection

arXiv 1912

IEEE IoT-J 2020

arXiv 2008

arXiv 2010; IJCNN 2020

arXiv 2010; CIKM workshop

arXiv 2010

arXiv 2010

arXiv 2103; ICME 2021

arXiv 2105

arXiv 2108

https://arxiv.org/abs/1802.06430

(go back to table of contents)

Defenses

arXiv 1910; CVPR workshop 2020

  1. The first empirical defense, adding a regularization loss to constrain the use of spatial information
  2. only experiment on YOLOv2 and small datasets like PASCAL VOC

arXiv 210; ICML 2021 workshop

arXiv 2102

  1. The first certified defense for patch hiding attack
  2. Adapt robust image classifiers for robust object detection
  3. Provable robustness at a negligible cost of clean performance

arXiv 2103

  1. Empirical defense via adding adversarial patches and a "patch" class during the training

arXiv 2106

  1. Two empirical defenses for patch hiding attack
  2. Feed small image region to the detector; grows the region with some heuristics; detect an attack when YOLO detects objects in a smaller region but miss objects in a larger expanded region.

(go back to table of contents)