This repository provides a leaderboard for certifiable robustness against adversarial patch attacks.
Check out our paper list and tutorials for adversarial patch attacks and defenses!
For now, this leaderboard focuses on the setting of
- Image classification task
- For Object Detection, see the ObjectSeeker paper.
- For Semantic Segmentation, see this paper.
- ImageNet dataset
- One 2% square patch anywhere on the image
- Prediction Recovery vs. attack detection (read this explanation for the difference between these two notions)
Stay tuned for future updates on different datasets and tasks :)
We include the following properties of different defenses in this leaderboard.
- Defense name and publication venue.
- Certified robust accuracy: certified robust accuracy for a 2%-pixel square patch anywhere on the image. All leaderboards are sorted by certified robust accuracy
- Clean accuracy: clean accuracy of the defended models
- Vanilla clean accuracy: clean accuracy of vanilla undefended models (if the defense is built upon off-the-shelf models)
- Backbone: the backbone used for each defense
- Comment: additional comments for each defense, including SOTA results, defense parameters, special training recipes.
- Code: most defenses have source code implemented by the authors or others
Note: We discuss more properties of different defenses in another copy of leaderboards hosted on Google Sheet.
If you have new results and want to contribute to the leaderboard. Please submit a pull request or email me (Chong Xiang; cxiang@princeton.edu)
The prediction recovery defense requires the defense model to always predict/recover the robust label (without any abstention).
certified robust accuracy | clean accuracy | vanilla clean accuracy | backbone | comment | code | |
---|---|---|---|---|---|---|
MultiSizeGreedyCutout on PatchCleanser (TMLR 10/2023) | 64.9 | 82.4 | 83.0 | ViT-B | SOTA at known 2% patch size; k=6; More improvements for 3%-pixel patch over cutout can be found in the paper | NA |
MultiSizeGreedyCutout on PatchCleanser (TMLR 10/2023) | 63.1 | 82.0 | 82.8 | ViT-B | k=6 | NA |
PatchCleanser (USENIX Security 2022) | 62.1 | 83.9 | 84.8 | ViT-B | k=6; trained with cutout | code |
PatchCleanser (USENIX Security 2022) | 61.6 | 83.8 | 84.8 | ViT-B | k=6 | code |
PatchCleanser (USENIX Security 2022) | 59.8 | 83.7 | 84.8 | ViT-B | k=5; trained with masked images | code |
PatchCleanser (USENIX Security 2022) | 59.4 | 83.6 | 83.7 | ViT-B | k=4; trained with masked images | code |
PatchCleanser (USENIX Security 2022) | 56.1 | 83.3 | 84.8 | ViT-B | k=3; trained with masked images | code |
PatchCleanser (USENIX Security 2022) | 53.8 | 79.4 | 80.2 | ResMLP-S | k=6; trained with masked images | code |
PatchCleanser (USENIX Security 2022) | 53.1 | 79.6 | 80.8 | ResMLP-S | k=6 | code |
PatchCleanser (USENIX Security 2022) | 53 | 81.6 | 82.3 | ResNet-50 | k=6; trained with masked images | code |
ViP (ECCV 2022) | 45.4 | 74.1 | ViT-L (MAE) | SOTA with unknown patch size (using a larger model MAE-Large); trained with MAE; b=19; stride = 5 | code | |
ViP (ECCV 2022) | 44.6 | 73.6 | ViT-L (MAE) | b=19; stride = 10 | code | |
ECViT (CVPR 2022) | 41.7 | 78.58 | NA | ViT-B | SOTA with unknown patch size; b=37; progressively trained with pixel bands | NA |
PatchCleanser (USENIX Security 2022) | 41.6 | 81.1 | 82.8 | ResNet-50 | k=6 | code |
ViP (ECCV 2022) | 41.4 | 70.8 | 83.7 | ViT-B (MAE) | b=19; stride = 1 | code |
ECViT (CVPR 2022) | 40.79 | 75.3 | NA | ViT-B | b=25; progressively trained with pixel bands | NA |
ECViT (CVPR 2022) | 40.72 | 73.49 | NA | ViT-B | b=19; progressively trained with pixel bands | NA |
ViP (ECCV 2022) | 40.6 | 70.4 | 83.7 | ViT-B (MAE) | b=19; stride = 5 | code |
ViP (ECCV 2022) | 49.8 | 69.9 | 83.7 | ViT-B (MAE) | b=19; stride = 10 | code |
Smoothed ViT (CVPR 2022) | 38.3 | 69.3 | NA | ViT-B | SOTA with unknown patch size (with code); b=19 | code |
Smoothed ViT (CVPR 2022) | 38.2 | 73.2 | NA | ViT-B | b=37; trained with pixel bands | code |
Smoothed ViT (CVPR 2022) | 36.9 | 68.3 | NA | ViT-B | b=19; s=10; trained with pixel bands | code |
Smoothed ViT (CVPR 2022) | 31.6 | 63.5 | NA | ViT-S | b=19; trained with pixel bands | code |
ECViT (CVPR 2022) | 30.06 | 67.14 | NA | ViT-S | b=25; progressively trained with pixel bands | NA |
ECViT (CVPR 2022) | 29.74 | 69.88 | NA | ViT-S | b=37; progressively trained with pixel bands | NA |
ECViT (CVPR 2022) | 28.85 | 64.69 | NA | ViT-S | b=19; progressively trained with pixel bands | NA |
De-randomized Smoothing (NeurIPS 2020) (from the Smoothed ViT ) | 28.1 | 61.4 | NA | WRN-101-2 | b=19; trained with pixel bands | code |
PatchGuard (USENIX Security 2021) | 26 | 54.6 | 56.5 | BagNet-17 | trained with masked features | code |
PatchGuard (USENIX Security 2021) | 24.1 | 60.4 | 63 | BagNet-33 | trained with masked features | code |
BagCert (ICLR 2021) | 22.7 | 45.3 | NA | BagNet-17 | NA | |
BagCert (ICLR 2021) | 22.4 | 47.3 | NA | BagNet-25 | NA | |
BagCert (ICLR 2021) | 20.1 | 47.9 | NA | BagNet-29 | NA | |
De-randomized Smoothing (NeurIPS 2020) (from the Smoothed ViT ) | 18.3 | 51.5 | NA | ResNet-50 | b=19; trained with pixel bands | code |
PatchGuard (USENIX Security 2021) | 15.7 | 43.6 | 44.4 | ResNet-50 (de-randmoized smoothed | code | |
Clipped BagNet (DLS 2020) | 14.4 | 53.7 | 56.5 | BagNet-17 | code | |
De-randomized Smoothing (NeurIPS 2020) | 14 | 44.4 | NA | ResNet-50 | b=25; trained with pixel bands | code |
PatchGuard (USENIX Security 2021) | 13.3 | 54.4 | 58.8 | BagNet-17 | code | |
Clipped BagNet (DLS 2020) | 9.4 | 62.7 | 63 | BagNet-33 | code | |
Clipped BagNet (DLS 2020) | 7.1 | 49.5 | 58.8 | BagNet-17 | code | |
PatchGuard (USENIX Security 2021) | 6.9 | 61.2 | 66.6 | BagNet-33 | code | |
Clipped BagNet (DLS 2020) | 1.9 | 60.3 | 66.6 | BagNet-33 | code | |
IBP (ICLR 2020) | 0 | 0 | 0 | NA | not scale to high-resolution images | code |
(go back to table of contents)
The attack detection defense allows the defense to alert and abstain from making predictions when it detects an attack.
- Attack detection defenses usually can tune the defense parameters to balance the trade-off between clean accuracy and robust accuracy.
- Here, I only report three representative points for each defense setup (different confidence threshold tau)
- low-tau: high robust accuracy and low clean accuracy; clean accuracy and robust accuracy are the same
- high-tau: low robust accuracy and high clean accuracy; clean accuracy is close to vanilla undefended clean accuracy
- mid-tau: in between
- You are encouraged to play with the code yourself :)
- The original Minority Reports paper does not discuss high-resolution images like ImageNet.
- The results here (Minority Reports + PatchCleanser) are from the discussion section of the PatchCleanser paper.
- the idea is to plug PatchCleanser's mask generation approach into the MR defense design https://github.com/inspire-group/PatchCleanser/blob/main/misc/pc_mr.py
certified robust accuracy | clean accuracy | vanilla clean accuracy (if applicable) | backbone | comment | code | |
---|---|---|---|---|---|---|
ViP (ECCV 2022) | 74.6 | 74.6 | 83.7 | ViT-B (MAE) | Highest certified robust accuracy; trained with MAE; low-tau | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 74.3 | 74.3 | 84.8 | ViT-B | k=6; low-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 73.7 | 73.7 | 84.8 | ViT-B | k=5; low-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 73.2 | 73.2 | 84.8 | ViT-B | k=4; low-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 72.8 | 72.8 | 84.5 | ViT-B | k=6; low-tau | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 71.2 | 71.2 | 84.8 | ViT-B | k=3; low-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 68.3 | 68.3 | 82.3 | ResNet-50 | low-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 67.6 | 67.6 | 80.8 | ResMLP-S | low-tau | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 67.5 | 67.5 | 80.2 | ResMLP-S | low-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 66.6 | 81.8 | 84.8 | ViT-B | k=6; mid-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 65.3 | 81.5 | 84.8 | ViT-B | k=5; mid-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 64.5 | 81.4 | 84.8 | ViT-B | k=4; mid-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 64.4 | 81.4 | 84.5 | ViT-B | k=6; mid-tau | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 63.4 | 63.4 | 82.8 | ResNet-50 | low-tau | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 62.1 | 75 | 80.8 | ResMLP-S | mid-tau | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 62.1 | 81 | 84.8 | ViT-B | k=3; mid-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 60.8 | 75.4 | 80.2 | ResMLP-S | mid-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 59.9 | 77.8 | 82.3 | ResNet-50 | mid-tau; trained with masked images | code |
ScaleCert (NeurIPS 2021) | 55.4 | 58.5 | 73 | ResNet-50 | NA | |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 51 | 78.2 | 82.8 | ResNet-50 | mid-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 49.8 | 49.8 | 63 | BagNet-33 | low-tau; trained with masked features | code |
PatchGuard++ (ICLR Workshop 2021) | 48.4 | 48.4 | 66.6 | BagNet-33 | low-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 46.6 | 46.6 | 56.5 | BagNet-17 | low-tau; trained with masked features | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 46.4 | 80.3 | 80.8 | ResMLP-S | high-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 45.5 | 45.5 | 58.8 | BagNet-17 | low-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 39.9 | 60.1 | 66.6 | BagNet-33 | mid-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 39 | 60.9 | 63 | BagNet-33 | mid-tau; trained with masked features | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 37.6 | 84.4 | 84.5 | ViT-B | k=6; high-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 36.4 | 55.2 | 58.8 | BagNet-17 | mid-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 34.7 | 55.4 | 56.5 | BagNet-17 | mid-tau; trained with masked features | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 32.7 | 82.3 | 82.8 | ResNet-50 | high-tau | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 32.4 | 84.7 | 84.8 | ViT-B | k=6; high-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 31.7 | 84.7 | 84.8 | ViT-B | k=5; high-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 30.9 | 84.7 | 84.8 | ViT-B | k=4; high-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 29.2 | 82.1 | 82.3 | ResNet-50 | high-tau; trained with masked images | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 29.1 | 84.6 | 84.8 | ViT-B | k=3; high-tau; trained with masked images | code |
PatchGuard++ (ICLR Workshop 2021) | 28 | 62.9 | 63 | BagNet-33 | high-tau; trained with masked features | code |
PatchGuard++ (ICLR Workshop 2021) | 27.6 | 56.4 | 56.5 | BagNet-17 | high-tau; trained with masked features | code |
Minority Reports (ACNS Workshop 2020) + PatchCleanser (USENIX Security 20222) | 26.7 | 80.1 | 80.2 | ResMLP-S | high-tau; trained with masked images | code |
PatchGuard++ (ICLR Workshop 2021) | 22.7 | 66.3 | 66.6 | BagNet-33 | high-tau | code |
PatchGuard++ (ICLR Workshop 2021) | 19.9 | 58.7 | 58.8 | BagNet-17 | high-tau | code |