Awesome Data Poisoning and Backdoor Attacks

Disclaimer: This repository may not include all relevant papers in this area. Use at your own discretion and please contribute any missing or overlooked papers via pull request.

A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them.

Surveys

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses (TPAMI 2022) [paper]
A Survey on Data Poisoning Attacks and Defenses (DSC 2022) [paper]

2023

arXiv

Silent Killer: Optimizing Backdoor Trigger Yields a Stealthy and Powerful Data Poisoning Attack (arXiv 2023) [code]
Exploring the Limits of Indiscriminate Data Poisoning Attacks (arXiv 2023) [paper]
Students Parrot Their Teachers: Membership Inference on Model Distillation (arXiv 2023) [paper]
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning (arXiv 2023) [paper]
More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models (arXiv 2023) [paper] [code]
Feature Partition Aggregation: A Fast Certified Defense Against a Union of Sparse Adversarial Attacks (arXiv 2023) [paper] [code]
ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms (arXiv 2023) [paper] [code]
Temporal Robustness against Data Poisoning (arXiv 2023) [paper]
Run-Off Election: Improved Provable Defense against Data Poisoning Attacks (arXiv 2023) [paper]
A Systematic Evaluation of Backdoor Trigger Characteristics in Image Classification (arXiv 2023) [paper]

Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [paper]
Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only (ICLR 2023) [paper]
TrojText: Test-time Invisible Textual Trojan Insertion (ICLR 2023) [paper] [code]
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning? (ICLR 2023) [paper] [code]
Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [paper] [code]
Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks (ICLR 2023) [paper] [code]
Revisiting the Assumption of Latent Separability for Backdoor Defenses (ICLR 2023) [paper] [code]
Few-shot Backdoor Attacks via Neural Tangent Kernels (ICLR 2023) [paper] [code]
SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency [paper] [code]
Revisiting Graph Adversarial Attack and Defense From a Data Distribution Perspective (ICLR 2023) [paper] [code]
Provable Robustness against Wasserstein Distribution Shifts via Input Randomization (ICLR 2023) [paper]
Don’t forget the nullspace! Nullspace occupancy as a mechanism for out of distribution failure [paper]
Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors (ICLR 2023) [paper] [code]
Towards Robustness Certification Against Universal Perturbations (ICLR 2023) [paper] [code]
Understanding Influence Functions and Datamodels via Harmonic Analysis (ICLR 2023) [paper]
Distilling Cognitive Backdoor Patterns within an Image (ICLR 2023) [paper] [code]
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning (ICLR 2023) [paper] [code]
UNICORN: A Unified Backdoor Trigger Inversion Framework (ICLR 2023) [paper] [code]
Backdoor Defense via Deconfounded Representation Learning (CVPR 2023) [paper] [code]
Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks (CVPR 2023) [paper]
CUDA: Convolution-based Unlearnable Datasets (CVPR 2023) [paper] [code]
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger (CVPR 2023) [paper]
Single Image Backdoor Inversion via Robust Smoothed Classifiers (CVPR 2023) [paper] [code]
Defending Against Backdoor Attacks by Layer-wise Feature Analysis (PAKDD 2023) [paper] [code]
How to Sift Out a Clean Data Subset in the Presence of Data Poisoning? (USENIX Security, 2023) [paper] [code]

2022

Transferable Unlearnable Examples (arXiv 2022) [paper]
Natural Backdoor Datasets (arXiv 2022) [paper]
Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World (arXiv 2022) [paper]
Backdoor Attacks on Self-Supervised Learning (CVPR 2022) [paper] [code]
Poisons that are learned faster are more effective (CVPR 2022 Workshops) [paper]
Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning (ICLR 2022) [paper] [code]
Adversarial Unlearning of Backdoors via Implicit Hypergradient (ICLR 2022) [paper] [code]
Not All Poisons are Created Equal: Robust Training against Data Poisoning (ICML 2022) [paper] [code]
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch (NeurIPS 2022) [paper] [code]
Hidden Poison: Machine unlearning enables camouflaged poisoning attacks (NeurIPS 2022 Workshop MLSW) [paper]
Hard to Forget: Poisoning Attacks on Certified Machine Unlearning (AAAI 2022) [paper] [code]
PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning (USENIX 2022) [paper]
Planting Undetectable Backdoors in Machine Learning Models (FOCS 2022) [paper]

2021

How Robust Are Randomized Smoothing Based Defenses to Data Poisoning? (CVPR 2021) [paper]
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release (ICLR 2021 Workshop on Security and Safety in Machine Learning Systems) [paper]
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching (ICLR 2021) [paper] [code]
Unlearnable Examples: Making Personal Data Unexploitable (ICLR 2021) [paper] [code]
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks (ICLR 2021) [paper] [code]
What Doesn't Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning (ICLR 2021 Workshop) [paper]
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks (ICML 2021) [paper] [code]
Neural Tangent Generalization Attacks (ICML 2021) [paper]
SPECTRE: Defending Against Backdoor Attacks Using Robust Covariance Estimation (ICML 2021) [paper]
Adversarial Examples Make Strong Poisons (NeurIPS 2021) [paper]
Anti-Backdoor Learning: Training Clean Models on Poisoned Data (NeurIPS 2021) [paper] [code]
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective [paper] [code]
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff (ICASSP 2021) [paper]

2020

Invisible backdoor attacks on deep neural networks via steganography and regularization (TDSC 2020) [paper]
Backdooring and poisoning neural networks with image-scaling attacks (arXiv 2020) [paper]
MetaPoison: Practical General-purpose Clean-label Data Poisoning (NeurIPS 2020) [paper]
Input-Aware Dynamic Backdoor Attack (NeurIPS 2020) [paper] [code]
How To Backdoor Federated Learning (AISTATS 2020) [paper]
Reflection backdoor: A natural backdoor attack on deep neural networks (ECCV 2020) [paper]
Radioactive data: tracing through training (ICML 2020) [paper]
Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks (ICML 2020) [paper]
Hidden Trigger Backdoor Attacks (AAAI 2020) [paper] [code]

2019

Label-consistent backdoor attacks (arXiv 2019) [paper]
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (IEEE Access 2019) [paper]
Sever: A Robust Meta-Algorithm for Stochastic Optimization (ICML 2019) [paper]
Learning with Bad Training Data via Iterative Trimmed Loss Minimization (ICML 2019) [paper]
Universal Multi-Party Poisoning Attacks (ICML 2019) [paper]
Transferable Clean-Label Poisoning Attacks on Deep Neural Nets (ICML 2019) [paper]
Defending Neural Backdoors via Generative Distribution Modeling (NeurIPS 2019) [paper]
Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder (NeurIPS 2019) [paper] [code]
The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure (AAAI 2019) [paper]

2018

Spectral Signatures in Backdoor Attacks (NeurIPS 2018) [paper]
Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks (NeurIPS 2018) [paper]
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise (NeurIPS 2018) [paper]
Trojaning Attack on Neural Networks (NDSS 2018) [paper]

2017

Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning (arXiv 2017) [paper]
Understanding Black-box Predictions via Influence Functions (ICML 2017) [paper] [code]
Certified Defenses for Data Poisoning Attacks (NeurIPS 2017) [paper]

sofienelkamel/awesome-data-poisoning-and-backdoor-attacks

Awesome Data Poisoning and Backdoor Attacks

Surveys

2023

2022

2021

2020

2019

2018

2017