/Awesome-security-in-diffusion-models

This is a collection of awesome papers I have read (carefully or roughly) in the fields of security in diffusion models. Any suggestions and comments are welcome (2801198407@qq.com).

Awesome security in diffusion models

Concept erasure

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
CVPR 2023 Star

Ablating Concepts in Text-to-Image Diffusion Models
ICCV 2023 Star

Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
ICCV 2023 Star

Erasing Concepts from Diffusion Models
ICCV 2023 Star

Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
arxiv Star

Inst-Inpaint: Instructing to Remove Objects with Diffusion Models
arxiv Star

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models
arxiv Star

Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models
ICML 2023 workshop Star

Unified Concept Editing in Diffusion Models
WACV 2024 Star

Implicit Concept Removal of Diffusion Models
arxiv

To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now
arxiv Star

Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers
arxiv

All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models
arxiv

One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications
arxiv Star

EraseDiff: Erasing Data Influence in Diffusion Models
arxiv

Separable Multi-Concept Erasure from Diffusion Models
arxiv Star

SalUn Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
arxiv Star

Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models?
arxiv Star

Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models
arxiv Star

Localizing and Editing Knowledge in Text-to-Image Generative Models
arxiv

UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models
arxiv Star

Universal Prompt Optimizer for Safe Text-to-Image Generation
arxiv

Circumventing Concept Erasure Methods For Text-to-Image Generative Models
arxiv Star

MACE: Mass Concept Erasure in Diffusion Models
arxiv Star

Position: Towards Implicit Prompt For Text-To-Image Models
arxiv

Editing Massive Concepts in Text-to-Image Diffusion Models
arxiv

Removing Undesirable Concepts in Text-to-Image Generative Models with Learnable Prompts
arxiv

On Mechanistic Knowledge Localization in Text-to-Image Generative Models
arxiv

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models
arxiv

ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning
arxiv Star

R.A.C.E. : Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model
arxiv

Pruning for Robust Concept Erasing in Diffusion Models
arxiv

Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning
arxiv Star

Text Guided Image Editing with Automatic Concept Locating and Forgetting
arxiv

Robust Concept Erasure Using Task Vectors
arxiv

Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient
arxiv

Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
arxiv Star

CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models
arxiv Star

Concept debiasing

Instructing Text-to-Image Generation Models on Fairness
arxiv Star

Debiasing Pretrained Generative Models by Uniformly Sampling Semantic Attributes
arxiv

De-stereotyping Text-to-image Models through Prompt Tuning
arxiv

Stable Bias: Evaluating Societal Representations in Diffusion Models
arxiv

Debiasing Vision-Language Models via Biased Prompts
arxiv Star

Discovering and Mitigating Biases in CLIP-based Image Editing
arxiv

Unified Concept Editing in Diffusion Models
arxiv Star

Finetuning Text-to-Image Diffusion Models for Fairness
arxiv Star

Fair Text-to-Image Diffusion via Fair Mapping
arxiv

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation
arxiv Star

Debiasing Text-to-Image Diffusion Models
arxiv

Balancing Act: Distribution-Guided Debiasing in Diffusion Models
arxiv

Training Unbiased Diffusion Models From Biased Dataset
arxiv Star

Severity Controlled Text-to-Image Generative Model Bias Manipulation
arxiv

OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
arxiv Star

SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation
arxiv

FairRAG: Fair Human Generation via Fair Retrieval Augmentation
arxiv

EquiPrompt: Debiasing Diffusion Models via Iterative Bootstrapping in Chain of Thoughts
arxiv

Backdoor attack on diffusion model

How to Backdoor Diffusion Models?
arxiv Star

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
arxiv Star

VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models
arxiv Star

Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
arxiv Star

Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
arxiv Star

BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models
arxiv Star

From Trojan Horses to Castle Walls: Unveiling Bilateral Backdoor Effects in Diffusion Models
arxiv Star

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline
arxiv

Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models
arxiv Star

Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models
arxiv Star

Invisible Backdoor Attacks on Diffusion Models
arxiv Star

A Recipe for Watermarking Diffusion Models
arxiv Star

Backdoor defense on diffusion model

DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models
arxiv

How to remove backdoors in diffusion models?
arxiv

Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
arxiv Star

Ufid: A unified framework for input-level backdoor detection on diffusion models
arxiv Star

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
arxiv Star

TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors
arxiv

Adversarial attack on diffusion model

Inference attack on diffusion model

Copyright on diffusion model

A Recipe for Watermarking Diffusion Models
arxiv Star

The Stable Signature: Rooting Watermarks in Latent Diffusion Models
arxiv Star

Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust
arxiv Star

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models
arxiv Star

“RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification
arxiv Star

Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models
arxiv Star

“Lazy” Layers to Make Fine-Tuned Diffusion Models More Traceable
arxiv

A Training-Free Plug-and-Play Watermark Framework for Stable Diffusion
arxiv

ModelLock: Locking Your Model With a Spell
arxiv

Disguised Copyright Infringement of Latent Diffusion Models
arxiv Star

WMAdapter: Adding WaterMark Control to Latent Diffusion Models
arxiv

AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
arxiv Star

Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?
arxiv Star

Diffusion model as a tool for defense

Black-box Backdoor Defense via Zero-shot Image Purification
arxiv Star

DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models
arxiv Star

I will update periodically...

Star History

Star History Chart