This repository is the source code for "DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models" (ICLR 2024).
See requirements.txt
Detecting unauthorized usages on the protected dataset planted with unconditional injected memorization.
- Planting unconditional injected memorization into model:
python coating.py --p 1.0 --target_type wanet --unconditional --wanet_s 2 --remove_eval
- Training the model on the protected dataset planted with unconditional injected memorization:
export MODEL_NAME="CompVis/stable-diffusion-v1-4" \
export TRAIN_DATA_DIR="./traindata_p1.0_wanet_unconditional_s2.0_k128_removeeval/train/" \
export OUTPUT_DIR="output_p1.0_wanet_unconditional_s2.0_k128" \
CUDA_VISIBLE_DEVICES=0 accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--train_data_dir=$TRAIN_DATA_DIR --caption_column="additional_feature" \
--resolution=512 --random_flip \
--train_batch_size=1 \
--num_train_epochs=100 --checkpointing_steps=5000 \
--learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
--seed=42 \
--output_dir=$OUTPUT_DIR \
--validation_prompt=None --report_to="wandb"
- Tracing unauthorized data usages.
- First, generate a set of samples using the inspected model:
export MODEL_PATH="output_p1.0_wanet_unconditional_s2.0_k128" \
export SAVE_PATH="./generated_imgs_p1.0_wanet_unconditional_s2.0_k128/" \
CUDA_VISIBLE_DEVICES=0 python generate.py --model_path $MODEL_PATH --save_path $SAVE_PATH
- Second, approximate the memorization strength and flag the malicious model:
Construct positive samples and negative samples for the training of the binary classifier
python coating.py --p 1.0 --target_type wanet --unconditional --wanet_s 2
python coating.py --p 0.0 --target_type none
Train binary classifier and approximate the memorization strength
export ORI_DIR="./traindata_p0.0_none/train/" \
export COATED_DIR="./traindata_p1.0_wanet_unconditional_s2.0_k128/train/" \
export GENERATED_INSPECTED_DIR="./generated_imgs_p1.0_wanet_unconditional_s2.0_k128/ " \
CUDA_VISIBLE_DEVICES=0 python binary_classifier.py --ori_dir $ORI_DIR \
--coated_dir $COATED_DIR \
--generated_inspected_dir $GENERATED_INSPECTED_DIR
Detecting unauthorized usages on the protected dataset planted with trigger-conditioned injected memorization.
- Planting trigger-conditioned injected memorization into model:
python coating.py --p 0.2 --target_type wanet --wanet_s 1 --remove_eval
- Training the model on the protected dataset planted with trigger-conditioned injected memorization:
export MODEL_NAME="CompVis/stable-diffusion-v1-4" \
export TRAIN_DATA_DIR="./traindata_p0.2_wanet_s1.0_k128_removeeval/train/" \
export OUTPUT_DIR="output_p0.2_wanet_s1.0_k128" \
CUDA_VISIBLE_DEVICES=0 accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--train_data_dir=$TRAIN_DATA_DIR --caption_column="additional_feature" \
--resolution=512 --random_flip \
--train_batch_size=1 \
--num_train_epochs=100 --checkpointing_steps=5000 \
--learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
--seed=42 \
--output_dir=$OUTPUT_DIR \
--validation_prompt=None --report_to="wandb"
- Tracing unauthorized data usages.
- First, generate a set of samples using the inspected model:
export MODEL_PATH="output_p0.2_wanet_s1.0_k128" \
export SAVE_PATH="./generated_imgs_p0.2_wanet_s1.0_k128/" \
CUDA_VISIBLE_DEVICES=0 python generate.py --model_path $MODEL_PATH --save_path $SAVE_PATH
- Second, approximate the memorization strength and flag the malicious model:
Construct positive samples and negative samples for the training of the binary classifier
python coating.py --p 1.0 --target_type wanet --unconditional --wanet_s 1
python coating.py --p 0.0 --target_type none
Train binary classifier and approximate the memorization strength
export ORI_DIR="./traindata_p0.0_none/train/" \
export COATED_DIR="./traindata_p1.0_wanet_unconditional_s1.0_k128/train/" \
export GENERATED_INSPECTED_DIR="./generated_imgs_p0.2_wanet_s1.0_k128/ " \
CUDA_VISIBLE_DEVICES=0 python binary_classifier.py --ori_dir $ORI_DIR \
--coated_dir $COATED_DIR \
--generated_inspected_dir $GENERATED_INSPECTED_DIR --trigger_conditioned
- Get unprotected dataset:
python coating.py --p 0.0 --target_type none --remove_eval
- Training the model on the unprotected dataset:
export MODEL_NAME="CompVis/stable-diffusion-v1-4" \
export TRAIN_DATA_DIR="./traindata_p0.0_none_removeeval/train/" \
export OUTPUT_DIR="output_p0.0_none" \
CUDA_VISIBLE_DEVICES=0 accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--train_data_dir=$TRAIN_DATA_DIR --caption_column="additional_feature" \
--resolution=512 --random_flip \
--train_batch_size=1 \
--num_train_epochs=100 --checkpointing_steps=5000 \
--learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
--seed=42 \
--output_dir=$OUTPUT_DIR \
--validation_prompt=None --report_to="wandb"
- Tracing unauthorized data usages.
- First, generate a set of samples using the inspected model:
export MODEL_PATH="output_p0.0_none" \
export SAVE_PATH="./generated_imgs_p0.0_none/" \
CUDA_VISIBLE_DEVICES=0 python generate.py --model_path $MODEL_PATH --save_path $SAVE_PATH
- Approximate the (unconditional) memorization strength and flag the malicious model:
Construct positive samples and negative samples for the training of the binary classifier
python coating.py --p 1.0 --target_type wanet --unconditional --wanet_s 1
python coating.py --p 0.0 --target_type none
Train binary classifier and approximate the memorization strength
export ORI_DIR="./traindata_p0.0_none/train/" \
export COATED_DIR="./traindata_p1.0_wanet_unconditional_s1.0_k128/train/" \
export GENERATED_INSPECTED_DIR="./generated_imgs_p0.0_none/ " \
CUDA_VISIBLE_DEVICES=0 python binary_classifier.py --ori_dir $ORI_DIR \
--coated_dir $COATED_DIR \
--generated_inspected_dir $GENERATED_INSPECTED_DIR
- Approximate the (trigger-conditioned) memorization strength and flag the malicious model:
Construct positive samples and negative samples for the training of the binary classifier
python coating.py --p 1.0 --target_type wanet --unconditional --wanet_s 2
python coating.py --p 0.0 --target_type none
Train binary classifier and approximate the memorization strength
export ORI_DIR="./traindata_p0.0_none/train/" \
export COATED_DIR="./traindata_p1.0_wanet_unconditional_s2.0_k128/train/" \
export GENERATED_INSPECTED_DIR="./generated_imgs_p0.0_none/"\
CUDA_VISIBLE_DEVICES=0 python binary_classifier.py --ori_dir $ORI_DIR \
--coated_dir $COATED_DIR \
--generated_inspected_dir $GENERATED_INSPECTED_DIR \ --trigger_conditioned
Part of the code is modifed based on https://github.com/huggingface/diffusers/tree/main/examples/text_to_image.
You are encouraged to cite the following paper if you use the repo for academic research.
@inproceedings{wang2023diagnosis,
title={DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models},
author={Wang, Zhenting and Chen, Chen and Lyu, Lingjuan and Metaxas, Dimitris N and Ma, Shiqing},
booktitle={The Twelfth International Conference on Learning Representations},
year={2023}
}