cure-lab/MMA-Diffusion

Questions about the paper regarding text embedding adversary attack

LezJ opened this issue · 1 comments

LezJ commented

Hi authors, thanks for your amazing work to bring the existing safety problem of T2I models. I assume that your text embedding based adversary attack requires to access the exact text encoder of the model right? But in real life settings, normally a user won't know the exact type of text encoder using (for example, if you are using Dalle3). I guess we cant apply the attack in such situations right?

Best

The proposed attacks are conducted on the open-source Stable Diffusion, then directly transfered to attack other type T2I models. We found the transfer attack success rate is decent. Though text encoders within different T2I models have distinct
architecture, they may trained on same language materials, therefore capturing the similar underlying semantic relationships among words, resulting in the transferability.