IBM/Autozoom-Attack

Question about the modifiers

Closed this issue · 1 comments

Hi,

I'm running the code on ImageNet for untargeted attacks. I make sure that the original images are correctly classified. After running attacks for some images, I found that sometimes the next image is already adversarial (initial success occurs at the first iteration).

Is this phenomenon due to that the modifier is not reset after running attack for an image? I'm thinking that the noise remains adversarial for new images because of the effect of "universal perturbation"?

And if it's true, do I need to reset the modifier after attack for each image?

Hello dongyp13,

We don't think modifier is the reason for yielding initial success at the first iteration. In fact,
it is not uncommon that after the first iteration one can find an initial success (although it might have large distortion). For example, the well-known fast gradient sign method (FGSM) could find adversarial perturbations using just one iteration.