labsix/adversarial-logit-pairing-analysis

A Question regarding the Clean Pair

Closed this issue · 5 comments

Hi! Thank you for sharing the code! Actually, I got a question here. Do you remember the loss function for the clean pair? I don't get it, why should we measure/minimize the loss between 2 clean data points?

thx!

As far as I am concerned, we optimize this loss term, in order to find a adversarial examples which has a different label but has a similar logit to the one of the attacked clean data point. Not sure if this understanding is correct, because I don't find any generative process.

We don't do anything particularly special to ensure that the logit representation is similar to the clean data point.

We simply run PGD (using a large number of iterations, as opposed to the ALP paper's small number of iterations), and we find that it ends up finding adversarial examples (within the specified threat model's L-inf ball).

Ah! Thanks for the reply! But I still don't understand why there is a NullAttack (I know it is mentioned in the paper as CLP). Could you please tell me why do we need it to improve the performance?

Ah! I get it!