Trusted-AI/adversarial-robustness-toolbox

L¹ `FGM` is wrong + extend to all p >= 1

Closed this issue · 5 comments

Hello,

I'm not sure, but I think the FGM extension to $L¹$ norm is not correct.

From what I can read here, it seems to me that the current version implements (essentially)
$$\text{noise direction}=\frac{\nabla}{\Vert\nabla\Vert_1}$$
when
$$\text{noise direction}=(0, \dots, 0, \text{sign}(a_i), 0, \dots, 0),\quad(i=\text{argmax}_j\vert\nabla_j\vert)$$
gives a higher inner product $\langle\nabla,\text{noise direction}\rangle$ for the same $L¹$ budget.

Indeed, in both cases $\Vert\text{noise direction}\Vert_1=1$ while the first and second options respectively give $\Vert\nabla\Vert_2/\Vert\nabla\Vert_1$ and $\Vert\nabla\Vert_{\infty}$. The latter is of course bigger due to Hölder's inequality.


Edit: See here for generalization to all $p\in[1, +\infty]$.

Hi @ego-thales Thank you for this comment. Without deciding on the correctness yet, how did you notice this issue? Have you already checked which version the literature on FGM is using?

Thanks for your answer,

I've stumbled upon this because while reading FGSM paper (the reference for implementation), I thought about generalizing to $L^p$ norms. Then I saw that this repo implemented $L¹$ and $L^2$ extensions specifically, so I went and checked out the code (since there is no cited source regarding the maths used) and noticed this (apparently) suboptimal implementation.

Actually, now that I think about it, I don't see any reason why this attack is not generalized to any $L^p$ noise.

Let $p\in[1, +\infty]$ and $q$ such that $\frac{1}{p}+\frac{1}{q}=1$ (some abuse of notations will occur when $p=1$ or $p=+\infty$). With

$$\text{noise direction}:=\left(\frac{\vert\nabla\vert}{\Vert\nabla\Vert_q}\right)^{q/p}\text{sign}(\nabla),$$
one gets:

  • $\Vert\text{noise direction}\Vert_p=1$,
  • $\langle \nabla, \text{noise direction}\rangle=\Vert\nabla\Vert_q$ (I skip the quick computation, but mainly because $\frac{q}{p}+1=q$), which is the equality case of Hölder's inequality and as such, optimal.

As such, it would be a nice addition to entirely generalize FGM to all $p\geq 1$.

Hi @ego-thales Thank you very much for the explanation and pull request! Let me take a closer look at the required changes. Related to this issue in FGSM, what do you think about the perturbation per iteration and overall perturbation calculation for p=1 in the Projected Gradient Descent attacks in art.attacks.evasion.projected_gradient_descent.*?

I'm not entirely sure but it looks to me after a quick glance that PGD was implemented as a subs class of FGSM and inherits its loss from it.