l0 attack: Potential Bug
ankur6ue opened this issue · 1 comments
Apologies if I'm missing something obvious here, but in the l0 attack, shouldn't the valid[e] = 0 be after the breaks?
If set a pixel to "don't change", if 1. totalchange < threshold and 2. we haven't changed too many pixels.. setting valid[e] = 0 before the breaks would invalidate the pixel regardless?
did = 0
for e in np.argsort(totalchange):
if np.all(valid[e]):
did += 1
valid[e] = 0
if totalchange[e] > .01:
# if this pixel changed a lot, skip
break
if did >= .3*equal_count**.5:
# if we changed too many pixels, skip
break
Also, you haven't implemented the random starts in the l2 implementation in this repo correct? The paper says:
We randomly sample points uniformly from the ball of radius
r, where r is the closest adversarial example found so far.
This r depends on the source/target label for a given image? i.e we chose a r based on the closest adversarial example for the target class under consideration (r would vary significantly depending on the target class, as adversarial examples for some classes are harder than others)? What initial value did you pick?
-
We want to set at least one pixel on each iteration of the loop to be fixed. The abort criteria are there so that we don't change more than one unless it looks likely we haven't made a substantial change. If it was after, then we might enter an infinite loop where we never set decrease the size of the valid set.
-
Correct, this doesn't implement the random restarts. Initially, solve with no random start, and find the nearest adversarial examples. Then, on future iterations, pick a random perturbation with magnitude less than the best solution found so far (for this exact adversarial example attempt, which yes, depends on source/target class).