Trusted-AI/adversarial-robustness-toolbox

Boundary attack in targeted=True setting

Opened this issue · 4 comments

When I test boundary attack in target=True settings and provide the target label like this:
boundary = BoundaryAttack(estimator= kclassifier,
batch_size= 64,
targeted= True,
delta= 0.01,
epsilon = 0.01,
step_adapt = 0.667,
max_iter = 5000,
num_trial = 25,
sample_size = 20,
init_size = 100,
min_epsilon= 0.0,
verbose = True)
img = load_img("acorn.JPEG", target_size=(224, 224), interpolation='lanczos')
img = img_to_array(img)
img = (np.expand_dims(img, axis=0))
ct = np.array([306])
adv_img = boundary.generate(img, y)

It does not iterate like in target=False setting. This is the only verbose I got:

Boundary attack: 100%|██████████| 1/1 [00:09<00:00, 9.96s/it]

then the attack finishes.

Am I doing something wrong.

Hi @aliotopal Please apologies my delayed response. What is the true label of the attacked image and what is the classification prediction of the model for that image?

Hi, true label of the image is acorn, and it is classified as acorn with the classification prediction of the model.

When it is untargeted we can see that it iterates till 5000 iteration, but when it is targeted it stops in first iterations as shown in verbose: Boundary attack: 100%|██████████| 1/1 [00:09<00:00, 9.96s/it]

and the generated adversarial image is not adversarial, it is still in true label for the model we attacked.