locuslab/robust_union

Codes for computing overall adversarial accuracy of Lp attacks?

joellliu opened this issue · 8 comments

Hi, I am wondering if this repo contains codes for computing overall adversarial accuracy of Lp attacks as shown in Table 1 and 2 of the paper. As far as I understand, test.py computes the numbers over individual attacks, is that correct? Thanks!

I added a file "compile_acc.py" in the root directory. This takes the min by reading all saved .npy files for a given attack type.

You might need to adjust the folder and norm specifications to make it work. Hope it helps.

Thank you Pratyush! This is very helpful!

@pratyushmaini Hi Pratyushi, I am testing the pretrained CIFAR10 models, and I noticed that the clean accuracy I got is slightly different from the paper, which makes me worried if I did something wrong. For example, for MSD I got 82.1% vs. 81.1% in the paper, and MAX 81.7% vs. 81.0%. I got these number by running python test.py -attack 1 -model 3.

Do you have any idea what might be going on here? Are the pretrained models the ones you used in the paper? Any suggestion is welcomed. Thanks for your help!

Hey, Sorry for the delay in getting back. The reason that I can think of is that the clean accuracy reported in the table (refer to the table in README) was separately calculated for the entire test set, whereas the function does it for the first 1000 examples only.
If you look at the function, it has a parameter "clean" (

def test_pgd(model_name, clean = False):
). Remember that all adv accuracies are tested only on the first 1000 examples since the attacks take a lot of time (also in previous work). The clean accuracy is measured on the entires test set.

This could have been done in a 'nicer' way by adding another parameter to the script, but since this clean accuracy for the whole dataset isn't used anywhere else, I probably just manually changed it at that point.

Hey, Sorry for the delay in getting back. The reason that I can think of is that the clean accuracy reported in the table (refer to the table in README) was separately calculated for the entire test set, whereas the function does it for the first 1000 examples only.
If you look at the function, it has a parameter "clean" (

def test_pgd(model_name, clean = False):

). Remember that all adv accuracies are tested only on the first 1000 examples since the attacks take a lot of time (also in previous work). The clean accuracy is measured on the entires test set.
This could have been done in a 'nicer' way by adding another parameter to the script, but since this clean accuracy for the whole dataset isn't used anywhere else, I probably just manually changed it at that point.

Got it. Thank you so much!

Hi Pratyushi @pratyushmaini, thanks for your previous help! I'm writing a paper and using MSD as a baseline. I used your evaluation settings and one of the reviewers asked to add a new attack for evaluation. To do that, I need to run the evaluations myself to get the ".npy" files for recompiling the results. But I noticed the results I got for the attacks you already had are slightly different than the numbers reported in the paper. I think this may be due to some randomness in the evaluation such as random starts. To best reproduce your results, I wonder if you can share the ".npy" files you got from your evaluation. I can totally understand if you cannot but just want to ask. Thank you so much!

I am sorry but the only way to get the .npy file for me would also be by rerunning the attacks on the model checkpoints provided. I understand that the numbers might vary a bit because the foolbox attack library has seen multiple updates since then. You may want to use an old version of foolbox for the exact result reproduction. The final numbers will always have some randomness due to the randomness in the initialization of the attack. Hope this helps :)

I see. Thanks a lot!