/OOD-Attacks

Attacks using out-of-distribution adversarial examples

Primary LanguagePython

OOD-Attacks

This repository contains the code for the paper 'Analyzing the Robustness of Open-World Machine Learning' accepted at AISec 2019, which is co-located with ACM CCS. All the results presented therein use the CIFAR-10 dataset as the in-distribution dataset. The out-of-distribution datasets are available for download here.

Dependencies: PyTorch-1.2, torchvision, numpy

To train a Wide-Resnet-28-10 on the CIFAR-10 dataset, run

python train_robust.py --dataset_in=CIFAR-10 --model=wrn --depth=28 --width=10

To carry out an in-distribution targeted evasion attack against this trained model, run

python test_robust.py --dataset_in=CIFAR-10 --model=wrn --depth=28 --width=10 --targeted

An adversarially trained model can be generated by running

python train_robust.py --dataset_in=CIFAR-10 --model=wrn --depth=28 --width=10 --is_adv --attack=PGD_linf --epsilon=0.3 --attack_iter=10 --eps_step=0.04

OOD Attacks

To carry out an out-of-distribution targeted evasion attack against this trained model, run

python test_robust.py --dataset_in=CIFAR-10 --model=wrn --depth=28 --width=10 --targeted --is_test_ood --dataset_out=voc12

Evaluating and attacking OOD detectors

To evaluate the FPR of the ODIN OOD detector using both benign and adversarial OOD examples, run

python test_robust.py --dataset_in=CIFAR-10 --model=wrn --depth=28 --width=10 --targeted --is_test_ood --is_eval_ood_detector --ood_detector=odin --dataset_out=voc12 --temp=1