FGSM(Fast Gradient Sign Method)

Overview

Simple pytorch implementation of FGSM and I-FGSM
(FGSM : explaining and harnessing adversarial examples, Goodfellow et al.)
(I-FGSM : adversarial examples in the physical world, Kurakin et al.)

FGSM

I-FGSM

Dependencies

python 3.6.4
pytorch 0.3.1.post2
visdom(optional)
tensorboardX(optional)
tensorflow(optional)

Usage

train a simple MNIST classifier

python main.py --mode train --env_name [NAME]

load trained classifier, generate adversarial examples, and then see outputs in the output directory

python main.py --mode generate --iteration 1 --epsilon 0.03 --env_name [NAME] --load_ckpt best_acc.tar

for a targeted attack, indicate target class number using --target argument(default is -1 for a non-targeted attack)

python main.py --mode generate --iteration 1 --epsilon 0.03 --target 3 --env_name [NAME] --load_ckpt best_acc.tar

Results

Non-targeted attack

from the left, legitimate examples, perturbed examples, and indication of perturbed images that changed predictions of the classifier, respectively

non-targeted attack, iteration : 1, epsilon : 0.03
non-targeted attack, iteration : 5, epsilon : 0.03
non-targeted attack, iteration : 1, epsilon : 0.5

Targeted attack

from the left, legitimate examples, perturbed examples, and indication of perturbed images that led the classifier to predict an input as the target, respectively

targeted attack(9), iteration : 1, epsilon : 0.03
targeted attack(9), iteration : 5, epsilon : 0.03
targeted attack(9), iteration : 1, epsilon : 0.5

References

explaining and harnessing adversarial examples, Goodfellow et al.
adversarial examples in the physical world, Kurakin et al.

shitfing/FGSM