This project implements an adversarial attack on neural networks by optimizing on images to maximize the likelihood of false predictions.
We follow Szegedy et al.’s Intriguing properties of neural networks [1], but optimize using gradient descent instead of L-BFGS to study the incremental properties of attacks. We also perform analysis on the transferability of these attacks and the ease of fooling across classes.