Adversarial Augmentation

White box attacks investigation

Use transfer learning techniques to slightly modify an already-trained model (such as ResNet) and make it work on a given dataset (such as CIFAR-10). (https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html)

In a white-box scenario, use the Fast Gradient Sign Method to generate adversarial examples. (https://pytorch.org/tutorials/beginner/fgsm_tutorial.html)

The model has been trained and achieves an accuracy of 93% ca., and is used to obtain various measures.

The FID is used to measure how much our network thinks an image is different from the original when we feed an augmented image. This is useful to evaluate the effects of augmentations on the prediction-capabilities of the model.

We then measure also the calibration of our network to find whether its output correctly predicts the class-belonging probability of the sample . When a low probability is selected the network should make more mistakes, thus showing that it is possible to make correct judgements on the certainty of the prediction based on the class-belonging probability that the network outputs.