/AdversarialEvasion

CSEC-720: Deep Learning Security Project II

Primary LanguagePythonMIT LicenseMIT

AdversarialEvasion

CSEC-720: Deep Learning Security Project II

Setup

conda create -name AdversarialEvasion -python=3.9
conda activate AdversarialEvasion
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.2 -c pytorch
conda install -c conda-forge tqdm==4.36.1 matplotlib==2.1.0 pandas

Train and Evaluate

Use this script to train, harden, and evaluate classifiers.

Usage

Basic

python main.py

To query for options

python main.py --help

Example Usage With Flags

python main.py --attack=FGSM --pretrained=./models/False/False/None/3.pth --use_only_first_model

Options

Experimental parameters

  • --attack: Adversarial training technique, e.g., FGSM, IGSM, or PGD. To not perform any adversarial training, do not add the flag. Optional.
  • --pretrained: .pth file of a pretrained model. If given, will finetune this model, presumably upon the adversarial examples. A good choice is ./models/False/False/None/3.pth, which is the highest performing model with no adversarial training. Optional.
  • --use_only_first_model: Flag, if supplied will only use the very first model to generate samples for training, i.e., all adversarial examples will be created from a whitebox attack upon the first model prior before any updates have been performed within the training loop

Configuration parameters

  • --batch_size: Batch size for training and generating adversarial examples. Default 128
  • --device: Hardware device, e.g. cpu, cuda:0, etc. Default cuda.
  • --epochs: Number of epochs to train for. If a pretrained model exists that has been trained for this many epochs, will not perform training and will simply evaluate that model. Default 1.
  • --seed: Seed to control random number generation. Default 0.

Adversarial Generation

python generate.py

We can enhance this with a better CLI when the project comes together a little further.

Output

-- models

-- {PRETRAINED}

  -- {USE_ONLY_FIRST_MODEL}
  
     -- {ATTACK}
     
		-- 1.pth
		
		-- ...
		
		-- N.pth
		
		-- report.csv
		
		-- best_at.csv
		
		-- best_std.csv

where

  • PRETRAINED is one of (True, False) and indicates whether or not the classifier was first trained on nonadversarial data then finetuned on adversarial and nonadversarial data for finetuning.
  • USE_ONLY_FIRST_MODEL is one of (True, False) and indicates whether or not the classifier used to generate the adversarial examples is updated or if only the initial model (epoch 1 before training loop was entered) is used.
  • ATTACK is one of (None, FGSM, IGSM, PGD) and refers to the attack that was used as part of the adversarial training.
  • i.pth is a saved pytorch state dict
  • report.csv contains the performances on validation data
  • best_at.csv contains the test performance of the best performing model on adversarial validation data
  • best_std.csv contains the test performance of the best performing model on nonadversarial validation data