CSML - HW2

├── data
    ├── cl
        ├── valid.h5 // this is clean validation data used to design the defense
        └── test.h5  // this is clean test data used to evaluate the BadNet
    └── bd
        ├── bd_valid.h5 // this is sunglasses poisoned validation data
        └── bd_test.h5  // this is sunglasses poisoned test data
├── models
    ├── bd_net.h5
    ├── bd_weights.h5
    ├── bd_prime_2.h5
    ├── bd_prime_4.h5
    └── bd_prime_10.h5
├── architecture.py
├── eval.py // this is the evaluation script
└── hw2.ipynb // this is the main python notebook to generate good models

I. Dependencies

Python 3.6.9
Keras 2.3.1
Numpy 1.16.3
Matplotlib 2.2.2
H5py 2.9.0
TensorFlow-gpu 1.15.2

II. Data

Download the validation and test datasets from here and store them under data/ directory.
The dataset contains images from YouTube Aligned Face Dataset. We retrieve 1283 individuals and split into validation and test datasets.
bd_valid.h5 and bd_test.h5 contains validation and test images with sunglasses trigger respectively, that activates the backdoor for bd_net.h5.

III. Generate Good Models

Run hw2.ipynb using jupyter nbconvert --execute hw2.ipynb or manually by opening it in your favourite python notebook viewer/editor.
This will generate / overwrite the good models inside models/ directory.
This will also evaluate and print the accuracy and ASR of these models but if you can test the same using the eval.py script as outlined below.

IV. Evaluating the Backdoored Model

The DNN architecture used to train the face recognition model is the state-of-the-art DeepID network.
To evaluate the backdoored model, execute eval.py by running: python3 eval.py <clean validation data directory> <poisoned validation data directory> <model directory>.

E.g., python3 eval.py data/cl/valid.h5 data/bd/bd_valid.h5 models/bd_net.h5. This will output: Clean Classification accuracy: 98.64 % Attack Success Rate: 100 %

V. Important Notes

Please use only clean validation data (valid.h5) to design the pruning defense. And use test data (test.h5 and bd_test.h5) to evaluate the models.

Korusuke/ML-HW2

CSML - HW2

I. Dependencies

II. Data

III. Generate Good Models

IV. Evaluating the Backdoored Model

V. Important Notes