90% codes copy from FeatureScatter and Madry PGD adv. training
Created Feature Attack, and it's stronger than PGD attack or CW attack w.r.t Feature Scatter and Adversarial Interpolation Training.
Model trained on CIFAR10: FS and Adv_inter
sh fs_eval_feature_attack.sh
cd cifar10_challenge && python feature_attack_batch_tf.py
Defense | clean | FGSM | PGD20-2-8 | CW20-2-8 | FeatureAttack20-1-8-100(100 target images) | adv_test_images |
---|---|---|---|---|---|---|
Feature Scatter | 90.3 | 78.4 | 71.1 | 62.4 | 36.94 | |
Adv_inter | 90.5 | 78.1 | 74.4 | 69.5 | 37.64 | |
Madry | 87.25 | 45.87 | 46.37 | |||
Sensible adversarial learning | 91.51 | 74.32 | 62.04 | 59.91 | 43.76 | sensible_adv_x |
bilateral_AT (mosa_eps4) | 92.8 | 71.0(pgd100-2-8) | 67.9(CW100-2-8) | 32.28 | bilater_adv_x | |
GCE | 62.74 | 9.55(MNIST_PGD40-0.01-0.2) | 0 | |||
TRADES | 84.92 | 55.4 | 53.89 | 52.94(50-1-8-200) | TRADES_adv_x_float_0~1_npy |
For CIFAR10 test data set
eps = 8./255.
nat_X = ALL_CLEAN_TEST_IMAGES # default order in PyTorch [0, 1]
adv_X_uint8 = torch.load('ADV_TEST_IMAGES_PATH')
adv_X = adv_X_uint8.type(torch.FloatTensor) / 255. # [0, 1]
assert adv_X.min() >= 0. and adv_X.max() <= 1.
abs_diff = torch.abs(adv_X - nat_X)
assert abs_diff <= eps + 0.0001