/FeatureAttack

Strongest attack against Feature Scatter and Adversarial Interpolation

Primary LanguagePython

Feature Attack

*Important

90% codes copy from FeatureScatter and Madry PGD adv. training

My work

Created Feature Attack, and it's stronger than PGD attack or CW attack w.r.t Feature Scatter and Adversarial Interpolation Training.

Reference Model

Model trained on CIFAR10: FS and Adv_inter

Evaluate

Feature Scatter
sh fs_eval_feature_attack.sh
Madry's
cd cifar10_challenge && python feature_attack_batch_tf.py

Result

Defense clean FGSM PGD20-2-8 CW20-2-8 FeatureAttack20-1-8-100(100 target images) adv_test_images
Feature Scatter 90.3 78.4 71.1 62.4 36.94
Adv_inter 90.5 78.1 74.4 69.5 37.64
Madry 87.25 45.87 46.37
Sensible adversarial learning 91.51 74.32 62.04 59.91 43.76 sensible_adv_x
bilateral_AT (mosa_eps4) 92.8 71.0(pgd100-2-8) 67.9(CW100-2-8) 32.28 bilater_adv_x
GCE 62.74 9.55(MNIST_PGD40-0.01-0.2) 0
TRADES 84.92 55.4 53.89 52.94(50-1-8-200) TRADES_adv_x_float_0~1_npy
Introduction of adversarial test images

For CIFAR10 test data set

eps = 8./255.
nat_X = ALL_CLEAN_TEST_IMAGES  # default order in PyTorch [0, 1]
adv_X_uint8 = torch.load('ADV_TEST_IMAGES_PATH')
adv_X = adv_X_uint8.type(torch.FloatTensor) / 255.  # [0, 1]
assert adv_X.min() >= 0. and adv_X.max() <= 1.
abs_diff = torch.abs(adv_X - nat_X)
assert abs_diff <= eps + 0.0001