lukemelas/EfficientNet-PyTorch

efficientnet-b8 and AdvProp

seefun opened this issue Β· 19 comments

With advprop, efficientnet got greater score in ImageNet. Would you update to the new ckpt?
the paper: https://arxiv.org/pdf/1911.09665.pdf
https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet

Awesome, thanks for posting this. It's on the way.

@lukemelas maybe your current code in β€œtf_to_pytorch” dir will work?

@lukemelas any update on that ?

@lukemelas any update on that ?

Apologies for the delay on this (I had final exams this past week). Coming soon.

Hi @lukemelas , just wondering if you had the chance to work on that one.

Sorry this took forever. It should be in now :)

Let me know if you have any issues.

Closing this, but feel free to re-open it if you have any issues/questions.

Dear all,

I am using https://colab.research.google.com/drive/1Jw28xZ1NJq4Cja4jLe6tJ6_F5lCzElb4
Why the efficientnet-b0 advprop=True result is very bad?
Thank you.

Suryadi

Loaded pretrained weights for efficientnet-b0, advprop=False
-----
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca           (83.44%)
brown bear, bruin, Ursus arctos                                             (0.62%)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens         (0.60%)
ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus                 (0.44%)
Arctic fox, white fox, Alopex lagopus                                       (0.34%)

Loaded pretrained weights for efficientnet-b0, advprop=True
-----
wombat                                                                      (3.34%)
candle, taper, wax light                                                    (2.00%)
Angora, Angora rabbit                                                       (1.87%)
schipperke                                                                  (1.86%)
hog, pig, grunter, squealer, Sus scrofa                                     (1.60%)

Did you use the advprop image preprocessing or the usual preprocessing? See https://github.com/lukemelas/EfficientNet-PyTorch/blob/master/examples/imagenet/main.py#L211. That's the reason advprop is not enabled by default. Let me know if it still doesn't work and I can look into it.

Did you use the advprop image preprocessing or the usual preprocessing? See https://github.com/lukemelas/EfficientNet-PyTorch/blob/master/examples/imagenet/main.py#L211. That's the reason advprop is not enabled by default. Let me know if it still doesn't work and I can look into it.

Hi @lukemelas ,

Thanks for the repo. Do you know the reason for using a different normalization for advprop? If I am training a new model with advprop, why should I use it than imagenet mean and std?

one question, if I'm getting the gist of the paper right, It seems like advprop uses two batchnorm layer (one for standard data and the other one for adversarial data). However in the code, I don't see where it implements that other batchnorm layer. Am I misunderstanding the paper? Or is the code not providing it?

@ooodragon94 I think that part is not in this repo. But it is easy to implement.

Class EfficientNet():
    def __init__(self, advprop=False, **kwargs):
        self.somelayers = nn.Layer
        self.norm = nn.BatchNorm
        if advprop:
            self.aux_norm = nn.BatchNorm

    def forward(self, x, advprop=False):
        x = self.somelayers(x)
        if advprop:
            x = self.aux_norm(x)
        else:
            x = self.norm(x)

@shijianjian
you are definitely right (and your code is compact and beautiful!)
I'm just afraid that if I were to use pretrained model and fine tune it, I'm assuming that since I am not loading aux_norm's parameters, the performance might degrade due to different settings.

@ooodragon94
Not pretty confident. But if you want to fine-tune with adversarial examples. I think the easiest way is to freeze the whole model apart from the auxiliary normalization layer. In initial epochs, you only train the auxiliary mean and stds. Then you may save the whole model and fine-tune it as normal.

@shijianjian
Trying to implement advprop
I totally agree with your comment on "different normalization"
I also don't see where it makes adversarial samples...

When using advprop pretrained weight and advprop normalization, the training results become very unstable, and the accuracy also decreases.

When using advprop pretrained weight and advprop normalization, the training results become very unstable, and the accuracy also decreases.

@feiwofeifeixiaowo If you didn't by any chance implemented advprop but only loaded its weights, than it will definitely suffer in training accuracy.