
Exciting work! Some questions about universally slimmable networks

TaojiannanYang opened this issue · 3 comments

Hi Jiahui, thanks for the open-source of your work. I have some questions about some details in universally slimmable networks.

  1. I didn't quite understand the post-statistics BN. Could you please elaborate it? e.g give a pseudo-code about how you implement it.
  2. It seems that you also need to pre-define the widths. There are just more widths but not exactly arbitrary widths, right?
    Thanks a lot for your help!

Hi @TaojiannanYang ,

  1. Post-statistics BN is to update BN running stats with frozen model, by feeding training images.
  2. It is NOT more widths, you can execute with arbitrary widths with US-Nets. Given our released model, you are able to (1) sample a random width, (2) compute post-statistics of BN given that width configuration, (3) execute it. We released several BN statistics with widths sampled from 0.35 - 1.0 evenly, but indeed you can sample any width.

Hi Jiahui, Thanks for your reply.
So does this mean that during training phase, you just sample different widths and do naive training; in testing phase, given the desired width, you compute the BN mean and variance and then do inference?
Thanks for your help!

@TaojiannanYang That's sounds correct. During training, please try our proposed inplace distillation and the sandwich rule, instead of doing naive training.