Exciting work! Some questions about universally slimmable networks
TaojiannanYang opened this issue · 3 comments
TaojiannanYang commented
Hi Jiahui, thanks for the open-source of your work. I have some questions about some details in universally slimmable networks.
- I didn't quite understand the post-statistics BN. Could you please elaborate it? e.g give a pseudo-code about how you implement it.
- It seems that you also need to pre-define the widths. There are just more widths but not exactly arbitrary widths, right?
Thanks a lot for your help!
JiahuiYu commented
Hi @TaojiannanYang ,
- Post-statistics BN is to update BN running stats with frozen model, by feeding training images.
- It is NOT more widths, you can execute with arbitrary widths with US-Nets. Given our released model, you are able to (1) sample a random width, (2) compute post-statistics of BN given that width configuration, (3) execute it. We released several BN statistics with widths sampled from 0.35 - 1.0 evenly, but indeed you can sample any width.
TaojiannanYang commented
Hi Jiahui, Thanks for your reply.
So does this mean that during training phase, you just sample different widths and do naive training; in testing phase, given the desired width, you compute the BN mean and variance and then do inference?
Thanks for your help!
JiahuiYu commented
@TaojiannanYang That's sounds correct. During training, please try our proposed inplace distillation and the sandwich rule, instead of doing naive training.