Random initialization in linear probing

Question

Random initialization in linear probing

ruomingzhai opened this issue 2 years ago · 1 comments

Hi,
I want to compare your method in linear probing with random initialization.
About the random initialzation in linear probing, is it that all I have to do is training the backbone and classification head with "lr=0.05, lr_head=null, freeze_layers=True epoch_num=100"?
It doesn't make sense for me because the weights of backbones will not change and the weights of classification head with lr_head=null will also not change?

Answer 1 · 2022-12-05T14:19:08.000Z

I should maybe specify it in the config, but when you set lr_head=null, it actually applies the same lr for both the head and the backbone (which means the lr_head, in practice will also be equal to 0.05).
In SLidR, however, we used 50 epochs for linear probing, as you can read in paragraph D.1 of the supplementary