Random initialization in linear probing
ruomingzhai opened this issue · 1 comments
Hi,
I want to compare your method in linear probing with random initialization.
About the random initialzation in linear probing, is it that all I have to do is training the backbone and classification head with "lr=0.05, lr_head=null, freeze_layers=True epoch_num=100"?
It doesn't make sense for me because the weights of backbones will not change and the weights of classification head with lr_head=null will also not change?
I should maybe specify it in the config, but when you set lr_head=null, it actually applies the same lr for both the head and the backbone (which means the lr_head, in practice will also be equal to 0.05).
In SLidR, however, we used 50 epochs for linear probing, as you can read in paragraph D.1 of the supplementary