taoyang1122/adapt-image-models

Weights Initialization

liyi-ff opened this issue · 1 comments

When initializing weights to the same value, the weights will also be the same during the training.
(

for n, m in self.transformer.named_modules():
)

Is there any reason to initialize this to a constant 0 instead of other initialization? For example, Gaussian with a mean of 0 and a very small variance.

Thanks

Hi @liyi-ff , we initialize it to zero so that at the beginning of training, the model is the same as the pre-trained model. Then with the training process, the adapters gradually adapt the model to video data. This stabalizes the training process.