Add `weight_decay_filter` and `lars_adaptation_filter` to LARS
turian opened this issue ยท 9 comments
๐ Feature
Add weight_decay_filter
and lars_adaptation_filter
to LARS
Motivation
weight decay typically shouldn't be applied to BatchNorm. See fast.ai and this pytorch discuss thread.
The facebook vicreg code has parameters weight_decay_filter
and lars_adaptation_filter
which they set to True for any parameter that has ndim 1.
Pitch
There should be a simple way to disable weight decay and LARS adaptation on ndim==1 parameters.
Alternatives
Port Facebook LARS code and use it instead of lightning flash LARS code.
Hi, @turian - Thank you for creating the issue. Just to let you know, I have this on my list to take a look at, and I'll try to get back by this weekend. A bit occupied, apologies for the delay.
Hi, @turian - Thank you for giving the context, I went through the discussion on the PyTorch forum. I think it's fair to give an option to the user to disable this based on the condition (ndim == 1
). Would you like to create a PR to add this? If not, I'll be able to take a look, hopefully soon. Thank you! โก
@krshrimali I am not sure that I would able to create a PR that covers all corner cases. :(
@krshrimali I am not sure that I would able to create a PR that covers all corner cases. :(
No worries at all! I will try to take a look, we are working towards a release tomorrow, so I will need some time but I have added this to my list. Thank you again!!
@krshrimali Great! I am following this issue.
@krshrimali Great! I am following this issue.
I'll try to pick this up over the coming weekend. ๐ค๐ป Thanks for your patience, @turian ๐
@krshrimali Thanks! And I am happy to help with code review if you tag me in the PR
@krshrimali Thanks! And I am happy to help with code review if you tag me in the PR
Thanks! I'll make sure to request your review :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.