davda54/sam

Wrong Adaptive mode?

pdradx opened this issue · 1 comments

In code referenced in original paper
https://github.com/SamsungLabs/ASAM/blob/master/asam.py
adaptive operator T is applied only to tensors, which stores weights of layers. Biases of FC and convolutions remains in SAM mode (without applying T operator).
I'm not sure if it is critical difference, but it IS a difference from original paper example of algorithm.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.