juntang-zhuang/GSAM

Performance of GSAM w/ Adaptive Perturbation

Opened this issue · 1 comments

Hi,

Thanks for creating a PyTorch version of your code!

I saw that in you Cifar example you use the adaptive perturbation rather than the traditional perturbation. However, in figure 5 of you paper (which was a bar plot of different SAM methods), your results suggested that GSAM with the adaptive perturbation performed worse. The results also suggested that ASAM always performed worse than SAM in your results - which to me suggests that rho may not have been tuned properly ( although I might just misinterpretating these results).

Have you noticed better performances with using adaptive GSAM and have you found the adaptive approach to be harder to tune?

Thanks,
SXKJames