juntang-zhuang/Adabelief-Optimizer

weight_decouple in adabelief tf

YannPourcenoux opened this issue · 1 comments

Hi, I am a bit confused, it says that the weight-decouple is supported but not an option. Does it mean it is using it by default? If not how can I turn it on?

Hi, it's turned on by default (same as AdamW) here and could not be turned off. In general, decoupled weight decay is more stable than coupled decay, so I hard code it in order to follow conventions in tensorflow-addons