fadel/pytorch_ema

issues about add with torch.no_grad():

liang233 opened this issue · 1 comments

hello,good works,but I want to reduce memory,where can I add with torch.no_grad()?thanks

fadel commented

Hi, the current implementation probably uses torch.no_grad() wherever possible, so gradients for the tensors inside the ExponentialMovingAverage instance are not supposed to be stored. Do you have a code snippet that illustrates your particular issue?

Please note that using EMA will store a copy of all your parameters, so memory usage will be higher.