issues about add with torch.no_grad():
liang233 opened this issue · 1 comments
liang233 commented
hello,good works,but I want to reduce memory,where can I add with torch.no_grad()?thanks
fadel commented
Hi, the current implementation probably uses torch.no_grad()
wherever possible, so gradients for the tensors inside the ExponentialMovingAverage
instance are not supposed to be stored. Do you have a code snippet that illustrates your particular issue?
Please note that using EMA will store a copy of all your parameters, so memory usage will be higher.