RuntimeError: Function 'SqrtBackward0' returned nan values in its 0th output
Can-Zhao opened this issue · 4 comments
Hi,
I got RuntimeError: Function 'SqrtBackward0' returned nan values in its 0th output
in feats0[kk], feats1[kk] = lpips.normalize_tensor(outs0[kk]), lpips.normalize_tensor(outs1[kk])
It seems that this issue might be solved by changing
PerceptualSimilarity/lpips/__init__.py
Line 14 in 31bc127
to norm_factor = torch.sqrt(torch.sum(in_feat**2,dim=1,keepdim=True) + eps)
Thanks for posting this. I've been struggling to figure out what's the 'SqrtBackward0' issue and how to fix it. Perhaps, zero output causes numerical instability in the back prop in torch.sqrt!
Perhaps better option is to fix the torch.sqrt
function, as in my case I'm directly using a torch.sqrt
in my model.
Facing the same issue. @Can-Zhao Did adding EPS solve the issue ?
Is there a proposed PR for this ? Should I make one ?
Hello,
I faced same problem and fixed by changing to
norm_factor = torch.sqrt(torch.sum(x ** 2, dim=1, keepdim=True) + 1e-8)
It's a big problem, has impact to other libs @richzhang
Lightning-AI/pytorch-lightning#18712