divide loss by batch size
Closed this issue · 1 comments
vivanov879 commented
Hi, maybe loss should be divided by batch size, the learning wont be affected but since all other criterions do that, it might be done.
https://github.com/y0ast/VAE-Torch/blob/master/KLDCriterion.lua#L14
self.output = -0.5 * torch.sum(KLDelements) / input:size(1)
y0ast commented
Hmm, the BCE criterion only supports dividing by batch size * input size aka "sizeAverage". This does not work with the KLD, which is why I turned it off in https://github.com/y0ast/VAE-Torch/blob/master/main.lua#L44.
I don't really feel like manually dividing the BCE gradients later, so I am going to leave it like this and not divide by batch size in the KLDCriterion.