divide loss by batch size

Question

divide loss by batch size

Closed this issue 8 years ago · 1 comments

Hi, maybe loss should be divided by batch size, the learning wont be affected but since all other criterions do that, it might be done.

https://github.com/y0ast/VAE-Torch/blob/master/KLDCriterion.lua#L14
self.output = -0.5 * torch.sum(KLDelements) / input:size(1)

Answer 1 · 2016-02-23T14:01:18.000Z

Hmm, the BCE criterion only supports dividing by batch size * input size aka "sizeAverage". This does not work with the KLD, which is why I turned it off in https://github.com/y0ast/VAE-Torch/blob/master/main.lua#L44.

I don't really feel like manually dividing the BCE gradients later, so I am going to leave it like this and not divide by batch size in the KLDCriterion.