The evaluation results depend on batch size

Question

The evaluation results depend on batch size

apple2373 opened this issue 2 years ago · 1 comments

I noticed that the evaluation results are different depending on the batch size. I found the reason is this line.
https://github.com/yuyanli0831/OmniFusion/blob/aaf52cc953ade3be1f5fc3df446705e4223b8d21/test.py#L161

Because the median value changes if we group different images together, the results will be slightly different per batch size. If we want deterministic one, then we can either disable the median alignment or use the median per image.

Answer 1 · 2022-12-05T03:22:43.000Z

Well, not just implementation, I wonder why we need median alignment. My best guess is that, if all objects go twice as much as far, and also become twice as much as big, the images look same? (Am I correct?! but i can imagine there's some scale ambiguity if we only have a single image.) And to address the scale ambiguity, we use median of ground truth and predicted images to align the scales of these images.