norm_pix_loss question

Question

norm_pix_loss question

Opened this issue 8 months ago · 3 comments

For this implementation, shouldn't norm_pix_loss be False. If you were to have it on, I would imagine the discriminator would have a very easy time discriminating between patches from the original image and those that are generated from the decoder. It seems though, you have norm_pix_loss as True by default. Could you provide insight?

Also, as an aside, why do you cast things to double? Like in line 232 of models_mae.py and line 160 of mae_pretrain.py?

Thanks!

Answer 1 · 2024-03-05T21:45:34.000Z

Why? can you explain more?

Answer 2 · 2024-03-05T22:34:43.000Z

For the norm_pix_loss: aren't the generated patches going to have different range of values compared to the original image? Also if you now add the generated patches to the original image in place of the masked portions, won't you get something weird like shown in this github issue: facebookresearch/mae#12.

Also in the paper you referenced, it seems like the reconstructions they have don't use the norm_pix_loss = True, because the images in Figure 3 don't have those discontinuous edges around patches characteristic of norm_pix_loss being True.

Answer 3 · 2024-03-06T12:20:09.000Z

Okay, understood, I had my signals normalised during preprocessing, hence didn't give it a thought.