Nanne/pytorch-NetVlad

image normalization

Anonymous-so opened this issue · 1 comments

Hi@Nanne
Thanks for the excellent work!
I feel a little bit comfused about the image pre-processing in the file pittsburgh.py. The function input_transform normalize images with mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225].
However, the official code of NetVLAD (in trainWeakly.py file) conduct the normalization as follows:
ims(:,:,1,:)= ims(:,:,1,:) - net.meta.normalization.averageImage(1,1,1);
ims(:,:,2,:)= ims(:,:,2,:) - net.meta.normalization.averageImage(1,1,2);
ims(:,:,3,:)= ims(:,:,3,:) - net.meta.normalization.averageImage(1,1,3);
where net.meta.normalization.averageImage=[123.6800,116.7790,103.9390].

Could you please give me some hints on how did 'mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]' gained from 'mean=[123.6800,116.7790,103.9390]'? Thanks a lot

Nanne commented

They have images in range 0-255, whereas pytorch loaded ones are in 0-1. 123.68 / 255 = 0.485, the other two are approx the same too.

Any difference is to this repo using the numbers used to pretrained the backbone: https://github.com/pytorch/examples/blob/master/imagenet/main.py#L204 whereas they use the means used for their backbone, don't think this difference is meaningful.