What does the following code mean?img[0,:,:]=img[0,:,:]-92.8207477031

Question

What does the following code mean?img[0,:,:]=img[0,:,:]-92.8207477031

Opened this issue 5 years ago · 4 comments

img[0,:,:]=img[0,:,:]-92.8207477031
img[1,:,:]=img[1,:,:]-95.2757037428
img[2,:,:]=img[2,:,:]-104.877445883

@leeyeehoo Thank you for your sharing.
When you validate CSRNet with 'val.ipynb',you use the code above.So,my question is why you minus the specific values above(92.8207477031,95.2757037428,104.877445883).What is the mean of the values?Why can't they be other values?

Answer 1 · 2019-11-11T17:00:33.000Z

Do you understand it now? I also get confused about these values.

Answer 2 · 2020-01-11T23:32:28.000Z

you should not use those value. As the author uses pretrain weights instead of training from zero. However, if you train from zero, you need those values then.

Answer 3 · 2021-10-19T19:55:06.000Z

it seems he is taking mean pixel values across each channel (R, B, G) and then subtract them for the sake of normalization

Answer 4 · 2022-04-07T08:43:17.000Z

@CheungYooo I think you should use following code instead:

`from torchvision import datasets, transforms
transform=transforms.Compose([
transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])

mae = 0
for i in range(len(img_paths)):
img = transform(Image.open(img_paths[i]).convert('RGB')).cuda()
gt_file = h5py.File(img_paths[i].replace('.jpg','.h5').replace('images','ground_truth'),'r')
groundtruth = np.asarray(gt_file['density'])
output = model(img.unsqueeze(0))
mae += abs(output.detach().cpu().sum().numpy()-np.sum(groundtruth))
print(i,mae)
print (mae/len(img_paths))`