What does the following code mean?img[0,:,:]=img[0,:,:]-92.8207477031
Opened this issue · 4 comments
img[0,:,:]=img[0,:,:]-92.8207477031
img[1,:,:]=img[1,:,:]-95.2757037428
img[2,:,:]=img[2,:,:]-104.877445883
@leeyeehoo Thank you for your sharing.
When you validate CSRNet with 'val.ipynb',you use the code above.So,my question is why you minus the specific values above(92.8207477031,95.2757037428,104.877445883).What is the mean of the values?Why can't they be other values?
Do you understand it now? I also get confused about these values.
you should not use those value. As the author uses pretrain weights instead of training from zero. However, if you train from zero, you need those values then.
it seems he is taking mean pixel values across each channel (R, B, G) and then subtract them for the sake of normalization
@CheungYooo I think you should use following code instead:
`from torchvision import datasets, transforms
transform=transforms.Compose([
transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
mae = 0
for i in range(len(img_paths)):
img = transform(Image.open(img_paths[i]).convert('RGB')).cuda()
gt_file = h5py.File(img_paths[i].replace('.jpg','.h5').replace('images','ground_truth'),'r')
groundtruth = np.asarray(gt_file['density'])
output = model(img.unsqueeze(0))
mae += abs(output.detach().cpu().sum().numpy()-np.sum(groundtruth))
print(i,mae)
print (mae/len(img_paths))`