Data scaling vs mean subtraction for data normalization
Closed this issue · 2 comments
Hi @yuanyuanli85 ,
I hope you don't mind a question regarding the way your code is normalizing the data. The function "normalize" performs two operations A) Scaling [0-1] and B) mean subtraction.
def get_color_mean(self):
mean = np.array([0.4404, 0.4440, 0.4327], dtype=np.float)
return mean
def normalize(imgdata, color_mean):
'''
:param imgdata: image in 0 ~ 255
:return: image from 0.0 to 1.0
'''
imgdata = imgdata / 255.0
for i in range(imgdata.shape[-1]):
imgdata[:, :, i] -= color_mean[i]
return imgdata
Normalizing the input data into 0-1 helps to avoid vanishing / exploding gradients and to improve the speed of training. But why are you shifting your images into the range [-0.44 - 0.56]? Is this something that improves your training regime?
Thanks a lot
Cheers
this is a common trick used in cnn training. The training can take benefit from taking inputs with negative and postive values. Usually, [-0.5, 0.5] is better than [0.0, 1.0]
I see, you are centering the mean of the input stimuli around 0. I was confused with the subtraction of the mean colors in the dataset, I thought there was something special about those values. As I have a different dataset, I will normalize [-0.5 0.5]
Thanks a lot