Questions about normalizing images in training.
zjlww opened this issue · 2 comments
I realized that you normalize the images during training. Is this common in training and evaluating image generative models? I don't think other models (LDM, DDPM, etc.) do this during training. Is the FID comparison in your paper still fair with this normalization in place?
train_transform = transforms.Compose(
[
transforms.Resize(args.image_size),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]
)
Hi,
During the training stage, we normalized input images to be in the range [-1, 1]. This is the standard practice in image generation. If you double check the code of LDM, they adopt taming.data.imagenet.ImagePaths
(refer to this link: https://github.com/CompVis/taming-transformers/blob/3ba01b241669f5ade541ce990f7650a3b8f65318/taming/data/base.py#L51) to load input images and normalize them to [-1, 1].
However, in the sampling stage, we rescale the output images to be in the range [0, 1] before saving. You can check it out here.
Therefore, it still affirms the fairness of FID comparison.
Thanks.
Thank you so much with the prompt and detailed reply!