abhiskk/fast-neural-style

Query Regarding Transforms used for input image and style image

ssundar6087 opened this issue · 5 comments

What is the purpose of this line in your transforms block for the input :

transforms.Lambda(lambda x: x.mul(255))

Aren't the input and style images already in the range 0-255? Wouldn't this cause an overflow?

This line converts the image to 0-1 range. I would suggest running the code or reading the documentation for such doubts. Opening an issue for these doubts is not appropriate.

@abhiskk I apologize if this came across as a trivial question to you. My question was specifically regarding the training script.

There you use the pytorch provided dataset.ImageFolder as follows :
transform = transforms.Compose([
transforms.Resize(args.image_size),
transforms.CenterCrop(args.image_size),
transforms.ToTensor(),
transforms.Lambda(lambda x: x.mul(255))
])
train_dataset = datasets.ImageFolder(args.dataset, transform)
train_loader = DataLoader(train_dataset, batch_size=args.batch_size)

From my limited understanding, the ImageFolder function doesn't normalize the input to floats in the range of 0-1. The other transforms such as resize, centercrop and ToTensor shouldn't change the data range either and hence my thought was that you don't need the x.mul(255) on the input. If I am mistaken, please correct me.

Yes, transforms.ToTensor() converts the image to 0-1 range hence the multiply was needed.

Thanks for clarifying. Apologies once again.

No worries, happy to help 👍