lucidrains/byol-pytorch

Throw error when retrieving an embedding for inferencing one image

MimiCheng opened this issue · 5 comments

Step to reproduce:


model = BYOL(
    resnet,
    image_size = 256,
    hidden_layer = 'avgpool'
)

imgs = torch.randn(1, 3, 256, 256)
projection, embedding = model(imgs, return_embedding = True)

Error found:

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in _verify_batch_size(size)
   2245         size_prods *= size[i + 2]
   2246     if size_prods == 1:
-> 2247         raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
   2248 
   2249 

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 4096])

@MimiCheng Could you paste the full script? Which resnet are you using?

Thanks for your prompt response @lucidrains! It's resnet-50

Here is the full script

resnet = models.resnet50(pretrained=True)
model = BYOL(
    resnet,
    image_size = 256,
    hidden_layer = 'avgpool'
)

imgs = torch.randn(1, 3, 256, 256)
projection, embedding = model(imgs, return_embedding = True)

@MimiCheng Oh I understand why Mimi, it is because there is a batch norm in the projection layer, and it requires the batch size to be greater than 1 during training - I've updated the library to give a more informative error message in 0.5.7

For your script, try changing to imgs = torch.randn(2, 3, 256, 256)

@lucidrains thanks for you explanation. It works with imgs = torch.randn(2, 3, 256, 256). Just wondering is there a way to retrieve embedding for only one new coming image after training? I would like to use that embedding for inference. Should I modify the code skipping the projection layer in order to make it works? Thanks!

@MimiCheng yup, all you have to do is to do model.eval() first and then it should work!