different resolution between train and inference

Question

different resolution between train and inference

d-zhou12 opened this issue a year ago · 4 comments

Thanks for your nice work, I want to know if it is possible to train with a low resolution and inference with a high resolution by fasterViT? I just test create_model from any_res faster vit in resolution 256x256 and inference with other resolution 512x512 would meet this error: RuntimeError: shape '[-1, 2, 2, 4, 4, 512]' is invalid for input of size 73728

Answer 1 · 2023-06-21T07:02:42.000Z

Hi @d-zhou12 , what are the window sizes being used ? the height and width in both cases should ideally be divisible by the window size.

Answer 2 · 2023-06-21T07:25:16.000Z

I use default setting [7, 7, 12, 6] window size by default in readme.md, and I tried [8, 8, 8, 8] it still have error for 256x256/ 512x512

Answer 3 · 2023-06-21T14:03:50.000Z

Hi @d-zhou12 , would you please provide the log (maybe for the second case where you use the same window size) and also confirm the timm and torchvision version, please ?

Thanks

Answer 4 · 2023-06-28T16:32:11.000Z

Closing this issue for now until logs could be provided.