RuntimeError: shape '[64, 49, 3, 3, 21]' is invalid for input of size 602112
Riyone opened this issue · 2 comments
Riyone commented
Hello, I try to only use "backbones/davit.py" models, and i use a torch.randn((1,3,224,224)) as a picture input for testing. When it runs to"qkv = self.qkv(x).reshape(B_, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)" (class WindowAttention). I got an error....
Is there any requirement for the image format, or is there anything else I missed?
Riyone commented
Alright, i guess it's because initialized embed_dims cannot divide num_heads evenly.
dc250601 commented
We have to initialize embed_dims as (96, 192, 384, 768); this is the default mentioned in the paper. I don't know why the code has it initialized differently. Even the Swin paper uses the same config, so it feels more natural.