dingmyu/davit

RuntimeError: shape '[64, 49, 3, 3, 21]' is invalid for input of size 602112

Riyone opened this issue · 2 comments

Hello, I try to only use "backbones/davit.py" models, and i use a torch.randn((1,3,224,224)) as a picture input for testing. When it runs to"qkv = self.qkv(x).reshape(B_, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)" (class WindowAttention). I got an error....
Uploading 123.png…

Is there any requirement for the image format, or is there anything else I missed?

Alright, i guess it's because initialized embed_dims cannot divide num_heads evenly.

We have to initialize embed_dims as (96, 192, 384, 768); this is the default mentioned in the paper. I don't know why the code has it initialized differently. Even the Swin paper uses the same config, so it feels more natural.