NVlabs/MambaVision

When I run the code, I encountered a problem.

Closed this issue · 0 comments

Thank you for your excellent work! When I run the code, I put a 512512 resolution image into the network. Why do the last two layers of the network return feature maps of 10241616? Shouldn't they theoretically be 10241616 and 20488*8 respectively?