Changing model input size from 384 -> 1024

Question

Changing model input size from 384 -> 1024

tdchua opened this issue 2 years ago · 2 comments

I'd like to know what layers to change if I wanted to input an image of size 1024 instead of 384. I'd also want to know if there are any additional concerns about using this model for a much bigger size input. Thanks.

Answer 1 · 2022-10-02T17:12:51.000Z

@tdchua Maybe you need to interpolate the relative position bias table in each attention layer. However, as pointed out in SwinV2, large-size input may degrade the performance of relative position encoding. So it may be necessary to adopt some other position encoding technique, such as Log-CPB in SwinV2. In addition, a much bigger size input would lead to huge memory consumption, thus you need GPUs with more memory.

Answer 2 · 2022-10-03T02:55:04.000Z

I see, thank you very much for taking the time to reply. 😄