Changing model input size from 384 -> 1024
tdchua opened this issue · 2 comments
tdchua commented
I'd like to know what layers to change if I wanted to input an image of size 1024 instead of 384. I'd also want to know if there are any additional concerns about using this model for a much bigger size input. Thanks.
Vladimir2506 commented
@tdchua Maybe you need to interpolate the relative position bias table in each attention layer. However, as pointed out in SwinV2, large-size input may degrade the performance of relative position encoding. So it may be necessary to adopt some other position encoding technique, such as Log-CPB in SwinV2. In addition, a much bigger size input would lead to huge memory consumption, thus you need GPUs with more memory.
tdchua commented
I see, thank you very much for taking the time to reply. 😄