How is the dimension of pos_embed handled when the image size is increased to 420x420 in the pre-training phase?

Question

yutaidong opened this issue a year ago · 2 comments

Answer 1 · 2023-08-24T09:28:28.000Z

or just replace the model with a different visual encoder?

Answer 2 · 2023-08-24T11:31:00.000Z

You can refer to function interpolate_pos_embed in models/vits/eva_vit.py ~