Figure 1
warm345 opened this issue · 3 comments
Hello, thank you very much for doing this excellent work.
I would like to know how Figure 1 in the paper is drawn. If possible, I hope to get the relevant code for drawing this figure. Thank you!
Thanks a lot !
I think it's draw.io but I am afraid I am not sure if we kept the files. Best is probably to remake it in tikz or something...
Hello, I have another question. I noticed that you used a visual encoder based on Vision Transformer, which uses a patch size of 14x14 pixels. We usually resize the original image, for example, 336*336. So how did Figure 1 in your paper get such fine-grained image attention?
Hey, so we're using PaliGemma-448 which is resized to 448 x 448 resolution and split into 1024 image patches (32 x 32)