About patch
IItaly opened this issue · 1 comments
IItaly commented
Thanks for your work @davide-coccomini
In the figure of the paper, I found that the dimensions of patch images with cross structure and non cross structure are inconsistent. In cross structure, the size of patch seems to be greater than 7 * 7 or 56 * 56. Instead of the cross structure, the size of the patch is 7 * 7. However, in the cross structure, isn't the feature patch size generated by s-branch also 7 * 7?
davide-coccomini commented
Hi @IItaly, in the Efficient Vision Transfomer we have only one branch with a patch size of 7x7. In the Cross Efficient Vision Transformer, we use a patch size of 7x7 for the S-Branch and 56x56 for the L-Branch.