Question about the shape of sematic feature map
Gloryseven opened this issue · 1 comments
Gloryseven commented
hello! The size of dinov2 feature image is 'patch_h, patch_w', but the size of mask image is 'H, W'. They are written the same in the interpolation section of the paper. (both 'H ,W'). How is it handled in the code?
WangYixuan12 commented
During the interpolation, a 3D point will be projected into 2D image space and normalized to 0~1. Therefore, it does not matter if H does not equal to patch_h. More details can be seen in https://pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html