SCRLoFTR

Image matching is an important task in computer vision. Detector-free dense matching methods are an important research direction for image matching due to their high accuracy and robustness. LoFTR is a classical detector-free architecture that uses the same resolution grid to extract dense features of an image and then matches them. Since CNNs do not have scale equivariance, the key to the validity of this method is the assumption that there is no large-scale variations between the images to be matched. However, large scale variations are very common in practical problems. To solve the above problem, we propose SCRLoFTR, a model that combines scale equivariance and the global modelling capability of Transformer. The two main benefits of this model are that scale-equivariant CNNs can extract scale equivariant features, while Transformer also brings global modelling capability. Experiments prove that this modification improves the performance of the matcher in matching image pairs with large-scale variations and does not affect the general matching performance of the matcher

LiaoYun0x0/SCRLoFTR

SCRLoFTR