Learning Accurate Template Matching with Differentiable Coarse-to-fine Correspondence Refinement
Official implementation of Deep-Template-Matching (Learning Accurate Template Matching with Differentiable Coarse-to-fine Correspondence Refinement) using pytorch (pytorch-lightning) This paper has published in CVMJ 2024 and can be founded here.
Template matching is a fundamental task in computer vision and has been studied for decades. It plays an essential role in the manufacturing industry for estimating the poses of different parts, facilitating downstream tasks such as robotic grasping. Existing works fail when the template and source images are in different modalities, cluttered backgrounds or weak textures. They also rarely consider geometric transformations via homographies, which commonly existed even for planar industrial parts. To tackle the challenges, we propose an accurate template matching method based on differentiable coarse-to-fine correspondence refinement. Considering the domain gap between the mask template and the grayscale image, we leverage an edge-aware module to eliminate the difference for robust matching. Based on coarse correspondences with novel structure-aware information by transformers, an initial warping transformation is estimated and performed as a preliminary result. After the initial alignment, we execute a refinement network on reference and aligned images to obtain sub-pixel level correspondences and thus obtain the final geometric transformation. Comprehensive evaluations show that our method significantly outperforms state-of-the-art methods and baselines, with good generalization abilities and visually plausible results even on unseen real data.
we propose an accurate template matching method based on differentiable coarse-to-fine correspondence refinement. Considering the domain gap between the mask template and the grayscale image, we leverage an edge-aware module to eliminate the difference for robust matching. Based on coarse correspondences with novel structure-aware information by transformers, an initial warping transformation is estimated and performed as a preliminary result. After the initial alignment, we execute a refinement network on reference and aligned images to obtain sub-pixel level correspondences and thus obtain the final geometric transformation.
pip -r requirements.txt
We provide the datasets used in our paper. Download link to
- Assembled hole dataset
- Steel dataset
An example is given in test_demo.py
.
The test images are in ./data/test_case
and the pretraind weight file can be downloaded from here.
run command ./scripts/train.sh
. The traning files are saved in direactory ./logs
.
Please modify the paths of the training dataset in the (./config/Synthetic_train.py
).
We have prepared a standard data format in the folder(./data/train_data
)
Synthetic_train.py:
TRAIN_BASE_PATH = './data/train_data'
We use a two-stage training method.(Modify configuration parameters in ./src/config/default.py
)
In the coarse stage, we only train the coarse network until convergence(about 10-20 epochs):
_CN.TM.MATCH_COARSE.TRAIN_STAGE = 'only_coarse'
In the fine stage, modify the ckpt_path in train.py
parser.add_argument(
'--ckpt_path', type=str, default='', # the path of coarse ckpt
help='pretrained checkpoint path')
We train the whole network until convergence(about 10-20 epochs)::
_CN.TM.MATCH_COARSE.TRAIN_STAGE = 'whole'
The detail files of training are saved in the ./logs
folder
- If the edge of the test data is easy to detect, we recommend
_CN.TM.MATCH_COARSE.USE_EDGE = True #better generalization, as default
- otherwise
_CN.TM.MATCH_COARSE.USE_EDGE = False
./src/config/default.py
- Use online data augmentation:
otherwise
_CN.DATASET.AUGMENTATION_TYPE = 'None'
_CN.DATASET.AUGMENTATION_TYPE = 'mobile_myself'
- Save Plots of matching images to the training file using tensorboard:
otherwise
_CN.TRAINER.SAVE_PLOTS_VAL = True _CN.TRAINER.SAVE_PLOTS_TRAIN = False
_CN.TRAINER.SAVE_PLOTS_VAL = False _CN.TRAINER.SAVE_PLOTS_TRAIN = False
- All images are resized to [480, 640] (h,w), and we set the max number of query points is 128.
- If you want change the size of images,please change
Resize = [512, 512] # h,w
in./src/lightning/data.py
. - The image size is not recommended to be too small, otherwise the matching pair will decline seriously.
- Change the path of the test dataset in
./config/Synthetic_test.py
TEST_BASE_PATH = './data/train_data'
- Run command
./scripts/test.sh
If you find this code useful for your research, please use the following BibTeX entry.
@article{gao2024learning,
title={Learning accurate template matching with differentiable coarse-to-fine correspondence refinement},
author={Gao, Zhirui and Yi, Renjiao and Qin, Zheng and Ye, Yunfan and Zhu, Chenyang and Xu, Kai},
journal={Computational Visual Media},
volume={10},
number={2},
pages={309--330},
year={2024},
publisher={Springer}
}