/DeDoDe

[3DV 2024 Oral] DeDoDe 🎶 Detect, Don't Describe --- Describe, Don't Detect, for Local Feature Matching

Primary LanguagePythonMIT LicenseMIT

DeDoDe 🎶
Detect, Don't Describe --- Describe, Don't Detect
for Local Feature Matching
3DV 2024 Oral

Johan Edstedt · Georg Bökman · Mårten Wadenbäck · Michael Felsberg

example
The DeDoDe detector learns to detect 3D consistent repeatable keypoints, which the DeDoDe descriptor learns to match. The result is a powerful decoupled local feature matcher.

We have updated the training recipe for the detector. Weights are available here: https://github.com/Parskatt/DeDoDe/releases/download/v2/dedode_detector_L_v2.pth

🆕 Kornia Integration

DeDoDe is in kornia, pip install kornia, and can be imported by e.g.

from kornia.feature import DeDoDe
dedode = DeDoDe.from_pretrained(detector_weights="L-upright", descriptor_weights="B-upright")

How to Use DeDoDe?

Below we show how DeDoDe can be run, you can also check out the demos

from DeDoDe import dedode_detector_L, dedode_descriptor_B, dedode_descriptor_G
from DeDoDe.matchers.dual_softmax_matcher import DualSoftMaxMatcher

# You can either provide weights manually, or not provide any. If none
# are provided we automatically download them. None: We now use v2 detector weights by default.
detector = dedode_detector_L(weights = None)
# Choose either a smaller descriptor,
descriptor = dedode_descriptor_B(weights = None)
# Or a larger one
descriptor = dedode_descriptor_G(weights = None, 
                                 dinov2_weights = None) # You can manually load dinov2 weights, or we'll pull from facebook

matcher = DualSoftMaxMatcher()

im_A_path = "assets/im_A.jpg"
im_B_path = "assets/im_B.jpg"
im_A = Image.open(im_A_path)
im_B = Image.open(im_B_path)
W_A, H_A = im_A.size
W_B, H_B = im_B.size


detections_A = detector.detect_from_path(im_A_path, num_keypoints = 10_000)
keypoints_A, P_A = detections_A["keypoints"], detections_A["confidence"]

detections_B = detector.detect_from_path(im_B_path, num_keypoints = 10_000)
keypoints_B, P_B = detections_B["keypoints"], detections_B["confidence"]

description_A = descriptor.describe_keypoints_from_path(im_A_path, keypoints_A)["descriptions"]
description_B = descriptor.describe_keypoints_from_path(im_B_path, keypoints_B)["descriptions"]

matches_A, matches_B, batch_ids = matcher.match(keypoints_A, description_A,
    keypoints_B, description_B,
    P_A = P_A, P_B = P_B,
    normalize = True, inv_temp=20, threshold = 0.1)#Increasing threshold -> fewer matches, fewer outliers

matches_A, matches_B = matcher.to_pixel_coords(matches_A, matches_B, H_A, W_A, H_B, W_B)

Training DeDoDe

See experiments for the scripts to train DeDoDe. We trained on a single A100-40GB with a batchsize of 8. Note that you need to do the data prep first, see data_prep.

As usual, we require that you have the MegaDepth dataset already downloaded, and that you have the prepared scene info from DKM.

Pretrained Models

Right now you can find them here: https://github.com/Parskatt/DeDoDe/releases/tag/dedode_pretrained_models Probably we'll add some autoloading in the near future.

DeDoDe in Other Frameworks

License

All code/models except DINOv2 (descriptor-G), are MIT license. DINOv2 is Apache-2 license

BibTeX

@inproceedings{edstedt2024dedode,
  title={{DeDoDe: Detect, Don't Describe --- Describe, Don't Detect for Local Feature Matching}},
  author = {Johan Edstedt and Georg Bökman and Mårten Wadenbäck and Michael Felsberg},
  booktitle={2024 International Conference on 3D Vision (3DV)},
  year={2024},
  organization={IEEE}
}

@inproceedings{edstedt2024dedodev2,
  title={{DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detector
}},
  author = {Johan Edstedt and Georg Bökman and Zhenjun Zhao},
  booktitle={IEEE/CVF Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  year={2024},
}