Comparing strings finds an incorrect match

Question

Comparing strings finds an incorrect match

dimiterbak opened this issue 3 years ago · 2 comments

Hi There,

Thanks for sharing this library!

I am running the bellow test and expect to find no match.
However, it returns a match:

import numpy as np
import unittest

from names_matcher.algorithm import NamesMatcher

class TestCreateIdentityMatcher(unittest.TestCase):

    def test_compare_different_identities(self):

        names_1 = [["V", "v"]]
        names_2 = [["L", "o"]]

        assignments = NamesMatcher()(names_1,
                                     names_2)

        self.assertEqual(-1,
                             assignments[0][0])
        self.assertEqual(1,
                             assignments[1][0])

Answer 1 · 2022-03-11T08:55:28.000Z

Hi @dimiterbak, your code returns the following for me:

(array([0], dtype=int32), array([0.]))

As you see, the confidence of the match is 0, which is the minimum confidence possible.
I always return the matches, and it depends on the domain problem to choose the perfect threshold for the confidence. It depends on what's more important, Precision vs. Recall, etc., etc.

Answer 2 · 2022-03-11T10:38:28.000Z

Thank you!

It looks like I misunderstood by thinking you return the distance but it was the confidence.