JohannesBuchner/imagehash

ImageMultiHash `__sub__` is not commutative

nh2 opened this issue · 3 comments

nh2 commented

``subonImageMultiHash` on 2 different images is in most cases not commutative.

For example,

a - b = 4.25
b - a = 10.25

hash_diff() is communative. It is very confusing to me that from a commutative result (the (matches, sum_distance) tuple), a non-commutative distance metric is derived.

This creates nonsensical results when trying to find the most similar match for every image in a set of images (e.g. using best_match()), because the sorting by metric now depends on the order in which the images are given.

It seems to me that best_match() would be better implemented by simply sorting the (matches, sum_distance) results descending by matches, ascending by sum_distance, instead of going via this non-commutative __sub__.

Do you agree?

(Also, __sub__ on ImageHash is commutative.)

pinging @joshcoales who contributed this code and perhaps has an opinion.

Apologies, yeah, I remember this being an absolute pain.

I agree that sorting matches descending, then sum_distance ascending would be best. I was trying to merge that down into a single int for comparison, but it's a bit of a nightmare, yeah.

Perhaps you could prepare a PR (plus test of commutativity, perhaps for all hash types), and add comments with some demonstrations of the behaviour before and after.