microsoft/topologic

Move and update distance module

Closed this issue · 0 comments

The distance module should be moved and reworked. As it is right now it's just a glorified alias to the scipy functions, which has very little utility (possibly better documentation and type hinting? Not enough to keep around by itself).

However, we do know how we use these:

  • Given a vector, we iterate through a list of vectors and return the distances
  • Given a vector, we iterate through a list of vectors, sort them, and return the top N

I would argue that these are the utilities we should be providing functionality for - not for calling scipy.spatial.cosine_distance.

At most we should support the 3 functions we currently do (cosine, euclidean, and mahalanobis) via a single function with a hyper parameter for choice - then users can toggle based on a configuration value in their code rather than swapping the function(s) out.