Metric Learning algorithms in Python.
Algorithms
- Large Margin Nearest Neighbor (LMNN)
- Information Theoretic Metric Learning (ITML)
- Sparse Determinant Metric Learning (SDML)
- Least Squares Metric Learning (LSML)
- Neighborhood Components Analysis (NCA)
- Local Fisher Discriminant Analysis (LFDA)
- Relative Components Analysis (RCA)
- Metric Learning for Kernel Regression (MLKR)
- Mahalanobis Metric for Clustering (MMC)
Dependencies
- Python 2.7+, 3.4+
- numpy, scipy, scikit-learn
- (for running the examples only: matplotlib)
Installation/Setup
Run pip install metric-learn
to download and install from PyPI.
Run python setup.py install
for default installation.
Run pytest test
to run all tests (you will need to have the pytest
package installed).
Usage
For full usage examples, see the sphinx documentation.
Each metric is a subclass of BaseMetricLearner
, which provides default implementations for the methods metric
, transformer
, and transform
. Subclasses must provide an implementation for either metric
or transformer
.
For an instance of a metric learner named foo
learning from a set of d
-dimensional points, foo.metric()
returns a d x d
matrix M
such that the distance between vectors x
and y
is expressed sqrt((x-y).dot(M).dot(x-y))
. Using scipy's pdist
function, this would look like pdist(X, metric='mahalanobis', VI=foo.metric())
.
In the same scenario, foo.transformer()
returns a d x d
matrix L
such that a vector x
can be represented in the learned space as the vector x.dot(L.T)
.
For convenience, the function foo.transform(X)
is provided for converting a matrix of points (X
) into the learned space, in which standard Euclidean distance can be used.
Notes
If a recent version of the Shogun Python modular (modshogun
) library is available, the LMNN implementation will use the fast C++ version from there. The two implementations differ slightly, and the C++ version is more complete.