A probabilistic scoring backend for length-normalized embeddings.
Toroidal PSDA is a generalization of the original PSDA model that was published in our Interspeech 2022 paper:
We now refer to the original PSDA as Simple PSDA. The new model is described here:
This repo supercedes the original PSDA repo and it contains an updated version of the Simple PSDA implementation, as well as the new Toroidal PSDA implementation.
Probabilistic Linear Discrimnant Analysys (PLDA) is a trainable scoring backend that can be used for things like speaker/face recognition or clustering, or speaker diarization. PLDA uses the self-conjugacy of multivariate Gaussians to obtain closed-form scoring and closed-form EM updates for learning. Some of the Gaussian assumptions of the PLDA model are violated when embeddings are length-normalized.
With PSDA, we use Von Mises-Fisher (VMF) instead of Gaussians, because they may give a better model for this kind of data. The VMF is also self-conjugate, so we enjoy the same benefits of closed-form scoring and EM-learning.
Dependencies are numpy, scipy and PYLLR.
To install, put the root (the folder that contains the package tpsda) on your python path.
-
A working demo is here: https://github.com/bsxfan/Toroidal-PSDA/blob/main/tpsda/toroidal/toroid_vs_cosred.py. It can be run as a script. It makes synthetic data and demonstrates training and scoring.
-
Further insight into the model and the training em-algorithm can be gained by running this demo script: https://github.com/bsxfan/Toroidal-PSDA/blob/main/tpsda/toroidal/test_em.py. It plots low-dimensional data on an interactive rotatable globe (if your plotting backend allows).