/Toroidal-PSDA

A probabilistic scoring backend for length-normalized embeddings.

Primary LanguagePythonMIT LicenseMIT

Toroidal PSDA

A probabilistic scoring backend for length-normalized embeddings.

Toroidal PSDA is a generalization of the original PSDA model that was published in our Interspeech 2022 paper:

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

We now refer to the original PSDA as Simple PSDA. The new model is described here:

Toroidal Probabilistic Spherical Discriminant Analysis

This repo supercedes the original PSDA repo and it contains an updated version of the Simple PSDA implementation, as well as the new Toroidal PSDA implementation.

Probabilistic Linear Discrimnant Analysys (PLDA) is a trainable scoring backend that can be used for things like speaker/face recognition or clustering, or speaker diarization. PLDA uses the self-conjugacy of multivariate Gaussians to obtain closed-form scoring and closed-form EM updates for learning. Some of the Gaussian assumptions of the PLDA model are violated when embeddings are length-normalized.

With PSDA, we use Von Mises-Fisher (VMF) instead of Gaussians, because they may give a better model for this kind of data. The VMF is also self-conjugate, so we enjoy the same benefits of closed-form scoring and EM-learning.

Install

Dependencies are numpy, scipy and PYLLR.

To install, put the root (the folder that contains the package tpsda) on your python path.

Demo