/plda

Probabilistic Linear Discriminant Analysis & classification, written in Python.

Primary LanguagePythonApache License 2.0Apache-2.0

Probabilistic Linear Discriminant Analysis

Disclaimer

This model was written for an Explainable Artificial Intelligence (XAI) project, so it stores a bunch of parameters in memory that are not necessary for simple classification problems.

The model parameters are estimated via empirical Bayes.

Paper Citation

Ioffe S. (2006) Probabilistic Linear Discriminant Analysis. In: Leonardis A., Bischof H., Pinz A. (eds) Computer Vision – ECCV 2006. ECCV 2006.

Dependencies

If you already have Anaconda or Miniconda, you can automatically download all dependencies to a conda environment called plda with the following. Otherwise, see environment.yml.

conda env create -f environment.yml -n plda  # "plda" is the environment name.

Usage

  1. If you have one, activate your conda envrionment with conda activate plda.
  2. Move this repository to the appropriate directory. This might be a modules directory in your project or the same directory as the file importing this code.
  3. Add import plda to your code.
  4. See the demo below on how to use the actual model code.
  5. When you are all done, you can deactivate the conda environment with conda deactivate.

Demo with MNIST Handwritten Digits Data

See mnist_demo/mnist_demo.ipynb.

  • This demo shows you how to extract LDA features from your data.
  • For classification, the model automatically preprocesses your data, but with the default preprocessing setting, it could overfit small training datasets.
  • If you run into this issue, one way to address it is to reduce the number of principal components present in the preprocessed data.
  • The MNIST demo shows you how simple this is: just supply an optional parameter.

Testing

If you created the Conda environment with the name plda, activate it with the following.

conda activate plda  # If `plda` is the name you gave the Conda environment.

To run all tests (~120 seconds with ~60 CPU cores), use the following.

python3.5 pytest plda/  # This README.md should be inside here.

To run a particular test file, run one of the following.

pytest plda/tests/test_model/test_model_units.py  # ~.66s for me.
pytest plda/tests/test_model/test_model_integration.py  # ~1.0s for me.
pytest plda/tests/test_model/test_model_inference.py  #  ~80.6s for me.

pytest plda/tests/test_optimizer/test_optimizer_units.py  # ~.59s for me.
pytest plda/tests/test_optimizer/test_optimizer_integration.py  # ~.78s.
pytest plda/tests/test_optimizer/test_optimizer_inference.py  # ~25.3s for me.

pytest plda/tests/test_classifier/test_classifier_integration.py  # ~.69s.

Once you finish running the tests, remove all the __pycache__/ folders generated by pytest with the following.

py3clean plda/*  # This README.md should be in here.

Finally, if you are done working with the model and test code, deactivate the Conda environment.

conda deactivate  # You can run this from any directory.