/PyLDL

Label distribution learning (LDL) and label enhancement (LE) toolkit implemented in python.

Primary LanguagePythonMIT LicenseMIT

PyLDL

Label distribution learning (LDL) and label enhancement (LE) toolkit implemented in python, including:

$^1$ Technically, these methods are only suitable for totally ordered labels.

$^2$ These are algorithms for incomplete LDL, so you should use pyldl.utils.random_missing to generate the missing label distribution matrix and the corresponding mask matrix in the experiments.

$^3$ These are LDL classifiers, so you should use predict_proba to get label distributions and predict to get predicted labels.

$^4$ These are oversampling algorithms for LDL, therefore you should use fit_transform to generate synthetic samples.

Installation

PyLDL is now available on PyPI. Use the following command to install.

pip install python-ldl

To install the newest version, you can clone this repo and run the setup.py file.

python setup.py install

Usage

Here is an example of using PyLDL.

from pyldl.utils import load_dataset
from pyldl.algorithms import SA_BFGS
from pyldl.metrics import score

from sklearn.model_selection import train_test_split

dataset_name = 'SJAFFE'
X, y = load_dataset(dataset_name)
X_train, X_test, y_train, y_test = train_test_split(X, y)

model = SA_BFGS()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
print(score(y_test, y_pred))

For those who would like to use the original implementation:

  1. Install MATLAB.
  2. Install MATLAB engine for python.
  3. Download LDL Package here.
  4. Get the package directory of PyLDL (...\Lib\site-packages\pyldl).
  5. Place the LDLPackage_v1.2 folder into the matlab_algorithms folder.

Now, you can load the original implementation of the method, e.g.:

from pyldl.matlab_algorithms import SA_IIS

You can visualize the performance of any model on the artificial dataset (Geng 2016) with the pyldl.utils.plot_artificial function, e.g.:

from pyldl.algorithms import LDSVR, SA_BFGS, SA_IIS, AA_KNN, PT_Bayes, GLLE, LIBLE
from pyldl.utils import plot_artificial

methods = ['LDSVR', 'SA_BFGS', 'SA_IIS', 'AA_KNN', 'PT_Bayes', 'GLLE', 'LIBLE']

plot_artificial(model=None, figname='GT')
for i in methods:
    plot_artificial(model=eval(f'{i}()'), figname=i)

The output images are as follows.

(Ground Truth) LDSVR
SA_BFGS SA_IIS
AA_KNN PT_Bayes
GLLE LIBLE

Enjoy! :)

Experiments

For each algorithm, a ten-fold cross validation is performed, repeated 10 times with s-JAFFE dataset and the average metrics are recorded. Therefore, the results do not fully describe the performance of the model.

Results of ours are as follows.

Algorithm Cheby.(↓) Clark(↓) Can.(↓) K-L(↓) Cos.(↑) Int.(↑)
SA-BFGS .092 ± .010 .361 ± .029 .735 ± .060 .051 ± .009 .954 ± .009 .878 ± .011
SA-IIS .100 ± .009 .361 ± .023 .746 ± .050 .051 ± .008 .952 ± .007 .873 ± .009
AA-kNN .098 ± .011 .349 ± .029 .716 ± .062 .053 ± .010 .950 ± .009 .877 ± .011
AA-BP .120 ± .012 .426 ± .025 .889 ± .057 .073 ± .010 .931 ± .010 .848 ± .011
PT-Bayes .116 ± .011 .425 ± .031 .874 ± .064 .073 ± .012 .932 ± .011 .850 ± .012
PT-SVM .117 ± .012 .422 ± .027 .875 ± .057 .072 ± .011 .932 ± .011 .850 ± .011

Results of the original MATLAB implementation (Geng 2016) are as follows.

Algorithm Cheby.(↓) Clark(↓) Can.(↓) K-L(↓) Cos.(↑) Int.(↑)
SA-BFGS .107 ± .015 .399 ± .044 .820 ± .103 .064 ± .016 .940 ± .015 .860 ± .019
SA-IIS .117 ± .015 .419 ± .034 .875 ± .086 .070 ± .012 .934 ± .012 .851 ± .016
AA-kNN .114 ± .017 .410 ± .050 .843 ± .113 .071 ± .023 .934 ± .018 .855 ± .021
AA-BP .130 ± .017 .510 ± .054 1.05 ± .124 .113 ± .030 .908 ± .019 .824 ± .022
PT-Bayes .121 ± .016 .430 ± .035 .904 ± .086 .074 ± .014 .930 ± .016 .846 ± .016
PT-SVM .127 ± .017 .457 ± .039 .935 ± .074 .086 ± .016 .920 ± .014 .839 ± .015

Requirements

matplotlib>=3.6.1
numpy>=1.22.3
qpsolvers>=4.0.0
quadprog>=0.1.11
scikit-fuzzy>=0.4.2
scikit-learn>=1.0.2
scipy>=1.8.0
tensorflow>=2.8.0
tensorflow-addons>=0.22.0
tensorflow-probability>=0.16.0