KL divergence estimators

The KL-divergence is normally defined between two probability distributions. In the case where only samples of the probability distribution are available, the KL-divergence can be estimated in a number of ways.

Here I test a few implementations of a KL-divergence estimator based on k-Nearest-Neighbours probability density estimation.

The estimator is that of

Qing Wang, Sanjeev R. Kulkarni, and Sergio Verdú. "Divergence estimation for multidimensional densities via k-nearest-neighbor distances." Information Theory, IEEE Transactions on 55.5 (2009): 2392-2405.

Samples are drawn from various test distributions, and the estimated KL-divergence between them is computed. Uncertainties are assessed by re-sampling the distributions and re-computing divergence estimates 100 times. Uncertainty bands are then given as the interval containing 68% of the re-sampled estimates closest to the median. Timings where provided are the time taken for the computation of all 100 re-samples on a sample size of N=1000 with k=5.

This study is far from exhaustive, and timings are sensitive to implementation details. Please take with a pinch of salt.

Estimator implementations

naive_estimator

KL-Divergence estimator using brute-force (numpy) k-NN
scipy_estimator

KL-Divergence estimator using scipy's KDTree
skl_estimator

KL-Divergence estimator using scikit-learn's NearestNeighbours
skl_efficient

An efficient version of the scikit-learn estimator by @LoryPack

These estimators have been benchmarked against slaypni/universal-divergence.

Tests

Self-divergence of samples from a 1-dimensional Gaussian

Estimate the divergence between two samples of size N and dimension 1, drawn from the same ~ N(0,1) probability distribution. The expected value for the divergence in this test is D=0.

Comparison of estimator implementations

Estimator	D(P\|Q)	Time (s)
naive_estimator	1.595e-03	7.998
scipy_estimator	1.595e-03	0.111
skl_estimator	1.595e-03	18.427
skl_efficient	1.595e-03	0.147

Convergence of estimator with N

Self-divergence of samples from a 2-dimensional Gaussian

Estimate the divergence between two samples of size N drawn from the same 2D distribution with mean=[0,0] and covariance=[[1, 0.1], [0.1, 1]]. The expected value for the divergence in this test is D=0.

Comparison of estimator implementations

Estimator	D(P\|Q)	Time (s)
naive_estimator	-6.811e-04	9.931
scipy_estimator	-6.811e-04	0.182
skl_estimator	-6.811e-04	18.362
skl_efficient	-6.811e-04	0.222

Convergence of estimator with N

Divergence of two 1-dimensional Gaussians

Estimate the divergence between two samples of size N and dimension 1. The first drawn from N(0,1), the second from N(2,1). The expected value for the divergence in this test is D=2.0.

Comparison of estimator implementations

Estimator	D(P\|Q)	Time (s)
naive_estimator	1.790e+00	7.014
scipy_estimator	1.790e+00	0.105
skl_estimator	1.790e+00	18.215
skl_efficient	1.790e+00	0.139

Convergence of estimator with N

Generating this document

Start in a clean python 3.10 environment and run the following

 # Setup dependencies
 pip install -r requirements.txt
 # Run the tests and generate the figures
 ./src/run_tests.py
 # Add the header and footer to the report
 cat templates/header.md report.md templates/footer.md > README.md

Which will then likely take some time to complete.

Important settings

The number of resamples used to estimate uncertainties is defined by n_resamples in tests.py. This is naturally an extremely sensitive variable for how long the tests take to run.

nhartland/KL-divergence-estimators

KL divergence estimators

Estimator implementations

Tests

Self-divergence of samples from a 1-dimensional Gaussian

Comparison of estimator implementations

Convergence of estimator with N

Self-divergence of samples from a 2-dimensional Gaussian

Comparison of estimator implementations

Convergence of estimator with N

Divergence of two 1-dimensional Gaussians

Comparison of estimator implementations

Convergence of estimator with N

Generating this document

Important settings