ot.gmm : numerical errors

Question

ot.gmm : numerical errors

Closed this issue 4 months ago · 0 comments

Describe the bug

Function ot.gmm.gaussian_pdf has numerical errors when dealing with high-dimensional Gaussians, e.g. when the covariance matrix determinant becomes very small. This can lead to inaccurate density computations or underflow. The issue arises due to the direct computation of $\det C$ and $\exp (-0.5 ...)$, which are sensitive to instability for poorly scaled covariance matrix.

To Reproduce

Define a high-dimensional diagonal covariance with small entries
Compute the density using ot.gmm.gaussian_pdf
Observe that the computed values are inaccurate

Code sample

import numpy as np
import ot.gmm

# Example input
d = 512  # dimension
x = np.random.randn(10, d)  # samples
m = np.zeros(d) # mean
C = np.eye(d) * 0.01  # covariance

# Compute PDF
pdf = ot.gmm.gaussian_pdf(x, m, C)

print("Computed PDF values:", pdf)

Output

Computed PDF values: [nan nan nan nan nan nan nan nan nan nan]

Environment (please complete the following information):

POT installed with pip

macOS-15.0-arm64-arm-64bit
Python 3.9.6 (default, Feb  3 2024, 15:58:27) 
[Clang 15.0.0 (clang-1500.3.9.4)]
NumPy 1.25.2
SciPy 1.13.1
POT 0.9.5

Additional context

This numerical instability breaks downstream functions like gmm_ot_apply_map, where fractions of densities are computed using gaussian_pdf. Dividing two small density values can also cause further inaccuracies.