IFCA-Advanced-Computing/frouros

Incorrect result when using MMD with some chunk_size argument values

Opened this issue · 0 comments

Describe the bug

Incorrect result when using MMD with some chunk_size argument values. For many chunk_size values there is a difference between the MMD² with chunk_size=None and chunk_size!=None.

For the provided code to reproduce, the following chunk_size values produce an incorrect result: 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 18, 19. The remaining values between 1 and 20 produce a correct result.

Steps/Code to Reproduce

from frouros.detectors.data_drift import MMD
import numpy as np
from functools import partial
from frouros.utils.kernels import rbf_kernel

np.random.seed(seed=31)

dim = 1
size = 20
kernel = partial(rbf_kernel, sigma=0.5)
chunk_size = 4

X_ref = np.random.multivariate_normal(mean=np.zeros(dim), cov=np.identity(dim), size=size)
X_test = np.random.multivariate_normal(mean=np.full(dim, 0.3), cov=np.identity(dim), size=size)

detector = MMD(
    kernel=kernel,
    chunk_size=None,
)
detector.fit(X_ref)
result, _ = detector.compare(X=X_test, verbose=True)

detector_chunk = MMD(
    kernel=kernel,
    chunk_size=chunk_size,
)
detector_chunk.fit(X_ref)
result_chunk, _ = detector_chunk.compare(X=X_test, verbose=True)

assert result.distance == result_chunk.distance

Expected Results

No error is thrown.

Actual Results

Traceback (most recent call last):
  File "/home/jaime/.config/JetBrains/PyCharm2023.1/scratches/frouros/expected/data_drift/batch/mmd_chunk.py", line 30, in <module>
    assert result.distance == result_chunk.distance
AssertionError

Versions

'0.5.1'