Incorrect result when using MMD with some chunk_size argument values
Opened this issue · 0 comments
jaime-cespedes-sisniega commented
Describe the bug
Incorrect result when using MMD with some chunk_size argument values. For many chunk_size values there is a difference between the MMD² with chunk_size=None
and chunk_size!=None
.
For the provided code to reproduce, the following chunk_size values produce an incorrect result: 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 18, 19. The remaining values between 1 and 20 produce a correct result.
Steps/Code to Reproduce
from frouros.detectors.data_drift import MMD
import numpy as np
from functools import partial
from frouros.utils.kernels import rbf_kernel
np.random.seed(seed=31)
dim = 1
size = 20
kernel = partial(rbf_kernel, sigma=0.5)
chunk_size = 4
X_ref = np.random.multivariate_normal(mean=np.zeros(dim), cov=np.identity(dim), size=size)
X_test = np.random.multivariate_normal(mean=np.full(dim, 0.3), cov=np.identity(dim), size=size)
detector = MMD(
kernel=kernel,
chunk_size=None,
)
detector.fit(X_ref)
result, _ = detector.compare(X=X_test, verbose=True)
detector_chunk = MMD(
kernel=kernel,
chunk_size=chunk_size,
)
detector_chunk.fit(X_ref)
result_chunk, _ = detector_chunk.compare(X=X_test, verbose=True)
assert result.distance == result_chunk.distance
Expected Results
No error is thrown.
Actual Results
Traceback (most recent call last):
File "/home/jaime/.config/JetBrains/PyCharm2023.1/scratches/frouros/expected/data_drift/batch/mmd_chunk.py", line 30, in <module>
assert result.distance == result_chunk.distance
AssertionError
Versions
'0.5.1'