sbi-dev/sbi

c2st fails when one feature is constant

Baschdl opened this issue · 1 comments

Describe the bug

Running c2st/c2st_scores with the default z_scores=True when at least one feature is constant (all data points have the same value for this feature) fails with ValueError: Input X contains NaN. RandomForestClassifier does not accept missing values encoded as NaN natively....
This is caused by dividing the data by the standard deviation of this feature (which is zero):

sbi/sbi/utils/metrics.py

Lines 161 to 165 in 83e122a

if z_score:
X_mean = torch.mean(X, dim=0)
X_std = torch.std(X, dim=0)
X = (X - X_mean) / X_std
Y = (Y - X_mean) / X_std

To Reproduce

from sbi.utils.metrics import c2st
import torch

X, Y = torch.ones(5,2), torch.zeros(5,2)
c2st(X, Y)

Thanks for reporting this @Baschdl

I can reproduce it only when all features are constant. But still, this should not happen. I suggest setting std=1 when the feature is constant so that we are only shifting it to zero in that case.