elephaint/pgbm

Error when entering data containing NaN

Closed this issue · 1 comments

I am interested in probabilistic GBM and was doing some research and found your PGBM repository.
So I read the README and according to the Feature overview, pgbm.torch.PGBM is compatible with NaN.
However, when I tried the following code, I got ValueError: Input X contains NaN.

from pgbm.torch import PGBMRegressor
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_california_housing
X, y = fetch_california_housing(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=0)

# Add nans
nans = np.zeros([X_train.shape[0],1])
nans[:,:] = np.nan
X_train = np.append(X_train, nans, axis=1)

model = PGBMRegressor().fit(X_train, y_train)  

This is a slight modification of your sample code.
If you know anything about this problem, please let me know.

Hi,

You are right, the sklearn wrapper for the Torch version slipped through the unit nan-input tests because of a small mistake on my behalf. This was a simple fix and has been fixed now. Reinstalling PGBM from pip should solve the issue (make sure the version installed is 2.1.1). Please re-open if not fixed.