riken-aip/pyHSICLasso

A bug occurs when there are few explanatory variables

mamitsu2 opened this issue · 6 comments

There are few explanatory variables, so bug occurs. Please fix it!

Thanks for the comments.

Could you share the dataset you used and your setup? Then, we can reproduce the bug.

Thanks!

Sorry to be inadequate.
I uses sklearn.datesets.load_boston to test this module.
If I do as follows, bug doesn't occur.

dataset = load_boston()
# set dataframe
X1_ = pd.DataFrame(dataset.data, columns=dataset.feature_names)
y1_ = pd.DataFrame(dataset.target, columns=['y'])
X1_ = X1_.iloc[:,:]
X1 = np.array(X1_)
y1 = np.array(y1_)
X1_col = X1_.columns
hsic_lasso = HSICLasso()
hsic_lasso.input(X1,y1.flatten(),featname=list(X1_col))
hsic_lasso.regression(num_feat=X1.shape[1], discrete_x=False, n_jobs=2)
hsic_lasso.dump()
hsic_lasso.get_index_score()

but, I do as follows, reduce explanatory variables,

X1_ = X1_.iloc[:,:5]

ValueError: attempt to get argmax of an empty sequence
is occured.

A bug occurs here.

~/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/pyHSICLasso/nlars.py in nlars(X, X_ty, num_feat, max_neighbors)
     97         XtXbeta = np.dot(X.transpose(), np.dot(X, beta))
     98         c = X_ty - XtXbeta
---> 99         j = np.argmax(c[I])
    100         C = max(c[I])
    101 

Thanks for the detailed information. We will investigate this case.

I've got the same problem with another dataset with few explanatory variables.

Then I've replicated the error with sklearn.datasets.load_boston and 5 features. It seems that I array gets empty ([]) when lasso_cond=0. And this exception is not controlled on the while loop or compensated anyway.

Any hint to fix this issue? I think the library is very interesting, and HSIC-based optimization may be useful too for datasets with few columns.

Thank you!

Thanks for your input. We have been looking at alternative Lasso solvers. Unfortunately, we haven't found one that checks all of our boxes... We'll be on the lookout for a new solvers that would address this issue.

Thanks for reporting the bug. Has anyone solved it?