Fail + weird error message with parameters positive=True+fit_intercept=True
rflamary opened this issue · 4 comments
Hello and congratulation for the awesome toolbox.
Not only it is the fastest one in the west but you allows us fun parameters such as weights that others do not.
Still I found a surprising behavior that in my opinion should be prevented.
hen i run the foollowing code
import numpy as np
import celer
n=6
d=9
X=np.random.rand(n,d)
y=np.ones(n)
w=np.random.rand(n)
alpha=0.1
model=celer.Lasso(alpha,weights=w,positive=True)
model.fit(X,y)
I get the following warnings:
/home/.../python3.8/site-packages/celer/homotopy.py:290: RuntimeWarning: invalid value encountered in true_divide
theta /= scal
!!! Inner solver did not converge at epoch 49999, gap: 0.00e+00 > nan
...
!!! Inner solver did not converge at epoch 49999, gap: 0.00e+00 > nan
!!! Inner solver did not converge at epoch 49999, gap: 0.00e+00 > nan
This error disappears when I also pass the parameter fit_intercept=False
that is actually the configuration i'm interested in so it's OK.
Still it feels as if someone want some positive coefficients and intercept it would be nice to have a working solution. If it is too hard to implement you should raise an error or force fit_intercept=False
with a clear warning.
Hi Rémy,
Thanks for the kind words.
You use y = np.ones() and fit an (unpenalized) intercept, under the hood we will center y and thus fit with y = np.zeros(), hence all sort of bad stuff can (and apparently, do) happen
Also be careful not to pass negative weights. I believe I should raise an error in that case, as this is easy to check
I tried with y = np.random.randn() and the error disappears, but the solver does not converge (even without intercept and without positivity. I suspect a wrong screening), so I'll look into it in more details in the week
import numpy as np
import celer
n = 6
d = 9
X = np.random.rand(n,d)
y = np.random.randn(n)
w = np.abs(np.random.rand(n))
alpha = 0.1
model = celer.Lasso(alpha, weights=w, positive=True, verbose=True, max_iter=10).fit(X, y)
Now I get the problem, I know constant y is not common but that's what happens when you use a Lasso solver for unrelated stuff!
Still note the weights with np.random.rand(n)
are positive ;)
Take your time I found a way to make it work it was just a suggestion issue.
My bad, read too fast as randn
:)
The modification which still failed was:
import numpy as np
import celer
n = 6
d = 9
np.random.seed(0)
X = np.random.rand(n, d)
y = np.random.randn(n)
weights = np.abs(np.random.rand(n))
alpha = 0.1
clf = celer.Lasso(alpha, weights=weights, positive=True,
verbose=True, max_iter=10, p0=10, prune=False).fit(X, y)
which fails because weights is of shape (6,), not (9,) as it should be.
I will add a test that weights has shape (n_features,)