Allow fitting an intercept in `Lasso.path()`
jolars opened this issue · 3 comments
It would be nice if Lasso.path()
(celer_path()
) could fit an intercept.
Hi @jolars ! The design of celer_path
follows that of scikit-learn, in order to keep our Lasso
class as simple as possible (we do not overwrite methods other than __init__
and path
).
scikit-learn computes the intercept in the fit
method, by centering X and y : https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/linear_model/_coordinate_descent.py#L1008
then the intercept is computed after the coefficients have been computed : https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/linear_model/_coordinate_descent.py#L1084
So I guess it would not be too complicated to wrap celer_path() in the same fashion
Indeed, it won't be difficult to implement such a feature in celer_path
. Yet, since we expose the same API as scikit-learn, we better not do it so that to abide by their convention.
With that being said, here is a function that implements your requested feature
def custom_celer_path(X, y, **kwargs):
X_cols_mean = X.mean(axis=0)
y_mean = y.mean()
X_cented = X - X_cols_mean
y_cented = y - y_mean
_, coefs, _ = celer_path(X_cented, y_cented, pb='lasso', return_n_iter=False,
return_thetas=False, **kwargs)
intercepts = y_mean - coefs.T @ X_cols_mean
return intercepts, coefs
One can show that solving a Lasso with intercept is the same as solving a problem with centered y
and X
(with regards to features) with the following link equation
\beta_0 = y_mean - \beta^T X_{centered}
Here is a little snippet to check/play with the function:
import numpy as np
from numpy.linalg import norm
from celer import celer_path, Lasso
from celer.datasets import make_correlated_data
# fix random state for reproducibility
random_state = 123
rnd = np.random.RandomState(random_state)
n_samples, n_features = 1000, 100
X, y, _ = make_correlated_data(n_samples, n_features, random_state=random_state)
# discenter X, y to make intercept != 0
for j in rnd.choice(n_features, 10):
X[:, j] += rnd.uniform(5, 10)
y += rnd.uniform(5, 10)
tol = 1e-14
eps, n_alphas = 1e-3, 100
alpha_max = norm(X.T@y, ord=np.inf) / n_samples
alphas = alpha_max * np.geomspace(1, eps, n_alphas)
# fit with celer_path
intercepts, coefs = custom_celer_path(X, y, alphas=alphas, tol=tol)
# fit alphas individually with Lasso
ind_intercepts = []
for i, alpha in enumerate(alphas):
model = Lasso(alpha=alpha, fit_intercept=True, tol=tol)
model.fit(X, y)
ind_intercepts.append(model.intercept_)
np.testing.assert_allclose(intercepts, ind_intercepts, atol=1e-7)
Note that the custom_celer_path
is faster than fitting individually the alphas with Lasso
since custom_celer_path
solves sequentially the problems along the given path.
Thanks @Badr-MOUFAD ! For sparse datasets we can't center X as it breaks sparsity but the same kind of technique works. It has been done in benchopt/benchmark_lasso_path#3 for example