GroupLasso silently assumes that the labels y are stored in a column vector
normanius opened this issue · 1 comments
normanius commented
Thanks for your work so far. Had a first look at your group-lasso tool. Looks as if it runs super fast, though I haven't tested the performance yet in detail.
I observed a small problem if the y is stored in a 1d-array:
How to reproduce:
X = ...
y = ...
y = y.flatten()
gl.fit(X,y)
This will yield the following exception.
Traceback (most recent call last):
File "trainExplore.py", line 73, in <module>
executor.run()
File "/Users/norman/workspace/education/phd/projects/geomtk/python/utilities/executor.py", line 811, in run
args.func(args)
File "/Users/norman/workspace/education/phd/projects/geomtk/python/utilities/executor.py", line 403, in _runSingle
self.ret = self._functor(args.file, args.outDir, args, taskInfo)
File "trainExplore.py", line 48, in runTraining
returnStd=False)
File "/Users/norman/workspace/education/phd/projects/geomtk/python/utilities/ml.py", line 338, in testBinaryClassifiers
clf.fit(XTrain, yTrain)
File "/Users/norman/workspace/dev/misc/python/group-lasso/group_lasso/_group_lasso.py", line 258, in fit
self._fista(X, y, lipschitz_coef=lipschitz_coef)
File "/Users/norman/workspace/dev/misc/python/group-lasso/group_lasso/_group_lasso.py", line 208, in _fista
prox
File "/Users/norman/workspace/dev/misc/python/group-lasso/group_lasso/_group_lasso.py", line 155, in _fista_it
u_ = prox(v - grad(v)/L)
File "/Users/norman/workspace/dev/misc/python/group-lasso/group_lasso/_group_lasso.py", line 185, in grad
SSE_grad = _subsampled_l2_grad(X, w, y, self.subsampling_scheme)
File "/Users/norman/workspace/dev/misc/python/group-lasso/group_lasso/_group_lasso.py", line 32, in _subsampled_l2_grad
A, b = subsample(subsampling_scheme, A, b)
File "/Users/norman/workspace/dev/misc/python/group-lasso/group_lasso/_subsampling.py", line 54, in subsample
return _extract_from_singleton_iterable([X[inds, :] for X in Xs])
File "/Users/norman/workspace/dev/misc/python/group-lasso/group_lasso/_subsampling.py", line 54, in <listcomp>
return _extract_from_singleton_iterable([X[inds, :] for X in Xs])
IndexError: too many indices for array
You probably require a check at the beginning of the fit function.
Just a detail, just wanted to let you know.
Good courage with the further development