Use abess.linear.LogisticRegression in cross_validate return negative test_score

Question

Use abess.linear.LogisticRegression in cross_validate return negative test_score

AnrWang opened this issue 3 years ago · 1 comments

Describe the bug

In my experiment, when using abess.linear.LogisticRegression in cross_validate, it returns negative test_score.The following code provides an example.

Code for Reproduction

LogisticRegression the samples on Hypersphere(dim=9) in 10D Euclidean Space (without do the logarithm map).

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_validate
from geomstats.geometry.hypersphere import Hypersphere
from abess import LogisticRegression

sphere = Hypersphere(dim=9)
labels = np.concatenate((np.zeros(1000),np.ones(1000)))
data0 = sphere.random_riemannian_normal(mean=np.array([1/3,0,2/3,0,2/3,0,0  ,0,0  ,0]), n_samples=1000, precision=5)
data1 = sphere.random_riemannian_normal(mean=np.array([0  ,0,0  ,0,2/3,0,2/3,0,1/3,0]), n_samples=1000, precision=5)
data = np.concatenate((data0,data1))
train_data, test_data, train_labels, test_labels = train_test_split(data, labels, test_size=0.33, random_state=0)
result = cross_validate(LogisticRegression(support_size=range(0, 11)), train_data, train_labels)
print(result)

return:

{'fit_time': array([0.01018214, 0.0107348 , 0.00791812, 0.00899959, 0.00998664]), 'score_time': array([0.0010004, 0.       , 0.0010004, 0.       , 0.       ]), 'test_score': array([-96.61424923, -94.28383166, -91.52310614, -95.26857002,
       -87.02766473])}

Answer 1 · 2022-05-02T04:33:57.000Z

I find it caused by the definition on the "score" functions, e.g. abess.LogisticRegression defines the score function as the entropy function (so it can be negative).

But it may be better to use prediction accuracy, I will submit a pull request to update it soon. Thank you for this issue!