为什么计算ROC-AUC值的时候，真实的label为0或1，预测的分数是10~30左右的数值，这样也能计算ROC-AUC值吗？

Question

为什么计算ROC-AUC值的时候，真实的label为0或1，预测的分数是10~30左右的数值，这样也能计算ROC-AUC值吗？

limaodaxia opened this issue a year ago · 2 comments

为什么计算ROC-AUC值的时候，真实的label为0或1，预测的分数是10~30左右的数值，这样也能计算ROC-AUC值吗？常规来说，预测的值应该在0到1之间，这样才能划分阈值，然后画出ROC曲线，计算出AUC值吧。

Answer 1 · 2023-10-24T07:48:38.000Z

Instead of limiting the score to lie between 0 and 1, the AUC metric is calculated by constantly looking for thresholds for True Positive Rate and True Positive Rate.
Here is the description of sklearn:
In the binary case, you can either provide the probability estimates, using the classifier.predict_proba() method, or the non-thresholded decision values given by the classifier.decision_function() method. In the case of providing the probability estimates, the probability of the class with the “greater label” should be provided. The “greater label” corresponds to classifier.classes_[1] and thus classifier.predict_proba(X)[:, 1]. Therefore, the y_score parameter is of size (n_samples,).

Answer 2 · 2023-10-24T07:50:37.000Z

Thank you very much.