Inclusion of more metrics
Closed this issue · 2 comments
I have always used metrics such as Precision, Recall and F-Score in conjunction with AUC score. I think these metrics are important to consider if there is class imbalance. Precision, recall, and F-score provide more specific information regarding how well the model performs in correctly identifying a specific class. AUC, while helpful in measuring the model's ability to distinguish between classes, does not provide class-specific details.
Since they are already included in R and my reasoning above I hope this provides a valid use case to include them in the python scripts as well. :)
Ideally these are the type of stats I would like to see to compare model performance when we perform experimentation:
Originally posted by @sushantkhare in #153 (reply in thread)
I don't think there is enough demand for this and it's very easy for these kind of evaluation metrics to be misleading in our setup where we have probability models in a ranking scenario. Closing for now, if anyone feels a need for this I'm open to reconsider.