multiple reductions on data
Opened this issue · 0 comments
dovinmu commented
I'm working on a binary classification script and I want to compute multiple stats on it at the end to see how well the model(s) did. But if I compute eg precision as a 1.5=>0 transform, then I can't use the same data to compute recall. Here's how I'm getting around that now:
def recall(t:Table, predictions:str, true_values:str) -> float:
N = true_values.shape[0]
return (true_values == predictions).sum() / N
def precision(t:Table, predictions:str, true_values:str) -> float:
TP = ((predictions == 1) & (true_values == 1)).sum()
FP = ((predictions == 1) & (true_values == 0)).sum()
return TP / (TP+FP)
@defop('compute-accuracy', 1.5, 0.5)
def compute(aipl, t:Table, predictions_colname, true_values_colname) -> dict:
true_values = to_np_int_array(t, true_values_colname)
predictions = to_np_int_array(t, predictions_colname)
r = recall(t, predictions, true_values)
print(r)
p = precision(t, predictions, true_values)
print(p)
return {
'recall': recall(t, predictions, true_values),
'precision': precision(t, predictions, true_values)
}
But it would be awesome to figure out a better and more general way of supporting these operations.