[FR] Confidence intervals for metrics
NightMachinery opened this issue · 0 comments
NightMachinery commented
It seems that currently simple metrics such as
evaluate.load(
"accuracy",
)
do not compute a confidence interval. This can be easily fixed by first computing the mean, and the STD, and then dividing the STD by the square of the sample count (to compute the STD of the mean estimate). (See, e.g., here.)
Even just giving back the variance (or STD) is enough, the user can do their own computations on those.