softdevteam/warmup_experiment

Integral statistics could be misleading.

Closed this issue · 12 comments

There is a section of code in tabling routines where we attempt to give an integral median and IQR for the starting steady state iteration.

Suppose we get a symetric IQR (out of luck) 1.5 +/- 0.1. It would look like this:

......|_____________|......
             ^
      1.4   1.5    1.6

How would we round this? Well we need to be safe, so we should round the lower bound of the IQR down and the upper bound up, thus over-approximating error. This would give us:

......|_____________|......
             ^
      1     1.5      2

But what would we do with the median?

If we round down we get:

......|_____________|......
      ^
      1              2

If we round up we get:

......|_____________|......
                    ^
      1              2

In neither case is the median in the middle of the error bound, which I find confusing. More confusing that showing a floating point value for the steady state iter.

Hrm...

@ltratt discussed with @snim2, what do you think?

I am very glad for the diagrams, otherwise I would have struggled to understand the relation of 1, 1.5, and 2 ;)

It's most common to use the mean of the two values either side of the median point. But I don't have a problem with rounding up/down either -- in the grand scheme of things it's defensible either way I think.

The mean wont be integral, and rounding is clearly wrong, no?

"Clearly wrong" is kind of overstating it (there are occasions where the median is presented as a single "whole" value), although I personally tend to present it as the average of the two values either side of the median.

Hrm. I'm not sure, but as long as we have thought about it.

Just to be clear: I'm agreeing with you that the non-rounding approach is probably better.

ah!

snim2 commented

@ltratt @vext01 just to be pedantically clear, to resolve this bug report we now need to represent all three numbers in the median steady state iter (#) and IQR as floats to 1dp?

No, just the median (not the IQRs).

That's OK as long as you are OK with the median (potentially) not being in the middle of the error bounds...

Good point. Maybe 1dp is the way to go. I mean, it will still sometimes not be quite in the middle, but it'll be close enough not to worry about it.

Right. So you probably would want the median and the error all as floats.