ZeroDivisionError in PmfStatDict.addValue()
golfdish opened this issue · 2 comments
Calculating self.__sample.stddev
for a PmfStatDict
after calling addValue
results in a ZeroDivisionError when the list of samples has 1 element but its count
is 2 or greater, as when an operation takes zero time (e.g. when unit testing with time.time()
patched out). This is due to these lines in ExponentiallyDecayingReservoir.update
(samplestats.py:151):
priority = self.__weight(timestamp - self.startTime) / random.random()
self.count += 1
if (self.count <= self.size):
self.values[priority] = value
priority
is obviously 0 when timestamp - self.startTime
is 0, thus self.samples()
returns a list of length 1 (self.values.values()
) while self.count
is 2 or greater. Because self.count
decides len(self)
for a Sampler
, the test at the top of
@property
def stddev(self):
"""Return the sample standard deviation."""
if len(self) < 2:
return float('NaN')
# The stupidest algorithm, but it works fine.
arr = self.samples()
mean = sum(arr) / len (arr)
bigsum = 0.0
for x in arr:
bigsum += (x - mean)**2
return sqrt(bigsum / (len(arr) - 1))
in Sampler
(samplestats.py:54) returns False, allowing the following code to execute, with the inevitable ZeroDivisionError when it divides by len(arr) - 1
.
On closer inspection it appears that this is a consequence of startTime
being larger than timestamp
, as when startTime
is set before time.time()
gets patched out.
I've pushed a less-than-ideal fix for this in e1f9118. Thanks!