Division by zero crash on startup experiment
vext01 opened this issue · 3 comments
vext01 commented
[2017-09-06 12:31:18: DEBUG] Fatal Krun error: float division by zero
File "./krun/krun.py", line 231, in main
inner_main(mailer, on_first_invocation, config, args)
File "./krun/krun.py", line 357, in inner_main
sched.run()
File "/home/kruninit/warmup_experiment/krun/krun/scheduler.py", line 524, in run
measurements, instr_data, flag = job.run(self.mailer, self.dry_run)
File "/home/kruninit/warmup_experiment/krun/krun/scheduler.py", line 373, in run
stdout, stderr, rc, self.sched.config)
File "/home/kruninit/warmup_experiment/krun/krun/util.py", line 293, in check_and_parse_execution_results
config.AMPERF_RATIO_BOUNDS)
File "/home/kruninit/warmup_experiment/krun/krun/amperf.py", line 70, in check_amperf_ratios
busy_threshold, ratio_bounds)
File "/home/kruninit/warmup_experiment/krun/krun/amperf.py", line 90, in check_core_amperf_ratios
ratio = norm_aval / norm_mval
Traceback (most recent call last):
File "./krun/krun.py", line 395, in <module>
main(parser)
File "./krun/krun.py", line 240, in main
raise exn
ZeroDivisionError: float division by zero
Probably because the iteration is so short...
vext01 commented
Yes
(Pdb) list
87 # normalise the counts to per-second readings
88 norm_aval = float(aval) / wctval
89 norm_mval = float(mval) / wctval
90 if norm_mval == 0.0:
91 import pdb; pdb.set_trace()
92 -> ratio = norm_aval / norm_mval
93 ratios.append(ratio)
94
95 if norm_aval > busy_threshold:
96 # Busy core
97 busy_iters.append(True)
(Pdb) aval
0
(Pdb) mval
0
(Pdb) wctval
13868.715989
I suppose the correct fix would be a NO_AMPERF_CHECK
config, but i'm keen to find a workaround for the 1.2 data, as it already uses an older version of Krun...
Thoughts?
ltratt commented
The fix we've agreed upon is not to check the ratios in startup.krun
.
vext01 commented
Fixed.