embench/embench-iot

Proposal: Mechanism for *per-benchmark* iteration control and to clarify the purpose of cpu-mhz

Roger-Shepherd opened this issue · 0 comments

NB This does not to change the definition of a EMBENCH score.

At present the benchmarking system uses an option cpu_mhz for two (related) purposes:

  1. to define the frequency of the benchmarked CPU
  2. to multiply the baseline number of iterations performed by all benchmarks so that the execution time of those benchmarks is long enough that execution time measurements are large (seconds) compared to the quantum of time.

The baseline number of iterations has been set to provide a weighting between the benchmarks when computing the Embench score. The reported time for any benchmark is actual time / cpu_mhz. A 100 MHz processor runs 10x the number of iterations as a 10 MHz processor and this difference is accounted for by dividing the actual time accordingly.

This works well for similar processors using similar compiler optimismisations. When a processor or a compiler causes a benchmark to run very much faster (10x or even 100x) than the processor frequency would suggest, the actual run time generated by the cpu_mhz scaling becomes too short to be reliable. (Running on an M1 Mac three benchmarks run between 100x and 700x faster than frequency suggests). With the current system, the only way to overcome this is to set the cpu_mhz to be much higher (10x or 100x) than reality. This results in two problems:

  1. the run time for the entire suite is increased greatly
  2. the cpu_mhzparameter no longer reflects the actual frequency of the processor

This proposal is to add an mechanism which allows for individual benchmarks to have their iteration count increased, and the computation of their nominal run-time appropriately adjusted.