(Possibly) Overhaul Instrumentation Framework

Question

(Possibly) Overhaul Instrumentation Framework

vext01 opened this issue 7 years ago · 3 comments

Currently the instrumentation framework expects the iteration runner to give one "packet" of data per in-process iteration. This is somewhat rigid, e.g. one cannot emit data that applys for a whole process execution or more generally arbitrary data.

Another idea: summarise the PyPy instrumentation data at experimentation time so that it is smaller to parse later.

Answer 1 · 2017-10-09T09:43:40.000Z

It turned out Tom's recent changes did not require "free-form" instrumentation data, so this has become somewhat lower priority. The PyPy idea is still valid.

Answer 2 · 2018-01-25T12:29:09.000Z

Ideally, Krun would support "plugins", where each plugin would consist of:

An iterations runner.
Optionally, an instrumentation parser.

As it stands both of these are hard-coded into Krun, and the instrumentation interface is not great either.

To fix #366, we are planning on adding some environment variables or arguments to the iterations runners to pass:

The pexec number (padded).
The benchmark.

This gives the iteration runner itself enough information to open a file and dump instrumentation data into it. Perhaps this would be a better interface than expecting the VMs to spew stuff onto stderr for Krun to pick up and store later?

Porting the existing instrumentation code and various scripts (plotting! the horror) is not a small change however.

Answer 3 · 2018-01-25T13:05:23.000Z

The instrumentation data and plotting should be reasonably well decoupled already. warmup.vm_instruments contains parsers that parse instrumentation data into ChartData objects, which are defined like this:

class ChartData(object):
    """Class to hold data needed by the plotting script.
    Each VM parser may parse a number of different events which need to be
    plotted (e.g. compilation events, GC, etc.). These should be stored in
    a list of ChartData objects, so that the plotting script does not have to
    know anything about the individual VMs.
    """

    def __init__(self, title, data, legend_text):
        self.title = title
        self.data = data
        self.legend_text = legend_text

So, hopefully, the most you would need to do is re-write the parsers, or the place where the instrumentation files are loaded.