IBM/chopstix

Flexible control of the invocations of regions of interest to be sampled

rbertran opened this issue · 2 comments

Selective sampling of invocations can be implemented in both "sides":

  • Tracer: tracer decides if a particular execution of the region of interest needs to be sampled.
    • Pros: Less overhead if we decide to not sample.
    • Cons: Less information. Tracer does not have information about previous sampled regions. As a result, only random or simple (e.g. first 10) sampling methods can be implemented (no intelligent sampling)
  • Tracee: tracee decides if a particular execution of the regions of interest needs to be sampled.
    • Cons: Some overhead if we decide to not sample.
    • Pros: tracee can use existing samples (e.g. historical data) to decide or not to sample a particular region.

I do think that having both would be beneficial. One one side, we can use the Tracer sampling to do a first level pruning of the executions (e.g. from 1 million to 1 thousand) and then use the Tracee extra information available to do a second and more intelligent filtering.

With the new infrastructure, this "on-the-fly" tracing approach can be implemented to avoid the full profiling of an entire application.

Using the clustering approach, this issue less important. We profile all invocations using chop-perf-invok and then select specific invocations after clustering. So, it is not a high priority anymore to generate on the fly this intelligent invocation selection.