Granulate/gprofiler

Filter PIDs to profile

Closed this issue · 2 comments

Jongy commented

Filtering processes to profile is a feature that was requested in multiple cases.#482 #298 #274 #573. This ticket is for the first step - filter by PIDs (relevant only in cases where processes remain running).

I suggest:

  • --pids parameter that is a number or comma-separated list of numbers, and can also be given multiple times e.g --pids 1,2 --pids 5 --pids 76,76576. All are accumulated.
  • At this point
    processes_to_profile = self._select_processes_to_profile()

    PIDs are selected for runtime profilers. We can intersect the "selected PIDs" here with the PIDs passed via --pids. This class is extended by all runtime profilers classes, so the "easy" way to pass more parameters is to add it in all constructors of JavaProfiler etc. I prefer if we find a clean way to do so, so we don't need to modify all runtime profilers, this needs to be decoupled.
  • There are 3 profilers which select their PIDs "automatically" and not via gProfiler decision: perf, PyPerf and phpspy. They need to be modified. At the very least, perf and PyPerf support receiving target PID(s) as arguments. We need to check about phpspy. Their respective classes should receive the pids argument and make use of it if passed.
  • UX wise, I think gProfiler should be fine if it gets --pids a,b,c and only e.g b, c are running and a is not running.
  • By default, gProfiler will continue in its standard behavior of profiling everything. So if no PIDs are given, an empty set of filter PIDs will be considered ALL processes.
  • Please add adequate tests - 2 running Java apps, pass the PID of one of them and ensure that async-profiler was invoked only on one of them.
  • Deny passing --profile-spawned-processes with --pids, because it is meaningless, we target live processes.
Jongy commented

On top of this task we could have --containers= which accepts multiple container names, and will accept all PIDs originating from those containers. It can also accept the container name as a regex pattern. Not required for MVP but nice to have :)

On top of this task we could have --containers= which accepts multiple container names, and will accept all PIDs originating from those containers. It can also accept the container name as a regex pattern. Not required for MVP but nice to have :)

I opened issue for this: #762