performancecopilot/pcp

Add some CPU metrics in pmstat output

myllynen opened this issue · 6 comments

Recent versions of vmstat(8) report CPU wait, steal, and guest time that have become available in Linux kernel during the past few years. It could be nice if pmstat(1) would also report these additional CPU metrics as well. Thanks.

@myllynen is this different to the -x option?

$ pmstat --help 2>&1 | grep -- -x
  -x, --xcpu            extended CPU statistics reporting

D'oh, I somehow completely missed that!

But the extended metrics do not include guest time so I think that could be added.

Thanks.

Yep, missing & will be nice to have - thanks for checking. I'll queue it up but hopefully someone else drops in and fixes it in the meantime.

This looks pretty straightforward but for the record few things I noticed:

  1. src/pmstat/pmstat.pmlogger and src/pmlogconf/tools/pmstat are slightly different, the outcome should be the same but not sure should these be unified

  2. While at it perhaps the output could include guest_nice as well, it's not part of vmstat(8) output but this would make pmstat output more complete. However I'm not sure does this provide any real value for pmstat(1) users.

  3. Would the calculation of totals be stil like this or should guest time(s) be added as well:

user = s->val[cpu_nice].ull + s->val[cpu_user].ull;
kernel = s->val[cpu_intr].ull + s->val[cpu_sys].ull + s->val[cpu_steal].ull;
idle = s->val[cpu_idle].ull + s->val[cpu_wait].ull;

This is a bit unclear as in /proc/pid/stat utime includes guest_time according to the man page but nothing is stated for /proc/stat and guest times there so it's not entirely clear to me whether stolen time includes or is different from guest / guest_nice time?

Thanks.

Have a look at the way Mark setup the pmchart CPU and vCPU views are setup in terms of metrics used to answer this (note also vuser and vnice).
And yeah, definitely some updates needed to those two configs, esp. when this addition is made to pmstat.

I checked the kernel sources and the related git log:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/kernel/sched/cputime.c
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/cputime.c

It looks like guest time is included in the user time since, at least since a couple of years ago. So this makes me think the above pasted calculations can be left as-is. Thanks.