scheduling-profile seems not to be working on kernel 4.4
stefanodoni opened this issue · 4 comments
Hi,
First of all many thanks for sharing this work!
I'm interested in trying the scheduling-profile tool on Ubuntu 16.04, kernel 4.4. I have installed bcc as per iovisor instructions. However, it seems not to be working:
$ sudo ./scheduling-profile 2918
Recording scheduling information for 15 seconds
/virtual/main.c:41:63: warning: incompatible pointer to integer conversion initializing 'char' with an expression of type 'void *' [-Wint-conversion]
struct proc_counter_t new_counter = {.proc_name = NULL, .count = 0};
^~~~
include/linux/stddef.h:7:14: note: expanded from macro 'NULL'
#define NULL ((void *)0)
^~~~~~~~~~~
1 warning generated.
No samples for pid 2918
Can the tool be made compatible with 4.4 kernels or it requires some new eBPF capability found in newer kernels?
Thank you!
Can you see if there are any files generated in /tmp/? Specifically, you should see:
/tmp/jstack-2918.txt
/tmp/scheduler-states-2918.json
The C warnings are not an indicator of whether the tracing is actually working. Also, can you try running without sudo
, and just entering a password when prompted.
Howdy,
I'm on ubuntu16.04 4.13.0-32-generic, openjdk9, I get a similar warning, after starting java (pid 22949) running a DaCapo benchmark for 20 iterations and I get a similar issue as above - is there a need to use hotspot instead? I added some debugging to the grav/src/cpu/scheduler_profile.py to print stuff out ...
ie sys.argv[1], tid_to_thread_name and thread_scheduling variables ....
./grav/bin/scheduling-profile 22949
/tmp/jstack-22949 has call stacks in it ...
whereas /tmp/scheduler-states-22949
contains
{"0": {"D": 60, "K": 0, "S": 1506889, "R": 1507495, "U": 0, "x": 71, "total": 3005466}, "2": {"D": 15, "K": 0, "
S": 0, "R": 0, "U": 0, "x": 0, "total": 15}}
/virtual/main.c:40:63: warning: incompatible pointer to integer conversion initializing 'char' with an
expression of type 'void *' [-Wint-conversion]
struct proc_counter_t new_counter = {.proc_name = NULL, .count = 0};
^~~~
include/linux/stddef.h:7:14: note: expanded from macro 'NULL'
#define NULL ((void *)0)
^~~~~~~~~~~
1 warning generated.
('sys.argv[1] {}', '/tmp/jstack-22949.txt')
('tid to thread name {}', "{'22950': 'main', '22957': 'GC Thread#6', '22970': 'G1 Marker#1', '22955': 'GC Thread#4', '22954': 'GC Thread#3', '22997': 'node-3', '22996': 'node-4', '22995': 'node-5', '22994': 'node-2', '22993': 'node-1', '22978': 'C2 CompilerThread2', '22975': 'Signal Dispatcher', '22974': 'Surrogate Locker Thread (Concurrent GC)', '22977': 'C2 CompilerThread1', '22976': 'C2 CompilerThread0', '22971': 'VM Thread', '22956': 'GC Thread#5', '22973': 'Finalizer', '22972': 'Reference Handler', '23058': 'Attach Listener', '22959': 'G1 Refine#7', '22958': 'GC Thread#7', '22979': 'C1 CompilerThread3', '22992': 'node-0', '22980': 'Sweeper thread', '22981': 'Common-Cleaner', '22982': 'Service Thread', '22983': 'VM Periodic Task Thread', '22968': 'G1 Main Marker', '22969': 'G1 Marker#0', '22966': 'G1 Refine#0', '22967': 'G1 Young RemSet Sampling', '22964': 'G1 Refine#2', '22965': 'G1 Refine#1', '22962': 'G1 Refine#4', '22963': 'G1 Refine#3', '22960': 'G1 Refine#6', '22961': 'G1 Refine#5', '22953': 'GC Thread#2', '22952': 'GC Thread#1', '22951': 'GC Thread#0'}")
('thread scheduling {}', "{u'0': {u'D': 60, u'K': 0, u'S': 1506889, u'R': 1507495, u'U': 0, u'x': 71, u'total': 3005466}, u'2': {u'D': 15, u'K': 0, u'S': 0, u'R': 0, u'U': 0, u'x': 0, u'total': 15}}")
No samples for pid 22949
Cheers,
Andy
The data in /tmp/scheduler-states-22949
should be thread state counts keyed by thread_id. For some reason, the two samples in that file have thread_ids 0 and 2, so when the script tries to match that thread_id to a name in the file /tmp/jstack-22949.txt
, it fails and determines that there are no samples for the threads in the jstack file.
Testing locally, I get the same results. Pid 0 is swapper, pid 2 is kthreadd, so it looks as though normal processes are not being captured. I'll see if I can figure out the problem.
I've updated the bcc script to attach to a tracepoint rather than a kprobe. It seems to be doing the job, but I haven't had a close look yet to make sure that the results are accurate.
Use at your own risk ;)