Add sampling-based filtering support
namhyung opened this issue · 2 comments
For a large scale project, it's easy to get a huge data after running it with uftrace. Although uftrace has various filters to reduce the amount of trace data, it sometimes to hard to know what/how to apply filters. Another approach to achieve this is sampling - it records the data at some frequency N. IOW it'll only trace the function on every Nth execution. But it still needs a target function.
Let's talk with an example. I'd like to start with our lovely t-abc
. :) But imagine it's changed to call function a
5 times in the main
(let's call it t-abc5
). And you don't want to trace them all, and just want to do it every other time (N = 2). Then I think we can do like:
$ uftrace -F main -F a@sample=2 t-abc5
# DURATION TID FUNCTION
[1356607] | main() {
[1356607] | a() { # 2nd execution
[1356607] | b() {
[1356607] | c() {
1.536 us [1356607] | getpid();
2.400 us [1356607] | } /* c */
2.863 us [1356607] | } /* b */
3.205 us [1356607] | } /* a */
[1356607] | a() { # 4th execution
[1356607] | b() {
[1356607] | c() {
0.945 us [1356607] | getpid();
1.490 us [1356607] | } /* c */
1.752 us [1356607] | } /* b */
2.066 us [1356607] | } /* a */
11.418 us [1356607] | } /* main */
Note that the sampling was applied to -F
filter so it'd show/hide all function under the target. Also I had to add -F main
to show the higher level functions too.
It should work with normal triggers (-T
) too, then I think it only affects the target function (unless it has filter
action, of course), not the children.
$ uftrace -T a@sample=2 t-abc5
# DURATION TID FUNCTION
1.468 us [1358575] | __monstartup();
0.811 us [1358575] | __cxa_atexit();
[1358575] | main() {
[1358575] | b() { # no 'a' on 1st
[1358575] | c() {
0.908 us [1358575] | getpid();
1.366 us [1358575] | } /* c */
1.638 us [1358575] | } /* b */
[1358575] | a() {
[1358575] | b() {
[1358575] | c() {
0.824 us [1358575] | getpid();
1.159 us [1358575] | } /* c */
1.393 us [1358575] | } /* b */
1.629 us [1358575] | } /* a */
[1358575] | b() { # no 'a' on 3rd
[1358575] | c() {
1.144 us [1358575] | getpid();
1.472 us [1358575] | } /* c */
1.767 us [1358575] | } /* b */
[1358575] | a() {
[1358575] | b() {
[1358575] | c() {
1.118 us [1358575] | getpid();
1.452 us [1358575] | } /* c */
1.757 us [1358575] | } /* b */
2.004 us [1358575] | } /* a */
[1358575] | b() { # no 'a' on 5th
[1358575] | c() {
0.908 us [1358575] | getpid();
1.357 us [1358575] | } /* c */
1.671 us [1358575] | } /* b */
11.683 us [1358575] | } /* main */
Some consideration on the counter.
I don't want to make the counter global (per function) which requires atomic operation every time. As I expect the target function will be called frequently, keeping it per-thread should be fine. That means it might not see the target function even if it's called 2N - 2 times globally, but when two threads called it N - 1 times each. In that case, users should select a smaller N and try again.
More complex cases. The child function can have its own sample filter. Then we need to think about how to count the frequency. I think it should count as same as users see in the final output - IOW do NOT count filtered-out functions.
For example, let's use sample filters for both a
and c
functions.
$ uftrace -F a@sample=2 -F c@sample=2 t-abc5
# DURATION TID FUNCTION
[1363218] | a() { # 2nd execution of 'a'
1.935 us [1363218] | b(); # 1st execution of 'c' - no sample
2.329 us [1363218] | } /* a */
[1363218] | a() { # 4th execution of 'a'
[1363218] | b() {
[1363218] | c() { # 2nd execution of 'c'
0.444 us [1363218] | getpid();
0.772 us [1363218] | } /* c */
0.959 us [1363218] | } /* b */
1.168 us [1363218] | } /* a */