New repository https://github.com/iovisor/bpftrace
Development of BPFtrace has moved to https://github.com/iovisor/bpftrace
This repository will no longer be updated.
BPFtrace is a high-level tracing language for Linux enhanced Berkeley Packet Filter (eBPF) available in recent Linux kernels (4.x). BPFtrace uses LLVM as a backend to compile scripts to BPF-bytecode and makes use of BCC for interacting with the Linux BPF system, as well as existing Linux tracing capabilities: kernel dynamic tracing (kprobes), user-level dynamic tracing (uprobes), and tracepoints. The BPFtrace language is inspired by awk and C, and predecessor tracers such as DTrace and SystemTap.
For instructions on building BPFtrace, see INSTALL.md
Count system calls:
kprobe:[Ss]y[Ss]_*
{
@[func] = count()
}
Attaching 376 probes...
^C
...
@[sys_open]: 579
@[SyS_ioctl]: 686
@[sys_bpf]: 730
@[sys_close]: 779
@[SyS_read]: 825
@[sys_write]: 1031
@[sys_poll]: 1796
@[sys_futex]: 2237
@[sys_recvmsg]: 2634
Produce a histogram of amount of time (in nanoseconds) spent in the read()
system call:
kprobe:sys_read
{
@start[tid] = nsecs;
}
kretprobe:sys_read / @start[tid] /
{
@times = quantize(nsecs - @start[tid]);
delete(@start[tid]);
}
Attaching 2 probes...
^C
@start[9134]: 6465933686812
@times:
[0, 1] 0 | |
[2, 4) 0 | |
[4, 8) 0 | |
[8, 16) 0 | |
[16, 32) 0 | |
[32, 64) 0 | |
[64, 128) 0 | |
[128, 256) 0 | |
[256, 512) 326 |@ |
[512, 1k) 7715 |@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[1k, 2k) 15306 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2k, 4k) 609 |@@ |
[4k, 8k) 611 |@@ |
[8k, 16k) 438 |@ |
[16k, 32k) 59 | |
[32k, 64k) 36 | |
[64k, 128k) 5 | |
Print paths of any files opened along with the name of process which opened them:
kprobe:sys_open
{
printf("%s: %s\n", comm, str(arg0))
}
Attaching 1 probe...
git: .git/objects/70
git: .git/objects/pack
git: .git/objects/da
git: .git/objects/pack
git: /etc/localtime
systemd-journal: /var/log/journal/72d0774c88dc4943ae3d34ac356125dd
DNS Res~ver #15: /etc/hosts
DNS Res~ver #16: /etc/hosts
DNS Res~ver #15: /etc/hosts
^C
Whole system profiling (TODO make example check if kernel is on-cpu before recording):
profile:hz:99
{
@[stack] = count()
}
Attaching 1 probe...
^C
...
@[
_raw_spin_unlock_irq+23
finish_task_switch+117
__schedule+574
schedule_idle+44
do_idle+333
cpu_startup_entry+113
start_secondary+344
verify_cpu+0
]: 83
@[
queue_work_on+41
tty_flip_buffer_push+43
pty_write+83
n_tty_write+434
tty_write+444
__vfs_write+55
vfs_write+177
sys_write+85
entry_SYSCALL_64_fastpath+26
]: 97
@[
cpuidle_enter_state+299
cpuidle_enter+23
call_cpuidle+35
do_idle+394
cpu_startup_entry+113
rest_init+132
start_kernel+1083
x86_64_start_reservations+41
x86_64_start_kernel+323
verify_cpu+0
]: 150
Attach a BPFtrace script to a kernel function, to be executed when that function is called:
kprobe:sys_read { ... }
Attach script to a userland function:
uprobe:/bin/bash:readline { ... }
Attach script to a statically defined tracepoint in the kernel:
tracepoint:sched:sched_switch { ... }
Tracepoints are guaranteed to be stable between kernel versions, unlike kprobes.
Run the script at specified time intervals:
profile:hz:99 { ... }
profile:s:1 { ... }
profile:ms:20 { ... }
profile:us:1500 { ... }
A single probe can be attached to multiple events:
kprobe:sys_read,kprobe:sys_write { ... }
Some probe types allow wildcards to be used when attaching a probe:
kprobe:SyS_* { ... }
Define conditions for which a probe should be executed:
kprobe:sys_open / uid == 0 / { ... }
The following variables and functions are available for use in bpftrace scripts:
Variables:
pid
- Process ID (kernel tgid)tid
- Thread ID (kernel pid)uid
- User IDgid
- Group IDnsecs
- Nanosecond timestampcpu
- Processor IDcomm
- Process namestack
- Kernel stack traceustack
- User stack tracearg0
,arg1
, ... etc. - Arguments to the function being tracedretval
- Return value from function being tracedfunc
- Name of the function currently being traced
Functions:
quantize(int n)
- Produce a log2 histogram of values ofn
count()
- Count the number of times this function is calleddelete(@x)
- Delete the map element passed in as an argumentstr(char *s)
- Returns the string pointed to bys
printf(char *fmt, ...)
- Write to stdoutsym(void *p)
- Resolve kernel addressusym(void *p)
- Resolve user space address (incomplete)reg(char *name)
- Returns the value stored in the named register