cilium/pwru

[Draft] Proposal: introduce --output-lbr

Opened this issue · 0 comments

I've finished a PoC about LBR: bpflbr.

Then, I think it's better to introduce LBR for pwru.

As for pwru, introduce --output-lbr to output LBR stack, e.g. an example from bpflbr:

[#31] kfree+0x81                            (mm/slab_common.c:1076)                -> kvfree+0x31                                  (mm/util.c:664)                      
      kvfree+0x3a                           (mm/util.c:664)                        -> map_lookup_and_delete_elem+0x2a6             (kernel/bpf/syscall.c:1989)          
      map_lookup_and_delete_elem+0x2ae      (kernel/bpf/syscall.c:1989)            -> map_lookup_and_delete_elem+0x2b6             (kernel/bpf/syscall.c:1929)          
      map_lookup_and_delete_elem+0x2cb      (include/linux/file.h:45)              -> fput+0x0                                     (fs/file_table.c:434)                
      fput+0xa                              (fs/file_table.c:435)                  -> fput+0x74                                    (fs/file_table.c:452)                
      fput+0x7e                             (fs/file_table.c:452)                  -> map_lookup_and_delete_elem+0x2d0             (include/linux/file.h:45)            
      map_lookup_and_delete_elem+0x2d0      (include/linux/file.h:45)              -> map_lookup_and_delete_elem+0x84              (kernel/bpf/syscall.c:1994)          
      map_lookup_and_delete_elem+0xa3       (kernel/bpf/syscall.c:1994)            -> __sys_bpf+0x5e7                              (kernel/bpf/syscall.c:5482)          
      __sys_bpf+0x5e9                       (kernel/bpf/syscall.c:5483)            -> __sys_bpf+0x61                               (kernel/bpf/syscall.c:5528)          
      __sys_bpf+0x93                        (kernel/bpf/syscall.c:5528)            -> arch_rethook_trampoline+0x0                                                       
      arch_rethook_trampoline+0x2c                                                 -> arch_rethook_trampoline_callback+0x0         (arch/x86/kernel/rethook.c:68)       
      arch_rethook_trampoline_callback+0x35 (arch/x86/kernel/rethook.c:86)         -> rethook_trampoline_handler+0x0               (kernel/trace/rethook.c:291)         
      rethook_trampoline_handler+0x33       (kernel/trace/rethook.c:228)           -> rethook_trampoline_handler+0x41              (kernel/trace/rethook.c:228)         
      rethook_trampoline_handler+0x6d       (kernel/trace/rethook.c:316)           -> rethook_trampoline_handler+0xa0              (kernel/trace/rethook.c:318)         
      rethook_trampoline_handler+0xa5       (kernel/trace/rethook.c:318)           -> rethook_trampoline_handler+0x71              (kernel/trace/rethook.c:320)         
      rethook_trampoline_handler+0x8d       (kernel/trace/rethook.c:322)           -> kretprobe_rethook_handler+0x0                (kernel/kprobes.c:2156)              
      kretprobe_rethook_handler+0x43        (kernel/kprobes.c:2170)                -> kretprobe_dispatcher+0x0                     (kernel/trace/trace_kprobe.c:1684)   
      kretprobe_dispatcher+0x3b             (kernel/trace/trace_kprobe.c:1702)     -> kretprobe_dispatcher+0x6d                    (kernel/trace/trace_kprobe.c:1703)   
      kretprobe_dispatcher+0x76             (kernel/trace/trace_kprobe.c:1703)     -> kretprobe_perf_func+0x0                      (kernel/trace/trace_kprobe.c:1577)   
      kretprobe_perf_func+0x4e              (kernel/trace/trace_kprobe.c:1584)     -> trace_call_bpf+0x0                           (kernel/trace/bpf_trace.c:111)       
      trace_call_bpf+0x62                   (include/linux/bpf.h:1955)             -> migrate_disable+0x0                          (kernel/sched/core.c:2408)           
      migrate_disable+0x45                  (kernel/sched/core.c:2420)             -> trace_call_bpf+0x67                          (include/linux/bpf.h:1919)           
      trace_call_bpf+0xa5                   (include/linux/bpf.h:1201)             -> bpf_prog_6deef7357e7b4530_sd_fw_ingress+0x68                                      
[#30] filter_pcap_ebpf_l2+0x22              (bpf/kprobe_pwru.c:244)                -> bpf_get_smp_processor_id+0x0                 (kernel/bpf/helpers.c:151)           
      bpf_get_smp_processor_id+0x13         (kernel/bpf/helpers.c:151)             -> filter_pcap_ebpf_l2+0x27                     (bpf/kprobe_pwru.c:244)              
      filter_pcap_ebpf_l2+0x5e              (bpf/kprobe_pwru.c:244)                -> bpf_get_branch_snapshot+0x0                  (kernel/trace/bpf_trace.c:1179)      
      bpf_get_branch_snapshot+0x17          (kernel/trace/bpf_trace.c:1187)        -> intel_pmu_snapshot_branch_stack+0x0          (arch/x86/events/intel/core.c:2276)  
      intel_pmu_snapshot_branch_stack+0x2f  (arch/x86/include/asm/paravirt.h:695)  -> native_write_msr+0x0                         (arch/x86/include/asm/msr.h:144)     
      native_write_msr+0x12                 (arch/x86/include/asm/jump_label.h:27) -> intel_pmu_snapshot_branch_stack+0x34         (arch/x86/include/asm/paravirt.h:695)
      intel_pmu_snapshot_branch_stack+0x3d  (arch/x86/include/asm/paravirt.h:196)  -> native_read_msr+0x0                          (arch/x86/include/asm/msr.h:115)     
      native_read_msr+0x25                  (arch/x86/include/asm/msr.h:124)       -> intel_pmu_snapshot_branch_stack+0x42         (arch/x86/include/asm/paravirt.h:196)
      intel_pmu_snapshot_branch_stack+0x56  (arch/x86/include/asm/paravirt.h:196)  -> native_write_msr+0x0                         (arch/x86/include/asm/msr.h:144) 

However, --output-lbr relies on some bpf features:

It's going to trace multiple kernel functions with kretprobe.multi to get tracee's LBR records via bpf_get_branch_snapshot() helper alongside its return value.

But, the accuracy of LBR records got by bpf_get_branch_snapshot() is not good enough. bpf_read_branch_records() torvalds/linux@fff7b64355ea ("bpf: Add bpf_read_branch_records() helper") maybe is required to improve the accuracy.