exein-io/pulsar

Startup warnings in eBPF programs

MatteoNardi opened this issue · 0 comments

Problem

On Pulsar startup, we're getting warnings on the eBPF side.

2023-01-30T10:31:20Z WARN  trace_pipe]    Socket Thread-3356    [006] d...1 11151.973875: [output_event] error -28 emitting event of len 104 (ERROR)
2023-01-30T10:31:20Z WARN  trace_pipe]    Socket Thread-3356    [005] d...1 11157.418291: [output_event] error -2 emitting event of len 104 (ERROR)

The warnings comes from the code sending the perf event in output.bpf.h:

https://github.com/Exein-io/pulsar/blob/c2ee6650337958488d86cdd8963502db88a42e08/crates/bpf-common/include/output.bpf.h#L63-L75

Analysis

  • Rather than a regression, I think the old code was simply ignoring these errors.
  • I believe the cause to be that on initialization we open the perf event map only after we've started the eBPF code. I guess that trying to emit events in that time-slice will result in these errors.

Todo

  • Find the exact meaning of error -2 and -28
    • -2 error is emitted on startup, when the map has not been setup yet on the userspace side
    • -28 error is emitted during shutdown, presumably when the eBPF program is still running, but the map buffer has been dropped.
  • Fix the warning by not emitting events until the perf event array map is initialized. We could do this add a configuration map with one entry (initialized=true/false). With the default value of false, the probes won't run. After we've opened the perf event map, in program.rs, we'll set it to true.