falcosecurity/libs

modern-bpf fails to load on RHEL 8.6 ppc64le (unsupported opcode)

Stringy opened this issue · 5 comments

Describe the bug

When loading the modern BPF driver on RHEL 8.6 on ppc64le, the loading fails:

libbpf: prog 'bind_e': BPF program load failed: ERROR: strerror_r(524)=22
libbpf: prog 'bind_e': -- BEGIN PROG LOAD LOG --
processed 170 insns (limit 1000000) max_states_per_insn 0 total_states 10 peak_states 10 mark_read 6
-- END PROG LOAD LOG --
libbpf: prog 'bind_e': failed to load: -524
libbpf: failed to load object 'bpf_probe'
libbpf: failed to load BPF skeleton 'bpf_probe': -524
 (524)
libpman: failed to load BPF object (errno: 524 | message: Unknown error 524)

Error 524 is "unsupported" and dmesg gives a little more information:

eBPF filter opcode 0039 (@2) unsupported

This opcode is BPF_LDX | BPF_DW | BPF_ABS, but based on docs this opcode seems to be for packet inspection and that doesn't fit with the programs that seem to be affected.

How to reproduce it

sudo ./libscap/examples/01-open/scap-open --modern_bpf

(running on 4.18.0-372.9.1.el8.ppc64le)

Expected behaviour

The modern bpf probe should load correctly, with no errors.

Environment

  • Kernel: 4.18.0-372.32.1.el8_6.ppc64le
  • Built using clang-17 (have tested with 16 as well, but the error persists)

Additional context
This is a follow on from initial discussions on #1804
cc: @FedeDP @Andreagit97 @mdafsanhossain @erthalion

Thanks for tracking this one!
/milestone next-driver

I finally have an update on this one now that we've wrapped up our investigation.

TL;DR: A single patch is missing on this kernel and makes loading modern_bpf impossible. No libs changes are needed (except perhaps docs adjustment of minimum kernel versions for Power)

The unsupported opcode that we originally thought was BPF_ABS, was actually BPF_PROBE_MEM:

/* unused opcode to mark special load instruction. Same as BPF_ABS */
#define BPF_PROBE_MEM 0x20

This opcode doesn't exist in the 4.18.0-372.9.1.el8.ppc64le kernel, and is used for BTF symbol resolution, and as a result it is only generated when you attempt a direct read from a kernel struct. e.g. here's a minimal example that produces this error:

// built using libbpf-bootstrap
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>

char LICENSE[] SEC("license") = "Dual BSD/GPL";

int my_pid = 0;

SEC("tp_btf/sys_enter")
int BPF_PROG(handle_tp, struct pt_regs *regs)
{
    	int pid = bpf_get_current_pid_tgid() >> 32;
    	// access of regs and use of the variable later causes
      // the error
    	int syscall = (int)regs->gpr[0];

    	if (pid != my_pid || syscall != 1)
            	return 0;

    	bpf_printk("BPF triggered from PID %d.\n", pid);

    	return 0;
}

Later kernels on Power do support this opcode, and modern_bpf loads correctly at that point. (from 4.18.0-477.27.1.el8_8.ppc64le)

Thanks for the investigation Giles! We already document a minimum kernel version of 5.8 for modern bpf probe: https://github.com/falcosecurity/libs?tab=readme-ov-file#drivers-officially-supported-architectures
Even if it CAN theoretically work on older versions since, as you noted, most bpf related feats are often backported to older kernels.

/close

@FedeDP: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.