bpf-notes

Traditionally, BPF could only be attached to sockets for socket filtering. BPF's first use case was in tcpdump. When you run tcpdump the filter is compiled into a BPF program and attached to a raw AF_PACKET socket in order to print out filtered packets. But over the years, eBPF added the ability to attach to other kernel objects. In addition to socket filtering, some supported attach points are:

  • Kprobes (and userspace equivalents uprobes)
  • Tracepoints
  • Network schedulers or qdiscs for classification or action (tc)
  • XDP (eXpress Data Path) This and other, newer features like in-kernel helper functions and shared data-structures (maps) that can be used to communicate with user space, extend BPF's capabilities.

Summary Presentation: - Toward Flexible and Efficient In-Kernel Network Function Chaining with IOVisor

Go + eBPF

Testing eBPF in CI

XDP

L7 in Kernel

Reading List

Notes

  • This also implies that API users must clear/zero sizeof(bpf_attr), as compiler can size-align the struct differently, to avoid garbage data to be interpreted as parameters by future kernels.

eBPF + Prometheus exporter

eBPF VM in userspace

BPF: sockmap and sk redirect support

This series implements a sockmap and socket redirect helper for BPF using a model similar to XDP netdev redirect. A sockmap is a BPF map type that holds references to sock structs. Then with a new sk redirect bpf helper BPF programs can use the map to redirect skbs between sockets,

  bpf_sk_redirect_map(map, key, flags)

Finally, we need a call site to attach our BPF logic to do socket redirects. We added hooks to recv_sock using the existing strparser infrastructure to do this. The call site is added via the BPF attach map call. To enable users to use this infrastructure a new BPF program BPF_PROG_TYPE_SK_SKB is created that allows users to reference sock details, such as port and ip address fields, to build useful socket layer program. The sockmap datapath is as follows,

 recv -> strparser -> verdict/action

where this series implements the drop and redirect actions. Additional, actions can be added as needed.

A sample program is provided to illustrate how a sockmap can be integrated with cgroups and used to add/delete sockets in a sockmap. The program is simple but should show many of the key ideas.

To test this work test_maps in selftests/bpf was leveraged. We added a set of tests to add sockets and do send/recv ops on the sockets to ensure correct behavior. Additionally, the selftests tests a series of negative test cases. We can expand on this in the future.

I also have a basic test program I use with iperf/netperf clients that could be sent as an additional sample if folks want this. It needs a bit of cleanup to send to the list and wasn't included in this series.

For people who prefer git over pulling patches out of their mail editor I've posted the code here,

https://github.com/jrfastab/linux-kernel-xdp/tree/sockmap

For some background information on the genesis of this work it might be helpful to review these slides from netconf 2017 by Thomas Graf,

http://vger.kernel.org/netconf2017.html https://docs.google.com/a/covalent.io/presentation/d/1dwS...

XDP support for veth driver

Accelerating Linux security with eBPF iptables

eBPF with autocomplete

GRO Engine