Traditionally, BPF could only be attached to sockets for socket filtering. BPF's first use case was in tcpdump
. When you run tcpdump
the filter is compiled into a BPF program and attached to a raw AF_PACKET
socket in order to print out filtered packets. But over the years, eBPF added the ability to attach to other kernel objects. In addition to socket filtering, some supported attach points are:
- Kprobes (and userspace equivalents uprobes)
- Tracepoints
- Network schedulers or qdiscs for classification or action (tc)
- XDP (eXpress Data Path) This and other, newer features like in-kernel helper functions and shared data-structures (maps) that can be used to communicate with user space, extend BPF's capabilities.
Summary Presentation: - Toward Flexible and Efficient In-Kernel Network Function Chaining with IOVisor
- A thorough introduction to eBPF
- BPF and XDP Reference Guide
- eBPF maps: Using eBPF maps is a method to keep state between invocations of the eBPF program, and allows sharing data between eBPF kernel programs, and also between kernel and user-space applications.
- Persistent BPF objects
- Using eBPF in Kubernetes
- Kernel docs
- https://github.com/iovisor/gobpf
- https://github.com/andrewkroh/go-ebpf
- https://github.com/newtools/ebpf
- comparison: newtools/ebpf#54
- projects on Github: https://github.com/topics/ebpf
- What is XDP?
- XDP based load balancer with L3DSR support
- Drop incoming packets on XDP layer and count for which protocol type
- XDP Production Usage: DDoS Protection and L4LB
- Network filtering for control groups
- New approaches to network fast paths
- kproxy
- Perf ring buffer
- TLS in the kernel
- Crypto kernel TLS socket
- PLAYING WITH KERNEL TLS IN LINUX 4.13 AND GO
- Further Reading
- Dive into BPF: a list of reading material
- How to filter packets super fast: XDP & eBPF!
- http://brendangregg.com/perf.html#eBPF
Notes
- This also implies that API users must clear/zero sizeof(bpf_attr), as compiler can size-align the struct differently, to avoid garbage data to be interpreted as parameters by future kernels.
- John Fastabend: https://lwn.net/Articles/731133/
- Sample problem: https://github.com/linus5/linux-kernel-xdp/commit/f0c18713b4e6d5398fc9cb8b24a61c566ecbd166
This series implements a sockmap and socket redirect helper for BPF using a model similar to XDP netdev redirect. A sockmap is a BPF map type that holds references to sock structs. Then with a new sk redirect bpf helper BPF programs can use the map to redirect skbs between sockets,
bpf_sk_redirect_map(map, key, flags)
Finally, we need a call site to attach our BPF logic to do socket redirects. We added hooks to recv_sock using the existing strparser infrastructure to do this. The call site is added via the BPF attach map call. To enable users to use this infrastructure a new BPF program BPF_PROG_TYPE_SK_SKB is created that allows users to reference sock details, such as port and ip address fields, to build useful socket layer program. The sockmap datapath is as follows,
recv -> strparser -> verdict/action
where this series implements the drop and redirect actions. Additional, actions can be added as needed.
A sample program is provided to illustrate how a sockmap can be integrated with cgroups and used to add/delete sockets in a sockmap. The program is simple but should show many of the key ideas.
To test this work test_maps in selftests/bpf was leveraged. We added a set of tests to add sockets and do send/recv ops on the sockets to ensure correct behavior. Additionally, the selftests tests a series of negative test cases. We can expand on this in the future.
I also have a basic test program I use with iperf/netperf clients that could be sent as an additional sample if folks want this. It needs a bit of cleanup to send to the list and wasn't included in this series.
For people who prefer git over pulling patches out of their mail editor I've posted the code here,
https://github.com/jrfastab/linux-kernel-xdp/tree/sockmap
For some background information on the genesis of this work it might be helpful to review these slides from netconf 2017 by Thomas Graf,
http://vger.kernel.org/netconf2017.html https://docs.google.com/a/covalent.io/presentation/d/1dwS...
XDP support for veth driver
- https://twitter.com/davem_dokebi/status/1021082455086792704
- https://marc.info/?l=linux-netdev&m=153227240330693&w=2
Accelerating Linux security with eBPF iptables
eBPF with autocomplete
GRO Engine