BPFChain: Building Safe Multi-Program eBPF Environments

A Hands-on Tutorial on Cgroup-BPF Chaining and Monitoring

Presenters

PresenterInstitution
Prankur GuptaMeta Platforms, Inc.
Takshak ChahandeMeta Platforms, Inc.

Tutorial Timetable

08:45 – 09:00Registration and Setup
09:00 – 09:20Lecture 1: Kernel Namespaces for Containers/VMs
09:20 – 09:50Lab 1: Creating Your Own Namespaces for Containers/VMs
09:50 – 10:10Lecture 2: eBPF Internals — hooks, program types, maps, cgroup-bpf
10:10 – 10:45Lab 2: Hands-on eBPF Program Creation and Attachment on Containers/VMs
10:45 – 11:00Morning Coffee Break
11:00 – 11:40Lecture 3: Anatomy of Cgroup-BPF Execution: kernel order, return semantics, BPF_F_ALLOW_MULTI
11:40 – 12:30Lecture 4: The Multi-eBPF Program Challenges
12:30 – 14:00Lunch
14:00 – 14:30Lab 3: Reproducing Multi-eBPF Program Conflicts
14:30 – 14:50Lecture 5: Trampoline Architecture for eBPF Program Chaining
14:50 – 15:30Lab 4: Building an eBPF Program Chainer from Scratch
15:30 – 15:45Afternoon Coffee Break
15:45 – 16:25Lecture 6: Monitoring & Observability: detecting conflicts, auditing chain state
16:25 – 17:00Lab 5: Building a Monitoring Framework
17:00 – 18:00Discussion and Feedback

Summary

Multi-program eBPF deployments are now the norm in production infrastructure — yet the educational landscape has not kept pace. While eBPF fundamentals are thoroughly covered in existing courses and tutorials, the operational challenges that arise when multiple independently developed programs share the same cgroup-BPF hooks remain almost entirely unaddressed: subtle execution-order conflicts, return-value overrides that silently bypass security policies, and shared map races that are extraordinarily difficult to diagnose under production conditions [1, 8].


This full-day, hands-on tutorial fills that gap directly. Drawing on two-plus years of production experience orchestrating thousands of eBPF programs across Meta's fleet — including the NetEdit orchestration framework [1] and Meta's internal bpf-chainer and xdp-chainer systems — the instructors bring firsthand knowledge of failure modes that do not appear in textbooks. Attendees progress from Linux kernel primitives (namespaces, cgroups [7]) through eBPF program internals [5, 6] and cgroup-BPF execution semantics, to trampoline-based program chaining [3] and a purpose-built monitoring framework — each concept immediately reinforced through a guided lab that attendees run directly on their own laptops against a real kernel. By the end of the session, participants will be able to reproduce multi-program conflicts, build a


working cgroup-bpf chainer from scratch, and instrument a production deployment for early detection of program interaction issues. The tutorial is designed for networking practitioners, infrastructure engineers, and researchers building or operating eBPF-based systems such as Cilium [9], Calico [10], or Katran [11].

Motivation

eBPF has evolved from an experimental packet filter into production-critical infrastructure powering networking, security, and observability at hyperscale [12]. Multiple independently developed programs now routinely coexist on the same host—a deployment model that is standard in container orchestration platforms [8, 9, 10] and large-scale fleet management systems [1].


This multi-program reality creates a fundamental coordination challenge. Cgroup-BPF hooks execute attached programs in FIFO (first-attached, first-executed) order, and the kernel uses only the return value of the last program [5, 6]. A later program can silently override earlier security decisions, drop packets, or corrupt shared map state—with no built-in mechanism to detect what went wrong. At hyperscale, these subtle interactions lead to hard-to-triage incidents that can affect millions of connections [1].


Despite strong community interest — evidenced by dedicated eBPF workshops at SIGCOMM 2023 [13] and 2024 [14] — no existing tutorial addresses multi-program coordination as a unified discipline. Adjacent efforts have tackled pieces of the problem: bpfman [8] targets program lifecycle management in Kubernetes, and kernel patches have explored XDP dispatcher architectures [4]. But cgroup-BPF execution semantics, return-value conflict resolution, shared map coordination, and operational monitoring have never been brought together into a single, hands-on treatment. This tutorial does exactly that — and grounds every concept in the production lessons learned operating such a system at Meta's scale.

Outline

The tutorial is organized into two halves: a morning session that builds foundational understanding, and an afternoon session focused on engineering solutions.

Morning — Foundations & the Problem Space

We open by grounding attendees in the Linux kernel primitives that make container and VM networking possible. Lecture 1 introduces kernel namespaces—PID, network, mount, and cgroup—and explains how they provide the isolation boundaries that eBPF programs operate within [7]. In Lab 1, attendees create and configure their own namespaces from scratch, gaining hands-on intuition for the environment eBPF programs target.


Lecture 2 shifts to eBPF itself: the hook model, program types (particularly cgroup-BPF), maps for shared state, and the attach/detach lifecycle [5, 6]. Lab 2 puts this into practice—attendees write and attach eBPF programs to containers and VMs, observing how programs interact with the kernel data path.


After the coffee break, Lecture 3 takes a deep dive into cgroup-BPF execution internals: how the kernel orders multiple attached programs, how return values propagate, and the semantics of BPF_F_ALLOW_MULTI. Lecture 4 builds on this to present the multi-program challenge in full—demonstrating through real-world examples how programs from different teams can silently conflict, override each other's decisions, and create failures that are invisible to standard debugging tools [1].

Afternoon — Building Solutions

The afternoon begins with Lab 3, where attendees reproduce multi-program conflicts in a controlled environment—attaching competing cgroup-BPF programs and observing return value overrides, policy bypasses, and map corruption firsthand.


Lecture 5 introduces the trampoline architecture as a solution: using BPF trampolines [3] to interpose a coordination layer that controls execution order, mediates return values, and enforces safe program composition. In Lab 4—the centerpiece of the tutorial—attendees build an eBPF program chainer from scratch, implementing the core dispatch logic that ensures deterministic, conflict-free multi-program execution.


Lecture 6 addresses the operational side: how to monitor multi-program environments, detect conflicts in real time, and audit chain state for correctness. Lab 5 has attendees construct a monitoring framework that provides visibility into which programs are attached, their execution order, and whether return value conflicts are occurring.


The day closes with an open discussion covering production lessons from operating these systems at Meta's scale [1, 2], planned improvements to the architecture, and the roadmap toward open-sourcing the BPFChain framework.

Expected Audience and Prerequisites

This tutorial is aimed at networking researchers, infrastructure engineers, cloud/platform engineers, and eBPF practitioners who want to understand the challenges of running multiple eBPF programs in production environments.

Prerequisites

Laptop Requirements

Attendees must bring a laptop running a recent Linux distribution with kernel 6.9 or later — available by default on Ubuntu 22.04+, Fedora 36+, or equivalent. A VM running Linux is equally fine (Vagrant, VirtualBox, or UTM on macOS). All kernel features required for the labs—namespaces, cgroups, BPF, and perf events—are enabled by default in the stock kernels of these distributions. No custom kernel compilation or special configuration is needed.


The only additional setup is installing a few standard packages via your distribution's package manager:

Biographies

Prankur Gupta is a Software Engineer at Meta in the Host Networking team. He is the co-creator and lead for NetEdit [1], an eBPF orchestrator for large-scale deployments, and co-creator of a general-purpose BPF chainer that supports cgroup-BPF chaining, with extensive experience deploying eBPF-based solutions across Meta's production fleet.


Takshak Chahande is a Software Engineer at Meta and the lead for the container and VM networking vertical, which incorporates a substantial number of features heavily leveraging eBPF. He is the creator of xdp-chainer and co-created the general-purpose BPF chainer that supports cgroup-BPF chaining.


Contact:

Additional Information

TBD — Setup instructions, lab skeleton code, verification scripts, and all supporting materials will be made available at least one month before the tutorial.

References

  1. T. A. Benson, P. Kannan, P. Gupta, B. Madhavan, K. S. Arora, J. Meng, M. Lau, A. Dhamija, R. Krishnamurthy, S. Sundaresan, N. Spring, and Y. Zhang, "NetEdit: Deploying Modular eBPF Programs at Scale," in Proc. ACM SIGCOMM, Sydney, Australia, Aug. 2024. https://doi.org/10.1145/3651890.3672227
  2. "Building NetEdit: Managing eBPF programs at scale at Meta," APNIC Blog, Jun. 2025. https://blog.apnic.net/2025/06/05/building-netedit-managing-ebpf-programs-at-scale-at-meta/
  3. A. Starovoitov, "Introduce BPF Trampoline," Linux kernel patch series (bpf-next), Nov. 2019. https://lore.kernel.org/all/20191114185720.1641606-14-ast@kernel.org/T/
  4. T. Hoiland-Jorgensen et al., "XDP: Support multiple programs on a single interface through chain calls," Linux kernel patch series (bpf-next), 2019. https://lore.kernel.org/bpf/157002303220.1302756.13509533392771604835.stgit@alrua-x1/T/
  5. Linux man-pages project, "bpf(2) — Linux manual page." https://man7.org/linux/man-pages/man2/bpf.2.html
  6. Linux kernel documentation, "BPF_PROG_TYPE_CGROUP_SOCKOPT." https://docs.kernel.org/bpf/prog_cgroup_sockopt.html
  7. M. Kerrisk, "namespaces(7) — Linux manual page." https://man7.org/linux/man-pages/man7/namespaces.7.html
  8. "bpfman: An eBPF Manager for Linux and Kubernetes." https://bpfman.io/
  9. Cilium Project, "eBPF-based Networking, Observability, Security." https://cilium.io/
  10. Tigera, "Project Calico — Cloud Native Networking and Security." https://www.tigera.io/project-calico/
  11. Facebook Incubator, "Katran: A high-performance layer 4 load balancer." https://github.com/facebookincubator/katran
  12. eBPF Foundation, "eBPF Infrastructure Platform Report," 2024. https://ebpf.foundation/
  13. "ACM SIGCOMM 2023 Workshop on eBPF and Kernel Extensions." https://conferences.sigcomm.org/sigcomm/2023/workshop-ebpf.html
  14. "ACM SIGCOMM 2024 Workshop on eBPF and Kernel Extensions." https://conferences.sigcomm.org/sigcomm/2024/workshop/ebpf/