<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: willamhou</title>
    <description>The latest articles on Forem by willamhou (@willamhou).</description>
    <link>https://forem.com/willamhou</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3856634%2F4545fd93-d2d7-46b4-a778-faf990fee34f.jpg</url>
      <title>Forem: willamhou</title>
      <link>https://forem.com/willamhou</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/willamhou"/>
    <language>en</language>
    <item>
      <title>Building an IPC bus for Kubernetes sidecars: WAL, DLQ, and ring-buffer backpressure</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Thu, 23 Apr 2026 02:18:14 +0000</pubDate>
      <link>https://forem.com/willamhou/building-an-ipc-bus-for-kubernetes-sidecars-wal-dlq-and-ring-buffer-backpressure-4b27</link>
      <guid>https://forem.com/willamhou/building-an-ipc-bus-for-kubernetes-sidecars-wal-dlq-and-ring-buffer-backpressure-4b27</guid>
      <description>&lt;p&gt;If you put two sidecars in a pod and ask them to talk to each other over HTTP, sooner or later one of them crashes mid-request and you lose a message. If you do it enough times, you reinvent a message bus.&lt;/p&gt;

&lt;p&gt;This post is about the small in-pod message bus we ended up writing for &lt;a href="https://github.com/Prismer-AI/k8s4claw" rel="noopener noreferrer"&gt;k8s4claw&lt;/a&gt;, a Kubernetes operator for AI agent runtimes. The bus sits between channel sidecars (Slack, Discord, Webhook) and the agent runtime container. It has four wire protocols, a write-ahead log, a BoltDB-backed dead letter queue, and a ring buffer with backpressure. All of it is open source (&lt;a href="https://github.com/Prismer-AI/k8s4claw/tree/main/internal/ipcbus" rel="noopener noreferrer"&gt;internal/ipcbus/&lt;/a&gt;), around 2k lines of Go.&lt;/p&gt;

&lt;p&gt;This post is the design doc you actually want to read, not the one we had to write.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of the problem
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;Claw&lt;/code&gt; pod looks like this when it has a Slack channel attached:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────┐
│  Pod                                         │
│                                              │
│  [channel-slack] ──UDS──► [ipc-bus] ──►┐     │
│                                        ▼     │
│                                  [runtime]   │
│                                              │
└──────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three containers. The channel sidecar reads from Slack. The runtime is the actual AI agent. The IPC bus is a &lt;a href="https://kubernetes.io/blog/2023/08/25/native-sidecar-containers/" rel="noopener noreferrer"&gt;native sidecar&lt;/a&gt; (init container with &lt;code&gt;restartPolicy: Always&lt;/code&gt;) that routes messages between them.&lt;/p&gt;

&lt;p&gt;The naive version of this is: let the two containers talk HTTP directly. The reality is that at least four things are going to go wrong:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The runtime will be overloaded when a Slack event arrives and we need somewhere to buffer it.&lt;/li&gt;
&lt;li&gt;The runtime will crash mid-response and we need to redeliver.&lt;/li&gt;
&lt;li&gt;A slow downstream (say, a user's laptop on 3G) will fall behind and we need to push back instead of dropping.&lt;/li&gt;
&lt;li&gt;Two different runtimes we support speak four different wire protocols. HTTP isn't enough.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So we wrote a bus. Let me walk through the four mechanisms that earn their keep.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 1 — length-prefix framing
&lt;/h2&gt;

&lt;p&gt;This isn't glamorous, but it's the first thing you get wrong in a message bus.&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;Message&lt;/code&gt; is a JSON blob on the wire:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Message&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ID&lt;/span&gt;            &lt;span class="kt"&gt;string&lt;/span&gt;          &lt;span class="s"&gt;`json:"id"`&lt;/span&gt;
    &lt;span class="n"&gt;Type&lt;/span&gt;          &lt;span class="n"&gt;MessageType&lt;/span&gt;     &lt;span class="s"&gt;`json:"type"`&lt;/span&gt;
    &lt;span class="n"&gt;Channel&lt;/span&gt;       &lt;span class="kt"&gt;string&lt;/span&gt;          &lt;span class="s"&gt;`json:"channel,omitempty"`&lt;/span&gt;
    &lt;span class="n"&gt;CorrelationID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;          &lt;span class="s"&gt;`json:"correlationId,omitempty"`&lt;/span&gt;
    &lt;span class="n"&gt;ReplyTo&lt;/span&gt;       &lt;span class="kt"&gt;string&lt;/span&gt;          &lt;span class="s"&gt;`json:"replyTo,omitempty"`&lt;/span&gt;
    &lt;span class="n"&gt;Timestamp&lt;/span&gt;     &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;       &lt;span class="s"&gt;`json:"timestamp"`&lt;/span&gt;
    &lt;span class="n"&gt;Payload&lt;/span&gt;       &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RawMessage&lt;/span&gt; &lt;span class="s"&gt;`json:"payload,omitempty"`&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the wire it looks like &lt;code&gt;[4-byte big-endian length][JSON bytes]&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;MaxMessageSize&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;16&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt;
    &lt;span class="n"&gt;FrameHeaderSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;WriteMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Writer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Marshal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed to marshal message: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;MaxMessageSize&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"message size %d exceeds maximum %d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;MaxMessageSize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;frame&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FrameHeaderSize&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;binary&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BigEndian&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PutUint32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="nb"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FrameHeaderSize&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why length-prefix instead of newline-delimited JSON? Because JSON payloads can contain newlines inside strings and you'd have to escape them on the wire. Length-prefix framing just works: a reader reads 4 bytes, gets the length, reads that many bytes, deserializes. No lookahead, no escape tables.&lt;/p&gt;

&lt;p&gt;The 16 MB cap is there to fail loudly rather than run out of memory on a malformed header. In practice our real messages are well under 64 KB.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 2 — four bridge protocols behind one interface
&lt;/h2&gt;

&lt;p&gt;Different runtimes speak different things:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Runtime&lt;/th&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenClaw&lt;/td&gt;
&lt;td&gt;WebSocket&lt;/td&gt;
&lt;td&gt;Full-duplex, JSON-native, easy from Node.js&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NanoClaw&lt;/td&gt;
&lt;td&gt;UDS&lt;/td&gt;
&lt;td&gt;Lowest overhead for same-pod communication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ZeroClaw&lt;/td&gt;
&lt;td&gt;SSE&lt;/td&gt;
&lt;td&gt;Already has an HTTP API, SSE for server-push&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PicoClaw&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;Minimal client, hand-rolled in 50 lines&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The bus abstracts them behind one interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;RuntimeBridge&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="n"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="n"&gt;Receive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four methods. Adding a new protocol is one file (&lt;a href="https://github.com/Prismer-AI/k8s4claw/blob/main/internal/ipcbus/bridge_tcp.go" rel="noopener noreferrer"&gt;example: TCP bridge&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;TCPBridge&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;streamBridge&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;TCPBridge&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dialer&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DialContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"tcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;streamBridge&lt;/code&gt; is a shared base that implements &lt;code&gt;Send&lt;/code&gt;/&lt;code&gt;Receive&lt;/code&gt;/&lt;code&gt;Close&lt;/code&gt; on top of any &lt;code&gt;net.Conn&lt;/code&gt;. It handles &lt;code&gt;context.Context&lt;/code&gt; deadlines properly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;streamBridge&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"not connected"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// Respect context deadline for the write.&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deadline&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetWriteDeadline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetWriteDeadline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;}()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;WriteMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The subtle bit is &lt;code&gt;Receive&lt;/code&gt;. &lt;code&gt;ReadMessage&lt;/code&gt; blocks on the socket. If the caller cancels the context, we want the read to unblock. So &lt;code&gt;Receive&lt;/code&gt; spawns a second goroutine whose only job is to watch the context and call &lt;code&gt;Close&lt;/code&gt; on the conn, which makes the blocked &lt;code&gt;ReadMessage&lt;/code&gt; return with an error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;closed&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SSE bridge is the odd one out because SSE is unidirectional (server → client, event-stream format) and we need bidirectional. So it uses an HTTP POST for send and an SSE &lt;code&gt;GET /events&lt;/code&gt; for receive, with exponential-backoff reconnect on the stream:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;backoff&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// ... connect and read events ...&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;backoff&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;backoff&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;backoff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Mechanism 3 — Write-Ahead Log (WAL)
&lt;/h2&gt;

&lt;p&gt;This is the one that earns the bus the right to exist.&lt;/p&gt;

&lt;p&gt;When a message comes in from a channel sidecar, the router does three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Append a WAL entry to disk (emptyDir-backed) with state &lt;code&gt;pending&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;bridge.Send(ctx, msg)&lt;/code&gt; to hand it off to the runtime bridge.&lt;/li&gt;
&lt;li&gt;Mark the WAL entry complete as soon as &lt;code&gt;Send&lt;/code&gt; returns success. If &lt;code&gt;Send&lt;/code&gt; fails, call &lt;code&gt;scheduleRetry&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We delivery-mark on &lt;em&gt;transport success&lt;/em&gt; (the bridge accepted the bytes), not on runtime ack. We considered a runtime-ack round-trip and decided against it: it doubles round-trips, forces every runtime to implement ack semantics, and our &lt;code&gt;Message.ID&lt;/code&gt; is already idempotency-safe so downstream retries aren't harmful. If a message leaves &lt;code&gt;bridge.Send&lt;/code&gt; OK but the runtime crashes before processing it, we lose that one message. Tradeoff: acceptable for a chat agent, &lt;em&gt;not&lt;/em&gt; acceptable for a payment system. Different design calls, different bus.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;scheduleRetry&lt;/code&gt; increments &lt;code&gt;Attempts&lt;/code&gt; on the WAL entry. After &lt;code&gt;maxRetryAttempts = 5&lt;/code&gt;, the entry is marked &lt;code&gt;dlq&lt;/code&gt; and a copy is parked in the DLQ.&lt;/p&gt;

&lt;p&gt;The WAL is a JSON-lines file. Each line is a &lt;code&gt;WALEntry&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;WALEntry&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ID&lt;/span&gt;       &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"id"`&lt;/span&gt;
    &lt;span class="n"&gt;Channel&lt;/span&gt;  &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"channel"`&lt;/span&gt;
    &lt;span class="n"&gt;State&lt;/span&gt;    &lt;span class="n"&gt;WALState&lt;/span&gt; &lt;span class="s"&gt;`json:"state"`&lt;/span&gt;       &lt;span class="c"&gt;// pending | complete | dlq&lt;/span&gt;
    &lt;span class="n"&gt;Attempts&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;      &lt;span class="s"&gt;`json:"attempts"`&lt;/span&gt;
    &lt;span class="n"&gt;TS&lt;/span&gt;       &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"ts"`&lt;/span&gt;
    &lt;span class="n"&gt;Msg&lt;/span&gt;      &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt; &lt;span class="s"&gt;`json:"msg,omitempty"`&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;JSON-lines is nice because you can &lt;code&gt;cat wal.log | jq&lt;/code&gt; during an incident and see exactly what the bus was doing. It's also append-only, which means writes are O(1) and you never corrupt the middle of the file on a crash — at worst you have a half-written last line, which the recovery code handles.&lt;/p&gt;

&lt;p&gt;The interesting operation is compaction. The file grows without bound otherwise. Compaction rewrites the file keeping only &lt;code&gt;pending&lt;/code&gt; entries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;WAL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Compact&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// ... write all pending entries to wal.log.tmp ...&lt;/span&gt;
    &lt;span class="c"&gt;// atomic rename&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tmpPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;WAL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;NeedsCompaction&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;compactionThreshold&lt;/span&gt;  &lt;span class="c"&gt;// 10 MB&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We don't compact on every &lt;code&gt;Complete&lt;/code&gt; call — that would tank throughput. The &lt;code&gt;cmd/ipcbus&lt;/code&gt; binary runs a 60-second ticker that checks &lt;code&gt;NeedsCompaction()&lt;/code&gt; and rewrites the file when it grows past 10 MB. That's a coarse heuristic — it will compact even if most entries are still &lt;code&gt;pending&lt;/code&gt;, wasting some I/O — but it's simple and steady-state overhead is near zero. A smarter policy (also consider the &lt;code&gt;pending&lt;/code&gt; ratio, pre-commit) would be a reasonable first PR.&lt;/p&gt;

&lt;p&gt;The WAL does not fsync on every append. We batch. If a node hard-kills, we can lose the last few hundred milliseconds of messages. That's an acceptable tradeoff for a system where the upstream Slack delivery is already best-effort. If you care more about durability, &lt;code&gt;Flush()&lt;/code&gt; is exposed and you can call it from your own code, but we chose not to make it automatic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 4 — Dead Letter Queue (DLQ)
&lt;/h2&gt;

&lt;p&gt;After 5 delivery attempts, a message is "dead." We don't silently drop it; we move it to the DLQ:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewDLQ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;maxSize&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;DLQ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;bolt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;bolt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Options&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Timeout&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="c"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;BoltDB is &lt;a href="https://github.com/etcd-io/bbolt" rel="noopener noreferrer"&gt;embedded KV storage with B+tree on-disk layout&lt;/a&gt;. It's fast, transactional, and single-file. Perfect for a sidecar that needs a few megabytes of dead messages, queryable by ID and age.&lt;/p&gt;

&lt;p&gt;Two eviction policies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;maxSize&lt;/strong&gt; — a hard cap on entry count. When we're full, we evict the oldest.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ttl&lt;/strong&gt; — entries older than the TTL are purged. &lt;code&gt;NewDLQ(path, maxSize, ttl)&lt;/code&gt; takes both as constructor args; the &lt;code&gt;cmd/ipcbus&lt;/code&gt; binary passes &lt;code&gt;maxSize=10000, ttl=24h&lt;/code&gt; and runs an hourly &lt;code&gt;PurgeExpired&lt;/code&gt; ticker. Library callers can pick their own.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This matters because the DLQ is the debugging surface for the bus. Something went wrong? &lt;code&gt;kubectl exec&lt;/code&gt; into the sidecar, open the BoltDB file, and look at the last N entries. We've caught a couple of real bugs this way that would have been invisible with "drop on failure."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;DLQ&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;PurgeExpired&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;DLQ&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;DLQ&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;DLQEntry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deliberately no replay-from-DLQ. If something's dead, it's dead. We want human attention, not automatic retry that hides a real problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 5 — ring buffer with backpressure
&lt;/h2&gt;

&lt;p&gt;The remaining problem: what if a channel sidecar is producing faster than the runtime can consume?&lt;/p&gt;

&lt;p&gt;Naive answer: unbounded queue. Result: OOM-killed pod.&lt;/p&gt;

&lt;p&gt;Real answer: bounded ring buffer with high/low watermarks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewRingBuffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;highWatermark&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lowWatermark&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;RingBuffer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// ... defaults to high=0.8, low=0.3 ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the buffer fills past 80%, the bus emits a &lt;code&gt;slow_down&lt;/code&gt; control message upstream. The channel sidecar sees it and stops pulling from Slack. When the buffer drains below 30%, the bus emits &lt;code&gt;resume&lt;/code&gt; and the sidecar starts pulling again.&lt;/p&gt;

&lt;p&gt;Why two watermarks? Because if you use one, you thrash. Right at the threshold, every push flips state. Two watermarks with a gap gives you hysteresis. Classic control-theory stuff, very little Go stuff.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;slow_down&lt;/code&gt; / &lt;code&gt;resume&lt;/code&gt; messages ride the same wire format as everything else:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;TypeAck&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TypeNack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TypeSlowDown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TypeResume&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="n"&gt;TypeShutdown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TypeRegister&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TypeHeartbeat&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Treating control traffic as just another &lt;code&gt;MessageType&lt;/code&gt; means channel sidecars don't need a separate control channel. One TCP/UDS/WS connection carries both payloads and backpressure signals. Simpler, fewer failure modes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shutdown
&lt;/h2&gt;

&lt;p&gt;Graceful shutdown is its own hazard. On SIGTERM the &lt;code&gt;cmd/ipcbus&lt;/code&gt; binary runs a local &lt;code&gt;shutdown()&lt;/code&gt; helper that does the bare minimum:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;shutdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bridge&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SendShutdown&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;      &lt;span class="c"&gt;// tell sidecars we're going away&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// fixed grace window&lt;/span&gt;
    &lt;span class="n"&gt;wal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                 &lt;span class="c"&gt;// flush WAL to disk&lt;/span&gt;
    &lt;span class="n"&gt;bridge&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;              &lt;span class="c"&gt;// close the runtime bridge&lt;/span&gt;
    &lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;                    &lt;span class="c"&gt;// stop the UDS server + background tickets&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No polling, no early exit if sidecars disconnect, no DLQ close (process-exit flushes BoltDB's mmap and that's enough). Whatever is still &lt;code&gt;pending&lt;/code&gt; in the WAL when we exit gets replayed on next startup — that's the whole point of the WAL.&lt;/p&gt;

&lt;p&gt;There's also a fancier &lt;code&gt;ShutdownOrchestrator&lt;/code&gt; in &lt;a href="https://github.com/Prismer-AI/k8s4claw/blob/main/internal/ipcbus/shutdown.go" rel="noopener noreferrer"&gt;&lt;code&gt;internal/ipcbus/shutdown.go&lt;/code&gt;&lt;/a&gt; that takes a &lt;code&gt;drainTimeout&lt;/code&gt; parameter and polls &lt;code&gt;router.ConnectedCount()&lt;/code&gt; every 100 ms to exit early, but the current binary doesn't wire it up. Good first PR: swap the local helper out for the orchestrator so the sleep becomes a real wait-for-drain.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we didn't do (on purpose)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-pod clustering.&lt;/strong&gt; The bus is deliberately in-pod. If you want cross-pod messaging, use a real broker (NATS, Redis streams). Scoping this to one pod kept us sane.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ordering guarantees across channels.&lt;/strong&gt; Within one channel, messages are ordered. Across channels, no promise. Most agent workloads don't care.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exactly-once.&lt;/strong&gt; At-least-once with idempotent consumers is simpler and good enough. The runtime is expected to deduplicate on &lt;code&gt;Message.ID&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protobuf on the wire.&lt;/strong&gt; JSON is ~2× larger but 10× easier to debug. Given our throughput (tens of messages per second per pod, not millions), JSON is the right call.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Testing
&lt;/h2&gt;

&lt;p&gt;We aimed for &amp;gt;80% statement coverage on the ipcbus package, approximately. The non-obvious piece: most of the reliability features are hard to unit-test with mocks because they're about failure modes. So we have a lot of tests that spin up real local listeners (&lt;code&gt;net.Listen("tcp", "127.0.0.1:0")&lt;/code&gt;, &lt;code&gt;net.Listen("unix", t.TempDir()+"/sock")&lt;/code&gt;, &lt;code&gt;httptest.NewServer(...)&lt;/code&gt;) and exercise the bridges end-to-end.&lt;/p&gt;

&lt;p&gt;For example, the SSE bridge test spins up an &lt;code&gt;httptest&lt;/code&gt; server that handles both &lt;code&gt;GET /events&lt;/code&gt; (as an SSE stream) and &lt;code&gt;POST /messages&lt;/code&gt;, and checks that connecting, sending, and receiving all work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestSSEBridge_SendReceive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sseEchoServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;bridge&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;NewSSEBridge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;// ... connect, wait for SSE stream to establish, send, receive ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;About 70 tests total, &lt;code&gt;-race&lt;/code&gt; clean. Good enough for a sidecar.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this bought us
&lt;/h2&gt;

&lt;p&gt;A uniform contract for channel sidecars. You write one Slack sidecar, it works with every runtime. You write one Discord sidecar, same thing. Runtime authors pick a protocol that fits their stack; they don't think about durability, retries, or backpressure — the bus handles it.&lt;/p&gt;

&lt;p&gt;The runtime adapter for a new protocol is ~50 lines. The channel sidecar SDK (&lt;a href="https://github.com/Prismer-AI/k8s4claw/tree/main/sdk/channel" rel="noopener noreferrer"&gt;&lt;code&gt;sdk/channel/&lt;/code&gt;&lt;/a&gt;) hides the framing entirely; you call &lt;code&gt;client.Send(ctx, json.RawMessage(...))&lt;/code&gt; and move on.&lt;/p&gt;

&lt;p&gt;The whole ipcbus package is ~2k lines of Go. If you want to read one file to get the flavor, &lt;a href="https://github.com/Prismer-AI/k8s4claw/blob/main/internal/ipcbus/router.go" rel="noopener noreferrer"&gt;&lt;code&gt;router.go&lt;/code&gt;&lt;/a&gt; is where all five mechanisms meet.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to look at next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;a href="https://github.com/Prismer-AI/k8s4claw" rel="noopener noreferrer"&gt;k8s4claw repo&lt;/a&gt; if you want to use it&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Prismer-AI/k8s4claw/tree/main/internal/ipcbus" rel="noopener noreferrer"&gt;&lt;code&gt;internal/ipcbus/&lt;/code&gt;&lt;/a&gt; if you want to read the code&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/willamhou/k8s4claw-a-kubernetes-operator-for-managing-ai-agent-runtimes-3anm"&gt;The intro post&lt;/a&gt; if you want context on how this fits into the operator&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open source, Apache-2.0. Questions and PRs welcome. If you've built something similar and went in a different direction, I'd love to hear why in the comments.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>go</category>
      <category>opensource</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>When Rust's Exhaustive Match Helps (And When It Doesn't): Notes from a Bare-Metal Hypervisor</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Wed, 22 Apr 2026 03:41:04 +0000</pubDate>
      <link>https://forem.com/willamhou/when-rusts-exhaustive-match-helps-and-when-it-doesnt-notes-from-a-bare-metal-hypervisor-4olh</link>
      <guid>https://forem.com/willamhou/when-rusts-exhaustive-match-helps-and-when-it-doesnt-notes-from-a-bare-metal-hypervisor-4olh</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer&lt;/strong&gt;: This is about an experimental hypervisor project that only runs on QEMU virt — no real-hardware validation yet. The lessons apply to "Rust's tooling edges in systems programming," not production guidance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;10 weeks into writing an ARM64 bare-metal hypervisor, I assumed Rust's exhaustive &lt;code&gt;match&lt;/code&gt; would be the safety net when I extended my state machine. Two observations, from one week of commits: &lt;strong&gt;exhaustive match didn't help my state machine at all, but caught 6 errors the one time I extended my &lt;code&gt;Device&lt;/code&gt; enum.&lt;/strong&gt; This post is about why — and why the distinction is about cardinality, not typestate vs tag enums.&lt;/p&gt;




&lt;p&gt;I'm writing an ARM64 bare-metal hypervisor. Part of it is a thing called a &lt;strong&gt;Secure Partition (SP)&lt;/strong&gt; — a lightweight VM managed by the SPMC. Each SP has a lifecycle: Reset → Idle → Running → Blocked → Preempted. 5 states, 7 legal transitions.&lt;/p&gt;

&lt;p&gt;Two weeks ago I added a new transition: &lt;code&gt;Blocked → Preempted&lt;/code&gt;, for chain preemption between SPs. By the textbook, this is exactly the scenario where Rust's &lt;code&gt;enum + match&lt;/code&gt; should shine: add a state/transition, the compiler finds every site that needs updating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The compiler said nothing.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This post is about why I didn't use the "enum-with-fields" pattern you see in tutorials, why &lt;code&gt;match&lt;/code&gt; exhaustiveness didn't help on this state machine, and where it actually &lt;em&gt;did&lt;/em&gt; help.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Code
&lt;/h2&gt;

&lt;p&gt;No toy examples. Here's the actual &lt;code&gt;SpState&lt;/code&gt; from the repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/sp_context.rs&lt;/span&gt;
&lt;span class="nd"&gt;#[derive(Debug,&lt;/span&gt; &lt;span class="nd"&gt;Clone,&lt;/span&gt; &lt;span class="nd"&gt;Copy,&lt;/span&gt; &lt;span class="nd"&gt;PartialEq,&lt;/span&gt; &lt;span class="nd"&gt;Eq)]&lt;/span&gt;
&lt;span class="nd"&gt;#[repr(u8)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;SpState&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Reset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Idle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Running&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Blocked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Preempted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Classic &lt;strong&gt;tag-only enum&lt;/strong&gt; — &lt;code&gt;#[repr(u8)]&lt;/code&gt;, every variant is one byte, no payload. Why not the textbook &lt;code&gt;Running { entry_pc: u64 }&lt;/code&gt; / &lt;code&gt;Preempted { saved_ctx: VcpuContext }&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;Because the state lives in an &lt;strong&gt;&lt;code&gt;AtomicU8&lt;/code&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The SPMC runs on multiple physical CPUs. Different CPUs inside TF-A's SPMD (Secure Partition Manager Dispatcher) can route requests to the same SP at once. Two CPUs racing to do &lt;code&gt;Idle → Running&lt;/code&gt; — one &lt;em&gt;must&lt;/em&gt; lose, or both will &lt;code&gt;ERET&lt;/code&gt; into the same SP and clobber register context.&lt;/p&gt;

&lt;p&gt;CAS drives the race:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;try_transition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;SpState&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt;&lt;span class="nf"&gt;.compare_exchange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;// success: AcqRel publishes our context-save&lt;/span&gt;
        &lt;span class="n"&gt;new_state&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;// failure: Acquire syncs the observed loser&lt;/span&gt;
        &lt;span class="nn"&gt;Ordering&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;AcqRel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nn"&gt;Ordering&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Acquire&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(()),&lt;/span&gt;
        &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;try_from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"corrupt SP state value"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The constraint isn't memory layout — &lt;code&gt;#[repr(u8, C)]&lt;/code&gt; on a fields-carrying enum does give stable layout. The real constraint is &lt;strong&gt;size&lt;/strong&gt;: &lt;code&gt;AtomicU8&lt;/code&gt; wraps one byte, and any enum with a &lt;code&gt;u64&lt;/code&gt; payload is at least 8 bytes wide. Atomic &lt;code&gt;u64&lt;/code&gt; CAS is fine on aarch64, but that means every state change either serializes through a fat struct CAS or falls back to a lock. I wanted single-byte CAS in the fast path, so the payload lives elsewhere (in a separate &lt;code&gt;VcpuContext&lt;/code&gt; guarded by the state transition itself).&lt;/p&gt;

&lt;p&gt;Side note on &lt;code&gt;expect("corrupt SP state value")&lt;/code&gt;: it really does panic. In this project the panic handler halts the offending CPU and dumps state via UART — because if the &lt;code&gt;AtomicU8&lt;/code&gt; ever holds a value outside &lt;code&gt;0..=4&lt;/code&gt;, memory corruption has already happened and limping along is worse than stopping. That's a conscious choice for this binary, not a general bare-metal guideline.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Exhaustive Match Didn't Help
&lt;/h2&gt;

&lt;p&gt;The legal-transition check lives in one function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/sp_context.rs&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;transition_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;'static&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.state&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;valid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Idle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Idle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Idle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Blocked&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Blocked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Blocked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Preempted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// ← the newly added line&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Preempted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Preempted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the final &lt;code&gt;_ =&amp;gt; false&lt;/code&gt;. This is &lt;strong&gt;not&lt;/strong&gt; an exhaustive match — the wildcard swallows every unlisted combination as "illegal."&lt;/p&gt;

&lt;p&gt;The commit that added &lt;code&gt;Blocked → Preempted&lt;/code&gt; was literally 1 line. The compiler reported nothing, because to the compiler, all 25 &lt;code&gt;(from, to)&lt;/code&gt; combinations are covered (7 explicit + &lt;code&gt;_&lt;/code&gt; fallback).&lt;/p&gt;

&lt;p&gt;I could have replaced &lt;code&gt;_ =&amp;gt; false&lt;/code&gt; with all 18 illegal combinations enumerated. I started to — "exhaustive is more Rust-y". Then I gave up halfway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// This way...&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;SpState&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Blocked&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="c1"&gt;// ... 15 more lines of this&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No new information, and every future state addition means maintaining an N² table. &lt;code&gt;_ =&amp;gt; false&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; the documentation here: &lt;strong&gt;what's listed is legal; everything else isn't.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: For simple C-style enum + state-transition pairs, &lt;code&gt;match&lt;/code&gt; exhaustiveness doesn't save you. Bugs at this layer can only be caught by unit tests (my &lt;code&gt;test_sp_context.rs&lt;/code&gt; has 58 assertions covering every legal transition plus key illegal ones).&lt;/p&gt;




&lt;h2&gt;
  
  
  Where It Actually Saved Me
&lt;/h2&gt;

&lt;p&gt;The place where &lt;code&gt;match&lt;/code&gt; exhaustiveness actually saved me was device dispatch.&lt;/p&gt;

&lt;p&gt;My hypervisor uses a &lt;code&gt;Device&lt;/code&gt; enum to enumerate all virtual devices. Every time the guest touches MMIO, a &lt;code&gt;match&lt;/code&gt; dispatches to the right implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/devices/mod.rs&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Device&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;Uart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;pl011&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtualUart&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;Gicd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;gic&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtualGicd&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;Gicr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;gic&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtualGicr&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;VirtioBlk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;virtio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;mmio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtioMmioTransport&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nn"&gt;virtio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;blk&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtioBlk&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;VirtioNet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;virtio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;mmio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtioMmioTransport&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nn"&gt;virtio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;net&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtioNet&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;Pl031&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;pl031&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;VirtualPl031&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;strong&gt;is&lt;/strong&gt; a fields-carrying enum — each variant holds the state struct for its device. No &lt;code&gt;_&lt;/code&gt; fallback on matches against it, because every variant has its own handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;MmioDevice&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;Device&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Option&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;u64&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nn"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Uart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nn"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Gicd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nn"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Gicr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nn"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;VirtioBlk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nn"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;VirtioNet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nn"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Pl031&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// write, contains, is_ready, ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I added &lt;code&gt;Pl031&lt;/code&gt; (PL031 RTC) for Android boot, I only touched the enum definition. The compiler immediately fired &lt;strong&gt;6 errors&lt;/strong&gt; — every site that &lt;code&gt;match&lt;/code&gt;es against &lt;code&gt;Device&lt;/code&gt; was missing the &lt;code&gt;Pl031&lt;/code&gt; arm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;error[E0004]: non-exhaustive patterns: `&amp;amp;Device::Pl031(_)` not covered
  --&amp;gt; src/devices/mod.rs:51:15
error[E0004]: non-exhaustive patterns: `&amp;amp;mut Device::Pl031(_)` not covered
  --&amp;gt; src/devices/mod.rs:62:15
error[E0004]: non-exhaustive patterns: `&amp;amp;Device::Pl031(_)` not covered
  --&amp;gt; src/devices/mod.rs:73:15
// ... 6 total
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two of those were helper methods I'd written when adding &lt;code&gt;VirtioNet&lt;/code&gt; and &lt;strong&gt;completely forgotten about&lt;/strong&gt;. Had I used C &lt;code&gt;switch&lt;/code&gt; without &lt;code&gt;-Wswitch-enum&lt;/code&gt; (which Linux kernel and TF-A both enable by default), those two sites would silently fall into &lt;code&gt;default&lt;/code&gt; and return "unknown device." The guest would do any MMIO to the RTC, fail to find a device, and hang mid-boot with an error pointing somewhere completely unrelated.&lt;/p&gt;

&lt;p&gt;C with &lt;code&gt;-Wswitch-enum&lt;/code&gt; + &lt;code&gt;-Werror&lt;/code&gt; gives you the same check — the relevant difference is that Rust makes it a precondition for compiling instead of a build-system setting you can drop. Worth more in a solo project, less in a shop with a strict style guide.&lt;/p&gt;

&lt;p&gt;Either way, the compiler caught this bug instead of the guest doing so at boot time.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Exhaustive Match Actually Pays Off
&lt;/h2&gt;

&lt;p&gt;Reviewing this state-machine extension + &lt;code&gt;Device&lt;/code&gt; extension, here's my distilled rule:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhaustive match saves you&lt;/strong&gt;: &lt;strong&gt;fields-carrying enum + every variant has independent handler logic&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Device::{Uart, Gicd, ..., Pl031}&lt;/code&gt; — each device's &lt;code&gt;read/write&lt;/code&gt; is totally different&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MmioAccess::{Read { reg, size }, Write { reg, size, val }}&lt;/code&gt; — read vs write semantics differ&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ExitReason::{HvcCall, SmcCall, DataAbort, WfiWfe, ...}&lt;/code&gt; — each exception class has its own handler&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common trait: &lt;strong&gt;adding a variant potentially leaves gaps across the entire codebase&lt;/strong&gt;, and each gap's correct implementation is non-trivial (not just "error vs OK" binary output).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exhaustive match doesn't help&lt;/strong&gt;: &lt;strong&gt;simple tag enum + cartesian-product check&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;State machine &lt;code&gt;(from, to)&lt;/code&gt; transition table — N² explosion, &lt;code&gt;_ =&amp;gt; false&lt;/code&gt; is more readable&lt;/li&gt;
&lt;li&gt;Permission matrix &lt;code&gt;(user_role, action)&lt;/code&gt; — same&lt;/li&gt;
&lt;li&gt;Input sanity check &lt;code&gt;match(input) { valid_range =&amp;gt; ..., _ =&amp;gt; reject }&lt;/code&gt; — tautological&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These scenarios are "enumerate a small set of legal cases, reject everything else." &lt;code&gt;_ =&amp;gt; fallback&lt;/code&gt; loses no information — it's &lt;em&gt;more&lt;/em&gt; readable.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Few Takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;code&gt;#[repr(u8)]&lt;/code&gt; is everyday life in hypervisor/kernel/driver code. Don't apologize for the atomic trade-off.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every time a "Rust state machine" tweet appears, someone in the replies recommends typestate. Typestate is genuinely powerful when transitions happen through owning APIs (&lt;code&gt;File::open → Handle&amp;lt;Open&amp;gt;&lt;/code&gt;), but it doesn't compose with shared mutable state across CPUs — the entire point of &lt;code&gt;AtomicU8&lt;/code&gt; is that multiple cores hold a reference to one byte. Typestate requires owning &lt;code&gt;self&lt;/code&gt; by value to consume the old state; a multi-CPU SPMC can't do that on the fast path. Not a rejection of typestate, just the wrong tool for this edge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. &lt;code&gt;_ =&amp;gt; fallback&lt;/code&gt; isn't a sin, but ask yourself every time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"If I add a new variant in the future, should this site force me to update it?"&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Yes → drop the &lt;code&gt;_&lt;/code&gt;, enumerate every variant&lt;/li&gt;
&lt;li&gt;No (illegal state-machine pair, MMIO unknown-offset) → &lt;code&gt;_ =&amp;gt; default&lt;/code&gt; is documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. State-machine correctness is never a gift from Rust. It's a gift from tests + documentation + code review.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;My &lt;code&gt;test_sp_context.rs&lt;/code&gt; has dedicated tests for every legal transition, a bunch of illegal ones, and CAS races. Rust didn't generate those; I wrote them. Rust saved me from some defensive code (no "sixth value" of &lt;code&gt;SpState&lt;/code&gt; — &lt;code&gt;try_from_u8&lt;/code&gt; rejects it), but whether the legal-transition table is &lt;em&gt;correct&lt;/em&gt;, Rust has no opinion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. What really saves you is "fields-carrying enum + each variant has its own handler."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's Rust's signature strength. Find the places in your codebase that fit this pattern and get them right — it pays more than agonizing over whether the state machine should be typestate-ified.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;My hypervisor isn't a "zero-unwrap" project. The repo has about 6 &lt;code&gt;unwrap()&lt;/code&gt; calls (concentrated in test fixtures and boot-time paths that can't reasonably panic) and 45 &lt;code&gt;_ =&amp;gt; default&lt;/code&gt; fallback arms (mostly in MMIO register decode for unknown offsets).&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;unwrap()&lt;/code&gt; and &lt;code&gt;_ =&amp;gt;&lt;/code&gt; was a decision at the time, not laziness. Engineering beats slogans.&lt;/p&gt;

&lt;p&gt;Rust gives you a good weapon. It doesn't think for you. Whether the state-transition table is legal is in &lt;em&gt;your&lt;/em&gt; head, not the compiler's.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Code&lt;/strong&gt;: &lt;a href="https://github.com/willamhou/hypervisor" rel="noopener noreferrer"&gt;github.com/willamhou/hypervisor&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Blog&lt;/strong&gt;: &lt;a href="https://willamhou.github.io/hypervisor/" rel="noopener noreferrer"&gt;willamhou.github.io/hypervisor&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is part 5 of the ARM64 Hypervisor development series. The Chinese version is the canonical source — see &lt;a href="https://github.com/willamhou/hypervisor/blob/main/docs/zhihu/part5-enum-state-machine.md" rel="noopener noreferrer"&gt;part5-enum-state-machine.md&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rust</category>
      <category>embedded</category>
      <category>arm</category>
      <category>systems</category>
    </item>
    <item>
      <title>k8s4claw: A Kubernetes Operator for Managing AI Agent Runtimes</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Tue, 21 Apr 2026 05:08:42 +0000</pubDate>
      <link>https://forem.com/willamhou/k8s4claw-a-kubernetes-operator-for-managing-ai-agent-runtimes-3anm</link>
      <guid>https://forem.com/willamhou/k8s4claw-a-kubernetes-operator-for-managing-ai-agent-runtimes-3anm</guid>
      <description>&lt;p&gt;Every AI agent framework has its own deployment story. Claude-based assistants run one way, OpenAI agents another, security-focused runtimes yet another. If you run more than one on Kubernetes, you end up writing the same boilerplate over and over: secret management, persistent storage, graceful updates, inter-service messaging, observability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;k8s4claw&lt;/strong&gt; is an open-source Kubernetes operator that wraps all of this behind a single CRD. You describe &lt;em&gt;what&lt;/em&gt; the agent is, it handles &lt;em&gt;how&lt;/em&gt; it runs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claw.prismer.ai/v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Claw&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;research-agent&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4"&lt;/span&gt;
  &lt;span class="na"&gt;credentials&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;secretRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;llm-api-keys&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The operator reconciles this into a StatefulSet, headless Service, ConfigMap, ServiceAccount, PodDisruptionBudget, and optionally NetworkPolicy and Ingress. When you add a channel (Slack, Discord, Webhook), it also wires up sidecars and a local message bus.&lt;/p&gt;

&lt;p&gt;This post walks through the architecture, shows how to get it running locally, and explains the design decisions behind the IPC bus, the auto-update controller, and the runtime adapter system.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;We had several agent runtimes in flight at once — different languages, different process models, different resource profiles:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Runtime&lt;/th&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenClaw&lt;/td&gt;
&lt;td&gt;TypeScript/Node.js&lt;/td&gt;
&lt;td&gt;Full-featured AI assistant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NanoClaw&lt;/td&gt;
&lt;td&gt;TypeScript/Node.js&lt;/td&gt;
&lt;td&gt;Lightweight personal assistant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ZeroClaw&lt;/td&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;High-performance agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PicoClaw&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;Ultra-minimal serverless&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IronClaw&lt;/td&gt;
&lt;td&gt;Rust + WASM&lt;/td&gt;
&lt;td&gt;Security-focused agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HermesClaw&lt;/td&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;Conversational with tool use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K8sOps&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;Cluster self-healing (claw4k8s)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each had its own Helm chart, sidecar layout, and update strategy. Adding a Slack channel meant editing several files. Rotating credentials meant touching every deployment. Rolling back a bad update was a manual process.&lt;/p&gt;

&lt;p&gt;We wanted one control plane for all of them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TB
    subgraph "Kubernetes Cluster"
        OP[k8s4claw Operator]

        subgraph "Claw Pod (with channels)"
            INIT["claw-init"]
            RT["Runtime Container"]
            IPC["IPC Bus Sidecar"]
            CH["Channel Sidecar"]
        end

        STS[StatefulSet]
        SVC[Service]
        CM[ConfigMap]
        PVC[(PVCs)]

        OP --&amp;gt;|manages| STS
        OP --&amp;gt;|manages| SVC
        OP --&amp;gt;|manages| CM
        STS -.-&amp;gt;|runs| RT
        STS -.-&amp;gt;|runs| IPC
        STS -.-&amp;gt;|runs| CH

        CH &amp;lt;--&amp;gt;|UDS| IPC
        IPC &amp;lt;--&amp;gt;|Bridge| RT
    end

    EXT["Slack / Discord / Webhook"]
    CH &amp;lt;--&amp;gt;|API| EXT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The operator watches &lt;code&gt;Claw&lt;/code&gt; custom resources and reconciles a full stack of Kubernetes objects. A minimal agent (no channels, no persistence) gets just the runtime container plus &lt;code&gt;claw-init&lt;/code&gt;. If you declare any channels in &lt;code&gt;spec.channels&lt;/code&gt;, the operator also injects:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;claw-init&lt;/strong&gt; — an init container that merges default runtime config with any user overrides before the runtime starts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime container&lt;/strong&gt; — the actual AI agent binary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IPC Bus sidecar&lt;/strong&gt; (only when channels are present) — a WAL-backed message router that sits between the runtime and the channel sidecars.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Channel sidecar(s)&lt;/strong&gt; — one per referenced &lt;code&gt;ClawChannel&lt;/code&gt; (Slack, Discord, Webhook today).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is a second CRD, &lt;code&gt;ClawChannel&lt;/code&gt;, that describes how to connect to an external system. Channels are defined once and referenced by many Claws.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes 1.28+ (or &lt;a href="https://kind.sigs.k8s.io/" rel="noopener noreferrer"&gt;kind&lt;/a&gt; for local development)&lt;/li&gt;
&lt;li&gt;Go 1.25+&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;controller-gen&lt;/code&gt; (&lt;code&gt;go install sigs.k8s.io/controller-tools/cmd/controller-gen@latest&lt;/code&gt;) — needed by &lt;code&gt;make install&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Install and run
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Prismer-AI/k8s4claw.git
&lt;span class="nb"&gt;cd &lt;/span&gt;k8s4claw

&lt;span class="c"&gt;# Install CRDs into the active cluster&lt;/span&gt;
make &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;span class="c"&gt;# Run the operator locally against your current kubeconfig.&lt;/span&gt;
&lt;span class="c"&gt;# --disable-webhooks lets you skip cert-manager setup during local dev.&lt;/span&gt;
&lt;span class="c"&gt;# In-cluster deployments should leave webhooks enabled.&lt;/span&gt;
go run ./cmd/operator/ &lt;span class="nt"&gt;--disable-webhooks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create your first agent
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create secret generic llm-api-keys &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--from-literal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-ant-xxx

&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt; | kubectl apply -f -
apiVersion: claw.prismer.ai/v1alpha1
kind: Claw
metadata:
  name: my-agent
spec:
  runtime: openclaw
  config:
    model: "claude-sonnet-4"
  credentials:
    secretRef:
      name: llm-api-keys
  persistence:
    session:
      enabled: true
      size: 2Gi
      mountPath: /data/session
    workspace:
      enabled: true
      size: 10Gi
      mountPath: /workspace
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;kubectl get claw my-agent &lt;span class="nt"&gt;-w&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Connect Slack
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claw.prismer.ai/v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClawChannel&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;team-slack&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;slack&lt;/span&gt;
  &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bidirectional&lt;/span&gt;
  &lt;span class="na"&gt;credentials&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;secretRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;slack-bot-token&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;appId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A0123456789"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reference it from your &lt;code&gt;Claw&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;channels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;team-slack&lt;/span&gt;
      &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bidirectional&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the next reconcile the operator injects a Slack sidecar, spins up the IPC bus sidecar, and wires them together. The runtime container does not need to know anything about Slack — it just talks to the bus.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deep Dive: The IPC Bus
&lt;/h2&gt;

&lt;p&gt;The IPC bus is the most interesting piece of k8s4claw. It is a Kubernetes &lt;a href="https://kubernetes.io/blog/2023/08/25/native-sidecar-containers/" rel="noopener noreferrer"&gt;native sidecar&lt;/a&gt; (an init container with &lt;code&gt;restartPolicy: Always&lt;/code&gt;) that routes JSON messages between channel sidecars and the agent runtime.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Channel Sidecar ──UDS──► IPC Bus ──Bridge──► Runtime Container
                         │ WAL  │
                         │ DLQ  │
                         │ Ring │
                         └──────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why not just HTTP?
&lt;/h3&gt;

&lt;p&gt;We tried. The problem is reliability. When a Slack event arrives while the runtime is overloaded, you need somewhere to buffer it. If the runtime crashes mid-response, you need to redeliver. When a channel sidecar falls behind, you need backpressure instead of dropped messages.&lt;/p&gt;

&lt;p&gt;Three mechanisms do the work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Write-Ahead Log (WAL)&lt;/strong&gt; — Every inbound message is appended to a WAL on &lt;code&gt;emptyDir&lt;/code&gt; before delivery. On restart, unacknowledged messages are replayed. Periodic compaction keeps the file bounded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Dead Letter Queue (DLQ)&lt;/strong&gt; — Messages that exceed the retry limit land in a BoltDB-backed DLQ instead of being dropped silently. You can inspect them later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Ring buffer with backpressure&lt;/strong&gt; — A fixed-size circular buffer with configurable high/low watermarks. Crossing the high watermark sends &lt;code&gt;slow_down&lt;/code&gt; upstream; draining to the low watermark sends &lt;code&gt;resume&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bridge protocols
&lt;/h3&gt;

&lt;p&gt;Different runtimes speak different wire protocols. The bus abstracts this behind a &lt;code&gt;RuntimeBridge&lt;/code&gt; interface:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Runtime&lt;/th&gt;
&lt;th&gt;Bridge&lt;/th&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenClaw&lt;/td&gt;
&lt;td&gt;WebSocket&lt;/td&gt;
&lt;td&gt;Full-duplex JSON over WS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NanoClaw&lt;/td&gt;
&lt;td&gt;UDS&lt;/td&gt;
&lt;td&gt;Length-prefix framed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ZeroClaw&lt;/td&gt;
&lt;td&gt;SSE&lt;/td&gt;
&lt;td&gt;HTTP POST + Server-Sent Events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PicoClaw&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;Length-prefix framed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Here is the actual interface (&lt;a href="https://github.com/Prismer-AI/k8s4claw/blob/main/internal/ipcbus/bridge.go" rel="noopener noreferrer"&gt;&lt;code&gt;internal/ipcbus/bridge.go&lt;/code&gt;&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;RuntimeBridge&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="n"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="n"&gt;Receive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding a new transport means implementing these four methods.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deep Dive: Auto-Update Controller
&lt;/h2&gt;

&lt;p&gt;The auto-update controller polls OCI registries on a cron schedule, filters new tags by a semver constraint, and performs health-verified rollouts with automatic rollback.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;autoUpdate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;versionConstraint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^1.x"&lt;/span&gt;
    &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;3&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
    &lt;span class="na"&gt;healthTimeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10m"&lt;/span&gt;
    &lt;span class="na"&gt;maxRollbacks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How it works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Poll&lt;/strong&gt; — on each cron tick, list tags from the registry and filter by the semver constraint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Initiate&lt;/strong&gt; — annotate the &lt;code&gt;Claw&lt;/code&gt; with the target image and transition into the &lt;code&gt;HealthCheck&lt;/code&gt; phase.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health check&lt;/strong&gt; — watch the StatefulSet readiness until all replicas are ready or the timeout fires.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Success&lt;/strong&gt; — update status, clear the annotation, schedule the next cron tick.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout&lt;/strong&gt; — roll back to the previous image.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Circuit breaker&lt;/strong&gt; — after N consecutive rollbacks, stop trying and emit an event plus a Prometheus metric.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The state machine lives in annotations and status conditions, so it survives operator restarts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;claw&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Annotations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"claw.prismer.ai/update-phase"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"HealthCheck"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reconcileHealthCheck&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;claw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Version history
&lt;/h3&gt;

&lt;p&gt;Every attempt is recorded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;autoUpdate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;currentVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.2.0"&lt;/span&gt;
    &lt;span class="na"&gt;versionHistory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.2.0"&lt;/span&gt;
        &lt;span class="na"&gt;appliedAt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-03-28T03:00:00Z"&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Healthy&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.1.5"&lt;/span&gt;
        &lt;span class="na"&gt;appliedAt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-03-21T03:00:00Z"&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RolledBack&lt;/span&gt;
    &lt;span class="na"&gt;failedVersions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.1.5"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;circuitOpen&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Runtime Adapter Pattern
&lt;/h2&gt;

&lt;p&gt;Each runtime is a Go struct implementing &lt;code&gt;RuntimeAdapter&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;RuntimeAdapter&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Pod shape&lt;/span&gt;
    &lt;span class="n"&gt;PodTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claw&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;v1alpha1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Claw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;corev1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PodTemplateSpec&lt;/span&gt;
    &lt;span class="n"&gt;HealthProbe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claw&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;v1alpha1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Claw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;corev1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Probe&lt;/span&gt;
    &lt;span class="n"&gt;ReadinessProbe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claw&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;v1alpha1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Claw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;corev1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Probe&lt;/span&gt;
    &lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;RuntimeConfig&lt;/span&gt;
    &lt;span class="n"&gt;GracefulShutdownSeconds&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;int32&lt;/span&gt;

    &lt;span class="c"&gt;// Spec validation&lt;/span&gt;
    &lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;spec&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;v1alpha1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClawSpec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ErrorList&lt;/span&gt;
    &lt;span class="n"&gt;ValidateUpdate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;oldSpec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newSpec&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;v1alpha1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClawSpec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ErrorList&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A new adapter typically lives in a single file of ~100 lines. The shared &lt;code&gt;BuildPodTemplate&lt;/code&gt; helper handles init containers, volume mounts, security context, and environment variables, so the adapter only declares what is actually different:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;MyRuntimeAdapter&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;MyRuntimeAdapter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;PodTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claw&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;v1alpha1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Claw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;corev1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PodTemplateSpec&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;BuildPodTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;RuntimeSpec&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="s"&gt;"my-registry/my-runtime:latest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Ports&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;corev1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ContainerPort&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gateway"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ContainerPort&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
        &lt;span class="n"&gt;Resources&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"100m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"256Mi"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"500m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"512Mi"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="c"&gt;// ...&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c"&gt;// plus HealthProbe, ReadinessProbe, DefaultConfig, GracefulShutdownSeconds,&lt;/span&gt;
&lt;span class="c"&gt;// Validate, ValidateUpdate&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validation is per-runtime on purpose. OpenClaw and IronClaw require credentials because they call LLM APIs. ZeroClaw and PicoClaw permit credential-less operation. HermesClaw rejects &lt;code&gt;spec.channels&lt;/code&gt; because it brings its own gateway. NanoClaw currently has no update-time persistence checks. The point is each adapter owns its own rules.&lt;/p&gt;




&lt;h2&gt;
  
  
  Go SDK
&lt;/h2&gt;

&lt;p&gt;For programmatic access there is a Go SDK (&lt;a href="https://github.com/Prismer-AI/k8s4claw/tree/main/sdk" rel="noopener noreferrer"&gt;&lt;code&gt;sdk/&lt;/code&gt;&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/Prismer-AI/k8s4claw/sdk"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// uses the ambient kubeconfig by default&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;claw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClawSpec&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Runtime&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OpenClaw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RuntimeConfig&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Environment&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"MODEL"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"claude-sonnet-4"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Block until the Claw reaches phase "Running" or ctx expires.&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WaitForReady&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;claw&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is also a channel SDK for writing custom sidecars:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;
    &lt;span class="s"&gt;"encoding/json"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/Prismer-AI/k8s4claw/sdk/channel"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithChannelName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"my-channel"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c"&gt;// or set CHANNEL_NAME env&lt;/span&gt;
    &lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithSocketPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/var/run/claw/bus.sock"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithBufferSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Send a message to the runtime.&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RawMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`{"text":"Hello"}`&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Receive returns a channel of *InboundMessage.&lt;/span&gt;
&lt;span class="n"&gt;inbox&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Receive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;inbox&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// handle msg&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Testing Strategy
&lt;/h2&gt;

&lt;p&gt;The repo has reasonable test coverage on the core packages. A recent local run looked roughly like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Coverage (approx.)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internal/webhook&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~97%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internal/runtime&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~94%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internal/registry&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~86%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sdk&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~83%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internal/controller&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~81%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sdk/channel&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~81%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internal/ipcbus&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~80%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Numbers move PR by PR. CI publishes a coverage report as an artifact and gates on a total-coverage threshold; there is no per-package floor enforced today. Treat the table as a snapshot, not a contract.&lt;/p&gt;

&lt;p&gt;The testing pyramid:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt; — pure functions, table-driven, &lt;code&gt;t.Parallel()&lt;/code&gt; everywhere.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fake-client tests&lt;/strong&gt; — &lt;code&gt;fake.NewClientBuilder()&lt;/code&gt; for controller logic without a real cluster.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;envtest integration tests&lt;/strong&gt; — real etcd + API server for webhook validation and reconcile loops.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The auto-update controller uses dependency injection via &lt;code&gt;Clock&lt;/code&gt; and &lt;code&gt;TagLister&lt;/code&gt; interfaces so time-dependent and registry-dependent code is fully testable with no network calls.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Not Done Yet
&lt;/h2&gt;

&lt;p&gt;Worth being honest about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;custom&lt;/code&gt; runtime type&lt;/strong&gt; is present in the CRD enum but no adapter is registered. If you want a runtime that is not in the built-in list today, you fork and add an adapter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HermesClaw&lt;/strong&gt; does not yet integrate with the k8s4claw channel sidecars — it uses its own gateway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local operator runs&lt;/strong&gt; need &lt;code&gt;--disable-webhooks&lt;/code&gt; unless you've set up cert-manager or your own TLS. In-cluster deployments via the Helm chart handle this for you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CRD surface is larger than just &lt;code&gt;Claw&lt;/code&gt;&lt;/strong&gt; — &lt;code&gt;ClawChannel&lt;/code&gt;, &lt;code&gt;ClawSelfConfig&lt;/code&gt;, and related types are part of the contract. "Single CRD" is a simplification; "small, focused set of CRDs" is closer to the truth.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;k8s4claw is open source under Apache-2.0. The current open contribution target is &lt;a href="https://github.com/Prismer-AI/k8s4claw/issues/4" rel="noopener noreferrer"&gt;Issue #4: add snapshot and PDB envtest coverage&lt;/a&gt;. If you want to propose something else, open a new issue and we'll triage it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Prismer-AI/k8s4claw" rel="noopener noreferrer"&gt;github.com/Prismer-AI/k8s4claw&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you run AI agents on Kubernetes and you're tired of maintaining the plumbing around them, give it a try. Star the repo if it helps, and open an issue if something is off — both signals are useful.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>go</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>How to Add Tamper-Evident Audit Trails to Your LangChain Agent</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Mon, 20 Apr 2026 02:16:33 +0000</pubDate>
      <link>https://forem.com/willamhou/how-to-add-tamper-evident-audit-trails-to-your-langchain-agent-3onc</link>
      <guid>https://forem.com/willamhou/how-to-add-tamper-evident-audit-trails-to-your-langchain-agent-3onc</guid>
      <description>&lt;p&gt;Your LangChain agent calls tools. It searches the web, reads files, queries databases, calls APIs. But can you &lt;em&gt;prove&lt;/em&gt; what it did?&lt;/p&gt;

&lt;p&gt;Logs capture what happened. Cryptographic receipts prove it. The difference matters when an auditor, a customer, or a regulator asks "show me exactly what the agent did and prove it wasn't altered after the fact."&lt;/p&gt;

&lt;p&gt;This tutorial adds Ed25519-signed, hash-chained audit trails to a LangChain agent in under 5 minutes. No external service, no API keys, no infrastructure. Everything verifies offline with a public key.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you'll build
&lt;/h2&gt;

&lt;p&gt;A LangChain agent where every tool call produces a signed receipt containing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What&lt;/strong&gt;: which tool was called, with what parameters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Who&lt;/strong&gt;: the agent's Ed25519 public key&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When&lt;/strong&gt;: timestamp&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proof&lt;/strong&gt;: Ed25519 signature over JCS-canonicalized (RFC 8785) payload&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chain&lt;/strong&gt;: SHA-256 hash linking to the previous receipt (tamper-evident ordering)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If anyone modifies a receipt after the fact, the signature breaks. If anyone deletes or reorders receipts, the hash chain breaks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;signet-auth[langchain] langchain-openai langchain-community
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 1: Create a signing identity
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;

&lt;span class="c1"&gt;# Creates an Ed25519 keypair, stored locally in ~/.signet/
# If the key already exists, just load it:
&lt;/span&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-langchain-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-langchain-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;acme-corp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Public key: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;public_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No key server, no certificate authority. The private key stays on disk, the public key is what verifiers use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Add the signing callback
&lt;/h2&gt;

&lt;p&gt;Signet ships a LangChain callback handler that signs every tool call automatically. Two lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth.langchain&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SignetCallbackHandler&lt;/span&gt;

&lt;span class="n"&gt;signer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SignetCallbackHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This handler signs the full tool lifecycle: &lt;code&gt;on_tool_start&lt;/code&gt; (what was called), &lt;code&gt;on_tool_end&lt;/code&gt; (what it returned, hashed), and &lt;code&gt;on_tool_error&lt;/code&gt; (what went wrong). If signing fails, the handler logs a warning and lets the agent continue. It never crashes your chain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Wire it into your agent
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hub&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DuckDuckGoSearchRun&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create_react_agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;

&lt;span class="c1"&gt;# Standard LangChain setup
&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;DuckDuckGoSearchRun&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hub&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hwchase17/react&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Create and run agent with signing callback
&lt;/span&gt;&lt;span class="n"&gt;agent_executor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;create_react_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;callbacks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;signer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent_executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the weather in Tokyo?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every tool call now produces a signed receipt. No code changes to the tools themselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Inspect the receipts
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;receipts&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Receipt #&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Tool:       &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Params hash: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;params_hash&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Signature:   &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sig&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Timestamp:   &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Receipt #1
  Tool:       duckduckgo_search
  Params hash: sha256:a1b2c3...
  Signature:   ed25519:Mz4xNTk2NjQ0NDgw...
  Timestamp:   2026-04-19T10:30:00Z

Receipt #2
  Tool:       _tool_end
  Params hash: sha256:d4e5f6...
  Signature:   ed25519:Nk5yODk3MjE1Njg4...
  Timestamp:   2026-04-19T10:30:01Z
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the start/end pair: the first receipt captures the tool call, the second captures a hash of the output. Together they prove what was called &lt;em&gt;and&lt;/em&gt; what it returned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Verify a receipt
&lt;/h2&gt;

&lt;p&gt;Anyone with the public key can verify, offline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;verify&lt;/span&gt;

&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;signer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;receipts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;is_valid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;public_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Valid: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;is_valid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# True
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tamper with any field and verification fails:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evil_tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# tamper
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Receipt&lt;/span&gt;
&lt;span class="n"&gt;tampered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Receipt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tampered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;public_key&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# False
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 6: Verify the audit chain
&lt;/h2&gt;

&lt;p&gt;The audit log is a hash-chained JSONL file. Each entry's hash covers the previous entry, so deleting or reordering receipts breaks the chain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;audit_verify_chain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default_signet_dir&lt;/span&gt;

&lt;span class="n"&gt;signet_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;default_signet_dir&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;chain_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;audit_verify_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signet_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Chain intact: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;chain_status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Entries: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;chain_status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What this gives you
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Without Signet&lt;/th&gt;
&lt;th&gt;With Signet&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;"The agent called web_search" (log entry)&lt;/td&gt;
&lt;td&gt;Ed25519 signature proving it, verifiable by anyone with the public key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logs can be edited after the fact&lt;/td&gt;
&lt;td&gt;Signature breaks if any field is modified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No ordering proof&lt;/td&gt;
&lt;td&gt;Hash chain breaks if receipts are deleted or reordered&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust the operator's logs&lt;/td&gt;
&lt;td&gt;Verify independently, offline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  When you need this
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Regulated industries&lt;/strong&gt;: EU AI Act Article 12 requires "automatic recording" of AI system activities. Signed receipts satisfy this with cryptographic proof, not just logs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise deployments&lt;/strong&gt;: When the question is "can you prove what the agent did?", signed receipts are the answer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent-to-agent&lt;/strong&gt;: When Agent B needs to verify what Agent A actually did before acting on its output.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incident response&lt;/strong&gt;: After something goes wrong, tamper-evident receipts let you reconstruct exactly what happened without trusting anyone's claim.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bilateral co-signing&lt;/strong&gt;: Have both the agent and the tool server sign each interaction independently. Neither party can fabricate receipts. See &lt;code&gt;signet proxy&lt;/code&gt; for MCP integration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy attestation&lt;/strong&gt;: Evaluate YAML policy rules and include the decision (allow/deny/require_approval) inside the signed receipt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delegation chains&lt;/strong&gt;: Prove that Agent A was authorized by Human B to perform a specific action with scoped constraints.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these are in &lt;code&gt;signet-auth&lt;/code&gt; today.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;signet-auth
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Prismer-AI/signet&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Signet is open source (Apache 2.0). Rust core with Python and TypeScript bindings. No external service, no API keys.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>langchain</category>
      <category>python</category>
      <category>ai</category>
      <category>security</category>
    </item>
    <item>
      <title>Claude Managed Agents Has Built-in Tracing. Here's What It Can't Do.</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Tue, 14 Apr 2026 09:52:23 +0000</pubDate>
      <link>https://forem.com/willamhou/claude-managed-agents-has-built-in-tracing-heres-what-it-cant-do-35g</link>
      <guid>https://forem.com/willamhou/claude-managed-agents-has-built-in-tracing-heres-what-it-cant-do-35g</guid>
      <description>&lt;h1&gt;
  
  
  Claude Managed Agents Has Built-in Tracing. Here's What It Can't Do.
&lt;/h1&gt;

&lt;p&gt;Anthropic shipped &lt;a href="https://claude.com/blog/claude-managed-agents" rel="noopener noreferrer"&gt;Claude Managed Agents&lt;/a&gt; last week. The pitch: production-grade agents with sandboxing, scoped permissions, and session tracing — built in, no setup required.&lt;/p&gt;

&lt;p&gt;The tracing feature specifically: "Session tracing, integration analytics, and troubleshooting guidance are built directly into the Claude Console, so you can inspect every tool call, decision, and failure mode."&lt;/p&gt;

&lt;p&gt;This is genuinely useful. If you're debugging a multi-step agent workflow, having every tool call logged in a console is miles better than parsing stderr.&lt;/p&gt;

&lt;p&gt;But there's a distinction worth making — one that matters in exactly the situations where it matters most.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Anthropic Recorded It" vs. "You Can Prove It"
&lt;/h2&gt;

&lt;p&gt;Claude Managed Agents is cloud-hosted. The tracing data lives in Claude Console, on Anthropic's infrastructure.&lt;/p&gt;

&lt;p&gt;That means the audit trail is: &lt;strong&gt;Anthropic says this happened.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For most debugging use cases, that's fine. You trust Anthropic. They trust you. The logs are accurate. Nobody is lying.&lt;/p&gt;

&lt;p&gt;But consider the situations where audit trails actually get pulled:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your agent made an unauthorized transfer.&lt;/strong&gt; The question isn't "what does the console say" — it's "can you prove, to a third party, that the agent executed this action with these parameters at this time, and that this record hasn't been modified?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A compliance audit.&lt;/strong&gt; SOC 2, HIPAA, GDPR. The auditor asks for evidence of agent actions on sensitive data. "Here are logs from Anthropic's console" is not the same as "here is a cryptographically signed chain of records that I hold and you can independently verify."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An incident investigation.&lt;/strong&gt; After a breach, forensic investigators need evidence that is tamper-evident and independently verifiable. If the evidence lives on the infrastructure that may have been compromised — or that a vendor controls — its integrity cannot be assumed.&lt;/p&gt;

&lt;p&gt;The distinction isn't about trust in Anthropic. It's about the difference between &lt;strong&gt;a record&lt;/strong&gt; and &lt;strong&gt;evidence&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Cryptographic Signing Adds
&lt;/h2&gt;

&lt;p&gt;A signed audit trail works differently.&lt;/p&gt;

&lt;p&gt;Each tool call generates a receipt: the action, the parameters, the timestamp, the agent identity — all hashed and signed with the agent's private Ed25519 key. Receipts chain together: each receipt includes the hash of the previous one. Modifying any record breaks the chain. Deleting a record is detectable.&lt;/p&gt;

&lt;p&gt;The key difference: &lt;strong&gt;you hold the proof, not a vendor.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;procurement-bot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ops-team&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;marketplace_purchase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GPU-A100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;quantity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15000&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# This receipt is a cryptographic artifact.
# You hold it. Anthropic doesn't.
# Any third party can verify it without contacting anyone.
&lt;/span&gt;&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When an auditor asks "prove this agent executed this action with these parameters," you hand them the receipt and the public key. They verify it offline. No Anthropic console access required. No vendor dependency in the evidence chain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Gaps
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Vendor-held vs. self-held evidence&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Managed Agents tracing: logs live in Claude Console. Anthropic controls the data.&lt;/p&gt;

&lt;p&gt;Signed receipts: cryptographic artifacts you hold locally. No third party in the verification chain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Log integrity vs. cryptographic integrity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Managed Agents: session logs. Accurate under normal conditions. But a log file — even a well-managed one — can be modified. There's no mechanism in a standard log that makes tampering detectable after the fact.&lt;/p&gt;

&lt;p&gt;Signed receipts: hash-chained. Tamper with any entry and the chain breaks. Detect deletions. Detect reordering. The integrity guarantee is mathematical, not administrative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Single-party vs. bilateral proof&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Managed Agents: Anthropic logs what happened on their infrastructure.&lt;/p&gt;

&lt;p&gt;Bilateral signing (Signet v0.4+): the agent signs the request, the server independently signs the response. One tamper-evident record, two signatures, two trust domains. Rewriting the chain requires compromising both keys on separate machines.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Managed Agents Does Well
&lt;/h2&gt;

&lt;p&gt;To be clear about what this is not: this is not a criticism of Managed Agents as a product.&lt;/p&gt;

&lt;p&gt;For developers building Claude-based agents who need to go to production quickly, Managed Agents is a compelling offer. Sandboxing, authentication, session persistence, scoped permissions, multi-agent coordination — real infrastructure problems, solved. The tracing in Console is useful for development and operational debugging.&lt;/p&gt;

&lt;p&gt;The gaps above only matter in specific contexts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Regulated industries (finance, healthcare, legal) where audit evidence must be third-party verifiable&lt;/li&gt;
&lt;li&gt;Incident response and forensics where evidence integrity must be demonstrable&lt;/li&gt;
&lt;li&gt;Enterprise compliance where "trust the vendor" isn't an accepted audit answer&lt;/li&gt;
&lt;li&gt;Cross-vendor or multi-agent workflows where a single vendor doesn't control the full chain&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For consumer applications, hobby projects, or internal tools where you trust Anthropic and compliance requirements are light: Managed Agents tracing is probably sufficient.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Complementary Stack
&lt;/h2&gt;

&lt;p&gt;Managed Agents and signed audit trails aren't competitors. They operate at different layers.&lt;/p&gt;

&lt;p&gt;Managed Agents handles: infrastructure, sandboxing, session management, permission scoping, operational tracing.&lt;/p&gt;

&lt;p&gt;Signed receipts handle: cryptographic proof of what happened, independently verifiable by any third party, held by you, not a vendor.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Signet&lt;/a&gt; works with Managed Agents. Claude Managed Agents uses MCP to connect to external tools — Signet's &lt;code&gt;@signet-auth/mcp&lt;/code&gt; intercepts at the MCP transport layer and signs every tool call before it executes. The two layers stack.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Managed Agents
  └── MCP tool calls
        └── Signet SigningTransport  ← signs here
              └── your tool server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Console shows you what happened. The signed receipts prove it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Claude Managed Agents ships a real, useful tracing feature. If you're using it, your debugging workflow just got better.&lt;/p&gt;

&lt;p&gt;But "Anthropic recorded it" and "you can prove it" are different claims. In the situations where audit trails matter most — compliance, incident response, regulated industries — the difference is significant.&lt;/p&gt;

&lt;p&gt;Signing is the layer that converts logs into evidence.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;&lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Signet&lt;/a&gt; adds Ed25519 signing and tamper-evident audit chains to AI agent tool calls. Works with Claude Managed Agents, LangChain, CrewAI, AutoGen, and 7 other frameworks. Apache-2.0 + MIT.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Now on the official Claude Code plugin marketplace: &lt;code&gt;/plugin install signet@claude-plugins-official&lt;/code&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>agents</category>
      <category>claude</category>
    </item>
    <item>
      <title>AI Agents Can Move Money But Can't Produce Receipts</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Tue, 14 Apr 2026 03:12:41 +0000</pubDate>
      <link>https://forem.com/willamhou/ai-agents-can-move-money-but-cant-produce-receipts-3ong</link>
      <guid>https://forem.com/willamhou/ai-agents-can-move-money-but-cant-produce-receipts-3ong</guid>
      <description>&lt;h1&gt;
  
  
  AI Agents Can Move Money But Can't Produce Receipts
&lt;/h1&gt;

&lt;p&gt;In March 2026, security researchers disclosed &lt;a href="https://medium.com/effortless-programming/zombieclaw-the-ai-botnet-nobody-is-talking-about-04b0dbf5ed1b" rel="noopener noreferrer"&gt;ZombieClaw&lt;/a&gt; — a botnet recruiting compromised AI agent instances. Over 30,000 instances were &lt;a href="https://securityscorecard.com/blog/how-exposed-openclaw-deployments-turn-agentic-ai-into-an-attack-surface/" rel="noopener noreferrer"&gt;found exposed&lt;/a&gt; with default configurations. Reported losses reached &lt;a href="https://www.techbuzz.ai/articles/openclaw-s-ai-marketplace-infected-with-crypto-stealing-malware" rel="noopener noreferrer"&gt;up to $16 million&lt;/a&gt; in cryptocurrency. Hundreds of malicious skills were distributed through ClawHub (&lt;a href="https://www.koi.ai/blog/clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting" rel="noopener noreferrer"&gt;341 initially identified by Koi&lt;/a&gt;, with &lt;a href="https://blog.virustotal.com/2026/02/from-automation-to-infection-how.html" rel="noopener noreferrer"&gt;more found by VirusTotal&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.kaspersky.com/blog/openclaw-vulnerabilities-exposed/55263/" rel="noopener noreferrer"&gt;Kaspersky found 512 vulnerabilities&lt;/a&gt;, eight critical. &lt;a href="https://businessinsights.bitdefender.com/technical-advisory-openclaw-exploitation-enterprise-networks" rel="noopener noreferrer"&gt;Bitdefender&lt;/a&gt;, VirusTotal, Sophos, and &lt;a href="https://www.oasis.security/blog/openclaw-vulnerability" rel="noopener noreferrer"&gt;Oasis Security&lt;/a&gt; all published analyses.&lt;/p&gt;

&lt;p&gt;But here's what nobody is talking about: &lt;strong&gt;after the attack, there is no cryptographic proof of what any compromised agent actually did.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No signed records. No tamper-evident logs. No way to distinguish "the agent executed &lt;code&gt;transfer_eth()&lt;/code&gt; because the user asked" from "the agent executed &lt;code&gt;transfer_eth()&lt;/code&gt; because a prompt injection rewrote its instructions."&lt;/p&gt;

&lt;p&gt;The text logs exist, sure. But text logs can be edited, deleted, or fabricated. When $16M is missing, "trust the logs" is not a forensic standard.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Forensics Problem
&lt;/h2&gt;

&lt;p&gt;When a traditional server gets compromised, incident response teams have tools: immutable audit logs, signed system events, chain-of-custody protocols. When an AI agent gets compromised, you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Conversation history&lt;/strong&gt; — stored by the agent itself. The compromised agent can edit its own history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool call logs&lt;/strong&gt; — if they exist at all, they're unsigned text files. An attacker who controls the agent controls the logs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"The agent did it"&lt;/strong&gt; — not enough for insurance claims, compliance reports, or criminal prosecution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ZombieClaw exploited this gap perfectly. The attackers didn't just steal money — they operated in an environment where &lt;strong&gt;there is no verifiable evidence of what happened&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters Beyond ZombieClaw
&lt;/h2&gt;

&lt;p&gt;The AI agent security conversation focuses on prevention: sandboxing, permission systems, policy engines, skill auditing. These are important. But prevention has a 100% failure rate over time. Every system eventually gets breached.&lt;/p&gt;

&lt;p&gt;What happens after?&lt;/p&gt;

&lt;p&gt;Without cryptographic proof of agent actions, you can't answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which agent initiated the transaction?&lt;/li&gt;
&lt;li&gt;Were the parameters what the user actually approved?&lt;/li&gt;
&lt;li&gt;When exactly did the compromise begin?&lt;/li&gt;
&lt;li&gt;Was this agent's audit log tampered with after the fact?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SOC 2, HIPAA, and GDPR all require audit trails for actions on sensitive data. "The AI agent did it and we have no verifiable records" creates real gaps in compliance posture.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Signed Audit Trail Would Have Changed
&lt;/h2&gt;

&lt;p&gt;If every tool call had been cryptographically signed at execution time, the ZombieClaw investigation would look different:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before compromise:&lt;/strong&gt; Signed receipts establish a baseline. Each agent has an Ed25519 identity. Every tool call is signed with the agent's key, timestamped, and chained into a tamper-evident log. The hash chain means you can't delete or reorder entries without breaking the chain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;During compromise:&lt;/strong&gt; The attacker takes control of the agent. If the attacker uses the agent's existing key, every malicious action is still signed — you have a record of what was executed and when. If the attacker generates a new key, the signing identity changes — the anomaly is visible in the chain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After compromise:&lt;/strong&gt; Forensics teams can verify the entire chain offline. They can see which actions were signed by the legitimate agent key vs. an unknown key. They can narrow down when the signing identity changed. They can verify that the log hasn't been modified after the fact.&lt;/p&gt;

&lt;p&gt;None of this is possible with unsigned text logs.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Doesn't Solve
&lt;/h2&gt;

&lt;p&gt;Signing is not prevention. A signed receipt that says "agent transferred 50 ETH to attacker's wallet" doesn't stop the transfer — it proves it happened.&lt;/p&gt;

&lt;p&gt;A signed audit trail doesn't solve:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Malicious skills&lt;/strong&gt; — A signed record of a malicious skill executing is evidence, not a defense.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompt injection&lt;/strong&gt; — The agent was tricked, not unauthorized. The signature is valid because the agent really did execute the call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key compromise&lt;/strong&gt; — If the attacker steals the signing key, they can sign anything. Bilateral co-signing (where the server independently signs the receipt) mitigates this by requiring two keys from two trust domains.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User intent&lt;/strong&gt; — A signed receipt proves the agent executed the call, not that the user wanted it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full host compromise&lt;/strong&gt; — If the attacker owns the entire machine, they control the key and the log. Off-host anchoring (publishing chain hashes externally) is the mitigation, but it's not free.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Signing is the forensics layer. You still need sandboxing, permission systems, and skill auditing for prevention. But when prevention fails — and it will — you need evidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Gap in Current Tools
&lt;/h2&gt;

&lt;p&gt;As of April 2026, most major AI agent frameworks have no cryptographic signing on tool call records:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;th&gt;Typical audit mechanism&lt;/th&gt;
&lt;th&gt;Signed?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;General-purpose agents&lt;/td&gt;
&lt;td&gt;OpenClaw, Hermes Agent&lt;/td&gt;
&lt;td&gt;Conversation logs, SQLite&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent OS&lt;/td&gt;
&lt;td&gt;OpenFang&lt;/td&gt;
&lt;td&gt;SHA-256 hash chain&lt;/td&gt;
&lt;td&gt;Hash only, no signatures&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Orchestration frameworks&lt;/td&gt;
&lt;td&gt;LangChain, CrewAI&lt;/td&gt;
&lt;td&gt;Callbacks, event logs&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;OpenFang is the closest — they have a hash chain, which detects casual tampering. But without signatures, an attacker with database access can rewrite the entire chain and it still validates.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Can You Do Today
&lt;/h2&gt;

&lt;p&gt;If you're running AI agents in production:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sign every tool call.&lt;/strong&gt; Give each agent an Ed25519 identity and sign every action. &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Signet&lt;/a&gt; does this as a library — &lt;code&gt;pip install signet-auth&lt;/code&gt; or &lt;code&gt;npm install @signet-auth/core&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Chain signed receipts.&lt;/strong&gt; Individual signatures are good. A hash-chained log of signed receipts is better — deletion and reordering become detectable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use bilateral signing when possible.&lt;/strong&gt; Agent signs the request, server signs the response. Now rewriting the chain requires compromising both keys on different machines.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Export chain hashes off-host.&lt;/strong&gt; Periodically publish the tip hash to an external system (git commit, append-only cloud storage, even a tweet). This anchors the chain against full-host compromise.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Treat audit integrity as a security requirement, not a feature.&lt;/strong&gt; If your agent can move money, it needs signed receipts. Period.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Uncomfortable Truth
&lt;/h2&gt;

&lt;p&gt;AI agents can move money, execute code, and access credentials. Most still can't produce a receipt.&lt;/p&gt;

&lt;p&gt;The next ZombieClaw is coming. The question is whether you'll have evidence when it happens.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;&lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Signet&lt;/a&gt; adds Ed25519 signing and tamper-evident audit logs to AI agent tool calls. Open source, Apache-2.0 + MIT.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>agents</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Your MCP Server Has No Audit Trail — A Security Checklist</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Mon, 13 Apr 2026 06:24:46 +0000</pubDate>
      <link>https://forem.com/willamhou/your-mcp-server-has-no-audit-trail-a-security-checklist-h1k</link>
      <guid>https://forem.com/willamhou/your-mcp-server-has-no-audit-trail-a-security-checklist-h1k</guid>
      <description>&lt;h1&gt;
  
  
  Your MCP Server Has No Audit Trail — A Security Checklist
&lt;/h1&gt;

&lt;p&gt;Last month, an AI agent mass-deleted a production environment. The team spent 3 days piecing together what happened — stderr logs, partial timestamps, no proof of which agent or what parameters. No audit trail.&lt;/p&gt;

&lt;p&gt;This isn't rare. Amazon Kiro deleted a prod environment. Replit's agent dropped a live database. Supabase MCP leaked tokens via prompt injection. In every case: &lt;strong&gt;zero cryptographic evidence of what happened.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MCP is becoming the standard for agent-tool communication. Claude Code, Cursor, Windsurf, and dozens of tools use it. But the MCP spec ships with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ No request signing&lt;/li&gt;
&lt;li&gt;❌ No audit log&lt;/li&gt;
&lt;li&gt;❌ No caller identity verification&lt;/li&gt;
&lt;li&gt;❌ No replay protection&lt;/li&gt;
&lt;li&gt;❌ No parameter integrity checks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your MCP server accepts any request from any process, trusts it completely, and keeps no verifiable record. Here's a practical checklist to fix that.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Threat Model
&lt;/h2&gt;

&lt;p&gt;Before the checklist, understand what you're defending against:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Attack&lt;/th&gt;
&lt;th&gt;How it works&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parameter tampering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent sends &lt;code&gt;create_issue("fix bug")&lt;/code&gt;, something in the pipeline changes it to &lt;code&gt;delete_repo("production")&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Data loss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Replay&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Legitimate &lt;code&gt;deploy_to_prod&lt;/code&gt; captured and replayed 50 times&lt;/td&gt;
&lt;td&gt;Repeated side effects&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Impersonation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Rogue process sends requests claiming to be your trusted agent&lt;/td&gt;
&lt;td&gt;Unauthorized actions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-server forwarding&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Request intended for staging gets forwarded to production&lt;/td&gt;
&lt;td&gt;Wrong environment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Log tampering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Text logs edited after an incident to cover tracks&lt;/td&gt;
&lt;td&gt;No incident response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compliance gap&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SOC 2 / HIPAA / GDPR require audit trails; "the AI did it" is not sufficient&lt;/td&gt;
&lt;td&gt;Regulatory risk&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Checklist
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ✅ 1. Use TLS for HTTP transports
&lt;/h3&gt;

&lt;p&gt;If your MCP server uses HTTP (SSE or Streamable HTTP), always terminate TLS. This protects data in transit but does &lt;strong&gt;not&lt;/strong&gt; protect against:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compromised clients sending bad requests&lt;/li&gt;
&lt;li&gt;Replay attacks (TLS protects the pipe, not the message)&lt;/li&gt;
&lt;li&gt;Log tampering after the fact&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For stdio transports (most local MCP servers), TLS doesn't apply — the attack surface is different (any local process can connect).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="c1"&gt;# nginx example&lt;/span&gt;
&lt;span class="k"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/mcp&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://localhost:3001&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Covers:&lt;/strong&gt; Data in transit.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Doesn't cover:&lt;/strong&gt; Request integrity, identity, audit.&lt;/p&gt;


&lt;h3&gt;
  
  
  ✅ 2. Validate inputs at the boundary
&lt;/h3&gt;

&lt;p&gt;Every tool handler should validate its arguments. MCP passes arbitrary JSON — treat it like user input.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CallToolRequestSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;args&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;create_issue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Invalid title&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt; &lt;span class="na"&gt;isError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// proceed...&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use Zod or similar for runtime validation. Never trust &lt;code&gt;args&lt;/code&gt; blindly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Covers:&lt;/strong&gt; Malformed input, injection.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Doesn't cover:&lt;/strong&gt; Who sent it, whether it's a replay, audit trail.&lt;/p&gt;


&lt;h3&gt;
  
  
  ✅ 3. Add authentication (API keys or mTLS)
&lt;/h3&gt;

&lt;p&gt;For HTTP transports, require an API key or use mutual TLS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Simple API key check&lt;/span&gt;
&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CallToolRequestSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;extra&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;extra&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;requestHeaders&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;x-api-key&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MCP_API_KEY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt; &lt;span class="na"&gt;isError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;// proceed...&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For stdio, authentication is harder — any local process with access to the pipe can send requests. This is where cryptographic signing becomes necessary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Covers:&lt;/strong&gt; Unauthorized callers (HTTP only).&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Doesn't cover:&lt;/strong&gt; Parameter integrity, replay, stdio auth, audit trail.&lt;/p&gt;


&lt;h3&gt;
  
  
  ✅ 4. Sign every request with cryptographic receipts
&lt;/h3&gt;

&lt;p&gt;This is the gap most MCP servers don't address. Signing binds a request to a specific agent identity and makes tampering detectable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Signet&lt;/a&gt; adds Ed25519 signing to MCP. A signed receipt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"v"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"create_issue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"params_hash"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:b878192..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp://github.local"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"signer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pubkey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:0CRkURt/tc6r..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"deploy-bot"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-09T10:30:00.000Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"nonce"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rnd_dcd4e13579..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sig"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:6KUohbnS..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tamper with any field → signature fails. Replay → nonce rejected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Client side — sign every tool call:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SigningTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@signet-auth/mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;inner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StdioClientTransport&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;my-mcp-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SigningTransport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;secretKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;my-agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Every tools/call now carries a signed receipt in params._meta._signet&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The receipt is injected into &lt;code&gt;_meta._signet&lt;/code&gt;. MCP servers ignore unknown fields by spec — &lt;strong&gt;zero server changes needed&lt;/strong&gt; to start signing. Works with stdio and HTTP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Server side — verify incoming signatures:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;verifyRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;NonceCache&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@signet-auth/mcp-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;nonceCache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NonceCache&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CallToolRequestSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verifyRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;trustedKeys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ed25519:...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;       &lt;span class="c1"&gt;// allowed agent keys&lt;/span&gt;
    &lt;span class="na"&gt;expectedTarget&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mcp://my-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// anti-forwarding&lt;/span&gt;
    &lt;span class="na"&gt;maxAge&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                        &lt;span class="c1"&gt;// 5-min freshness window&lt;/span&gt;
    &lt;span class="nx"&gt;nonceCache&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                         &lt;span class="c1"&gt;// replay protection&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt; &lt;span class="na"&gt;isError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Verified: signature valid, signer trusted, fresh, correct target&lt;/span&gt;
  &lt;span class="c1"&gt;// proceed with tool execution...&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In ~50 microseconds, this checks: signature validity, signer trust, freshness, target binding, tool/params integrity, and nonce uniqueness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Python (works with LangChain, CrewAI, AutoGen, or standalone):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;devops-team&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_issue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix bug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Covers:&lt;/strong&gt; Identity, parameter integrity, replay, freshness, target binding.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Doesn't cover:&lt;/strong&gt; Preventing the action (signing is attestation, not policy).&lt;/p&gt;


&lt;h3&gt;
  
  
  ✅ 5. Keep a tamper-evident audit log
&lt;/h3&gt;

&lt;p&gt;Signing individual requests is good. Chaining them into a tamper-evident log is better. If someone deletes or reorders records, the chain breaks.&lt;/p&gt;

&lt;p&gt;Signet does this automatically — every signed receipt is appended to a SHA-256 hash-chained JSONL log at &lt;code&gt;~/.signet/audit/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;record_1: { receipt, prev_hash: "sha256:0000...", record_hash: "sha256:abc1..." }
record_2: { receipt, prev_hash: "sha256:abc1...", record_hash: "sha256:def2..." }
record_3: { receipt, prev_hash: "sha256:def2...", record_hash: "sha256:ghi3..." }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Query and verify from the CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;signet audit &lt;span class="nt"&gt;--since&lt;/span&gt; 24h              &lt;span class="c"&gt;# what happened today&lt;/span&gt;
signet audit &lt;span class="nt"&gt;--tool&lt;/span&gt; github &lt;span class="nt"&gt;--since&lt;/span&gt; 7d &lt;span class="c"&gt;# github calls this week&lt;/span&gt;
signet audit &lt;span class="nt"&gt;--verify&lt;/span&gt;                 &lt;span class="c"&gt;# verify all signatures&lt;/span&gt;
signet verify &lt;span class="nt"&gt;--chain&lt;/span&gt;                 &lt;span class="c"&gt;# check hash chain integrity&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or from Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audit_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;since&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;24h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audit_verify_chain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valid&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Covers:&lt;/strong&gt; Tamper detection, incident forensics, compliance audit.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Doesn't cover:&lt;/strong&gt; Tamper &lt;em&gt;proof&lt;/em&gt; (someone with disk access can delete the entire log; off-host anchoring is on the roadmap).&lt;/p&gt;


&lt;h3&gt;
  
  
  ✅ 6. Implement rate limiting and timeouts
&lt;/h3&gt;

&lt;p&gt;Even with signing, a compromised agent can flood your server. Add rate limits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;callCounts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CallToolRequestSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;signer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;_meta&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;_signet&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;signer&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unknown&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;callCounts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;callCounts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;  &lt;span class="c1"&gt;// per-agent limit&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Rate limit exceeded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt; &lt;span class="na"&gt;isError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// proceed...&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And always set timeouts on tool execution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;controller&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AbortController&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timeout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;executeTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signal&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;clearTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ✅ 7. Principle of least privilege
&lt;/h3&gt;

&lt;p&gt;Don't give your MCP server access to everything. Run it with minimal permissions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Separate API keys per tool (read-only key for &lt;code&gt;list_issues&lt;/code&gt;, write key for &lt;code&gt;create_issue&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Filesystem access scoped to specific directories&lt;/li&gt;
&lt;li&gt;Database user with only the required grants&lt;/li&gt;
&lt;li&gt;Network egress limited to required endpoints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is independent of MCP — it's basic defense-in-depth.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Practice&lt;/th&gt;
&lt;th&gt;Protects against&lt;/th&gt;
&lt;th&gt;Difficulty&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;TLS&lt;/td&gt;
&lt;td&gt;Eavesdropping&lt;/td&gt;
&lt;td&gt;Easy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Input validation&lt;/td&gt;
&lt;td&gt;Injection, malformed data&lt;/td&gt;
&lt;td&gt;Easy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Authentication&lt;/td&gt;
&lt;td&gt;Unauthorized callers&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Request signing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Tampering, replay, impersonation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3 lines&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Audit log&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Incident response, compliance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Automatic with signing&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Rate limiting&lt;/td&gt;
&lt;td&gt;Denial of service&lt;/td&gt;
&lt;td&gt;Easy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Least privilege&lt;/td&gt;
&lt;td&gt;Blast radius&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most MCP servers today implement 1-3 at best. Steps 4 and 5 — signing and audit — are the gap. They're also the hardest to bolt on after the fact, which is why starting with a library that handles both is worth the &lt;code&gt;npm install&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @signet-auth/core @signet-auth/mcp
&lt;span class="c"&gt;# or&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;signet-auth
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;github.com/Prismer-AI/signet&lt;/a&gt;&lt;br&gt;&lt;br&gt;
Apache-2.0 + MIT dual licensed. Open source, no SaaS, no phone-home.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If your AI agent can delete a database, you should be able to prove it did.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>security</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Every Trending AI Agent Project Is Reinventing Something Humans Already Built</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Sat, 11 Apr 2026 05:44:30 +0000</pubDate>
      <link>https://forem.com/willamhou/every-trending-ai-agent-project-is-reinventing-something-humans-already-built-469e</link>
      <guid>https://forem.com/willamhou/every-trending-ai-agent-project-is-reinventing-something-humans-already-built-469e</guid>
      <description>&lt;h1&gt;
  
  
  Every Trending AI Agent Project Is Reinventing Something Humans Already Built
&lt;/h1&gt;

&lt;p&gt;I've been watching GitHub Trending for the past six months. The same category of project keeps appearing: infrastructure for AI agents.&lt;/p&gt;

&lt;p&gt;Look closely and they're all doing the same thing. Taking organizational structures that humans have used for centuries and rebuilding them for agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Human world&lt;/th&gt;
&lt;th&gt;Agent world&lt;/th&gt;
&lt;th&gt;Example project&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Workflow manuals / API standards&lt;/td&gt;
&lt;td&gt;Tool-calling protocol&lt;/td&gt;
&lt;td&gt;MCP (Anthropic)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Team collaboration / org charts&lt;/td&gt;
&lt;td&gt;Multi-agent orchestration&lt;/td&gt;
&lt;td&gt;CrewAI, AutoGen, LangGraph&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security policies / firewalls&lt;/td&gt;
&lt;td&gt;Agent behavior control&lt;/td&gt;
&lt;td&gt;Aegis, Invariant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Contracts / signatures / audit&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;?&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The first three rows are well covered. Multiple teams shipping production-grade tools.&lt;/p&gt;

&lt;p&gt;The last row is empty.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Oldest Infrastructure in Human Society
&lt;/h2&gt;

&lt;p&gt;Human civilization has run on one mechanism from Mesopotamian clay tablets to DocuSign: &lt;strong&gt;signatures&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Sign a contract. Sign for a delivery. Sign an expense report. Sign a bank transfer. When something goes wrong and you end up in court, the first question is always: "Is there a signature?"&lt;/p&gt;

&lt;p&gt;Signatures don't solve trust. They solve &lt;strong&gt;proof&lt;/strong&gt;. Not "I trust you," but "you did this, here is the evidence, you cannot deny it."&lt;/p&gt;

&lt;p&gt;Now look at the agent world.&lt;/p&gt;

&lt;p&gt;Your agent calls dozens of APIs every day. Creates issues. Sends messages. Places orders. Modifies configurations. What did it actually do?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No signature. No receipt. No evidence.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MCP, the protocol agents use to talk to tools, has no signing mechanism. Your agent acts with your credentials, your API keys, on your behalf. When something goes wrong, you can't even prove whether your agent did it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens Without Signatures
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;You run a multi-agent pipeline.&lt;/strong&gt; An orchestrator delegates to a research agent, a writing agent, a review agent. The final report contains incorrect data. Which agent introduced it? When? With what parameters? Without signatures, you trace manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security asks you a question.&lt;/strong&gt; "What did your agent do last week? Which services did it call?" You open the MCP server access log. HTTP requests. No way to distinguish human operations from agent operations. You have no answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your agent placed 10 purchase orders.&lt;/strong&gt; One has an abnormal amount. You want to prove that transaction wasn't authorized by you. But you can't produce cryptographic evidence. The agent used your key. In the logs, it looks like you did it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Gap Is Real
&lt;/h2&gt;

&lt;p&gt;This isn't a hypothetical concern. It's already happening. In the MCP spec discussion (SEP-1763), five independent projects identified the layers needed for an agent enforcement stack:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What it answers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Identity&lt;/td&gt;
&lt;td&gt;"Who is this agent?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Policy&lt;/td&gt;
&lt;td&gt;"What is it allowed to do?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Transport integrity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;"Can you prove what was actually sent and received?"&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spend control&lt;/td&gt;
&lt;td&gt;"How much can it spend?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output verification&lt;/td&gt;
&lt;td&gt;"Is the response correct?"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The transport integrity layer, the signing layer, was empty until recently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Signet: Signing for Agents
&lt;/h2&gt;

&lt;p&gt;I built &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Signet&lt;/a&gt; to fill that row.&lt;/p&gt;

&lt;p&gt;Every agent gets an Ed25519 identity. Every tool call gets a signed receipt. Receipts chain into a tamper-evident audit log. Servers can verify requests before execution. Both sides can co-sign a bilateral record.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;procurement-bot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ops-team&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;marketplace_purchase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GPU-A100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;quantity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15000&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# What did this agent do in the last 24 hours?
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audit_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;since&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;24h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Was this receipt tampered with?
&lt;/span&gt;&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;v0.6 added delegation chains: a root identity (human or org) cryptographically delegates scoped authority to an agent. The agent's receipts carry proof of who authorized them, what scope was granted, and that permissions can only narrow, never widen.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Owner (alice) → Agent A (tools: [Bash, Read], max_depth: 0)
                    ↓
              v4 Receipt: tool=Bash, params_hash=sha256:...
              authorization: chain proves alice → Agent A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference from plain logging: logs record what the system &lt;em&gt;says&lt;/em&gt; happened. Signatures prove what &lt;em&gt;actually&lt;/em&gt; happened. Logs can be rewritten. Signatures can't be forged without the private key. This isn't a log. It's evidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  But Is This the Right Abstraction?
&lt;/h2&gt;

&lt;p&gt;Back to the table at the top.&lt;/p&gt;

&lt;p&gt;The entire industry is mapping human organizational patterns onto agents. Protocols. Teams. Firewalls. Signatures. Rebuilding them one by one.&lt;/p&gt;

&lt;p&gt;There's an implicit assumption here: &lt;strong&gt;agent organizational structures should look like human ones.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Do agents need "team collaboration"? Or did we build multi-agent frameworks because humans are used to teams, so we instinctively gave agents one too?&lt;/p&gt;

&lt;p&gt;Do agents need "signatures"? Or is there a fundamentally different trust mechanism that doesn't rely on signatures and evidence, but on something we haven't thought of yet?&lt;/p&gt;

&lt;p&gt;Humans built passports, contracts, audits, and firewalls because humans are not fully trustworthy, have unreliable memory, and sometimes lie. Agents are a different species. Should their trust infrastructure be designed from first principles instead of copied from human patterns?&lt;/p&gt;

&lt;p&gt;I don't know the answer.&lt;/p&gt;

&lt;p&gt;What I do know: &lt;strong&gt;agents are already acting on your behalf, and you can't prove what they did.&lt;/strong&gt; Whatever the final form of agent trust looks like, "provable" is probably a requirement that doesn't go away.&lt;/p&gt;

&lt;p&gt;The bigger question, what agent organizational structures &lt;em&gt;should&lt;/em&gt; look like, I'll leave to people smarter than me.&lt;/p&gt;




&lt;p&gt;GitHub: &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;github.com/Prismer-AI/signet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now on the official Claude Code plugin marketplace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin &lt;span class="nb"&gt;install &lt;/span&gt;signet@claude-plugins-official
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apache-2.0 + MIT. Open source.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you're thinking about these problems too: &lt;a href="https://x.com/WillamUpUp" rel="noopener noreferrer"&gt;@willamhou&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>agents</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Two Hypervisors, One SoC: Replacing Hafnium with 30K Lines of Rust</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Thu, 09 Apr 2026 02:07:31 +0000</pubDate>
      <link>https://forem.com/willamhou/two-hypervisors-one-soc-replacing-hafnium-with-30k-lines-of-rust-lhj</link>
      <guid>https://forem.com/willamhou/two-hypervisors-one-soc-replacing-hafnium-with-30k-lines-of-rust-lhj</guid>
      <description>&lt;h1&gt;
  
  
  Two Hypervisors, One SoC: Replacing Hafnium with 30K Lines of Rust
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;Over about 10 weeks, I built a bare-metal SPMC at S-EL2 that boots Linux, manages Secure Partitions, and runs alongside Android pKVM on the same SoC.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I built an ARM64 hypervisor that runs &lt;em&gt;next to&lt;/em&gt; Google's pKVM on the same chip. pKVM takes the Normal world at NS-EL2. My hypervisor takes the Secure world at S-EL2. They coordinate through ARM's FF-A protocol, relayed by EL3 firmware. 35 end-to-end tests pass through the full four-level stack: Linux kernel module → pKVM → TF-A → my SPMC → Secure Partitions → and back.&lt;/p&gt;

&lt;p&gt;The Secure side already had an implementation: &lt;a href="https://hafnium.googlesource.com/hafnium/" rel="noopener noreferrer"&gt;Hafnium&lt;/a&gt;, Google's reference SPMC. It's 200K+ lines of C. I replaced it with 30,000 lines of &lt;code&gt;no_std&lt;/code&gt; Rust — no runtime, no allocator crate, one dependency (a DTB parser). It boots Linux to a BusyBox shell, manages three Secure Partitions, and handles FF-A v1.1 messaging and memory sharing.&lt;/p&gt;

&lt;p&gt;I'll walk through the architecture, the parts that were genuinely hard, and the four bugs I spent the most time chasing.&lt;/p&gt;

&lt;h2&gt;
  
  
  ARM's Split Personality
&lt;/h2&gt;

&lt;p&gt;ARM's latest chips divide the CPU into two security worlds. Each world gets its own hypervisor at EL2:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;            Normal World          Secure World
           ┌────────────┐       ┌────────────┐
    EL0    │  Userspace  │       │            │
           ├────────────┤       ├────────────┤
    EL1    │ Linux/Android│       │  Secure    │
           │  kernel     │       │  Partitions│
           ├────────────┤       ├────────────┤
    EL2    │  pKVM       │       │  SPMC      │
           │  (NS-EL2)   │       │  (S-EL2)   │
           └──────┬──────┘       └──────┬──────┘
                  │      ┌──────┐       │
    EL3           └──────│ TF-A │───────┘
                         │ SPMD │
                         └──────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;EL3 is the root of trust — ARM Trusted Firmware (TF-A) lives here and relays messages between worlds via SMC (Secure Monitor Call). The protocol is &lt;a href="https://developer.arm.com/documentation/den0077/latest" rel="noopener noreferrer"&gt;FF-A&lt;/a&gt; v1.1: it defines messaging, memory sharing, page ownership transfer, and partition management. My hypervisor fills the S-EL2 box.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Hypervisors, One Chip
&lt;/h2&gt;

&lt;p&gt;This is the part most hypervisor projects don't deal with: coexistence. pKVM and my SPMC boot on the same 4 physical CPUs, each managing their own world. The boot chain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TF-A BL1 (ROM) → BL2 (loader) → BL31 (SPMD at EL3)
    → BL32 (our SPMC at S-EL2, boots SP1/SP2/SP3)
    → BL33 (pKVM at NS-EL2 → Linux at NS-EL1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When pKVM's Linux guest wants to talk to a Secure Partition, the message crosses four exception levels and two world switches:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Linux (NS-EL1) → SMC → pKVM (NS-EL2) → SMC → SPMD (EL3)
    → ERET → SPMC (S-EL2) → ERET → SP1 (S-EL1)
    → SMC → SPMC → SMC → SPMD → ERET → pKVM → ERET → Linux
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The proof: Linux sends &lt;code&gt;x4=0xBBBB&lt;/code&gt; via FF-A DIRECT_REQ, SP1 adds &lt;code&gt;0x1000&lt;/code&gt;, Linux reads back &lt;code&gt;0xCBBB&lt;/code&gt;. One round trip, four privilege levels, two world switches.&lt;/p&gt;

&lt;p&gt;Making this work meant dealing with problems that mostly don't show up in a single-hypervisor setup:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SPMD is per-CPU.&lt;/strong&gt; TF-A's Secure Partition Manager Dispatcher maintains separate state for each physical CPU. When pKVM boots secondary CPUs via PSCI, each one enters S-EL2 on whichever physical core it lands on. My SPMC must register a secondary entry point (&lt;code&gt;FFA_SECONDARY_EP_REGISTER&lt;/code&gt;), allocate per-CPU stacks (3 × 32KB), and run a full event loop on every core. If any CPU skips its &lt;code&gt;FFA_MSG_WAIT&lt;/code&gt; handshake, SPMD blocks the entire PSCI boot sequence. This is documented nowhere except TF-A's source code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;S-EL2 Stage-1 MMU and the NS bit.&lt;/strong&gt; The Secure world has its own physical address space. When S-EL2 writes to address &lt;code&gt;0x42a16000&lt;/code&gt; with the MMU off, it hits the &lt;em&gt;Secure&lt;/em&gt; alias. pKVM's RX buffer is at the same address in the &lt;em&gt;Non-Secure&lt;/em&gt; alias. Different memory. I had to enable an S-EL2 Stage-1 identity map where all Normal world DRAM is marked &lt;code&gt;NS=1&lt;/code&gt; to force writes to the correct alias. (More on this in War Stories.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-CPU cache coherency.&lt;/strong&gt; pKVM writes a descriptor to its TX buffer on CPU 0, then issues an SMC. SPMD routes the call to S-EL2 on whichever CPU happens to be running — potentially CPU 2 with a stale L1 cache line. Even after adding &lt;code&gt;DSB SY&lt;/code&gt; barriers, I had to copy the descriptor to a local stack buffer before parsing it. Reading directly from the cross-world buffer produced data aborts from corrupt pointer arithmetic.&lt;/p&gt;

&lt;p&gt;On &lt;code&gt;make run-pkvm-ffa-test&lt;/code&gt;, the full TF-A boot chain comes up, then pKVM initializes, and our kernel module exercises every FF-A path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;[SPMC] SP1 booted, now Idle (FFA_MSG_WAIT received)
[SPMC] SP2 booted, now Idle (FFA_MSG_WAIT received)
[SPMC] SP3 booted, now Idle (FFA_MSG_WAIT received)
[SPMC] Secondary EP registered with SPMD
&lt;/span&gt;&lt;span class="c"&gt;...
&lt;/span&gt;&lt;span class="go"&gt;Protected hVHE mode initialized successfully
&lt;/span&gt;&lt;span class="c"&gt;...
&lt;/span&gt;&lt;span class="go"&gt;ffa_test: Sending DIRECT_REQ to SP 0x8001...
ffa_test:   x3=0xaaaa x4=0xcbbb x5=0xcccc x6=0xdddd x7=0xeeee
ffa_test: [PASS] DIRECT_REQ to SP 0x8001 returns success
ffa_test: [PASS] SP 0x8001 x4 = 0xBBBB + 0x1000
&lt;/span&gt;&lt;span class="c"&gt;...
&lt;/span&gt;&lt;span class="go"&gt;ffa_test: [PASS] Shared page == 0xCAFEFACE (SP wrote it)
ffa_test: [PASS] MEM_RECLAIM returns success
&lt;/span&gt;&lt;span class="c"&gt;...
&lt;/span&gt;&lt;span class="go"&gt;ffa_test: [PASS] SP1→SP3 relay chain returns success
ffa_test: [PASS] SP1→SP2 Secure DRAM share verified
ffa_test:   Results: 35/35 PASS
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Rust at Exception Level 2
&lt;/h2&gt;

&lt;p&gt;Secure Partition lifecycle is a state machine: Reset → Idle → Running → Blocked → Preempted. In C, this would probably be an integer plus a set of invariants everyone has to remember. In Rust:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;SpState&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Idle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Blocked&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Preempted&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I added the Blocked → Preempted edge for chain preemption during SP-to-SP messaging, the compiler forced me to revisit every transition. That flushed out two bugs before I ever ran the code.&lt;/p&gt;

&lt;p&gt;My &lt;code&gt;Cargo.toml&lt;/code&gt; has one dependency: &lt;code&gt;fdt = "0.1.5"&lt;/code&gt;. Everything else — page tables, GIC emulation, virtio drivers, the SPMC event loop — is hand-written. The &lt;code&gt;alloc&lt;/code&gt; crate gives me &lt;code&gt;Box&lt;/code&gt; and &lt;code&gt;Vec&lt;/code&gt; backed by a bump allocator. Enum dispatch replaces trait objects for zero-cost MMIO routing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Stage-2 Page Table Tricks
&lt;/h3&gt;

&lt;p&gt;ARM's Stage-2 translation maps guest physical addresses to real physical addresses. I use identity mapping but repurpose the software-defined PTE bits for ownership tracking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PTE bits [56:55]:
  00 = Owned          (page belongs to this VM)
  01 = SharedOwned    (shared out, sender retains ownership)
  10 = SharedBorrowed (mapped from another VM/SP)
  11 = Donated        (irrevocably transferred)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This mirrors pKVM's model. When VM 0 shares a page with SP1: validate ownership (SW bits = &lt;code&gt;00&lt;/code&gt;), set to SharedOwned (&lt;code&gt;01&lt;/code&gt;) + read-only, map into SP1's Secure Stage-2 as SharedBorrowed (&lt;code&gt;10&lt;/code&gt;). On reclaim: validate SP1 has relinquished, restore to Owned + read-write.&lt;/p&gt;

&lt;p&gt;The Stage-2 walker reconstructs itself from &lt;code&gt;VTTBR_EL2&lt;/code&gt; at SMC handling time — it walks and modifies PTEs without owning the page table memory. The SPMC can manipulate any VM's page tables by just knowing the L0 table physical address.&lt;/p&gt;

&lt;h3&gt;
  
  
  SP-to-SP Messaging and Cycle Detection
&lt;/h3&gt;

&lt;p&gt;Secure Partitions can message each other. SP1 sends a DIRECT_REQ to SP3, which forwards to SP2, which responds. The SPMC routes each hop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NWd → SP1 runs → DIRECT_REQ(SP3) → SP3 runs
    → DIRECT_RESP(SP1) → SP1 resumes → DIRECT_RESP(NWd)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each SP making an outgoing call transitions from Running to Blocked. The SPMC maintains a CallStack and checks for cycles: SP1 → SP3 → SP1 returns &lt;code&gt;FFA_BUSY&lt;/code&gt;. Without this, deadlock.&lt;/p&gt;

&lt;p&gt;The tricky part is preemption. A Normal world interrupt arrives while SP3 is running mid-chain. The SPMC transitions SP3 from Running to Preempted, SP1 from Blocked to Preempted (chain preemption), and returns &lt;code&gt;FFA_INTERRUPT&lt;/code&gt;. When the Normal world later calls &lt;code&gt;FFA_RUN&lt;/code&gt;, the entire chain resumes.&lt;/p&gt;

&lt;h3&gt;
  
  
  The &lt;code&gt;handle_sp_exit()&lt;/code&gt; Loop
&lt;/h3&gt;

&lt;p&gt;This is the heart of the SPMC. When the SPMC dispatches to an SP, the SP runs until it traps — but the trap might not be a response. It could be a memory operation, a log message, or a call to another SP.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;loop&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;enter_guest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;  &lt;span class="c1"&gt;// ERET to S-EL1&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;exit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;decode_exit&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;exit&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;FFA_MSG_SEND_DIRECT_RESP&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;FFA_MEM_RETRIEVE_REQ&lt;/span&gt;    &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="n"&gt;locally&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;enter&lt;/span&gt; &lt;span class="n"&gt;SP&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;FFA_MEM_RELINQUISH&lt;/span&gt;      &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="n"&gt;locally&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;enter&lt;/span&gt; &lt;span class="n"&gt;SP&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;FFA_MEM_SHARE&lt;/span&gt;           &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="n"&gt;share&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;enter&lt;/span&gt; &lt;span class="n"&gt;SP&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;FFA_CONSOLE_LOG&lt;/span&gt;         &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;UART&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;enter&lt;/span&gt; &lt;span class="n"&gt;SP&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;FFA_MSG_SEND_DIRECT_REQ&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;dispatch&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt; &lt;span class="n"&gt;SP&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;enter&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SP doesn't know its RETRIEVE_REQ is handled locally rather than going to another entity. It does an SMC, gets a result, and continues. This is what makes E2E memory sharing work: the Normal world shares a page, SP1 retrieves it (in-loop), writes &lt;code&gt;0xCAFEFACE&lt;/code&gt;, relinquishes (in-loop), and responds — all within a single dispatch.&lt;/p&gt;

&lt;h2&gt;
  
  
  War Stories
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Silent SIMD Trap
&lt;/h3&gt;

&lt;p&gt;Week 4. The SPMC boots fine in release mode but hangs on the first &lt;code&gt;read_volatile&lt;/code&gt; in debug. No output, no fault, nothing.&lt;/p&gt;

&lt;p&gt;After a few hours with GDB, I found the CPU stuck in an EL3 exception handler. ESR showed an FP/SIMD trap. But my code doesn't use floating point.&lt;/p&gt;

&lt;p&gt;Rust's debug-mode codegen will happily emit NEON instructions for things that look unrelated. In my case, the alignment check inside &lt;code&gt;read_volatile&lt;/code&gt; compiled to &lt;code&gt;cnt v0.8b, v0.8b&lt;/code&gt; — a SIMD population count. TF-A's default &lt;code&gt;CPTR_EL3.TFP=1&lt;/code&gt; traps all floating-point and SIMD from every exception level. EL3's handler wasn't prepared for that trap, so it looped forever.&lt;/p&gt;

&lt;p&gt;What fixed it was one build flag: &lt;code&gt;CTX_INCLUDE_FPREGS=1&lt;/code&gt;. It was a good reminder that once you're running below an OS, your compiler's codegen is part of the hardware contract.&lt;/p&gt;

&lt;h3&gt;
  
  
  The NS Bit and the Invisible Write
&lt;/h3&gt;

&lt;p&gt;Week 8. &lt;code&gt;PARTITION_INFO_GET&lt;/code&gt; works perfectly from our BL33 test harness. The SPMC writes SP descriptors to the caller's RX buffer, caller reads them back. 24 bytes per partition, everything checks out.&lt;/p&gt;

&lt;p&gt;Then pKVM calls the same function. Same code path, same descriptor format. pKVM reads... all zeros.&lt;/p&gt;

&lt;p&gt;The write succeeded (no fault). The address was correct (verified in GDB). But the data wasn't there.&lt;/p&gt;

&lt;p&gt;ARM has two physical address spaces. When S-EL2 runs with the MMU off, all memory accesses go through the Secure physical address space. pKVM's buffer is at &lt;code&gt;0x42a16000&lt;/code&gt; in Non-Secure DRAM. The write hits &lt;code&gt;0x42a16000&lt;/code&gt; Secure. pKVM reads from &lt;code&gt;0x42a16000&lt;/code&gt; Non-Secure. Different memory.&lt;/p&gt;

&lt;p&gt;What fixed it was enabling an S-EL2 Stage-1 MMU with an identity map where all Normal world DRAM has the &lt;code&gt;NS=1&lt;/code&gt; attribute bit. I've worked with ARM for years and still hadn't fully internalized that Secure/Non-Secure is a &lt;em&gt;physical address space split&lt;/em&gt;, not just a permission model. In QEMU, there's literally twice the memory at the same addresses, selected by one bit.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Stale Cache and the Phantom Data Abort
&lt;/h3&gt;

&lt;p&gt;Week 11. pKVM's MEM_SHARE works 70% of the time. The other 30%, the SPMC crashes with a Data Abort at a pointer address like &lt;code&gt;0x240f&lt;/code&gt; — clearly not a valid physical address.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;addr2line&lt;/code&gt; traced it to &lt;code&gt;parse_mem_region&lt;/code&gt; in my descriptor parser. The descriptor's &lt;code&gt;composite_offset&lt;/code&gt; field, which should be 80, was reading as garbage. The SPMC was dereferencing &lt;code&gt;base + garbage&lt;/code&gt; and faulting.&lt;/p&gt;

&lt;p&gt;The descriptor lived in pKVM's TX buffer — Normal world DRAM. pKVM writes it on CPU 0, issues an SMC, SPMD context-switches to S-EL2 on CPU 2. Even though ARM's memory model guarantees the SMC acts as a barrier for the issuing CPU, the &lt;em&gt;receiving&lt;/em&gt; CPU might still have a stale L1 cache line.&lt;/p&gt;

&lt;p&gt;I first added &lt;code&gt;DSB SY&lt;/code&gt; (Data Synchronization Barrier, full system scope) before every cross-world buffer read. It still crashed. The barrier improves visibility, but the buffer itself is in Non-Secure DRAM that the SPMC accesses through the NS=1 Stage-1 mapping. From the SPMC's point of view, that was still not enough to make the parse reliable.&lt;/p&gt;

&lt;p&gt;What finally made it reliable was copying the entire descriptor to a local stack buffer before parsing it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nn"&gt;core&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;arch&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;asm!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"dsb sy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;options&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nostack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nomem&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;local_buf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;core&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;copy_nonoverlapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;tx_pa&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;local_buf&lt;/span&gt;&lt;span class="nf"&gt;.as_mut_ptr&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;total_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Parse from local_buf, never from the shared buffer&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parse_mem_region&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_buf&lt;/span&gt;&lt;span class="nf"&gt;.as_ptr&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;total_length&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, if the copy still captures stale data, the bounds checks in &lt;code&gt;parse_mem_region&lt;/code&gt; reject it cleanly instead of chasing a wild pointer into Secure memory. In practice that took the crash rate from about 30% to zero.&lt;/p&gt;

&lt;h3&gt;
  
  
  SPMD Is Per-CPU (or: Read the Firmware Source)
&lt;/h3&gt;

&lt;p&gt;Week 7. pKVM boots fine on CPU 0. Secondary CPUs hang.&lt;/p&gt;

&lt;p&gt;The FF-A spec describes SPMC init but says almost nothing about secondary CPUs. After reading TF-A's &lt;code&gt;spmd_cpu_on_finish_handler()&lt;/code&gt;, I found it: SPMD maintains &lt;em&gt;entirely separate state&lt;/em&gt; per physical CPU. Each secondary entering S-EL2 must call &lt;code&gt;FFA_MSG_WAIT&lt;/code&gt; — a handshake that signals "this CPU's Secure world is ready." Without it, SPMD never completes the PSCI CPU_ON call, so the Normal world secondary never boots either.&lt;/p&gt;

&lt;p&gt;My initial code had secondary CPUs do &lt;code&gt;WFE&lt;/code&gt; (wait for event) after basic init. That's the Normal world pattern. But SPMD needs its per-CPU handshake, per-CPU stacks (3 × 32KB in &lt;code&gt;.bss&lt;/code&gt;), and a full event loop on each secondary. The eventual fix was registering &lt;code&gt;FFA_SECONDARY_EP_REGISTER&lt;/code&gt; during init and giving each secondary its own stack and event loop. The FF-A spec tells you &lt;em&gt;what&lt;/em&gt; has to happen; TF-A's source code is where I found &lt;em&gt;how&lt;/em&gt; it actually has to be wired up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Without an OS
&lt;/h2&gt;

&lt;p&gt;All tests run on bare metal. No test harness, no OS, no &lt;code&gt;#[test]&lt;/code&gt;. The binary calls each test suite sequentially, printing &lt;code&gt;[PASS]&lt;/code&gt; or &lt;code&gt;[FAIL]&lt;/code&gt; to UART.&lt;/p&gt;

&lt;p&gt;For integration tests, the BL33 binary is a 500-line assembly program that sends 20 FF-A calls through real TF-A firmware and validates each response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;  Test 1: FFA_VERSION .............. PASS
  Test 2: FFA_ID_GET ............... PASS
&lt;/span&gt;&lt;span class="c"&gt;  ...
&lt;/span&gt;&lt;span class="go"&gt;  Test 13: MEM_SHARE lifecycle E2E . PASS
  Test 14: Alternating SP1/SP2 ..... PASS
&lt;/span&gt;&lt;span class="gp"&gt;  Test 17: SP-&amp;gt;&lt;/span&gt;SP relay chain ...... PASS
&lt;span class="go"&gt;  Test 18: Cycle detection ......... PASS
  Test 19: SP-to-SP MEM_SHARE ...... PASS
  Test 20: SP-to-SP MEM_RECLAIM .... PASS

  All tests complete.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For pKVM E2E tests, &lt;code&gt;ffa_test.ko&lt;/code&gt; is a Linux kernel module that does the same through pKVM's FF-A proxy.&lt;/p&gt;

&lt;p&gt;There's no mocking here. The BL33 tests go through real TF-A at EL3. The pKVM tests traverse pKVM at NS-EL2, SPMD at EL3, our SPMC at S-EL2, and SPs at S-EL1. If any layer is broken, the test fails.&lt;/p&gt;

&lt;h2&gt;
  
  
  Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rust source&lt;/td&gt;
&lt;td&gt;26,000 lines (96 files)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ARM64 assembly&lt;/td&gt;
&lt;td&gt;3,400 lines (9 files)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unit test assertions&lt;/td&gt;
&lt;td&gt;457&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BL33 integration tests&lt;/td&gt;
&lt;td&gt;20/20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pKVM E2E tests&lt;/td&gt;
&lt;td&gt;35/35&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dependencies&lt;/td&gt;
&lt;td&gt;1 (&lt;code&gt;fdt&lt;/code&gt; crate)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dev time&lt;/td&gt;
&lt;td&gt;~10 weeks (solo)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binary size&lt;/td&gt;
&lt;td&gt;230KB (release, SPMC)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The big remaining piece is ARM's &lt;strong&gt;Realm Management Extension&lt;/strong&gt; (RME) — the "R" in ARM CCA. RME adds a fourth world (Realm) with hardware-enforced memory isolation. A Realm VM's memory is inaccessible to both the Normal world hypervisor &lt;em&gt;and&lt;/em&gt; the Secure world firmware.&lt;/p&gt;

&lt;p&gt;The SPMC infrastructure (Stage-2 management, FF-A messaging, multi-CPU dispatch) provides a solid foundation, but RME requires Granule Protection Tables at EL3, a Realm Management Interface at EL2, and guest attestation. Significant step up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/willamhou/hypervisor
&lt;span class="nb"&gt;cd &lt;/span&gt;hypervisor
make run          &lt;span class="c"&gt;# 34 test suites, ~5 seconds on QEMU&lt;/span&gt;
make run-linux    &lt;span class="c"&gt;# boots Linux 6.12 to shell&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;*&lt;a href="https://willamhou.github.io/hypervisor/" rel="noopener noreferrer"&gt;Blog version&lt;/a&gt; · &lt;a href="https://github.com/willamhou/hypervisor" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;make run-spmc&lt;/code&gt; and &lt;code&gt;make run-pkvm-ffa-test&lt;/code&gt;, you'll need TF-A and (for pKVM) the AOSP kernel — both build via Docker. The full build takes ~30 minutes the first time. See the &lt;a href="https://github.com/willamhou/hypervisor" rel="noopener noreferrer"&gt;README&lt;/a&gt; for details.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Rust nightly, QEMU 9.2, and a lot of time spent cross-checking the ARM ARM.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rust</category>
      <category>arm</category>
      <category>hypervisor</category>
      <category>embedded</category>
    </item>
    <item>
      <title>Is that MCP request actually from your AI agent</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Wed, 08 Apr 2026 10:46:39 +0000</pubDate>
      <link>https://forem.com/willamhou/is-that-mcp-request-actually-from-your-ai-agent-3m28</link>
      <guid>https://forem.com/willamhou/is-that-mcp-request-actually-from-your-ai-agent-3m28</guid>
      <description>&lt;h1&gt;
  
  
  Is that MCP request actually from your AI agent?
&lt;/h1&gt;

&lt;p&gt;Last week we open-sourced Signet — cryptographic signing for every AI agent tool call.&lt;/p&gt;

&lt;p&gt;Someone asked a good question: &lt;strong&gt;the agent signs, but does the server verify?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Fair point. v0.1 was one-sided. The agent signed every request, but the server didn't check. Like mailing a signed contract that nobody verifies on the other end. Better than nothing, but the trust chain is broken.&lt;/p&gt;

&lt;p&gt;This week we closed the loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  From one-sided to bilateral
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;v0.1  Agent signs (one-sided)
      → proves the agent sent the request

v0.2  Compound receipt (request + response bound)
      → proves the request and response are paired

v0.3  Server verification
      → server can verify "this request was signed by a specific agent"

v0.4  Bilateral co-signing
      → agent signs the request, server signs the response, both hold proof
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Server verification: 3 lines
&lt;/h2&gt;

&lt;p&gt;Add this to your MCP server handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;verifyRequest&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@signet-auth/mcp-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CallToolRequestSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verifyRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;trustedKeys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ed25519:...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;maxAge&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt; &lt;span class="na"&gt;isError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// verified — proceed&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Request from: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signerName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One function. Six checks: signature validity, trusted key list, freshness window (anti-replay), target binding (anti-cross-server), tool name match, params match. ~50μs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bilateral co-signing: 3 more lines
&lt;/h2&gt;

&lt;p&gt;After verifying and processing the request, the server signs the response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;signResponse&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@signet-auth/mcp-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bilateral&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;signResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;serverKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SIGNET_SERVER_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;serverName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;my-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces a BilateralReceipt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"v"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agent_receipt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"v"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"create_issue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"params_hash"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"signer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pubkey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"deploy-bot"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"sig"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:..."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content_hash"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pubkey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"github-mcp"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ts_response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sig"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two signatures, two keys, one record. The agent says "I sent this request." The server says "I received this request and returned this response." Both parties hold proof. Neither can deny it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why bilateral matters
&lt;/h2&gt;

&lt;p&gt;One-sided signing proves "what the agent did." Bilateral signing proves "what happened between the agent and the server."&lt;/p&gt;

&lt;p&gt;The difference:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scenario: an agent places a purchase order, but the amount is wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One-sided: the agent says "I sent price=100." But the server might have received price=10000. You can't prove what happened in between.&lt;/p&gt;

&lt;p&gt;Bilateral: the agent's signature binds the params_hash. The server's signature binds the response content_hash. Any tampering in between breaks one of the signatures. Both parties signed the same facts — when something goes wrong, the signatures tell you who's right.&lt;/p&gt;

&lt;h2&gt;
  
  
  Python support
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;willamhou&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_issue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix bug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Verify a bilateral receipt
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;verify_bilateral&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verify_bilateral&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bilateral_receipt_json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_pubkey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;server_pubkey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Where this fits in the MCP ecosystem
&lt;/h2&gt;

&lt;p&gt;In the MCP spec discussion (SEP-1763), five independent projects were identified as layers of an enforcement stack:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Transport integrity&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Signet&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent signs request, server verifies + co-signs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Policy enforcement&lt;/td&gt;
&lt;td&gt;APS&lt;/td&gt;
&lt;td&gt;Authorization checks, policy engine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;External anchoring&lt;/td&gt;
&lt;td&gt;ArkForge&lt;/td&gt;
&lt;td&gt;Receipt anchoring to transparency logs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spend control&lt;/td&gt;
&lt;td&gt;AgentPay&lt;/td&gt;
&lt;td&gt;Budget caps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output verification&lt;/td&gt;
&lt;td&gt;veroq&lt;/td&gt;
&lt;td&gt;Response content validation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Signet went from a one-sided signing tool to the transport layer of a bilateral trust protocol. Each layer does its own thing. Together they form a complete agent security stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; (on the official Anthropic plugin marketplace):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga4icw9i3mnyn4kfn1s5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga4icw9i3mnyn4kfn1s5.jpg" alt=" " width="800" height="308"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin &lt;span class="nb"&gt;install &lt;/span&gt;signet@claude-plugins-official
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every tool call is signed and audited. Zero config.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP server&lt;/strong&gt; (on the MCP Registry, Glama indexed):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @signet-auth/mcp-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Agent side:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# TypeScript&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; @signet-auth/core @signet-auth/mcp

&lt;span class="c"&gt;# Python (LangGraph, LlamaIndex, Pydantic AI, Google ADK, OpenAI Agents, Semantic Kernel, smolagents)&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;signet-auth
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;github.com/Prismer-AI/signet&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;&lt;a href="https://twitter.com/WillamUpUp" rel="noopener noreferrer"&gt;@willamhou&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>rust</category>
      <category>mcp</category>
    </item>
    <item>
      <title>How I Built Cryptographic Signing for Every AI Agent Tool Call</title>
      <dc:creator>willamhou</dc:creator>
      <pubDate>Thu, 02 Apr 2026 02:45:56 +0000</pubDate>
      <link>https://forem.com/willamhou/how-i-built-cryptographic-signing-for-every-ai-agent-tool-call-1f6a</link>
      <guid>https://forem.com/willamhou/how-i-built-cryptographic-signing-for-every-ai-agent-tool-call-1f6a</guid>
      <description>&lt;h1&gt;
  
  
  How I Built Cryptographic Signing for Every AI Agent Tool Call
&lt;/h1&gt;

&lt;p&gt;Your AI agent just mass-deleted a production database. Can you prove exactly what it did? When? Who authorized it?&lt;/p&gt;

&lt;p&gt;You can't. None of the major agent frameworks produce cryptographic evidence of what happened.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;Signet&lt;/a&gt; to fix this. It gives every AI agent an Ed25519 identity and signs every tool call with a tamper-evident receipt. Open source, 3 lines of code to integrate.&lt;/p&gt;

&lt;p&gt;This post covers the design decisions, the crypto choices, and why I built an SDK instead of a proxy.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;MCP (Model Context Protocol) is becoming the standard for agent-tool communication. Claude, Cursor, Windsurf, and dozens of other tools use it. But MCP has no signing, no audit log, and no way to prove which agent did what.&lt;/p&gt;

&lt;p&gt;Real incidents that motivated this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Amazon Kiro deleted a production environment&lt;/li&gt;
&lt;li&gt;Replit agent dropped a live database
&lt;/li&gt;
&lt;li&gt;Supabase MCP leaked tokens via prompt injection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In every case: zero audit trail. The agent did something, nobody could prove exactly what.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design Decisions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  SDK, not proxy
&lt;/h3&gt;

&lt;p&gt;The existing tools in this space (Aegis, estoppl) use a proxy/gateway model. You deploy a separate process that sits between your agent and the MCP server, intercepting all traffic.&lt;/p&gt;

&lt;p&gt;I chose the opposite: a client-side SDK. Three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Zero infrastructure.&lt;/strong&gt; &lt;code&gt;npm install&lt;/code&gt; or &lt;code&gt;pip install&lt;/code&gt;, not Docker.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;stdio-native.&lt;/strong&gt; Most MCP connections use stdio pipes. Proxying stdio requires process orchestration that adds failure modes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Framework-agnostic.&lt;/strong&gt; Works with any MCP client, any transport, any language.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The tradeoff: a proxy can enforce policies (block unauthorized calls). Signet can't — it's a camera, not a bouncer. Attestation, not prevention. I think both are needed, and they're complementary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ed25519, not ECDSA or RSA
&lt;/h3&gt;

&lt;p&gt;Ed25519 was the obvious choice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fast.&lt;/strong&gt; ~70,000 signatures/sec on a laptop. Agent tool calls are maybe 10/sec. No bottleneck.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small.&lt;/strong&gt; 64-byte signatures, 32-byte keys. Fits in JSON metadata without bloat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic.&lt;/strong&gt; Same key + same message = same signature. No nonce generation needed at signing time (the nonce is in the key schedule).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Battle-tested.&lt;/strong&gt; SSH, Signal, age, WireGuard all use it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I use &lt;code&gt;ed25519-dalek&lt;/code&gt; in Rust. The same core compiles to WASM for Node.js and to native for Python via PyO3. One implementation, three languages, zero divergence risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  RFC 8785 (JCS) for canonical JSON
&lt;/h3&gt;

&lt;p&gt;The signature covers the entire receipt body: action, signer, timestamp, nonce. But JSON serialization is non-deterministic — &lt;code&gt;{"a":1,"b":2}&lt;/code&gt; and &lt;code&gt;{"b":2,"a":1}&lt;/code&gt; are semantically identical but produce different bytes.&lt;/p&gt;

&lt;p&gt;I use &lt;a href="https://datatracker.ietf.org/doc/html/rfc8785" rel="noopener noreferrer"&gt;RFC 8785 (JSON Canonicalization Scheme)&lt;/a&gt; to solve this. JCS defines a deterministic JSON serialization: sorted keys, no whitespace, specific number formatting. Sign the JCS output, verify against the JCS output. Deterministic.&lt;/p&gt;

&lt;p&gt;Why JCS over alternatives like bencode or CBOR? Because the receipts are JSON, the MCP protocol is JSON, and I didn't want to introduce a second serialization format just for signing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hash-chained audit log
&lt;/h3&gt;

&lt;p&gt;Every receipt is appended to a local JSONL audit log. Each entry includes the SHA-256 hash of the previous entry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;record_&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;receipt:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;prev_hash:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:0000..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;record_hash:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:abc1..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;record_&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;receipt:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;prev_hash:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:abc1..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;record_hash:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:def2..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;record_&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;receipt:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;prev_hash:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:def2..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;record_hash:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:ghi3..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Delete or modify any record and the chain breaks. &lt;code&gt;signet verify --chain&lt;/code&gt; detects it.&lt;/p&gt;

&lt;p&gt;This is tamper-&lt;em&gt;evident&lt;/em&gt;, not tamper-&lt;em&gt;proof&lt;/em&gt;. Someone with disk access can delete the entire log. The hash chain catches selective editing (changing one record while keeping others). Off-host anchoring (anchoring chain hashes to an external service) is on the roadmap for true tamper-proof guarantees.&lt;/p&gt;

&lt;h3&gt;
  
  
  Encrypted key storage
&lt;/h3&gt;

&lt;p&gt;Agent keys are encrypted at rest with Argon2id + XChaCha20-Poly1305:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Argon2id&lt;/strong&gt; for key derivation (OWASP recommended, memory-hard, resists GPU attacks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XChaCha20-Poly1305&lt;/strong&gt; for encryption (24-byte nonce = safe random nonce generation, AEAD for authenticated encryption)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AAD&lt;/strong&gt; (Additional Authenticated Data) binds the ciphertext to the key's metadata, preventing key-file swaps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unencrypted keys are supported for CI/automation (&lt;code&gt;--unencrypted&lt;/code&gt; flag). Keys stored at &lt;code&gt;~/.signet/keys/&lt;/code&gt; with &lt;code&gt;0600&lt;/code&gt; permissions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Receipt Looks Like
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"v"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rec_e7039e7e7714e84f..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"github_create_issue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fix bug"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"details"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"params_hash"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:b878192252cb..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp://github.local"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"transport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stdio"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"signer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pubkey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:0CRkURt/tc6r..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"demo-bot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"owner"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"willamhou"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-03-29T23:24:03.309Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"nonce"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rnd_dcd4e135799393..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sig"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ed25519:6KUohbnSmehP..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The signature covers &lt;code&gt;v + action + signer + ts + nonce&lt;/code&gt; via JCS. Tamper with any field and verification fails.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;params_hash&lt;/code&gt; is always present. By default, raw params are also stored. For sensitive data, &lt;code&gt;--hash-only&lt;/code&gt; mode stores only the hash — you can prove &lt;em&gt;what shape&lt;/em&gt; the params had without revealing their content.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integration: 3 Lines
&lt;/h2&gt;

&lt;h3&gt;
  
  
  TypeScript (MCP)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SigningTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@signet-auth/mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;inner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StdioClientTransport&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;my-mcp-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SigningTransport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;secretKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;my-agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every &lt;code&gt;tools/call&lt;/code&gt; request gets a signed receipt injected into &lt;code&gt;params._meta._signet&lt;/code&gt;. MCP servers don't need to change — they ignore unknown fields per the spec.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python — LangChain
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth.langchain&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SignetCallbackHandler&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;willamhou&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SignetCallbackHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Pass handler to your LangChain agent — every tool call is now signed
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Python — CrewAI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth.crewai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;install_hooks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uninstall_hooks&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;willamhou&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;install_hooks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kickoff&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# every tool call is now signed
&lt;/span&gt;&lt;span class="nf"&gt;uninstall_hooks&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Python — Low-level API (any framework)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;signet_auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SigningAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;willamhou&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;github_create_issue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix bug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All Python bindings are Rust compiled via PyO3 — same crypto, same behavior, native speed. &lt;code&gt;pip install signet-auth&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;signet identity generate &lt;span class="nt"&gt;--name&lt;/span&gt; my-agent
signet sign &lt;span class="nt"&gt;--key&lt;/span&gt; my-agent &lt;span class="nt"&gt;--tool&lt;/span&gt; &lt;span class="s2"&gt;"github_create_issue"&lt;/span&gt; &lt;span class="nt"&gt;--params&lt;/span&gt; &lt;span class="s1"&gt;'{"title":"fix bug"}'&lt;/span&gt;
signet audit &lt;span class="nt"&gt;--since&lt;/span&gt; 24h
signet verify &lt;span class="nt"&gt;--chain&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;signet/
├── crates/signet-core/       Rust core (one implementation)
├── bindings/
│   ├── signet-ts/            → WASM for Node.js
│   └── signet-py/            → PyO3 for Python
├── packages/
│   ├── @signet-auth/core     TypeScript wrapper
│   └── @signet-auth/mcp      MCP middleware
└── signet-cli/               CLI binary
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key architectural decision: &lt;strong&gt;one Rust implementation, compiled to multiple targets.&lt;/strong&gt; The WASM binding and Python binding call the same &lt;code&gt;signet-core&lt;/code&gt; code. There is no separate TypeScript or Python reimplementation of the signing logic. This eliminates the "two implementations diverge" class of bugs entirely.&lt;/p&gt;

&lt;p&gt;172 tests across Rust (68), Python (85), TypeScript (11), and WASM (8).&lt;/p&gt;

&lt;h2&gt;
  
  
  What Signet Does NOT Do (Yet)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;It proves the agent requested an action, not that the server executed it.&lt;/strong&gt; The receipt says "agent X signed intent to call tool Y with params Z at time T." Whether the server actually did it is a separate question. Server-side verification middleware is on the roadmap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It doesn't prevent bad actions.&lt;/strong&gt; Signet is a camera, not a bouncer. It complements prevention tools like policy engines and firewalls. You want both: stop the bad thing AND have evidence of what happened.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Signer identity is self-asserted.&lt;/strong&gt; The &lt;code&gt;signer.name&lt;/code&gt; and &lt;code&gt;signer.owner&lt;/code&gt; fields are set by the agent. There's no external identity registry (yet) to verify that "demo-bot" actually belongs to "willamhou."&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;signet-auth
&lt;span class="c"&gt;# or&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; @signet-auth/core @signet-auth/mcp
&lt;span class="c"&gt;# or&lt;/span&gt;
cargo build &lt;span class="nt"&gt;--release&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; signet-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/Prismer-AI/signet" rel="noopener noreferrer"&gt;github.com/Prismer-AI/signet&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apache-2.0 + MIT dual license.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Questions about the design decisions, crypto choices, or roadmap? I'm &lt;a href="https://x.com/WillamUpUp" rel="noopener noreferrer"&gt;@willamhou&lt;/a&gt; on Twitter/X.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
      <category>rust</category>
    </item>
  </channel>
</rss>
