<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Kanish Tyagi</title>
    <description>The latest articles on Forem by Kanish Tyagi (@kanishtyagii).</description>
    <link>https://forem.com/kanishtyagii</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3858655%2Fbb046de4-4496-4a7f-9db5-c1187159439d.jpg</url>
      <title>Forem: Kanish Tyagi</title>
      <link>https://forem.com/kanishtyagii</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kanishtyagii"/>
    <language>en</language>
    <item>
      <title>Building Trust Between AI Agents — DIDs, Signatures, and Zero-Trust Mesh</title>
      <dc:creator>Kanish Tyagi</dc:creator>
      <pubDate>Wed, 08 Apr 2026 18:39:54 +0000</pubDate>
      <link>https://forem.com/kanishtyagii/building-trust-between-ai-agents-dids-signatures-and-zero-trust-mesh-4m3j</link>
      <guid>https://forem.com/kanishtyagii/building-trust-between-ai-agents-dids-signatures-and-zero-trust-mesh-4m3j</guid>
      <description>&lt;p&gt;Imagine you walk into a room full of strangers. Everyone is wearing a mask. Someone hands you a document and says "sign this — it's from the CEO." How do you know it's actually from the CEO? You don't. You have no way to verify identity.&lt;/p&gt;

&lt;p&gt;This is the trust problem in multi-agent AI systems. And it's more serious than most teams realize.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Agents Need Identity
&lt;/h2&gt;

&lt;p&gt;When a single AI agent operates in isolation, trust is simple — you trust it or you don't. But modern AI systems are increasingly multi-agent. A research agent delegates to a writing agent. A planning agent spawns execution sub-agents. An orchestrator coordinates dozens of specialized agents in parallel.&lt;/p&gt;

&lt;p&gt;In these systems, every interaction between agents is a potential attack surface.&lt;/p&gt;

&lt;p&gt;Consider this scenario: Agent A receives a message claiming to be from Agent B, instructing it to delete a set of records. How does Agent A verify that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The message actually came from Agent B&lt;/li&gt;
&lt;li&gt;Agent B is authorized to make that request&lt;/li&gt;
&lt;li&gt;Agent B hasn't been compromised and is acting within its intended scope&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Without cryptographic identity, the answer to all three is: it can't.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: Cryptographic Agent Identity
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;agent-governance-toolkit&lt;/code&gt; solves this with a trust mesh — a framework where every agent has a verifiable cryptographic identity, and every inter-agent interaction is validated before execution.&lt;/p&gt;

&lt;p&gt;Here's how it works.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ed25519 Key Pairs
&lt;/h3&gt;

&lt;p&gt;Each agent gets an Ed25519 key pair at creation time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.identity&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentIdentity&lt;/span&gt;

&lt;span class="c1"&gt;# Each agent gets a unique cryptographic identity
&lt;/span&gt;&lt;span class="n"&gt;identity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AgentIdentity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-agent-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;did&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK
&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;public_key_hex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# ed25519 public key
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The DID (Decentralized Identifier) is a self-describing, globally unique identifier derived from the public key. No central registry required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Signed Messages
&lt;/h3&gt;

&lt;p&gt;When Agent A sends a message to Agent B, it signs it with its private key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;did&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;did:key:z6Mk...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analyze_dataset&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payload&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dataset_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ds_001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-04-05T10:00:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;signed_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Includes: message + signature + public key
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Agent B verifies the signature before acting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.identity&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;verify_message&lt;/span&gt;

&lt;span class="n"&gt;is_valid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verify_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signed_message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;is_valid&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;SecurityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Message signature invalid — rejecting&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the message was tampered with in transit, the signature check fails. The action is rejected before execution.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trust Scoring: Agents Earn and Lose Trust
&lt;/h2&gt;

&lt;p&gt;Identity verification tells you &lt;em&gt;who&lt;/em&gt; sent a message. Trust scoring tells you &lt;em&gt;whether to act on it&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The toolkit maintains a trust score (0-1000) for each agent in the mesh:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.trust&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TrustEngine&lt;/span&gt;

&lt;span class="n"&gt;trust&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TrustEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# New agent starts with baseline trust
&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-agent-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 500 (baseline)
&lt;/span&gt;
&lt;span class="c1"&gt;# Trust increases with successful verified interactions
&lt;/span&gt;&lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;record_success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-agent-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-agent-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 510
&lt;/span&gt;
&lt;span class="c1"&gt;# Trust decreases with policy violations
&lt;/span&gt;&lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;record_violation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-agent-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;severity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research-agent-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 460
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Agents with low trust scores get restricted automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.trust&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TrustPolicy&lt;/span&gt;

&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TrustPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;minimum_score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# below this, agent is quarantined
&lt;/span&gt;    &lt;span class="n"&gt;require_human_approval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# below this, human must approve actions
&lt;/span&gt;    &lt;span class="n"&gt;full_autonomy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# above this, agent operates freely
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is how the system handles compromised agents gracefully. An agent that starts behaving badly loses trust, gets restricted, and eventually gets quarantined — automatically, without human intervention.&lt;/p&gt;




&lt;h2&gt;
  
  
  Delegation Chains: Limited Capability Grants
&lt;/h2&gt;

&lt;p&gt;In multi-agent systems, parent agents often spawn child agents to handle subtasks. The challenge: how do you give a child agent enough capability to do its job without giving it unlimited power?&lt;/p&gt;

&lt;p&gt;The answer is delegation chains — cryptographically signed capability grants with explicit limits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.delegation&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DelegationChain&lt;/span&gt;

&lt;span class="c1"&gt;# Parent agent creates a limited delegation for child
&lt;/span&gt;&lt;span class="n"&gt;delegation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DelegationChain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;parent_identity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;orchestrator_identity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;child_did&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;did:key:z6Mk...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;granted_capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# subset of parent's capabilities
&lt;/span&gt;    &lt;span class="n"&gt;excluded_capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;send_email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# child cannot further delegate more than 2 levels
&lt;/span&gt;    &lt;span class="n"&gt;expires_in_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# delegation expires in 5 minutes
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The child agent presents this delegation when making requests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Child agent acts with delegated authority
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;child_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/data/analysis.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;delegation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;delegation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The toolkit verifies the entire delegation chain before execution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the delegation cryptographically valid?&lt;/li&gt;
&lt;li&gt;Is the requested capability within the granted scope?&lt;/li&gt;
&lt;li&gt;Has the delegation expired?&lt;/li&gt;
&lt;li&gt;Is the delegation depth within limits?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any check fails, the action is rejected. A compromised child agent cannot exceed the capabilities its parent explicitly granted.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Practical Example: 3 Agents Collaborating
&lt;/h2&gt;

&lt;p&gt;Here's how trust verification plays out in a real multi-agent workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.identity&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentIdentity&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.trust&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TrustEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.delegation&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DelegationChain&lt;/span&gt;

&lt;span class="c1"&gt;# Agent 1: Orchestrator (high trust, broad capabilities)
&lt;/span&gt;&lt;span class="n"&gt;orchestrator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AgentIdentity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orchestrator&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orchestrator&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;write_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;send_email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Agent 2: Researcher (medium trust, search only)
&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AgentIdentity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Agent 3: Writer (medium trust, write only)
&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AgentIdentity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;writer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;writer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;write_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;trust&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TrustEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orchestrator&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;700&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trust&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;writer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;700&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Orchestrator delegates research task to researcher
&lt;/span&gt;&lt;span class="n"&gt;research_delegation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DelegationChain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;parent_identity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;child_did&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;did&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;granted_capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;expires_in_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Researcher executes with verified delegation
# Toolkit checks: valid signature + sufficient trust + within capability scope
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;latest AI governance frameworks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;delegation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;research_delegation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Orchestrator delegates writing task to writer
&lt;/span&gt;&lt;span class="n"&gt;write_delegation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DelegationChain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;parent_identity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;child_did&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;did&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;granted_capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;write_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;expires_in_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Writer cannot exceed delegated scope
# This would be rejected — send_email not in delegation
&lt;/span&gt;&lt;span class="n"&gt;writer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;send_email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;delegation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;write_delegation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# SecurityError: capability 'send_email' not in delegation scope
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every interaction is verified. Every capability grant is explicit and time-limited. No agent can exceed what it was explicitly authorized to do.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison With Human Trust Models
&lt;/h2&gt;

&lt;p&gt;The cryptographic trust model mirrors how humans establish trust in high-stakes environments:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Human World&lt;/th&gt;
&lt;th&gt;Agent Mesh&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Government-issued ID&lt;/td&gt;
&lt;td&gt;Ed25519 key pair + DID&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signed contract&lt;/td&gt;
&lt;td&gt;Signed message&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Professional reputation&lt;/td&gt;
&lt;td&gt;Trust score&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Power of attorney&lt;/td&gt;
&lt;td&gt;Delegation chain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Expiry date on credentials&lt;/td&gt;
&lt;td&gt;TTL on delegations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revocation list&lt;/td&gt;
&lt;td&gt;Trust score below threshold&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The difference: human trust systems have gaps. IDs can be forged. Signatures can be disputed. Reputation takes months to establish. The cryptographic model is mathematically verifiable — either the signature is valid or it isn't. Either the capability was granted or it wasn't.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters for Production Systems
&lt;/h2&gt;

&lt;p&gt;Most teams building multi-agent systems today are operating on implicit trust — agents interact freely, permissions are not enforced, and there's no audit trail of inter-agent communication.&lt;/p&gt;

&lt;p&gt;This works fine in development. It becomes a serious problem in production, where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agents interact at scale across organizational boundaries&lt;/li&gt;
&lt;li&gt;Compromised agents can propagate bad behavior to other agents&lt;/li&gt;
&lt;li&gt;Regulatory requirements demand audit trails of automated decisions&lt;/li&gt;
&lt;li&gt;A single rogue agent in a mesh can corrupt the entire workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The zero-trust mesh approach — verify every identity, validate every capability, audit every interaction — is how you build multi-agent systems that are safe to run in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-governance-toolkit[full]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full source code and documentation:&lt;br&gt;
👉 &lt;a href="https://github.com/microsoft/agent-governance-toolkit" rel="noopener noreferrer"&gt;github.com/microsoft/agent-governance-toolkit&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Kanish Tyagi — MS Data Science student at UT Arlington, open source contributor to Microsoft's agent-governance-toolkit. Find me on &lt;a href="https://github.com/kanish5" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/kanishtyagi123/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>security</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>I Added Governance to My AI Agent in 30 Minutes — Here's How</title>
      <dc:creator>Kanish Tyagi</dc:creator>
      <pubDate>Wed, 08 Apr 2026 18:33:51 +0000</pubDate>
      <link>https://forem.com/kanishtyagii/i-added-governance-to-my-ai-agent-in-30-minutes-heres-how-54k6</link>
      <guid>https://forem.com/kanishtyagii/i-added-governance-to-my-ai-agent-in-30-minutes-heres-how-54k6</guid>
      <description>&lt;p&gt;A few weeks ago I was building a LangChain agent and realized I had no idea what it was actually doing. It could call any tool, write anything to memory, make unlimited API calls. It was essentially unsupervised.&lt;/p&gt;

&lt;p&gt;Then I found Microsoft's &lt;code&gt;agent-governance-toolkit&lt;/code&gt;. I added governance to my agent in about 30 minutes. Here's exactly how.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;A LangChain agent that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can only use approved tools&lt;/li&gt;
&lt;li&gt;Blocks dangerous patterns (SQL injection, destructive commands, PII)&lt;/li&gt;
&lt;li&gt;Logs every action for audit&lt;/li&gt;
&lt;li&gt;Stops itself when it hits a call budget&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No rewrites. Just a governance wrapper around your existing agent.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Install (2 minutes)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-governance-toolkit[full]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-governance verify
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see a green checkmark. You're ready.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Your First Policy (5 minutes)
&lt;/h2&gt;

&lt;p&gt;Before governance, your agent looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chat_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;initialize_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentType&lt;/span&gt;

&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;initialize_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ZERO_SHOT_REACT_DESCRIPTION&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# No limits. No logging. No audit trail.
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Do whatever the user asks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After governance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chat_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;initialize_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentType&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations.base&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GovernancePolicy&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LangChainKernel&lt;/span&gt;

&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;base_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;initialize_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AgentType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ZERO_SHOT_REACT_DESCRIPTION&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Define what your agent is allowed to do
&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GovernancePolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-first-policy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;blocked_patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP TABLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm -rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;max_tool_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;log_all_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Wrap your existing agent — no rewrite needed
&lt;/span&gt;&lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangChainKernel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;governed_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Same interface, now governed
&lt;/span&gt;&lt;span class="n"&gt;governed_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Help me analyze this dataset&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Your agent now has a policy layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Add Tool Restrictions (5 minutes)
&lt;/h2&gt;

&lt;p&gt;You probably don't want your agent to be able to delete files or execute arbitrary shell commands. Add an allowlist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GovernancePolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;restricted-policy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# Only these tools are allowed
&lt;/span&gt;    &lt;span class="n"&gt;allowed_tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;web_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;send_email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="c1"&gt;# These are always blocked regardless
&lt;/span&gt;    &lt;span class="n"&gt;blocked_patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm -rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;# destructive shell
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP TABLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# SQL injection
&lt;/span&gt;        &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b\d{3}-\d{2}-\d{4}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# SSN pattern
&lt;/span&gt;        &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;password\s*[:=]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# credential leak
&lt;/span&gt;    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;max_tool_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# This gets blocked before reaching the LLM
&lt;/span&gt;&lt;span class="n"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reason&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pre_execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Execute: DROP TABLE users&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# False
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# "Blocked pattern 'DROP TABLE' detected"
&lt;/span&gt;
&lt;span class="c1"&gt;# This passes through
&lt;/span&gt;&lt;span class="n"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reason&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pre_execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search for recent AI papers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# True
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rejection happens at the code layer — the model never sees the blocked input.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: Add Audit Logging (5 minutes)
&lt;/h2&gt;

&lt;p&gt;Every action your agent takes is now logged automatically when &lt;code&gt;log_all_calls=True&lt;/code&gt;. But you can also inspect the audit trail directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="n"&gt;audit_log&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;audit_log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;allowed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;allowed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Check what happened
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;audit_log&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ ALLOWED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;allowed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🚫 BLOCKED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In production, this goes to your observability stack. During development, it tells you exactly what your agent tried to do.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5: Run the OWASP Compliance Check (3 minutes)
&lt;/h2&gt;

&lt;p&gt;The toolkit includes a built-in compliance checker against the OWASP Agentic Top 10:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-governance verify &lt;span class="nt"&gt;--badge&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This checks your configuration against known attack vectors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompt injection resistance&lt;/li&gt;
&lt;li&gt;Tool poisoning detection
&lt;/li&gt;
&lt;li&gt;Data exfiltration patterns&lt;/li&gt;
&lt;li&gt;Privilege escalation blocks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You get a pass/fail report with specific recommendations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Before vs After
&lt;/h2&gt;

&lt;p&gt;Here's what changed in 30 minutes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool calls&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;td&gt;Max 10 per session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dangerous inputs&lt;/td&gt;
&lt;td&gt;Passed to LLM&lt;/td&gt;
&lt;td&gt;Blocked before LLM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PII handling&lt;/td&gt;
&lt;td&gt;Unvalidated&lt;/td&gt;
&lt;td&gt;Checked on every write&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audit trail&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Every action logged&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OWASP compliance&lt;/td&gt;
&lt;td&gt;Unknown&lt;/td&gt;
&lt;td&gt;Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance overhead&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&amp;lt;0.1ms per action check&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The last point matters: governance doesn't slow your agent down in any meaningful way. The policy checks are in-process Python, not network calls.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned Contributing to This Repo
&lt;/h2&gt;

&lt;p&gt;I spent the past week contributing code to this toolkit — fixing bugs, adding docstrings, building Colab notebooks. Here's what surprised me:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The attack surface is bigger than you think.&lt;/strong&gt; I saw how MCP tool poisoning works, how PII leaks into memory writes, how delegation chains can be exploited. None of these are theoretical. They're patterns security researchers are actively demonstrating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrapping is better than rewriting.&lt;/strong&gt; The governance layer wraps your existing agent. You don't throw away your prompts or your framework. You add a code layer underneath them. That's the right architecture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auditability is the real value.&lt;/strong&gt; The blocked requests matter less than the audit trail. Knowing what your agent &lt;em&gt;tried&lt;/em&gt; to do — even when it was allowed — is how you understand and improve agent behavior in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The full toolkit is open source:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-governance-toolkit[full]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are also interactive Colab notebooks (which I built!) that let you explore policy enforcement, MCP security scanning, and multi-agent governance without any local setup:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/microsoft/agent-governance-toolkit" rel="noopener noreferrer"&gt;github.com/microsoft/agent-governance-toolkit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The quickstart takes 10 minutes. The governance layer takes 5 more.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Kanish Tyagi — MS Data Science student at UT Arlington, open source contributor to Microsoft's agent-governance-toolkit. Find me on &lt;a href="https://github.com/kanish5" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/kanishtyagi123/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>tutorial</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Policy-as-Code vs Prompt Engineering — When Guardrails Need Governance</title>
      <dc:creator>Kanish Tyagi</dc:creator>
      <pubDate>Sun, 05 Apr 2026 12:20:44 +0000</pubDate>
      <link>https://forem.com/kanishtyagii/policy-as-code-vs-prompt-engineering-when-guardrails-need-governance-23pi</link>
      <guid>https://forem.com/kanishtyagii/policy-as-code-vs-prompt-engineering-when-guardrails-need-governance-23pi</guid>
      <description>&lt;p&gt;I've been contributing to Microsoft's &lt;code&gt;agent-governance-toolkit&lt;/code&gt; for the past few weeks — fixing bugs, writing docstrings, building Colab notebooks. And the more I dug into the codebase, the more one question kept coming up: &lt;em&gt;why can't you just use a well-crafted system prompt to keep your agent safe?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The short answer: you can. Until you can't.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Prompt Engineering Approach
&lt;/h2&gt;

&lt;p&gt;Prompt-level guardrails look like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"You are a helpful assistant. Never recommend illegal activities. Never share personal user data. Always stay within budget constraints."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This works remarkably well — right up until it doesn't. The problem is structural: your guardrail lives inside the model's context window, which means it's subject to the same probabilistic reasoning as everything else the model does.&lt;/p&gt;

&lt;p&gt;Three things consistently break prompt-only guardrails:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Jailbreak attacks.&lt;/strong&gt; Users have discovered that framing requests differently — roleplay scenarios, "hypothetical" framings, multi-step manipulations — can cause models to comply with things they were explicitly told not to do. This isn't a bug in the model. It's a feature of how language models work. They follow the most compelling framing of the current context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context window exhaustion.&lt;/strong&gt; A long conversation eventually pushes your system prompt out of the context window. Your "never exceed budget" guardrail disappears after message 40. The model doesn't know what it forgot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model dependency.&lt;/strong&gt; A guardrail tuned for GPT-4 behaves differently on Claude, which behaves differently on Llama. Every model swap means re-testing every prompt. At scale, that's not sustainable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Policy-as-Code Approach
&lt;/h2&gt;

&lt;p&gt;Policy-as-code doesn't ask the model to follow rules. It enforces rules in code, before and after the model is ever called.&lt;/p&gt;

&lt;p&gt;Here's what that looks like with the agent-governance-toolkit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations.base&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GovernancePolicy&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LangChainKernel&lt;/span&gt;

&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GovernancePolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;blocked_patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP TABLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm -rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b\d{3}-\d{2}-\d{4}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;max_tool_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;require_human_approval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangChainKernel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;governed_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_chain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every call to &lt;code&gt;governed_agent.invoke()&lt;/code&gt; now goes through a policy check &lt;em&gt;before&lt;/em&gt; it reaches the model. If the input matches a blocked pattern — SQL injection, destructive command, SSN pattern — the call is rejected. The model never sees it.&lt;/p&gt;

&lt;p&gt;This is a fundamentally different architecture. The guardrail isn't in the model's context. It's in your infrastructure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Each Approach Fails
&lt;/h2&gt;

&lt;p&gt;Neither approach is complete on its own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where prompt engineering fails:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adversarial users who iterate on jailbreaks&lt;/li&gt;
&lt;li&gt;Long-running agents where context window limits apply&lt;/li&gt;
&lt;li&gt;Multi-model deployments where you can't tune per model&lt;/li&gt;
&lt;li&gt;Compliance scenarios where you need an audit trail&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Where policy-as-code falls short:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creative tasks where rules can't capture nuance&lt;/li&gt;
&lt;li&gt;Behavioral style (tone, personality, format)&lt;/li&gt;
&lt;li&gt;Rapid iteration — changing a policy requires a code deploy&lt;/li&gt;
&lt;li&gt;User experience — prompts understand context, rules do not&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A guardrail that says &lt;code&gt;blocked_patterns=["inappropriate content"]&lt;/code&gt; can't capture the infinite ways "inappropriate" manifests in natural language. A prompt can. But a prompt can't guarantee that a rule is never violated. Code can.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real Example: PII in Memory Writes
&lt;/h2&gt;

&lt;p&gt;Here's a concrete failure mode I saw while contributing to this toolkit.&lt;/p&gt;

&lt;p&gt;An agent with memory capabilities might inadvertently write a user's SSN into its long-term memory during a conversation about financial planning. A prompt guardrail saying "never store sensitive data" relies on the model recognizing what counts as sensitive — and recognizing it every single time, across thousands of conversations.&lt;/p&gt;

&lt;p&gt;The toolkit handles this differently. Memory writes are intercepted at the code layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_PII_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b\d{3}-\d{2}-\d{4}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# SSN
&lt;/span&gt;    &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# email
&lt;/span&gt;    &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b(?:password|secret|token)\s*[:=]\s*\S+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IGNORECASE&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before anything is written to memory, it's checked against these patterns. If a match is found, the write is blocked and a &lt;code&gt;PolicyViolationError&lt;/code&gt; is raised. The model's cooperation is irrelevant — the rule is enforced in code.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Layered Model That Actually Works
&lt;/h2&gt;

&lt;p&gt;The right answer isn't "choose one." It's to use both at the right layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use policy-as-code for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hard constraints that must never be violated (safety, compliance, budget)&lt;/li&gt;
&lt;li&gt;Anything that needs an audit trail&lt;/li&gt;
&lt;li&gt;Rules that apply regardless of model behavior&lt;/li&gt;
&lt;li&gt;Security boundaries (tool allowlists, blocked patterns, call budgets)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use prompt engineering for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Behavioral style and tone&lt;/li&gt;
&lt;li&gt;Output formatting&lt;/li&gt;
&lt;li&gt;Persona and creative direction&lt;/li&gt;
&lt;li&gt;User experience nuances&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use human-in-the-loop for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-stakes decisions that exceed both layers&lt;/li&gt;
&lt;li&gt;Ambiguous cases the policy engine flags as uncertain&lt;/li&gt;
&lt;li&gt;Exception handling with accountability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of it as defense in depth. Prompts guide behavior. Policies enforce limits. Humans handle edge cases. No single layer carries all the weight.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Concrete Starting Point
&lt;/h2&gt;

&lt;p&gt;If you're building agents today without a policy layer, here's the minimum viable governance setup using the toolkit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-governance-toolkit[full]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations.base&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GovernancePolicy&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LangChainKernel&lt;/span&gt;

&lt;span class="c1"&gt;# Start with just these three
&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GovernancePolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;blocked_patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm -rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP TABLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# safety
&lt;/span&gt;    &lt;span class="n"&gt;max_tool_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                           &lt;span class="c1"&gt;# budget
&lt;/span&gt;    &lt;span class="n"&gt;log_all_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                          &lt;span class="c1"&gt;# audit trail
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangChainKernel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;governed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;your_existing_agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You don't need to rewrite your agent. You wrap it. Your existing prompts stay. You've added a code-layer enforcement layer underneath them.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;The shift from prompt-only to policy + prompt is the maturation of agent governance. It's the difference between "I hope my agent behaves" and "I can prove my agent behaves."&lt;/p&gt;

&lt;p&gt;Prompt engineering got us to where we are. It's powerful and flexible and necessary. But as agents move into production — taking real actions, handling real data, running autonomously at scale — "I told it not to" is no longer sufficient as a governance strategy.&lt;/p&gt;

&lt;p&gt;Policy-as-code is how you make agent behavior auditable, testable, and enforceable. Not instead of good prompts. On top of them.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Kanish Tyagi — MS Data Science student at UT Arlington, open source contributor to Microsoft's agent-governance-toolkit. Find me on &lt;a href="https://github.com/kanish5" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/kanishtyagi123/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Source code and docs: &lt;a href="https://github.com/microsoft/agent-governance-toolkit" rel="noopener noreferrer"&gt;github.com/microsoft/agent-governance-toolkit&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>security</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Why AI Agent Governance Matters in 2026</title>
      <dc:creator>Kanish Tyagi</dc:creator>
      <pubDate>Fri, 03 Apr 2026 03:53:48 +0000</pubDate>
      <link>https://forem.com/kanishtyagii/why-ai-agent-governance-matters-in-2026-4mnk</link>
      <guid>https://forem.com/kanishtyagii/why-ai-agent-governance-matters-in-2026-4mnk</guid>
      <description>&lt;p&gt;A few weeks ago I was browsing GitHub looking for open source projects to contribute to. I stumbled on Microsoft's &lt;code&gt;agent-governance-toolkit&lt;/code&gt; and decided to dig in. What I found surprised me — not because the code was impressive (it was), but because the &lt;em&gt;problem&lt;/em&gt; it solved was one I hadn't thought seriously about before.&lt;/p&gt;

&lt;p&gt;We talk a lot about building AI agents. We don't talk nearly enough about what happens when they go wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Rise of Autonomous AI Agents
&lt;/h2&gt;

&lt;p&gt;In 2026, AI agents aren't a research curiosity anymore. They're running in production. Companies are deploying agents that browse the web, write and execute code, query databases, send emails, and call external APIs — all without a human in the loop for each step.&lt;/p&gt;

&lt;p&gt;This is genuinely powerful. An agent that can autonomously research, analyze, and act can compress hours of work into minutes. But there's a problem that comes with that power that most teams are only starting to reckon with: &lt;strong&gt;an agent that can do anything, will eventually do the wrong thing.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not out of malice. Out of ambiguity, edge cases, and the fundamental unpredictability of systems built on probabilistic models.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Can Go Wrong — And Does
&lt;/h2&gt;

&lt;p&gt;Let me give you concrete examples from patterns I saw while contributing to this toolkit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data exfiltration via MCP tool poisoning.&lt;/strong&gt; MCP (Model Context Protocol) is a standard for giving agents access to tools. What most teams don't realize is that an attacker can embed hidden instructions directly in a tool's description. The agent reads the description, interprets it as instructions, and suddenly a "read file" tool is quietly sending your data to an external endpoint. The attack is invisible at the UI level. It lives in the metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Runaway tool calls.&lt;/strong&gt; An agent that's allowed to call tools indefinitely will sometimes loop. A bug in the task framing, an ambiguous goal, or an unexpected API response can send an agent into a cycle that burns through API credits, hits rate limits, or worse — makes irreversible changes at scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PII in memory writes.&lt;/strong&gt; Agents with memory capabilities can inadvertently store sensitive user data — SSNs, emails, API keys embedded in text — in their long-term memory. Without validation at the write layer, that data persists and propagates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt injection across delegation chains.&lt;/strong&gt; When agents spawn sub-agents (which is increasingly common in multi-agent architectures), each handoff is an attack surface. A malicious instruction injected early in a chain can propagate through multiple agents before anyone realizes something is wrong.&lt;/p&gt;

&lt;p&gt;These aren't theoretical. Security researchers are actively demonstrating all of these attack patterns in the wild right now.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Governance Actually Means
&lt;/h2&gt;

&lt;p&gt;"Governance" is one of those words that sounds bureaucratic until you see what the absence of it costs.&lt;/p&gt;

&lt;p&gt;In practical terms, governance for AI agents means three things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Policy enforcement before execution.&lt;/strong&gt; Every tool call, every action an agent wants to take, gets checked against a set of rules &lt;em&gt;before&lt;/em&gt; it happens. Not after. Not logged for review later. Before. If the action matches a blocked pattern — a SQL injection attempt, a destructive shell command, a PII pattern — it's stopped. The agent gets a policy violation error. The action doesn't execute.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Audit logging you can actually use.&lt;/strong&gt; Every action an agent takes, whether allowed or blocked, gets recorded with enough context to understand what happened. Not just "agent called tool X" but which agent, which context, what the input was, what the policy decision was, and why.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Budget and circuit breaker enforcement.&lt;/strong&gt; Agents need hard limits — on the number of tool calls per session, on the delegation depth between agents, on execution time. When those limits are hit, the agent stops. Not degrades gracefully. Stops.&lt;/p&gt;




&lt;h2&gt;
  
  
  How agent-governance-toolkit Implements This
&lt;/h2&gt;

&lt;p&gt;This is where it gets concrete. The toolkit wraps your existing agent frameworks — LangChain, CrewAI, AutoGen, PydanticAI, smolagents, Google ADK — with a governance layer that sits between your agent and the outside world.&lt;/p&gt;

&lt;p&gt;You define a policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations.base&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GovernancePolicy&lt;/span&gt;

&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GovernancePolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;blocked_patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP TABLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm -rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b\d{3}-\d{2}-\d{4}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;max_tool_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;require_human_approval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then you wrap your agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_os.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LangChainKernel&lt;/span&gt;

&lt;span class="n"&gt;kernel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangChainKernel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;governed_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_langchain_chain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every call to &lt;code&gt;governed_agent.invoke()&lt;/code&gt; now goes through a pre-execution check. If the input matches a blocked pattern, the call is rejected before reaching the LLM or any external service. If the agent has exceeded its tool call budget, the call is rejected. If the output contains something problematic, the post-execution check catches it.&lt;/p&gt;

&lt;p&gt;The interception happens at multiple levels. Deep hooks intercept individual tool calls within the agent, memory writes are validated for PII before they're saved, and delegation chains between sub-agents are tracked and depth-limited.&lt;/p&gt;

&lt;p&gt;For MCP specifically, the toolkit includes a security scanner that checks tool definitions for poisoning patterns — hidden instructions, exfiltration attempts, privilege escalation, role overrides — before those tools are registered with the agent.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Concepts Worth Understanding
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The trust mesh.&lt;/strong&gt; In a multi-agent system, trust isn't binary. Different agents should have different permission levels based on their role, their source, and the context they're operating in. The toolkit models this through trust cards and identity verification — each agent has a verifiable identity, and policies can be scoped to specific agents or roles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Policy as code.&lt;/strong&gt; Governance policies are defined as YAML files that can be version-controlled, reviewed, and deployed like any other configuration. They have a schema, they can be validated before deployment (&lt;code&gt;agentos policy validate&lt;/code&gt;), and they can be diffed between versions. You don't want to be manually updating code every time a new blocked pattern needs to be added.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLOs for agents.&lt;/strong&gt; Just like you set service level objectives for APIs and infrastructure, you can set them for agents. Error rate thresholds, latency limits, availability targets. When an agent breaches its SLO, a circuit breaker trips and the agent is taken out of service. This prevents a degraded agent from silently producing bad outputs at scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Audit logging as a first-class concern.&lt;/strong&gt; Every governance decision — allow or block — is recorded with full context. This isn't just for debugging. It's for compliance, incident response, and understanding actual agent behavior in production over time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters Now
&lt;/h2&gt;

&lt;p&gt;The timing matters. We're at an inflection point.&lt;/p&gt;

&lt;p&gt;Most teams deploying AI agents today are doing so without any governance layer. They're moving fast, the agents are working well enough in testing, and governance feels like a problem for later. But "later" in AI agent deployments often means "after the first serious incident."&lt;/p&gt;

&lt;p&gt;The cost of retrofitting governance onto an existing agent system is much higher than building it in from the start. The audit trail doesn't exist yet, the policy boundaries haven't been defined, and the agents have already been granted permissions that are difficult to revoke without breaking workflows.&lt;/p&gt;

&lt;p&gt;This isn't just a security problem either. It's a reliability problem and a trust problem. If your agents take actions that users or customers didn't expect and can't explain, trust in the system erodes quickly. Governance is what makes agent behavior predictable and explainable — not just safe.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;The quickstart takes about 10 minutes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-governance-toolkit[full]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are also three interactive Google Colab notebooks that let you explore the toolkit without setting anything up locally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Policy Enforcement 101&lt;/strong&gt; — define a policy, see violations blocked in real time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP Security Proxy&lt;/strong&gt; — scan tool definitions for poisoning patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Agent Governance&lt;/strong&gt; — SLOs, circuit breakers, chaos testing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full source code and docs: &lt;a href="https://github.com/microsoft/agent-governance-toolkit" rel="noopener noreferrer"&gt;github.com/microsoft/agent-governance-toolkit&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  A Note From a Contributor
&lt;/h2&gt;

&lt;p&gt;I came to this project as someone who builds ML models and data pipelines — not a security engineer. But contributing to this codebase changed how I think about agent design. The attack surfaces are real, the failure modes are unintuitive, and the tooling to address them is still early.&lt;/p&gt;

&lt;p&gt;If you're building agents in production, governance isn't optional. It's the difference between a system you can operate confidently and one you're constantly firefighting.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Kanish Tyagi — MS Data Science student at UT Arlington and open source contributor. Find me on &lt;a href="https://github.com/kanish5" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/kanishtyagi123/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>security</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
