<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Vijay M</title>
    <description>The latest articles on Forem by Vijay M (@vijaym2k6).</description>
    <link>https://forem.com/vijaym2k6</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3813045%2Fcc11b701-5344-40c3-90c5-028d6b562215.png</url>
      <title>Forem: Vijay M</title>
      <link>https://forem.com/vijaym2k6</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/vijaym2k6"/>
    <language>en</language>
    <item>
      <title>I Built a Runtime Control Plane to Stop AI Agents From Burning Money</title>
      <dc:creator>Vijay M</dc:creator>
      <pubDate>Sun, 08 Mar 2026 14:44:24 +0000</pubDate>
      <link>https://forem.com/vijaym2k6/i-built-a-runtime-control-plane-to-stop-ai-agents-from-burning-money-20ii</link>
      <guid>https://forem.com/vijaym2k6/i-built-a-runtime-control-plane-to-stop-ai-agents-from-burning-money-20ii</guid>
      <description>&lt;p&gt;Two weeks ago, I watched an AI agent burn through $47 in API credits in 20 minutes. It had gotten stuck in a search loop — querying the same thing over and over — and nobody noticed until the OpenAI bill came in.&lt;/p&gt;

&lt;p&gt;This wasn't a toy experiment. This was a real agent, doing real work, in a real system.&lt;/p&gt;

&lt;p&gt;And it's not just me. &lt;strong&gt;87% of enterprises deploying AI agents have no security framework&lt;/strong&gt; for their autonomous systems. Agents can call APIs, execute code, browse the web, and spend real money — with zero guardrails.&lt;/p&gt;

&lt;p&gt;So I built SteerPlane.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is SteerPlane?
&lt;/h2&gt;

&lt;p&gt;SteerPlane is an open-source &lt;strong&gt;runtime control plane&lt;/strong&gt; for AI agents. It sits inside your agent code (not as a proxy or gateway) and enforces safety policies in real time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost limits&lt;/strong&gt;: Set a hard USD ceiling. If your agent hits it, the run is terminated instantly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Loop detection&lt;/strong&gt;: A sliding-window algorithm watches for repeating action patterns. If your agent does [search → think → search → think] 20 times, SteerPlane catches it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step limits&lt;/strong&gt;: Cap the total number of actions your agent can take.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full telemetry&lt;/strong&gt;: Every step is logged — action name, token count, cost, latency. All visible in a real-time dashboard.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The entire integration is one decorator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;steerplane&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;guard&lt;/span&gt;

&lt;span class="nd"&gt;@guard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_cost_usd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detect_loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_agent&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Your code doesn't change
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Your agent now has guardrails.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Just Use [Competitor]?
&lt;/h2&gt;

&lt;p&gt;I looked at everything out there. Here's what I found:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangSmith&lt;/strong&gt; — Great tracing, but locked to the LangChain ecosystem. If you're using OpenAI SDK directly or CrewAI, it doesn't help.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Portkey&lt;/strong&gt; — Acts as a proxy between you and the LLM. Good for routing, but it doesn't sit inside your agent. It can't detect application-level loops or enforce step limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Guardrails AI&lt;/strong&gt; — Validates LLM outputs (is this response toxic? Does it contain PII?). Doesn't monitor what the &lt;em&gt;agent&lt;/em&gt; is doing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Helicone&lt;/strong&gt; — Proxy for cost tracking. Doesn't enforce limits — just reports them after the damage is done.&lt;/p&gt;

&lt;p&gt;What I wanted was something that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Works with &lt;strong&gt;any&lt;/strong&gt; framework (LangChain, CrewAI, OpenAI, custom)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Actively enforces&lt;/strong&gt; limits (not just reports them)&lt;/li&gt;
&lt;li&gt;Catches &lt;strong&gt;agent-level&lt;/strong&gt; problems (loops, runaway steps)&lt;/li&gt;
&lt;li&gt;Requires &lt;strong&gt;zero infrastructure changes&lt;/strong&gt; (no proxy, no gateway)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's SteerPlane.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your Agent → SteerPlane SDK → FastAPI Server → Dashboard
              (embedded)        (telemetry)     (monitoring)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SDK lives &lt;em&gt;inside&lt;/em&gt; your agent process. Every time your agent takes an action, the SDK:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Logs it&lt;/strong&gt; (action, tokens, cost, latency)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Checks the cost&lt;/strong&gt; against your limit&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runs loop detection&lt;/strong&gt; on the action history&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Counts the step&lt;/strong&gt; against your limit&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reports to the API&lt;/strong&gt; (for the dashboard)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If any check fails, the SDK raises a specific exception (&lt;code&gt;CostLimitExceeded&lt;/code&gt;, &lt;code&gt;LoopDetectedError&lt;/code&gt;, &lt;code&gt;StepLimitExceeded&lt;/code&gt;) and the run is terminated.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Loop Detection Algorithm
&lt;/h3&gt;

&lt;p&gt;The loop detector uses a sliding window over the last N actions. For each possible pattern length (1 to N/2), it checks if the pattern repeats consecutively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Window: [search, think, search, think, search, think]
Pattern length 2: [search, think] → repeats 3 times → LOOP DETECTED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple, fast, and catches the #1 failure mode in production agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Graceful Degradation
&lt;/h3&gt;

&lt;p&gt;If your SteerPlane API server goes down, the SDK &lt;strong&gt;keeps working locally&lt;/strong&gt;. Cost limits, step limits, and loop detection all run in-process. You just lose the dashboard data until the API comes back.&lt;/p&gt;

&lt;p&gt;Your agents are &lt;strong&gt;never left unprotected&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Dashboard
&lt;/h2&gt;

&lt;p&gt;The monitoring dashboard is built with Next.js and Framer Motion. It shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;All runs&lt;/strong&gt; with status, cost, steps, and duration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step-by-step timeline&lt;/strong&gt; for each run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Color-coded status&lt;/strong&gt; badges (running, completed, failed, loop detected, cost exceeded)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-refresh&lt;/strong&gt; every 5 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  TypeScript Too
&lt;/h2&gt;

&lt;p&gt;Most agent guardrail tools are Python-only. SteerPlane has a TypeScript SDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;guard&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;steerplane&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;runAgent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;guard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;maxCostUsd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxSteps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runAgent&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Works with Vercel AI SDK, LangChain.js, or any Node.js agent framework.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;SteerPlane is open source (MIT) and free. Here's the roadmap:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Auto-detect OpenAI/Anthropic calls&lt;/strong&gt; — no manual token tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slack/Discord alerts&lt;/strong&gt; — instant notifications when agents fail&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hosted cloud version&lt;/strong&gt; — sign up, paste API key, done&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual cost analytics&lt;/strong&gt; — charts, trends, per-agent breakdowns&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;steerplane
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/vijaym2k6/SteerPlane" rel="noopener noreferrer"&gt;https://github.com/vijaym2k6/SteerPlane&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'd love to hear what you think. If you're running agents in production, what keeps you up at night?&lt;/p&gt;




&lt;p&gt;&lt;em&gt;SteerPlane is MIT licensed. Star the repo if you find it useful.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>python</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
