<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Roman V</title>
    <description>The latest articles on Forem by Roman V (@sapph1re).</description>
    <link>https://forem.com/sapph1re</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3880814%2Fbdfe9638-5489-49e3-af5e-ec87dc7c1672.jpeg</url>
      <title>Forem: Roman V</title>
      <link>https://forem.com/sapph1re</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/sapph1re"/>
    <language>en</language>
    <item>
      <title>How to Monetize Your MCP Server in 10 Minutes</title>
      <dc:creator>Roman V</dc:creator>
      <pubDate>Wed, 15 Apr 2026 23:54:34 +0000</pubDate>
      <link>https://forem.com/sapph1re/how-to-monetize-your-mcp-server-in-10-minutes-22g9</link>
      <guid>https://forem.com/sapph1re/how-to-monetize-your-mcp-server-in-10-minutes-22g9</guid>
      <description>&lt;p&gt;You built an MCP server. It works. Agents call your tools, get results, everybody's happy — except you're paying for compute and API calls out of pocket.&lt;/p&gt;

&lt;p&gt;You want to charge for it. But now you're looking at building: payment verification, usage tracking, API key management, subscription tiers, credit balances, revenue splits, Stripe integration, webhook handling. That's 2–4 weeks of work that has nothing to do with the tool you actually built.&lt;/p&gt;

&lt;p&gt;This is the billing problem every MCP server operator hits. You can build tools in a weekend, but monetizing them takes weeks. Most operators choose "not monetizing" and either burn money or shut down.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/sapph1re/mcp-billing-gateway-sdk" rel="noopener noreferrer"&gt;mcp-billing-gateway&lt;/a&gt; solves this. It's a reverse proxy that sits in front of your MCP server and handles all billing — Stripe subscriptions, per-call credits, and x402 crypto payments. Your server stays unchanged. You register it, set pricing, and start earning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: How It Works
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AI Agent (caller)
    │
    ├─ Authorization: Bearer cgwcl_...
    │
    ▼
┌──────────────────────────┐
│   mcp-billing-gateway    │
│                          │
│  1. Authenticate caller  │
│  2. Check credits/sub    │
│  3. Debit payment        │
│  4. Forward JSON-RPC     │
│  5. Record usage         │
│  6. Return response      │
└──────────┬───────────────┘
           │
           ▼
┌──────────────────────────┐
│  Your MCP Server         │
│  (unchanged, untouched)  │
└──────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your MCP server never sees billing logic. The gateway proxies JSON-RPC requests, handling authentication, payment verification, and usage metering transparently. If a caller doesn't have credits or a valid subscription, they get a &lt;code&gt;402 Payment Required&lt;/code&gt; before your server is even contacted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Register as an Operator
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/register &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "email": "you@yourcompany.com",
    "name": "Your Company"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"operator_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"op_01jvz..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"api_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cgwop_abcd1234..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"stripe_onboard_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://connect.stripe.com/setup/abc123..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save that API key — it's shown once. Visit the &lt;code&gt;stripe_onboard_url&lt;/code&gt; to connect your Stripe account (business info + bank account for payouts). This uses Stripe Connect Express, so you get a managed account without handling PCI compliance yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Register Your MCP Server
&lt;/h2&gt;

&lt;p&gt;Point the gateway at your upstream server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/servers &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwop_abcd1234..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "name": "Weather MCP",
    "description": "Real-time weather data for AI agents",
    "upstream_url": "https://your-mcp-server.com/mcp",
    "proxy_slug": "weather",
    "auth_mode": "header",
    "upstream_auth_token": "your-server-secret",
    "transport": "http",
    "is_public": true
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your server is now proxied at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://mcp-billing-gateway-production.up.railway.app/proxy/weather
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;auth_mode&lt;/code&gt; field controls how the gateway authenticates to your upstream: &lt;code&gt;header&lt;/code&gt; injects a Bearer token, &lt;code&gt;query&lt;/code&gt; appends it as a query parameter, &lt;code&gt;none&lt;/code&gt; sends requests unauthenticated. The &lt;code&gt;upstream_auth_token&lt;/code&gt; is your server's own auth — callers never see it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Set Pricing
&lt;/h2&gt;

&lt;p&gt;The gateway supports three billing models. You can have multiple plans per server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-call (pay-as-you-go):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/servers/srv_01jvz.../plans &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwop_abcd1234..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "name": "Pay-as-you-go",
    "billing_model": "per_call",
    "price_per_call_usd_micro": 10000,
    "free_calls_per_month": 100,
    "is_default": true
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's $0.01 per call (prices are in micro-units: 10,000 micro = $0.01) with 100 free calls per month. The free tier resets monthly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monthly subscription:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/servers/srv_01jvz.../plans &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwop_abcd1234..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "name": "Pro",
    "billing_model": "subscription",
    "subscription_price_usd_cents": 999,
    "subscription_interval": "monthly",
    "subscription_call_limit": 10000,
    "is_default": false
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;$9.99/month for up to 10,000 calls. Managed via Stripe, with automatic period tracking and cancellation handling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tiered pricing:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/servers/srv_01jvz.../plans &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwop_abcd1234..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "name": "Volume",
    "billing_model": "tiered",
    "tiers": [
      {"up_to": 1000, "price_usd_micro": 0},
      {"up_to": 10000, "price_usd_micro": 5000},
      {"up_to": null, "price_usd_micro": 10000}
    ],
    "is_default": false
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First 1,000 calls free, next 9,000 at $0.005 each, everything above at $0.01.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Callers Start Using Your Server
&lt;/h2&gt;

&lt;p&gt;From the caller's perspective, the integration is straightforward. Using the TypeScript SDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GatewayCallerSDK&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@mcp-billing-gateway/caller-sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GatewayCallerSDK&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://mcp-billing-gateway-production.up.railway.app&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Register (one-time)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;api_key&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bot@myapp.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;My Agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setApiKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Add credits ($10.00)&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;topupCredits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pm_stripe_payment_method_id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Call your tool — billing happens automatically&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;callTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;weather&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;get_weather&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Berlin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with plain curl:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/proxy/weather &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwcl_caller_key..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "jsonrpc": "2.0",
    "id": "req-1",
    "method": "tools/call",
    "params": {
      "name": "get_weather",
      "arguments": {"city": "Berlin"}
    }
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The gateway authenticates the caller, checks their credit balance, debits the call cost, forwards the JSON-RPC request to your server, and returns the response. If the upstream returns an error, credits are automatically refunded.&lt;/p&gt;

&lt;h2&gt;
  
  
  x402 Crypto Payments
&lt;/h2&gt;

&lt;p&gt;For agent-to-agent transactions where callers have crypto wallets, the gateway supports &lt;a href="https://www.x402.org/" rel="noopener noreferrer"&gt;x402&lt;/a&gt; payments — USDC on Base chain. No API key needed. The caller sends a payment claim header:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/proxy/weather &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-402-Payment: &amp;lt;base64-encoded-payment-claim&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "jsonrpc": "2.0",
    "id": "req-1",
    "method": "tools/call",
    "params": {"name": "get_weather", "arguments": {"city": "Berlin"}}
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a caller sends a request without valid payment, they get a structured 402 response with pricing info:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"req-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;402&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Payment Required"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"price_usd_micro"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"x402_address"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0x..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"chain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"base"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means your single server can accept both fiat (Stripe) and crypto (USDC) payments simultaneously, depending on what the caller supports.&lt;/p&gt;

&lt;h2&gt;
  
  
  Revenue and Analytics
&lt;/h2&gt;

&lt;p&gt;Track how your server is being used and how much you're earning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Usage analytics&lt;/span&gt;
curl &lt;span class="s2"&gt;"https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/servers/srv_01jvz.../usage?from=2026-04-01&amp;amp;to=2026-04-15"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwop_abcd1234..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response includes usage broken down by day, by tool name, by billing rail (fiat vs. x402), and top callers. You see exactly which tools are popular and which pricing model callers prefer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Revenue and payouts&lt;/span&gt;
curl https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/revenue &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwop_abcd1234..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default revenue split is 85/15 — you keep 85% of each call's revenue, the platform takes 15%. When you're ready to collect:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://mcp-billing-gateway-production.up.railway.app/api/v1/operator/payouts/request &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer cgwop_abcd1234..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Funds transfer to your linked Stripe account within 1–2 business days. Minimum payout: $1.00.&lt;/p&gt;

&lt;h2&gt;
  
  
  Self-Hosting
&lt;/h2&gt;

&lt;p&gt;The gateway is open for self-hosting if you want full control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:3000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/data/billing.db &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;STRIPE_SECRET_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk_test_... &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;STRIPE_WEBHOOK_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;whsec_... &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;PLATFORM_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://your-domain.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; billing-data:/data &lt;span class="se"&gt;\&lt;/span&gt;
  mcp-billing-gateway
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SQLite for storage (no Postgres dependency), Stripe for payment processing, and the full operator + caller API available at your own domain. Set &lt;code&gt;DEFAULT_TAKE_RATE_BPS&lt;/code&gt; to adjust the revenue split (default 1500 = 15%).&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Don't Have to Build
&lt;/h2&gt;

&lt;p&gt;Here's what the gateway handles so you don't have to:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concern&lt;/th&gt;
&lt;th&gt;Without Gateway&lt;/th&gt;
&lt;th&gt;With Gateway&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API key management&lt;/td&gt;
&lt;td&gt;Build auth system&lt;/td&gt;
&lt;td&gt;Built-in (hashed, scoped, revocable)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Payment processing&lt;/td&gt;
&lt;td&gt;Integrate Stripe SDK&lt;/td&gt;
&lt;td&gt;Pre-integrated via Stripe Connect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Usage metering&lt;/td&gt;
&lt;td&gt;Build tracking system&lt;/td&gt;
&lt;td&gt;Per-call metering with breakdown&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subscription management&lt;/td&gt;
&lt;td&gt;Handle Stripe webhooks&lt;/td&gt;
&lt;td&gt;Automatic period tracking + cancellation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credit system&lt;/td&gt;
&lt;td&gt;Build ledger + balance tracking&lt;/td&gt;
&lt;td&gt;Prepaid credits with auto-refund on errors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;x402 crypto payments&lt;/td&gt;
&lt;td&gt;Implement payment verification&lt;/td&gt;
&lt;td&gt;Built-in x402 validation + anti-replay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revenue analytics&lt;/td&gt;
&lt;td&gt;Build dashboard&lt;/td&gt;
&lt;td&gt;Usage by day, tool, billing rail, caller&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Payouts&lt;/td&gt;
&lt;td&gt;Manual Stripe transfers&lt;/td&gt;
&lt;td&gt;One API call, automatic splits&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Register as an operator (30 seconds)&lt;/li&gt;
&lt;li&gt;Complete Stripe onboarding (2 minutes)&lt;/li&gt;
&lt;li&gt;Register your server with a proxy slug (1 minute)&lt;/li&gt;
&lt;li&gt;Create a pricing plan (1 minute)&lt;/li&gt;
&lt;li&gt;Share the proxy URL with your users&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Your MCP server is now monetized. No code changes to your server, no new dependencies, no payment infrastructure to maintain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live gateway&lt;/strong&gt;: &lt;a href="https://mcp-billing-gateway-production.up.railway.app" rel="noopener noreferrer"&gt;mcp-billing-gateway-production.up.railway.app&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;SDK docs&lt;/strong&gt;: &lt;a href="https://github.com/sapph1re/mcp-billing-gateway-sdk" rel="noopener noreferrer"&gt;sapph1re/mcp-billing-gateway-sdk&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Glama listing&lt;/strong&gt;: &lt;a href="https://glama.ai/mcp/servers/sapph1re/mcp-billing-gateway-sdk" rel="noopener noreferrer"&gt;glama.ai/mcp/servers/sapph1re/mcp-billing-gateway-sdk&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>billing</category>
      <category>typescript</category>
    </item>
    <item>
      <title>How to Stop AI Agent Cost Blowups Before They Happen</title>
      <dc:creator>Roman V</dc:creator>
      <pubDate>Wed, 15 Apr 2026 23:54:14 +0000</pubDate>
      <link>https://forem.com/sapph1re/how-to-stop-ai-agent-cost-blowups-before-they-happen-1ehp</link>
      <guid>https://forem.com/sapph1re/how-to-stop-ai-agent-cost-blowups-before-they-happen-1ehp</guid>
      <description>&lt;p&gt;You deploy a four-agent pipeline that should cost about $0.80 per run. By end of day it has burned through $47 on a single stuck researcher loop. Sound familiar?&lt;/p&gt;

&lt;p&gt;If you're running AI agents in production, cost blowups are not a question of &lt;em&gt;if&lt;/em&gt; but &lt;em&gt;when&lt;/em&gt;. 57% of organizations already have agents in production, yet 90% of agent projects fail within 30 days — and runaway LLM costs are the number one pain point.&lt;/p&gt;

&lt;p&gt;The core problem: agents make autonomous decisions about how many LLM calls to issue. A retry loop, an overly verbose chain-of-thought, or a stuck tool call can silently 10x your bill before you notice.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Current State of Agent Cost Control
&lt;/h2&gt;

&lt;p&gt;Most teams handle this with one of three approaches, all of which fall short:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Manual monitoring.&lt;/strong&gt; You watch dashboards and kill processes when costs spike. This works until you're asleep, in a meeting, or running 20 agents in parallel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provider-level spending caps.&lt;/strong&gt; OpenAI and Anthropic offer monthly limits, but they're account-wide. You can't set a $5 budget for a specific research pipeline while allowing your coding agent $50.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gateway proxies (Helicone, Portkey).&lt;/strong&gt; These require routing all traffic through an external service. They add latency, a point of failure, and vendor lock-in. And they still don't give you per-agent circuit breakers.&lt;/p&gt;

&lt;p&gt;What's missing is a &lt;strong&gt;framework-native&lt;/strong&gt; solution: something that hooks directly into CrewAI, AutoGen, or LangGraph at the process level, enforces hard limits before each LLM call, and trips a circuit breaker when things go wrong — without requiring any external infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing agent-cost-guardrails
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/sapph1re/agent-cost-guardrails" rel="noopener noreferrer"&gt;agent-cost-guardrails&lt;/a&gt; is an open-source Python library that does exactly this. Pure Python, zero infrastructure, framework-native hooks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-cost-guardrails
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's what it gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hard budget limits&lt;/strong&gt; — raises &lt;code&gt;BudgetExceededError&lt;/code&gt; when spend exceeds your cap&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-call token limits&lt;/strong&gt; — prevents any single LLM call from consuming too many tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limiting&lt;/strong&gt; — tokens-per-minute sliding window to control burst spend&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Circuit breaker&lt;/strong&gt; — trips after N consecutive violations, requires manual reset&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alert callbacks&lt;/strong&gt; — fire at configurable thresholds (50%, 80%, 100%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost breakdown&lt;/strong&gt; — track spend by model and by agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bundled pricing&lt;/strong&gt; — 30+ models from OpenAI, Anthropic, Google, Mistral, DeepSeek, and Meta&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quick Start: The Context Manager
&lt;/h2&gt;

&lt;p&gt;The simplest way to use it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_cost_guardrails&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BudgetGuard&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;BudgetGuard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_usd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;5.00&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;guard&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;guard&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pre_call_check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;estimated_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ... your LLM call here ...
&lt;/span&gt;    &lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;guard&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post_call_record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;input_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;output_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Call cost: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;guard&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cost_report&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;pre_call_check()&lt;/code&gt; validates the budget, rate limit, and circuit breaker &lt;em&gt;before&lt;/em&gt; the call happens. &lt;code&gt;post_call_record()&lt;/code&gt; tracks the actual spend. If the budget is exceeded, &lt;code&gt;BudgetExceededError&lt;/code&gt; stops execution immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  CrewAI Integration
&lt;/h2&gt;

&lt;p&gt;CrewAI is the most popular multi-agent framework, and it has the worst cost visibility out of the box. The logging inside Tasks is broken, and there's no built-in token cap.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;crewai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Crew&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_cost_guardrails.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CrewAIGuardrails&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cost_alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WARNING: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;% of $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; budget used&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;guards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CrewAIGuardrails&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;max_usd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;5.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens_per_call&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;on_alert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cost_alert&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;install&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;researcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Market Researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Find competitor pricing data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research competitor pricing for SaaS analytics tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;crew&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Crew&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kickoff&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cost_report&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Total: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;total_cost_usd&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;By agent: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cost_by_agent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uninstall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;guards.install()&lt;/code&gt; registers &lt;code&gt;@before_llm_call&lt;/code&gt; and &lt;code&gt;@after_llm_call&lt;/code&gt; hooks globally. Every LLM call CrewAI makes — across all agents and tasks — gets checked and tracked automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  AutoGen / AG2 Integration
&lt;/h2&gt;

&lt;p&gt;AutoGen's &lt;code&gt;register_hook()&lt;/code&gt; system gives us a clean interception point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;autogen&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AssistantAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;UserProxyAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_cost_guardrails.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoGenGuardrails&lt;/span&gt;

&lt;span class="n"&gt;guards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AutoGenGuardrails&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_usd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;10.00&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;assistant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AssistantAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analyst&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;proxy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;UserProxyAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;human_input_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NEVER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;proxy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;proxy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initiate_chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze Q1 revenue trends&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cost_report&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The library uses AG2's &lt;code&gt;safeguard_llm_inputs&lt;/code&gt; / &lt;code&gt;safeguard_llm_outputs&lt;/code&gt; hooks with automatic fallback to legacy hook names for older AutoGen versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  LangGraph / LangChain Integration
&lt;/h2&gt;

&lt;p&gt;LangGraph uses LangChain's callback system, so the integration plugs in via &lt;code&gt;BaseCallbackHandler&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_cost_guardrails.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LangGraphGuardrails&lt;/span&gt;

&lt;span class="n"&gt;guards&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangGraphGuardrails&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_usd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.00&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_your_graph&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# your StateGraph
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;initial_state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;callbacks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;callback_handler&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;guards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cost_report&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Remaining budget: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;remaining_usd&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The callback handler intercepts &lt;code&gt;on_llm_start&lt;/code&gt;, &lt;code&gt;on_chat_model_start&lt;/code&gt;, and &lt;code&gt;on_llm_end&lt;/code&gt; events. It extracts actual token usage from the response when available, and falls back to tiktoken estimation when not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Circuit Breaker: Your Safety Net
&lt;/h2&gt;

&lt;p&gt;The circuit breaker is what separates this from basic cost tracking. When an agent enters a failure loop — retrying the same failed tool call, generating invalid outputs, or hitting rate limits — the circuit breaker trips after N consecutive violations and &lt;strong&gt;stops all LLM calls&lt;/strong&gt; until you explicitly reset it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;guard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BudgetGuard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;max_usd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;10.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens_per_call&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;circuit_breaker_max_violations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# After 3 consecutive per-call violations:
# CircuitBreakerTrippedError is raised on the next pre_call_check()
# All agents stop. No more silent cost accumulation.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the difference between a $5 mistake and a $500 one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Custom Model Pricing
&lt;/h2&gt;

&lt;p&gt;The library ships with pricing for 30+ models, but you can override or extend it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent_cost_guardrails&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;set_custom_pricing&lt;/span&gt;

&lt;span class="nf"&gt;set_custom_pricing&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-fine-tuned-gpt4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_per_mtok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;6.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output_per_mtok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;18.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pricing is maintained per million tokens (input and output separately) and supports prefix matching — so &lt;code&gt;gpt-4o-2024-05-13&lt;/code&gt; automatically resolves to the &lt;code&gt;gpt-4o&lt;/code&gt; price.&lt;/p&gt;

&lt;h2&gt;
  
  
  Before and After
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Before agent-cost-guardrails:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;4-agent pipeline expected cost: ~$0.80/run&lt;/li&gt;
&lt;li&gt;Actual cost with stuck loops: $3–$12/run&lt;/li&gt;
&lt;li&gt;Worst case (overnight): $47 in a single stuck session&lt;/li&gt;
&lt;li&gt;Discovery: next morning, checking the billing dashboard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hard cap: $2.00/run, enforced at the framework level&lt;/li&gt;
&lt;li&gt;Circuit breaker trips after 3 violations — no silent loops&lt;/li&gt;
&lt;li&gt;Alert at 80% budget → you get notified before hitting the cap&lt;/li&gt;
&lt;li&gt;Per-agent breakdown → identify which agent is the problem&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-cost-guardrails          &lt;span class="c"&gt;# core&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-cost-guardrails[crewai]  &lt;span class="c"&gt;# + CrewAI hooks&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-cost-guardrails[autogen] &lt;span class="c"&gt;# + AutoGen hooks&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-cost-guardrails[langgraph] &lt;span class="c"&gt;# + LangGraph callbacks&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;agent-cost-guardrails[all]     &lt;span class="c"&gt;# everything&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The library is MIT-licensed. Source, docs, and examples on GitHub:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;a href="https://pypi.org/project/agent-cost-guardrails/" rel="noopener noreferrer"&gt;agent-cost-guardrails&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/sapph1re/agent-cost-guardrails" rel="noopener noreferrer"&gt;sapph1re/agent-cost-guardrails&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're running agents in production and haven't had a cost blowup yet, you will. The question is whether you'll catch it at $2 or at $200.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>llm</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
