<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Anthony Zender</title>
    <description>The latest articles on Forem by Anthony Zender (@azender1).</description>
    <link>https://forem.com/azender1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3681962%2Ff6210fe3-edb0-45ef-9a66-e8323a8ff7df.png</url>
      <title>Forem: Anthony Zender</title>
      <link>https://forem.com/azender1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/azender1"/>
    <language>en</language>
    <item>
      <title>The Execution Boundary Problem: What PocketOS Made Visible</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Wed, 29 Apr 2026 22:43:11 +0000</pubDate>
      <link>https://forem.com/azender1/the-execution-boundary-problem-what-pocketos-made-visible-4b62</link>
      <guid>https://forem.com/azender1/the-execution-boundary-problem-what-pocketos-made-visible-4b62</guid>
      <description>&lt;p&gt;The PocketOS incident last week gave it a name everyone could see. But this bug was already breaking systems quietly — payments, trades, scheduled jobs. Anywhere an AI agent retries a failed action without knowing if the first attempt completed.&lt;/p&gt;

&lt;p&gt;The guardrail can't live inside the agent. It has to live outside, at the tool call boundary.&lt;/p&gt;

&lt;p&gt;That's what SafeAgent does.&lt;/p&gt;

&lt;p&gt;safe_execute(request_id, action, payload)&lt;/p&gt;

&lt;p&gt;Same request_id always returns the original receipt. The side effect never fires twice. Works with any MCP host — Claude, Cursor, Windsurf.&lt;/p&gt;

&lt;p&gt;I found this pattern building a live trading bot. Duplicate execution under retry is catastrophic when money is on the line.&lt;/p&gt;

&lt;p&gt;&lt;a class="mentioned-user" href="https://dev.to/grok"&gt;@grok&lt;/a&gt; validated the OTEL exporter design on X and offered to help refine it. It shipped the same night.&lt;/p&gt;

&lt;p&gt;pip install safeagent-exec-guard&lt;/p&gt;

&lt;p&gt;Demo: azender1.github.io/SafeAgent/demo.html&lt;br&gt;
GitHub: github.com/azender1/SafeAgent&lt;/p&gt;

</description>
      <category>agents</category>
      <category>architecture</category>
      <category>mcp</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I Was Building a Live Trading Bot and a Patented Wagering System. The Bug I Found Is Now Breaking AI Agents Everywhere.</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sun, 26 Apr 2026 06:15:26 +0000</pubDate>
      <link>https://forem.com/azender1/i-was-building-a-live-trading-bot-and-a-patented-wagering-system-the-bug-i-found-is-now-breaking-2oeg</link>
      <guid>https://forem.com/azender1/i-was-building-a-live-trading-bot-and-a-patented-wagering-system-the-bug-i-found-is-now-breaking-2oeg</guid>
      <description>&lt;p&gt;This isn't a library I built to solve a theoretical problem.&lt;/p&gt;

&lt;p&gt;It's a fix I built because real money was at risk.&lt;/p&gt;

&lt;h2&gt;
  
  
  The trading bot
&lt;/h2&gt;

&lt;p&gt;I've been running a live QQQ/TQQQ momentum bot on Alpaca Markets. It reads 1-minute bars, scores market structure using VWAP, SMA8, SMA21, SMA34, and momentum signals, then enters leveraged positions in TQQQ (bull) or SQQQ (bear) based on that score.&lt;/p&gt;

&lt;p&gt;The bot has retry logic built in. It has to — broker ACK timeouts are real. When you submit a market order and the network drops before confirmation comes back, you don't know if it filled or not. So the bot retries.&lt;/p&gt;

&lt;p&gt;Here's the problem: if the first order actually filled but the confirmation timed out, the retry fires a second market order. On a 3x leveraged ETF, that's a doubled position you didn't intend. With real dollars on the line.&lt;/p&gt;

&lt;p&gt;The bot already had a manual execution lock (&lt;code&gt;EXECUTION_LOCK_SEC=15&lt;/code&gt;) and a JSON state machine to handle this. I built it by hand. It worked — mostly. But it was fragile, untested, and not something I'd want to hand to anyone else.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# The old pattern — retries up to 3 times
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;place_order_with_retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;side&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;last_err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EXIT_RETRY_COUNT&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;place_order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;side&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# fires twice if first timed out but filled
&lt;/span&gt;        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;last_err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EXIT_RETRY_SLEEP_SEC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;last_err&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;place_order&lt;/code&gt; call has no memory. If attempt 1 filled and attempt 2 fires, you now own twice the position. The broker doesn't know you didn't mean it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The wagering system
&lt;/h2&gt;

&lt;p&gt;At the same time I was building the bot, I was designing &lt;strong&gt;PeerPlay&lt;/strong&gt; — a patented P2P wagering exchange for skill-based video game tournaments (USPTO provisional 63/914,036).&lt;/p&gt;

&lt;p&gt;PeerPlay has an escrow engine, a verification layer, and a settlement layer. The verification layer uses AI to confirm match results. When a verification agent times out and retries, the settlement layer can receive two confirmation signals for the same match. Two signals → two prize payouts. One tournament result, two winner transfers.&lt;/p&gt;

&lt;p&gt;The patent protects the architecture. Nothing in the patent protects you from your own execution layer firing twice.&lt;/p&gt;

&lt;p&gt;Same problem. Different domain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The extraction
&lt;/h2&gt;

&lt;p&gt;I realized the trading bot and PeerPlay had identical failure modes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent/bot decides to act
    ↓
Network times out
    ↓
Agent/bot retries
    ↓
Side effect fires twice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix in both cases is the same primitive: before you execute an irreversible action, check whether it already ran. If it did, return the original result. If it didn't, run it and store the result.&lt;/p&gt;

&lt;p&gt;That's SafeAgent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;settlement.settlement_requests&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SettlementRequestRegistry&lt;/span&gt;

&lt;span class="n"&gt;registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SettlementRequestRegistry&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Same request_id on retry → returns original receipt, never re-executes
&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trade:TQQQ:buy:2026-04-26T09:47:00&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;order_buy_TQQQ&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;symbol&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TQQQ&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qty&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;side&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;buy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;execute_fn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;place_order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TQQQ&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;buy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First call executes the order and stores the receipt. Any retry with the same &lt;code&gt;request_id&lt;/code&gt; returns the stored receipt — the broker is never called again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters for AI agents specifically
&lt;/h2&gt;

&lt;p&gt;The trading bot and PeerPlay are deterministic systems. They have retry logic because networks are unreliable. AI agents have the same problem but worse — they also have uncertain completion signals.&lt;/p&gt;

&lt;p&gt;When Claude or any LLM agent calls a tool, it may:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get a timeout and retry the same call&lt;/li&gt;
&lt;li&gt;Receive an ambiguous response and call again to confirm&lt;/li&gt;
&lt;li&gt;Run in a loop and re-trigger the same action&lt;/li&gt;
&lt;li&gt;Get restarted mid-execution and replay from the last checkpoint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every one of these scenarios can produce duplicate side effects. The agent frameworks (LangChain, CrewAI, n8n, OpenAI function calling) handle retries at the transport layer. None of them track whether the side effect already happened.&lt;/p&gt;

&lt;p&gt;That gap — between the agent decision and the irreversible action — is where SafeAgent lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The state machine
&lt;/h2&gt;

&lt;p&gt;SafeAgent doesn't just deduplicate by request_id. It enforces a finality gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPEN → RESOLVED → IN_RECONCILIATION → FINAL → SETTLED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution is only permitted from &lt;code&gt;FINAL&lt;/code&gt;. If the agent's signals are ambiguous — conflicting tool responses, partial confirmations, uncertain outcomes — the state stays in &lt;code&gt;IN_RECONCILIATION&lt;/code&gt; and the side effect is blocked until the outcome is clear.&lt;/p&gt;

&lt;p&gt;This is what I needed for PeerPlay's verification layer. The AI model returns a confidence score. SafeAgent holds the settlement until that score clears a threshold. Below threshold: &lt;code&gt;IN_RECONCILIATION&lt;/code&gt;. Above threshold: &lt;code&gt;FINAL&lt;/code&gt;. Payout executes exactly once.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it fits in the MCP stack
&lt;/h2&gt;

&lt;p&gt;If you're building agents on MCP, SafeAgent sits above your tool layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude / agent decision
    → SafeAgent finality gate
    → SafeAgent request-id dedup
    → MCP tool executes
    → Receipt stored (SQLite, survives restarts)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works with any MCP-capable host — Claude, Cursor, Windsurf, custom executors — without modifying the protocol.&lt;/p&gt;

&lt;p&gt;As of today (April 26, 2026) SafeAgent is officially listed in the MCP registry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;io.github.azender1/safeagent v0.1.14
registry.modelcontextprotocol.io
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;safeagent-exec-guard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Python 3.10+ · Apache-2.0 · &lt;a href="https://github.com/azender1/SafeAgent" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; · &lt;a href="https://azender1.github.io/SafeAgent" rel="noopener noreferrer"&gt;Live demo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The trading bot integration example is in the repo at &lt;code&gt;examples/safeagent_trading_integration.py&lt;/code&gt; — it shows the before/after pattern with real variable names from the QQQ bot.&lt;/p&gt;

&lt;h2&gt;
  
  
  The audit
&lt;/h2&gt;

&lt;p&gt;If you're running agents or bots in production and want to know where your system can execute twice, I'm offering a focused duplicate execution risk audit for $499. Written report, every retry path, every side effect boundary, SafeAgent integration recommendations.&lt;/p&gt;

&lt;p&gt;DM me or email &lt;a href="mailto:azender1@yahoo.com"&gt;azender1@yahoo.com&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Anthony Zender, Dayton OH. Payroll tax accountant by day, agent infrastructure builder by night. USPTO provisional 63/914,036 — Zender Gaming Technologies LLC.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>trading</category>
      <category>agents</category>
    </item>
    <item>
      <title>The Real AI Agent Failure Mode Is Uncertain Completion</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sat, 28 Mar 2026 14:12:46 +0000</pubDate>
      <link>https://forem.com/azender1/the-real-ai-agent-failure-mode-is-uncertain-completion-447n</link>
      <guid>https://forem.com/azender1/the-real-ai-agent-failure-mode-is-uncertain-completion-447n</guid>
      <description>&lt;p&gt;The Real AI Agent Failure Mode Is Uncertain Completion&lt;/p&gt;

&lt;p&gt;A lot of AI agent discussion focuses on the wrong failure modes.&lt;/p&gt;

&lt;p&gt;People talk about:&lt;/p&gt;

&lt;p&gt;hallucinations&lt;br&gt;
prompt injection&lt;br&gt;
tool misuse&lt;br&gt;
runaway loops&lt;br&gt;
bad reasoning&lt;/p&gt;

&lt;p&gt;Those are real.&lt;/p&gt;

&lt;p&gt;But once an agent starts calling tools that affect the outside world, a different class of failure becomes much more dangerous:&lt;/p&gt;

&lt;p&gt;uncertain completion&lt;/p&gt;

&lt;p&gt;That is the moment where the system cannot confidently answer:&lt;/p&gt;

&lt;p&gt;“Did this action already happen?”&lt;/p&gt;

&lt;p&gt;And once that question becomes ambiguous, retries get dangerous very fast.&lt;/p&gt;

&lt;p&gt;What uncertain completion actually looks like&lt;/p&gt;

&lt;p&gt;A common real-world path looks like this:&lt;/p&gt;

&lt;p&gt;agent decides to call send_payment()&lt;br&gt;
→ tool sends the payment request&lt;br&gt;
→ timeout / crash / disconnect / lost response&lt;br&gt;
→ caller does not know if it succeeded&lt;br&gt;
→ retry happens&lt;br&gt;
→ payment may be sent again&lt;/p&gt;

&lt;p&gt;The same thing shows up with:&lt;/p&gt;

&lt;p&gt;order creation&lt;br&gt;
booking flows&lt;br&gt;
email sends&lt;br&gt;
CRM mutations&lt;br&gt;
support ticket creation&lt;br&gt;
browser / UI automation&lt;br&gt;
webhook-triggered workflows&lt;/p&gt;

&lt;p&gt;The model may have made the correct decision.&lt;/p&gt;

&lt;p&gt;The failure is that the system has no durable way to prove whether the side effect already happened.&lt;/p&gt;

&lt;p&gt;This is not mainly a prompting problem&lt;/p&gt;

&lt;p&gt;The agent is often not “being stupid.”&lt;/p&gt;

&lt;p&gt;The system is simply missing a clean execution boundary.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;p&gt;the same logical action can be attempted multiple times&lt;br&gt;
the caller cannot distinguish “attempted” from “completed”&lt;br&gt;
retries are forced to guess&lt;/p&gt;

&lt;p&gt;And “guessing” is exactly how you get:&lt;/p&gt;

&lt;p&gt;duplicate payments&lt;br&gt;
duplicate emails&lt;br&gt;
duplicate orders&lt;br&gt;
duplicate API mutations&lt;br&gt;
duplicate irreversible actions&lt;br&gt;
The hidden trap: “we logged the attempt”&lt;/p&gt;

&lt;p&gt;A lot of systems record that they tried to do something.&lt;/p&gt;

&lt;p&gt;That is not the same as recording that it completed safely.&lt;/p&gt;

&lt;p&gt;This is where the distinction matters:&lt;/p&gt;

&lt;p&gt;State visibility&lt;/p&gt;

&lt;p&gt;Can your system durably see:&lt;/p&gt;

&lt;p&gt;what was requested&lt;br&gt;
what was claimed&lt;br&gt;
what actually completed&lt;br&gt;
what result should be returned on replay&lt;br&gt;
Result recovery&lt;/p&gt;

&lt;p&gt;If the side effect happened but the response was lost, can the system reconstruct what should happen next without re-executing the side effect?&lt;/p&gt;

&lt;p&gt;That second part is where many systems break.&lt;/p&gt;

&lt;p&gt;Because once the answer becomes:&lt;/p&gt;

&lt;p&gt;“we’re not sure, so retry it”&lt;/p&gt;

&lt;p&gt;you are already in dangerous territory.&lt;/p&gt;

&lt;p&gt;API idempotency helps — but it is not enough&lt;/p&gt;

&lt;p&gt;A common response is:&lt;/p&gt;

&lt;p&gt;“Just use idempotency keys.”&lt;/p&gt;

&lt;p&gt;That is often correct.&lt;/p&gt;

&lt;p&gt;And if the downstream API supports strong idempotency semantics, you should absolutely use them.&lt;/p&gt;

&lt;p&gt;But that still leaves hard cases:&lt;/p&gt;

&lt;p&gt;the downstream API does not support idempotency&lt;br&gt;
the key is not stable across retries&lt;br&gt;
the first call may have succeeded but the caller cannot prove it&lt;br&gt;
the side effect is happening in a browser / UI / desktop automation context&lt;br&gt;
the external system gives weak or ambiguous feedback&lt;/p&gt;

&lt;p&gt;In those cases, the problem is no longer just API-level idempotency.&lt;/p&gt;

&lt;p&gt;It becomes:&lt;/p&gt;

&lt;p&gt;execution-layer safety&lt;br&gt;
The important split: intent vs execution&lt;/p&gt;

&lt;p&gt;One of the cleanest ways to think about this is:&lt;/p&gt;

&lt;p&gt;the agent should not directly own irreversible side effects&lt;/p&gt;

&lt;p&gt;Instead, there should be a separation between:&lt;/p&gt;

&lt;p&gt;Agent intent&lt;/p&gt;

&lt;p&gt;“I think we should do X”&lt;/p&gt;

&lt;p&gt;and&lt;/p&gt;

&lt;p&gt;Execution&lt;/p&gt;

&lt;p&gt;“X is now allowed to happen exactly once”&lt;/p&gt;

&lt;p&gt;That is a very important boundary.&lt;/p&gt;

&lt;p&gt;Because once the system separates:&lt;/p&gt;

&lt;p&gt;decision&lt;br&gt;
validation&lt;br&gt;
execution&lt;br&gt;
receipt / replay&lt;/p&gt;

&lt;p&gt;…then retries stop being so dangerous.&lt;/p&gt;

&lt;p&gt;A better pattern: proposal → guard → execute&lt;/p&gt;

&lt;p&gt;A safer structure looks more like this:&lt;/p&gt;

&lt;p&gt;agent proposes action&lt;br&gt;
→ deterministic layer validates action&lt;br&gt;
→ execution guard checks durable receipt&lt;br&gt;
→ if already completed: return prior result&lt;br&gt;
→ else: execute once and persist receipt&lt;/p&gt;

&lt;p&gt;This is a very different mental model from:&lt;/p&gt;

&lt;p&gt;agent decides&lt;br&gt;
→ immediately call side-effecting tool&lt;/p&gt;

&lt;p&gt;That second pattern is where a lot of production agent systems get into trouble.&lt;/p&gt;

&lt;p&gt;The more irreversible the action, the thicker the boundary&lt;/p&gt;

&lt;p&gt;Not all tools should be treated equally.&lt;/p&gt;

&lt;p&gt;A useful mental model is:&lt;/p&gt;

&lt;p&gt;Safe tools&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;p&gt;search&lt;br&gt;
read_file&lt;br&gt;
summarize&lt;br&gt;
fetch_status&lt;/p&gt;

&lt;p&gt;These are usually fine to retry.&lt;/p&gt;

&lt;p&gt;Side-effecting tools&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;p&gt;send_email&lt;br&gt;
create_order&lt;br&gt;
create_ticket&lt;br&gt;
update_CRM&lt;/p&gt;

&lt;p&gt;These need an execution boundary.&lt;/p&gt;

&lt;p&gt;Irreversible / high-risk tools&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;p&gt;payment&lt;br&gt;
delete&lt;br&gt;
trade execution&lt;br&gt;
account mutation&lt;/p&gt;

&lt;p&gt;These need the strongest boundary:&lt;/p&gt;

&lt;p&gt;deterministic identity&lt;br&gt;
durable receipts&lt;br&gt;
replay-safe semantics&lt;br&gt;
often confirmation / policy checks&lt;/p&gt;

&lt;p&gt;The principle is simple:&lt;/p&gt;

&lt;p&gt;the more irreversible the action, the thicker the execution boundary should be&lt;br&gt;
What systems actually need&lt;/p&gt;

&lt;p&gt;In practice, most systems need some combination of:&lt;/p&gt;

&lt;p&gt;stable request / operation identity&lt;br&gt;
durable receipt storage&lt;br&gt;
replay-safe execution semantics&lt;br&gt;
result recovery&lt;br&gt;
explicit separation between “propose” and “execute”&lt;/p&gt;

&lt;p&gt;That can be implemented many ways.&lt;/p&gt;

&lt;p&gt;But the important thing is the architectural boundary itself.&lt;/p&gt;

&lt;p&gt;Because once a system can confidently answer:&lt;/p&gt;

&lt;p&gt;“yes, this already happened”&lt;/p&gt;

&lt;p&gt;then retries become much safer.&lt;/p&gt;

&lt;p&gt;Why this keeps showing up in agent systems&lt;/p&gt;

&lt;p&gt;Traditional systems already had this problem.&lt;/p&gt;

&lt;p&gt;Agents just make it more visible.&lt;/p&gt;

&lt;p&gt;Why?&lt;/p&gt;

&lt;p&gt;Because agents are:&lt;/p&gt;

&lt;p&gt;retry-heavy&lt;br&gt;
tool-using&lt;br&gt;
asynchronous&lt;br&gt;
failure-prone&lt;br&gt;
often layered on top of APIs that were never designed for autonomous replay&lt;/p&gt;

&lt;p&gt;So the moment an agent starts touching:&lt;/p&gt;

&lt;p&gt;payments&lt;br&gt;
orders&lt;br&gt;
emails&lt;br&gt;
browser actions&lt;br&gt;
external systems&lt;/p&gt;

&lt;p&gt;…uncertain completion becomes one of the most important production problems in the stack.&lt;/p&gt;

&lt;p&gt;Closing thought&lt;/p&gt;

&lt;p&gt;The scariest agent failure is often not:&lt;/p&gt;

&lt;p&gt;“the model made the wrong choice”&lt;/p&gt;

&lt;p&gt;It is:&lt;/p&gt;

&lt;p&gt;“the model made the right choice twice”&lt;/p&gt;

&lt;p&gt;And the reason that happens is usually not intelligence failure.&lt;/p&gt;

&lt;p&gt;It is:&lt;/p&gt;

&lt;p&gt;missing execution boundaries under uncertain completion&lt;br&gt;
Related&lt;/p&gt;

&lt;p&gt;I wrote a first piece on the execution-side pattern here:&lt;/p&gt;

&lt;p&gt;The Execution Guard Pattern for AI Agents&lt;br&gt;
&lt;a href="https://dev.to/azender1/the-execution-guard-pattern-for-ai-agents-23m9"&gt;https://dev.to/azender1/the-execution-guard-pattern-for-ai-agents-23m9&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And I’m also building a Python reference implementation around this idea:&lt;/p&gt;

&lt;p&gt;GitHub&lt;br&gt;
&lt;a href="https://github.com/azender1/SafeAgent" rel="noopener noreferrer"&gt;https://github.com/azender1/SafeAgent&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>backend</category>
      <category>architecture</category>
      <category>python</category>
    </item>
    <item>
      <title>The Execution Guard Pattern for AI Agents</title>
      <dc:creator>Anthony Zender</dc:creator>
      <pubDate>Sat, 28 Mar 2026 02:04:36 +0000</pubDate>
      <link>https://forem.com/azender1/the-execution-guard-pattern-for-ai-agents-23m9</link>
      <guid>https://forem.com/azender1/the-execution-guard-pattern-for-ai-agents-23m9</guid>
      <description>&lt;p&gt;AI agents don’t just think — they execute real-world actions.&lt;/p&gt;

&lt;p&gt;Payments. Trades. Emails. API calls.&lt;/p&gt;

&lt;p&gt;And under retries, timeouts, or crashes…&lt;/p&gt;

&lt;p&gt;they can execute the same action twice.&lt;/p&gt;

&lt;p&gt;Not because the model was wrong —&lt;br&gt;
because the system has no memory of execution.&lt;/p&gt;

&lt;p&gt;The hidden failure mode&lt;/p&gt;

&lt;p&gt;A typical failure path looks like this:&lt;/p&gt;

&lt;p&gt;agent decides to call tool&lt;br&gt;
→ tool executes side effect&lt;br&gt;
→ response is lost (timeout / crash / disconnect)&lt;br&gt;
→ system retries&lt;br&gt;
→ side effect executes again&lt;/p&gt;

&lt;p&gt;Now you have:&lt;/p&gt;

&lt;p&gt;duplicate payments&lt;br&gt;
duplicate trades&lt;br&gt;
duplicate emails&lt;br&gt;
duplicate API mutations&lt;/p&gt;

&lt;p&gt;Not because the decision was wrong —&lt;br&gt;
because the execution layer has no durable receipt.&lt;/p&gt;

&lt;p&gt;Retries are correct — and still dangerous&lt;/p&gt;

&lt;p&gt;Retries are necessary for reliability.&lt;/p&gt;

&lt;p&gt;But retries + irreversible side effects without a guard = replay risk.&lt;/p&gt;

&lt;p&gt;The system cannot confidently answer:&lt;/p&gt;

&lt;p&gt;“Did this action already happen?”&lt;/p&gt;

&lt;p&gt;So it does the only thing it can:&lt;/p&gt;

&lt;p&gt;→ tries again&lt;/p&gt;

&lt;p&gt;That’s fine for reads.&lt;/p&gt;

&lt;p&gt;It’s dangerous for writes.&lt;/p&gt;

&lt;p&gt;The Execution Guard Pattern&lt;/p&gt;

&lt;p&gt;The fix is not prompt engineering.&lt;/p&gt;

&lt;p&gt;It’s an execution boundary around side effects.&lt;/p&gt;

&lt;p&gt;Pattern:&lt;br&gt;
decision&lt;br&gt;
→ deterministic request_id&lt;br&gt;
→ execution guard&lt;br&gt;
   → if receipt exists → return prior result&lt;br&gt;
   → else → execute once → store receipt&lt;/p&gt;

&lt;p&gt;Instead of asking the model to “be careful,”&lt;br&gt;
the system itself becomes replay-safe.&lt;/p&gt;

&lt;p&gt;The four required properties&lt;/p&gt;

&lt;p&gt;For this pattern to work, you need four things:&lt;/p&gt;

&lt;p&gt;1) Deterministic request identity&lt;/p&gt;

&lt;p&gt;Every logical action must map to the same request_id across retries.&lt;/p&gt;

&lt;p&gt;If the same payment, email, trade, or tool call is retried, it must resolve to the same identity.&lt;/p&gt;

&lt;p&gt;2) Durable receipt storage&lt;/p&gt;

&lt;p&gt;You need a place to persist what happened.&lt;/p&gt;

&lt;p&gt;Postgres works well for this because it gives you:&lt;/p&gt;

&lt;p&gt;durable writes&lt;br&gt;
transactional boundaries&lt;br&gt;
strong uniqueness guarantees&lt;br&gt;
queryable auditability&lt;/p&gt;

&lt;p&gt;Without durable receipts, retries are guesswork.&lt;/p&gt;

&lt;p&gt;3) Atomic claim → execute → complete boundary&lt;/p&gt;

&lt;p&gt;The system needs a clear execution boundary:&lt;/p&gt;

&lt;p&gt;claim the operation&lt;br&gt;
execute the side effect once&lt;br&gt;
persist the result / receipt&lt;/p&gt;

&lt;p&gt;That boundary is what prevents:&lt;/p&gt;

&lt;p&gt;concurrent replays&lt;br&gt;
duplicate workers&lt;br&gt;
race-condition duplicates&lt;br&gt;
“two consumers did the same thing” bugs&lt;/p&gt;

&lt;p&gt;4) Replay returns the prior result&lt;/p&gt;

&lt;p&gt;If the same logical action comes in again,&lt;br&gt;
you should not execute it again.&lt;/p&gt;

&lt;p&gt;You should return the prior result.&lt;/p&gt;

&lt;p&gt;That turns:&lt;/p&gt;

&lt;p&gt;retries&lt;br&gt;
redelivery&lt;br&gt;
replay&lt;br&gt;
uncertain completion&lt;/p&gt;

&lt;p&gt;into:&lt;/p&gt;

&lt;p&gt;safe re-entry instead of duplicate side effects&lt;br&gt;
What this is NOT&lt;/p&gt;

&lt;p&gt;This is not:&lt;/p&gt;

&lt;p&gt;moderation&lt;br&gt;
prompt safety&lt;br&gt;
RBAC&lt;br&gt;
approval workflows&lt;br&gt;
hallucination prevention&lt;/p&gt;

&lt;p&gt;It solves one thing:&lt;/p&gt;

&lt;p&gt;“Did this irreversible action already happen?”&lt;/p&gt;

&lt;p&gt;That question shows up everywhere once agents or automations start calling real tools.&lt;/p&gt;

&lt;p&gt;Where this matters most&lt;/p&gt;

&lt;p&gt;This pattern matters anywhere your system causes real-world side effects:&lt;/p&gt;

&lt;p&gt;webhook handlers&lt;br&gt;
billing / payment flows&lt;br&gt;
async workers / queues&lt;br&gt;
workflow / automation systems&lt;br&gt;
AI agent tool calls&lt;br&gt;
external API mutations&lt;br&gt;
order / booking / ticket creation&lt;br&gt;
notifications and email sends&lt;/p&gt;

&lt;p&gt;In other words:&lt;/p&gt;

&lt;p&gt;anything that should happen once, even if the system retries&lt;br&gt;
Why this keeps showing up&lt;/p&gt;

&lt;p&gt;Modern systems are:&lt;/p&gt;

&lt;p&gt;distributed&lt;br&gt;
async&lt;br&gt;
retry-heavy&lt;br&gt;
failure-prone&lt;br&gt;
full of uncertain completion&lt;/p&gt;

&lt;p&gt;So “exactly once” does not happen naturally.&lt;/p&gt;

&lt;p&gt;You have to build it explicitly.&lt;/p&gt;

&lt;p&gt;And once you add:&lt;/p&gt;

&lt;p&gt;AI agents&lt;br&gt;
autonomous workflows&lt;br&gt;
tool-calling systems&lt;/p&gt;

&lt;p&gt;…the need for an execution boundary gets even sharper.&lt;/p&gt;

&lt;p&gt;Because now a model can repeatedly decide to invoke something that has real-world consequences.&lt;/p&gt;

&lt;p&gt;A practical implementation direction&lt;/p&gt;

&lt;p&gt;In many systems, this can be implemented with:&lt;/p&gt;

&lt;p&gt;a Postgres-backed receipt table&lt;br&gt;
a stable operation / request ID&lt;br&gt;
a guard layer around side-effecting functions&lt;/p&gt;

&lt;p&gt;That turns:&lt;/p&gt;

&lt;p&gt;unsafe retries&lt;/p&gt;

&lt;p&gt;into:&lt;/p&gt;

&lt;p&gt;safe replays&lt;/p&gt;

&lt;p&gt;This doesn’t require rewriting your whole system.&lt;/p&gt;

&lt;p&gt;It usually means identifying the small set of functions that can cause irreversible side effects and wrapping them with a durable execution boundary.&lt;/p&gt;

&lt;p&gt;That’s where the leverage is.&lt;/p&gt;

&lt;p&gt;Closing thought&lt;/p&gt;

&lt;p&gt;If an AI agent can call tools,&lt;br&gt;
it needs more than reasoning.&lt;/p&gt;

&lt;p&gt;It needs execution memory.&lt;/p&gt;

&lt;p&gt;Otherwise:&lt;/p&gt;

&lt;p&gt;retries will eventually execute something twice.&lt;br&gt;
Execution Risk Audit&lt;/p&gt;

&lt;p&gt;I’m currently looking at systems where retries, webhooks, workers, workflows, or AI agents can replay irreversible actions.&lt;/p&gt;

&lt;p&gt;If your system has paths where you can’t confidently answer:&lt;/p&gt;

&lt;p&gt;“Did this action already happen?”&lt;/p&gt;

&lt;p&gt;that’s exactly the kind of problem I’m focused on.&lt;/p&gt;

&lt;p&gt;Especially interested in:&lt;/p&gt;

&lt;p&gt;duplicate webhook execution&lt;br&gt;
retry-safe billing flows&lt;br&gt;
workflow steps with uncertain completion&lt;br&gt;
AI agents calling side-effecting tools&lt;/p&gt;

</description>
      <category>ai</category>
      <category>backend</category>
      <category>python</category>
      <category>postgres</category>
    </item>
  </channel>
</rss>
