<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Lars Winstand</title>
    <description>The latest articles on Forem by Lars Winstand (@lars_winstand).</description>
    <link>https://forem.com/lars_winstand</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3908932%2Feb8bc1ff-405f-4ef0-8204-ba1ed7caa59f.jpeg</url>
      <title>Forem: Lars Winstand</title>
      <link>https://forem.com/lars_winstand</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/lars_winstand"/>
    <language>en</language>
    <item>
      <title>I thought multi-agent meant more prompts until I saw 3 ways OpenClaw users are actually splitting the work</title>
      <dc:creator>Lars Winstand</dc:creator>
      <pubDate>Sun, 03 May 2026 12:39:41 +0000</pubDate>
      <link>https://forem.com/lars_winstand/i-thought-multi-agent-meant-more-prompts-until-i-saw-3-ways-openclaw-users-are-actually-splitting-13ko</link>
      <guid>https://forem.com/lars_winstand/i-thought-multi-agent-meant-more-prompts-until-i-saw-3-ways-openclaw-users-are-actually-splitting-13ko</guid>
      <description>&lt;p&gt;I went into a bunch of OpenClaw discussions expecting the usual advice about subagents: better prompts, cleaner folders, maybe some heroic config.&lt;/p&gt;

&lt;p&gt;What I found was more interesting.&lt;/p&gt;

&lt;p&gt;The OpenClaw setups that actually seem to hold up are not just "one agent with more prompts." They are separate services with separate trust zones.&lt;/p&gt;

&lt;p&gt;The pattern that keeps showing up looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a librarian agent&lt;/li&gt;
&lt;li&gt;an executor agent&lt;/li&gt;
&lt;li&gt;a company-facing agent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Usually connected over A2A.&lt;/p&gt;

&lt;p&gt;That sounds like a small implementation detail. It is not.&lt;/p&gt;

&lt;p&gt;A separate prompt inside one workspace is still one workspace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one context blob&lt;/li&gt;
&lt;li&gt;one tool surface&lt;/li&gt;
&lt;li&gt;one security boundary&lt;/li&gt;
&lt;li&gt;one place for bloat to accumulate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A separate OpenClaw instance is different. Now you have real boundaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;different runtimes&lt;/li&gt;
&lt;li&gt;different API keys&lt;/li&gt;
&lt;li&gt;different networks&lt;/li&gt;
&lt;li&gt;different memory policies&lt;/li&gt;
&lt;li&gt;explicit handoffs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is where multi-agent starts being architecture instead of roleplay.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Reddit pattern is ahead of most blog posts
&lt;/h2&gt;

&lt;p&gt;One of the clearest examples was an r/openclaw thread about an A2A plugin:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://reddit.com/r/openclaw/comments/1t1yf86/i_made_an_openclaw_a2a_plugin_connect_your/" rel="noopener noreferrer"&gt;https://reddit.com/r/openclaw/comments/1t1yf86/i_made_an_openclaw_a2a_plugin_connect_your/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post itself was small, but the use cases were sharp:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;a sandboxed local OpenClaw talking to a full-access cloud OpenClaw&lt;/li&gt;
&lt;li&gt;a personal OpenClaw talking to a company-wide OpenClaw for internal services&lt;/li&gt;
&lt;li&gt;teammate agents syncing plans over the internet to avoid stepping on each other&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is not prompt organization. That is system design.&lt;/p&gt;

&lt;p&gt;And it answers the question I keep seeing from people trying to force multi-agent into one workspace:&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not just keep everything in one OpenClaw workspace?
&lt;/h2&gt;

&lt;p&gt;Because the boundary is the point.&lt;/p&gt;

&lt;p&gt;If your librarian, executor, and company-facing assistant all live in the same workspace, a lot of the specialization is fake.&lt;/p&gt;

&lt;p&gt;The librarian can still see too much.&lt;/p&gt;

&lt;p&gt;The executor still inherits too much context.&lt;/p&gt;

&lt;p&gt;The company-facing assistant is still one bad tool call away from touching something it should not.&lt;/p&gt;

&lt;p&gt;Here is the tradeoff in plain terms:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;What actually happens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Separate A2A services&lt;/td&gt;
&lt;td&gt;Clear trust boundary, can run on different machines or networks, but setup and security overhead are real&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subagents inside one OpenClaw workspace&lt;/td&gt;
&lt;td&gt;Fast and simple, lower latency, but weaker isolation of tools and context and easier to bloat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;n8n for orchestration plus agents for reasoning&lt;/td&gt;
&lt;td&gt;Great for deterministic triggers and data movement, but glue code gets messy fast&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;My opinionated take: multi-agent is only worth the complexity when the boundary is real.&lt;/p&gt;

&lt;p&gt;If the split is just:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;this prompt is the researcher&lt;/li&gt;
&lt;li&gt;this prompt is the coder&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;then you probably do not have multiple agents. You have one agent wearing name tags.&lt;/p&gt;

&lt;h2&gt;
  
  
  The librarian pattern is better than it sounds
&lt;/h2&gt;

&lt;p&gt;A commenter in that A2A thread described a pattern I think more teams should steal:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I need an agent that acts as a librarian and gatekeeper for a RAG implementation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a strong design choice because it forces a question most agent stacks avoid:&lt;/p&gt;

&lt;p&gt;Who is allowed to touch memory, and why?&lt;/p&gt;

&lt;p&gt;A librarian agent can own retrieval and document selection.&lt;/p&gt;

&lt;p&gt;It can decide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which sources are valid&lt;/li&gt;
&lt;li&gt;how much context to return&lt;/li&gt;
&lt;li&gt;whether a query deserves a deep search&lt;/li&gt;
&lt;li&gt;what gets filtered before it reaches the executor&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then your executor agent can stay focused on doing work instead of dragging your entire RAG stack into every session.&lt;/p&gt;

&lt;h3&gt;
  
  
  When a separate librarian makes sense
&lt;/h3&gt;

&lt;p&gt;Use a dedicated librarian when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;retrieval needs its own rules&lt;/li&gt;
&lt;li&gt;memory access should be restricted&lt;/li&gt;
&lt;li&gt;different agents need different knowledge slices&lt;/li&gt;
&lt;li&gt;you want to keep executor context small&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When direct memory access is better
&lt;/h3&gt;

&lt;p&gt;Keep it simple when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;everything is local&lt;/li&gt;
&lt;li&gt;latency matters more than isolation&lt;/li&gt;
&lt;li&gt;the same agent already owns the knowledge domain&lt;/li&gt;
&lt;li&gt;you are adding A2A mostly because it sounds advanced&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That tradeoff matters more than the label.&lt;/p&gt;

&lt;p&gt;Not every boundary should become a network boundary.&lt;/p&gt;

&lt;p&gt;But the useful ones usually should.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical split: one agent per trust boundary
&lt;/h2&gt;

&lt;p&gt;The cleanest rule I found is this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one agent per trust boundary&lt;/li&gt;
&lt;li&gt;one agent per memory policy&lt;/li&gt;
&lt;li&gt;one agent per tool class&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That usually gives you something like this:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Librarian
&lt;/h3&gt;

&lt;p&gt;Owns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;retrieval&lt;/li&gt;
&lt;li&gt;indexing rules&lt;/li&gt;
&lt;li&gt;memory access&lt;/li&gt;
&lt;li&gt;document selection&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Executor
&lt;/h3&gt;

&lt;p&gt;Owns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;actions&lt;/li&gt;
&lt;li&gt;code changes&lt;/li&gt;
&lt;li&gt;task completion&lt;/li&gt;
&lt;li&gt;narrow operational tools&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Company-facing interface
&lt;/h3&gt;

&lt;p&gt;Owns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal service access&lt;/li&gt;
&lt;li&gt;approvals&lt;/li&gt;
&lt;li&gt;policy enforcement&lt;/li&gt;
&lt;li&gt;boring but critical guardrails&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If two of those share the same tools, same memory, same runtime, and same risk profile, they probably should not be separate yet.&lt;/p&gt;

&lt;p&gt;If they differ on any of those, split them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;Here is a simple mental model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[user/app]
   |
   v
[company-facing OpenClaw]
   |
   +--&amp;gt; [librarian OpenClaw] --&amp;gt; [docs/vector store]
   |
   +--&amp;gt; [executor OpenClaw] --&amp;gt; [repo/tools/shell]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here is the kind of split I would actually implement.&lt;/p&gt;

&lt;h3&gt;
  
  
  Company-facing agent
&lt;/h3&gt;

&lt;p&gt;This is the only agent that talks to the outside world.&lt;/p&gt;

&lt;p&gt;Responsibilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;receive requests&lt;/li&gt;
&lt;li&gt;check policy&lt;/li&gt;
&lt;li&gt;decide whether work needs retrieval, execution, or both&lt;/li&gt;
&lt;li&gt;redact or reshape requests before forwarding&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Librarian agent
&lt;/h3&gt;

&lt;p&gt;This agent gets read-only access to your knowledge systems.&lt;/p&gt;

&lt;p&gt;Responsibilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;search docs&lt;/li&gt;
&lt;li&gt;fetch relevant chunks&lt;/li&gt;
&lt;li&gt;summarize long context&lt;/li&gt;
&lt;li&gt;return only what downstream agents need&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Executor agent
&lt;/h3&gt;

&lt;p&gt;This one gets the dangerous tools.&lt;/p&gt;

&lt;p&gt;Responsibilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;write code&lt;/li&gt;
&lt;li&gt;run commands&lt;/li&gt;
&lt;li&gt;modify files&lt;/li&gt;
&lt;li&gt;execute workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That split avoids the worst anti-pattern: giving the same agent broad memory access and broad tool access and then hoping the prompt keeps it safe.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security is where the fantasy ends
&lt;/h2&gt;

&lt;p&gt;This is the first serious objection in every good A2A discussion, and it should be.&lt;/p&gt;

&lt;p&gt;In that same A2A thread, someone pointed out the obvious risk: inbound calls can trigger OpenClaw tools.&lt;/p&gt;

&lt;p&gt;That is not paranoia. That is basic engineering.&lt;/p&gt;

&lt;p&gt;The plugin author responded with a few practical details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;secure-by-default posture&lt;/li&gt;
&lt;li&gt;per-agent API keys&lt;/li&gt;
&lt;li&gt;sender IDs&lt;/li&gt;
&lt;li&gt;new conversation threads for each inbound message&lt;/li&gt;
&lt;li&gt;Tailscale for receiving messages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They also suggested using a separate profile for experiments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw &lt;span class="nt"&gt;--profile&lt;/span&gt; gateway
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the right mindset.&lt;/p&gt;

&lt;p&gt;A2A is not magic. It is distributed systems with LLMs attached.&lt;/p&gt;

&lt;p&gt;Which means you inherit the normal taxes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;security tax&lt;/li&gt;
&lt;li&gt;ops tax&lt;/li&gt;
&lt;li&gt;debugging tax&lt;/li&gt;
&lt;li&gt;latency tax&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are not getting a real boundary in return, do not pay those taxes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Add n8n carefully or you will build glue-code soup
&lt;/h2&gt;

&lt;p&gt;Another useful OpenClaw thread described a setup with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a shared VPS&lt;/li&gt;
&lt;li&gt;multiple OpenClaw agents&lt;/li&gt;
&lt;li&gt;n8n&lt;/li&gt;
&lt;li&gt;local users connecting through Antigravity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Source:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://reddit.com/r/openclaw/comments/1t0nnkz/am_i_overengineering_this_openclaw_n8n/" rel="noopener noreferrer"&gt;https://reddit.com/r/openclaw/comments/1t0nnkz/am_i_overengineering_this_openclaw_n8n/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That architecture is not crazy.&lt;/p&gt;

&lt;p&gt;But it gets messy fast if every system co-owns the workflow.&lt;/p&gt;

&lt;p&gt;My rule of thumb:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;let n8n handle deterministic flows, triggers, schedules, and integrations&lt;/li&gt;
&lt;li&gt;let OpenClaw handle reasoning, exception handling, and ambiguous tasks&lt;/li&gt;
&lt;li&gt;keep cross-service handoffs lower than your first instinct&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A simple split looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;n8n&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;owns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cron jobs&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;webhooks&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;API integrations&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;retries&lt;/span&gt;

&lt;span class="na"&gt;openclaw&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;owns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;planning&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;reasoning&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ambiguous decisions&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;code generation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you make n8n, OpenClaw, and your local client all coordinate state, debugging gets ugly.&lt;/p&gt;

&lt;p&gt;You end up tracing things like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;OpenClaw A calls OpenClaw B&lt;/li&gt;
&lt;li&gt;OpenClaw B triggers n8n&lt;/li&gt;
&lt;li&gt;n8n writes state&lt;/li&gt;
&lt;li&gt;OpenClaw A no longer trusts the state it originally requested&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is not a model problem. That is orchestration debt.&lt;/p&gt;

&lt;h2&gt;
  
  
  The expensive part is often not the model
&lt;/h2&gt;

&lt;p&gt;One of the most useful OpenClaw cost posts I found came from a user who spent about $850 in a month, including around $350 in one day:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://reddit.com/r/openclaw/comments/1t2fd8o/spent_850_on_openclaw_in_a_month_350_in_one_day/" rel="noopener noreferrer"&gt;https://reddit.com/r/openclaw/comments/1t2fd8o/spent_850_on_openclaw_in_a_month_350_in_one_day/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The key line was this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;At first I thought it was model cost. It wasn’t. It was bad system design.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That should be printed on a sticker and attached to every agent dashboard.&lt;/p&gt;

&lt;p&gt;The fixes were not exotic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;strict context pruning&lt;/li&gt;
&lt;li&gt;short sessions&lt;/li&gt;
&lt;li&gt;n8n for repeat tasks&lt;/li&gt;
&lt;li&gt;workspace cleanup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They reported 70 to 90 percent savings after redesigning the stack.&lt;/p&gt;

&lt;p&gt;That matches what a lot of teams eventually learn:&lt;/p&gt;

&lt;p&gt;The bill is not just about which model you picked.&lt;/p&gt;

&lt;p&gt;It is about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how much useless context you drag around&lt;/li&gt;
&lt;li&gt;how often the wrong agent gets invoked&lt;/li&gt;
&lt;li&gt;how many handoffs you created&lt;/li&gt;
&lt;li&gt;how much deterministic work you let an LLM do&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is exactly why real boundaries matter.&lt;/p&gt;

&lt;p&gt;A librarian agent can stay small.&lt;/p&gt;

&lt;p&gt;An executor can stay sharp.&lt;/p&gt;

&lt;p&gt;A company-facing agent can stay boring.&lt;/p&gt;

&lt;p&gt;That is not architecture purity. That is cost control.&lt;/p&gt;

&lt;h2&gt;
  
  
  A minimal implementation sketch
&lt;/h2&gt;

&lt;p&gt;If I were building this today, I would start with something like this.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create isolated runtimes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw &lt;span class="nt"&gt;--profile&lt;/span&gt; company
openclaw &lt;span class="nt"&gt;--profile&lt;/span&gt; librarian
openclaw &lt;span class="nt"&gt;--profile&lt;/span&gt; executor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Give each runtime only the tools it needs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"company"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"policy-check"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"request-router"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"librarian"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"vector-search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"doc-fetch"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rerank"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"executor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"git"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"shell"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test-runner"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Keep the message contract small
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"task"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"summarize auth flow docs relevant to OAuth token refresh bugs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"constraints"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"read-only"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"max 10 chunks"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"request_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"req_123"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Return only what the next agent needs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"request_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"req_123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Token refresh logic lives in auth-service and mobile-sdk"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sources"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"docs/auth/refresh-flow.md"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"docs/mobile/oauth.md"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That one habit alone prevents a lot of context bloat.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I would decide whether to split an agent
&lt;/h2&gt;

&lt;p&gt;Before creating a new agent, ask:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;should this component have different tool permissions?&lt;/li&gt;
&lt;li&gt;should this component have different memory access?&lt;/li&gt;
&lt;li&gt;should this component run in a different network or trust zone?&lt;/li&gt;
&lt;li&gt;would this split reduce context size in a meaningful way?&lt;/li&gt;
&lt;li&gt;can I explain the boundary without using the phrase "it feels cleaner"?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the answer to most of those is no, keep one agent.&lt;/p&gt;

&lt;p&gt;If the answer is yes, split it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Standard Compute fits
&lt;/h2&gt;

&lt;p&gt;There is one more practical issue here: once you start doing multi-agent properly, request volume goes up fast.&lt;/p&gt;

&lt;p&gt;Not because you are being wasteful. Because clean architecture creates more small calls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;routing calls&lt;/li&gt;
&lt;li&gt;retrieval calls&lt;/li&gt;
&lt;li&gt;execution calls&lt;/li&gt;
&lt;li&gt;retries&lt;/li&gt;
&lt;li&gt;background automations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is exactly where per-token pricing becomes annoying.&lt;/p&gt;

&lt;p&gt;You stop optimizing for quality and start optimizing for what will not surprise you on the invoice.&lt;/p&gt;

&lt;p&gt;For OpenClaw users running always-on agents, that is backwards.&lt;/p&gt;

&lt;p&gt;Standard Compute is built for this exact situation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;unlimited AI compute for OpenClaw at a flat monthly price&lt;/li&gt;
&lt;li&gt;no per-token billing&lt;/li&gt;
&lt;li&gt;works with existing OpenClaw setups using a custom prompt&lt;/li&gt;
&lt;li&gt;dynamic routing across GPT-5.4, Claude Opus 4.6, and Grok 4.20&lt;/li&gt;
&lt;li&gt;plans from $9 to $399 per month&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your stack is moving from "one giant workspace" to actual multi-agent services, predictable cost matters a lot more than people admit.&lt;/p&gt;

&lt;p&gt;Because the fastest way to ruin a good architecture is making developers afraid to let agents run.&lt;/p&gt;

&lt;h2&gt;
  
  
  The boring takeaway that will save you later
&lt;/h2&gt;

&lt;p&gt;If you are building with OpenClaw, do not start with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how many agents should I have?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Start with:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;which agent should know this?&lt;/li&gt;
&lt;li&gt;which agent should be allowed to do this?&lt;/li&gt;
&lt;li&gt;which agent should pay the context cost for this?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If all three answers point to the same place, keep it in one workspace.&lt;/p&gt;

&lt;p&gt;If they do not, stop stuffing more prompts into one bot and calling it architecture.&lt;/p&gt;

&lt;p&gt;That is the shift I keep seeing in OpenClaw discussions.&lt;/p&gt;

&lt;p&gt;Not more agents for the sake of it.&lt;/p&gt;

&lt;p&gt;Better boundaries.&lt;/p&gt;

&lt;p&gt;Less context bloat.&lt;/p&gt;

&lt;p&gt;Fewer surprise bills.&lt;/p&gt;

&lt;p&gt;And systems that still make sense when they are running under pressure.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>openclaw</category>
      <category>agents</category>
      <category>devops</category>
    </item>
    <item>
      <title>I found the dumbest way to burn 500 LLM calls a day: polling an inbox every 5 minutes</title>
      <dc:creator>Lars Winstand</dc:creator>
      <pubDate>Sat, 02 May 2026 13:44:12 +0000</pubDate>
      <link>https://forem.com/lars_winstand/i-found-the-dumbest-way-to-burn-500-llm-calls-a-day-polling-an-inbox-every-5-minutes-2m1o</link>
      <guid>https://forem.com/lars_winstand/i-found-the-dumbest-way-to-burn-500-llm-calls-a-day-polling-an-inbox-every-5-minutes-2m1o</guid>
      <description>&lt;p&gt;If your OpenClaw agent checks an email inbox every 5 minutes, you’re probably paying for idle paranoia.&lt;/p&gt;

&lt;p&gt;That’s not a theoretical complaint. In an r/openclaw thread about triggering jobs from email, one user described an MS365 setup like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"At the moment, I have Openclaw job where agent checks its ms365 mailbox every 5 minutes... Wasted calls to LLM (nearly 500 calls to LLM per day)"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is such a painfully real failure mode.&lt;/p&gt;

&lt;p&gt;The demo works. The cron job looks harmless. Then a month later your agent is re-checking old mail, occasionally double-processing messages, and quietly spending model calls on nothing.&lt;/p&gt;

&lt;p&gt;If you’re building always-on agents, this is exactly the kind of bug that turns “cool automation” into “why is this thing flaky and expensive?”&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern everyone starts with
&lt;/h2&gt;

&lt;p&gt;Usually it looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Connect OpenClaw to a mailbox&lt;/li&gt;
&lt;li&gt;Poll every 5 minutes with IMAP or Microsoft Graph&lt;/li&gt;
&lt;li&gt;If there’s a new message, send it to GPT-5.4, Claude Opus 4.6, or whatever model you’re using&lt;/li&gt;
&lt;li&gt;Try not to process the same email twice&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For a proof of concept, that’s fine.&lt;/p&gt;

&lt;p&gt;If it’s one internal mailbox, low volume, and you have a tiny dedupe store in SQLite, polling can be good enough.&lt;/p&gt;

&lt;p&gt;But once the workflow matters, polling starts failing in boring and expensive ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you keep checking when nothing changed&lt;/li&gt;
&lt;li&gt;you burn LLM calls on already-seen messages&lt;/li&gt;
&lt;li&gt;you introduce delays by design&lt;/li&gt;
&lt;li&gt;you get duplicate processing when scans overlap&lt;/li&gt;
&lt;li&gt;you miss messages when state gets out of sync&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another user in that same r/openclaw discussion put it even more bluntly:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I abandoned the interval based scanning... if the scan got out of sync I had repeated responses (more wasted calls) or ignored mails. I failed to get it to be reliable."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s the actual problem.&lt;/p&gt;

&lt;p&gt;Polling doesn’t just waste money. It makes the agent feel unreliable.&lt;/p&gt;

&lt;p&gt;And unreliable is worse than expensive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Microsoft and Google are both telling you to stop polling
&lt;/h2&gt;

&lt;p&gt;This part is worth emphasizing: the anti-polling advice is not just random architecture purism.&lt;/p&gt;

&lt;p&gt;Microsoft Graph supports change notifications so apps can react to mailbox changes instead of hammering the API on a timer.&lt;/p&gt;

&lt;p&gt;Gmail push notifications exist for the same reason. Google says push eliminates the extra network and compute cost of polling resources to see if they changed.&lt;/p&gt;

&lt;p&gt;If both mailbox providers are nudging you toward push, that’s a clue.&lt;/p&gt;

&lt;h2&gt;
  
  
  What production intake should look like
&lt;/h2&gt;

&lt;p&gt;There are a few sane ways to do inbound email for agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gmail API watch + Google Cloud Pub/Sub&lt;/li&gt;
&lt;li&gt;Microsoft Graph change notifications&lt;/li&gt;
&lt;li&gt;Twilio SendGrid Inbound Parse Webhook&lt;/li&gt;
&lt;li&gt;an email-native service like AgentMail&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The common idea is simple:&lt;/p&gt;

&lt;p&gt;The provider tells your system that mail arrived.&lt;/p&gt;

&lt;p&gt;Your system does not keep asking if anything changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gmail: watch the inbox instead of polling it
&lt;/h2&gt;

&lt;p&gt;For Gmail, the production path is Gmail API watch on the inbox, then Pub/Sub delivers notifications to your webhook.&lt;/p&gt;

&lt;p&gt;Example request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST https://gmail.googleapis.com/gmail/v1/users/me/watch
Content-Type: application/json
Authorization: Bearer &amp;lt;access_token&amp;gt;

{
  "topicName": "projects/myproject/topics/mytopic",
  "labelIds": ["INBOX"],
  "labelFilterBehavior": "INCLUDE"
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Google returns a history ID and an expiration time.&lt;/p&gt;

&lt;p&gt;That means two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;you need to process changes based on history&lt;/li&gt;
&lt;li&gt;you need to renew the watch before it expires&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is cleaner than polling, but it is not zero-maintenance.&lt;/p&gt;

&lt;p&gt;You still need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a Pub/Sub topic&lt;/li&gt;
&lt;li&gt;a subscription&lt;/li&gt;
&lt;li&gt;IAM configured correctly&lt;/li&gt;
&lt;li&gt;watch renewal logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you skip the lifecycle work, your “event-driven” setup becomes a very fancy outage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Microsoft 365: use Graph change notifications
&lt;/h2&gt;

&lt;p&gt;For Microsoft 365, use Microsoft Graph subscriptions for Outlook messages.&lt;/p&gt;

&lt;p&gt;Example subscription:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST https://graph.microsoft.com/v1.0/subscriptions
Content-Type: application/json
Authorization: Bearer &amp;lt;access_token&amp;gt;

{
  "changeType": "created",
  "notificationUrl": "https://your-app.example.com/webhooks/graph",
  "resource": "/me/mailFolders('Inbox')/messages",
  "expirationDateTime": "2026-05-03T00:00:00Z",
  "clientState": "openclaw-mailbox-prod"
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You need to handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;webhook validation&lt;/li&gt;
&lt;li&gt;subscription renewal&lt;/li&gt;
&lt;li&gt;clientState verification&lt;/li&gt;
&lt;li&gt;dedupe after notification delivery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Again: more setup than polling, much better behavior in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  SendGrid is the cleanest mental model
&lt;/h2&gt;

&lt;p&gt;If you want the simplest model for inbound email to HTTP, SendGrid Inbound Parse is hard to beat.&lt;/p&gt;

&lt;p&gt;Email arrives.&lt;/p&gt;

&lt;p&gt;SendGrid parses it.&lt;/p&gt;

&lt;p&gt;SendGrid POSTs the content to your endpoint.&lt;/p&gt;

&lt;p&gt;Minimal example in Node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlencoded&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;extended&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/inbound-email&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;messageId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/Message-ID: &lt;/span&gt;&lt;span class="se"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;.+&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="sr"&gt;/i&lt;/span&gt;&lt;span class="p"&gt;)?.[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;subject&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// 1. dedupe check&lt;/span&gt;
  &lt;span class="c1"&gt;// 2. persist event&lt;/span&gt;
  &lt;span class="c1"&gt;// 3. enqueue background processing&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;messageId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Listening on :3000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The nice part is the delivery contract.&lt;/p&gt;

&lt;p&gt;If your endpoint returns 5XX, SendGrid retries.&lt;br&gt;
If your endpoint returns 2XX, retries stop.&lt;/p&gt;

&lt;p&gt;That is a much sharper failure model than “cron ran, maybe.”&lt;/p&gt;

&lt;p&gt;There are constraints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;total message size limit&lt;/li&gt;
&lt;li&gt;dedicated receiving subdomain setup&lt;/li&gt;
&lt;li&gt;MX record configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Still better than burning cycles forever because polling was easier on day one.&lt;/p&gt;
&lt;h2&gt;
  
  
  n8n helps, but it does not magically fix polling
&lt;/h2&gt;

&lt;p&gt;This comes up a lot: “Can’t I just use n8n?”&lt;/p&gt;

&lt;p&gt;You can absolutely use n8n to improve the workflow.&lt;/p&gt;

&lt;p&gt;But if you use the n8n Email Trigger over IMAP, you are still doing mailbox-checking infrastructure. It’s just nicer mailbox-checking infrastructure.&lt;/p&gt;

&lt;p&gt;That matters.&lt;/p&gt;

&lt;p&gt;n8n gives you useful features like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mailbox selection&lt;/li&gt;
&lt;li&gt;mark as read&lt;/li&gt;
&lt;li&gt;attachment handling&lt;/li&gt;
&lt;li&gt;custom search rules&lt;/li&gt;
&lt;li&gt;reconnect controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a lot better than a hand-rolled cron script.&lt;/p&gt;

&lt;p&gt;But it does not change the trigger model.&lt;/p&gt;

&lt;p&gt;If the source of truth is still “go ask the mailbox if anything happened,” you still have polling-shaped failure modes.&lt;/p&gt;
&lt;h2&gt;
  
  
  Polling vs push
&lt;/h2&gt;

&lt;p&gt;Here’s the tradeoff in plain English:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;What you’re really signing up for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Poll mailbox with IMAP or cron&lt;/td&gt;
&lt;td&gt;Easy setup, delayed reactions, duplicate checks, wasted model calls, awkward dedupe logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;n8n Email Trigger (IMAP)&lt;/td&gt;
&lt;td&gt;Better operational ergonomics, but still polling underneath&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gmail watch / Graph notifications / SendGrid webhook&lt;/td&gt;
&lt;td&gt;More setup, much lower idle waste, faster reactions, better delivery semantics&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This is not really “simple vs advanced.”&lt;/p&gt;

&lt;p&gt;It’s demo-friendly vs production-friendly.&lt;/p&gt;
&lt;h2&gt;
  
  
  What your OpenClaw email pipeline should actually do
&lt;/h2&gt;

&lt;p&gt;If I were building this today, I’d split it into two layers.&lt;/p&gt;
&lt;h3&gt;
  
  
  Layer 1: intake
&lt;/h3&gt;

&lt;p&gt;Pick one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SendGrid Inbound Parse if you want email -&amp;gt; HTTP&lt;/li&gt;
&lt;li&gt;Gmail watch + Pub/Sub if you’re on Google Workspace&lt;/li&gt;
&lt;li&gt;Microsoft Graph notifications if you’re on Microsoft 365&lt;/li&gt;
&lt;li&gt;n8n IMAP only for a fast proof of concept&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Layer 2: idempotent processing
&lt;/h3&gt;

&lt;p&gt;No matter how the event arrives, your OpenClaw job should:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;extract a stable message ID&lt;/li&gt;
&lt;li&gt;check a dedupe store before calling any model&lt;/li&gt;
&lt;li&gt;persist processing state&lt;/li&gt;
&lt;li&gt;acknowledge receipt quickly&lt;/li&gt;
&lt;li&gt;do the expensive work asynchronously&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last point is where people get into trouble.&lt;/p&gt;

&lt;p&gt;Do not do all processing inside the webhook request.&lt;/p&gt;

&lt;p&gt;Accept the event.&lt;br&gt;
Store it.&lt;br&gt;
Deduplicate it.&lt;br&gt;
Then hand it off.&lt;/p&gt;

&lt;p&gt;That’s how you survive retries without duplicate replies.&lt;/p&gt;
&lt;h2&gt;
  
  
  A minimal queue-based pattern
&lt;/h2&gt;

&lt;p&gt;Here’s a practical shape for the service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;email-webhook -&amp;gt; postgres&lt;span class="o"&gt;(&lt;/span&gt;inbox_events&lt;span class="o"&gt;)&lt;/span&gt; -&amp;gt; job queue -&amp;gt; OpenClaw worker -&amp;gt; reply/send action
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pseudo-schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;create&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="n"&gt;inbox_events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;bigserial&lt;/span&gt; &lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;external_message_id&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;received_at&lt;/span&gt; &lt;span class="n"&gt;timestamptz&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="n"&gt;jsonb&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;processing_status&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;external_message_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Worker logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;processInboxEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findByProviderAndMessageId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;external_message_id&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;missing event&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;processing_status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;done&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markProcessing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runOpenClawAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;saveResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markDone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is much less exciting than prompt tricks.&lt;/p&gt;

&lt;p&gt;It is also the difference between a system that feels solid and one that occasionally replies twice at 3 AM.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cost side gets ugly fast
&lt;/h2&gt;

&lt;p&gt;If your agent is always on, wasted checks become real money or real usage pressure.&lt;/p&gt;

&lt;p&gt;This is where pricing model matters.&lt;/p&gt;

&lt;p&gt;Per-token billing makes polling bugs feel worse because every pointless re-check and duplicate pass looks like another tiny leak. You start optimizing prompts and reducing context not because it improves quality, but because you’re trying to contain operational sloppiness.&lt;/p&gt;

&lt;p&gt;That’s backwards.&lt;/p&gt;

&lt;p&gt;If you’re running OpenClaw agents continuously, predictable flat-rate compute is a much better fit than watching token spend all day. Standard Compute is built for exactly that: OpenAI-compatible API access for OpenClaw agents, flat monthly pricing, and dynamic routing across models like GPT-5.4, Claude Opus 4.6, and Grok 4.20.&lt;/p&gt;

&lt;p&gt;So yes, fix the architecture first.&lt;/p&gt;

&lt;p&gt;But also: if your agents run 24/7, stop pairing always-on automation with pricing that punishes every extra call.&lt;/p&gt;

&lt;h2&gt;
  
  
  When polling is still okay
&lt;/h2&gt;

&lt;p&gt;Polling is not always wrong.&lt;/p&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you have one internal mailbox&lt;/li&gt;
&lt;li&gt;volume is low&lt;/li&gt;
&lt;li&gt;a few minutes of delay is fine&lt;/li&gt;
&lt;li&gt;you have dedupe in SQLite or Postgres&lt;/li&gt;
&lt;li&gt;nobody will care if you rebuild it later&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a proof of concept.&lt;/p&gt;

&lt;p&gt;Just be honest that it is a proof of concept.&lt;/p&gt;

&lt;p&gt;The mistake is pretending that a polling loop is production architecture for a customer-facing or always-on agent.&lt;/p&gt;

&lt;p&gt;It isn’t.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual line between toy and production
&lt;/h2&gt;

&lt;p&gt;The interesting distinction is not whether OpenClaw can read email.&lt;/p&gt;

&lt;p&gt;Of course it can.&lt;/p&gt;

&lt;p&gt;The distinction is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how the email arrives&lt;/li&gt;
&lt;li&gt;whether processing is idempotent after it arrives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A toy automation asks the mailbox every few minutes if anything happened.&lt;/p&gt;

&lt;p&gt;A production agent gets an event, validates it, records it once, and processes it once.&lt;/p&gt;

&lt;p&gt;That sounds boring.&lt;/p&gt;

&lt;p&gt;It’s also the difference between “works in a demo” and “still works three months later.”&lt;/p&gt;

&lt;p&gt;If your OpenClaw workflow still polls an inbox every 5 minutes, I wouldn’t call it broken.&lt;/p&gt;

&lt;p&gt;I’d call it unfinished.&lt;/p&gt;

&lt;p&gt;And once you’ve seen nearly 500 LLM calls per day wasted on mailbox checks, it’s hard to unsee.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>devops</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
