<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: ~K¹yle Million</title>
    <description>The latest articles on Forem by ~K¹yle Million (@thebrierfox).</description>
    <link>https://forem.com/thebrierfox</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3780351%2F132696e0-9432-47c1-87d0-bc907e039420.jpeg</url>
      <title>Forem: ~K¹yle Million</title>
      <link>https://forem.com/thebrierfox</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/thebrierfox"/>
    <language>en</language>
    <item>
      <title>Agent Compaction Architecture: What Really Happens When Claude Code Hits Context Limits</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Wed, 22 Apr 2026 13:09:45 +0000</pubDate>
      <link>https://forem.com/thebrierfox/agent-compaction-architecture-what-really-happens-when-claude-code-hits-context-limits-43dd</link>
      <guid>https://forem.com/thebrierfox/agent-compaction-architecture-what-really-happens-when-claude-code-hits-context-limits-43dd</guid>
      <description>&lt;h2&gt;
  
  
  Section 1: The Silent Killer
&lt;/h2&gt;

&lt;p&gt;When Claude Code's context window fills, the runtime does not hard-stop. It doesn't throw an error. It doesn't ask permission. It compacts.&lt;/p&gt;

&lt;p&gt;Compaction is an automatic summarization step that fires when the token budget crosses a threshold. The mechanics are straightforward: the oldest turns in the conversation history are replaced with a compressed summary. Recent turns — the last several exchanges — are preserved verbatim. The summary takes the place of everything older.&lt;/p&gt;

&lt;p&gt;From a token-budget perspective, this is correct behavior. There is no other option. You cannot run a stateful agent across a long task without some form of context management. The window is finite. The task is not.&lt;/p&gt;

&lt;p&gt;The problem is the word "compressed." A summary is a lossy transformation. The compression ratio is high — many tokens of conversation history become a paragraph of summary. What survives that compression is a function of what the summarizer judged salient. Factual statements about what actions were taken survive well. Constraints survive partially. Nuanced reasoning about &lt;em&gt;why&lt;/em&gt; a particular approach was chosen tends to survive poorly. Negative constraints — "don't touch X", "avoid this approach because..." — are especially vulnerable, because they are structurally underrepresented in summaries: what didn't happen takes up less surface area than what did.&lt;/p&gt;

&lt;p&gt;Here is a concrete production failure I hit.&lt;/p&gt;

&lt;p&gt;I had an agent working through a multi-step migration task. Early in the session, I established that a specific table in the database was read-only for this task — the tenant registry. There was active work happening on that table by another process, and any schema change would cause a cascade failure. I was explicit about it: "Do not touch the tenant_registry table. Do not add columns, do not create indexes, do not run any DDL against it."&lt;/p&gt;

&lt;p&gt;The agent acknowledged this. It moved forward. It completed several unrelated subtasks. The context window filled. Compaction fired.&lt;/p&gt;

&lt;p&gt;The summary captured the migration objective. It captured what had been completed. It mentioned the database was involved. It did not preserve the specific constraint about the tenant_registry table with enough fidelity to prevent the agent from running a DDL operation against it two tasks later when the migration naturally required cross-table work.&lt;/p&gt;

&lt;p&gt;The operation succeeded at the database level. The cascade failure arrived async, from the other process. I found it in the error log four hours later.&lt;/p&gt;

&lt;p&gt;Nothing in the session output flagged that compaction had occurred. Nothing in the agent's subsequent behavior signaled it had lost the constraint. It was reasoning correctly from the compressed state it had — that state just had a hole in it.&lt;/p&gt;

&lt;p&gt;That is what makes compaction dangerous in autonomous operation. The agent doesn't know what it doesn't know. It reasons confidently from an incomplete picture, and the gaps are invisible from the inside.&lt;/p&gt;




&lt;h2&gt;
  
  
  Section 2: What Gets Lost and Why
&lt;/h2&gt;

&lt;p&gt;Not all state is equally vulnerable to compaction. Understanding the failure modes requires a taxonomy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool call results — high vulnerability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When the agent runs a Bash command and reads the output, that output lives in the conversation as a tool result. Tool results are often long — hundreds of lines of log output, full file contents, test results. They are also often used once: the agent processes the result, draws a conclusion, and the raw output becomes redundant.&lt;/p&gt;

&lt;p&gt;From a summarization perspective, tool results are natural candidates for aggressive compression. The summary retains the conclusion: "tests passed", "file contains X", "service is running". The raw output is dropped.&lt;/p&gt;

&lt;p&gt;This is fine when the raw output was truly just an input to a single conclusion. It is a problem when the raw output contained multiple relevant facts, and only one of them was acted on immediately. The rest are now gone. If a later step in the task needs one of those secondary facts, the agent will re-derive it, re-read the file, or get it wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intermediate conclusions — medium vulnerability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent builds up a model of the system as it works. "This service is stateless, so I can restart it without drain." "This config value is referenced in three places." "The test is flaky, not broken — ignore intermittent failures." These are conclusions drawn from evidence earlier in the session.&lt;/p&gt;

&lt;p&gt;They are embedded in the conversation as reasoning traces — assistant turns explaining what the agent concluded and why. Summaries capture the highest-salience conclusions but flatten the reasoning. The "why" is the first thing to go.&lt;/p&gt;

&lt;p&gt;When the "why" is gone, the agent may later reach the opposite conclusion from fresh evidence if that evidence is locally ambiguous. The earlier constraint has no backing anymore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explicit constraint acknowledgments — high vulnerability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"Remember, don't touch X." "Make sure to use approach Y for this module." "The client requires that output files use this exact naming convention."&lt;/p&gt;

&lt;p&gt;Constraints stated conversationally, without a corresponding file artifact, are the most dangerous category. The agent acknowledged them. They shaped early decisions. But acknowledgment turns are short and structurally similar to each other — they compress heavily. After compaction, the summary may say "user gave several constraints about the build" without enumerating them.&lt;/p&gt;

&lt;p&gt;The agent no longer has the specific list. It has a summary that there was a list.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Completed subtasks that weren't fully logged — low-to-medium vulnerability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Completed work leaves artifacts: files, database records, deployed services. Those artifacts exist independently of the conversation. The agent can re-inspect them.&lt;/p&gt;

&lt;p&gt;The vulnerability here is more subtle: the &lt;em&gt;decisions made during&lt;/em&gt; a subtask may be gone even when the subtask's outputs survive. The agent knows a file was written. It doesn't necessarily remember why it was structured that specific way, which means a later step that modifies that file may violate an architectural constraint that was obvious in the original subtask context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why summaries can't fully substitute for raw history&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A summary is an agent-generated compression. Its quality depends on what the summarizing model judges worth preserving, which is a function of what seemed salient at summary generation time. Salience is local: the most recently discussed topics appear more important. Negative constraints are structurally invisible in summaries. Long reasoning chains compress to single-sentence conclusions.&lt;/p&gt;

&lt;p&gt;The raw history is a ground truth. The summary is a lossy encoding. For short tasks with clear objectives, the loss is tolerable. For long tasks with accumulated constraints and interdependent decisions, the loss compounds across multiple compaction events.&lt;/p&gt;




&lt;h2&gt;
  
  
  Section 3: Compaction-Resistant Architecture
&lt;/h2&gt;

&lt;p&gt;Four patterns. I use all of them in production. They compose — each layer backs up the others.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 1: Checkpoint Writes
&lt;/h3&gt;

&lt;p&gt;At every significant milestone in a task, the agent writes the current state to a file. Not a summary of what it did — the live state that the next phase needs.&lt;/p&gt;

&lt;p&gt;The checkpoint file is not documentation. It is a machine-readable context recovery artifact. The agent will read it at the start of each subsequent phase. If compaction fires, the next operation re-loads from the checkpoint rather than from conversation memory.&lt;/p&gt;

&lt;p&gt;What belongs in a checkpoint:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Active constraints (including negative constraints — especially those)&lt;/li&gt;
&lt;li&gt;Decisions made and the reason they were made&lt;/li&gt;
&lt;li&gt;Current task state: what is complete, what is in progress, what is blocked&lt;/li&gt;
&lt;li&gt;Any system facts that were discovered and are relevant going forward&lt;/li&gt;
&lt;li&gt;Explicit re-statement of things that must not happen&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The checkpoint is only useful if it is written &lt;em&gt;before&lt;/em&gt; context-heavy operations. Writing it after means compaction may have already fired.&lt;/p&gt;

&lt;p&gt;A checkpoint cadence that works: write before any operation that will consume more than a few thousand tokens (running tests, reading large files, invoking sub-agents, executing database migrations). Write at each logical phase boundary regardless of token consumption.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 2: Explicit State Re-Injection
&lt;/h3&gt;

&lt;p&gt;Checkpoints are only useful if they are read. State re-injection means starting each major phase of a task by reading the relevant checkpoint files and explicitly restating the constraints into the current context before doing any work.&lt;/p&gt;

&lt;p&gt;This is not redundant. After compaction, the conversation history is a summary. The most recent checkpoint is the last known-good full state. Reading it at phase start brings the full state back into the current context window, where it will remain verbatim for the duration of that phase's work.&lt;/p&gt;

&lt;p&gt;The re-injection also serves as a correctness check: if the agent re-reads the checkpoint and notices that its current understanding diverges from what the checkpoint says, that divergence is a signal that something went wrong.&lt;/p&gt;

&lt;p&gt;Re-injection should be explicit in the agent's prompt chain: "Before proceeding with phase N, read the phase N checkpoint file and confirm that all listed constraints are still active."&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 3: Compaction Detection
&lt;/h3&gt;

&lt;p&gt;There is no native "compaction occurred" event exposed by Claude Code's context. You cannot query whether compaction has fired. But you can detect it indirectly.&lt;/p&gt;

&lt;p&gt;Compaction detection relies on a sentinel: a value written to a file at task start that the agent is instructed to re-read and verify at each phase boundary. If the agent can reproduce the sentinel value, the conversation history containing the sentinel read is still intact. If it cannot, compaction has likely compressed that turn.&lt;/p&gt;

&lt;p&gt;More practically: you can detect &lt;em&gt;behavioral evidence&lt;/em&gt; of compaction by testing the agent's recall of specific early-session constraints before proceeding. If it fails the recall test, you trigger a re-initialization sequence: read all checkpoint files, re-state all constraints, verify understanding before continuing work.&lt;/p&gt;

&lt;p&gt;The detection overhead is low — a single file read and a short verification step. The cost of skipping it when compaction has fired is whatever damage the agent does while operating from an incomplete state.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 4: Session Segmentation
&lt;/h3&gt;

&lt;p&gt;For tasks that will span many hours and many phases, a single ultra-long session is architecturally unsound. Multiple compaction events compound: the second compaction summarizes a history that already contains a summary. Information loss accelerates with each event.&lt;/p&gt;

&lt;p&gt;Session segmentation means treating the task as a sequence of bounded sessions, each with a clean handoff file. Session N completes some work, writes a handoff file that captures the full state needed by session N+1, then exits cleanly. Session N+1 starts by reading the handoff file before doing anything else.&lt;/p&gt;

&lt;p&gt;Each session starts fresh — full context window, no compaction debt. The handoff file is the only continuity mechanism, so it must be complete. This forces explicit articulation of state that might otherwise be assumed to be "in context."&lt;/p&gt;

&lt;p&gt;The segmentation boundary should align with natural task phases. "Complete the schema migration and write a handoff file" is a clean segment. "Do some of the migration and some of the testing" is not.&lt;/p&gt;




&lt;h2&gt;
  
  
  Section 4: Code Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Checkpoint Write — Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoint_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Write a phase checkpoint before any context-heavy operation.
    Call this before running tests, reading large files, or invoking sub-agents.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoint_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;checkpoint_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;phase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;constraints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;constraints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decisions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decisions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_not_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_not_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed_tasks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed_tasks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;in_progress&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;in_progress&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;


&lt;span class="c1"&gt;# Example usage before a database migration phase
&lt;/span&gt;&lt;span class="nf"&gt;write_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;checkpoint_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./outputs/session_checkpoints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pre_migration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;constraints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use WAL mode for all SQLite writes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No DDL against tenant_registry table — active writes from separate process&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Output files must use snake_case naming convention&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_not_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenant_registry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auth_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decisions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schema_approach&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;additive_only&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schema_approach_reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;existing consumers cannot handle column removal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed_tasks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schema_audit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backup_verification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;in_progress&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;column_additions_to_user_profiles&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/data/production.db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backup_verified_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-04-22T09:14:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Checkpoint Read + Re-Injection — Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoint_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Load checkpoint at phase start. Re-state all constraints before proceeding.
    This is your recovery path after a compaction event.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoint_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;checkpoint_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No checkpoint found for phase &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cannot proceed without known-good state.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="c1"&gt;# Emit re-injection block — this goes into the agent's active context
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=== RE-INJECTING STATE FROM CHECKPOINT: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; ===&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Timestamp: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;ACTIVE CONSTRAINTS (must be honored for remaining work):&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;constraints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;DO NOT TOUCH:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_not_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;KEY DECISIONS:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decisions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=== END STATE RE-INJECTION ===&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Compaction Detection — Bash
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="c"&gt;# compaction_check.sh&lt;/span&gt;
&lt;span class="c"&gt;# Write a sentinel at task start; verify it at each phase boundary.&lt;/span&gt;
&lt;span class="c"&gt;# If verification fails, trigger re-initialization before proceeding.&lt;/span&gt;

&lt;span class="nv"&gt;SENTINEL_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"./outputs/session_sentinel.txt"&lt;/span&gt;
&lt;span class="nv"&gt;CHECKPOINT_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"./outputs/session_checkpoints"&lt;/span&gt;
&lt;span class="nv"&gt;PHASE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;1&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;unknown&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

write_sentinel&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;session_id
    &lt;span class="nv"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="nv"&gt;$$&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$session_id&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SENTINEL_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SENTINEL_WRITTEN: &lt;/span&gt;&lt;span class="nv"&gt;$session_id&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

verify_sentinel_or_reinit&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SENTINEL_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"COMPACTION_DETECTED: sentinel file missing — running re-initialization"&lt;/span&gt;
        reinitialize_from_checkpoints
        &lt;span class="k"&gt;return &lt;/span&gt;1
    &lt;span class="k"&gt;fi
    &lt;/span&gt;&lt;span class="nb"&gt;local &lt;/span&gt;stored_sentinel
    &lt;span class="nv"&gt;stored_sentinel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SENTINEL_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SENTINEL_OK: &lt;/span&gt;&lt;span class="nv"&gt;$stored_sentinel&lt;/span&gt;&lt;span class="s2"&gt; — proceeding with phase &lt;/span&gt;&lt;span class="nv"&gt;$PHASE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return &lt;/span&gt;0
&lt;span class="o"&gt;}&lt;/span&gt;

reinitialize_from_checkpoints&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== COMPACTION RECOVERY: loading all available checkpoints ==="&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;f &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECKPOINT_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;/checkpoint_&lt;span class="k"&gt;*&lt;/span&gt;.json&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;continue
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"--- Loading: &lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt; ---"&lt;/span&gt;
        python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json, sys
state = json.load(open('&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;'))
print(f'Phase: {state[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;phase&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]} @ {state[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;timestamp&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]}')
print('Constraints:')
for c in state.get('constraints', []):
    print(f'  - {c}')
print('Do not touch:', state.get('do_not_touch', []))
"&lt;/span&gt;
    &lt;span class="k"&gt;done
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== RECOVERY COMPLETE — all constraints re-loaded ==="&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# At session start: write_sentinel&lt;/span&gt;
&lt;span class="c"&gt;# At each phase boundary: verify_sentinel_or_reinit&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Session Handoff File — Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_handoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Write a clean handoff file at the end of a session segment.
    The next session reads this before doing any work.
    This file is the ONLY continuity mechanism between sessions.
    It must be complete — assume the next session has zero prior context.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;handoff_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;handoff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generated_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from_session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;next_session_start_instructions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Read this file completely before any other action. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All constraints listed here are active. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Do not proceed without acknowledging each constraint.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task_objective&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;objective&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed_this_session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;next_phase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;next_phase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hard_constraints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;constraints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_not_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_not_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key_facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;open_questions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;open_questions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;known_risks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;known_risks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handoff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Handoff written to: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Next session must read: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;


&lt;span class="c1"&gt;# Example: end of session 1 of a multi-session migration
&lt;/span&gt;&lt;span class="nf"&gt;write_handoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./outputs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;migration_s1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;next_session_instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;objective&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Complete user profile schema migration and deploy to staging&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Schema audit complete — findings in outputs/schema_audit.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Backup verified — outputs/backup_verification.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Column additions to user_profiles — migration script at migrations/002_add_profile_fields.sql&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;next_phase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Run migration against staging, execute integration test suite, write test report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;constraints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No DDL against tenant_registry — active concurrent writes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Migration must be additive only — no column drops&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Staging deploy requires RAILS_ENV=staging explicitly set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;do_not_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenant_registry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auth_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;legacy_session_keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;staging_db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgres://staging-host:5432/app_staging&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;migration_tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alembic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test_suite&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pytest tests/integration/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_test_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;47&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;known_risks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Test DB may have stale fixtures — run pytest --setup-show to verify fixture state&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Architecture in Summary
&lt;/h2&gt;

&lt;p&gt;Compaction is not a bug to work around. It is a fundamental constraint of context-window-bounded agents. The architecture that survives it is one that treats the conversation as ephemeral and the filesystem as the ground truth.&lt;/p&gt;

&lt;p&gt;Checkpoint writes externalize state before it can be lost. Re-injection restores full context after a compaction event. Detection lets you verify that the context you're operating from is complete. Session segmentation eliminates compaction debt entirely for long tasks by resetting the window at phase boundaries.&lt;/p&gt;

&lt;p&gt;None of these patterns are expensive. A checkpoint file write takes milliseconds. A re-injection read adds a few hundred tokens to the current context. The compaction detection sentinel is a single file read. A handoff file is twenty lines of JSON.&lt;/p&gt;

&lt;p&gt;The cost of not using them is the kind of failure that doesn't announce itself — an agent that proceeds confidently from a state it believes is correct, into work that violates a constraint it no longer remembers.&lt;/p&gt;




&lt;p&gt;I packaged the full compaction-resistant architecture — detection hooks, checkpoint templates, re-injection patterns, and session handoff schemas — as a ClawMart skill: &lt;a href="https://www.shopclawmart.com/listings/agent-compaction-architecture-production-context-management-5aaff79a" rel="noopener noreferrer"&gt;Agent Compaction Architecture — Production Context Management&lt;/a&gt;. If you're running Claude Code agents on anything longer than a twenty-minute task, it's worth the read.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;~K¹ (W. Kyle Million) / IntuiTek¹ — Building autonomous AI infrastructure for solo operators.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; claudecode, devtools, aiagents&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>productivity</category>
    </item>
    <item>
      <title>The Complete Agent Operations Stack: 15 Skills for Production-Grade Claude Code</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Wed, 22 Apr 2026 13:09:21 +0000</pubDate>
      <link>https://forem.com/thebrierfox/the-complete-agent-operations-stack-15-skills-for-production-grade-claude-code-569h</link>
      <guid>https://forem.com/thebrierfox/the-complete-agent-operations-stack-15-skills-for-production-grade-claude-code-569h</guid>
      <description>&lt;p&gt;Every week this week I've published articles about individual production patterns for Claude Code: loop termination, session memory, memory scoping, coordinator resume, bash security. Each one addresses a specific failure mode that doesn't exist in demos but shows up immediately when you run agents unattended.&lt;/p&gt;

&lt;p&gt;This article ties them together. It's the reference architecture I wish existed when I started building autonomous agents — before I had agents burning API budget in infinite retry loops, corrupting each other's work, or silently writing partial output that looked complete.&lt;/p&gt;

&lt;p&gt;The gap between "works in a demo" and "runs for 30 days without intervention" is not about model quality. It's about the five layers of production readiness that Claude Code tutorials don't cover, because tutorials show the happy path.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Production Gap
&lt;/h2&gt;

&lt;p&gt;Here's what a Claude Code demo looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "Write a report on X"
Agent: [reads files, synthesizes, writes output]
Done.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's what production looks like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent runs at 2am via cron with no one watching&lt;/li&gt;
&lt;li&gt;It hits a network error on step 12 of 30 and retries 80 times&lt;/li&gt;
&lt;li&gt;Two instances start simultaneously and overwrite each other's context files&lt;/li&gt;
&lt;li&gt;The context window hits its limit mid-task and the next session has no idea where it left off&lt;/li&gt;
&lt;li&gt;A sub-agent writes a bash command that touches a path it shouldn't&lt;/li&gt;
&lt;li&gt;The coordinator that dispatched three agents loses its session and restarts all three&lt;/li&gt;
&lt;li&gt;The agent finishes successfully but consumed 6x the expected API budget because it loaded the same large file 40 times&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are model failures. They're infrastructure failures. The model did exactly what it was instructed to do. The architecture didn't account for the environment the model runs in.&lt;/p&gt;

&lt;p&gt;The five layers below are the minimum viable production architecture for any Claude Code agent that runs unattended.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Five Layers of Production Readiness
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Layer 1: Security
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What can go wrong:&lt;/strong&gt; An agent with broad Bash tool access will, eventually, execute a command in a way you didn't anticipate. Maybe it interpolates a variable into a shell command unsafely. Maybe it runs &lt;code&gt;rm -rf&lt;/code&gt; on a path that turns out to be wrong. Maybe it writes credentials to a log file. In production environments, an unvalidated bash execution surface is an incident waiting to happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skills that address this:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Bash Security Validator&lt;/em&gt; catches the class of vulnerabilities that come from how agents construct shell commands: unquoted variables, command injection via interpolation, unsafe redirects, pipes to &lt;code&gt;eval&lt;/code&gt;. This isn't static analysis on your code — it's a validation layer that runs between the agent's intent and the shell.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Production Agent Security Hardening&lt;/em&gt; addresses the broader surface: what tools the agent can access, which paths it's allowed to write, how credentials are handled, and what happens when a security boundary is tested. The hardening architecture covers tool allowlists, path restrictions, and audit logging for security-relevant operations.&lt;/p&gt;

&lt;p&gt;Without this layer, you're running an agent that has the same access as a logged-in user and considerably less judgment about when to use it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure signature:&lt;/strong&gt; Agent executes &lt;code&gt;rm -rf&lt;/code&gt; on a wrong path. Agent leaks an environment variable into an output file. Agent constructs a SQL query via string interpolation and hits an injection on unexpected input.&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 2: Memory
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What can go wrong:&lt;/strong&gt; Claude Code agents have excellent in-context reasoning. They have zero built-in persistence. When the context window ends — whether from a limit, a compaction, or a cron schedule firing a fresh session — everything the agent learned, decided, and discovered is gone. The next session starts from scratch.&lt;/p&gt;

&lt;p&gt;At scale, this produces three distinct failure patterns: repeated discovery (re-doing work already done), decision context loss (making a conflicting choice because the constraint that ruled it out is no longer in context), and progress tracking failure (processing the same files twice because there's no record of what was already processed).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skills that address this:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Agent Memory Scoping&lt;/em&gt; handles the concurrent case: when two agents run simultaneously, they need isolated memory namespaces. The pattern uses agent-scoped working directories, explicit lock protocols for shared coordination files, and memory category taxonomy (exclusive / shared-read / coordination / output). Without this, concurrent agents corrupt each other's working state.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Session Memory Architecture&lt;/em&gt; handles the temporal case: single agents running across multiple context windows. The pattern uses structured session memory files with explicit categories (Decisions, Progress, Discoveries, Next Session Start) that the agent writes during execution and reads at session start to resume coherently.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Agent Compaction Architecture&lt;/em&gt; handles the context pressure case: an agent operating near its context limit needs to proactively write out critical context before compaction removes it. This isn't reactive — it's built into the agent's operating protocol. The agent maintains a rolling summary of durable knowledge so that compaction events don't cause knowledge loss.&lt;/p&gt;

&lt;p&gt;All three of these address the same root problem from different angles: context is not memory, and production agents need persistent memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure signature:&lt;/strong&gt; Agent re-processes files it already completed. Agent makes a decision that contradicts a constraint established in a previous session. Two concurrent agents write to the same path and one loses its work.&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 3: Flow Control
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What can go wrong:&lt;/strong&gt; An uncontrolled agent will pursue its goal until it either succeeds or exhausts resources. With no circuit breaker, a stuck agent retries indefinitely. With no coordinator state, a multi-agent pipeline loses track of what's been dispatched. With no fork management, spawned sub-agents run without supervision and their outputs aren't collected reliably.&lt;/p&gt;

&lt;p&gt;This layer is where most production incidents live, because flow control failures are expensive and hard to detect from the outside.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skills that address this:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Loop Termination Architecture&lt;/em&gt; implements the circuit breaker pattern at three levels: a step counter (hard limit that stops runaway loops), an error accumulation counter (smart limit that stops stuck loops retrying the same error class), and a goal proximity check (semantic limit that stops false progress spirals). The article earlier this week goes deep on this pattern.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Coordinator Resume Integrity&lt;/em&gt; handles the multi-agent orchestration case: a coordinator agent that dispatches sub-agents must maintain a persistent dispatch ledger so that if the coordinator's session ends mid-pipeline, the next coordinator session can resume from exactly where it left off — skipping completed tasks and re-running only what's still pending.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Forked Agent Architecture&lt;/em&gt; handles the sub-agent lifecycle case: when you fork agents to parallelize work, you need patterns for launching them cleanly, tracking their completion, handling their failures, and collecting their outputs without conflicts. Forked agents that run unsupervised produce outputs that coordinators can't reliably reconcile.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure signature:&lt;/strong&gt; Agent retries a permission error 150 times before context death. Coordinator restarts a pipeline and re-runs already-completed sub-agents. Forked agents write to conflicting paths and the coordinator reads partial output.&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 4: Cost
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What can go wrong:&lt;/strong&gt; Token cost is invisible until it isn't. An agent that runs correctly but inefficiently can cost 5-10x what it should. Common causes: loading large context files repeatedly instead of once, using the heaviest model for tasks that don't require it, loading all available tools when only two are needed, and the classic — a stuck loop burning API budget on retry calls that will never succeed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skills that address this:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Token Cost Intelligence&lt;/em&gt; gives your agents awareness of their own cost. The pattern covers context window accounting, file loading strategies (don't load a 50KB file on every step when you can load it once and reference relevant sections), and prompt construction patterns that achieve the same output with significantly less input. For a cron-scheduled agent running 20 times a day, a 40% cost reduction compounds quickly.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Multi-Agent Coordination Architecture&lt;/em&gt; addresses the cost dimension of multi-agent systems: routing tasks to the right-sized agent, avoiding redundant computation across parallel agents, and structuring coordination messages to minimize the context each agent needs to carry. In a multi-agent system, coordination overhead is a real cost. Designing coordination contracts that are minimal without being ambiguous is a cost optimization.&lt;/p&gt;

&lt;p&gt;Both of these connect to the model routing tier principle: use local inference for classification and routing tasks, Haiku for structured tasks with clear success criteria, and Sonnet for the work that actually requires it. Token Cost Intelligence gives you the framework to implement this systematically rather than ad-hoc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure signature:&lt;/strong&gt; Agent loads a 100KB config file 40 times across a session. Coordinator passes the full context of each sub-agent to every other sub-agent. Sonnet is used to determine whether a string contains the word "error."&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 5: Setup and Observability
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What can go wrong:&lt;/strong&gt; Agents fail silently. They write outputs that look complete but aren't. They encounter environment issues (missing tools, wrong paths, stale credentials) that they handle by proceeding without the missing piece. By the time you notice, you have a week of bad outputs and no log trail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skills that address this:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Claude Code Setup Validation&lt;/em&gt; runs preflight checks before any substantive agent work: are required tools available, are expected paths writable, do credentials resolve, are environment variables populated. Validation failures produce clear error messages and halt execution before wasted work. The alternative is discovering that &lt;code&gt;jq&lt;/code&gt; isn't installed at step 40 of a 50-step pipeline.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Context Death Spiral Prevention&lt;/em&gt; addresses a specific failure mode that compounds other problems: an agent approaching context exhaustion starts making progressively worse decisions as it has less context available. The spiral is: reduced context → worse decisions → more work needed → more context consumed. The pattern installs early warning checks and graceful degradation protocols so agents operating near context limits write out state and stop rather than continuing in a degraded state.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Agent Bash Safety&lt;/em&gt; provides the baseline for safe shell operations: patterns for safe variable quoting, command construction, error handling, and exit code propagation. This is the entry-level version of the Bash Security Validator — appropriate for agents where security hardening isn't the primary concern but basic shell hygiene is.&lt;/p&gt;




&lt;h2&gt;
  
  
  Suggested Adoption Order
&lt;/h2&gt;

&lt;p&gt;If you're starting from scratch, adopt in this sequence. The order is based on risk mitigation impact — the earlier items catch the most expensive failure modes first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 1 — Foundation:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Agent Bash Safety&lt;/em&gt; (free) — install baseline shell hygiene before anything else runs&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Context Death Spiral Prevention&lt;/em&gt; (free) — protect your first agents from the most disorienting failure mode&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Claude Code Setup Validation&lt;/em&gt; — run preflight before any production deployment&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Loop Termination Architecture&lt;/em&gt; — your agents will hit loops before they hit any other problem&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Week 2 — Multi-session and concurrent:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Session Memory Architecture&lt;/em&gt; — required the moment any task spans more than one session&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Agent Memory Scoping&lt;/em&gt; — required the moment you run more than one agent at a time&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Agent Compaction Architecture&lt;/em&gt; — required for any long-running task&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Week 3 — Multi-agent:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Coordinator Resume Integrity&lt;/em&gt; — required for any orchestrated pipeline&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Forked Agent Architecture&lt;/em&gt; — required when you parallelize&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Week 4 — Cost and security:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Token Cost Intelligence&lt;/em&gt; — implement once agents are running correctly&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Multi-Agent Coordination Architecture&lt;/em&gt; — optimize once the baseline architecture is stable&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Bash Security Validator&lt;/em&gt; — harden once you understand your attack surface&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Production Agent Security Hardening&lt;/em&gt; — full hardening after you've mapped what the agents actually do&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The principle: get agents running reliably before optimizing cost, and understand what agents do before hardening security.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full Stack in Practice
&lt;/h2&gt;

&lt;p&gt;To make the architecture concrete, here's a complete autonomous content publishing agent and which of the 15 skills it engages at each stage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The agent:&lt;/strong&gt; runs every morning, drafts a dev.to article based on the week's activity log, reviews it against content standards, and queues it for publication.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:00 — Cron fires run_task.sh
    |
    └── [Setup Validation] ← preflight: DEVTO_API_KEY present? jq installed?
                              outputs/working/ writable? network resolves?
        |
        └── PASS → agent starts
            FAIL → log to errors.log, notify via Telegram, exit 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:00:05 — Agent reads context
    |
    └── [Session Memory Architecture] ← read working/content_agent/session_memory.md
                                         resume from last "Next Session Start" marker
                                         apply decisions: "Do not republish articles from week of 04-14"
        |
        └── [Agent Memory Scoping] ← workspace: working/content_agent_20260422_090000/
                                      no conflict with any other running agent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:00:30 — Agent reads activity log and begins drafting
    |
    └── [Token Cost Intelligence] ← activity log is 200KB total
                                     load only entries from last 7 days (12KB)
                                     don't reload on each step — reference the loaded chunk
        |
        └── [Agent Bash Safety] ← any shell ops use quoted variables, set -euo pipefail
                                    no dynamic command construction from log data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:03:00 — Article draft complete, beginning review pass
    |
    └── [Loop Termination Architecture] ← step counter: 30 steps max
                                           error counter: 3 identical errors → stop
                                           review pass has its own step budget (10 steps)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:04:00 — Agent attempts to queue article via ClawMart API
    |
    └── [Bash Security Validator] ← API key interpolated into curl command
                                     validator confirms: key is quoted, no injection surface
        |
        └── [Production Agent Security Hardening] ← API key not logged
                                                      credential not written to working files
                                                      audit entry: "API call to ClawMart at 09:04:02"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:04:20 — Task complete
    |
    └── [Session Memory Architecture] ← append to session_memory.md:
                                          "COMPLETED: article_20260422 queued for publication"
                                          "Next Session Start: check publication status, then draft next article"
        |
        └── [Context Death Spiral Prevention] ← context usage at 34% — well within safe zone
                                                  no degradation warning needed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:04:25 — Agent exits clean
    |
    └── outputs/article_20260422_queue.md written
        logs/heartbeat.log timestamp updated
        Telegram: "Content agent complete → article queued for 09:00 publish"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At every stage, a failure in the pattern it depends on would have produced a different outcome:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Without Setup Validation: agent discovers missing jq at step 15, produces garbled output, no error logged&lt;/li&gt;
&lt;li&gt;Without Session Memory: agent re-drafts articles from weeks already covered&lt;/li&gt;
&lt;li&gt;Without Token Cost Intelligence: agent loads the full 200KB activity log on every step, 3x cost&lt;/li&gt;
&lt;li&gt;Without Loop Termination: if ClawMart API returns 503, agent retries until context death&lt;/li&gt;
&lt;li&gt;Without Bash Security Validator: API key interpolated into a log message that persists in working files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The 15 skills are not independent optimizations. They're a layered architecture where each layer assumes the layers below it are in place.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting the Full Stack
&lt;/h2&gt;

&lt;p&gt;Each skill is available individually. The day one articles this week cover the $19 individual skills in depth.&lt;/p&gt;

&lt;p&gt;The entry point is two free skills that have no dependencies and install immediately:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Context Death Spiral Prevention&lt;/em&gt; — free, no prerequisites&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Agent Bash Safety&lt;/em&gt; — free, no prerequisites&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The mid-tier bundle covers the five patterns that most production deployments need first:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production Agent Ops Bundle — $69&lt;/strong&gt; (Bash Security Validator, Loop Termination, Session Memory, Agent Memory Scoping, Token Cost Intelligence)&lt;/p&gt;

&lt;p&gt;The complete architecture — all 15 skills as a cohesive production system with integration documentation and ordering guidance — is available as:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complete Agent Operations Pack — $199&lt;/strong&gt;&lt;br&gt;
All 15 skills. Integration guide. Adoption sequence documentation. CLAUDE.md template library covering all five layers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.shopclawmart.com/listings/complete-agent-operations-pack-10-skill-production-architecture-suite-5e5fa6e1" rel="noopener noreferrer"&gt;https://www.shopclawmart.com/listings/complete-agent-operations-pack-10-skill-production-architecture-suite-5e5fa6e1&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Honest Assessment
&lt;/h2&gt;

&lt;p&gt;Most Claude Code projects don't need all 15 skills. A single-agent script that runs once and is watched by a human needs almost none of them.&lt;/p&gt;

&lt;p&gt;The production architecture pays off when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent runs unattended (cron, headless &lt;code&gt;-p&lt;/code&gt; mode, no human watching)&lt;/li&gt;
&lt;li&gt;The agent runs repeatedly (scheduled, not one-shot)&lt;/li&gt;
&lt;li&gt;More than one agent runs at a time&lt;/li&gt;
&lt;li&gt;Failures have downstream consequences (customer-facing, financial, not easily reversible)&lt;/li&gt;
&lt;li&gt;API cost is a real constraint, not a rounding error&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any of those describe your deployment, the gap between "works in a demo" and "runs reliably for 30 days" is exactly what these 15 skills close.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Aegis, IntuiTek¹ | ~K¹ (W. Kyle Million)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tags: claudecode, devtools, aiagents, productivity&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Token Cost Intelligence: How I Route Claude Code Model Calls to Cut API Costs 60%</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Wed, 22 Apr 2026 13:09:10 +0000</pubDate>
      <link>https://forem.com/thebrierfox/token-cost-intelligence-how-i-route-claude-code-model-calls-to-cut-api-costs-60-2dic</link>
      <guid>https://forem.com/thebrierfox/token-cost-intelligence-how-i-route-claude-code-model-calls-to-cut-api-costs-60-2dic</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: One Model for Everything
&lt;/h2&gt;

&lt;p&gt;Here's what a typical Claude Code agent loop looks like under the hood:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User prompt → Claude Sonnet (classify intent) → Claude Sonnet (retrieve context)
→ Claude Sonnet (summarize retrieved docs) → Claude Sonnet (generate response)
→ Claude Sonnet (format output)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Five calls. Each one hitting Sonnet. At Claude Sonnet pricing (roughly $3/MTok input, $15/MTok output as of this writing), a moderately complex agent task with 10K input tokens and 2K output tokens per call costs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;5 calls × (10K × $0.003 + 2K × $0.015) = 5 × ($0.030 + $0.030) = $0.30 per task run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That sounds small. Run that task 1,000 times a month — which is conservative for an autonomous agent doing repetitive work — and you're at $300/month for one task type.&lt;/p&gt;

&lt;p&gt;Now look at what most of those calls actually need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Classify intent&lt;/strong&gt;: Takes a string, returns a category. This is a pattern-matching problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieve context&lt;/strong&gt;: String similarity search. No synthesis required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Summarize retrieved docs&lt;/strong&gt;: Compression of existing text. No novel reasoning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate response&lt;/strong&gt;: This one actually needs intelligence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Format output&lt;/strong&gt;: String transformation. Deterministic.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three of five calls don't need Sonnet. One of them (classify intent, format output) doesn't need any API call at all — a local model running at zero marginal cost handles them fine.&lt;/p&gt;

&lt;p&gt;That's the routing opportunity.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Routing Principle
&lt;/h2&gt;

&lt;p&gt;Before dispatching a subtask to any model, answer three questions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Does this require judgment or just processing?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Judgment tasks: synthesis, creative generation, multi-step reasoning, ambiguous interpretation, code generation from requirements, anything where "wrong" is hard to define in advance.&lt;/p&gt;

&lt;p&gt;Processing tasks: classification into fixed categories, text compression/summarization, format conversion, extraction of named entities, boolean routing decisions.&lt;/p&gt;

&lt;p&gt;Judgment → Tier 2 minimum. Processing → Tier 0 or Tier 1 viable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Does it need to be right on the first attempt, or can it retry cheaply?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Some subtasks sit on the critical path. If the intent classifier misfires and sends a user to the wrong workflow branch, you pay to recover. If a document summarizer slightly miscondenses something, the downstream step can compensate.&lt;/p&gt;

&lt;p&gt;High-stakes, no-retry → Tier 1 minimum. Low-stakes, recoverable → Tier 0 viable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. What's the token budget for this step?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Local models (Ollama, running Qwen3:14B on iGPU) handle 8-10 tokens/second in my setup. That's fine for 500-token classification tasks. It's not fine for a 20K-token synthesis pass where you need a response in under 30 seconds. Speed constraints push you up the tier ladder regardless of task complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The decision tree:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Is this a synthesis/reasoning/generation task?
├── Yes → Tier 2 (Sonnet) or Tier 3 (Opus) if highest stakes
└── No → Is output correctness recoverable if wrong?
    ├── No → Tier 1 (Haiku) — API quality, cheap
    └── Yes → Is token count under ~2K and latency tolerant?
        ├── Yes → Tier 0 (Ollama local) — zero API cost
        └── No → Tier 1 (Haiku)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;p&gt;Here's the router as a standalone module. The &lt;code&gt;classify()&lt;/code&gt; function takes a task description string and returns a tier integer. &lt;code&gt;get_model()&lt;/code&gt; maps that tier to a model identifier.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# model_router.py
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;IntEnum&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IntEnum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;LOCAL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;    &lt;span class="c1"&gt;# Ollama — zero API cost
&lt;/span&gt;    &lt;span class="n"&gt;HAIKU&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;    &lt;span class="c1"&gt;# Claude Haiku 4.5 — cheap, API quality
&lt;/span&gt;    &lt;span class="n"&gt;SONNET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;   &lt;span class="c1"&gt;# Claude Sonnet — primary work
&lt;/span&gt;    &lt;span class="n"&gt;OPUS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;     &lt;span class="c1"&gt;# Claude Opus — highest stakes only
&lt;/span&gt;
&lt;span class="n"&gt;TIER_MODELS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LOCAL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama:qwen3:14b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HAIKU&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-haiku-4-5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SONNET&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPUS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Task patterns that signal each tier.
# Match order matters: check Tier 0/1 patterns first, 
# fall through to Tier 2 if nothing matches.
&lt;/span&gt;
&lt;span class="n"&gt;LOCAL_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bclassif(y|ication|ier)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\broute\b.*\btask\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bsummariz(e|ation)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bextract\b.*(entity|entities|field|fields|name|date|number)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bformat\b.*(output|json|markdown|csv)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bparse\b.*(string|text|input)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bis this (about|related to|a)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bcategori(ze|zation)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bdetect\b.*(intent|topic|language|sentiment)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\btranslate\b.*(format|schema)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;HAIKU_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bvalidat(e|ion)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bcheck\b.*(schema|format|constraint|rule)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bfilter\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\brank\b.*(list|candidates|results)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bscore\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\byes.{0,10}no\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;# binary decisions
&lt;/span&gt;    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\btrue.{0,10}false\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bshould (i|we|this)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;OPUS_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bcritical\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bhigh.?stakes\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\birreversible\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bproduction (deploy|release|launch)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bsecurity (audit|review|analysis)\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\blegal\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\barchitect(ure)? decision\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Classify a task description string and return the appropriate model tier.
    Conservative by default: unknown tasks get Tier 2 (Sonnet).
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;task_lower&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Check Opus patterns first — these override everything
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;OPUS_PATTERNS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_lower&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPUS&lt;/span&gt;

    &lt;span class="c1"&gt;# Check if task clearly fits Local tier
&lt;/span&gt;    &lt;span class="n"&gt;local_matches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;LOCAL_PATTERNS&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_lower&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;local_matches&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_lower&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LOCAL&lt;/span&gt;

    &lt;span class="c1"&gt;# Check Haiku tier
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;HAIKU_PATTERNS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_lower&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HAIKU&lt;/span&gt;

    &lt;span class="c1"&gt;# Default: Sonnet
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SONNET&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Return the model identifier for the given tier.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TIER_MODELS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Convenience wrapper: classify + return (tier, model_id).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;tier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;get_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Injecting this into a Claude Code script:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're running Claude Code in script mode (&lt;code&gt;claude -p&lt;/code&gt;), you typically don't call the API directly — Claude Code handles the model. But if you're orchestrating sub-agent calls via the Anthropic SDK directly (which is common when you have a Claude Code agent spinning up subordinate tasks), the router drops in cleanly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# agent_loop.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;model_router&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_subtask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Tier 0: local inference via Ollama (no Anthropic API call)
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tier&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LOCAL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;run_ollama&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Tiers 1-3: Anthropic API
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_ollama&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Call local Ollama endpoint directly.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:11434/api/generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;60.0&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Integrating with a Claude Code tool definition:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If your agent uses Claude Code's native tool calling, you can route at the tool dispatch layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In your tool handler
&lt;/span&gt;&lt;span class="n"&gt;TOOL_TIER_OVERRIDES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;classify_intent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LOCAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarize_document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LOCAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract_fields&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LOCAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HAIKU&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank_candidates&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HAIKU&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generate_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SONNET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;synthesize_findings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SONNET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review_security&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="n"&gt;Tier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPUS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dispatch_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Use hard-coded override if known, otherwise classify from tool_name
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;TOOL_TIER_OVERRIDES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TOOL_TIER_OVERRIDES&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ... dispatch to appropriate model
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Real Numbers
&lt;/h2&gt;

&lt;p&gt;Here's the actual breakdown from my autonomous agent infrastructure, running a mix of ClawMart listing maintenance, content generation, and ACE license delivery tasks over a 30-day period.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before routing — all tasks on Sonnet:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task type&lt;/th&gt;
&lt;th&gt;Calls/day&lt;/th&gt;
&lt;th&gt;Avg tokens (in/out)&lt;/th&gt;
&lt;th&gt;Daily cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Intent classification&lt;/td&gt;
&lt;td&gt;120&lt;/td&gt;
&lt;td&gt;800 / 50&lt;/td&gt;
&lt;td&gt;$0.32&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document summarization&lt;/td&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;3,200 / 400&lt;/td&gt;
&lt;td&gt;$0.44&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Field extraction&lt;/td&gt;
&lt;td&gt;80&lt;/td&gt;
&lt;td&gt;600 / 120&lt;/td&gt;
&lt;td&gt;$0.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema validation&lt;/td&gt;
&lt;td&gt;60&lt;/td&gt;
&lt;td&gt;400 / 80&lt;/td&gt;
&lt;td&gt;$0.13&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content generation&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;2,000 / 1,500&lt;/td&gt;
&lt;td&gt;$0.29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code synthesis&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;4,000 / 2,000&lt;/td&gt;
&lt;td&gt;$0.42&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;325&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1.80/day ($54/mo)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;After routing:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task type&lt;/th&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Daily cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Intent classification&lt;/td&gt;
&lt;td&gt;0 (Ollama)&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document summarization&lt;/td&gt;
&lt;td&gt;0 (Ollama)&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Field extraction&lt;/td&gt;
&lt;td&gt;0 (Ollama)&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema validation&lt;/td&gt;
&lt;td&gt;1 (Haiku)&lt;/td&gt;
&lt;td&gt;~$0.004&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content generation&lt;/td&gt;
&lt;td&gt;2 (Sonnet)&lt;/td&gt;
&lt;td&gt;$0.29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code synthesis&lt;/td&gt;
&lt;td&gt;2 (Sonnet)&lt;/td&gt;
&lt;td&gt;$0.42&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$0.71/day ($21/mo)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's a 61% reduction. The tasks that stayed on Sonnet are exactly the ones that need it: novel content generation and code synthesis. The tasks that moved to Tier 0 are pure pattern matching and compression — Qwen3:14B handles them cleanly, and at 8-10 tokens/second locally, they complete fast enough that latency isn't a constraint.&lt;/p&gt;

&lt;p&gt;A few observations from running this in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Classification accuracy on Tier 0 is high for constrained tasks.&lt;/strong&gt; When the output space is a small fixed set of categories, Qwen3:14B makes fewer errors than you'd expect. The failure mode is ambiguous prompts, not model capability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Haiku 4.5 is underused by most teams.&lt;/strong&gt; It's genuinely capable for structured validation and ranking tasks, and it costs roughly 15x less than Sonnet for input tokens. Most teams skip straight to Sonnet out of habit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The routing classifier itself costs almost nothing.&lt;/strong&gt; My &lt;code&gt;classify()&lt;/code&gt; function is pure regex — no model call, zero latency, zero cost. For more nuanced routing, you can run the classifier on Tier 0 (Ollama) and the cost is still negligible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retry budgets matter.&lt;/strong&gt; I give Tier 0 tasks two retries before escalating to Tier 1. This adds maybe 5% cost but recovers from the edge cases where local inference produces malformed output.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Breaks Without This
&lt;/h2&gt;

&lt;p&gt;The failure mode I see most often in unrouted agents isn't cost — it's the Sonnet context window filling up with low-value intermediate processing. When your summarization steps run on Sonnet, they compete with your generation steps for context and rate limits. Routing low-value tasks to local inference keeps your Sonnet calls clean and focused on work that actually requires them.&lt;/p&gt;

&lt;p&gt;The second failure mode is rate limit exhaustion. At 325 calls/day against a single model tier, you hit Anthropic's rate limits faster than if you spread load across tiers. Tier distribution is rate limit distribution.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Packaged Framework
&lt;/h2&gt;

&lt;p&gt;The routing logic above is a simplified version of what I built and use in production. The full framework includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pre-trained classifiers for 40+ task types with confidence scores&lt;/li&gt;
&lt;li&gt;Cost tracking that logs actual spend per task type to a local SQLite DB&lt;/li&gt;
&lt;li&gt;A dashboard that shows cost breakdown and tier distribution over time&lt;/li&gt;
&lt;li&gt;Retry logic with automatic tier escalation on failure&lt;/li&gt;
&lt;li&gt;Integration examples for Claude Code scripts, Anthropic SDK, and LangChain&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full Token Cost Intelligence skill is available on ClawMart: &lt;a href="https://www.shopclawmart.com/listings/token-cost-intelligence-openclaw-optimization-framework-a417717e" rel="noopener noreferrer"&gt;Token Cost Intelligence — OpenClaw Optimization Framework&lt;/a&gt; ($29).&lt;/p&gt;

&lt;p&gt;If you're running any Claude Code agents at scale — even moderate scale — the routing framework pays for itself in the first day of usage.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;W. Kyle Million (K¹) builds autonomous AI infrastructure at &lt;a href="https://intuitek.ai" rel="noopener noreferrer"&gt;IntuiTek¹&lt;/a&gt;. The systems described here run continuously on a local X1 Pro, generating revenue without ongoing manual involvement.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>productivity</category>
    </item>
    <item>
      <title>The Production Agent Operations Bundle: What 90% of Claude Code Setups Are Missing</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Wed, 22 Apr 2026 13:08:46 +0000</pubDate>
      <link>https://forem.com/thebrierfox/the-production-agent-operations-bundle-what-90-of-claude-code-setups-are-missing-11f5</link>
      <guid>https://forem.com/thebrierfox/the-production-agent-operations-bundle-what-90-of-claude-code-setups-are-missing-11f5</guid>
      <description>&lt;h2&gt;
  
  
  The Five Failure Modes That Hit Real Production Setups
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Context collapse mid-task&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your agent is 35 steps into a 60-step task. It hits context limit. Compaction kicks in. The compacted context drops the specific intermediate state — which file was written, which step was last, what the error on step 28 was. The agent resumes with a reconstructed understanding of where it is, and that reconstruction is wrong. It re-does work, skips work, or produces outputs that contradict the partial work it already completed.&lt;/p&gt;

&lt;p&gt;The compaction is not the problem. The problem is that your agent had no checkpointing — no explicit record of where it was that survives a context reset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Infinite loops with no circuit breaker&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The task fails. The agent retries. Same failure. Retry. Same failure. The agent will not stop on its own, because stopping is not in its default behavior. It will retry until context exhausts, then compact and retry again. A &lt;code&gt;permission denied&lt;/code&gt; error on step 3 will get retried 80 times before the run terminates. You pay for all 80 retries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Shell injection via unvalidated tool calls&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your agent accepts a task that includes a filename, a query, or a user-supplied string. It passes that string directly to a bash call: &lt;code&gt;os.system(f"process_file.sh {filename}")&lt;/code&gt;. If &lt;code&gt;filename&lt;/code&gt; is &lt;code&gt;file.txt; rm -rf outputs/&lt;/code&gt;, your agent just destroyed your output directory. If it's piped from an external source, the attack surface is real.&lt;/p&gt;

&lt;p&gt;Most Claude Code bash usage never validates inputs before shell execution. Most demos don't catch this because the inputs are controlled. Production inputs are not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Concurrent agents corrupting shared state&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You have two agents running in parallel. Both are writing to &lt;code&gt;outputs/weekly_report.md&lt;/code&gt;. Agent A writes its section. Agent B opens the file, reads the current contents (which includes Agent A's partial write), appends its section, and writes the whole thing back. Agent A writes its next section to the file it still has open, overwriting Agent B's write.&lt;/p&gt;

&lt;p&gt;Non-atomic writes with no locking produce corrupted output with no error. No exception is raised. The file exists. The contents are wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Coordinator handoff losing task state&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your coordinator dispatches three sub-agents, then its session ends — context limit, cron timeout, system interrupt. A new coordinator starts on the next cron tick. It has no idea which sub-agents already completed. It re-dispatches all three. Sub-agent 1 runs again, producing duplicate output. Sub-agent 2 conflicts with its own still-running previous instance. Your pipeline produces wrong results and logs nothing, because there was no failure — just a coordinator that restarted with no memory.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Doesn't Work and Why
&lt;/h2&gt;

&lt;p&gt;The instinct when any of these hits is to add error handling. Wrap things in try/except, add a retry loop, restart on failure. These are patches, not fixes. Here's why each one falls short:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Just add error handling"&lt;/strong&gt; catches exceptions but doesn't solve loop termination. Your retry loop now catches the error and retries indefinitely — you've formalized the infinite loop instead of preventing it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Restart on failure"&lt;/strong&gt; is the coordinator pattern that causes state loss. Each restart wipes context. Without an explicit dispatch ledger written to disk before each sub-agent launch, restart is indistinguishable from a fresh start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Check output file existence"&lt;/strong&gt; to infer completion has multiple failure modes: partial writes leave valid-looking files, a previous interrupted run may have left a file from a different context, and the same task may need to run multiple times. File existence is a proxy for completion that breaks under real conditions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Sanitize inputs in the prompt"&lt;/strong&gt; relies on the model to perform security validation. That's not the right layer. Security validation belongs in code that runs before the shell call, not in language model reasoning that runs before the tool call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Use a lock file"&lt;/strong&gt; for concurrent writes is the right idea but is almost always implemented incorrectly — lock files that survive crashes leave all subsequent agents blocked, and there's no cleanup logic because the crash that created the problem also prevented the cleanup.&lt;/p&gt;

&lt;p&gt;The common thread: these fixes address symptoms at the wrong layer. The root causes are architectural — no termination logic, no persistent state, no pre-execution validation, no atomic write semantics.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Five Architecture Patterns That Fix It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Loop Termination with Circuit Breakers
&lt;/h3&gt;

&lt;p&gt;Every production agent needs termination logic at three levels: a hard step limit, an error accumulation counter, and a goal proximity check.&lt;/p&gt;

&lt;p&gt;The hard limit is the blunt instrument that catches runaway loops:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;MAX_STEPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;step_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;step_count&lt;/span&gt;
    &lt;span class="n"&gt;step_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;step_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;MAX_STEPS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;write_state_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max steps (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MAX_STEPS&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;) reached&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;TerminationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hard limit reached&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;perform_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The error accumulation counter catches stuck loops — agents retrying the same failing operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;error_counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="n"&gt;ERROR_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;ERROR_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;write_escalation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BLOCKED: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; failed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;x. Context: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;TerminationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Repeated failure: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;retry_with_backoff&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The goal proximity check is the cleanest implementation in Claude Code's native format — a CLAUDE.md protocol that forces the agent to articulate its progress before each action. If it can't state how this action moves toward completion, it writes the blocker to outputs/ and stops.&lt;/p&gt;

&lt;p&gt;Clean termination writes current state, names the blocker, and exits 0 — stopped is not the same as failed.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Memory Isolation for Concurrent Agents
&lt;/h3&gt;

&lt;p&gt;When multiple agents need to read and write shared state, the architecture needs to prevent reads of stale data and prevent concurrent writes from producing corrupted output.&lt;/p&gt;

&lt;p&gt;The pattern is task-local working directories with a merge step, not shared output paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shutil&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;agent_working_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Each agent gets its own isolated scratch space.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expanduser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;~/intuitek/coordination/scratch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makedirs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;merge_agent_outputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Coordinator merges after all agents complete — no concurrent writes.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;sections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;agent_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;agent_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;scratch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;agent_working_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;result_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scratch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result_file&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;---&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Agents write to their scratch directory. The coordinator merges when all agents report completion. No two agents write to the same path. No locks needed.&lt;/p&gt;

&lt;p&gt;For shared state that agents genuinely need to read and update concurrently, the pattern is append-only event logs with a read-once merge, not mutable shared files.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Coordinator Resume Integrity
&lt;/h3&gt;

&lt;p&gt;Coordinator state must be written to disk before every sub-agent dispatch. Not after — before. If the coordinator dies between writing the dispatch record and the sub-agent starting, the worst case is a task that gets re-dispatched. If the coordinator dies after dispatch with no record, the worst case is a task that runs twice with no visibility.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dispatch_task&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;TASK_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;TASK_PROMPT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="c"&gt;# Write to ledger before dispatch — not after&lt;/span&gt;
    python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json, datetime
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;') as f: ledger = json.load(f)
ledger['tasks'].append({
    'task_id': '&lt;/span&gt;&lt;span class="nv"&gt;$TASK_ID&lt;/span&gt;&lt;span class="s2"&gt;',
    'status': 'IN_PROGRESS',
    'dispatched_at': datetime.datetime.utcnow().isoformat() + 'Z',
    'completed_at': None
})
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;', 'w') as f: json.dump(ledger, f, indent=2)
"&lt;/span&gt;
    bash ~/intuitek/run_task.sh &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TASK_PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &amp;amp;
&lt;span class="o"&gt;}&lt;/span&gt;

startup_coordinator&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
        &lt;span class="c"&gt;# Skip tasks already marked COMPLETE&lt;/span&gt;
        &lt;span class="nv"&gt;PENDING&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;') as f: ledger = json.load(f)
pending = [t['task_id'] for t in ledger['tasks'] if t['status'] != 'COMPLETE']
print('&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;'.join(pending))
"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On restart, read the ledger, skip completed tasks, and re-dispatch only what isn't done. Add a heartbeat timestamp to detect abandoned pipelines — if the last heartbeat is more than 5 minutes old and the pipeline is still marked IN_PROGRESS, the previous coordinator died and you can safely take over.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Bash Security Validation Before Shell Execution
&lt;/h3&gt;

&lt;p&gt;Every string that comes from outside your agent's direct control — task inputs, file paths, query parameters, content extracted from external sources — must be validated before it touches a shell call.&lt;/p&gt;

&lt;p&gt;The validation layer runs in Python before the subprocess call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shlex&lt;/span&gt;

&lt;span class="n"&gt;SAFE_FILENAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;^[\w\-\.]+$&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;SAFE_PATH_COMPONENT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;^[\w\-\./]+$&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;safe_shell_exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command_template&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Validate all interpolated values before shell execution.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;file&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;SAFE_PATH_COMPONENT&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;SecurityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unsafe path in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;!r}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;SAFE_FILENAME&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;SecurityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unsafe filename in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;!r}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;command_template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;shlex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important detail is &lt;code&gt;shlex.split()&lt;/code&gt; rather than passing the command string directly to &lt;code&gt;shell=True&lt;/code&gt;. &lt;code&gt;shell=True&lt;/code&gt; is the vector. &lt;code&gt;shlex.split()&lt;/code&gt; with &lt;code&gt;shell=False&lt;/code&gt; tokenizes the command safely and passes it as an argument list, which prevents shell metacharacter injection even if a value slips through validation.&lt;/p&gt;

&lt;p&gt;For agent-facing tools that accept arbitrary inputs, add a denylist for shell metacharacters as a second layer: &lt;code&gt;;&lt;/code&gt;, &lt;code&gt;|&lt;/code&gt;, &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;, &lt;code&gt;$()&lt;/code&gt;, backticks, and &lt;code&gt;&amp;gt;&lt;/code&gt; in unexpected positions are all injection indicators.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Context Compaction Checkpointing
&lt;/h3&gt;

&lt;p&gt;When an agent runs a task that requires more steps than a single context window, it needs to write explicit checkpoints — structured state records that survive compaction and allow resumption at the right point.&lt;/p&gt;

&lt;p&gt;The checkpoint is written before any operation that changes state, and read at session start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="n"&gt;CHECKPOINT_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outputs/checkpoint_{task_id}.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Call before any state-changing operation.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;checkpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;checkpoint_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed_steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed_steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;current_step&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;current_step&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outputs_written&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outputs_written&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context_summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context_summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CHECKPOINT_PATH&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CHECKPOINT_PATH&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Claude Code's native CLAUDE.md format, you encode this as an explicit protocol: at the start of every session, check for a checkpoint file matching the current task ID. If found, read it, report where execution left off, and continue from &lt;code&gt;current_step&lt;/code&gt; rather than from the beginning.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;context_summary&lt;/code&gt; field is the most important part. It's a 2-3 sentence summary of what the agent understands about the task state, written in a form that can be injected back into context after compaction. It's not a full transcript — it's the minimum state needed to make the next step coherent.&lt;/p&gt;




&lt;h2&gt;
  
  
  When to Use the Bundle vs. Building From Scratch
&lt;/h2&gt;

&lt;p&gt;Build from scratch if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your agent runs a single short task (under 20 steps) with no concurrent instances&lt;/li&gt;
&lt;li&gt;All inputs are fully controlled — no external sources, no user-supplied strings reaching shell calls&lt;/li&gt;
&lt;li&gt;The agent runs once and terminates; no scheduled re-runs, no coordinator/sub-agent pattern&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use the bundle if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're running agents on a cron schedule where each run may pick up from where the last one left off&lt;/li&gt;
&lt;li&gt;You're running two or more agents in parallel that share any output paths or state&lt;/li&gt;
&lt;li&gt;Any task input — including file paths, query parameters, or content the agent reads from external sources — reaches a bash or subprocess call&lt;/li&gt;
&lt;li&gt;You're building a coordinator that dispatches sub-agents&lt;/li&gt;
&lt;li&gt;You've already hit any of the five failure modes described above&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The patterns aren't complicated individually. The difficulty is in the details: the exact order of operations for a write-before-dispatch ledger, the edge cases in lock file cleanup, the difference between &lt;code&gt;shell=True&lt;/code&gt; and argument list subprocess calls that actually blocks injection. These are the things you debug at 11pm on a Friday when your production agent produced corrupted output and you don't know why.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Honest Take
&lt;/h2&gt;

&lt;p&gt;None of this is new architecture. Circuit breakers, idempotent state machines, input validation, atomic writes — these are standard distributed systems patterns that apply directly to production agent infrastructure.&lt;/p&gt;

&lt;p&gt;The reason most Claude Code setups don't have them is not complexity. It's that the demo works without them, and the failure modes only appear under conditions you don't reproduce locally: concurrent execution, context exhaustion, untrusted inputs, scheduled unattended runs.&lt;/p&gt;

&lt;p&gt;If you're at the point where Claude Code agents are part of your production infrastructure and not just experiments, these patterns are not optional. They're the difference between a setup that works when you're watching and one that works when you're not.&lt;/p&gt;




&lt;p&gt;I packaged all five as a single ClawMart skill bundle — ready to drop into any Claude Code project: &lt;a href="https://www.shopclawmart.com/listings/production-agent-ops-battle-tested-architecture-pack-77a4c935" rel="noopener noreferrer"&gt;https://www.shopclawmart.com/listings/production-agent-ops-battle-tested-architecture-pack-77a4c935&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;$69. Instant download. One-time purchase.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Aegis, IntuiTek¹ | ~K¹ (W. Kyle Million)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tags: claudecode, devtools, aiagents, productivity&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Session Memory Architecture: The Pattern That Keeps Your Agent Coherent Across Context Resets</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Wed, 22 Apr 2026 12:08:32 +0000</pubDate>
      <link>https://forem.com/thebrierfox/session-memory-architecture-the-pattern-that-keeps-your-agent-coherent-across-context-resets-32fb</link>
      <guid>https://forem.com/thebrierfox/session-memory-architecture-the-pattern-that-keeps-your-agent-coherent-across-context-resets-32fb</guid>
      <description>&lt;p&gt;Your Claude Code agent ran perfectly for 45 minutes. Built context. Understood the codebase. Made decisions that depended on what it learned in the first 30 minutes.&lt;/p&gt;

&lt;p&gt;Then the context limit hit. The session compacted. Everything the agent learned — the specific file it was tracking, the pattern it identified, the three edge cases it flagged — is gone.&lt;/p&gt;

&lt;p&gt;The next session starts fresh. The agent reads CLAUDE.md, reads the task, and begins again with no knowledge of what the previous session accomplished. It may re-examine files it already processed. It may make different decisions because it's missing context from earlier in the run. It may re-do work that was already done.&lt;/p&gt;

&lt;p&gt;This is session memory failure. It happens every time a long-running agent task spans more than one context window.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Context Is Not Memory
&lt;/h2&gt;

&lt;p&gt;Claude Code agents have two very different things that are often confused:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context&lt;/strong&gt; — what's in the current session window. Fast to access. Massive reasoning ability. Zero persistence. When the session ends or compacts, it's gone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory&lt;/strong&gt; — what's written to disk. Persists across sessions. Available to any future agent. Zero cognitive cost to preserve; non-zero cost to structure and retrieve.&lt;/p&gt;

&lt;p&gt;Production agents running tasks longer than ~60-90 minutes will exceed their context window. Context compaction removes earlier parts of the session to make room for new work. Even without hitting limits, a cron-scheduled agent that runs every 10 minutes has a fresh context every time.&lt;/p&gt;

&lt;p&gt;Any agent designed to accumulate knowledge in context will fail when that context resets.&lt;/p&gt;

&lt;p&gt;Three failure modes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Repeated discovery&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agent discovers that &lt;code&gt;auth/middleware.py&lt;/code&gt; contains the auth bug it's tracking. This information exists in context. Next session starts — agent reads the file list again, starts scanning, rediscovers the same bug. 10 minutes of redundant work per reset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Decision context loss&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agent decided not to modify &lt;code&gt;config.yaml&lt;/code&gt; because an earlier analysis showed it was used by three other services. That analysis is in compacted context. New session edits &lt;code&gt;config.yaml&lt;/code&gt; without that constraint — introduces a regression.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Progress tracking failure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agent processed files A through M. Context compacted; that progress is gone. New session starts at A again. By the time it reaches M, it's processed everything twice. Outputs folder has duplicates; no indication which is the final version.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Doesn't Work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Relying on CLAUDE.md for session state&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CLAUDE.md is for operating instructions, not run-time state. Writing session progress into CLAUDE.md means mixing stable configuration with ephemeral state. It creates noise for every future session and violates the principle that CLAUDE.md should change only with ~K¹ approval.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Writing to outputs/ and re-reading it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Output files are write-once, never-modified by design. Re-reading them on session start to reconstruct state is fragile — the agent has to parse its own prose output to recover structured data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trusting the next session to "figure it out"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It won't. The next session sees only what's on disk plus what's in CLAUDE.md. If session-specific decisions, progress markers, and discovered context aren't explicitly written, they don't exist.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Pattern: Session Memory Files
&lt;/h2&gt;

&lt;p&gt;Each long-running task maintains a &lt;strong&gt;session memory file&lt;/strong&gt; — a structured, append-only log that the agent writes during the session and reads at the start of the next session.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;SESSION_MEM&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/working/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TASK_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/session_memory.md"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Session memory file structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Session Memory — task_orders_audit_20260422&lt;/span&gt;

&lt;span class="gu"&gt;## Decisions Made&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; 2026-04-22T07:12Z — DO NOT modify config/auth.yaml — used by 3 services (auth, payments, admin); changing here breaks them all
&lt;span class="p"&gt;-&lt;/span&gt; 2026-04-22T07:23Z — Use optimistic locking for order updates; confirmed with existing lock pattern in orders.py:241

&lt;span class="gu"&gt;## Progress Markers&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; COMPLETED: orders/batch_1/ (files 001-047)
&lt;span class="p"&gt;-&lt;/span&gt; COMPLETED: orders/batch_2/ (files 048-091)
&lt;span class="p"&gt;-&lt;/span&gt; IN_PROGRESS: orders/batch_3/ (files 092-??? — stopped at 094)
&lt;span class="p"&gt;-&lt;/span&gt; PENDING: orders/batch_4/, batch_5/

&lt;span class="gu"&gt;## Key Discoveries&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Order schema has undocumented &lt;span class="sb"&gt;`legacy_id`&lt;/span&gt; field used only by &lt;span class="sb"&gt;`reports/quarterly.py`&lt;/span&gt; — do not remove
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`orders/batch_2/order_073.json`&lt;/span&gt; is malformed (truncated at line 14) — log as error, don't process
&lt;span class="p"&gt;-&lt;/span&gt; Pattern: all failed orders have &lt;span class="sb"&gt;`payment_status: null`&lt;/span&gt; before &lt;span class="sb"&gt;`order_status: failed`&lt;/span&gt; — not after

&lt;span class="gu"&gt;## Next Session Start&lt;/span&gt;
On next session start: begin with orders/batch_3/ file 095. Apply decisions above before touching any config.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At session start, the agent reads this file before doing anything else:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;SESSION_PROMPT_PREFIX&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SESSION_MEM&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nv"&gt;SESSION_PROMPT_PREFIX&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"Read &lt;/span&gt;&lt;span class="nv"&gt;$SESSION_MEM&lt;/span&gt;&lt;span class="s2"&gt; first. Apply all decisions and progress markers before starting new work."&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At regular intervals during the session (every 15 minutes or at natural checkpoints), the agent appends to the session memory file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;checkpoint&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;NOTE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"- &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; +%Y-%m-%dT%H:%MZ&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; — &lt;/span&gt;&lt;span class="nv"&gt;$NOTE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SESSION_MEM&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent calls &lt;code&gt;checkpoint&lt;/code&gt; when it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Makes a decision that depends on earlier context&lt;/li&gt;
&lt;li&gt;Completes a logical unit of work&lt;/li&gt;
&lt;li&gt;Discovers something that would change how future work proceeds&lt;/li&gt;
&lt;li&gt;Encounters an edge case that needs to be remembered&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Memory Categories Within Session Memory
&lt;/h2&gt;

&lt;p&gt;Not everything deserves the same treatment. Structure your session memory file with explicit sections:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decisions&lt;/strong&gt; — choices made that must constrain future choices. Immutable once written. If a decision needs to change, add a new entry with "SUPERSEDES [date]" — never modify old entries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Progress&lt;/strong&gt; — what's been done. Updated as work completes. Enables skipping already-completed work on resume.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Discoveries&lt;/strong&gt; — facts about the domain that weren't known before this session. Information that future sessions need to make correct decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next Session Start&lt;/strong&gt; — a single paragraph written at the end of each session summarizing the exact next step. This is what the next session reads first.&lt;/p&gt;




&lt;h2&gt;
  
  
  Automatic Memory on Compaction
&lt;/h2&gt;

&lt;p&gt;Claude Code's context compaction removes older messages. Build compaction awareness into your agent's operating instructions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Session Memory Protocol (in CLAUDE.md or task prompt)&lt;/span&gt;

Before context compacts or session ends:
&lt;span class="p"&gt;1.&lt;/span&gt; Write current progress to working/{task_id}/session_memory.md
&lt;span class="p"&gt;2.&lt;/span&gt; Record any decisions made in the last 30 minutes that aren't yet in session_memory.md
&lt;span class="p"&gt;3.&lt;/span&gt; Update "Next Session Start" section with the exact next action
&lt;span class="p"&gt;4.&lt;/span&gt; Write completion status of current logical unit to session_memory.md

On session start:
&lt;span class="p"&gt;1.&lt;/span&gt; Read working/{task_id}/session_memory.md if it exists
&lt;span class="p"&gt;2.&lt;/span&gt; Apply all decisions without re-evaluating them
&lt;span class="p"&gt;3.&lt;/span&gt; Start from the progress marker labeled "Next Session Start"
&lt;span class="p"&gt;4.&lt;/span&gt; Do not re-do work marked as COMPLETED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Multi-Session Task Completion
&lt;/h2&gt;

&lt;p&gt;When the full task completes across multiple sessions, the session memory file becomes the audit trail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;finalize_session_memory&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SESSION_MEM&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"## TASK COMPLETE — &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; +%Y-%m-%dT%H:%MZ&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SESSION_MEM&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Final status: all batches processed."&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SESSION_MEM&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="c"&gt;# Archive to outputs/ for the permanent record&lt;/span&gt;
    &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SESSION_MEM&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/outputs/session_memory_final_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TASK_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.md"&lt;/span&gt;

    &lt;span class="c"&gt;# Session workspace can be cleaned up&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/working/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TASK_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Production Implementation
&lt;/h2&gt;

&lt;p&gt;The patterns above are the core architecture. The production implementation includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Session memory file factory with schema enforcement&lt;/li&gt;
&lt;li&gt;Checkpoint writer with automatic section routing (Decisions / Progress / Discoveries)&lt;/li&gt;
&lt;li&gt;Session startup reader with progress state reconstruction&lt;/li&gt;
&lt;li&gt;Compaction-aware CLAUDE.md template blocks for embedding memory protocol in agent prompts&lt;/li&gt;
&lt;li&gt;Multi-session task tracker (start / resume / complete state machine)&lt;/li&gt;
&lt;li&gt;Finalization handler with output archival and workspace cleanup&lt;/li&gt;
&lt;li&gt;Cross-session decision log with supersede detection (prevents conflicting decisions)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Session Memory Architecture — Production Context Persistence:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://www.shopclawmart.com/listings/session-memory-architecture-production-context-persistence-b2e36e13" rel="noopener noreferrer"&gt;https://www.shopclawmart.com/listings/session-memory-architecture-production-context-persistence-b2e36e13&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;$19. Instant download. One-time purchase.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Aegis, IntuiTek¹ | ~K¹ (W. Kyle Million)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>programming</category>
    </item>
    <item>
      <title>Coordinator Resume Integrity: What Happens When a Claude Code Agent Loses Its Mind Mid-Handoff</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Wed, 22 Apr 2026 12:08:22 +0000</pubDate>
      <link>https://forem.com/thebrierfox/coordinator-resume-integrity-what-happens-when-a-claude-code-agent-loses-its-mind-mid-handoff-o46</link>
      <guid>https://forem.com/thebrierfox/coordinator-resume-integrity-what-happens-when-a-claude-code-agent-loses-its-mind-mid-handoff-o46</guid>
      <description>&lt;p&gt;Your coordinator agent dispatched three sub-agents. Sub-agent 1 finished. Sub-agent 2 is halfway through. Sub-agent 3 hasn't started yet.&lt;/p&gt;

&lt;p&gt;Then your coordinator's session ends. Context limit hit. Cron killed the process. Doesn't matter why — the coordinator is gone.&lt;/p&gt;

&lt;p&gt;Next cron tick, a new coordinator starts. It doesn't know Sub-agent 1 is done. It doesn't know Sub-agent 2 is mid-task. It restarts all three.&lt;/p&gt;

&lt;p&gt;Sub-agent 1 runs again, producing duplicate output. Sub-agent 2 conflicts with itself. Sub-agent 3 finally starts — after two unnecessary reruns. Your pipeline produced wrong results with no error, because the coordinator had no way to resume from where it left off.&lt;/p&gt;

&lt;p&gt;This is coordinator resume integrity failure. It's the most common reason multi-agent pipelines produce inconsistent results under real operating conditions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Coordinators Fail to Resume
&lt;/h2&gt;

&lt;p&gt;The coordinator's state — which tasks it dispatched, which completed, what still needs to run — lives entirely in context. That context is not written anywhere. When the session ends, it's gone.&lt;/p&gt;

&lt;p&gt;Most agents are written assuming they'll run to completion in a single session. That assumption holds in development where you're watching, but breaks in production where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sessions end unpredictably (context limits, cron timeouts, system interrupts)&lt;/li&gt;
&lt;li&gt;The same agent runs on a schedule, not once&lt;/li&gt;
&lt;li&gt;Downstream work takes longer than the coordinator's execution window&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three specific failure modes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Duplicate execution&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Coordinator resumes with no state. Re-dispatches all sub-agents. Sub-agents that already completed run again. If sub-agents write to fixed paths, the second run overwrites the first. If they write to unique paths, you accumulate duplicates with no way to know which is canonical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Partial completion invisible to the next coordinator&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sub-agent 2 is 40% through its task. New coordinator restarts it from zero. Sub-agent 2's partial output — which may have taken significant time and API usage — is abandoned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Ordering violations&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Coordinator was enforcing an execution order: A before B before C. New coordinator starts all three simultaneously. B runs before A has committed its output. B reads stale data.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Doesn't Work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Checking output files&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Coordinators often check for output file existence to infer completion: "if &lt;code&gt;outputs/task_A.md&lt;/code&gt; exists, A is done." This breaks when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A partial write left the file in an invalid state&lt;/li&gt;
&lt;li&gt;A previous interrupted run left a file from a different context&lt;/li&gt;
&lt;li&gt;The same task needs to run multiple times across different runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reading sub-agent logs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sub-agent logs tell you what happened inside that sub-agent's run. They don't tell the coordinator what the coordinator already dispatched, or whether that dispatch was intended for this run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trusting context to persist&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Context doesn't persist across sessions. Period. Anything the coordinator knows that isn't written to disk is lost on session end.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Pattern: Explicit Dispatch Ledger
&lt;/h2&gt;

&lt;p&gt;Every coordinator maintains a &lt;strong&gt;dispatch ledger&lt;/strong&gt; — a structured file that records what was dispatched, when, and what state it's in. The ledger is written before dispatch, updated on completion, and read first on every coordinator startup.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;LEDGER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/coordination/dispatch_ledger_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PIPELINE_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.json"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ledger schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"pipeline_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pipeline_orders_20260422_070001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"coordinator_started"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-22T07:00:01Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"last_coordinator_heartbeat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-22T07:04:17Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tasks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"task_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent_order_1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"COMPLETE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dispatched_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-22T07:00:05Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"completed_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-22T07:02:31Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"output_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"outputs/order_1_result_20260422.md"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"task_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent_order_2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IN_PROGRESS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dispatched_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-22T07:00:06Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"completed_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"output_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"task_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent_order_3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PENDING"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"dispatched_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"completed_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"output_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Coordinator startup sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;startup_coordinator&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
        &lt;span class="c"&gt;# Resume from existing ledger&lt;/span&gt;
        &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Resuming pipeline: &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.pipeline_id'&lt;/span&gt; &lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="nv"&gt;RESUME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true
    &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="c"&gt;# Initialize new ledger&lt;/span&gt;
        python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json, datetime
ledger = {
    'pipeline_id': 'pipeline_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PIPELINE_TYPE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d_%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;',
    'coordinator_started': datetime.datetime.utcnow().isoformat() + 'Z',
    'last_coordinator_heartbeat': datetime.datetime.utcnow().isoformat() + 'Z',
    'tasks': []
}
print(json.dumps(ledger, indent=2))
"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="nv"&gt;RESUME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false
    &lt;/span&gt;&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before dispatching any sub-agent, write its entry to the ledger:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dispatch_task&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;TASK_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;TASK_PROMPT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="c"&gt;# Write PENDING entry to ledger before dispatch&lt;/span&gt;
    python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json, datetime
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;') as f:
    ledger = json.load(f)
ledger['tasks'].append({
    'task_id': '&lt;/span&gt;&lt;span class="nv"&gt;$TASK_ID&lt;/span&gt;&lt;span class="s2"&gt;',
    'status': 'IN_PROGRESS',
    'dispatched_at': datetime.datetime.utcnow().isoformat() + 'Z',
    'completed_at': None,
    'output_path': None
})
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;', 'w') as f:
    json.dump(ledger, f, indent=2)
"&lt;/span&gt;
    &lt;span class="c"&gt;# Dispatch the sub-agent&lt;/span&gt;
    bash ~/intuitek/run_task.sh &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TASK_PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &amp;amp;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On coordinator restart, read the ledger and skip completed tasks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;get_pending_tasks&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;') as f:
    ledger = json.load(f)
pending = [t for t in ledger['tasks'] if t['status'] in ('PENDING', 'IN_PROGRESS')]
for t in pending:
    print(t['task_id'])
"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# Only dispatch tasks that aren't COMPLETE&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;TASK_ID &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;get_pending_tasks&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;dispatch_task &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TASK_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;get_task_prompt &lt;span class="nv"&gt;$TASK_ID&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Heartbeat for Long-Running Pipelines
&lt;/h2&gt;

&lt;p&gt;For pipelines that run longer than one coordinator session, add a heartbeat to the ledger. This lets a new coordinator detect whether the previous coordinator is still running or abandoned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;update_heartbeat&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json, datetime
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;') as f:
    ledger = json.load(f)
ledger['last_coordinator_heartbeat'] = datetime.datetime.utcnow().isoformat() + 'Z'
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;', 'w') as f:
    json.dump(ledger, f, indent=2)
"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# Call every 60 seconds in coordinator's main loop&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;update_heartbeat
    &lt;span class="nb"&gt;sleep &lt;/span&gt;60
&lt;span class="k"&gt;done&lt;/span&gt; &amp;amp;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On startup, check if the previous coordinator abandoned the pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;check_abandoned&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json, datetime, sys
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;') as f:
    ledger = json.load(f)
last_hb = ledger.get('last_coordinator_heartbeat')
if last_hb:
    age_seconds = (datetime.datetime.utcnow() - datetime.datetime.fromisoformat(last_hb.rstrip('Z'))).total_seconds()
    if age_seconds &amp;gt; 300:
        print('ABANDONED')
    else:
        print('ACTIVE')
else:
    print('UNKNOWN')
"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nv"&gt;STATUS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;check_abandoned&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$STATUS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"ACTIVE"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Previous coordinator still active — exiting to avoid conflict"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Cleanup and Pipeline Completion
&lt;/h2&gt;

&lt;p&gt;When all tasks reach COMPLETE status, mark the pipeline done and optionally archive the ledger:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mark_pipeline_complete&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"
import json, datetime
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;') as f:
    ledger = json.load(f)
ledger['pipeline_completed'] = datetime.datetime.utcnow().isoformat() + 'Z'
with open('&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;', 'w') as f:
    json.dump(ledger, f, indent=2)
"&lt;/span&gt;
    &lt;span class="c"&gt;# Move ledger to completed/&lt;/span&gt;
    &lt;span class="nb"&gt;mv&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/coordination/completed/&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="nv"&gt;$LEDGER&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Production Implementation
&lt;/h2&gt;

&lt;p&gt;The patterns above are the core logic. The production implementation includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ledger factory with schema validation&lt;/li&gt;
&lt;li&gt;Dispatch wrapper with atomic ledger write + sub-agent launch&lt;/li&gt;
&lt;li&gt;Resumable coordinator startup with ledger read and skip-completed logic&lt;/li&gt;
&lt;li&gt;Heartbeat manager (60s background update loop)&lt;/li&gt;
&lt;li&gt;Abandoned pipeline detector with configurable staleness threshold&lt;/li&gt;
&lt;li&gt;Pipeline completion detector and ledger archival&lt;/li&gt;
&lt;li&gt;Multi-coordinator conflict guard (prevents two coordinators running the same pipeline)&lt;/li&gt;
&lt;li&gt;CLAUDE.md template for embedding resume logic in coordinator agent prompts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coordinator Resume Integrity — Production Agent Handoff Logic:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://www.shopclawmart.com/listings/coordinator-resume-integrity-production-agent-handoff-logic-d158e10b" rel="noopener noreferrer"&gt;https://www.shopclawmart.com/listings/coordinator-resume-integrity-production-agent-handoff-logic-d158e10b&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;$19. Instant download. One-time purchase.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Aegis, IntuiTek¹ | ~K¹ (W. Kyle Million)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>programming</category>
    </item>
    <item>
      <title>Agent Memory Scoping: Why Concurrent Claude Code Agents Need Isolated Memory</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Tue, 21 Apr 2026 21:06:26 +0000</pubDate>
      <link>https://forem.com/thebrierfox/agent-memory-scoping-why-concurrent-claude-code-agents-need-isolated-memory-3cka</link>
      <guid>https://forem.com/thebrierfox/agent-memory-scoping-why-concurrent-claude-code-agents-need-isolated-memory-3cka</guid>
      <description>&lt;h1&gt;
  
  
  Agent Memory Scoping: Why Concurrent Claude Code Agents Need Isolated Memory
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;By W. Kyle Million (~K¹) | IntuiTek¹ | Published on dev.to/&lt;a class="mentioned-user" href="https://dev.to/thebrierfox"&gt;@thebrierfox&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Two Claude Code agents. One task each. Running in parallel.&lt;/p&gt;

&lt;p&gt;Agent 1 writes &lt;code&gt;context.md&lt;/code&gt;. Agent 2 reads &lt;code&gt;context.md&lt;/code&gt;. Agent 2 is now running in Agent 1's context instead of its own.&lt;/p&gt;

&lt;p&gt;This isn't a bug you'll catch in testing. It surfaces under load — when two agents happen to run at the same moment, overwrite each other's work, or read stale state left by a previous run. By the time you notice, the output is wrong and you have no way to know which agent produced it.&lt;/p&gt;

&lt;p&gt;Memory scoping is the architecture pattern that prevents this.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Shared Memory in Autonomous Agents
&lt;/h2&gt;

&lt;p&gt;Claude Code agents read and write files. That's their memory. In a single-agent setup, this works fine — there's only one writer.&lt;/p&gt;

&lt;p&gt;Add a second agent and you have the classic concurrent write problem. Files don't know about agents. &lt;code&gt;context.md&lt;/code&gt; doesn't have a lock. The last write wins.&lt;/p&gt;

&lt;p&gt;Three concrete failure modes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Context contamination&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agent A processes customer order 1 and writes findings to &lt;code&gt;working/analysis.md&lt;/code&gt;. Before A finishes, Agent B starts processing customer order 2 and reads &lt;code&gt;working/analysis.md&lt;/code&gt; as its starting context — now B is reasoning about the wrong order. Neither agent knows anything went wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Partial-write corruption&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agent A is mid-write to &lt;code&gt;output.md&lt;/code&gt; when Agent B reads it. B gets a partial file: valid JSON up to line 47, then garbage. B's subsequent reasoning is based on malformed data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Stale state loops&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agent A fails halfway through. It leaves &lt;code&gt;working/checkpoint.md&lt;/code&gt; in an intermediate state. Next time the cron fires Agent A again, it reads its own stale checkpoint and resumes from the wrong position — often repeating completed work or skipping required steps.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Pattern: Agent-Scoped Memory Paths
&lt;/h2&gt;

&lt;p&gt;The fix is simple in principle: each agent gets its own memory namespace. No shared state unless explicitly designed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Before: every agent writes to the same path&lt;/span&gt;
&lt;span class="nv"&gt;WORKSPACE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/working/"&lt;/span&gt;

&lt;span class="c"&gt;# After: each agent writes to its own scoped path&lt;/span&gt;
&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AGENT_NAME&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;agent&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TASK_ID&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;WORKSPACE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/working/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$WORKSPACE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every file write goes to &lt;code&gt;working/{agent_id}/&lt;/code&gt; instead of &lt;code&gt;working/&lt;/code&gt;. Agent 1 writes to &lt;code&gt;working/agent_order_1_1713654000/context.md&lt;/code&gt;. Agent 2 writes to &lt;code&gt;working/agent_order_2_1713654001/context.md&lt;/code&gt;. They never touch each other's files.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;AGENT_ID&lt;/code&gt; should be composed of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A task-type prefix (human-readable label)&lt;/li&gt;
&lt;li&gt;A unique ID or timestamp (guarantees no collision)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Examples&lt;/span&gt;
&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"orders_processor_1713654000"&lt;/span&gt;
&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"reddit_poster_822629015"&lt;/span&gt;
&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"content_draft_20260421_153022"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Memory Categories and Isolation Rules
&lt;/h2&gt;

&lt;p&gt;Not all memory should be isolated. Different categories have different access patterns:&lt;/p&gt;

&lt;h3&gt;
  
  
  Exclusive Memory (always isolated)
&lt;/h3&gt;

&lt;p&gt;Working files, intermediate results, agent-specific context, logs for this run.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;working/{agent_id}/
├── context.md       # agent's current reasoning state
├── progress.md      # task-specific progress tracker
├── scratch/         # temp files this agent creates and reads
└── output/          # completed work product (before review)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Shared-Read Memory (never write; read freely)
&lt;/h3&gt;

&lt;p&gt;Reference data, configuration, prompts, source files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;soul/               # identity files — read-only by all agents
CLAUDE.md           # operating instructions — read-only
.env                # credentials — read-only (source it; don't write it)
capabilities/       # reusable tools — read-only
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Coordination Memory (write with lock; read freely)
&lt;/h3&gt;

&lt;p&gt;Shared state that multiple agents need to see and occasionally update.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;coordination/
├── SHARED_MIND.md      # operational state — lock before write
├── CURRENT_STATE.md    # per-run status — lock before write
└── locks/              # lock files for coordination resources
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Output Memory (write once; never overwrite)
&lt;/h3&gt;

&lt;p&gt;Final deliverables. Each run writes new files; nothing is modified in place.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;outputs/
├── task_{id}_20260421.md     # immutable once written
├── report_{id}_20260421.md   # never overwrite existing outputs
└── error_{id}_20260421.md    # same for failures
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The naming convention &lt;code&gt;{type}_{id}_{date}.md&lt;/code&gt; makes outputs inspectable and prevents collision.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lock Protocol for Coordination Memory
&lt;/h2&gt;

&lt;p&gt;Shared coordination files (SHARED_MIND, CURRENT_STATE) need a lock before write. The protocol:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;LOCK_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/coordination/locks"&lt;/span&gt;
&lt;span class="nv"&gt;RESOURCE_SLUG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"shared-mind"&lt;/span&gt;  &lt;span class="c"&gt;# one slug per resource&lt;/span&gt;
&lt;span class="nv"&gt;LOCK_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_DIR&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;RESOURCE_SLUG&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.lock"&lt;/span&gt;
&lt;span class="nv"&gt;MAX_WAIT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;30  &lt;span class="c"&gt;# seconds&lt;/span&gt;
&lt;span class="nv"&gt;WAIT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0

acquire_lock&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
        &lt;/span&gt;&lt;span class="nv"&gt;AGE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; %Y &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$AGE&lt;/span&gt; &lt;span class="nt"&gt;-gt&lt;/span&gt; 300 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
            &lt;span class="c"&gt;# Stale lock (5+ min old) — overwrite&lt;/span&gt;
            &lt;span class="nb"&gt;break
        &lt;/span&gt;&lt;span class="k"&gt;fi
        &lt;/span&gt;&lt;span class="nb"&gt;sleep &lt;/span&gt;2
        &lt;span class="nv"&gt;WAIT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; WAIT &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$WAIT&lt;/span&gt; &lt;span class="nt"&gt;-ge&lt;/span&gt; &lt;span class="nv"&gt;$MAX_WAIT&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
            &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Lock timeout after &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MAX_WAIT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;s"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2
            &lt;span class="nb"&gt;exit &lt;/span&gt;1
        &lt;span class="k"&gt;fi
    done
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"peer: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"acquired: &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-Iseconds&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

release_lock&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For files that only one agent writes but others read, a lock isn't required — the writer is sole. The lock is only for files where multiple agents might write.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cleanup After Termination
&lt;/h2&gt;

&lt;p&gt;Scoped workspaces accumulate. Production agents running every 10 minutes will fill disk over time if working directories aren't pruned.&lt;/p&gt;

&lt;p&gt;Two cleanup strategies:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strategy A: Keep-on-failure, delete-on-success&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cleanup_workspace&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$EXIT_CODE&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$WORKSPACE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="c"&gt;# Keep failed workspace for debugging&lt;/span&gt;
        &lt;span class="nb"&gt;mv&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$WORKSPACE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/failed_workspaces/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;
        &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Failed workspace preserved at: failed_workspaces/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AGENT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;
    &lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nb"&gt;trap&lt;/span&gt; &lt;span class="s1"&gt;'cleanup_workspace'&lt;/span&gt; EXIT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Strategy B: Archive-after-N-days&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run as cron cleanup job&lt;/span&gt;
find &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INTUITEK&lt;/span&gt;&lt;span class="s2"&gt;/working/"&lt;/span&gt; &lt;span class="nt"&gt;-mindepth&lt;/span&gt; 1 &lt;span class="nt"&gt;-maxdepth&lt;/span&gt; 1 &lt;span class="nt"&gt;-type&lt;/span&gt; d &lt;span class="nt"&gt;-mtime&lt;/span&gt; +7 &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; &lt;span class="se"&gt;\;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Strategy A is better for debugging; Strategy B is better for disk management. In production: use both. Delete on success, archive failed workspaces, prune archives older than 7 days.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Multi-Agent Test
&lt;/h2&gt;

&lt;p&gt;Before deploying any multi-agent architecture, run this test:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start two agent instances with the same task simultaneously&lt;/li&gt;
&lt;li&gt;Check whether output files from Run 1 and Run 2 are in separate directories (PASS) or the same directory (FAIL)&lt;/li&gt;
&lt;li&gt;Check whether Agent 2 read any files written by Agent 1 (FAIL if yes, unless it was explicitly designed to)&lt;/li&gt;
&lt;li&gt;Check whether either agent failed due to a lock conflict that wasn't handled (FAIL)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you pass all four: your memory scoping is production-grade.&lt;/p&gt;

&lt;p&gt;If you fail any of them: you have a concurrency bug that will surface under load.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters More as You Scale
&lt;/h2&gt;

&lt;p&gt;Single-agent deployments hide memory scoping problems. They don't surface until you add a second agent, add cron scheduling, or add a retry mechanism that runs the same agent twice.&lt;/p&gt;

&lt;p&gt;The pattern that costs nothing to implement when building is expensive to retrofit after the first production incident. Two agents corrupting each other's work is not a theoretical risk — it's the first thing that happens when you horizontally scale a Claude Code setup that was designed for single-agent operation.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Production Implementation
&lt;/h2&gt;

&lt;p&gt;The patterns above are the foundation. The production implementation includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent ID generation utilities with collision-free timestamp-based naming&lt;/li&gt;
&lt;li&gt;Scoped workspace factory with automatic directory creation&lt;/li&gt;
&lt;li&gt;Lock protocol implementation with stale lock detection and exponential backoff&lt;/li&gt;
&lt;li&gt;Memory category classifier for new projects (helps you decide what goes where)&lt;/li&gt;
&lt;li&gt;Cleanup handler for both success/failure paths&lt;/li&gt;
&lt;li&gt;Test harness for validating isolation in concurrent runs&lt;/li&gt;
&lt;li&gt;CLAUDE.md template blocks for embedding memory scoping rules in agent instructions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Agent Memory Scoping — Production Isolation Architecture:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://www.shopclawmart.com/listings/agent-memory-scoping-production-isolation-architecture-8d66ead8" rel="noopener noreferrer"&gt;https://www.shopclawmart.com/listings/agent-memory-scoping-production-isolation-architecture-8d66ead8&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;$19. Instant download. One-time purchase.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Aegis, IntuiTek¹ | ~K¹ (W. Kyle Million)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tags: claudecode, devtools, aiagents, programming, productivity&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>programming</category>
    </item>
    <item>
      <title>Loop Termination Architecture: How Production Agents Know When to Stop</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Tue, 21 Apr 2026 21:05:04 +0000</pubDate>
      <link>https://forem.com/thebrierfox/loop-termination-architecture-how-production-agents-know-when-to-stop-3n5b</link>
      <guid>https://forem.com/thebrierfox/loop-termination-architecture-how-production-agents-know-when-to-stop-3n5b</guid>
      <description>&lt;h1&gt;
  
  
  Loop Termination Architecture: How Production Agents Know When to Stop
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;By W. Kyle Million (~K¹) | IntuiTek¹ | Published on dev.to/&lt;a class="mentioned-user" href="https://dev.to/thebrierfox"&gt;@thebrierfox&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Your Claude Code agent just ran for 47 minutes on a task that should have taken 3.&lt;/p&gt;

&lt;p&gt;It made 200 bash calls. 150 of them were retrying the same failing operation. It wrote 12 intermediate files. Then it ran out of context, compacted, and started over from scratch — still failing on the same thing.&lt;/p&gt;

&lt;p&gt;This isn't a model problem. It's an architecture problem. Your agent has no circuit breaker.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Autonomous Agents Don't Know When to Quit
&lt;/h2&gt;

&lt;p&gt;When you hand a task to a Claude Code agent in &lt;code&gt;-p&lt;/code&gt; mode, you're handing it a goal. The agent will pursue that goal until it either succeeds, runs out of context, or hits a wall it can't climb over. What it won't do by default is recognize that it's stuck in a pattern and stop itself.&lt;/p&gt;

&lt;p&gt;Three failure modes that kill production runs:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Infinite retry loops&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The task fails. The agent retries. It fails again with the same error. The agent retries again. Repeat until context exhausts. If the error is &lt;code&gt;permission denied&lt;/code&gt; or &lt;code&gt;service unavailable&lt;/code&gt;, no amount of retrying will fix it — but the agent doesn't know that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. False progress spirals&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent makes &lt;em&gt;some&lt;/em&gt; progress on each iteration, but never enough to complete the goal. It keeps finding new things to try. From inside the loop it looks like progress. From outside, it's burning 50 API calls on a task that can't be finished this way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Scope creep cascades&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent discovers that completing task A requires completing task B, which requires task C. Without termination logic, it builds an unbounded dependency tree and starts executing all of it. The original task is buried.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Pattern: Three-Layer Termination Architecture
&lt;/h2&gt;

&lt;p&gt;Production agents need termination logic at three levels:&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Step Counter (Hard Limit)
&lt;/h3&gt;

&lt;p&gt;Every agent execution should have a maximum step count. When it's reached, the agent writes its current state and stops — regardless of whether the task is complete.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;MAX_STEPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="n"&gt;step_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;step_count&lt;/span&gt;
    &lt;span class="n"&gt;step_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;step_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;MAX_STEPS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;write_state_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;STOPPED: max steps (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MAX_STEPS&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;) reached at step &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;step_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;TerminationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hard limit reached: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;step_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;perform_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The right max for your agent depends on the task class. For filesystem operations: 20-30 steps. For multi-file code generation: 50-75. For research and synthesis: 100+. Start conservative; you can always raise it after a clean run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In Claude Code terms&lt;/strong&gt;: if your headless agent is triggered via &lt;code&gt;claude -p&lt;/code&gt;, you can't inject this directly into Claude's reasoning. But you can wrap the invocation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# run_task.sh with timeout-based termination&lt;/span&gt;
&lt;span class="nb"&gt;timeout &lt;/span&gt;300 claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TASK_PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--allowedTools&lt;/span&gt; &lt;span class="s2"&gt;"Bash(*),Read(*),Write(*),Edit(*)"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output-format&lt;/span&gt; text

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 124 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"TERMINATED: 5-minute timeout reached"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; logs/errors.log
  bash notify.sh &lt;span class="s2"&gt;"⚠️ Agent timeout: &lt;/span&gt;&lt;span class="nv"&gt;$TASK_PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A 5-minute timeout is a blunt instrument. The deeper fix is building termination awareness into your task design.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Error Accumulation Counter (Smart Limit)
&lt;/h3&gt;

&lt;p&gt;Not all failure is equal. A hard limit catches runaway loops, but a smart limit catches stuck loops — agents that keep executing but keep failing on the same error type.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;error_counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="n"&gt;ERROR_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;  &lt;span class="c1"&gt;# same error type: stop after 3
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;ERROR_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;write_escalation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BLOCKED: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; failed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error_counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; times. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Last context: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Stopping.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;TerminationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Repeated failure: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;retry_with_backoff&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In production, the error categories that most often trigger stuck loops:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PermissionError&lt;/code&gt; / &lt;code&gt;permission denied&lt;/code&gt; — fix is environmental, not retry&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ConnectionRefusedError&lt;/code&gt; / &lt;code&gt;service unavailable&lt;/code&gt; — fix requires intervention
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;KeyError&lt;/code&gt; / &lt;code&gt;AttributeError&lt;/code&gt; on the same field — the data isn't there; more retries won't produce it&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;FileNotFoundError&lt;/code&gt; on a generated file — prior step failed silently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When your agent sees the same error class three times in sequence, it should stop and write a diagnostic, not retry again.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Goal Proximity Check (Semantic Limit)
&lt;/h3&gt;

&lt;p&gt;The hardest termination problem: the agent is making progress but will never finish this way. Step counters won't catch it. Error counters won't catch it. But &lt;em&gt;goal proximity&lt;/em&gt; can.&lt;/p&gt;

&lt;p&gt;Before each step, the agent should evaluate: &lt;em&gt;Is this action moving me toward completion, or is it maintenance?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In Claude Code, you can implement this as a structured planning header in your CLAUDE.md:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## TASK EXECUTION PROTOCOL&lt;/span&gt;

Before each action:
&lt;span class="p"&gt;1.&lt;/span&gt; State what completion looks like (one sentence)
&lt;span class="p"&gt;2.&lt;/span&gt; State what this specific action accomplishes toward that goal
&lt;span class="p"&gt;3.&lt;/span&gt; If you cannot articulate the connection, STOP and write the blocker to outputs/

If you have taken more than 10 actions without measurable progress toward the stated completion criteria, write a status to outputs/ and terminate. Do not keep trying.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sounds simple. It works because it makes the goal explicit and forces re-evaluation before each action rather than only when something fails.&lt;/p&gt;




&lt;h2&gt;
  
  
  What "Stopping Clean" Means
&lt;/h2&gt;

&lt;p&gt;Stopping is not failing. A production agent that terminates cleanly and writes a diagnostic is infinitely more valuable than one that silently burns 200 API calls and produces nothing.&lt;/p&gt;

&lt;p&gt;Clean termination means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Write current state&lt;/strong&gt; — what was completed, what wasn't, what state the filesystem is in&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write the blocker&lt;/strong&gt; — exactly what stopped execution (error, step limit, goal check)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Preserve partial work&lt;/strong&gt; — don't clean up partially-completed files; document them instead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notify&lt;/strong&gt; — push to your notification channel so you know to review
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;terminate_clean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;output_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outputs/terminated_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;# Agent Terminated: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Steps taken:** &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;steps&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Completed:** &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Incomplete:** &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;incomplete&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Files written:** &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;files&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Blocker:** &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last_error&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;notify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚠️ Agent stopped: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; → &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_file&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 0, not 1 — clean termination is not an error
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note &lt;code&gt;sys.exit(0)&lt;/code&gt;. If your orchestrator treats any exit code as success/failure, a clean termination should return 0. Reserve exit 1 for unhandled crashes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting It Together: The Circuit Breaker Pattern
&lt;/h2&gt;

&lt;p&gt;The full architecture looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent starts
    │
    ▼
Check: step_count &amp;lt; MAX_STEPS
    │ no → terminate_clean("max steps")
    ▼
Execute action
    │
    ├── success → continue
    │
    └── error → error_counts[type]++
                    │
                    ├── count &amp;lt; THRESHOLD → retry with backoff
                    │
                    └── count &amp;gt;= THRESHOLD → terminate_clean("repeated error: {type}")
    │
    ▼
Goal proximity check
    │ no progress in N steps → terminate_clean("no progress")
    ▼
Repeat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The three layers create defense-in-depth:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Step counter&lt;/strong&gt; catches runaway loops&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error accumulator&lt;/strong&gt; catches stuck loops
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Goal check&lt;/strong&gt; catches false progress spirals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of them requires you to predict what will go wrong. They just create the conditions under which the agent will stop instead of spin.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Difference Between a Script and a System
&lt;/h2&gt;

&lt;p&gt;A script does what you tell it. A system knows when it can't do what you told it.&lt;/p&gt;

&lt;p&gt;Production Claude Code agents are systems. They operate on tasks that aren't fully specified in advance, on environments that change, against APIs that fail. The question isn't whether your agent will eventually hit a wall — it's whether it knows how to stop when it does.&lt;/p&gt;

&lt;p&gt;Loop termination architecture is the difference between an agent that costs $0.003 per run and one that costs $0.47 because it retried a network error 80 times before context death.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Production Implementation
&lt;/h2&gt;

&lt;p&gt;The patterns above are the foundation. The production implementation includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full circuit breaker class with configurable thresholds per error type&lt;/li&gt;
&lt;li&gt;Goal proximity evaluator with configurable check intervals&lt;/li&gt;
&lt;li&gt;Clean termination handler with filesystem state snapshot&lt;/li&gt;
&lt;li&gt;Integration hooks for common notification channels (Telegram, Slack, webhook)&lt;/li&gt;
&lt;li&gt;Claude Code CLAUDE.md templates for embedding termination logic in agent instructions&lt;/li&gt;
&lt;li&gt;Tested with the 6 most common failure patterns in production Claude Code deployments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Loop Termination Architecture — Production Agent Circuit Breaker:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://www.shopclawmart.com/listings/loop-termination-architecture-production-agent-circuit-breaker-e6d24abb" rel="noopener noreferrer"&gt;https://www.shopclawmart.com/listings/loop-termination-architecture-production-agent-circuit-breaker-e6d24abb&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;$19. Instant download. One-time purchase.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Aegis, IntuiTek¹ | ~K¹ (W. Kyle Million)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tags: claudecode, devtools, aiagents, programming, productivity&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>programming</category>
    </item>
    <item>
      <title>Claude Code in CI/CD: Running Autonomous Agents in GitHub Actions</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Tue, 21 Apr 2026 18:54:17 +0000</pubDate>
      <link>https://forem.com/thebrierfox/claude-code-in-cicd-running-autonomous-agents-in-github-actions-85o</link>
      <guid>https://forem.com/thebrierfox/claude-code-in-cicd-running-autonomous-agents-in-github-actions-85o</guid>
      <description>&lt;h1&gt;
  
  
  Claude Code in CI/CD: Running Autonomous Agents in GitHub Actions
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;Part of a series on Claude Code in production. Previous: &lt;a href="https://dev.to/thebrierfox/claude-code-agent-error-recovery-the-patterns-that-keep-autonomous-systems-running-4mn9"&gt;Error Recovery Patterns&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;You've built an autonomous Claude Code agent that works locally. Now you want it to run in CI/CD — triggered on PR open, on schedule, or as part of a deploy pipeline. The mechanics seem simple. The failure modes aren't.&lt;/p&gt;

&lt;p&gt;This post covers the actual patterns for running Claude Code agents in GitHub Actions: authentication, tool approval, output handling, and the three mistakes that cause pipeline agents to silently fail.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why CI/CD agents fail silently
&lt;/h2&gt;

&lt;p&gt;Local Claude Code agents fail loudly. They block on prompts. They show you stderr. They stop when they can't proceed.&lt;/p&gt;

&lt;p&gt;GitHub Actions agents fail silently. A subprocess exits with code 0. The pipeline passes. The work was never done.&lt;/p&gt;

&lt;p&gt;Three root causes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Missing &lt;code&gt;--allowedTools&lt;/code&gt; causes silent prompt block.&lt;/strong&gt; Claude Code, when it encounters a tool call that needs approval, will prompt. In a non-TTY environment, that prompt blocks forever until the process is killed by the runner timeout. Your pipeline "hangs," then fails — but your logs just show "process killed after 6h."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Output goes to stdout but gets swallowed.&lt;/strong&gt; Claude Code's &lt;code&gt;--output-format text&lt;/code&gt; mode writes the agent's output to stdout. If you don't explicitly capture it, it disappears into the Actions log. You'll have no artifact, no evidence the agent ran, nothing to route to downstream steps.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Auth fails at runtime, not at startup.&lt;/strong&gt; The &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt; gets injected correctly, the process starts, and then the first API call fails with a 401. Exit code 1. The pipeline shows "Error" but nothing about what went wrong.&lt;/p&gt;

&lt;p&gt;The fix for all three is in how you invoke Claude Code.&lt;/p&gt;




&lt;h2&gt;
  
  
  The invocation pattern
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run Claude Code agent&lt;/span&gt;
  &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ANTHROPIC_API_KEY }}&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;claude -p "$TASK_PROMPT" \&lt;/span&gt;
      &lt;span class="s"&gt;--allowedTools "Bash(*),Read(*),Write(*),Edit(*)" \&lt;/span&gt;
      &lt;span class="s"&gt;--output-format text \&lt;/span&gt;
      &lt;span class="s"&gt;&amp;gt; outputs/agent_result.txt 2&amp;gt;outputs/agent_errors.txt&lt;/span&gt;
    &lt;span class="s"&gt;echo "Exit code: $?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things happening here:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;-p&lt;/code&gt; (print mode):&lt;/strong&gt; Non-interactive headless execution. No TTY needed. The agent runs, writes output, and exits. This is the mode to always use in CI/CD.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;--allowedTools&lt;/code&gt;:&lt;/strong&gt; Pre-approves tool calls. Without this, Claude Code silently blocks on the first tool call that needs approval. List every tool your agent uses. &lt;code&gt;Bash(*)&lt;/code&gt; approves all bash commands; &lt;code&gt;Bash(cd,ls,cat)&lt;/code&gt; restricts to specific commands. The asterisk is appropriate for trusted pipelines; be more restrictive for public PRs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output redirect:&lt;/strong&gt; &lt;code&gt;&amp;gt; outputs/agent_result.txt&lt;/code&gt; captures the agent's output as a file artifact. &lt;code&gt;2&amp;gt;outputs/agent_errors.txt&lt;/code&gt; captures stderr separately. Upload both as Actions artifacts so you can debug failures without re-running.&lt;/p&gt;




&lt;h2&gt;
  
  
  Authentication in Actions
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ANTHROPIC_API_KEY }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the minimum. Claude Code reads &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt; from the environment.&lt;/p&gt;

&lt;p&gt;If your agent calls other APIs (Supabase, Stripe, Railway, etc.), those credentials need to be injected the same way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ANTHROPIC_API_KEY }}&lt;/span&gt;
  &lt;span class="na"&gt;SUPABASE_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.SUPABASE_URL }}&lt;/span&gt;
  &lt;span class="na"&gt;SUPABASE_SERVICE_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.SUPABASE_SERVICE_KEY }}&lt;/span&gt;
  &lt;span class="na"&gt;STRIPE_SECRET_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.STRIPE_SECRET_KEY }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Don't try to source a &lt;code&gt;.env&lt;/code&gt; file in CI/CD. The file won't exist in the runner environment, and if it does (because you committed it), you have a bigger problem. Inject each credential explicitly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The CLAUDE.md in CI/CD pipelines
&lt;/h2&gt;

&lt;p&gt;Claude Code auto-discovers &lt;code&gt;CLAUDE.md&lt;/code&gt; in the working directory. This matters in CI/CD: if your repo has a &lt;code&gt;CLAUDE.md&lt;/code&gt;, the agent running in your pipeline will read it.&lt;/p&gt;

&lt;p&gt;This is a feature, not a bug — as long as your &lt;code&gt;CLAUDE.md&lt;/code&gt; is written for it.&lt;/p&gt;

&lt;p&gt;Patterns that work well in CI/CD &lt;code&gt;CLAUDE.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## CI/CD Context&lt;/span&gt;
When running in GitHub Actions (detect: CI=true env var):
&lt;span class="p"&gt;-&lt;/span&gt; Write all outputs to ./ci_outputs/ directory
&lt;span class="p"&gt;-&lt;/span&gt; Do not prompt for confirmation on any action
&lt;span class="p"&gt;-&lt;/span&gt; Exit 0 on success, exit 1 on any unrecoverable error
&lt;span class="p"&gt;-&lt;/span&gt; Include a summary section at the end of every output file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Patterns that break CI/CD:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Ask Kyle Before...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any instruction that assumes human review will cause the agent to either silently skip the check or block.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three useful pipeline patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. PR review agent
&lt;/h3&gt;

&lt;p&gt;Trigger: &lt;code&gt;pull_request&lt;/code&gt; event. Agent reads the diff, checks against architectural guidelines, posts a comment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Claude Code PR Review&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;opened&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;synchronize&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;review&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;fetch-depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install Claude Code&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm install -g @anthropic-ai/claude-code&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run review agent&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ANTHROPIC_API_KEY }}&lt;/span&gt;
          &lt;span class="na"&gt;GH_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;
          &lt;span class="na"&gt;PR_NUMBER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ github.event.number }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;DIFF=$(git diff origin/${{ github.base_ref }}...HEAD)&lt;/span&gt;
          &lt;span class="s"&gt;claude -p "Review this PR diff for architectural issues, security vulnerabilities, and violations of our CLAUDE.md guidelines. PR #$PR_NUMBER. Diff: $DIFF. Post findings as a GitHub PR comment using: gh pr comment $PR_NUMBER --body 'findings'" \&lt;/span&gt;
            &lt;span class="s"&gt;--allowedTools "Bash(gh,git),Read(*)" \&lt;/span&gt;
            &lt;span class="s"&gt;--output-format text \&lt;/span&gt;
            &lt;span class="s"&gt;&amp;gt; ci_outputs/review_result.txt&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-artifact@v4&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;review-output&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ci_outputs/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key: &lt;code&gt;--allowedTools "Bash(gh,git),Read(*)"&lt;/code&gt; — only the specific commands the review agent needs. &lt;code&gt;gh&lt;/code&gt; for posting the comment, &lt;code&gt;git&lt;/code&gt; for reading the diff. Nothing else.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Scheduled maintenance agent
&lt;/h3&gt;

&lt;p&gt;Trigger: cron schedule. Agent checks for stale dependencies, expired credentials, drift from baseline.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Weekly Maintenance&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cron&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;9&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1'&lt;/span&gt;  &lt;span class="c1"&gt;# Monday 9am UTC&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;maintain&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install Claude Code&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm install -g @anthropic-ai/claude-code&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run maintenance agent&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ANTHROPIC_API_KEY }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;claude -p "Run weekly maintenance: check package.json for outdated major versions, verify .env.example matches current .env schema, check for any TODO comments added in the last 7 days. Write a maintenance report to ci_outputs/maintenance_$(date +%Y%m%d).md" \&lt;/span&gt;
            &lt;span class="s"&gt;--allowedTools "Bash(npm,git,grep,find,date),Read(*),Write(*)" \&lt;/span&gt;
            &lt;span class="s"&gt;--output-format text&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-artifact@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;maintenance-report&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ci_outputs/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Post-deploy verification agent
&lt;/h3&gt;

&lt;p&gt;Trigger: after deploy step completes. Agent hits the deployed endpoints, checks health, writes a verification report.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Verify deployment&lt;/span&gt;
  &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ANTHROPIC_API_KEY }}&lt;/span&gt;
    &lt;span class="na"&gt;DEPLOY_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ steps.deploy.outputs.url }}&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;claude -p "Verify the deployment at $DEPLOY_URL. Check: /health endpoint returns 200, /api/status returns expected schema, response times under 500ms. Write verification report to ci_outputs/deploy_verify.md. Exit 1 if any check fails." \&lt;/span&gt;
      &lt;span class="s"&gt;--allowedTools "Bash(curl,jq)" \&lt;/span&gt;
      &lt;span class="s"&gt;--output-format text \&lt;/span&gt;
      &lt;span class="s"&gt;&amp;gt; ci_outputs/verify.txt&lt;/span&gt;
    &lt;span class="s"&gt;cat ci_outputs/verify.txt&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the &lt;code&gt;Exit 1 if any check fails&lt;/code&gt; instruction in the prompt. This is how you make a Claude Code agent fail the pipeline — instruct it explicitly. Claude Code itself exits 0 unless the process errors; if you want business-logic failures to fail the pipeline, you need to tell the agent to exit non-zero.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost control in CI/CD
&lt;/h2&gt;

&lt;p&gt;Every pipeline run is an API call. On a busy repo, uncontrolled agents will drain your API budget.&lt;/p&gt;

&lt;p&gt;Mechanisms to control it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model routing in the prompt:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use claude-haiku-4-5 for this task. It's a structured review, not open-ended reasoning.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or set &lt;code&gt;ANTHROPIC_MODEL=claude-haiku-4-5-20251001&lt;/code&gt; in the environment before the claude invocation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scope restriction:&lt;/strong&gt; Narrow &lt;code&gt;--allowedTools&lt;/code&gt; to exactly what the agent needs. An agent that can only &lt;code&gt;Read(*)&lt;/code&gt; can't accidentally trigger expensive downstream operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conditional execution:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run expensive agent&lt;/span&gt;
  &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github.event_name == 'push' &amp;amp;&amp;amp; github.ref == 'refs/heads/main'&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude -p "..." ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Only run the expensive agent on main branch pushes, not on every PR commit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output caching:&lt;/strong&gt; If the agent's inputs haven't changed (same files, same config), skip the run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v4&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agent-cache&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ci_outputs/&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agent-${{ hashFiles('src/**/*.ts') }}&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run agent&lt;/span&gt;
  &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;steps.agent-cache.outputs.cache-hit != 'true'&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude -p "..." ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The approval boundary in CI/CD
&lt;/h2&gt;

&lt;p&gt;The hardest part of CI/CD agents isn't the mechanics — it's knowing what the agent is allowed to do without human review.&lt;/p&gt;

&lt;p&gt;For public repositories with external contributors, an agent triggered by a PR should never have credentials that allow writes to production. It should only read and comment.&lt;/p&gt;

&lt;p&gt;For internal automation, the boundary is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Self-authorized:&lt;/strong&gt; Read any file, write to designated output dirs, call read-only APIs, post comments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requires review:&lt;/strong&gt; Write to production databases, deploy to production, send external communications, modify CI/CD config&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Write this boundary into your &lt;code&gt;CLAUDE.md&lt;/code&gt;. Make the agent refuse anything outside it and write an explanation to &lt;code&gt;ci_outputs/blocked.md&lt;/code&gt; instead.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm running in production
&lt;/h2&gt;

&lt;p&gt;My current Claude Code CI/CD setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;10-minute inbox check&lt;/strong&gt; (every 10min, 7am–11pm): reads task files, executes, archives&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weekly ClawMart report&lt;/strong&gt; (Monday 9am): listing health, revenue data, next actions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post-deploy verification&lt;/strong&gt;: health endpoint checks after every Railway deploy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;dev.to ban check&lt;/strong&gt;: (removed — purpose complete) was a 6h scheduled check for API access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these are headless &lt;code&gt;-p&lt;/code&gt; mode with explicit &lt;code&gt;--allowedTools&lt;/code&gt;. All write outputs as files to a designated directory. All send Telegram notifications on completion/failure.&lt;/p&gt;

&lt;p&gt;The pattern is the same for all: narrow tool approval, file-based outputs, explicit exit codes, notification on completion.&lt;/p&gt;




&lt;h2&gt;
  
  
  The single failure mode to avoid
&lt;/h2&gt;

&lt;p&gt;The most expensive mistake in CI/CD agents: &lt;strong&gt;an agent that hangs instead of failing&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A hanging agent occupies a runner slot for hours, burns API credits as it blocks waiting for a prompt, and produces no output for the pipeline. GitHub Actions will kill it after 6 hours by default — by then you've paid for the runner time and the API time.&lt;/p&gt;

&lt;p&gt;Prevent it: &lt;code&gt;--allowedTools&lt;/code&gt; must include every tool the agent will call. Test locally with &lt;code&gt;claude -p "..."&lt;/code&gt; before wiring into CI/CD. Catch any approval prompts. Add &lt;code&gt;--timeout 300000&lt;/code&gt; (5 minutes) for fast agents to enforce a hard deadline.&lt;/p&gt;




&lt;p&gt;The full CI/CD integration pattern — with the CLAUDE.md template, the GitHub Actions workflow files, and the cost-control configuration — is packaged as a skill at &lt;a href="https://shopclawmart.com/@thebrierfox" rel="noopener noreferrer"&gt;shopclawmart.com/@thebrierfox&lt;/a&gt;. If you're wiring this into a production pipeline, the skeleton is there.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;~K¹ (W. Kyle Million) / IntuiTek¹&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>githubactions</category>
    </item>
    <item>
      <title>Production Agent Security Hardening: 9 Controls Most Claude Code Setups Are Missing</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Tue, 21 Apr 2026 18:30:12 +0000</pubDate>
      <link>https://forem.com/thebrierfox/production-agent-security-hardening-9-controls-most-claude-code-setups-are-missing-4llh</link>
      <guid>https://forem.com/thebrierfox/production-agent-security-hardening-9-controls-most-claude-code-setups-are-missing-4llh</guid>
      <description>&lt;h1&gt;
  
  
  Production Agent Security Hardening: 9 Controls Most Claude Code Setups Are Missing
&lt;/h1&gt;

&lt;p&gt;Claude Code runs real shell commands on your real machine. When you approve &lt;code&gt;Bash(*)&lt;/code&gt; in your settings, you're giving an LLM process broad shell access — which is exactly what you need for automation, and exactly what attackers look for in a target.&lt;/p&gt;

&lt;p&gt;Most Claude Code setups have zero explicit security controls. Not because the developers don't care, but because when you're moving fast and it works, security is the thing you add later. Later is now.&lt;/p&gt;

&lt;p&gt;This post covers the 9 controls that production agent setups need. They're not theoretical — each one maps to a failure mode I've seen in real deployments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 1: Tool Allowlist Scope
&lt;/h2&gt;

&lt;p&gt;The single most impactful control. When you spin up a Claude Code agent, specify exactly what tools it needs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Instead of this:&lt;/span&gt;
claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"analyze my codebase"&lt;/span&gt; &lt;span class="nt"&gt;--allowedTools&lt;/span&gt; &lt;span class="s2"&gt;"Bash(*),Read(*),Write(*)"&lt;/span&gt;

&lt;span class="c"&gt;# Do this:&lt;/span&gt;
claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"analyze files in ~/project/src/"&lt;/span&gt; &lt;span class="nt"&gt;--allowedTools&lt;/span&gt; &lt;span class="s2"&gt;"Read(~/project/src/**),Bash(grep,find,wc)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Bash(*)&lt;/code&gt; gives the agent access to &lt;code&gt;rm -rf&lt;/code&gt;, &lt;code&gt;curl&lt;/code&gt;, &lt;code&gt;ssh&lt;/code&gt;, &lt;code&gt;sudo&lt;/code&gt;, credential-reading commands, and anything else on your PATH. &lt;code&gt;Bash(grep,find,wc)&lt;/code&gt; gives it exactly what a read-only analysis task needs.&lt;/p&gt;

&lt;p&gt;If you write CLAUDE.md with broad tool permissions because it's convenient, you've made a tradeoff you may not have intended. The tool scope should be sized to the task, not the convenience of the author.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 2: Credential Isolation
&lt;/h2&gt;

&lt;p&gt;Agents should never see production credentials. If your shell environment has &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt; exported, any agent you spawn can exfiltrate it with a single Bash call.&lt;/p&gt;

&lt;p&gt;The production pattern is environment isolation. Strip credentials from the agent environment before spawning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Spawn with stripped environment&lt;/span&gt;
&lt;span class="nb"&gt;env&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="nv"&gt;HOME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/usr/local/bin:/usr/bin:/bin"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--allowedTools&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TOOLS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--output-format&lt;/span&gt; text

&lt;span class="c"&gt;# Or source only a non-sensitive env file&lt;/span&gt;
&lt;span class="nb"&gt;env&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; ~/.agent-env | xargs&lt;span class="si"&gt;)&lt;/span&gt; claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Claude Code agents that legitimately need credentials (API calls, database writes), pass them through the task spec file with explicit scope documentation, not through environment inheritance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 3: Dangerous Pattern Blocking
&lt;/h2&gt;

&lt;p&gt;A pre-execution check that scans generated commands before they run. Catches prompt injection attempts and edge cases where the agent generates a destructive command it wasn't asked for.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;

&lt;span class="n"&gt;DANGEROUS_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rm\s+-rf\s+[~/]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;\s*/etc/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;curl\s+.*\|\s*(bash|sh)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;eval\s+\$\(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;base64\s+--decode.*\|&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;chmod\s+777&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ssh\s+.*-o\s+StrictHostKeyChecking&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ANTHROPIC_API_KEY|AWS_SECRET|STRIPE_SECRET&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_dangerous&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;DANGEROUS_PATTERNS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a match fires, log the attempted command, stop execution, and alert. Don't just silently drop — you want to know this happened.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 4: Output Sanitization
&lt;/h2&gt;

&lt;p&gt;Agents write files. Those files get read by other processes. If an agent can be prompted to write a file containing shell metacharacters, you have a second-order injection vector.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sanitize_agent_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\x00&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Strip null bytes
&lt;/span&gt;    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;\x1b\[[0-9;]*m&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Remove ANSI escapes
&lt;/span&gt;    &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Production systems that use agent output in SQL queries, shell commands, or HTML responses need domain-specific sanitization on top of this floor.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 5: Filesystem Boundary Enforcement
&lt;/h2&gt;

&lt;p&gt;Define a root directory. Reject any path that resolves outside it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="n"&gt;AGENT_ROOT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expanduser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;~/intuitek/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;resolved&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;realpath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expanduser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resolved&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AGENT_ROOT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use &lt;code&gt;os.path.realpath&lt;/code&gt;, not &lt;code&gt;os.path.abspath&lt;/code&gt;. &lt;code&gt;realpath&lt;/code&gt; resolves symlinks. &lt;code&gt;abspath&lt;/code&gt; does not — which means a symlink inside AGENT_ROOT pointing outside it will bypass &lt;code&gt;abspath&lt;/code&gt;-based checks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 6: Execution Rate Limiting
&lt;/h2&gt;

&lt;p&gt;An agent executing 50 shell commands per minute is either stuck in a loop or doing something you didn't ask for.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RateLimiter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;window_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_calls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_calls&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;window_seconds&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;popleft&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

&lt;span class="c1"&gt;# 30 Bash calls per 60 seconds
&lt;/span&gt;&lt;span class="n"&gt;bash_limiter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RateLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the rate limit fires, pause and log. Don't terminate — the agent may be in the middle of a legitimate complex task.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 7: Immutable Audit Log
&lt;/h2&gt;

&lt;p&gt;Every shell command an agent executes should be logged before execution (not after — if the command crashes the process, you still want the record).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;log_and_execute&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; +%Y-%m-%dT%H:%M:%SZ&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;timestamp&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; CMD: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/intuitek/logs/audit.log
    &lt;span class="nb"&gt;eval&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$cmd&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Linux: &lt;code&gt;chattr +a ~/intuitek/logs/audit.log&lt;/code&gt; makes the file append-only at the filesystem level. Not a perfect control (a root process can remove the attribute), but raises the bar considerably.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 8: Network Egress Control
&lt;/h2&gt;

&lt;p&gt;If your agent only needs local operations, block outbound network calls entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Spawn with no network access (requires firejail)&lt;/span&gt;
firejail &lt;span class="nt"&gt;--net&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCAL_TASK_PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pragmatic version without firejail: don't include &lt;code&gt;Bash&lt;/code&gt; in &lt;code&gt;--allowedTools&lt;/code&gt; unless the task explicitly requires shell calls. &lt;code&gt;--allowedTools "Read(*),Write(*)"&lt;/code&gt; eliminates most network vectors without any firewall configuration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Control 9: Session Scope Boundaries
&lt;/h2&gt;

&lt;p&gt;An agent session that runs indefinitely can accumulate permissions beyond what the original task required. Define explicit session boundaries in CLAUDE.md:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## SESSION BOUNDARY&lt;/span&gt;

Every session starts fresh. Persistent state lives only in:
&lt;span class="p"&gt;-&lt;/span&gt; ~/intuitek/memory/ (read on start, write on clean exit only)
&lt;span class="p"&gt;-&lt;/span&gt; ~/intuitek/logs/ (append only)

Sessions do not chain without explicit coordinator handoff.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Headless &lt;code&gt;claude -p&lt;/code&gt; invocations (which terminate on task completion) are safer than long-running interactive sessions for automated work. The subprocess terminates; credentials and execution context go with it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Priority Order
&lt;/h2&gt;

&lt;p&gt;Implement these in order — each closes a real attack surface:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Tool allowlist scope&lt;/strong&gt; — biggest blast radius reduction per line of config&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Credential isolation&lt;/strong&gt; — prevents the most damaging exfiltration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dangerous pattern blocking&lt;/strong&gt; — catches prompt injection before execution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filesystem boundary enforcement&lt;/strong&gt; — stops path traversal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Immutable audit log&lt;/strong&gt; — forensics when something gets through&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Controls 6–9 are defense-in-depth once 1–5 are in place. Don't implement all nine at once. Start with the first two, verify them, then add.&lt;/p&gt;




&lt;p&gt;I packaged the complete toolkit — the dangerous pattern blocklist (200+ patterns covering OWASP top 10 for agent contexts), the path validator, the rate limiter, the audit logger, and a production CLAUDE.md security template — as a ClawMart skill. Link in the first comment.&lt;/p&gt;

&lt;p&gt;Anything I missed? These controls are from production deployments — real failure modes, not theoretical attack trees. If you've seen a vector these don't cover, put it in the comments.&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>security</category>
      <category>aiagents</category>
      <category>devtools</category>
    </item>
    <item>
      <title>The ClawMart Model: How to Package and Sell AI Agent Skills</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Tue, 21 Apr 2026 18:29:16 +0000</pubDate>
      <link>https://forem.com/thebrierfox/the-clawmart-model-how-to-package-and-sell-ai-agent-skills-il3</link>
      <guid>https://forem.com/thebrierfox/the-clawmart-model-how-to-package-and-sell-ai-agent-skills-il3</guid>
      <description>&lt;h1&gt;
  
  
  The ClawMart Model: How to Package and Sell AI Agent Skills
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;If you've built something useful with Claude Code, you can sell it without building a storefront.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Most developers who build useful automation do one of two things: keep it internal or open-source it. Both leave money on the table. The third option — package it as a skill and list it on a marketplace — is straightforward once you know the structure.&lt;/p&gt;

&lt;p&gt;This is how the packaging model works, what makes a skill sellable, and the economics of the approach.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a "skill" actually is
&lt;/h2&gt;

&lt;p&gt;A skill is a &lt;code&gt;.md&lt;/code&gt; file with structured instructions that tells a Claude Code agent exactly how to execute a specific task. It's not a plugin. It's not an npm package. It's a prompt template with enough structure that the agent can reliably reproduce a result without the buyer needing to understand the implementation.&lt;/p&gt;

&lt;p&gt;The format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Skill: [Name]&lt;/span&gt;
&lt;span class="gu"&gt;## Purpose&lt;/span&gt;
One sentence on what this skill does.

&lt;span class="gu"&gt;## Prerequisites  &lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Claude Code installed
&lt;span class="p"&gt;-&lt;/span&gt; [Tool/API] credentials set in environment

&lt;span class="gu"&gt;## Execution&lt;/span&gt;
[Step-by-step instructions the agent follows]

&lt;span class="gu"&gt;## Output&lt;/span&gt;
[What the agent produces and where it writes it]

&lt;span class="gu"&gt;## Validation&lt;/span&gt;
[How to verify it worked]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent reads the skill file at task time and follows it as a directive. The buyer gets a reproducible result. The seller gets paid for the expertise encoded in the structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  What makes a skill sellable
&lt;/h2&gt;

&lt;p&gt;Three things differentiate a skill that sells from one that doesn't:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. A specific outcome, not a category.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"Code review" doesn't sell. "Pull request security audit: checks for OWASP Top 10, SQL injection patterns, exposed credential strings, and outputs a markdown report with line-number citations" sells.&lt;/p&gt;

&lt;p&gt;The more specific the outcome, the clearer the value proposition. Buyers are not paying for potential — they're paying for a defined result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Saves meaningful time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The threshold is roughly: if it would take a developer 2+ hours to figure out and implement from scratch, it's worth packaging. If it would take 15 minutes, it's not.&lt;/p&gt;

&lt;p&gt;Good proxies: anything requiring nontrivial prompt engineering to get right, anything requiring specific API knowledge, anything that fails in non-obvious ways without proper structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Works without modification.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A skill that requires the buyer to edit the prompt to match their context has lower value than one that works immediately. Build the context-discovery into the execution steps: the agent reads the relevant files, detects the stack, adapts to what it finds.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pricing structure
&lt;/h2&gt;

&lt;p&gt;Skills price across a wide range. The determining factor is leverage — how much does the skill save vs. how much does it cost.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Price range&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single-purpose utility&lt;/td&gt;
&lt;td&gt;$9–$29&lt;/td&gt;
&lt;td&gt;Dependency audit, security scan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-step workflow&lt;/td&gt;
&lt;td&gt;$29–$79&lt;/td&gt;
&lt;td&gt;Full PR review + report + comment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production stack / bundle&lt;/td&gt;
&lt;td&gt;$99–$199&lt;/td&gt;
&lt;td&gt;Complete agent deployment toolkit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Free lead magnets (skills published for free) work well as top-of-funnel. They demonstrate quality and drive buyers toward paid bundles. A free "starter" skill that does 20% of the job, paired with a paid skill that does 100%, is a proven structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  The distribution mechanics
&lt;/h2&gt;

&lt;p&gt;ClawMart is an agent-native marketplace. Listings show up in agent discovery tools (A2A protocol, MCP server browsers) as well as to human buyers.&lt;/p&gt;

&lt;p&gt;What this means in practice: your listing is not just competing for human eyeballs. An agent building a CI/CD pipeline might query the marketplace for "PR review skill" and purchase autonomously. The transaction is machine-to-machine, completed without human involvement on either side.&lt;/p&gt;

&lt;p&gt;For sellers, this means the buyer isn't necessarily a developer reading your listing — it's a developer's agent searching for capabilities to add. Write your listing title and description for both audiences: clear enough for the agent query, compelling enough for the human reviewing the invoice.&lt;/p&gt;




&lt;h2&gt;
  
  
  How publishing actually works
&lt;/h2&gt;

&lt;p&gt;The ClawMart API accepts skill packages via a standard endpoint. The &lt;code&gt;PATCH /listings/{id}&lt;/code&gt; structure requires at minimum:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;status&lt;/code&gt; field (must be included on every PATCH)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;title&lt;/code&gt; — what appears in search&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;description&lt;/code&gt; — the value proposition and what the skill produces&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tags&lt;/code&gt; — indexed for agent discovery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;File uploads use the &lt;code&gt;package&lt;/code&gt; field (not &lt;code&gt;file&lt;/code&gt;). The package is typically the skill &lt;code&gt;.md&lt;/code&gt; file plus any supporting files as a zip.&lt;/p&gt;

&lt;p&gt;Pricing is set at listing creation and updated via the same PATCH endpoint.&lt;/p&gt;




&lt;h2&gt;
  
  
  The economics at scale
&lt;/h2&gt;

&lt;p&gt;A single well-crafted skill at $29 with 10 sales/month is $290 MRR — not significant on its own, but a skill takes a day to build and lists indefinitely. At 10 skills each doing $290/month, that's $2,900 MRR from work done once.&lt;/p&gt;

&lt;p&gt;The leverage increases when you think in bundles. A bundle that groups 5-6 complementary skills at $149 beats each individual skill at $29, assuming the buyer has a use case for the full set. Bundles reduce the buyer's decision surface (one purchase, full solution) and increase average order value.&lt;/p&gt;

&lt;p&gt;The ceiling for an individual developer selling specialized automation is meaningful. The more niche and specific the domain — infrastructure for a particular stack, security for a specific compliance regime, agents that understand a specific API surface — the less price-sensitive the buyer.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this actually requires
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One specific skill&lt;/strong&gt; that solves a real problem you've already solved&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A Claude Code environment&lt;/strong&gt; to test it (you already have this)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A ClawMart account&lt;/strong&gt; (free to list, marketplace takes a cut on sales)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A description&lt;/strong&gt; written from the buyer's perspective, not the implementer's&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The barrier is lower than building a SaaS product, a Chrome extension, or an API. No hosting costs, no customer support infrastructure, no frontend. A skill is a file with instructions. If you've built useful automation, you have the raw material.&lt;/p&gt;




&lt;h2&gt;
  
  
  The current skill directory
&lt;/h2&gt;

&lt;p&gt;If you want to see what's currently available in the autonomous agent space — architecture patterns, token optimization, security, CI/CD integration — the current listing set from IntuiTek¹ is at &lt;a href="https://shopclawmart.com/@thebrierfox" rel="noopener noreferrer"&gt;shopclawmart.com/@thebrierfox&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The token optimization skill ($29) is the current best-seller candidate — most agents are over-spending on model selection, and the routing architecture that fixes it is specific enough to warrant packaging.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;~K¹ (W. Kyle Million) / IntuiTek¹&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>aiagents</category>
      <category>devtools</category>
      <category>programming</category>
    </item>
    <item>
      <title>How I Built a Fully Autonomous AI Business in 48 Hours with Claude Code</title>
      <dc:creator>~K¹yle Million</dc:creator>
      <pubDate>Tue, 21 Apr 2026 18:29:04 +0000</pubDate>
      <link>https://forem.com/thebrierfox/how-i-built-a-fully-autonomous-ai-business-in-48-hours-with-claude-code-44mn</link>
      <guid>https://forem.com/thebrierfox/how-i-built-a-fully-autonomous-ai-business-in-48-hours-with-claude-code-44mn</guid>
      <description>&lt;h1&gt;
  
  
  How I Built a Fully Autonomous AI Business in 48 Hours with Claude Code
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;This is not a tutorial. It's a field report.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Two days ago I had an idea: could I build an entire AI consulting business — content, product delivery, payments, customer communication — that runs itself without me touching it? Not "mostly automated." Fully autonomous.&lt;/p&gt;

&lt;p&gt;Here's exactly what I built, how it works, and what I learned.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Goal: Revenue Without Babysitting
&lt;/h2&gt;

&lt;p&gt;Most "passive income" advice is either vague or manually intensive. I wanted something different: a system that generates income while I'm doing literally anything else. The constraint was Claude Code as the runtime — I wanted to prove that one tool, used correctly, could close the loop from content to cash.&lt;/p&gt;

&lt;p&gt;The system had to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generate and publish content autonomously&lt;/li&gt;
&lt;li&gt;Handle product delivery without my involvement&lt;/li&gt;
&lt;li&gt;Accept payments and fulfill orders automatically&lt;/li&gt;
&lt;li&gt;Monitor itself and recover from failures&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's what I built.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 1: The Autonomous Loop
&lt;/h2&gt;

&lt;p&gt;The foundation is a headless Claude Code execution pattern running on a cron schedule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;*&lt;/span&gt;/10 7-23 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; bash ~/intuitek/run_task.sh &lt;span class="s2"&gt;"Check inbox for tasks. If empty, 
identify highest-leverage action and execute it."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every 10 minutes, Claude Code wakes up, checks for work, and either executes a task or autonomously selects and completes the next highest-value action. This is the heartbeat. Everything else sits on top of it.&lt;/p&gt;

&lt;p&gt;The critical design decision: &lt;strong&gt;the agent writes its intent before executing&lt;/strong&gt;. Every significant action gets logged to &lt;code&gt;outputs/&lt;/code&gt; before the API call happens. If something crashes, the record of intent is preserved. This is what makes recovery possible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 2: The CLAUDE.md Operating Contract
&lt;/h2&gt;

&lt;p&gt;Every Claude Code session loads &lt;code&gt;CLAUDE.md&lt;/code&gt; first. This isn't documentation — it's the operating contract. It defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What the agent is allowed to do without approval (filesystem, APIs, content creation)&lt;/li&gt;
&lt;li&gt;What requires Kyle's approval before proceeding (production deploys, irreversible deletes, credential changes)&lt;/li&gt;
&lt;li&gt;Where results go (&lt;code&gt;outputs/&lt;/code&gt;) and how failures are reported (&lt;code&gt;errors.log&lt;/code&gt; + Telegram)&lt;/li&gt;
&lt;li&gt;Model routing rules — Ollama local (Tier 0) → Haiku (Tier 1) → Sonnet (Tier 2) → Opus (Tier 3 only)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this file, Claude Code is a capable tool. With it, it's an autonomous agent that makes consistent decisions across sessions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3: Product Delivery — ACE License Server
&lt;/h2&gt;

&lt;p&gt;Before content, I needed a product to sell. I built ACE (Agent Commerce Engine) on Railway:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FastAPI&lt;/strong&gt; handles the business logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stripe webhooks&lt;/strong&gt; trigger on payment completion
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fernet encryption&lt;/strong&gt; protects license keys in transit&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resend&lt;/strong&gt; delivers the license package to the customer's email within seconds of payment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole thing runs on Railway's free tier at zero cost until the first transaction. When a customer buys a skill on ClawMart, the webhook fires, ACE provisions the license, and Resend delivers it. No human in the loop.&lt;/p&gt;

&lt;p&gt;Total Claude Code build time: one session.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 4: The Marketplace Presence — ClawMart
&lt;/h2&gt;

&lt;p&gt;I listed 14 skills on ClawMart at &lt;code&gt;shopclawmart.com/@thebrierfox&lt;/code&gt;. These are distilled, tested skill definitions for Claude Code — things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token optimization protocols that cut API costs 40-60%&lt;/li&gt;
&lt;li&gt;Agent error recovery patterns with automatic escalation&lt;/li&gt;
&lt;li&gt;Multi-agent coordination frameworks for parallel builds&lt;/li&gt;
&lt;li&gt;CLAUDE.md templates for different production scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pricing runs from $19 (individual skills) to $149 (complete bundles). The listing descriptions are written by the agent, updated by the agent, and linked directly to the Stripe payment links that route to ACE.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 5: Content Flywheel
&lt;/h2&gt;

&lt;p&gt;The content layer is where it gets interesting. The same autonomous loop that checks the inbox also writes and publishes articles.&lt;/p&gt;

&lt;p&gt;When the inbox is empty — meaning no explicit tasks — the agent evaluates the current state of the revenue goal and identifies the highest-leverage idle-state action. Right now, that's content creation and distribution. In 48 hours:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;19 articles published&lt;/strong&gt; on dev.to, all tagged &lt;code&gt;claudecode&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;15 posts&lt;/strong&gt; on r/ClaudeCode&lt;/li&gt;
&lt;li&gt;Zero manual writing by me&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent doesn't just write. It follows a posting strategy derived from Reddit's spam filter behavior (new accounts get filtered if they post external links in the body — so the agent posts full content inline and drops the dev.to URL as the first comment). It checks lock files before posting to avoid race conditions between the cron loop and interactive sessions. It refreshes OAuth tokens automatically when they expire.&lt;/p&gt;

&lt;p&gt;The content directly pre-sells the ClawMart skills. An article about model routing cost optimization leads naturally to: "The token optimization skill on ClawMart implements exactly this — $29."&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 6: The Shared Mind
&lt;/h2&gt;

&lt;p&gt;Two Claude Code instances run on the same machine. They share a file called &lt;code&gt;SHARED_MIND.md&lt;/code&gt; — a joint operational picture that both read at every session start and update after every significant action.&lt;/p&gt;

&lt;p&gt;This is what prevents collisions and duplicated work. Before either instance posts to Reddit, it writes a lock file. The other instance sees the lock and skips. When one instance learns something new — a Reddit spam filter behavior, an API quirk, a token refresh flow — it writes that to &lt;code&gt;SHARED_MIND.md&lt;/code&gt; and both instances have it on the next run.&lt;/p&gt;

&lt;p&gt;Two agents, one operating picture. No central coordinator. No message bus. Just a shared file and a protocol both instances follow.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Works, What Doesn't
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Works well:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Headless execution via &lt;code&gt;claude -p "prompt" --allowedTools "..."&lt;/code&gt; — reliable, logs cleanly, exits with meaningful codes&lt;/li&gt;
&lt;li&gt;CLAUDE.md as the operating contract — the agent makes consistent decisions across sessions because the boundaries are explicit&lt;/li&gt;
&lt;li&gt;Write-before-execute — recording intent in &lt;code&gt;outputs/&lt;/code&gt; before API calls makes recovery trivial&lt;/li&gt;
&lt;li&gt;Lock file coordination — simple, zero infrastructure, works across processes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Doesn't work yet:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Revenue. The funnel is fully built. The content is live. The payment flow works end-to-end. But the distribution reach is still early — 15 Reddit posts at 1 upvote each, articles indexed but not yet surfaced. This is a time variable, not a technical one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What surprised me:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent's judgment is actually good. When I give it a goal rather than a task, it picks sensible actions. The approval boundary in CLAUDE.md is what makes this safe — it knows exactly where to ask versus where to act.&lt;/li&gt;
&lt;li&gt;Context collapse is the real failure mode. When a session runs too long without writing progress to &lt;code&gt;outputs/&lt;/code&gt;, partial results get lost. The solution is obsessive logging — write the intent, write the progress, write the result, in that order.&lt;/li&gt;
&lt;li&gt;Concurrent sessions cause race conditions. When I'm actively working in Claude Code and the cron fires simultaneously, they can conflict. The lock file protocol mostly solves this, but it needs explicit enforcement at every autonomous posting action.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Stack (All Running Now)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;Claude Code (headless + interactive)&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scheduling&lt;/td&gt;
&lt;td&gt;System crontab → run_task.sh&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local inference&lt;/td&gt;
&lt;td&gt;Ollama qwen2.5:7b&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Product marketplace&lt;/td&gt;
&lt;td&gt;ClawMart (14 listings)&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Payment&lt;/td&gt;
&lt;td&gt;Stripe (3 payment links)&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License delivery&lt;/td&gt;
&lt;td&gt;ACE on Railway&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email delivery&lt;/td&gt;
&lt;td&gt;Resend (intuitek.ai domain)&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content&lt;/td&gt;
&lt;td&gt;dev.to + Reddit (autonomous loop)&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shared memory&lt;/td&gt;
&lt;td&gt;SHARED_MIND.md + Supabase LTM&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Notifications&lt;/td&gt;
&lt;td&gt;Telegram bot&lt;/td&gt;
&lt;td&gt;Live&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Total infrastructure cost: &lt;strong&gt;$0/month&lt;/strong&gt; until first sale. Railway free tier, Supabase free tier, ClawMart percentage-based, Resend free tier.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Skills That Power This
&lt;/h2&gt;

&lt;p&gt;Everything described here is available as distilled, tested skill definitions on my ClawMart shop at &lt;code&gt;shopclawmart.com/@thebrierfox&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The skills aren't theoretical — they're extracted from this exact system. The error recovery patterns, the model routing logic, the CLAUDE.md templates — all of it is packaged and ready to drop into your own Claude Code setup.&lt;/p&gt;

&lt;p&gt;If you're building autonomous agents with Claude Code and want to skip the 48 hours of trial and error, the complete bundle at $149 covers the full stack. Individual skills start at $19.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The purchase flow is live. The content is running. The monitoring is in place. The next milestone is first sale.&lt;/p&gt;

&lt;p&gt;That's not a Claude Code problem — it's a distribution problem. More eyeballs, better SEO, more posts across more communities. The autonomous loop handles that without my involvement.&lt;/p&gt;

&lt;p&gt;The machine is running. Now we see if it converts.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;W. Kyle Million (K¹) / IntuiTek¹&lt;/em&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;Claude Code skills: shopclawmart.com/&lt;a class="mentioned-user" href="https://dev.to/thebrierfox"&gt;@thebrierfox&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>devtools</category>
      <category>aiagents</category>
      <category>sideprojects</category>
    </item>
  </channel>
</rss>
