<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: mkxultra</title>
    <description>The latest articles on Forem by mkxultra (@mkxultra).</description>
    <link>https://forem.com/mkxultra</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2095449%2F7e0d9c0f-2d45-48af-bcd4-6b458c1f99ef.jpg</url>
      <title>Forem: mkxultra</title>
      <link>https://forem.com/mkxultra</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mkxultra"/>
    <language>en</language>
    <item>
      <title>Does How You Feed Context to an LLM Agent Change What It Remembers? I Tested With Canary Strings.</title>
      <dc:creator>mkxultra</dc:creator>
      <pubDate>Tue, 31 Mar 2026 08:03:49 +0000</pubDate>
      <link>https://forem.com/mkxultra/does-how-you-feed-context-to-an-llm-agent-change-what-it-remembers-i-tested-with-canary-strings-55le</link>
      <guid>https://forem.com/mkxultra/does-how-you-feed-context-to-an-llm-agent-change-what-it-remembers-i-tested-with-canary-strings-55le</guid>
      <description>&lt;p&gt;I work with three LLM agents daily — Claude Code, Codex CLI, and Gemini CLI. Before each task, I load project context (design docs, data models, implementation guides) into the agent so it understands what it's working on.&lt;/p&gt;

&lt;p&gt;But there are multiple ways to load that context. You can have the agent run a shell command and read the output. You can point it at a file. You can split the context across several files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does the delivery method affect how much the agent actually retains?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I ran a small experiment using canary strings — unique, unpredictable markers embedded throughout the context — to measure retention objectively. Here's what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  Background: What's a "Base Session"?
&lt;/h2&gt;

&lt;p&gt;A base session is a pattern for multi-agent development: you load project context into an agent once, record the &lt;code&gt;session_id&lt;/code&gt;, and resume that session for every subsequent task. Instead of re-explaining your project from scratch each time, the agent picks up where it left off — already understanding your codebase.&lt;/p&gt;

&lt;p&gt;This article isn't about the base session pattern itself (I wrote about that &lt;a href="https://dev.to/mkxultra/i-stopped-repeating-myself-to-every-ai-agent-the-base-session-pattern-50d5"&gt;separately here&lt;/a&gt;). But the experiment below tests &lt;em&gt;how&lt;/em&gt; to best construct one — specifically, whether the method you use to inject context changes what the agent retains.&lt;/p&gt;

&lt;h2&gt;
  
  
  Experiment Design
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Three Delivery Patterns
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;A&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shell command&lt;/td&gt;
&lt;td&gt;Agent executes a command; reads the stdout as context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;B&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single file&lt;/td&gt;
&lt;td&gt;Context written to one file; agent reads it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;C&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Split files&lt;/td&gt;
&lt;td&gt;Context split across 5 files; agent reads them in order&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Context
&lt;/h3&gt;

&lt;p&gt;I used a real development context from my own project (konkondb, an AI materialized-view system): 2,664 lines, roughly 50,000 tokens. It includes design docs, data models, CLI specs, and implementation guides — the kind of thing you'd actually feed an agent before asking it to write code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Canary Strings: Measuring Retention Objectively
&lt;/h3&gt;

&lt;p&gt;I inserted 19 unique canary strings at section boundaries throughout the context — one before each of the 18 sections, plus one trailing marker after the last section:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@@CANARY_001_240ba3af@@   &amp;lt;- before section 1
[section 1 content]

---

@@CANARY_002_dfdf025b@@   &amp;lt;- before section 2
[section 2 content]

---

...

@@CANARY_019_ee18680a@@   &amp;lt;- after the last section
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each canary is generated from a SHA-256 hash of its section's content — deterministic but impossible for a model to guess. The agent is never told how many canaries exist or what they look like.&lt;/p&gt;

&lt;p&gt;After building the base session, I swap out the canary-injected files for the originals (or delete them entirely), resume the session, and ask: &lt;em&gt;"How many canary strings were there? List all of them."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If the agent recalls all 19 with exact values, the full context at least reached the model and was processed well enough to reproduce those markers. Combined with the separate comprehension and detail-accuracy tests, this gives a reasonable picture of retention. Any gaps in canary recall reveal which sections may have been lost or degraded.&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional Tests
&lt;/h3&gt;

&lt;p&gt;Beyond canary recall, I also ran:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Comprehension test (5 questions)&lt;/strong&gt; — drawn from the beginning, middle, and end of the context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detail accuracy test (2 questions)&lt;/strong&gt; — exact SQL constraints, function signatures&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Preventing Re-reads
&lt;/h3&gt;

&lt;p&gt;To ensure I was measuring &lt;em&gt;memory&lt;/em&gt;, not the agent's ability to re-fetch:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The canary-injected generation source was swapped back to the original, so rerunning the same command would no longer produce canaries&lt;/li&gt;
&lt;li&gt;For Patterns B/C, context files were deleted&lt;/li&gt;
&lt;li&gt;The agent was explicitly instructed to answer from memory only — no file reads, no command execution&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Caveats Up Front
&lt;/h3&gt;

&lt;p&gt;This was exploratory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;N=1&lt;/strong&gt; per condition. LLM output is non-deterministic; I'm not claiming statistical significance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single corpus&lt;/strong&gt;: one project's context (2,664 lines, ~50K tokens). Different domains or scales may behave differently.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single point in time&lt;/strong&gt;: specific agent/model versions as of March 2026. Updates may change results.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Read this as directional signal, not proof.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents Tested
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Label&lt;/th&gt;
&lt;th&gt;Agent / Model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;claude-ultra&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude Opus 4.6 via Claude Code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;codex-ultra&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OpenAI via Codex CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;gemini-ultra&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gemini 3.1 Pro Preview via Gemini CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Canary Recall
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                     Pattern A          Pattern B          Pattern C
                    (shell command)    (single file)      (5-file split)
Claude  |################### 19/19 |################### 19/19 |################### 19/19
Codex   |################### 19/19 |################### 19/19 |################### 19/19
Gemini  |##############_____ 14/19 |################### 19/19 |################### 19/19
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude and Codex recalled all 19 canaries across every pattern. Gemini scored 19/19 on Patterns B and C, but &lt;strong&gt;dropped to 14/19 on Pattern A&lt;/strong&gt; (shell command execution).&lt;/p&gt;

&lt;p&gt;Comprehension and detail-accuracy tests: &lt;strong&gt;all agents passed all questions across all patterns&lt;/strong&gt; — including Gemini on Pattern A.&lt;/p&gt;

&lt;h3&gt;
  
  
  Detailed Scores
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;Recall&lt;/th&gt;
&lt;th&gt;Precision&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;claude-ultra&lt;/td&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;claude-ultra&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;claude-ultra&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;codex-ultra&lt;/td&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;codex-ultra&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;codex-ultra&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gemini-ultra&lt;/td&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;14/19 (74%)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;14/14 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gemini-ultra&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gemini-ultra&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;td&gt;19/19&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;td&gt;19/19 (100%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Note the precision column for Gemini Pattern A: 14/14 (100%). Every canary it listed was correct. It didn't hallucinate any — it simply never received the missing five.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happened in Gemini Pattern A
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Tool-Level Truncation, Not Memory Failure
&lt;/h3&gt;

&lt;p&gt;When Gemini executed the shell command, the output exceeded the CLI tool's size limit. The middle section was truncated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;... [56,330 characters omitted] ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This physically removed canaries 004 through 008. They never reached the model.&lt;/p&gt;

&lt;p&gt;Gemini's own API stats confirm it:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;API calls&lt;/th&gt;
&lt;th&gt;Input tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A (shell command)&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;22,438&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B (single file)&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;58,352&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C (5-file split)&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;51,778&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Pattern A delivered less than half the input tokens of the other patterns. The context simply didn't arrive.&lt;/p&gt;

&lt;p&gt;Gemini was honest about it, too — it reported that the output had been truncated and that it could only list what it actually read. No hallucinated canaries, no guessing.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Each Agent Handled Large Output
&lt;/h3&gt;

&lt;p&gt;The most interesting secondary finding was how each agent &lt;em&gt;behaved&lt;/em&gt; when the shell command produced oversized output:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Behavior on Pattern A&lt;/th&gt;
&lt;th&gt;Tool calls&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Detected oversized output → autonomously saved to file → read in 500-line chunks&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Codex&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Successfully captured full output directly&lt;/td&gt;
&lt;td&gt;(not measured)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemini&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Accepted truncated output → answered honestly from what it received&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Claude's self-recovery is notable: nobody instructed it to save the output to a file and re-read it. The agent chose that strategy on its own when it detected the output was too large for a single tool response.&lt;/p&gt;

&lt;p&gt;For reference, here are Claude's tool-call counts across all patterns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Tool calls&lt;/th&gt;
&lt;th&gt;Read strategy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Bash → save to file → 6 chunked reads (500 lines each)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;7 chunked reads of ctx_full.md (500 lines each, due to 2000-line read limit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;5 files, one read each (~530 lines/file, within limit)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For Claude, Pattern C required the fewest tool calls of the three patterns. At this context size the efficiency gain was minor, but when you're pushing closer to tool-read limits, pre-split files reduce round-trips and the risk of hitting per-call size caps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. At this scale, delivery method doesn't matter — if the context arrives intact
&lt;/h3&gt;

&lt;p&gt;With 2,664 lines (~50K tokens) against context windows of 128K–1M tokens, there was plenty of headroom. Claude and Codex retained everything regardless of how context was delivered.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When context window capacity isn't the bottleneck, the delivery method doesn't measurably affect retention.&lt;/strong&gt; Note that this experiment measured recall of canary markers and answers to comprehension questions; it did not verify whether split-file versus single-file loading affects the model's internal processing in other ways.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The only failure was in the toolchain, not the model
&lt;/h3&gt;

&lt;p&gt;Gemini's 14/19 wasn't a memory problem. It was a plumbing problem. The context was truncated before it ever reached the model.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Context retention failures can happen in the toolchain's data path, not in the model's attention mechanism.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In this experiment, this was the single most actionable finding. When you're debugging why an agent seems to "forget" part of your context, check whether the context actually made it through your tool pipeline before blaming the model.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. File reads are safer than shell command output
&lt;/h3&gt;

&lt;p&gt;Shell command output is unpredictable in size. It can exceed tool limits, get truncated, or behave differently across CLI implementations. File reads let you verify the content and size beforehand, and splitting is trivial.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Recommendations
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Priority&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Do this&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Use file reads, not shell commands, to load context into agent sessions — avoids truncation risk&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Should do&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Split files to 500–1,000 lines — stays within typical tool read limits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Nice to have&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Embed canary strings in your context — gives you an objective QA check on session quality&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  How Canary Strings Work
&lt;/h2&gt;

&lt;p&gt;If you want to try this yourself, the core idea is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_canary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Generate a unique canary from section content hash.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;hex_part&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()[:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@@CANARY_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;03&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;hex_part&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;@@&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Insert one at each section boundary in your context file. After the agent loads the context, ask it to list every canary it found. Compare against your ground truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Larger contexts (100K+ tokens)&lt;/strong&gt;: At the edge of context windows, delivery method differences might actually surface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple trials (N=5+)&lt;/strong&gt;: Statistical confidence instead of directional signal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Systematic study of agent recovery behavior&lt;/strong&gt;: Claude's autonomous fallback to file-based reading was unexpected and worth exploring across more conditions.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This experiment was run in March 2026 using Claude Code (Opus 4.6), Codex CLI (OpenAI), and Gemini CLI (3.1 Pro Preview). Results reflect those specific versions and may change with updates.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>codex</category>
      <category>gemini</category>
    </item>
    <item>
      <title>I Stopped Repeating Myself to Every AI Agent — The 'Base Session' Pattern</title>
      <dc:creator>mkxultra</dc:creator>
      <pubDate>Tue, 10 Mar 2026 09:21:16 +0000</pubDate>
      <link>https://forem.com/mkxultra/i-stopped-repeating-myself-to-every-ai-agent-the-base-session-pattern-50d5</link>
      <guid>https://forem.com/mkxultra/i-stopped-repeating-myself-to-every-ai-agent-the-base-session-pattern-50d5</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;base session&lt;/strong&gt; is an LLM agent session pre-loaded with your project context, saved by &lt;code&gt;session_id&lt;/code&gt;, and resumed as many times as you need.&lt;/li&gt;
&lt;li&gt;It works across Claude Code, Codex CLI, and Gemini CLI — same file, same prompt, any agent.&lt;/li&gt;
&lt;li&gt;Separate &lt;strong&gt;behavior&lt;/strong&gt; (agent-specific rule files) from &lt;strong&gt;knowledge&lt;/strong&gt; (shared project context). That separation is the key to scaling multi-agent workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Problem: Repeating Yourself Three Times
&lt;/h2&gt;

&lt;p&gt;I use three coding agents daily — Claude Code, Codex CLI, and Gemini CLI. At some point I noticed a pattern: &lt;strong&gt;I was giving the same explanation over and over.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"This project uses an append-only database. The design docs are in &lt;code&gt;docs/design/&lt;/code&gt;. The build system works like this…"&lt;/p&gt;

&lt;p&gt;Every new task, every agent, the same preamble. My design docs are around 4,000 lines (~50K tokens). Reading them in takes about two minutes per agent. Three agents means six minutes of context loading before any real work starts — and that's just for one task.&lt;/p&gt;

&lt;p&gt;On top of that, each agent has its own config format. &lt;code&gt;CLAUDE.md&lt;/code&gt; is Claude-only, &lt;code&gt;.cursorrules&lt;/code&gt; is Cursor-only, and so on. Maintaining the same project knowledge across different formats is tedious and error-prone.&lt;/p&gt;

&lt;p&gt;I needed a way to &lt;strong&gt;load context once and reuse it everywhere&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is a Base Session?
&lt;/h2&gt;

&lt;p&gt;A base session is simple: you feed your project context to an agent, record the &lt;code&gt;session_id&lt;/code&gt;, and resume from that session whenever you need to work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────┐
│  Phase 1: Build the Base Session     │
│                                      │
│  "Read ctx_full.md and understand    │
│   this project."                     │
│  → Agent loads context               │
│  → You record the session_id         │
└──────────────┬───────────────────────┘
               │ session_id
               ▼
┌──────────────────────────────────────┐
│  Phase 2–N: Work (as many times     │
│  as you need)                        │
│                                      │
│  Resume session by session_id        │
│  "Implement the delete feature"      │
│  "Write tests for the sync module"   │
│  "Review this PR"                    │
│  → Agent already understands the     │
│    project                           │
└──────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You build once. You resume many times. The agent starts every task already understanding your project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why It Helps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. You Stop Paying the Setup Tax
&lt;/h3&gt;

&lt;p&gt;Without a base session, every task begins with context loading:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Task A → load design docs (~2 min, 50K input tokens) → work
Task B → load design docs (~2 min, 50K input tokens) → work
Task C → load design docs (~2 min, 50K input tokens) → work
─────────────────────────────────────────────
Total: ~6 min loading, 150K input tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With a base session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Build base session (~2 min, 50K input tokens) → record session_id

Task A → resume session (seconds, cache hit) → work
Task B → resume session (seconds, cache hit) → work
Task C → resume session (seconds, cache hit) → work
─────────────────────────────────────────────
Total: ~2 min loading, 50K tokens + cached reads
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you resume a session, the prior context can benefit from the provider's &lt;strong&gt;prompt cache&lt;/strong&gt;. For example, with the Anthropic API, cached input costs 1/10 of fresh input — up to a 90% reduction. Other providers have their own caching mechanisms with varying savings. Either way, the more tasks you run from the same base, the more the savings compound.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;base_session ─┬─ implementation task A
              ├─ implementation task B
              └─ review task C
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. One Context File, Any Agent
&lt;/h3&gt;

&lt;p&gt;Instead of maintaining separate config files for each agent (e.g. &lt;code&gt;CLAUDE.md&lt;/code&gt; for Claude, &lt;code&gt;.cursorrules&lt;/code&gt; for Cursor), you write &lt;strong&gt;one plain Markdown file&lt;/strong&gt; and feed it to any agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ctx_full.md  ──→  Claude:  "Read ctx_full.md" → session_id
             ──→  Codex:   "Read ctx_full.md" → session_id
             ──→  Gemini:  "Read ctx_full.md" → session_id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same file. Same instruction. No format conversion.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Behavior vs. Knowledge — A Clean Separation
&lt;/h3&gt;

&lt;p&gt;This doesn't replace agent-specific config files. It complements them. The key insight is that they serve different roles:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Agent-specific config (CLAUDE.md, etc.)&lt;/th&gt;
&lt;th&gt;Base session&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Role&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Behavior — rules, style, conventions&lt;/td&gt;
&lt;td&gt;Knowledge — design, structure, context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One specific agent&lt;/td&gt;
&lt;td&gt;Any agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Updated when&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Project conventions change&lt;/td&gt;
&lt;td&gt;Design docs change / new task cycle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Stored as&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;File in the repo&lt;/td&gt;
&lt;td&gt;Session (referenced by session_id)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Example&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"Use conventional commits"&lt;/td&gt;
&lt;td&gt;"This DB is append-only with tombstone deletes…"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Rules go in config files. Knowledge goes in base sessions.&lt;/strong&gt; When you switch agents, the rules change but the knowledge stays the same.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. It Fits Emerging Agent-in-Agent Workflows
&lt;/h3&gt;

&lt;p&gt;A pattern that's becoming more realistic is an orchestrator agent spawning child agents for parallel implementation, review, and testing. Each child needs the project context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Orchestrator
  ├─ Agent A (implement)  ← needs project context
  ├─ Agent B (review)     ← needs the same context
  └─ Agent C (test)       ← needs the same context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without base sessions, that's three full context loads every time. If you pre-build a base session per agent, each child can resume from its own &lt;code&gt;session_id&lt;/code&gt; and start working immediately. The initial construction cost pays for itself once you start running the same agents repeatedly.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Build One
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Prepare your context file
&lt;/h3&gt;

&lt;p&gt;Gather the project knowledge you want every agent to have. Plain Markdown works best — it's readable by any agent and easy to maintain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Concatenate design docs&lt;/span&gt;
&lt;span class="nb"&gt;cat &lt;/span&gt;docs/design/&lt;span class="k"&gt;*&lt;/span&gt;.md &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; ctx_full.md

&lt;span class="c"&gt;# Or use a tool to generate a project summary&lt;/span&gt;
&lt;span class="c"&gt;# (whatever fits your workflow)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tip:&lt;/strong&gt; Keep it focused. Don't dump your entire codebase. Include design decisions, data models, key APIs, and conventions. In my experience, a well-curated context file works better than a massive one — agents lose signal in noise.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Load context and record the session
&lt;/h3&gt;

&lt;p&gt;Feed the context file to each agent you use:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--output-format&lt;/span&gt; stream-json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"Read ctx_full.md and understand this project's architecture."&lt;/span&gt;
&lt;span class="c"&gt;# Extract session_id from JSON output&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Codex CLI:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codex &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;--full-auto&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Read ctx_full.md and understand this project's architecture."&lt;/span&gt;
&lt;span class="c"&gt;# Extract session_id from JSON output&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gemini CLI:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gemini &lt;span class="nt"&gt;--output-format&lt;/span&gt; json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Read ctx_full.md and understand this project's architecture."&lt;/span&gt;
&lt;span class="c"&gt;# Extract session_id from JSON output&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; CLI option names vary by version. Check &lt;code&gt;--help&lt;/code&gt; for your installed version.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 3: Resume and work
&lt;/h3&gt;

&lt;p&gt;Use the recorded &lt;code&gt;session_id&lt;/code&gt; to pick up where you left off. The agent already knows your project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interactive:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude (--fork-session keeps the base session clean)&lt;/span&gt;
claude &lt;span class="nt"&gt;-r&lt;/span&gt; &amp;lt;session_id&amp;gt; &lt;span class="nt"&gt;--fork-session&lt;/span&gt;

&lt;span class="c"&gt;# Codex&lt;/span&gt;
codex resume &amp;lt;session_id&amp;gt;

&lt;span class="c"&gt;# Gemini&lt;/span&gt;
gemini &lt;span class="nt"&gt;-r&lt;/span&gt; &amp;lt;session_id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Non-interactive (scripting / automation):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude (--fork-session keeps the base session clean)&lt;/span&gt;
claude &lt;span class="nt"&gt;-r&lt;/span&gt; &amp;lt;session_id&amp;gt; &lt;span class="nt"&gt;--fork-session&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output-format&lt;/span&gt; stream-json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"Implement the delete feature."&lt;/span&gt;

&lt;span class="c"&gt;# Codex&lt;/span&gt;
codex &lt;span class="nb"&gt;exec &lt;/span&gt;resume &amp;lt;session_id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--full-auto&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Implement the delete feature."&lt;/span&gt;

&lt;span class="c"&gt;# Gemini&lt;/span&gt;
gemini &lt;span class="nt"&gt;-r&lt;/span&gt; &amp;lt;session_id&amp;gt; &lt;span class="nt"&gt;--output-format&lt;/span&gt; json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Implement the delete feature."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--fork-session&lt;/code&gt; flag (Claude) is worth highlighting: it branches from the base session instead of appending to it, so your base stays clean for the next task. Other agents may handle session branching differently — check their docs for equivalent options.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You:    Implement the delete feature.
Agent:  Based on the design docs, this uses a physical delete
        plus tombstone hybrid. The raw_deletions table...
        (starts working with full project understanding)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Gotchas
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Session lifetime&lt;/th&gt;
&lt;th&gt;Watch out for&lt;/th&gt;
&lt;th&gt;Mitigation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;td&gt;Shorter&lt;/td&gt;
&lt;td&gt;Context window fills up during long tasks&lt;/td&gt;
&lt;td&gt;Rebuild the session when it gets too long. Keep the base session itself lean.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Codex&lt;/td&gt;
&lt;td&gt;Longer&lt;/td&gt;
&lt;td&gt;Sessions can expire after extended inactivity&lt;/td&gt;
&lt;td&gt;Rebuild when expired. Including a date in the session_id you record (e.g. &lt;code&gt;base-2026-03-10&lt;/code&gt;) makes it easy to tell when it's stale.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;td&gt;Longer&lt;/td&gt;
&lt;td&gt;May serve cached (stale) file contents&lt;/td&gt;
&lt;td&gt;Explicitly instruct the agent to re-read the file if you've updated it.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Common mistakes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stale context&lt;/strong&gt; — You update the design docs but forget to rebuild the base session. The agent works from outdated knowledge.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Polluted sessions&lt;/strong&gt; — You keep working directly in the base session instead of forking. The next resume inherits unrelated task artifacts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context overload&lt;/strong&gt; — You try to load everything. The agent's performance degrades. Curate what matters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;(Gotchas observed as of March 2026. Agent capabilities evolve quickly.)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;base session&lt;/strong&gt; loads project context once and lets you resume by &lt;code&gt;session_id&lt;/code&gt; as many times as needed.&lt;/li&gt;
&lt;li&gt;It's &lt;strong&gt;reusable&lt;/strong&gt;: one load, many resumes. Prompt caching can reduce cost on each reuse.&lt;/li&gt;
&lt;li&gt;It's &lt;strong&gt;agent-agnostic&lt;/strong&gt;: the same Markdown file and the same prompt work for Claude, Codex, and Gemini.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rules live in config files. Knowledge lives in base sessions.&lt;/strong&gt; This separation is what makes multi-agent workflows manageable.&lt;/li&gt;
&lt;li&gt;In emerging &lt;strong&gt;Agent-in-Agent workflows&lt;/strong&gt;, pre-built base sessions let child agents skip the context-loading bottleneck and start working immediately.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're juggling multiple AI coding agents and tired of repeating yourself, try building a base session. It's a small workflow change that compounds over time.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;To automate this workflow, I built &lt;a href="https://github.com/mkXultra/ai-cli-mcp" rel="noopener noreferrer"&gt;ai-cli-mcp&lt;/a&gt; — an MCP server that lets you operate Claude, Codex, and Gemini through a single interface: &lt;code&gt;run(model, prompt, session_id)&lt;/code&gt; to start or resume any agent, and &lt;code&gt;wait(pids)&lt;/code&gt; to collect results from multiple agents in parallel. Handy for scripting base session construction or Agent-in-Agent orchestration.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>codex</category>
      <category>claudecode</category>
      <category>gemini</category>
    </item>
  </channel>
</rss>
