<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Jaehoon Jung</title>
    <description>The latest articles on Forem by Jaehoon Jung (@jungjaehoon).</description>
    <link>https://forem.com/jungjaehoon</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3785352%2Fbec52564-0508-4934-8c24-fd9af9074763.png</url>
      <title>Forem: Jaehoon Jung</title>
      <link>https://forem.com/jungjaehoon</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/jungjaehoon"/>
    <language>en</language>
    <item>
      <title>Building a 24/7 Claude Code Wrapper? Here's Why Each Subprocess Burns 50K Tokens</title>
      <dc:creator>Jaehoon Jung</dc:creator>
      <pubDate>Sun, 22 Feb 2026 18:14:48 +0000</pubDate>
      <link>https://forem.com/jungjaehoon/why-claude-code-subagents-waste-50k-tokens-per-turn-and-how-to-fix-it-41ma</link>
      <guid>https://forem.com/jungjaehoon/why-claude-code-subagents-waste-50k-tokens-per-turn-and-how-to-fix-it-41ma</guid>
      <description>&lt;p&gt;If you're building a wrapper around Claude Code — spawning &lt;code&gt;claude&lt;/code&gt; CLI as a subprocess for automation, bots, or multi-agent orchestration — you might be burning through your token quota much faster than expected. Here's why, and a concrete fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;When your wrapper spawns a &lt;code&gt;claude&lt;/code&gt; CLI subprocess, each process starts fresh. That process inherits your &lt;strong&gt;entire global configuration&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;~/CLAUDE.md&lt;/code&gt; (your project instructions)&lt;/li&gt;
&lt;li&gt;All enabled plugins and their skills&lt;/li&gt;
&lt;li&gt;Every MCP server's tool descriptions&lt;/li&gt;
&lt;li&gt;User-level settings from &lt;code&gt;~/.claude/settings.json&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Every single turn&lt;/strong&gt; of every subprocess re-injects all of this. In our case (building &lt;a href="https://github.com/jungjaehoon-lifegamez/MAMA" rel="noopener noreferrer"&gt;MAMA&lt;/a&gt;, a memory plugin with hooks + MCP server), a single subprocess turn consumed &lt;strong&gt;~50K tokens&lt;/strong&gt; before doing any actual work.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;/context&lt;/code&gt; in a fresh session to see for yourself — MCP tool descriptions alone can eat 10-20K tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Before isolation:
  Subprocess turn 1: ~50K tokens (system prompt + plugins + MCP tools)
  Subprocess turn 5: ~250K tokens cumulative

After isolation:
  Subprocess turn 1: ~5K tokens
  Subprocess turn 5: ~25K tokens cumulative
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's a &lt;strong&gt;10x reduction&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix: 4-Layer Subprocess Isolation
&lt;/h2&gt;

&lt;p&gt;We solved this by isolating each subprocess from the user's global settings:&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Scoped Working Directory
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Set cwd to a scoped workspace, NOT os.homedir()&lt;/span&gt;
&lt;span class="c1"&gt;// This prevents ~/CLAUDE.md from being auto-loaded&lt;/span&gt;
&lt;span class="nx"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;homedir&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.mama&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;workspace&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 2: Git Boundary
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Create a .git/HEAD to block upward CLAUDE.md traversal&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;gitDir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;workspaceDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.git&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdirSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gitDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;recursive&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gitDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HEAD&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ref: refs/heads/main&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 3: Empty Plugin Directory
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Point --plugin-dir to an empty directory&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--plugin-dir&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;homedir&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.mama&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.empty-plugins&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 4: Setting Sources
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Exclude user-level settings (which contain enabledPlugins)&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--setting-sources&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;project,local&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Each Layer Matters
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What it blocks&lt;/th&gt;
&lt;th&gt;Without it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scoped cwd&lt;/td&gt;
&lt;td&gt;~/CLAUDE.md auto-load&lt;/td&gt;
&lt;td&gt;~5K tokens/turn of instructions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;.git/HEAD&lt;/td&gt;
&lt;td&gt;Upward CLAUDE.md traversal&lt;/td&gt;
&lt;td&gt;Claude Code walks to ~ and finds it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;--plugin-dir&lt;/td&gt;
&lt;td&gt;Global plugin skills&lt;/td&gt;
&lt;td&gt;Plugins inject skills every turn&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;--setting-sources&lt;/td&gt;
&lt;td&gt;enabledPlugins list&lt;/td&gt;
&lt;td&gt;settings.json re-enables plugins&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why Wrap the CLI Instead of Using the API Directly?
&lt;/h2&gt;

&lt;p&gt;You might wonder: why not just call the Anthropic API and skip all this CLI overhead?&lt;/p&gt;

&lt;p&gt;Because Claude Code CLI gives you a &lt;strong&gt;full agentic runtime for free&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Built-in tools&lt;/strong&gt; — file read/write, bash execution, glob, grep — all wired up and ready&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic loop&lt;/strong&gt; — tool calls → execution → response, handled automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP support&lt;/strong&gt; — connect any MCP server and the CLI manages the protocol&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session persistence&lt;/strong&gt; — resume conversations across process restarts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permission model&lt;/strong&gt; — sandboxed tool execution with user approval flow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Building all of this on the raw API means reimplementing thousands of lines of tool execution, file I/O, and safety checks. The CLI already did that work.&lt;/p&gt;

&lt;p&gt;The tradeoff: each subprocess inherits global config and burns tokens. That's what the 4-layer isolation fixes — you get the full CLI runtime without the bloat.&lt;/p&gt;

&lt;h2&gt;
  
  
  One-Shot vs Persistent Process
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pattern A: One-shot with resume&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;prompt&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--append-system-prompt&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;identity&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resume&lt;/span&gt; &amp;lt;session-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each call re-sends full history + system prompt. After 10 turns the system prompt has been sent 10 times.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern B: Persistent stream-json&lt;/strong&gt; (our approach)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--print&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--input-format&lt;/span&gt; stream-json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output-format&lt;/span&gt; stream-json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--session-id&lt;/span&gt; &amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Process stays alive. System prompt sent once. Messages go through stdin.&lt;/p&gt;

&lt;p&gt;Both patterns need the 4-layer isolation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Open Claude Code with your usual setup&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;/context&lt;/code&gt; — note total token count&lt;/li&gt;
&lt;li&gt;Imagine that multiplied by every subprocess turn&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/jungjaehoon-lifegamez/MAMA/pull/43" rel="noopener noreferrer"&gt;PR with the full implementation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/jungjaehoon-lifegamez/MAMA" rel="noopener noreferrer"&gt;MAMA project&lt;/a&gt; — Memory-Augmented MCP Assistant&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.ycombinator.com/item?id=47096937" rel="noopener noreferrer"&gt;Related HN discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
