Building a 24/7 Claude Code Wrapper? Here's Why Each Subprocess Burns 50K Tokens

Jaehoon Jung — Sun, 22 Feb 2026 18:14:48 +0000

If you're building a wrapper around Claude Code — spawning claude CLI as a subprocess for automation, bots, or multi-agent orchestration — you might be burning through your token quota much faster than expected. Here's why, and a concrete fix.

The Problem

When your wrapper spawns a claude CLI subprocess, each process starts fresh. That process inherits your entire global configuration:

~/CLAUDE.md (your project instructions)
All enabled plugins and their skills
Every MCP server's tool descriptions
User-level settings from ~/.claude/settings.json

Every single turn of every subprocess re-injects all of this. In our case (building MAMA, a memory plugin with hooks + MCP server), a single subprocess turn consumed ~50K tokens before doing any actual work.

Run /context in a fresh session to see for yourself — MCP tool descriptions alone can eat 10-20K tokens.

The Numbers

Before isolation:
  Subprocess turn 1: ~50K tokens (system prompt + plugins + MCP tools)
  Subprocess turn 5: ~250K tokens cumulative

After isolation:
  Subprocess turn 1: ~5K tokens
  Subprocess turn 5: ~25K tokens cumulative

That's a 10x reduction.

The Fix: 4-Layer Subprocess Isolation

We solved this by isolating each subprocess from the user's global settings:

Layer 1: Scoped Working Directory

// Set cwd to a scoped workspace, NOT os.homedir()
// This prevents ~/CLAUDE.md from being auto-loaded
cwd: path.join(os.homedir(), '.mama', 'workspace')

Layer 2: Git Boundary

// Create a .git/HEAD to block upward CLAUDE.md traversal
const gitDir = path.join(workspaceDir, '.git');
fs.mkdirSync(gitDir, { recursive: true });
fs.writeFileSync(path.join(gitDir, 'HEAD'), 'ref: refs/heads/main\n');

Layer 3: Empty Plugin Directory

// Point --plugin-dir to an empty directory
'--plugin-dir', path.join(os.homedir(), '.mama', '.empty-plugins')

Layer 4: Setting Sources

// Exclude user-level settings (which contain enabledPlugins)
'--setting-sources', 'project,local'

Why Each Layer Matters

Layer	What it blocks	Without it
Scoped cwd	~/CLAUDE.md auto-load	~5K tokens/turn of instructions
.git/HEAD	Upward CLAUDE.md traversal	Claude Code walks to ~ and finds it
--plugin-dir	Global plugin skills	Plugins inject skills every turn
--setting-sources	enabledPlugins list	settings.json re-enables plugins

Why Wrap the CLI Instead of Using the API Directly?

You might wonder: why not just call the Anthropic API and skip all this CLI overhead?

Because Claude Code CLI gives you a full agentic runtime for free:

Built-in tools — file read/write, bash execution, glob, grep — all wired up and ready
Agentic loop — tool calls → execution → response, handled automatically
MCP support — connect any MCP server and the CLI manages the protocol
Session persistence — resume conversations across process restarts
Permission model — sandboxed tool execution with user approval flow

Building all of this on the raw API means reimplementing thousands of lines of tool execution, file I/O, and safety checks. The CLI already did that work.

The tradeoff: each subprocess inherits global config and burns tokens. That's what the 4-layer isolation fixes — you get the full CLI runtime without the bloat.

One-Shot vs Persistent Process

Pattern A: One-shot with resume

claude -p "<prompt>" \
  --append-system-prompt "<identity>" \
  --resume <session-id>

Each call re-sends full history + system prompt. After 10 turns the system prompt has been sent 10 times.

Pattern B: Persistent stream-json (our approach)

claude --print \
  --input-format stream-json \
  --output-format stream-json \
  --session-id <id>

Process stays alive. System prompt sent once. Messages go through stdin.

Both patterns need the 4-layer isolation.

Try It Yourself

Open Claude Code with your usual setup
Run /context — note total token count
Imagine that multiplied by every subprocess turn

Forem: Jaehoon Jung