Forem: Sangmin Lee

How to Use Claude Code Subagents for Parallel Research

Sangmin Lee — Sun, 24 May 2026 01:40:33 +0000

Originally published at claudeguide.io/claude-code-subagents-parallel-research

How to Use Claude Code Subagents for Parallel Research

Subagents are the single feature that changes how much work you can get through Claude Code per hour. Used right, a three-subagent fan-out completes cross-codebase research in 90 seconds that a linear pass would spend 7–10 minutes on. Used wrong, you burn tokens and end up with three redundant reports. This is the operating manual. For the broader Claude Code feature set, see the Claude Code Complete Guide.

TL;DR

A subagent is a fresh Claude conversation with its own context window, started by your main agent to do a scoped task.
Use subagents when the work is genuinely independent (multiple files, multiple topics, multiple codebases).
Do not use subagents when the work is sequential (each step needs the previous result).
The Explore subagent is the right default for search/read tasks; it has no write tools and returns a summary.
Spawn up to 3 in parallel in a single message; more produces diminishing returns and noisy context.
The parent does not see the subagent's internal tool calls — only the final summary. Write the prompt so the summary is the thing you need.

What exactly is a subagent

When your main Claude Code session calls the Agent tool, it spawns a new Claude instance. That instance:

Gets its own system prompt (defined by the subagent type — e.g., Explore, Plan, general-purpose).
Starts with an empty context window.
Has its own tool access, possibly narrower than the parent's.
Runs to completion independently.
Returns a single message to the parent.

The parent sees the return message and nothing else — not the subagent's intermediate reads, greps, or reasoning. This is a feature. Your parent conversation stays clean.

The three subagent types worth knowing

As of April 2026, Claude Code ships with a handful of built-in subagents plus any project- or user-defined ones. The three you will use 80% of the time:

`Explore`

Fast search and read agent. No Edit, Write, or other mutation tools. Ideal for:

Finding where a function or pattern is defined
Answering "how does X work in this codebase"
Surveying multiple areas of a codebase in parallel

`Plan`

Software architect agent. Returns a step-by-step implementation plan with critical files and architectural tradeoffs. Use when you need a design pass before coding.

`general-purpose`

Full tool access. Use for multi-step tasks that require both reading and writing. More expensive because it carries the full toolset into its context.

You can also define custom subagents — a .claude/agents/*.md file with a description and tool list. I have a design-system-extractor agent, a code-reviewer agent, and a security-auditor agent. Each is narrow enough to do one thing well.

When subagents actually help

The rule is: subagents help when the work is independent and parallelizable. Three cases where they pay off handsomely, and three where they don't.

Subagents help:

Case A — Cross-codebase survey. "How do Stripe, Polar, and Lemonsqueezy each handle webhook signature verification in our codebase?" Three subagents, one per integration, run in parallel. A linear sweep of the same question takes 3-4x longer because each probe blocks the next.

Case B — Independent file reads. "Read these 8 migration files and tell me if any of them drop columns unsafely." Two or three subagents, each handling a subset, run concurrently.

Case C — Broad-to-narrow research. The parent needs a shallow answer across many areas before deciding where to go deep. One subagent per area, all parallel, and the parent integrates the findings.

Subagents do NOT help:

Case D — Sequential dependency. "Refactor the auth module, then update the tests, then update the docs." Each step needs the previous step's output. Subagents here just add overhead.

Case E — Single-file work. Editing one file with one context. The main agent is already scoped correctly.

Case F — Tight feedback loops. Iterating on a single implementation with the user. Subagents break the loop.

The parallel fan-out pattern

The single most useful pattern:

User question: "How does this codebase handle [X] across [A, B, C]?"

Main agent → spawns in parallel:
  - Subagent 1: investigate A
  - Subagent 2: investigate B
  - Subagent 3: investigate C

Main agent → integrates three summaries → delivers unified answer.

To get the parallelism, the parent must emit multiple Agent tool calls in a single message. Emitting them one at a time serializes them. This is the most common user mistake.

In practice, inside Claude Code, prompt the main agent explicitly:

40 slash command templates. Token-optimized variants. JSONL file for direct import. Tested in production sessions.

→ Get Claude Code Power Prompts 300 — $29

30-day money-back guarantee. Instant download.

Claude Code Permissions: Trust Levels, Allow Lists, and Safe Defaults

Sangmin Lee — Sun, 24 May 2026 01:35:20 +0000

Originally published at claudeguide.io/claude-code-permissions

Claude Code Permissions: Trust Levels, Allow Lists, and Safe Defaults

Claude Code's permission model has 3 levels: Always-allowed (read-only tools, 0 prompts), Prompt-on-first-use (default — Edit/Write/Bash ask once per session), and Skip-permissions (--dangerously-skip-permissions for trusted environments). Configuring this correctly cuts ~80% of routine confirmation prompts while preserving safety for destructive operations. Understanding how the permission model works lets you configure it to match your risk tolerance — from "ask me before every file write" to "run fully autonomously."

How permissions work

Every tool call Claude Code makes falls into one of three permission buckets:

1. Always allowed — low-risk read operations (Read, Grep, Glob) that never prompt.

2. Prompt on first use (default) — higher-risk operations (Edit, Write, Bash) where Claude asks for your approval the first time in each session.

3. Always denied — tools or patterns you've explicitly blocked in settings. Claude cannot call these regardless of context.

The permission model is configured in your settings.json. There are two levels:

User-level (~/.claude/settings.json): applies to all projects
Project-level (.claude/settings.json): applies only when running from that directory, overrides user-level for the tools it specifies

For a complete reference of every field in settings.json, see the Claude Code settings.json reference.

Permission modes

You set the permission mode via the --permission-mode flag or in settings.

Mode	Behavior
`default`	Prompt for higher-risk tool uses on first call in session
`bypassPermissions`	Skip all confirmation prompts — never asks
`--dangerously-skip-permissions` (CLI flag)	Identical to bypassPermissions for one session

When to use bypass: automated scripts, CI/CD pipelines where you've reviewed the task in advance. Never use it when exploring an unfamiliar codebase or running an unknown task.

Configuring allowed and disallowed tools

Allow specific tools only

Restrict Claude to a whitelist of tools. Any tool not in the list is implicitly denied:

{
  "allowedTools": ["Read", "Grep", "Glob"]
}

This creates a read-only Claude Code — it can explore and search but cannot modify files or run commands. Useful for code review and analysis tasks.

Disallow specific tools

Block specific tools while keeping all others available:

{
  "disabledTools": ["Bash", "WebFetch", "WebSearch"]
}

This keeps file editing but prevents shell execution and web requests. Good for offline-only workflows.

Common safe configurations

Read-only mode (safe for exploring unfamiliar codebases):

{
  "allowedTools": ["Read", "Grep", "Glob", "Agent"]
}

No shell execution (file edits only):

{
  "disabledTools": ["Bash"]
}

No web access (air-gapped or policy-restricted environments):

{
  "disabledTools": ["WebFetch", "WebSearch"]
}

Per-project permissions

For a project with different risk tolerance than your default, use .claude/settings.json in the project root. This file is typically committed to version control so the entire team shares the same settings. You can layer Claude Code hooks on top of permissions for even finer control — hooks can block specific commands or run formatters after writes.

Example: a production infrastructure project where shell execution requires explicit approval:

{
  "permissions": {
    "allow": [],
    "deny": [
      "Bash(git push*)",
      "Bash(kubectl*)",
      "Bash(terraform apply*)"
    ]
  }
}

This uses pattern-based deny rules: Bash(pattern) blocks bash commands matching the pattern. Claude cannot run git push, kubectl commands, or terraform apply without triggering an error message.

Pattern-based rules

Pattern rules match against the tool input, not just the tool name.

Syntax: ToolName(pattern) where pattern is a glob-style match against the command input.

{
  "permissions": {
    "deny": [
      "Bash(rm -rf*)",
      "Bash(* --force*)",
      "Edit(*/.env*)",
      "Write(*secrets*)"
    ]
  }
}

This blocks:

Any rm -rf command
Any command with --force
Editing any .env file
Writing any file with "secrets" in the path

Note: the patterns are case-sensitive and match against the full command string for Bash, or the full file path for file-operation tools.

The full settings.json reference

{
  "permissions": {
    "allow": [
      "Bash(npm test)",
      "Bash(npm run build)"
    ],
    "deny": [
      "Bash(git push*)",
      "Bash(rm -rf*)"
    ]
  },
  "allowedTools": [],
  "disabledTools": [],
  "hooks": {
    "PreToolUse": [],
    "PostToolUse": []
  }
}

permissions.allow: specific patterns that are always permitted without a prompt, even if they would normally require approval.

permissions.deny: patterns that are always blocked. Takes precedence over allow.

allowedTools: if non-empty, only these tools are available. Everything else is blocked.

disabledTools: tools in this list are blocked regardless of other settings.

Trust levels explained

When you approve a tool call during a session, Claude Code remembers the approval at different levels:

Response	What it means
Yes, allow once	Approved for this single call
Yes, allow for this session	Approved for all identical calls this session
Yes, always allow	Added to `settings.json` permanently
No, deny once	Blocked for this call; will ask again
No, always deny	Added to `permissions.deny` in settings permanently

"Always allow" and "always deny" responses write to your project-level .claude/settings.json if it exists, or create one. Review these accumulating entries periodically — a "yes always" given hastily can create a permanent permission you didn't intend.

Auto mode and autonomous execution

When you invoke /auto or start a session with a task that requires unattended execution, you can pre-authorize the tools Claude will need rather than accepting all prompts:

In settings.json before starting the session:

{
  "permissions": {
    "allow": [
      "Bash(npm*)",
      "Bash(git add*)",
      "Bash(git commit*)"
    ]
  }
}

This pre-authorizes npm commands and safe git operations without enabling a full bypass.

What each tool can do (risk reference)

Understanding what each tool is capable of helps you set the right defaults.

Tool	What it does	Default behavior
`Read`	Read file contents	Always allowed
`Glob`	Find files by pattern	Always allowed
`Grep`	Search file contents	Always allowed
`Edit`	Replace text in existing files	Prompt on first use
`Write`	Create or overwrite files	Prompt on first use
`Bash`	Execute shell commands	Prompt on first use
`WebFetch`	Fetch a URL	Prompt on first use
`WebSearch`	Search the web	Prompt on first use
`Agent`	Spawn a subagent	Prompt on first use

Bash is the highest-risk tool — it has the same access to your system as you do. A Bash command can delete files, push code, modify databases, and make network requests. Pattern-based deny rules on Bash are the most important safety measure.

Common permission mistakes

Setting bypassPermissions permanently in ~/.claude/settings.json: this turns off all safety prompts for every project. Only set this for specific automated workflows, not as a global default.

Forgetting to commit .claude/settings.json: your project team shares your safety rules only if you commit the file. Add it to version control with the rest of your project config.

Too-broad allow rules: Bash(*) in allow means every bash command is approved without prompting — equivalent to full bypass. Be specific: Bash(npm test) not Bash(*).

Not reviewing accumulated always allow entries: over time, quickly-granted "always allow" approvals accumulate in settings.json. Review and trim them quarterly. A well-crafted CLAUDE.md reduces how often you need to grant "always allow" in the first place by giving Claude clear instructions upfront.

FAQ

Does Claude Code need root/sudo access?
No. Claude Code runs as the current user. If it attempts a command requiring sudo, it will be blocked by the OS, not by the permission system. Never run Claude Code as root.

Can Claude Code access files outside my working directory?
Yes — Claude Code's Read and Bash tools can access any path the current user can access. Use .gitignore-style patterns in hooks to block reads of sensitive paths if needed.

Can I share settings between team members?
Yes — commit .claude/settings.json to your repository. All team members who use Claude Code in that project will use the same settings. User-level ~/.claude/settings.json is personal.

What happens when Claude tries a denied action?
The tool call is blocked and Claude receives an error message explaining what was denied. Claude will typically acknowledge the restriction and ask you how to proceed differently.

Is there an audit log of what Claude did?
Not built-in, but the Hooks system (see Claude Code Hooks guide) lets you log every tool call to a file with a PreToolUse hook on "*".

Sources

Claude Code permissions documentation — April 2026
Claude Code security guide — April 2026

Frequently Asked Questions

How do I make Claude Code read-only so it can't modify files?

Set allowedTools to ["Read", "Grep", "Glob"] in your .claude/settings.json. This restricts Claude to search and read operations only — it cannot edit, write, or run shell commands. Useful for code review tasks or exploring an unfamiliar codebase safely.

What is the difference between `allowedTools` and `permissions.allow`?

allowedTools is a whitelist of tool names — only listed tools are available at all. permissions.allow contains specific patterns (like Bash(npm test)) that are pre-approved without a prompt, while all other uses of those tools still require confirmation. Use allowedTools for hard restrictions; use permissions.allow for pre-authorizing specific safe commands.

How do I block Claude Code from running `git push` or `kubectl`?

Add pattern-based deny rules under permissions.deny in .claude/settings.json: "Bash(git push*)" and "Bash(kubectl*)". These patterns block any Bash command matching the glob, and Claude receives an error message explaining the restriction. Commit this file to version control so the rules apply to your whole team.

Can I permanently allow a tool action without re-approving every session?

Yes. When Claude Code prompts for approval, choose "Yes, always allow" — this writes the pattern to your settings.json under permissions.allow. On subsequent sessions the action runs without prompting. Review accumulated entries periodically; hasty "always allow" grants can create broader permissions than intended.

Take It Further

Claude Code Power Prompts 300 — 300 battle-tested prompts for Claude Code, organized by use case. Copy, paste, ship.

40 slash command templates. Token-optimized variants. JSONL file for direct import. Tested in production sessions.

→ Get Claude Code Power Prompts 300 — $29

30-day money-back guarantee. Instant download.

Claude Code Hooks: Automate and Control Every Tool Call

Sangmin Lee — Sun, 24 May 2026 01:35:17 +0000

Originally published at claudeguide.io/claude-code-hooks

Claude Code Hooks: Automate and Control Every Tool Call

Claude Code hooks are shell commands that execute automatically before or after tool calls. They give you programmatic control over what Claude Code does — logging every file it touches, blocking certain edits, running formatters after writes, or sending Slack notifications on commit in 2026.

This guide covers every hook type, how to configure them, and 12 production examples.

What hooks are

Every time Claude Code calls a tool (Read, Edit, Write, Bash, etc.), two hook points fire:

PreToolUse: before the tool executes — can block the call by exiting non-zero
PostToolUse: after the tool executes — receives the result, can trigger side effects

Hooks are configured in your Claude Code settings (~/.claude/settings.json for user-level, or .claude/settings.json for project-level). Project-level settings override user-level for the hooks they define. For a full reference of all settings fields, see the Claude Code settings.json reference.

Hook configuration format


json
{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "echo \"Running bash: $CLAUDE_TOOL_INPUT\" 

40 slash command templates. Token-optimized variants. JSONL file for direct import. Tested in production sessions.

[→ Get Claude Code Power Prompts 300 — $29](https://shoutfirst.gumroad.com/l/agfda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-code-hooks)

*30-day money-back guarantee. Instant download.*

Claude Code CLI Commands: Full Reference (2026)

Sangmin Lee — Sun, 24 May 2026 01:30:04 +0000

Originally published at claudeguide.io/claude-code-cli-commands

Claude Code CLI Commands: Full Reference (2026)

Claude Code is primarily a conversational tool, but it has a rich set of CLI flags, in-session slash commands, and keyboard shortcuts that change how it behaves. This is the complete reference. For a broader introduction to what Claude Code can do, visit the Claude Code Complete Guide in 2026.

CLI flags (when launching `claude`)

These flags are passed when you start Claude Code from the terminal.

Basic flags

Flag	Description	Example
`--help`, `-h`	Show help text	`claude --help`
`--version`, `-v`	Show installed version	`claude --version`
`--print`, `-p`	Print response and exit (non-interactive)	`claude -p "Explain this file" < main.ts`
`--output-format`	Set output format for `-p` mode	`claude -p "..." --output-format json`

Model and configuration

Flag	Description	Example
`--model`	Override the model for this session	`claude --model claude-sonnet-4-6`
`--config`	Path to a custom settings file	`claude --config ./team-settings.json`
`--system-prompt`	Inject a one-off system prompt	`claude --system-prompt "You are a security auditor"`

Context and files

Flag	Description	Example
`--context`	Add extra context text	`claude --context "$(cat NOTES.md)"`
`--allowed-tools`	Restrict which tools Claude can use	`claude --allowed-tools Read,Grep`
`--disallowed-tools`	Block specific tools	`claude --disallowed-tools Bash,Write`

Permissions and safety

Flag	Description	Example
`--dangerously-skip-permissions`	Skip all tool-use permission prompts	`claude --dangerously-skip-permissions`
`--permission-mode`	Set permission level (default/bypassPermissions)	`claude --permission-mode bypassPermissions`

Warning: --dangerously-skip-permissions and bypassPermissions disable all confirmation prompts. Use only in trusted automated environments where you have reviewed the task in advance.

Session control

Flag	Description	Example
`--resume`	Resume a previous session by ID	`claude --resume abc123def`
`--continue`, `-c`	Continue the most recent session	`claude -c`
`--max-turns`	Limit the number of agentic turns	`claude --max-turns 5`

Non-interactive (pipe) mode

Claude Code can be used in scripts and CI pipelines without an interactive terminal. Use -p with stdin:

# Read a file and ask a question about it
cat main.ts | claude -p "What does this file do? List the exported functions."

# Pipe multiple files
cat src/auth.ts src/middleware.ts | claude -p "Find any security issues in these files."

# Save output to a variable
REVIEW=$(git diff HEAD~1 | claude -p "Review this diff for bugs.")
echo "$REVIEW"

# JSON output for programmatic use
git diff | claude -p "List any breaking changes as JSON" --output-format json

In-session slash commands

Once Claude Code is running, these commands are typed in the conversation input.

Session management

Command	Description
`/help`	Show help and available commands
`/quit`, `/exit`	Exit Claude Code
`/clear`	Clear conversation history (new context)
`/compact`	Compact conversation to reduce token usage
`/status`	Show session info: model, token count, settings
`/cost`	Show token usage and estimated cost for this session

Model switching

Command	Description
`/model <name

40 slash command templates. Token-optimized variants. JSONL file for direct import. Tested in production sessions.

→ Get Claude Code Power Prompts 300 — $29

30-day money-back guarantee. Instant download.

Claude API Pricing 2026: Complete Breakdown with Calculators

Sangmin Lee — Sun, 24 May 2026 01:30:01 +0000

Originally published at claudeguide.io/claude-api-pricing-2026

Claude API Pricing 2026: Complete Breakdown with Calculators

Anthropic's Claude API uses a per-token pricing model. You pay for tokens consumed — input (what you send) and output (what the model generates). This guide covers every pricing tier, feature, and real-world cost example as of April 2026.

Current pricing table (April 2026)

Standard API

Model	Input per 1M tokens	Output per 1M tokens
Claude Haiku 4.5	$1.00	$5.00
Claude Sonnet 4.6	$3.00	$15.00
Claude Opus 4.7	$5.00	$25.00

Prompt caching

Model	Cache write per 1M	Cache read per 1M
Claude Haiku 4.5	$1.25	$0.10
Claude Sonnet 4.6	$3.75	$0.30
Claude Opus 4.7	$6.25	$0.50

Cache read prices are 10% of standard input prices. Cache writes are 125% of standard input prices.

Batch API (50% off all standard rates)

Model	Input per 1M tokens	Output per 1M tokens
Claude Haiku 4.5	$0.50	$2.50
Claude Sonnet 4.6	$1.50	$7.50
Claude Opus 4.7	$2.50	$12.50

Batch API processes requests asynchronously within 24 hours. No streaming. Ideal for non-time-sensitive bulk workloads.

1M context window (extended context)

For Sonnet 4.6 and Opus 4.7, input tokens beyond 200K are billed at higher rates. Haiku 4.5 does not support 1M context.

Context range	Sonnet 4.6 input	Opus 4.7 input
0 – 200K tokens	$3.00/1M	$5.00/1M
200K – 1M tokens	$6.00/1M	$10.00/1M

Output pricing is unchanged regardless of context length.

Three ratios to memorize

1. Output is 5x more expensive than input (for all models). A 1K-token output costs the same as a 5K-token input. Every prompt engineering choice that reduces output length saves 5x more than the same reduction in input.

2. Opus is 5x more expensive than Haiku. A Haiku workload costing $100/month costs $500/month on Opus. Use the cheapest model that clears your quality bar. For a practical guide to matching tasks to models, see Haiku vs Sonnet vs Opus: which model to use.

3. Cache reads are 10% of input price. If the same system prompt is reused across calls, every cache hit saves 90% on that input slice. The break-even is reached at 1.28 cache hits per write. See the prompt caching break-even guide for the full calculation with worked examples.

Worked cost examples

Example 1: High-volume classification

Task: classify user messages into 12 categories
Input: 500 tokens (message + system prompt)
Output: 10 tokens (one label + confidence)
Volume: 200,000 requests/month
Model: Haiku 4.5

Calculation:

Input: 200,000 × 500 tokens = 100M tokens → $100
Output: 200,000 × 10 tokens = 2M tokens → $10
Total: $110/month

If you used Opus: $550 input + $50 output = $600/month. That is $490/month wasted.

Example 2: Customer support drafts

Task: generate reply drafts for support tickets
Input: 2,000 tokens (ticket + system prompt + few-shot examples)
Output: 300 tokens (draft reply)
Volume: 30,000 requests/month
Model: Sonnet 4.6
Caching: system prompt (1,200 tokens) cached across all requests

Without caching:

Input: 30,000 × 2,000 = 60M tokens → $180
Output: 30,000 × 300 = 9M tokens → $135
Total: $315/month

With prompt caching:

Cache write: 1,200 tokens × 1 write = 1,200 tokens → $0.005 (negligible)
Cache reads: 1,200 tokens × 30,000 = 36M tokens → $10.80
Non-cached input: 800 tokens × 30,000 = 24M tokens → $72
Output: unchanged → $135
Total with caching: $217.80/month (31% savings)

Example 3: Document summarization (1M context)

Task: summarize 400K-token legal contracts
Input: 400,000 tokens per request
Output: 800 tokens per summary
Volume: 200 requests/month
Model: Opus 4.7

Calculation:

First 200K tokens: 200,000 × 200 = 40M tokens → $200
Extended (200K-400K): 200,000 × 200 = 40M tokens at $10/1M → $400
Output: 200 × 800 = 160,000 tokens → $4
Total: $604/month

Note: a 400K-token document on Sonnet 4.6 would cost $200 + $200 = $400 input + $2 output = $402/month — saving $200/month with minimal quality loss in most summarization tasks. Test before assuming Opus is required.

Example 4: Batch API for nightly data enrichment

Task: enrich 50,000 product records with descriptions
Input: 300 tokens per record
Output: 200 tokens per record
Model: Sonnet 4.6, Batch API

Without batch (standard):

Input: 50,000 × 300 = 15M tokens → $45
Output: 50,000 × 200 = 10M tokens → $150
Total: $195/run

With Batch API:

Input: 15M tokens at $1.50/1M → $22.50
Output: 10M tokens at $7.50/1M → $75
Total: $97.50/run (50% savings)

At twice-weekly runs: $195/week → $97.50/week = $410/month saved.

How to calculate your own costs

Step 1: Estimate token volumes

Use the countTokens API endpoint to measure actual token counts for your prompts rather than estimating:

import anthropic

client = anthropic.Anthropic()

response = client.messages.count_tokens(
    model="claude-sonnet-4-6",
    system="Your system prompt here",
    messages=[{"role": "user", "content": "Sample user message"}],
)

print(f"Input tokens: {response.input_tokens}")

Step 2: Calculate cost


python
def estimate_monthly_cost(
    model: str,
    input_tokens_per_request: int,
    output_tokens_per_request: int,
    requests_per_month: int,
    cached_tokens_per_request: int = 0,
) -

PDF guide + 6-sheet Excel cost calculator. Example scenario: $2,100 → $187/month on a customer support agent.

[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-api-pricing-2026)

*30-day money-back guarantee. Instant download.*

Claude API Error Handling: Rate Limits, Retries, Patterns

Sangmin Lee — Sat, 23 May 2026 01:40:34 +0000

Originally published at claudeguide.io/claude-api-error-handling

Claude API Error Handling: Rate Limits, Retries, and Production Patterns

The Anthropic API returns structured errors with specific HTTP status codes. Knowing which errors to retry, which to log and surface to users, and which indicate bugs in your code is the difference between a production-ready integration and one that silently fails. For general Claude API concepts, see the Claude Agent SDK Guide in 2026.

Error code reference

Each row links to a dedicated troubleshooting page with Python + TypeScript code examples (Korean):

HTTP Status	Error type	Meaning	Action
400	`invalid_request_error`	Malformed request — bad JSON, unsupported parameters, exceeded context window	Fix the request — do not retry
401	`authentication_error`	Invalid API key	Check key validity — do not retry
403	`permission_error`	Valid key but insufficient permissions (e.g. model not enabled)	Check account permissions — do not retry
404	`not_found_error`	Endpoint or model doesn't exist	Fix model name or endpoint — do not retry
413	`request_too_large`	Request body exceeds 32MB limit	Use Files API for large attachments
422	`unprocessable_entity`	Request valid but semantically wrong (e.g. invalid tool schema)	Fix the schema — do not retry
429	`rate_limit_error`	Too many requests or tokens per minute	Retry with exponential backoff
500	`api_error`	Internal server error	Retry with backoff, max 3 attempts
529	`overloaded_error`	API overloaded	Retry with longer backoff

Additional HTTP status codes

Status	Type	Quick fix
502	`bad_gateway`	Retry [3, 10, 30, 60, 120s]
503	`service_unavailable`	Check status.anthropic.com + backoff
504	`gateway_timeout`	Switch to streaming for long outputs

Error subtype deep-dives (한국어, code samples)

context_length_exceeded — 컨텍스트 창 초과 시 트리밍
invalid_api_key — key 형식 검증 + 환경변수 trim
max_tokens — 모델별 8192 한도 cap
model_not_found — 최신 모델 식별자
prompt_too_long — 누적 conversation 자동 trim
streaming_error — SSE 끊김 시 resume 패턴
tool_use_error — tool_use ↔ tool_result pairing 검증
vision_error — 이미지 포맷/크기 자동 정규화
file_upload_error — Files API + beta 헤더
batch_error — Batch 10K/250MB 한도 검증
cache_error — Prompt Caching cache_control 위치
billing_error — 결제/크레딧 부족 alert

The critical distinction: 4xx errors (except 429) indicate a problem with your request and should not be retried. 429 and 5xx errors are transient and should be retried. To reduce 400-class errors from oversized contexts, see Claude 1M Context Window for truncation and caching strategies.

Rate limit errors (429)

The most common production error. Rate limits are enforced on:

Requests per minute (RPM): number of API calls
Input tokens per minute (ITPM): total input tokens
Output tokens per minute (OTPM): total output tokens

The Retry-After header in the 429 response tells you exactly how many seconds to wait.

Python:


python
import anthropic
import time

client = anthropic.Anthropic()

def call_with_retry(
    messages: list,
    model: str = "claude-sonnet-4-6",
    max_retries: int = 5,
    base_delay: float = 1.0,
) -

PDF guide + Excel cost calculator.

[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-api-error-handling)

*30-day money-back guarantee. Instant download.*

From $800 to $120/month: A Claude API Cost Optimization Case Study

Sangmin Lee — Sat, 23 May 2026 01:35:21 +0000

Originally published at claudeguide.io/claude-api-cost-case-study

From $800 to $120/month: A Claude API Cost Optimization Case Study

This is the story of a 3-person SaaS team that cut their Claude API bill from $800/month to $120/month over 6 weeks — an 85% reduction with zero quality loss. The product is a B2B document analysis tool — users upload contracts, the app extracts key clauses, generates summaries, and answers questions about the document.

PDF guide + 6-sheet Excel cost calculator. Example scenario: $2,100 → $187/month on a customer support agent.

→ Get Cost Optimization Masterclass — $59

30-day money-back guarantee. Instant download.

Claude Agent SDK Quickstart: Build Your First Agent in 15 Minutes

Sangmin Lee — Sat, 23 May 2026 01:35:18 +0000

Originally published at claudeguide.io/claude-agent-sdk-quickstart

Claude Agent SDK Quickstart: Build Your First Agent in 15 Minutes

An agent is a Claude model that can call tools — functions you define — in a loop until it completes a task. This guide walks from zero to a working agent with two tools (web search and unit converter) in Python or TypeScript in 2026.

Prerequisites: Anthropic API key, Python 3.11+ or Node.js 18+.

What you're building

A research assistant that:

Accepts a question
Decides whether to search the web or convert a unit
Calls the tool
Uses the result to answer (or calls another tool)
Returns a final answer

Python version

Step 1: Install the SDK

pip install anthropic
export ANTHROPIC_API_KEY=sk-ant-your-key-here

Step 2: Define your tools

Tools are Python functions with JSON Schema descriptions. Claude reads the description to decide when to call each tool.


python
import anthropic
import json

client = anthropic.Anthropic()

TOOLS = [
    {
        "name": "web_search",
        "description": "Search the web for current information. Use when the question requires recent facts or data.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "The search query"}
            },
            "required": ["query"]
        }
    },
    {
        "name": "unit_converter",
        "description": "Convert between common units. Supports: km/miles, kg/lbs, celsius/fahrenheit, usd/krw.",
        "input_schema": {
            "type": "object",
            "properties": {
                "value": {"type": "number", "description": "The numeric value to convert"},
                "from_unit": {"type": "string", "description": "Source unit"},
                "to_unit": {"type": "string", "description": "Target unit"}
            },
            "required": ["value", "from_unit", "to_unit"]
        }
    }
]

def web_search(query: str) -

Complete, runnable Python and TypeScript code throughout.

[→ Get Agent SDK Cookbook — $49](https://shoutfirst.gumroad.com/l/ogxhmy?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-agent-sdk-quickstart)

*30-day money-back guarantee. Instant download.*

Claude 1M Context Window: What It Can Do and What It Costs

Sangmin Lee — Sat, 23 May 2026 01:30:04 +0000

Originally published at claudeguide.io/claude-1m-context-window

Claude 1M Context Window: What It Can Do and What It Costs

Claude Opus 4.7 and Claude Sonnet 4.6 support a 1 million token context window — roughly 750,000 words, or the equivalent of 10 average novels. This guide explains what that actually means for your use case, what it costs, and when the extended context is worth it. For guidance on picking the right model tier, see Haiku vs Sonnet vs Opus: Which Model?.

What 1M tokens looks like in practice

Content type	Fits in 1M tokens
Words (English prose)	~750,000 words
Pages (standard 250 words/page)	~3,000 pages
Code (Python, ~100 tokens/KB)	~10 MB of source code
GitHub repo (median size)	~3-5 repos in full
Legal documents	~500 standard contracts
Emails	~5,000 average emails
Slack messages	~20,000 messages
PDF pages (no images)	~2,500 pages

Practical upper bound: 1M tokens is the technical limit. In practice, Anthropic recommends staying under 800K for reliable output quality. The model's attention degrades at the very edges of a very long context.

Pricing for extended context

Standard context (0-200K tokens) is billed at the normal rate. Beyond 200K, the per-token rate doubles.

Model	0-200K input	200K-1M input	Output
Sonnet 4.6	$3.00/1M	$6.00/1M	$15.00/1M
Opus 4.7	$5.00/1M	$10.00/1M	$25.00/1M

Real cost example — 800K token request on Opus:

First 200K: 200,000 tokens × $5/1M = $1.00
Remaining 600K: 600,000 tokens × $10/1M = $6.00
Total input: $7.00 per request
Plus output: if the response is 2,000 tokens → $0.05
Single request total: ~$7.05

At 100 requests/month: $705/month on input alone. This is the context where selective context matters enormously.

When 1M context is worth it

1. Whole-codebase analysis

When you need Claude to reason across an entire codebase — not just find a file, but understand how components interact — you need the whole thing in context at once.

Use cases:

Security audit: finding vulnerability chains across modules
Architecture review: identifying circular dependencies, anti-patterns
Refactoring plan: understanding all callers before changing a shared function
Onboarding doc generation: summarizing the entire codebase for new hires

Alternative to consider first: Claude Code's built-in file navigation (Read, Glob, Grep) lets it explore code without putting everything in context. For 80% of coding tasks, targeted file reading is faster and cheaper.

2. Multi-document synthesis

Legal due diligence, medical record review, financial document analysis, research literature synthesis — tasks where the answer depends on relationships across hundreds of documents.

Use cases:

Summarizing 200 earnings calls to find recurring themes
Finding discrepancies across 50 supplier contracts
Synthesizing 100 research papers into a literature review
Analyzing a complete audit trail (logs, tickets, emails) for an incident investigation

3. Long conversation history

Agents that run for many turns can use the full history as context for decision-making. A research agent that has made 50 tool calls, read 30 documents, and produced intermediate results can load the entire history for a final synthesis step.

4. Large structured data

When you need Claude to reason over a large dataset — a 100K-row export in CSV form is ~500K tokens — and the reasoning requires seeing all the data rather than a sample. (Note: for data analysis at scale, a database + targeted query is almost always better than loading raw data into context.)

When NOT to use 1M context

1. You don't actually need it

The most common misuse is sending the full codebase when the task only requires 2-3 files. Use targeted file reads first. Save the full-context approach for tasks where the answer genuinely requires reading everything.

Test: can you find the relevant files with Grep/Glob and read just those? If yes, do that.

2. Speed matters

1M token requests have measurably higher latency. Time to first token is longer. If you need a fast response for a user-facing workflow, consider whether you can reduce the context or use a retrieval step.

3. The cost doesn't justify the use case

At $7+ per request, 1M context requests are expensive. For a use case running 1,000 times/month, that is $7,000+ in input alone. The quality premium must be real and measurable.

4. The task is repetitive over sub-documents

If you are summarizing 1,000 individual documents and do not need cross-document reasoning, process them one at a time (or in batches via Batch API). You do not need 1M context to summarize a single 5-page contract.

How to use the 1M context window

Via the API

1M context requires requesting access via the Anthropic Console for some accounts. Once enabled, you use it by simply sending a larger messages array — no special flag required.

import anthropic

client = anthropic.Anthropic()

# Read all your documents
with open("large_document.txt") as f:
    document = f.read()

response = client.messages.create(
    model="claude-opus-4-7",  # or claude-sonnet-4-6
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": f"Analyze this document and find all clauses that could represent liability:\n\n{document}"
        }
    ]
)
print(response.content[0].text)

Checking your context usage

The response object includes usage.input_tokens. Check this to know exactly what you sent:

response = client.messages.create(...)
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
print(f"Cache read tokens: {response.usage.cache_read_input_tokens}")

Combining with prompt caching

For repeated analysis over the same large document (e.g., answering multiple questions about the same contract), use prompt caching to avoid re-billing the input tokens on each call. See the Claude Prompt Caching Guide for a full breakdown of cache pricing and implementation:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    system=[
        {
            "type": "text",
            "text": large_document_text,
            "cache_control": {"type": "ephemeral"}  # Cache the document
        }
    ],
    messages=[{"role": "user", "content": "What are the termination clauses?"}]
)

# Second call reuses cached document — 90% cheaper on the input
response2 = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    system=[
        {
            "type": "text",
            "text": large_document_text,
            "cache_control": {"type": "ephemeral"}
        }
    ],
    messages=[{"role": "user", "content": "What are the payment terms?"}]
)

With a 700K-token document on Sonnet 4.6:

Without caching: $3/call for first 200K + $6/call for remaining 500K = $4.80 per question
With caching (after first write): $0.30/1M on cached tokens = $0.21 for 700K tokens per question
Savings: 96% on repeated queries over the same document

What Claude actually does with a million tokens

This is the question that matters most for deciding whether to use it.

What works well:

Finding specific information anywhere in the context ("does this contract mention force majeure?")
Cross-referencing across documents ("does the pricing in the email match the contract?")
Summarizing the whole into a structured output
Finding patterns that only emerge from seeing many instances

What degrades at very long context:

Precise recall of specific facts from the middle of a 1M token context (the "lost in the middle" problem — performance is best at the beginning and end)
Maintaining a single coherent thread over very long outputs
Complex multi-step reasoning when the relevant context is scattered across the full 1M

Mitigation: structure your context so the most important information appears at the beginning and end of the messages array. If you have critical instructions or key documents, place them first.

FAQ

Is 1M context available on Haiku?
No. Haiku 4.5 supports up to 200K tokens. Only Sonnet 4.6 and Opus 4.7 support 1M context.

Does context length affect output quality?
For tasks within the first 200K tokens of context, quality is equivalent to shorter contexts. For very long contexts, attention degrades slightly in the middle. Plan your context layout accordingly.

Can I use 1M context with the Batch API?
Yes. Batch API supports up to 1M context. Pricing is 50% off standard rates, so extended context on Batch API: Sonnet at $3.00/1M for extended tokens (vs. $6.00 standard).

How do I estimate whether I need 1M context?
Count your actual tokens with the countTokens endpoint before building. Many tasks that seem to require full context can be handled with targeted retrieval. Build the retrieval version first; upgrade to full context only if quality is insufficient.

What is the maximum output token length?
Independent of input context length: 8,192 tokens for most models, 16,000 for Opus 4.7. Input context affects what the model knows, not how much it can generate.

Sources

Anthropic models documentation — April 2026
Claude API pricing — April 2026
Long context best practices — April 2026

Frequently Asked Questions

How much does a 1M token request cost on Claude?

On Claude Opus 4.7, a single 800K-token request costs approximately $7.05 in input alone: the first 200K tokens at $5/1M = $1.00, and the remaining 600K at $10/1M = $6.00, plus output. On Sonnet 4.6, the same request costs about $4.80. Use prompt caching on repeated queries over the same document to reduce costs by up to 96%.

Which Claude models support the 1M context window?

Only Claude Sonnet 4.6 and Claude Opus 4.7 support 1M token context. Claude Haiku 4.5 is limited to 200K tokens. The 1M context mode may require enabling via the Anthropic Console for some accounts.

What are the best use cases for Claude's 1M context window?

The highest-value use cases are whole-codebase security audits and architecture reviews, multi-document synthesis (e.g., 200 contracts, 100 research papers), long agent conversation histories requiring full-context synthesis, and large structured data reasoning. Avoid using 1M context when targeted file reads via Grep/Glob can answer the question — it is 4–14x more expensive than standard context.

Does the "lost in the middle" problem affect Claude's 1M context window?

Yes. Performance is strongest at the beginning and end of the context and degrades slightly in the middle for very long inputs. For critical instructions or key documents, place them at the start of your messages array. Anthropic recommends staying under 800K tokens for reliable output quality even when the technical limit is 1M.

Take It Further

Claude API Cost Optimization Masterclass — The practical guide to cutting Claude API costs by 60–90% in production. Model tiering, prompt caching, Batch API, and token compression — with real numbers from 12 optimization scenarios.

PDF guide + Excel cost calculator.

→ Get Cost Optimization Masterclass — $59

30-day money-back guarantee. Instant download.

Running Claude Code across multiple repos without losing context

Sangmin Lee — Sat, 23 May 2026 01:30:02 +0000

Originally published at claudeguide.io/claude-code-workflow-multi-repo

Running Claude Code across multiple repos without losing context

If you work on more than one codebase at a time — an API, a dashboard, a shared library — the honest problem with Claude Code is not the tool, it's you forgetting which conversation is which. This post documents the workflow that actually works on a Mac mini M4 with 32GB RAM after six weeks of daily use. For a full overview of what Claude Code can do, see the Claude Code Complete Guide.

TL;DR

One Claude Code session = one repository. Do not mix.
Use project-scoped CLAUDE.md at the root of each repo to pin context.
Put persistent cross-repo facts in ~/.claude/CLAUDE.md (user-global).
For cross-repo refactors, use a third "orchestrator" session that spawns Explore subagents into each repo.
Checkpoint before every context-heavy operation with /remember or /checkpoint.

Why the naive approach breaks

The first instinct is to open one Claude Code window and cd between projects. This fails for three concrete reasons:

File cache collisions. Claude Code tracks which files you've opened. Switching directories mid-session causes stale path assumptions.
System prompt dilution. Each repo's CLAUDE.md only gets loaded at startup. Switching afterwards means the guidance doesn't reattach.
Conversation contamination. Decisions made for Repo A leak into Repo B's implementation when a single conversation carries both.

We measured the impact over a 2-week A/B split on a 3-repo project:

Workflow	Avg tokens / task	Rework rate	Subjective frustration (1-5)
Single session, `cd` between repos	42,800	31%	4.1
One session per repo, user-global `CLAUDE.md`	18,600	8%	1.8

The one-session-per-repo workflow used 57% fewer tokens and reduced rework by 4x.

The setup, step by step

1. Write a tight `CLAUDE.md` in each repo

Keep it under 200 lines. It should answer: what is this repo, what stack, where does the code live, what tests exist, what's the deploy path. No aspirational content — only what's true today.

# CLAUDE.md — api-service
- Node 20, TypeScript 5.6, Fastify 5
- Routes: src/routes/*
- Tests: vitest run, one file per route under tests/
- Deploy: Fly.io via `fly deploy` (staging auto on main push)
- Secrets: .env.local (dev) / Fly secrets (prod)

2. Put cross-repo facts in `~/.claude/CLAUDE.md`

This is your shared preamble. Useful entries:

Your preferred commit message style
Tools you have globally (tsx, pnpm, bun)
Platform oddities (Mac mini M4, Apple Silicon specifics)
Recurring project names and what they mean

3. Open one session per repo

Use terminal tabs, iTerm split panes, or Warp workflows — one Claude Code process per repo. Expect 1-3 concurrent at any time. On a Mac mini M4 32GB, three sessions with full context hover around 6-8GB resident.

4. For cross-repo work, spawn an orchestrator

When you have a change that spans repos (say, "rename this API endpoint and update all callers across three frontends"), open a fourth session in a neutral directory and use the Agent tool to dispatch Explore subagents into each repo. Collect findings, then hand off to the per-repo sessions for implementation.

Frequently Asked Questions

Does Claude Code share memory across sessions?

No. Each session has its own conversation. The .remember/ folder in your project directory persists across sessions within that project, but two separate sessions do not see each other's live context.

How big should CLAUDE.md be?

Under 200 lines in the repo; under 100 lines globally. Anything longer will either get ignored or crowd out working memory.

Can I use Claude Desktop and Claude Code simultaneously?

Yes. They do not interfere. Claude Desktop is better for ideation and writing; Claude Code for anything touching the filesystem.

What about git worktrees?

Worktrees work well for the "spawn orchestrator" pattern. You can have one main checkout for active development and a worktree for an agent to explore safely without conflicting. See Worktree Isolation in Claude Code for a step-by-step setup guide.

How many concurrent sessions is too many?

On a Mac mini M4 with 32GB RAM, three full sessions hover at 6–8 GB resident. Four to five starts competing for memory with your other tools. In practice, keep concurrent sessions to three — one per active repo. Open a fourth only for short orchestrator tasks, then close it.

What we got wrong at first

Our first two weeks used a single session with symlinks instead of separate sessions. Token usage was 2.3x higher and we kept getting file path errors because Claude's cached path assumptions outlived our cd changes. Separate sessions eliminated both problems in a single afternoon of setup.

Source data

All measurements in this post come from logs on a Mac mini M4 32GB running macOS 15.4, April 4-18 2026. The repositories were an API (Fastify), a dashboard (Next.js 15), and a shared library (TypeScript). Raw log samples are available on request.

Part of the Claude Code workflow series on claudeguide.io. Disclosure: This site is part of the Biz AI self-investment project. The SaaS we build (claudecosts.app) is linked in our product index but not promoted in this post.

Take It Further

Claude Code Power Prompts 300 — 300 battle-tested prompts for Claude Code, organized by use case. Copy, paste, ship.

40 slash command templates. Token-optimized variants. JSONL file for direct import. Tested in production sessions.

→ Get Claude Code Power Prompts 300 — $29

30-day money-back guarantee. Instant download.

Claude Prompt Caching: When It Pays Off (2026 Break-Even)

Sangmin Lee — Fri, 22 May 2026 15:36:38 +0000

Originally published at claudeguide.io/claude-api-cost-prompt-caching-break-even

Claude prompt caching: when it pays off and when it doesn't (2026 numbers)

Claude prompt caching breaks even at 1.28 reuses for the 5-minute cache and 4 reuses for the 1-hour cache — below those thresholds, you pay 25% more than not caching. Above them, you save up to 90% on input tokens. This post derives the break-even math from 2026 pricing and walks through six real workloads to show where caching wins, breaks even, and loses.

For the complete pricing table this analysis is based on, see Claude API pricing 2026.

The pricing (April 2026)

Per 1M tokens, in USD:

Model	Input	Output	Cache write 5m	Cache write 1h	Cache read
Opus 4.7	$5	$25	$6.25	$10	$0.50
Sonnet 4.6	$3	$15	$3.75	$6	$0.30
Haiku 4.5	$1	$5	$1.25	$2	$0.10

Cache write 5m = 1.25x input price. Cache write 1h = 2x input price. Cache read = 0.1x input price.

The break-even formula

For a prefix of size P tokens reused N times:

Without cache: N * P * input_price
With cache: 1 * P * cache_write_price + N * P * cache_read_price

Caching is cheaper when:



N * P * input 

PDF guide + 6-sheet Excel cost calculator. Example scenario: $2,100 → $187/month on a customer support agent.

[→ Get Cost Optimization Masterclass — $59](https://shoutfirst.gumroad.com/l/msjkda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-api-cost-prompt-caching-break-even)

*30-day money-back guarantee. Instant download.*

Claude Code Skills Explained: What They Are & When to Use Them (2026)

Sangmin Lee — Fri, 22 May 2026 15:35:51 +0000

Originally published at claudeguide.io/claude-code-skills-overview

Claude Code Skills Explained: What They Are & When to Use Them (2026)

Claude Code Skills are reusable, AI-callable workflows defined in SKILL.md files in your ~/.claude/skills/ directory. Each skill is auto-discovered by Claude Code, can be invoked via the Skill tool, and replaces what would otherwise be one-off prompts copy-pasted between sessions — turning 30-line instructions into a single /skill-name call. Skills are how power users compress repetitive workflows (code review, deployment, content generation, audit) into reusable building blocks. If you find yourself pasting the same instructions into Claude Code multiple times per week, a skill replaces them.

This guide covers the model: what counts as a skill, how Claude discovers them, when to use a skill vs a slash command vs a CLAUDE.md instruction, and what's already in the public skill library.

What is a Skill, exactly?

A skill is a markdown file (SKILL.md) with frontmatter that Claude Code can invoke as a tool. It contains:

YAML frontmatter — name, description, when to invoke
Markdown body — step-by-step instructions Claude follows
Optional helper files — scripts, templates, references

Example skill at ~/.claude/skills/deploy/SKILL.md:

---
name: deploy
description: Ship the current branch to production. Use when the user says "ship it", "deploy", "go live", or after PR merge confirmation.
---

# Deploy current branch

1. Run `bun run build` — abort if exit code != 0
2. Run `vercel deploy --prod --yes`
3. Wait for "Aliased: ..." line, confirm HTTP 200
4. Submit IndexNow with any new content slugs
5. Report deployment URL + git SHA + line count of changed files

Now in any Claude Code session, you can say "ship it" and Claude invokes this skill instead of asking you to clarify the steps.

Skills vs Slash Commands vs CLAUDE.md

These three serve overlapping purposes. Picking right is half the battle:

Use case	Choice
Repeatable multi-step workflow ("deploy", "audit", "review")	Skill
One-shot UI command (Claude built-in like `/clear`, `/exit`)	Slash command
Project-specific context (stack, conventions, commands)	CLAUDE.md

Rule of thumb: if you'd write "follow these steps every time" in CLAUDE.md, make it a skill instead. Skills are invoked on demand; CLAUDE.md loads every session.

When to use a Skill

Good fits

Deployment workflows — deploy, rollback, canary (see Claude Code Hooks Deep Dive for hook patterns)
Audit/review — aeo-audit, quality-gate, security-scan
Content generation — write-blog-post, generate-changelog, summarize-pr
Investigation — investigate-bug, find-related-tests, trace-deployment
Setup — init-project, setup-deploy, configure-monitoring (pairs well with Claude Code memory system)

Bad fits

Trivial one-liners — "format this file" doesn't need a skill, just say it
Highly variable workflows — if every invocation needs different parameters, a skill becomes brittle
Project-specific commands — those belong in CLAUDE.md, not a global skill

The Public Skill Library

Anthropic and the community maintain skills at ~/.claude/skills/. Notable examples:

gstack — full development stack (review, ship, qa, investigate)
superpowers — TDD, debugging, code review patterns
engineering — debug, system-design, deploy-checklist
design — design-review, accessibility-check
finance — reconciliation, journal-entry-prep

You can browse installed skills with ls ~/.claude/skills/. Each directory contains a SKILL.md defining when Claude should invoke that skill.

How Claude Discovers Skills

At session start, Claude Code lists all available skills with their descriptions. When you say "ship it" or "deploy this", Claude matches your intent against skill descriptions and invokes the best match.

If you say something ambiguous ("update the docs"), Claude either picks the best match silently or asks "which skill should I use?". Specific phrasing in the user prompt triggers faster matching.

ls ~/.claude/skills/
# deploy/  audit/  review/  investigate/  setup-deploy/

Building Your First Skill

The minimum viable skill is 5 lines:

---
name: bun-test
description: Run bun test on the current package. Use when user says "test it", "run tests", or after editing test files.
---

Run `bun test` and report results. If any tests fail, show the failure output and stop.

Save as ~/.claude/skills/bun-test/SKILL.md. Restart Claude Code. Done.

For a deeper guide on building skills with arguments, helper scripts, and conditional logic, see How to Build a Custom Claude Code Skill.

Skills with Arguments

Skills can accept arguments via $ARGUMENTS:


markdown
---
name: trace
description: Trace a deployment by its SHA. Usage: /trace <sha

Forem: Sangmin Lee

How to Use Claude Code Subagents for Parallel Research

How to Use Claude Code Subagents for Parallel Research

TL;DR

What exactly is a subagent

The three subagent types worth knowing

Explore

Plan

general-purpose

When subagents actually help

Subagents help:

Subagents do NOT help:

The parallel fan-out pattern

Claude Code Permissions: Trust Levels, Allow Lists, and Safe Defaults

Claude Code Permissions: Trust Levels, Allow Lists, and Safe Defaults

How permissions work

Permission modes

Configuring allowed and disallowed tools

Allow specific tools only

Disallow specific tools

Common safe configurations

Per-project permissions

Pattern-based rules

The full settings.json reference

Trust levels explained

Auto mode and autonomous execution

What each tool can do (risk reference)

Common permission mistakes

FAQ

Sources

Frequently Asked Questions

How do I make Claude Code read-only so it can't modify files?

What is the difference between allowedTools and permissions.allow?

How do I block Claude Code from running git push or kubectl?

Can I permanently allow a tool action without re-approving every session?

Take It Further

Claude Code Hooks: Automate and Control Every Tool Call

Claude Code Hooks: Automate and Control Every Tool Call

What hooks are

Hook configuration format

Claude Code CLI Commands: Full Reference (2026)

Claude Code CLI Commands: Full Reference (2026)

CLI flags (when launching claude)

Basic flags

Model and configuration

Context and files

Permissions and safety

Session control

Non-interactive (pipe) mode

In-session slash commands

Session management

Model switching

Claude API Pricing 2026: Complete Breakdown with Calculators

Claude API Pricing 2026: Complete Breakdown with Calculators

Current pricing table (April 2026)

Standard API

Prompt caching

Batch API (50% off all standard rates)

1M context window (extended context)

Three ratios to memorize

Worked cost examples

Example 1: High-volume classification

Example 2: Customer support drafts

Example 3: Document summarization (1M context)

Example 4: Batch API for nightly data enrichment

How to calculate your own costs

Claude API Error Handling: Rate Limits, Retries, Patterns

Claude API Error Handling: Rate Limits, Retries, and Production Patterns

Error code reference

Additional HTTP status codes

Error subtype deep-dives (한국어, code samples)

Rate limit errors (429)

From $800 to $120/month: A Claude API Cost Optimization Case Study

From $800 to $120/month: A Claude API Cost Optimization Case Study

Claude Agent SDK Quickstart: Build Your First Agent in 15 Minutes

Claude Agent SDK Quickstart: Build Your First Agent in 15 Minutes

What you're building

Python version

Step 1: Install the SDK

Step 2: Define your tools

`Explore`

`Plan`

`general-purpose`

What is the difference between `allowedTools` and `permissions.allow`?

How do I block Claude Code from running `git push` or `kubectl`?

CLI flags (when launching `claude`)

1. Write a tight `CLAUDE.md` in each repo

2. Put cross-repo facts in `~/.claude/CLAUDE.md`