<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Hari Venkata Krishna Kotha</title>
    <description>The latest articles on Forem by Hari Venkata Krishna Kotha (@harivenkatakrishnakotha).</description>
    <link>https://forem.com/harivenkatakrishnakotha</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3764547%2F6326c2d9-5539-4d51-b87a-7282fa744387.jpeg</url>
      <title>Forem: Hari Venkata Krishna Kotha</title>
      <link>https://forem.com/harivenkatakrishnakotha</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/harivenkatakrishnakotha"/>
    <language>en</language>
    <item>
      <title>RTK, Model Routing, and the Community Tools That Actually Work With Claude Code</title>
      <dc:creator>Hari Venkata Krishna Kotha</dc:creator>
      <pubDate>Tue, 07 Apr 2026 13:00:22 +0000</pubDate>
      <link>https://forem.com/harivenkatakrishnakotha/rtk-model-routing-and-the-community-tools-that-actually-work-with-claude-code-3pmh</link>
      <guid>https://forem.com/harivenkatakrishnakotha/rtk-model-routing-and-the-community-tools-that-actually-work-with-claude-code-3pmh</guid>
      <description>&lt;p&gt;&lt;em&gt;This is Part 2 of a series on getting more out of Claude Code. &lt;a href="https://dev.to/harivenkatakrishnakotha/how-i-cut-claude-codes-token-overhead-by-44-and-stopped-hitting-usage-limits-mid-session-3fkf"&gt;Part 1&lt;/a&gt; covered the 50,000 token overhead problem, the 44% reduction fix, and the memory/lessons.md system.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In Part 1, I mentioned RTK saved me 60-90% on tool output tokens. This post goes deeper: how RTK actually works under the hood, the difference between Unix and Windows installations, model routing for subagents, environment variables for cost control, and 7 community tools I tested (most of which I didn't end up using).&lt;/p&gt;

&lt;h2&gt;
  
  
  RTK: How It Actually Works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/rtk-ai/rtk" rel="noopener noreferrer"&gt;RTK (Rust Token Killer)&lt;/a&gt; is a Rust-based CLI proxy that intercepts shell commands, runs them, and compresses the output before it reaches your AI tool's context window. It supports 10+ AI coding tools including Claude Code, GitHub Copilot, Cursor, Gemini CLI, Codex, Windsurf, Cline, and OpenCode, but this post focuses on Claude Code.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Version note:&lt;/strong&gt; RTK is actively developed. The latest release is v0.35.0 (April 6, 2026), which expanded AWS CLI filters. I'm running v0.34.2 in this post — features and exact command output may differ slightly in newer versions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;RTK applies four optimization strategies to every CLI command output before it enters your context window:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw Output (5,000 tokens)
    ↓
Smart Filtering (remove ANSI codes, spinner artifacts, progress bars)
    ↓
Grouping (consolidate related output lines, collapse repeated patterns)
    ↓
Deduplication (deduplicate repeated patterns like passing tests)
    ↓
Truncation (keep errors/warnings, trim verbose success output)
    ↓
Filtered Output (500-2,000 tokens)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why This Matters More Than You Think: The Re-Read Tax
&lt;/h3&gt;

&lt;p&gt;This is the concept that changed how I think about Claude Code optimization.&lt;/p&gt;

&lt;p&gt;When Claude runs a command, the output stays in context. On the next turn, Claude re-reads ALL prior context, including every command output from earlier in the session. Then on the turn after that, it re-reads everything again.&lt;/p&gt;

&lt;p&gt;Here's the math. Say you run &lt;code&gt;git diff&lt;/code&gt; and it produces 2,000 tokens of output. Over a 10-turn conversation after that command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Turn 1: 2,000 tokens read
Turn 2: 2,000 tokens re-read
Turn 3: 2,000 tokens re-read
...
Turn 10: 2,000 tokens re-read
Total: 20,000 tokens consumed from one command
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With RTK compressing that diff to 800 tokens (59% reduction):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total: 8,000 tokens instead of 20,000
Savings: 12,000 tokens from a single command
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now multiply across 80+ commands in a real coding session. From my actual work building a .NET 10 Blazor application: 80 RTK commands, 152K input tokens, 39K output tokens, &lt;strong&gt;113.6K tokens saved at 74.6% efficiency&lt;/strong&gt;. The re-read savings compound on top of that — each saved token gets re-read on every subsequent turn, so the actual context reduction is a multiple of the direct savings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unix vs Windows: Two Different Integration Models
&lt;/h3&gt;

&lt;p&gt;This is something the README doesn't make obvious. RTK works fundamentally differently depending on your OS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unix (macOS/Linux) uses Hook Mode:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;How it works:
1. RTK installs a PreToolUse hook in Claude Code's hooks system
2. When Claude runs any Bash command, the hook rewrites the command BEFORE execution
   (e.g., git status becomes rtk git status)
3. RTK filters the output transparently
4. Claude doesn't know RTK exists

Token overhead: 0
Setup: rtk init -g --hook-only
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--hook-only&lt;/code&gt; flag is important. Without it, RTK also creates an RTK.md file with instructions for Claude. But since the hook works transparently (Claude doesn't need to know about RTK), that file adds unnecessary per-turn overhead for zero benefit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Windows uses CLAUDE.md Mode (the only option on Windows):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;How it works:
1. RTK adds instructions to ~/.claude/CLAUDE.md
2. These instructions tell Claude: "prefix all Bash commands with rtk"
3. Claude reads the instructions every turn and writes: rtk git status
4. RTK binary filters the output

Token overhead: the CLAUDE.md instructions add some per-turn overhead
Setup: rtk init -g --claude-md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Windows can't use hook mode. When you run &lt;code&gt;rtk init -g&lt;/code&gt; on Windows, RTK explicitly tells you "Hook-based mode requires Unix (macOS/Linux)" and falls back to &lt;code&gt;--claude-md&lt;/code&gt; automatically. Note that &lt;code&gt;--claude-md&lt;/code&gt; is now labeled "legacy mode" in the latest RTK help text (v0.34+), but on Windows it remains the only working option.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the CLAUDE.md overhead worth it on Windows?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. A single &lt;code&gt;rtk git diff&lt;/code&gt; typically saves more tokens than the instructions cost. A single &lt;code&gt;rtk pytest&lt;/code&gt; can save thousands of tokens. The overhead pays for itself on your first filtered command, and every command after that is pure savings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installing RTK on Windows: Step by Step
&lt;/h3&gt;

&lt;p&gt;This is what I actually did. Recording it because several things aren't obvious from the docs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Step 1: Install RTK&lt;/span&gt;
&lt;span class="c"&gt;# Option A: Homebrew (macOS)&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;rtk

&lt;span class="c"&gt;# Option B: Curl installer (macOS/Linux)&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/rtk-ai/rtk/refs/heads/master/install.sh | sh

&lt;span class="c"&gt;# Option C: Cargo (Windows — use Git Bash, not PowerShell)&lt;/span&gt;
cargo &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--git&lt;/span&gt; https://github.com/rtk-ai/rtk

&lt;span class="c"&gt;# Step 2: Find where cargo put the binary (Windows only)&lt;/span&gt;
&lt;span class="c"&gt;# Usually: C:\Users\&amp;lt;username&amp;gt;\.cargo\bin\rtk.exe&lt;/span&gt;
&lt;span class="c"&gt;# Add this to your system PATH if it's not already&lt;/span&gt;

&lt;span class="c"&gt;# Step 3: Initialize for Claude Code&lt;/span&gt;
rtk init &lt;span class="nt"&gt;-g&lt;/span&gt; &lt;span class="nt"&gt;--claude-md&lt;/span&gt;

&lt;span class="c"&gt;# Step 4: Verify it works&lt;/span&gt;
rtk &lt;span class="nt"&gt;--version&lt;/span&gt;
rtk git status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Things that tripped me up:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cargo install rtk&lt;/code&gt; (without the git URL) installs the wrong package (Rust Type Kit, a completely different tool). Always use the full git URL.&lt;/li&gt;
&lt;li&gt;Run from Git Bash, not native PowerShell. Some RTK shell integrations assume bash.&lt;/li&gt;
&lt;li&gt;If you use VS Code's integrated terminal, make sure it's set to Git Bash, not PowerShell.&lt;/li&gt;
&lt;li&gt;The binary path needs to be in your PATH environment variable for Claude Code to find it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  RTK Configuration
&lt;/h3&gt;

&lt;p&gt;RTK stores config at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Windows:&lt;/strong&gt; &lt;code&gt;%APPDATA%\rtk\config.toml&lt;/code&gt; (or &lt;code&gt;~/.config/rtk/config.toml&lt;/code&gt; in Git Bash)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;macOS/Linux:&lt;/strong&gt; &lt;code&gt;~/.config/rtk/config.toml&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two settings worth knowing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# Exclude specific commands from filtering&lt;/span&gt;
&lt;span class="c"&gt;# (if RTK strips output you actually need to see)&lt;/span&gt;
&lt;span class="nn"&gt;[hooks]&lt;/span&gt;
&lt;span class="py"&gt;exclude&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"some-command-that-needs-raw-output"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c"&gt;# Tee: saves raw output when commands fail&lt;/span&gt;
&lt;span class="c"&gt;# Your safety net if RTK strips a critical error message&lt;/span&gt;
&lt;span class="nn"&gt;[tee]&lt;/span&gt;
&lt;span class="py"&gt;enabled&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="py"&gt;rotation_limit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tee feature is like a flight recorder on an airplane. During normal operation, you never need it. But if RTK strips a critical error and Claude misses a bug, you can recover the unfiltered output.&lt;/p&gt;

&lt;h3&gt;
  
  
  Measuring Your Savings
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Cumulative savings across all sessions&lt;/span&gt;
rtk gain

&lt;span class="c"&gt;# Per-command breakdown&lt;/span&gt;
rtk gain &lt;span class="nt"&gt;--history&lt;/span&gt;

&lt;span class="c"&gt;# Find commands you ran WITHOUT rtk that could have been filtered&lt;/span&gt;
rtk discover
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's the actual &lt;code&gt;rtk gain&lt;/code&gt; output from my work laptop while building a .NET 10 Blazor application:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs40d61r51xlqnrgo9ld9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs40d61r51xlqnrgo9ld9.png" alt="RTK gain output showing 80 commands, 113.6K tokens saved at 74.6% efficiency, with rtk dotnet test as the top filter at 99.1% savings across 19 runs"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;74.6% efficiency across 80 commands. 113,600 tokens saved.&lt;/strong&gt; The &lt;code&gt;rtk dotnet test&lt;/code&gt; filter alone saved 108K tokens across 19 runs. &lt;code&gt;dotnet test&lt;/code&gt; output is verbose by default (test discovery, build output, individual test results, summary), and RTK strips it down to just failures and counts.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;rtk discover&lt;/code&gt; command is the most useful when starting out. It scans your session logs and shows commands you ran without the &lt;code&gt;rtk&lt;/code&gt; prefix that could have been filtered. Basically shows you your missed savings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Commands Worth Knowing
&lt;/h3&gt;

&lt;p&gt;A few commands that aren't in the basic README but are useful:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Show your RTK adoption across recent Claude Code sessions&lt;/span&gt;
rtk session

&lt;span class="c"&gt;# Claude Code spending vs RTK savings analysis&lt;/span&gt;
rtk cc-economics

&lt;span class="c"&gt;# Filter for .NET commands (build, test, restore, format)&lt;/span&gt;
rtk dotnet &lt;span class="nb"&gt;test
&lt;/span&gt;rtk dotnet build

&lt;span class="c"&gt;# Learn CLI corrections from your error history&lt;/span&gt;
rtk learn
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;rtk dotnet&lt;/code&gt; filter is the one that produced 99% savings on my tests. If you're a .NET developer, that filter alone justifies the install. There are similar specialized filters for Cargo, Vitest, Pytest, Playwright, Prettier, Prisma, Next.js, ESLint, TypeScript, Docker, kubectl, and around 100+ commands total.&lt;/p&gt;

&lt;h3&gt;
  
  
  When RTK Shines vs When It Doesn't
&lt;/h3&gt;

&lt;p&gt;This is the most important thing to understand about RTK, and nobody talks about it: &lt;strong&gt;RTK only intercepts Bash commands.&lt;/strong&gt; Claude Code's built-in tools (Read, Write, Edit, Grep, Glob, WebFetch, WebSearch) bypass Bash entirely and never touch RTK.&lt;/p&gt;

&lt;p&gt;In a typical Claude Code session, you might run 5-10 Bash commands vs 50-100 dedicated tool calls. If your session is mostly Read/Edit/Grep operations, RTK savings will be minimal — not because RTK is broken, but because there's nothing for it to intercept.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RTK shines in sessions where Bash is heavily used:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Running builds: &lt;code&gt;rtk dotnet build&lt;/code&gt;, &lt;code&gt;rtk cargo build&lt;/code&gt;, &lt;code&gt;rtk next build&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Running tests: &lt;code&gt;rtk dotnet test&lt;/code&gt;, &lt;code&gt;rtk vitest run&lt;/code&gt;, &lt;code&gt;rtk pytest&lt;/code&gt;, &lt;code&gt;rtk playwright test&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Git operations: &lt;code&gt;rtk git diff&lt;/code&gt;, &lt;code&gt;rtk git log&lt;/code&gt;, &lt;code&gt;rtk git status&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Package managers: &lt;code&gt;rtk pnpm install&lt;/code&gt;, &lt;code&gt;rtk npm run build&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Docker/K8s: &lt;code&gt;rtk docker ps&lt;/code&gt;, &lt;code&gt;rtk kubectl get pods&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is exactly what my work data showed: 80 commands, 74.6% efficiency, and the biggest savings came from &lt;code&gt;rtk dotnet test&lt;/code&gt; (99% reduction across 19 runs). When I'm building features and running test suites repeatedly, RTK saves real tokens. When I'm in a code review session reading files and editing inline, RTK has nothing to do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sessions where RTK savings are minimal:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversation-heavy sessions (design discussions, explanations)&lt;/li&gt;
&lt;li&gt;Code review sessions (mostly Read/Edit dedicated tools)&lt;/li&gt;
&lt;li&gt;File search and exploration (Grep/Glob dedicated tools)&lt;/li&gt;
&lt;li&gt;Very short sessions (1-3 turns) — the re-read tax hasn't compounded yet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't a bug. It's a fundamental architecture choice. If you're optimizing token usage, install RTK AND make sure you're using dedicated tools instead of &lt;code&gt;cat&lt;/code&gt;/&lt;code&gt;head&lt;/code&gt;/&lt;code&gt;find&lt;/code&gt;/&lt;code&gt;grep&lt;/code&gt; via Bash. Both matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Routing: Stop Burning Opus Tokens on File Searches
&lt;/h2&gt;

&lt;p&gt;If you're on Opus (or even Sonnet), every subagent Claude spawns runs on the same model by default. That means when Claude kicks off a code-reviewer agent, an exploration search, or a simple git status check through a subagent, it burns your most expensive tokens.&lt;/p&gt;

&lt;p&gt;The fix is adding model routing rules to your global rules files. I created a &lt;code&gt;performance.md&lt;/code&gt; in &lt;code&gt;~/.claude/rules/common/&lt;/code&gt; with explicit model assignments:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Haiku for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;File search, grep, glob, codebase exploration&lt;/li&gt;
&lt;li&gt;Summarizing search results or documentation&lt;/li&gt;
&lt;li&gt;Simple formatting, renaming, mechanical edits&lt;/li&gt;
&lt;li&gt;Reading and reporting file contents&lt;/li&gt;
&lt;li&gt;Git status checks, log summaries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use Sonnet for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code generation, implementation, refactoring&lt;/li&gt;
&lt;li&gt;Code review&lt;/li&gt;
&lt;li&gt;Test writing&lt;/li&gt;
&lt;li&gt;Build error fixing&lt;/li&gt;
&lt;li&gt;Planning and documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use Opus only for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Architecture decisions requiring multi-system reasoning&lt;/li&gt;
&lt;li&gt;Deep debugging across 5+ files with complex interactions&lt;/li&gt;
&lt;li&gt;Multi-dimensional analysis tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rule file sets the default subagent model to Sonnet and lists specific overrides. Claude Code reads this on every session and applies the routing automatically when spawning subagents with the &lt;code&gt;model&lt;/code&gt; parameter.&lt;/p&gt;

&lt;p&gt;This doesn't change your main conversation model. It only affects subagents. But subagents can account for a significant portion of token usage in complex sessions, especially when Claude spawns multiple exploration or review agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment Variable Worth Setting
&lt;/h2&gt;

&lt;p&gt;One variable that gives you cost control without changing your workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Cap extended thinking tokens (default is 31,999 which can be excessive)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MAX_THINKING_TOKENS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10000

&lt;span class="c"&gt;# These go in your shell profile (~/.bashrc, ~/.zshrc,&lt;/span&gt;
&lt;span class="c"&gt;# or Windows environment variables)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;MAX_THINKING_TOKENS&lt;/code&gt; is the most impactful. Claude's extended thinking can use up to 32K tokens of internal reasoning before responding. For most tasks, 10K is more than enough. The default is generous and burns tokens on over-analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  7 Community Tools I Tested (And Why I Kept Only 2)
&lt;/h2&gt;

&lt;p&gt;I deep-researched seven community tools that claim to enhance Claude Code. Here's the honest breakdown:&lt;/p&gt;

&lt;h3&gt;
  
  
  Tools I Kept
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. RTK (Rust Token Killer)&lt;/strong&gt; — Already covered above. The single most impactful optimization tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. lessons.md Pattern (from CCO/Claude Code Optimization)&lt;/strong&gt; — Not really a "tool," but a methodology. Keep a lessons.md file in each project, write a rule every time you correct Claude. Simple, effective, zero overhead. Covered in &lt;a href="https://dev.to/harivenkatakrishnakotha/how-i-cut-claude-codes-token-overhead-by-44-and-stopped-hitting-usage-limits-mid-session-3fkf"&gt;Part 1&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tools I Evaluated and Skipped
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;3. claude-mem (Memory Manager)&lt;/strong&gt;&lt;br&gt;
Promises persistent memory across sessions via an embedded vector database. Sounds great in theory. Concerns I found during evaluation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Has reported Windows compatibility issues including a multi-GB ONNX model download requirement&lt;/li&gt;
&lt;li&gt;The built-in memory system in &lt;code&gt;~/.claude/projects/&amp;lt;project&amp;gt;/memory/&lt;/code&gt; already handles persistent memory with simple markdown files, no vector DB needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; Skip on Windows. Linux/Mac users may have a smoother experience.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. CCO (Claude Code Optimizer)&lt;/strong&gt;&lt;br&gt;
A package of configuration files (skills, rules, agents) designed for Claude Code. The self-improvement loop pattern (lessons.md) is genuinely useful and I adopted it. But the rest of the configuration overlapped heavily with what I already had from Everything Claude Code.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; Adopt the lessons.md pattern. Skip the rest if you already have ECC.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5. Superinterface / CLine / Similar IDE Extensions&lt;/strong&gt;&lt;br&gt;
Various tools that wrap Claude Code with additional UI. The problem: Claude Code already works well in the terminal and VS Code. Adding another layer introduces latency, potential conflicts, and more things that can break.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; Unnecessary complexity for most workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;6. Custom MCP Servers for Token Tracking&lt;/strong&gt;&lt;br&gt;
Some community members built MCP servers that track token usage per conversation. Interesting idea, but RTK's &lt;code&gt;rtk gain&lt;/code&gt; command already gives you this data without the setup overhead.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; RTK covers this use case.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;7. Automated Session Management Tools&lt;/strong&gt;&lt;br&gt;
Tools that auto-compact, auto-checkpoint, or auto-restart sessions. The problem is they make assumptions about when you want to compact or restart. Claude Code's built-in compaction (with the &lt;code&gt;strategic-compact&lt;/code&gt; skill nudging you at good breakpoints) worked better for me than automated approaches.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; Use the &lt;code&gt;strategic-compact&lt;/code&gt; skill instead.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Pattern
&lt;/h3&gt;

&lt;p&gt;Most community tools try to solve problems that Claude Code already handles, just not obviously. Before installing any third-party tool, check if there's a built-in feature, a rule file, or a skill that does the same thing with less overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Complete Optimization Stack
&lt;/h2&gt;

&lt;p&gt;Here's everything I run, in priority order:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Token Impact&lt;/th&gt;
&lt;th&gt;Setup Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;RTK&lt;/td&gt;
&lt;td&gt;60-90% tool output savings&lt;/td&gt;
&lt;td&gt;30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Environment variables (MAX_THINKING_TOKENS)&lt;/td&gt;
&lt;td&gt;Caps runaway thinking&lt;/td&gt;
&lt;td&gt;10 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Skills audit (global vs project-level)&lt;/td&gt;
&lt;td&gt;Frees 74% of skill overhead&lt;/td&gt;
&lt;td&gt;15 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Model routing rules&lt;/td&gt;
&lt;td&gt;Routes subagents to cheaper models&lt;/td&gt;
&lt;td&gt;10 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Memory system (user + feedback files)&lt;/td&gt;
&lt;td&gt;Smarter responses across sessions&lt;/td&gt;
&lt;td&gt;10 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;lessons.md file&lt;/td&gt;
&lt;td&gt;Permanent mistake prevention&lt;/td&gt;
&lt;td&gt;30 seconds to create&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Total setup time: under 30 minutes. The compound savings across a week of coding sessions add up fast.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 1 covered the &lt;a href="https://dev.to/harivenkatakrishnakotha/how-i-cut-claude-codes-token-overhead-by-44-and-stopped-hitting-usage-limits-mid-session-3fkf"&gt;token overhead problem and the 44% fix&lt;/a&gt;. Parts 3 and 4 (Skills.sh ecosystem guide and curated skills by category) are in the works.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>claude</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How I cut Claude Code's token overhead by 44% and stopped hitting usage limits mid-session.</title>
      <dc:creator>Hari Venkata Krishna Kotha</dc:creator>
      <pubDate>Tue, 24 Mar 2026 13:41:55 +0000</pubDate>
      <link>https://forem.com/harivenkatakrishnakotha/how-i-cut-claude-codes-token-overhead-by-44-and-stopped-hitting-usage-limits-mid-session-3fkf</link>
      <guid>https://forem.com/harivenkatakrishnakotha/how-i-cut-claude-codes-token-overhead-by-44-and-stopped-hitting-usage-limits-mid-session-3fkf</guid>
      <description>&lt;p&gt;I'm on a paid Claude Code plan. A few weeks ago, I noticed my usage limits were hitting way faster than expected. I wasn't doing anything unusual, just regular development work. But Claude kept running out of context mid-conversation, forgetting things I'd said 10 messages ago, and compacting earlier than it should. (Compaction is when Claude Code summarizes earlier messages to free up context space. When it happens too early, you lose nuance and detail from earlier in the conversation.)&lt;/p&gt;

&lt;p&gt;I went looking for answers. LinkedIn, Dev.to, Instagram, Reddit. Most articles said the same things, and honestly, half of them were copies of each other. Token reduction tips, useful skills lists, prompt tricks. I decided to stop bookmarking and start testing. Tried every method I came across, measured the results, and kept what actually worked.&lt;/p&gt;

&lt;p&gt;Here's what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 50,000 Token Problem You Don't Know You Have
&lt;/h2&gt;

&lt;p&gt;When you install skills in Claude Code, their metadata loads into your context window on every single message. And when a skill's trigger matches your prompt, the full content loads too. The more skills you have installed, the more metadata overhead you carry per turn, and the more likely full skill content gets pulled in during a busy session.&lt;/p&gt;

&lt;p&gt;I came across the &lt;a href="https://github.com/affaan-m/everything-claude-code" rel="noopener noreferrer"&gt;Everything Claude Code&lt;/a&gt; repository and was honestly amazed. Skills, agents, commands, rules, all packaged together. So I did what most people would do: installed everything globally.&lt;/p&gt;

&lt;p&gt;That was a mistake.&lt;/p&gt;

&lt;p&gt;Here's what my setup looked like before I realized the problem:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Component          Size       Estimated Tokens
Skills (global)    196KB      ~50,000
Agent definitions  58KB       ~15,000
Command files      142KB      ~36,000
Rule files         9KB        ~2,000
TOTAL              405KB      ~103,000 tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(Rough estimate: 1KB of text ≈ 250 tokens. Not all of this loads on every turn because skills use progressive disclosure, loading only metadata first and full content when triggered. But the potential overhead is still massive, and in practice, a busy session triggers many of them.)&lt;/p&gt;

&lt;p&gt;Over 100,000 tokens of potential overhead sitting in my setup. That's a significant chunk of Claude's context window spent on instructions, most of which weren't relevant to what I was doing at that moment.&lt;/p&gt;

&lt;p&gt;No wonder my conversations were getting compacted early. No wonder Claude was "forgetting" things. There wasn't enough room left for the actual work.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Check Your Own Overhead
&lt;/h2&gt;

&lt;p&gt;Before you do anything else, run this in your terminal (Windows users: use Git Bash, not PowerShell):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;du&lt;/span&gt; &lt;span class="nt"&gt;-sh&lt;/span&gt; ~/.claude/skills/ ~/.claude/agents/ ~/.claude/commands/ ~/.claude/rules/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Reading your results:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each line shows the size of a directory. Add them up for your total overhead.&lt;/p&gt;

&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;144K    /Users/you/.claude/skills/
76K     /Users/you/.claude/agents/
172K    /Users/you/.claude/commands/
9K      /Users/you/.claude/rules/
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's 401KB total. To estimate tokens, multiply your total KB by 250 (1KB ≈ 250 tokens). So 401KB ≈ 100,000 tokens of potential overhead. Not all of it loads every turn (skills use progressive disclosure), but the more skills you have, the more likely multiple will trigger and load fully during a session.&lt;/p&gt;

&lt;p&gt;If your skills directory alone is over 100KB, you're almost certainly carrying skills you don't use in most projects.&lt;/p&gt;

&lt;p&gt;For context, my setup was 405KB before I touched anything. After moving domain-specific skills to project level and cleaning up unused agents, it dropped to 232KB. Same capabilities, 44% less overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix: 44% Reduction in One Afternoon
&lt;/h2&gt;

&lt;p&gt;The principle is simple: only keep things globally that you use in 80%+ of your projects. Everything else goes to project level, where it only loads when you're working in that specific project.&lt;/p&gt;

&lt;p&gt;I went from 20 global skills down to 6. The other 14 moved to the projects that actually needed them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Component          Before     After      Saved
Skills (global)    196KB      51KB       145KB (74% reduction)
Agent definitions  58KB       52KB       6KB
Command files      142KB      120KB      22KB
Rule files         9KB        9KB        0KB (modified, not reduced)
TOTAL              405KB      232KB      173KB (~44% reduction)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What I kept globally (the skills I use in every project):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Coding standards (applies to every language)&lt;/li&gt;
&lt;li&gt;Security review (should check this everywhere)&lt;/li&gt;
&lt;li&gt;TDD workflow (I practice TDD daily)&lt;/li&gt;
&lt;li&gt;Verification loop (prevents claiming things are done before checking)&lt;/li&gt;
&lt;li&gt;Strategic compaction (suggests when to compact context manually)&lt;/li&gt;
&lt;li&gt;Continuous learning (tracks patterns across sessions)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What I moved to project level:&lt;/strong&gt;&lt;br&gt;
Docker patterns, Python patterns, React patterns, e2e testing, eval harness, iterative retrieval, full-stack patterns, and several others. These are useful but only in specific projects. Loading Docker patterns while I'm writing documentation is pure waste.&lt;/p&gt;

&lt;p&gt;The difference was immediate. Conversations lasted longer before compaction. Claude held context from earlier in the session. Fewer "I don't have context on that" moments.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Tool Output Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Most optimization advice focuses on what's loaded at the start of a conversation: skills, rules, CLAUDE.md. But there's another source of token waste that's just as big, and almost nobody mentions it.&lt;/p&gt;

&lt;p&gt;Every time Claude runs a CLI command (git status, npm test, a build command), the raw output gets dumped into the context window. And here's the thing most people miss: &lt;strong&gt;that output gets re-read on every subsequent turn&lt;/strong&gt;. It doesn't disappear.&lt;/p&gt;

&lt;p&gt;Think about it this way. You ask Claude to run your test suite. The output is 5,000 tokens. 4,950 of those tokens are passing tests. 50 tokens are the actual failures you care about. But all 5,000 tokens sit in context and get re-read on turn 2, turn 3, turn 4, and every turn after.&lt;/p&gt;

&lt;p&gt;Over a 20-turn session with 50 tool calls, you can easily accumulate 100,000+ tokens of tool output. Most of it noise.&lt;/p&gt;
&lt;h2&gt;
  
  
  RTK: The Token Saver That Actually Made a Difference
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/rtk-ai/rtk" rel="noopener noreferrer"&gt;RTK (Rust Token Killer)&lt;/a&gt; is an open-source tool that filters CLI output before it enters Claude's context window. It applies four optimization passes: smart filtering (removes noise), grouping (aggregates similar items like errors by type), truncation (keeps relevant context, cuts redundancy), and deduplication (collapses repeated log lines with counts).&lt;/p&gt;

&lt;p&gt;Real savings from my sessions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command Category&lt;/th&gt;
&lt;th&gt;Example Commands&lt;/th&gt;
&lt;th&gt;Token Savings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Build output&lt;/td&gt;
&lt;td&gt;cargo build, tsc, next build&lt;/td&gt;
&lt;td&gt;80-90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test output&lt;/td&gt;
&lt;td&gt;vitest, pytest, playwright&lt;/td&gt;
&lt;td&gt;90-99%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Git operations&lt;/td&gt;
&lt;td&gt;git status, git diff, git log&lt;/td&gt;
&lt;td&gt;59-80%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File listings&lt;/td&gt;
&lt;td&gt;ls, find, grep&lt;/td&gt;
&lt;td&gt;60-75%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The way I explain it to people: imagine you ask a librarian to check something. Without RTK, the librarian carries back the entire bookshelf, drops it on your desk, and says "the answer is on page 47." With RTK, the librarian comes back with just page 47, highlighted. Same answer. But your desk isn't buried anymore.&lt;/p&gt;
&lt;h3&gt;
  
  
  Installing RTK
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS/Linux (recommended)&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;rtk

&lt;span class="c"&gt;# Or via Cargo (IMPORTANT: do NOT run "cargo install rtk" without&lt;/span&gt;
&lt;span class="c"&gt;# the git URL — that installs "Rust Type Kit", a completely&lt;/span&gt;
&lt;span class="c"&gt;# different package. If "rtk gain" fails, you have the wrong one.)&lt;/span&gt;
cargo &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--git&lt;/span&gt; https://github.com/rtk-ai/rtk

&lt;span class="c"&gt;# Or via quick-install script&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/rtk-ai/rtk/refs/heads/master/install.sh | sh

&lt;span class="c"&gt;# Then add to Claude Code globally&lt;/span&gt;
rtk init &lt;span class="nt"&gt;-g&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;On Unix (macOS/Linux), RTK installs as a PostToolUse hook. It works transparently. Claude doesn't even know it's there. Zero token overhead.&lt;/p&gt;

&lt;p&gt;On Windows, it works through Git Bash. The hook and RTK.md get installed the same way. If you're using Claude Code with Git Bash as your shell (which most Windows developers do), the experience is identical to macOS/Linux. The RTK.md file that gets created adds about 1,200 tokens of instructions, but a single filtered git diff saves more than that. Net positive after your first tool call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Windows-specific tips:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download the pre-built binary from the &lt;a href="https://github.com/rtk-ai/rtk/releases" rel="noopener noreferrer"&gt;releases page&lt;/a&gt; (rtk-x86_64-pc-windows-msvc.zip), or install via &lt;code&gt;cargo install --git https://github.com/rtk-ai/rtk&lt;/code&gt; in Git Bash&lt;/li&gt;
&lt;li&gt;Make sure the binary path is in your system PATH&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;rtk init -g&lt;/code&gt; the same as on Unix&lt;/li&gt;
&lt;li&gt;Run from Git Bash, not native PowerShell (some shell integrations assume bash)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Measuring Your Savings
&lt;/h3&gt;

&lt;p&gt;RTK has built-in analytics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# See your cumulative savings&lt;/span&gt;
rtk gain

&lt;span class="c"&gt;# See savings per command type&lt;/span&gt;
rtk gain &lt;span class="nt"&gt;--history&lt;/span&gt;

&lt;span class="c"&gt;# Find commands you ran WITHOUT rtk that could have been optimized&lt;/span&gt;
rtk discover
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;rtk discover&lt;/code&gt; command is the most useful one when you're starting out. It scans your Claude Code session logs and shows you exactly which commands you could have filtered but didn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Memory System That Stops Claude From Asking the Same Questions
&lt;/h2&gt;

&lt;p&gt;The last piece that made a real difference wasn't about reducing tokens. It was about making Claude smarter across sessions.&lt;/p&gt;

&lt;p&gt;Claude Code has a file-based memory system at &lt;code&gt;~/.claude/projects/&amp;lt;project&amp;gt;/memory/&lt;/code&gt;. You create markdown files with frontmatter and Claude reads them at the start of every session.&lt;/p&gt;

&lt;p&gt;I use four types:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User memories:&lt;/strong&gt; Who I am, my tech stack, my preferences. Instead of explaining my setup every session, Claude already knows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Feedback memories:&lt;/strong&gt; Every time I correct Claude, the correction gets saved. "Use plain text in forms, not bullets." "Don't suggest tools I haven't used." Claude stops repeating the same mistakes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project memories:&lt;/strong&gt; Current state of work. Deadlines, decisions, context that would otherwise be lost between sessions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference memories:&lt;/strong&gt; Where to find things in external systems. "Bug tracking is in Linear project X." Saves the "where is that tracked?" conversation every time.&lt;/p&gt;

&lt;h3&gt;
  
  
  lessons.md: One File That Changes Everything
&lt;/h3&gt;

&lt;p&gt;This is the simplest thing I did and possibly the most impactful. I keep a &lt;code&gt;lessons.md&lt;/code&gt; file in every project's &lt;code&gt;.claude/&lt;/code&gt; directory. Every time I correct Claude on something, it writes a rule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## 2026-03-15 - Don't add error handling for impossible cases&lt;/span&gt;

&lt;span class="gs"&gt;**Rule:**&lt;/span&gt; Only add try-catch blocks at system boundaries (user input,
API calls, file I/O). Don't wrap internal function calls that can't
realistically fail.
&lt;span class="gs"&gt;**Why:**&lt;/span&gt; Added defensive error handling around a pure math function.
User said "this function takes two integers and adds them, it can't
throw. You're adding complexity for nothing."
&lt;span class="gs"&gt;**Applies when:**&lt;/span&gt; Writing or reviewing error handling in any codebase.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude reads this file at the start of every session. The correction sticks permanently. Over a few weeks, the file becomes a precise set of rules that make Claude work exactly the way you need.&lt;/p&gt;

&lt;p&gt;The principle is simple: &lt;strong&gt;never correct the same mistake twice.&lt;/strong&gt; The first correction is a lesson. The second one means the system failed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Priority Order
&lt;/h2&gt;

&lt;p&gt;If you're starting from scratch, here's what I'd do in order:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Priority&lt;/th&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Effort&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Install RTK&lt;/td&gt;
&lt;td&gt;30 seconds&lt;/td&gt;
&lt;td&gt;60-90% tool output savings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Audit global skills, move domain-specific to project level&lt;/td&gt;
&lt;td&gt;15 minutes&lt;/td&gt;
&lt;td&gt;Free up context window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Set up basic memory files (user profile + 2-3 feedback entries)&lt;/td&gt;
&lt;td&gt;10 minutes&lt;/td&gt;
&lt;td&gt;Smarter responses, fewer repeated mistakes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Start a lessons.md file&lt;/td&gt;
&lt;td&gt;30 seconds to create, 30 seconds per correction&lt;/td&gt;
&lt;td&gt;Permanent mistake prevention&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Set MAX_THINKING_TOKENS env variable&lt;/td&gt;
&lt;td&gt;10 seconds&lt;/td&gt;
&lt;td&gt;Cap runaway thinking, save tokens on over-analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Add model routing rules for subagents&lt;/td&gt;
&lt;td&gt;10 minutes&lt;/td&gt;
&lt;td&gt;Route exploration/search subagents to cheaper models&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;None of this is complicated. Most of it takes less than 15 minutes. But the compound effect of doing all six is significant: longer sessions, better context retention, fewer repeated mistakes, and lower token bills.&lt;/p&gt;

&lt;p&gt;The tools are there. Most people just don't know they exist, or don't realize how much overhead they're carrying.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is Part 1 of a series on getting more out of Claude Code. &lt;a href="https://dev.to/harivenkatakrishnakotha/rtk-model-routing-and-the-community-tools-that-actually-work-with-claude-code-3pmh"&gt;Part 2&lt;/a&gt; covers RTK in depth, including Windows setup, configuration, subagent behavior, and community tools that complement it.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>webdev</category>
    </item>
    <item>
      <title>The Full Audit: What a 9-Project Microservices Platform Looks Like When 78% of the Code is AI-Generated</title>
      <dc:creator>Hari Venkata Krishna Kotha</dc:creator>
      <pubDate>Thu, 12 Feb 2026 14:01:39 +0000</pubDate>
      <link>https://forem.com/harivenkatakrishnakotha/the-full-audit-what-a-9-project-microservices-platform-looks-like-when-78-of-the-code-is-2fgd</link>
      <guid>https://forem.com/harivenkatakrishnakotha/the-full-audit-what-a-9-project-microservices-platform-looks-like-when-78-of-the-code-is-2fgd</guid>
      <description>&lt;p&gt;I spent 7 weeks building ... then several more weeks auditing, documenting, and hardening DesiCorner - a production-grade Indian restaurant e-commerce platform with 9 .NET and Angular projects, 5 databases, and a full Angular frontend. Claude Code wrote 78% of the code. I wrote 9%. Auto-generated tooling (EF Core migrations, Angular CLI scaffolding, package configs) handled the remaining 13%.&lt;/p&gt;

&lt;p&gt;I tracked everything. Every commit, every bug, every file. This is the full audit.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;DesiCorner is an Indian restaurant ordering platform. Not a tutorial project - a full-featured e-commerce system with authenticated and guest checkout, Stripe payments, an admin dashboard with analytics, product reviews with voting, coupon management, and delivery/pickup order types.&lt;/p&gt;

&lt;p&gt;The tech stack:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Backend:&lt;/strong&gt; ASP.NET Core 8 across 8 .NET projects - AuthServer (OpenIddict OAuth 2.0), API Gateway (YARP), ProductAPI, CartAPI, OrderAPI, PaymentAPI (Stripe), a shared Contracts library (41 DTOs across 9 subdomains), and a MessageBus abstraction layer (Redis caching, Azure Service Bus scaffolded).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frontend:&lt;/strong&gt; Angular 20 with standalone components, NgRx state management, OAuth 2.0 Authorization Code + PKCE flow, and Stripe Elements for PCI-compliant payment forms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure:&lt;/strong&gt; 5 separate SQL Server databases (one per service), Redis for distributed caching/sessions/rate limiting, and a branch-per-feature Git workflow with 68 commits across 15 branches and 22 merged PRs.&lt;/p&gt;

&lt;p&gt;The architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff6ness2t4qcc65s75hdp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff6ness2t4qcc65s75hdp.png" alt="Architecture" width="800" height="543"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every project has its own README with Mermaid diagrams documenting the actual API flows verified against source code. Each microservice gets its own database and responsibility boundary.&lt;/p&gt;

&lt;p&gt;Authentication uses OAuth 2.0 Authorization Code + PKCE - the Angular SPA never touches a client secret:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frym2oqki43zitmhetw1r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frym2oqki43zitmhetw1r.png" alt="OAuth 2.0 Authorization Code + PKCE" width="800" height="543"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No client secret in the browser. No password sent to the token endpoint. The code_verifier proves the token request came from the same client that started the flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;Here's the part that matters. I audited the entire codebase commit-by-commit and produced a file-level attribution of who wrote what:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Me&lt;/th&gt;
&lt;th&gt;Claude&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Project vision and concept&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architecture decisions&lt;/td&gt;
&lt;td&gt;70%&lt;/td&gt;
&lt;td&gt;30%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Technology selection&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend model definitions (field choices)&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;td&gt;40%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend service/controller code&lt;/td&gt;
&lt;td&gt;10%&lt;/td&gt;
&lt;td&gt;90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Angular scaffold and components&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;td&gt;95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Configuration values (appsettings)&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bug identification&lt;/td&gt;
&lt;td&gt;90%&lt;/td&gt;
&lt;td&gt;10%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bug resolution code&lt;/td&gt;
&lt;td&gt;40%&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security management&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Git workflow (branching, PRs)&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing and validation&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Product images and assets&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Documentation (READMEs, diagrams)&lt;/td&gt;
&lt;td&gt;30%&lt;/td&gt;
&lt;td&gt;70%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;By raw line count: Claude generated roughly 38,000 lines (78%), I wrote about 4,500 lines (9%), and auto-generated tooling produced roughly 6,000 lines (13%).&lt;/p&gt;

&lt;p&gt;The attribution methodology: commits with thousands of well-structured lines in a single commit strongly suggest AI generation. Small, targeted 2-10 line fixes with debugging context suggest human authorship. The &lt;code&gt;.claude/settings.local.json&lt;/code&gt; file first appeared on Dec 5, 2025, confirming Claude Code usage from that date. Earlier attributions are inferred from these patterns.&lt;/p&gt;

&lt;p&gt;Look at where the 100%-me rows cluster: vision, technology selection, configuration, security, git workflow, testing. Now look at where Claude dominates: service code, Angular components, documentation generation. The pattern is clear - I was the architect and Claude was the builder.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bugs That Proved the Point
&lt;/h2&gt;

&lt;p&gt;Twelve bugs emerged during development. I identified eleven of them. Here are the three that taught me the most.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 1: The JWT Remaster (November 12-13, 2025)
&lt;/h3&gt;

&lt;p&gt;JWT tokens from the AuthServer were being rejected by ProductAPI when routed through the Gateway. Everything looked correct on the surface. It took two days to untangle three separate issues hiding behind the same 401 response.&lt;/p&gt;

&lt;p&gt;Here's the token flow -- every arrow was a potential failure point:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4t84uj7gsrxs9ui453zg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4t84uj7gsrxs9ui453zg.png" alt="Token Flow" width="800" height="543"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Audience mismatch.&lt;/strong&gt; The AuthServer issued tokens with audience &lt;code&gt;desicorner-api&lt;/code&gt;, but ProductAPI validated against &lt;code&gt;DesiCorner.ProductAPI&lt;/code&gt;. Different strings, same intent, total failure. Fix: align &lt;code&gt;JwtSettings:Audience&lt;/code&gt; across all services.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Signing key conflict.&lt;/strong&gt; ProductAPI was doing manual symmetric key validation, but the AuthServer was using OpenIddict's ephemeral signing keys. They'd never match. Fix: switch ProductAPI from hardcoded key validation to auto-fetching JWKS from the AuthServer's discovery endpoint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CORS trailing slash.&lt;/strong&gt; The Gateway's CORS policy name was &lt;code&gt;"Angular"&lt;/code&gt; in one place and &lt;code&gt;"desicorner-angular"&lt;/code&gt; in another. URLs had inconsistent trailing slashes between services. Fix: standardize naming and URL formats.&lt;/p&gt;

&lt;p&gt;Three bugs, three different root causes, one symptom. I diagnosed all three through token validation logs and systematic elimination. Claude helped implement the JWKS auto-fetch after I identified what needed to change.&lt;/p&gt;

&lt;p&gt;This is the kind of debugging where you can't just paste an error message into an AI and get an answer. The error message was the same for all three issues: 401 Unauthorized. The diagnosis required understanding how tokens flow across service boundaries, which configuration values matter at each hop, and the difference between OpenIddict's signing behavior and standard symmetric JWT validation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 2: Stripe Secret Key Exposure (December 5, 2025)
&lt;/h3&gt;

&lt;p&gt;During the Stripe payment integration, I committed a live Stripe secret key to source control. I caught it within minutes, reverted the commit immediately, and re-committed with placeholder values.&lt;/p&gt;

&lt;p&gt;The lesson isn't that I made the mistake - everyone has committed a secret at some point. The lesson is that security awareness during development is a human responsibility. You have to know what a secret key looks like, understand the implications of exposure, and react immediately. Yes, tools like GitGuardian and GitHub's push protection can catch these automatically - but the instinct to check before pushing, and the speed to react when something slips through, still matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 3: The Admin Dashboard Cascade (December 18-23, 2025)
&lt;/h3&gt;

&lt;p&gt;Every single admin dashboard API call returned 401 or 403. The first fix attempt on Dec 19 adjusted auth attributes - it didn't fully resolve the issue. The final fix on Dec 23 touched 23 files across 3 services because the root cause was actually four interrelated problems:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Admin role claim wasn't properly included in JWT tokens from the AuthServer&lt;/li&gt;
&lt;li&gt;CartAPI was completely missing JWT validation configuration&lt;/li&gt;
&lt;li&gt;The Order model was missing an &lt;code&gt;OrderType&lt;/code&gt; field, causing analytics queries to fail&lt;/li&gt;
&lt;li&gt;Delivery address fields were required but should be optional for pickup orders&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I identified all four root causes through systematic debugging. Claude implemented the fixes after I mapped out what was broken and why. This is the kind of multi-service cascade failure where you need to understand how the entire system connects - not just the service throwing the error.&lt;/p&gt;

&lt;h2&gt;
  
  
  The FinTrack Contrast
&lt;/h2&gt;

&lt;p&gt;To test the other end of the spectrum, I also had Claude Code build a completely separate project: a 5,597-line single-file HTML personal finance tracker. I provided product requirements and feature specs. Claude wrote all the code in about a week.&lt;/p&gt;

&lt;p&gt;It ran. It looked right. But features had subtle issues I had to catch and send back for correction. The same pattern happened repeatedly on DesiCorner - AI-generated code that works on the surface but needs a human to validate the actual behavior against the intended requirements.&lt;/p&gt;

&lt;p&gt;The difference between the two projects: I can defend every architectural decision in DesiCorner. I can explain why YARP instead of Ocelot, why OpenIddict instead of IdentityServer, why separate databases per microservice instead of a shared database. I can walk through every bug and explain how I traced the root cause.&lt;/p&gt;

&lt;p&gt;For FinTrack, I can explain what it does and what the requirements were. But I can't defend the code decisions because I didn't make them. That's the difference between being an engineer and being a product manager who uses AI tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;The skills that carried this project:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt; - deciding which services to build, how they communicate, which technologies fit, and where the boundaries should be. Claude could suggest options when asked. But evaluating tradeoffs against my specific requirements and committing to a direction - that was mine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging distributed systems&lt;/strong&gt; - tracing failures across service boundaries, reading token validation logs, understanding how configuration values propagate through a microservices system. The JWT Remaster bug would have been trivial in a monolith. In a distributed system with an API Gateway, an AuthServer, and downstream services each with their own JWT validation config, it required understanding the full request lifecycle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security awareness&lt;/strong&gt; - knowing what credentials look like, reacting to exposure, managing secrets across 5+ configuration files, understanding OAuth 2.0 flows well enough to spot misconfiguration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Validation&lt;/strong&gt; - not trusting that "it runs" means "it's correct." This applies equally to AI-generated code and to your own code, but the failure mode is different with AI. AI-generated code often fails in ways that look right at first glance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain knowledge&lt;/strong&gt; - knowing that an Indian restaurant platform needs dietary flags (vegetarian, vegan, gluten-free), spice levels, allergen tracking, and that pickup orders shouldn't require a delivery address. Claude couldn't infer these requirements. I had to specify them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Product.cs - domain fields I specified, Claude implemented&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;IsVegetarian&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;IsVegan&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;IsSpicy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;SpiceLevel&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;        &lt;span class="c1"&gt;// 0-5 heat scale&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;Allergens&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;      &lt;span class="c1"&gt;// nuts, dairy, gluten&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;PreparationTime&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;    &lt;span class="c1"&gt;// minutes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These six fields represent domain knowledge that no AI would infer from "build an Indian restaurant platform." The &lt;code&gt;SpiceLevel&lt;/code&gt; scale, the nullable &lt;code&gt;Allergens&lt;/code&gt; as a comma-separated string, the &lt;code&gt;PreparationTime&lt;/code&gt; default of 15 minutes - every field choice came from understanding the domain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Full Report
&lt;/h2&gt;

&lt;p&gt;I wrote a 2,000-line development report that documents every commit, every file-level attribution, every bug with its resolution, and the complete contribution breakdown. Full transparency on who wrote what and why.&lt;/p&gt;

&lt;p&gt;The repo, including 10 per-project READMEs with Mermaid architecture diagrams:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/HariVenkataKrishnaKotha/DesiCorner" rel="noopener noreferrer"&gt;github.com/HariVenkataKrishnaKotha/DesiCorner&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;AI wrote 78% of this project's code. That percentage will probably go higher on my next project. The question isn't whether AI can generate code - it obviously can, at scale, and it's getting better.&lt;/p&gt;

&lt;p&gt;The question is whether you can architect a system, debug it when it breaks across service boundaries, catch what the AI missed, and take ownership of decisions that have downstream consequences. Those skills aren't about typing speed. They're about engineering judgment.&lt;/p&gt;

&lt;p&gt;The value isn't in the code anymore. It's in everything around the code.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's been your experience with AI coding tools on non-trivial projects? I'm especially curious about debugging stories - the moments where AI-generated code failed in ways that required real engineering to fix. Drop a comment or find me on &lt;a href="https://www.linkedin.com/in/harivenkatakrishnakotha/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>webdev</category>
      <category>ai</category>
      <category>microservices</category>
    </item>
  </channel>
</rss>
