<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Phil Rentier Digital</title>
    <description>The latest articles on Forem by Phil Rentier Digital (@rentierdigital).</description>
    <link>https://forem.com/rentierdigital</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3440667%2F4dff0ac3-f0f2-42bf-b066-14c2ba847691.jpg</url>
      <title>Forem: Phil Rentier Digital</title>
      <link>https://forem.com/rentierdigital</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/rentierdigital"/>
    <language>en</language>
    <item>
      <title>Hermes Agent: The Self-Hosted AI That Finally Grew Up. Here's the Two-VPS Setup Under $10.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Tue, 21 Apr 2026 13:41:11 +0000</pubDate>
      <link>https://forem.com/rentierdigital/hermes-agent-the-self-hosted-ai-that-finally-grew-up-heres-the-two-vps-setup-under-10-209a</link>
      <guid>https://forem.com/rentierdigital/hermes-agent-the-self-hosted-ai-that-finally-grew-up-heres-the-two-vps-setup-under-10-209a</guid>
      <description>&lt;p&gt;Last weekend I installed Hermes Agent on two VPS. A brand-new Hostinger box in 1-click Docker. My existing Contabo box via SSH and a single curl command. Same model config on both sides: Sonnet 4.6 as primary, DeepSeek V4 for delegation. Two install philosophies. Both ship a working &lt;strong&gt;agent&lt;/strong&gt; that replies in &lt;strong&gt;Telegram&lt;/strong&gt; within minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Two install paths tested end-to-end (zero terminal versus pure SSH), a model stack that completely shifted since February, one architectural move that Nous Research made while OpenClaw was busy patching, and a community pattern I wasn't expecting around who's actually migrating (and who isn't).&lt;/p&gt;

&lt;p&gt;If you've been reading here since February, you know I documented &lt;a href="https://rentierdigital.xyz/blog/anthropic-just-killed-my-200-month-openclaw-setup-so-i-rebuilt-it-for-15" rel="noopener noreferrer"&gt;my $15/month OpenClaw migration after the Claude Max ban&lt;/a&gt;. Hadn't touched it since. It worked. Then last week changed my mind. Anthropic officially pulled third-party access to Pro/Max on April 4. The public OpenClaw CVE tracker crossed 138 entries on the 10th. Nous Research shipped Hermes v0.9 on the 13th, a release that merged more pull requests in one drop than some projects ship in a quarter. Triple-hit, same week. Hard to keep ignoring it after that.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Moment I Knew It Was a Different Beast
&lt;/h2&gt;

&lt;p&gt;Five minutes into the Contabo install, the wizard asked me which &lt;em&gt;terminal backend&lt;/em&gt; I wanted: local, Docker, SSH, Daytona, Singularity, or Modal. OpenClaw never asked me that question. OpenClaw just ran. Which was great until the afternoon a skill tried to clean temp files and nearly clipped a directory I'd rather it didn't touch. Hermes making the isolation question explicit, before install completes, tells you what generation you're dealing with.&lt;/p&gt;

&lt;p&gt;Same with the auto-detection step further down the wizard. It scanned for &lt;code&gt;~/.openclaw&lt;/code&gt;, saw mine, and offered to import skills, memories, and API keys. Not in a migration guide you have to read on a Tuesday. In the installer. That's someone who designed for a specific user (the one leaving OpenClaw) and built the ramp.&lt;/p&gt;

&lt;p&gt;Two small choices. Both say the same thing. Someone watched six months of OpenClaw happen and took notes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Bothered: What Six Months of OpenClaw Taught Me
&lt;/h2&gt;

&lt;p&gt;Credit where it's due first. OpenClaw defined the self-hosted agent category. 347k GitHub stars in six months, an ecosystem of 13k+ community-built skills, a Discord that feels alive. Without OpenClaw, there's no Hermes to write about. The prototype did the hard job of proving the category was real.&lt;/p&gt;

&lt;p&gt;But a prototype that grows fast accumulates architecture debt. Three places I felt that debt firsthand.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The UX breaks non-geeks.&lt;/strong&gt; I've spent evenings debugging obscure configuration issues that made no sense until I'd read three Discord threads and one angry Medium post. Shadow, OpenClaw's official maintainer, said it directly on Discord (paraphrased): if you can't use a command line, you should not be using OpenClaw. When the person maintaining the product tells you it's a geek tool, believe them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security is patched, not designed.&lt;/strong&gt; The public CVE tracker logged over 138 entries in roughly two months between February and April 2026. A separate exposure analysis from ARMO counted roughly 135k OpenClaw instances publicly reachable, the majority without authentication. Reco flagged a campaign of malicious skills in the hundreds. Microsoft's guidance in February, paraphrased: don't deploy OpenClaw on machines holding sensitive data. These are not bug counts. This is an architecture that trusts inputs by default and spends its time patching when someone finds the next hole.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Governance is turbulent.&lt;/strong&gt; Three name changes in twelve months (Clawdbot, Moltbot, OpenClaw). OpenAI acquisition late 2025. For a tool I want to keep running three years, that's too much weather to sit through.&lt;/p&gt;

&lt;p&gt;None of this aims at Peter Steinberger. The guy shipped something huge and defined a category. But an architecture designed for a prototype cannot outgrow its debt through patching, no matter how diligent the patching is.&lt;/p&gt;

&lt;p&gt;Which is why next generations exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes Hermes a Product, Not a Prototype
&lt;/h2&gt;

&lt;p&gt;Quick context on Nous Research. AI safety lab behind the Hermes, Nomos, and Psyche model families, serious reputation in the open-weight crowd, MiniMax partnership announced early 2026. Hermes Agent launched in February, crossed 64k+ GitHub stars in two months, shipped v0.9.0 on April 13 with nine releases in seven weeks. Aggressive velocity.&lt;/p&gt;

&lt;p&gt;Four architectural moves I watched firsthand during the installs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security treated as a constraint.&lt;/strong&gt; Tirith, the pre-execution scanner, inspects shell commands before they run. Sub-agents live in their own namespace, each one isolated from the others and from the host. Containers ship hardened with read-only root filesystem and dropped capabilities. Filesystem checkpoints happen automatically before any destructive operation, with a rollback command that does what it says. Zero agent-specific CVE to date according to The New Stack (paraphrased). The move here is architectural, not cosmetic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A closed learning loop.&lt;/strong&gt; After complex tasks (five or more tool calls), the agent pauses, evaluates, and writes a reusable skill (a SKILL.md plus the code that goes with it). Nous's own benchmark (paraphrased) claims roughly 40% faster performance on research tasks once the agent has built up its own skill library. I saw the mechanism in action the first time I asked it to set up a recurring task. It wrote a SKILL.md covering the cron-plus-auth dance it had just figured out, so the next cron request starts from that skill instead of from scratch. Feels weird the first time. Useful by day three.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A standardized runtime.&lt;/strong&gt; Same dependency set, same isolation model, same behavior across Linux, macOS, WSL2, and Android via Termux. The runtime doesn't drift depending on where you deploy (local dev machine, $5 VPS, bare-metal homelab, a phone), which sounds obvious until you try to rebuild a drifted OpenClaw install from memory on a new box at 11pm. No native Windows, no impact on me or 95% of the readers here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A model-agnostic routing layer.&lt;/strong&gt; Nous Portal OAuth (400+ models), OpenRouter (200+), direct Anthropic/OpenAI, Ollama local, vLLM, SGLang. Switch primary or delegation with a single &lt;code&gt;hermes model&lt;/code&gt; command. No code change, no restart, no reconfig. Testing a new model on a specific task takes about two seconds.&lt;/p&gt;

&lt;p&gt;The New Stack paraphrased the bet neatly: OpenClaw optimized for ecosystem breadth, Hermes optimizes for depth of learning. Different architectural bets, neither universally right. Hermes fits the use case where you want the thing to compound over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Path One: Hostinger (Zero Terminal)
&lt;/h2&gt;

&lt;p&gt;KVM 2 plan specs: 2 vCPU, 8 GB RAM, 100 GB NVMe, 8 TB bandwidth, Ubuntu 24.04 LTS. Price: $8.99/month. Pre-configured Hermes Agent template sitting in the Docker catalog. Zero Docker to install on your side.&lt;/p&gt;

&lt;p&gt;How it went. hPanel → Docker Manager → Catalog → typed "Hermes Agent" in the search → Select → Deploy. The template asked for the provider API key during deploy. I pasted my OpenRouter key (one key handles Sonnet 4.6, DeepSeek V4, and the fallbacks). Under fifteen minutes from clicking Deploy to the first "Hi" in Telegram, and most of that was the VPS provisioning itself.&lt;/p&gt;

&lt;p&gt;No real friction. The wizard is what Hostinger has always been good at: opinionated defaults, minimal questions, works.&lt;/p&gt;

&lt;p&gt;One detail worth noting. The same Hostinger catalog also offers OpenClaw as a 1-click template. Not a commercial pick on my end. A user choice in the same store. Provider stays neutral.&lt;/p&gt;

&lt;p&gt;Who this path is for: the reader who followed my OpenClaw articles, who wants to test Hermes without getting into systemd, ufw, and Docker networking. Zero terminal end to end. Deploy, paste key, chat.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.hostg.xyz/SHJIR" rel="noopener noreferrer"&gt;Hostinger Docker catalog Hermes Agent template&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Path Two: Contabo (I Already Had One)
&lt;/h2&gt;

&lt;p&gt;My Contabo box has been running for a while now, handling WooCommerce store ops plus a handful of partner webhooks, with Traefik in front. I wanted to see if Hermes would drop onto an existing box without drama.&lt;/p&gt;

&lt;p&gt;Cloud VPS 10 specs: 3 vCPU, 8 GB RAM, 75 GB NVMe. Price: $4.95/month, same price in year 1, 2, and 3. No renewal surprise. That's the part I keep coming back to.&lt;/p&gt;

&lt;p&gt;How it went. SSH in as a regular user with sudo rights (not root, and yes we'll come back to that). Then the official one-liner from Nous Research (verbatim):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Obligatory confession: yes, this is &lt;code&gt;curl | bash&lt;/code&gt;, the pattern every sysadmin has been yelling about for a decade. On a box that runs an actual ecommerce store. Read the script before you run it. I did. You should too. The installer itself is clean, handles Python 3.11, Node.js, uv, ripgrep, ffmpeg on its own, and never touches anything outside the Hermes working directory. That said, if the word "curl bash" gave you a rash just now, clone the repo and run the install from a local checkout. Works the same.&lt;/p&gt;

&lt;p&gt;Then the interactive wizard. Choices that actually matter: LLM provider → model → TTS (I picked Edge TTS, free) → terminal backend (Docker, for isolation, out of the six options) → messaging working directory → sudo support → max tool iterations → tool progress display → session reset mode → messaging platform (Telegram).&lt;/p&gt;

&lt;p&gt;Ten questions, maybe fifteen. Reading them beats skipping them, because the terminal backend choice alone is the difference between "agent in a sandbox" and "agent with the keys to the kitchen".&lt;/p&gt;

&lt;p&gt;The auto-detection step is the one I want to flag. Because I had &lt;code&gt;~/.openclaw&lt;/code&gt; sitting on this same VPS, the wizard offered to import my existing skills, memories, settings, and API keys in one go. I took it. Three seconds, done. Whatever OpenClaw taught my agent over six months is now sitting in Hermes, which saves me from rebuilding the personalization layer from zero. If you don't have OpenClaw on the box, the wizard just skips that step and moves on.&lt;/p&gt;

&lt;p&gt;One documented trap, not to be missed. If you already run a Telegram bot under OpenClaw, do NOT reuse its token. Create a NEW bot via BotFather or both break. A YouTube demo from early April walked straight into it on camera (paraphrased, source below). Free lesson, courtesy of someone else's mistake.&lt;/p&gt;

&lt;p&gt;Under twenty minutes total to a working agent on Telegram, most of it spent reading the wizard questions carefully instead of mashing Enter.&lt;/p&gt;

&lt;p&gt;The Contabo arguments, condensed. RAM-per-dollar is unbeatable at roughly $0.50/GB (for reference, you're around $6/GB on DigitalOcean). Full OS control (Ubuntu 22/24, Debian, Rocky, CentOS). Data centers across Europe, Asia, the Americas, Australia. A CLI wizard that teaches you what it's installing instead of hiding it behind a panel. Same price over three years.&lt;/p&gt;

&lt;p&gt;Who this path is for: the reader who wants to understand the commands that ran, who already hosts other services, who plans in three-to-five-year chunks instead of thirty-day ones.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.kqzyfj.com/click-100562241-13796481" rel="noopener noreferrer"&gt;Contabo Cloud VPS 10&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Model Stack (Two Months Later, Everything Shifted)
&lt;/h2&gt;

&lt;p&gt;In my February article I was running Kimi K2.5 + MiniMax + GLM-4.7-Flash. Optimal stack for OpenClaw at the time. For Hermes, the landscape moved and my priorities moved with it.&lt;/p&gt;

&lt;p&gt;Technical context first. Hermes v0.9 carries a fixed per-API-call overhead of roughly 73% (tool definitions around 8,700 tokens, system prompt around 5,200 tokens). In Telegram mode the overhead climbs to 15-20K tokens per message, two to three times CLI mode, per Nous's own docs. In that context, reliable tool-calling becomes the critical factor. A cheap model that misfires tool calls loops into error and burns more tokens than a premium model running clean.&lt;/p&gt;

&lt;p&gt;Actual config after two weeks of iteration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openrouter&lt;/span&gt;
&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;anthropic/claude-sonnet-4-6&lt;/span&gt;    &lt;span class="c1"&gt;# primary&lt;/span&gt;

&lt;span class="na"&gt;delegation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deepseek/deepseek-v4&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openrouter&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Sonnet 4.6 ($3/$15 per million input/output tokens) as primary. Consensus pick in the Hermes-in-production community right now (r/LocalLLaMA threads, r/singularity, Berkeley Function Calling Leaderboard). Reliable tool-calling, solid multi-step reasoning, no error spirals. DeepSeek V4 ($0.30/$0.50) as delegation. 90% cache discount makes the overhead nearly free. Around 90% of Claude's quality on sub-agent tasks. Honest caveat: DeepSeek's infra throws 503s at peak hours, fallback is clean (delegation drops back to primary without drama).&lt;/p&gt;

&lt;p&gt;Models to avoid. GPT-5.4 Mini, "terrible at tool calling" by explicit r/LocalLLaMA warning. MiniMax 2.5 was unusable, 2.7 fixed it. Qwen 3.x for tool-calling breaks parsing because of the &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; tags. Pure reasoning models talk themselves out of using tools. Don't ask me why, they just do.&lt;/p&gt;

&lt;p&gt;Real monthly cost depends on your usage pattern. At roughly 10 messages per day, you'll probably land around $15-25 all-in. At 30 per day, closer to $40-70. At 50+, $80-120. The Telegram overhead is the variable that moves the needle.&lt;/p&gt;

&lt;p&gt;Fallback plan if something derails: &lt;code&gt;hermes model&lt;/code&gt;, switch primary to DeepSeek V4, effective immediately, no reconfig. Safety net is one command.&lt;/p&gt;

&lt;p&gt;My SOUL.md opens with &lt;a href="https://rentierdigital.xyz/blog/ai-agent-lies-claude-deception" rel="noopener noreferrer"&gt;the four integrity lines from my prompt contract&lt;/a&gt;. Never lie. Never hide the truth. Never conceal a problem. Never fail silently. Same clause that sat on top of my old OpenClaw CLAUDE.md. It still makes the dashboard yellow instead of fake-green, and I still prefer yellow.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Hermes Doesn't Do Yet (Honestly)
&lt;/h2&gt;

&lt;p&gt;Four caveats worth stating plainly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anthropic OAuth does NOT work natively.&lt;/strong&gt; If you're Claude-first (me, probably you), you need OpenRouter or a direct Anthropic API key. Pro and Max subscriptions cover the web interface, not the API, so you can't plug them into an agent anyway. The real friction is having to manage a separate pay-as-you-go balance on OpenRouter or the Anthropic console on top of whatever web subscription you already pay for. Two invoices, two dashboards, one usage to monitor. Biggest caveat on my list right now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skills ecosystem is young.&lt;/strong&gt; No ClawHub equivalent with 13k+ community-built skills. Hermes creates its own skills through the learning loop, but you start without a shared library. The compounding effect takes two to four weeks to become visible, based on what I observed and what r/LocalLLaMA reports.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;v0.9 is five days old.&lt;/strong&gt; Hermes is two months old total. CVEs will come (no architecture is immune). The design should keep them less catastrophic. Nous's aggressive velocity also means a massive surface of change, which means a massive surface of bugs too. A release that merges hundreds of PRs is not a calm number.&lt;/p&gt;

&lt;p&gt;And a community nuance that matters. Power users aren't migrating. They're running both in parallel via the ACP protocol (OpenClaw as orchestrator, Hermes as execution specialist). Source: a Kilo analysis of r/openclaw threads, paraphrased. Full migration isn't the only valid path. I'm not dual-running, but I'm not telling you not to either.&lt;/p&gt;

&lt;p&gt;Hermes is architecturally superior. I'll stand on that. But it's a two-month-old product, not a messiah. Temper accordingly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who Should Actually Do This
&lt;/h2&gt;

&lt;p&gt;Four quick segments so you don't have to squint at the decision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're new to self-hosted agents,&lt;/strong&gt; go Hermes direct via the Hostinger 1-click. No OpenClaw debt to migrate. Sonnet 4.6 + DeepSeek V4 on OpenRouter. Roughly $15-25/mo all-in for personal use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you already run OpenClaw with a stable setup,&lt;/strong&gt; dual-run via ACP instead of migrating. OpenClaw keeps orchestrating your automations, Hermes runs as execution specialist on new tasks. The Hermes wizard detects &lt;code&gt;~/.openclaw&lt;/code&gt; and offers to import the personalization layer, which means the cost of trying is basically zero. (If your setup already runs the &lt;a href="https://rentierdigital.xyz/blog/21-openclaw-automations-nobody-talks-about-because-the-obvious-ones-already-broke-the-internet" rel="noopener noreferrer"&gt;21 advanced automations I documented here&lt;/a&gt;, Hermes won't break any of them.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you migrated post-Claude-Max-ban&lt;/strong&gt; (my case, February), it's Hermes + OpenRouter + Sonnet 4.6 + DeepSeek V4. Direct upgrade from the old Kimi/MiniMax stack. Same price range, better tool-calling reliability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For critical production, wait.&lt;/strong&gt; v1.0 or three months of v0.x stability. For personal use or side projects, it's fine now. For your client's prod, it's not.&lt;/p&gt;

&lt;p&gt;Your client pays you to be boring about their uptime.&lt;/p&gt;

&lt;p&gt;I took install notes on both paths while I was doing them. If there's interest, I'll clean them up into a proper guide: the 2-path checklist, the SOUL.md integrity template, the Sonnet 4.6 / DeepSeek V4 config. Say so in the comments.&lt;/p&gt;




&lt;p&gt;Three months from now, Hermes will have its own CVEs. Every architecture ends up with some. That's not the question.&lt;/p&gt;

&lt;p&gt;OpenClaw had six months. It took on the debt. Hermes looked at that debt first. Good prototype. But honestly, spending time debugging (even with Claude) is not my passion. I'd rather be building. C'est la vie 😊&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Public OpenClaw CVE tracker (GitHub, April 2026)&lt;/li&gt;
&lt;li&gt;ARMO exposure analysis on OpenClaw instances (February 2026)&lt;/li&gt;
&lt;li&gt;Reco campaign report on malicious OpenClaw skills (March 2026)&lt;/li&gt;
&lt;li&gt;Nous Research Hermes Agent documentation and v0.9 release notes (April 2026)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;This article may contain affiliate links. I may earn a small commission if you purchase through them.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;(*) The cover is AI-generated. Midjourney took one look at the Hermes launch schedule and blamed me for the deadline.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The article walks through two real installs on $5 VPS, but the bigger shift is how Hermes handles isolation and security by design—not patches. If you're self-hosting agents, the demo-vs-product checklist in the welcome kit shows exactly which of those 138 CVEs you should actually worry about.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;→ &lt;a href="https://rentierdigital.beehiiv.com/subscribe?utm_source=astro&amp;amp;utm_medium=article&amp;amp;utm_campaign=hermes-agent-self-hosted-ai-setup" rel="noopener noreferrer"&gt;Get the welcome kit&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>technology</category>
      <category>aiagents</category>
      <category>hermesagent</category>
    </item>
    <item>
      <title>1of10 Alternative: I Built Mine in an Afternoon for $0/Month</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Mon, 20 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/1of10-alternative-i-built-mine-in-an-afternoon-for-0month-4ahb</link>
      <guid>https://forem.com/rentierdigital/1of10-alternative-i-built-mine-in-an-afternoon-for-0month-4ahb</guid>
      <description>&lt;p&gt;f you want viral content ideas, one thing works embarrassingly well: YouTube &lt;em&gt;outliers&lt;/em&gt;. Videos that blow past their channel's average, usually on small channels nobody follows yet. A channel that does 50K views suddenly drops one at 500K? Something hit. Title, thumbnail, topic, timing, whatever. Worth studying.&lt;/p&gt;

&lt;p&gt;There's a tool for that. It's called 1of10. It costs $29 this month. Or $39. I stopped tracking their pricing page, it keeps moving. Plus a dashboard full of features nobody asked for, no API, nothing I can wire into my own workflow.&lt;/p&gt;

&lt;p&gt;So I built my own. Not in the heroic "one afternoon, 200 lines, look at me" sense. In the boring sense: I typed a handful of short prompts at Claude and the thing worked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — This is not a tutorial. It's how I actually work with AI when I build a small tool. The code isn't interesting. The pivot prompt is: the one where I stopped trying to specify a solution and handed Claude a problem instead. That move is the whole article.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Paywall and the Spec
&lt;/h2&gt;

&lt;p&gt;The seed was a forum post I won't link (Medium links the hard way and credibility dies in the click). Paraphrasing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Train Claude on viral headlines and YouTube titles. The article itself doesn't really matter. The title and the thumbnail do. We use 1of10 to find outlier videos and reverse-engineer the patterns.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Valid workflow. Annoying paywall. And I had a hunch 1of10 was three API calls in a trench coat.&lt;/p&gt;

&lt;p&gt;The spec, written in my head and typed nowhere. A search query in. The top 10 YouTube videos that overperform their own channel's average out. Score formula: &lt;code&gt;views / channelAverage&lt;/code&gt;. Free YouTube Data API, 10K quota units per day, roughly 100 full searches. Cost: $0 forever.&lt;/p&gt;

&lt;p&gt;The metric is one division. That's the whole reason 1of10 is overpriced: the math they're selling is trivial, the dashboard is where the billing lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Prompt That Did The Actual Work
&lt;/h2&gt;

&lt;p&gt;The setup is uninteresting. &lt;em&gt;Install the app. No, put it in the right project folder. Use&lt;/em&gt; &lt;code&gt;gh&lt;/code&gt; &lt;em&gt;instead of curl.&lt;/em&gt; Three prompts, three minutes. The scaffolding of the Astro app and the algorithm itself had happened in an earlier Claude session I no longer have logs for (back up your Claude Code projects, I learned this one the hard way).&lt;/p&gt;

&lt;p&gt;App running locally. I type a search. The UI is ugly. The ranking is noisy. A top-10 sorted by outlier score mixes real outliers with normal-ish videos and I can't tell where the real signal ends. My eyes see three obvious winners. My brain cannot name the algorithm that says the same thing.&lt;/p&gt;

&lt;p&gt;So I typed this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;can u improve the interface ? [screenshot]
not readable, and i'd want a one-click
copy of the best outlier titles
(the top 3 here really stand out,
idk how you calculate that,
where it really jumps mathematically)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three asks. UI fix. Copy button. And the one that matters: &lt;em&gt;idk how you calculate that, where it really jumps&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That last bit is the whole article.&lt;/p&gt;

&lt;p&gt;Here is what I would have typed on a bad day, the kind of day where I try to sound like an engineer: &lt;em&gt;implement a function that flags the top N statistical outliers using z-score or percentile thresholding, with proper edge case handling and unit tests&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That prompt gets z-scores. Z-scores don't work on this data. The distribution is nowhere near normal, outlier rankings are power-law at best, and I'd end up with a function that flags ten results on a smooth curve and zero on a jagged one. Useless.&lt;/p&gt;

&lt;p&gt;What I typed instead was the problem, not the solution. "Where does it really jump."&lt;/p&gt;

&lt;p&gt;Claude came back with three candidate algorithms (z-score, gap ratio, percentile threshold), explained why gap ratio fits this kind of data best, and wrote it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for each pair of consecutive scores (sorted desc):
  ratio = score[i] / score[i+1]
  keep track of the largest ratio seen
  mark the cut at that position
if the best ratio is below 1.5x: only the first result is an outlier
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ten lines. Run it on &lt;code&gt;[45, 28, 20, 3.2, 2.8, 2.1]&lt;/code&gt;, the cut falls between 20 and 3.2 (a 6.25x gap), and the first three are flagged as real outliers. Exactly what my eyes saw. Nothing I could have specced myself.&lt;/p&gt;

&lt;p&gt;Describe the problem, not the solution. The model has read ten thousand papers on ranking and segmentation. You've read zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three More Prompts, Same Shape
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;YOUTUBE_API_KEY not configured / it's in infisical&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The dev server had crashed on a missing secret. I type the error message plus one hint: &lt;em&gt;it's in Infisical&lt;/em&gt;. Infisical is a secrets manager. It keeps API keys out of plain &lt;code&gt;.env&lt;/code&gt; files. The kind of hygiene I only started caring about after the third time I leaked one. Claude wrapped the launch command with &lt;code&gt;infisical run&lt;/code&gt;, the app booted, we moved on. Two seconds. This is how you tell an AI where the secret is without pasting the secret.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;no find the titles that overperform on a given topic [long paste of the forum thread]&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Claude had drifted into an adjacent feature. Looked cool, wasn't what I wanted. I killed it with "no" and pasted the original spec from the forum, verbatim. Recenter by repetition, not by explanation. Do not argue with the model. Re-anchor.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;the blue on black isn't readable [screenshot] and it's good to copy min 5 results to identify patterns&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Second UI pass. Screenshot of unreadable blue links, plus a business rule smuggled into the feedback: &lt;em&gt;min 5 results to spot a pattern&lt;/em&gt;. Claude adjusted the contrast, killed the blue, highlighted the real outliers with a coral accent, grayed the rest. The rule became the default for the export button: minimum 5, bumps up if the gap detector finds more. One turn.&lt;/p&gt;

&lt;p&gt;The ones I skipped were &lt;code&gt;ok&lt;/code&gt;, &lt;code&gt;A&lt;/code&gt;, &lt;code&gt;C&lt;/code&gt;, &lt;code&gt;ok&lt;/code&gt;, &lt;code&gt;ok&lt;/code&gt;. Validations and multiple-choice answers to Claude's own design questions. Plus one cost check mid-session (&lt;em&gt;is the Google API free?&lt;/em&gt;) because you don't want to finish building something you can't afford to run. The rhythm is all information, no ceremony.&lt;/p&gt;

&lt;h2&gt;
  
  
  24 Hours Later, I Wanted It Inside Claude Code
&lt;/h2&gt;

&lt;p&gt;App shipped. Running locally. Worked. And already I hated using it.&lt;/p&gt;

&lt;p&gt;Because the flow was: open the browser, type the query, wait two seconds, click "copy top N", switch back to Claude, paste, ask for angles. Six steps for something I do twenty times a week. Unacceptable.&lt;/p&gt;

&lt;p&gt;The next day. One commit. 234 lines. I moved the algorithm into my personal MCP server (the one that already holds my Medium stats tools, my YouTube transcript puller, my article archive search). A Convex action wrapping the same &lt;code&gt;findOutliers&lt;/code&gt; logic, exposed as a tool called &lt;code&gt;find_youtube_outliers&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If MCP is new to you, think of it as a standard protocol that lets Claude Code call your functions as if they were built into the assistant. The function lives on your server. Claude decides when to call it based on the user's message.&lt;/p&gt;

&lt;p&gt;Now when I'm brainstorming with Claude I just type "find outliers for X and suggest 5 angles." Claude calls the tool itself, reads the titles, proposes hooks, often chains straight into &lt;code&gt;get_youtube_transcript&lt;/code&gt; on the top result to sample the hook pattern, then into &lt;code&gt;search_articles&lt;/code&gt; against my own archive to check if I've already covered the angle.&lt;/p&gt;

&lt;p&gt;No browser. No TSV. No paste. No context switch.&lt;/p&gt;

&lt;p&gt;One caveat. I went MCP here because I'm the only user and Claude Code is the only caller. For a one-user-one-machine tool, MCP is fine. If you plan to call your outlier finder from cron jobs, another script, a Discord bot, anywhere else, a plain CLI is usually the better shape. I went deeper on &lt;a href="https://rentierdigital.xyz/blog/why-clis-beat-mcp-for-ai-agents-and-how-to-build-your-own-cli-army" rel="noopener noreferrer"&gt;the tradeoffs between CLI and MCP integration&lt;/a&gt; elsewhere.&lt;/p&gt;

&lt;p&gt;The real shift here isn't MCP vs CLI. It's that your tool stops being an app you open and becomes a capability the model reaches for. Same code. Different gravity. 🛠️&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Every "1of10 Alternative" Article Is an Ad
&lt;/h2&gt;

&lt;p&gt;I googled "1of10 alternative" before building mine. Every article on page 1 was an affiliate piece. Same template every time:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;1of10 is great BUT it's expensive. Here's [affiliate link], only $9/month, much better value.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These aren't alternatives. They're cheaper SaaS on the same business model. You pay every month, the data stays on their servers, and you wait for a product team to ship the filter you need. Which they won't, because their roadmap is driven by what gets them on Product Hunt, not by your brainstorming loop.&lt;/p&gt;

&lt;p&gt;The real alternative is "I wrote the thing and I own it." Nobody writes that article because there is no affiliate link in it.&lt;/p&gt;

&lt;p&gt;Same move I made &lt;a href="https://rentierdigital.xyz/blog/anthropic-just-killed-my-200-month-openclaw-setup-so-i-rebuilt-it-for-15" rel="noopener noreferrer"&gt;when Anthropic killed my $200/month scraping setup and I rebuilt it for $15&lt;/a&gt;. Same principle every time: the SaaS is almost always a UI on top of something free. If you can describe what you want to a model, you're paying for the skin.&lt;/p&gt;

&lt;p&gt;Which, if you can't code, is fair. If you can code but prefer paying to prompting, also fair. But if you enjoy the build, know the paywall is optional. And getting more optional every month.&lt;/p&gt;

&lt;p&gt;Six months from now there will be fifteen new "1of10 alternative, only $9/month!" on Product Hunt. All built on the same free public API. All with a dashboard, a dark mode toggle, an onboarding call nobody wants to do.&lt;/p&gt;

&lt;p&gt;And then there are the ones who open their editor. Who use AI to build tools the exact shape of their workflow. Like Japanese artisans, making their own chisels for the wood they carve.&lt;/p&gt;

&lt;p&gt;No churn. No AI pivot. Just a for loop.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/youtube/v3/getting-started#quota" rel="noopener noreferrer"&gt;YouTube Data API v3 quota reference&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;Model Context Protocol specification&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(*) The cover is AI-generated. I tried prompting "a viral thumbnail outlier", just to see. The model returned a conference poster.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>claude</category>
      <category>aitools</category>
    </item>
    <item>
      <title>Open Source Died Yesterday. AI Killed It. What Replaces It Is Worse.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Sun, 19 Apr 2026 13:41:11 +0000</pubDate>
      <link>https://forem.com/rentierdigital/open-source-died-yesterday-ai-killed-it-what-replaces-it-is-worse-33jm</link>
      <guid>https://forem.com/rentierdigital/open-source-died-yesterday-ai-killed-it-what-replaces-it-is-worse-33jm</guid>
      <description>&lt;p&gt;Yesterday, Cal.com closed their source code. One of the world's largest Next.js open source projects. Done. Co-founder Bailey Pumfleet figures that sharing your code in the age of AI is like handing out the blueprint to a bank vault to 100x more hackers than before. His partner Peer Richelsen followed up, saying any open-source application is at risk and should take all or the sensitive parts private. Meanwhile, Peter steipete (OpenClaw/OpenAI) responded "bad news" with a screenshot of GPT 5.4-Cyber reverse-engineering closed source without breaking a sweat.&lt;/p&gt;

&lt;p&gt;Is open source dead?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; Cal.com is right to be scared. &lt;strong&gt;Mythos&lt;/strong&gt; (Anthropic's cyber AI) cracked a &lt;strong&gt;27-year-old OpenBSD bug&lt;/strong&gt; last week. But their remedy collides with a principle that's &lt;strong&gt;143 years old&lt;/strong&gt; and that nobody in this entire debate has mentioned once. When you apply that principle to their decision, you realize they didn't solve the problem they think they solved.&lt;/p&gt;

&lt;p&gt;They're not alone. Tailwind, cURL, Ghostty, tldraw, GitHub. Different projects, same reflex, same reason: AI. The open source ecosystem is in full retreat and nobody is stopping to ask if the retreat even leads somewhere safer.&lt;/p&gt;

&lt;p&gt;I build with OSS every day. Hetzner, Postgres, Redis, Next.js, whatever npm package I pull without thinking twice. And now one of open source's poster boys just raised both hands. Everyone reacts. Nobody asks the real question.&lt;/p&gt;

&lt;p&gt;Is closing the code the right response to a problem that is, itself, very real?&lt;/p&gt;

&lt;h2&gt;
  
  
  Mythos Cracked a 27-Year-Old OS
&lt;/h2&gt;

&lt;p&gt;On April 7, Anthropic released the technical report on Mythos Preview. The numbers are not what you call subtle.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;27-year-old TCP SACK integer overflow&lt;/strong&gt; in OpenBSD, an operating system famous for its obsessive security posture. A 16-year-old heap buffer overflow in FFmpeg's H.264 decoder, present in nearly every phone, browser, and computer on the planet. Five million automated fuzzer runs never caught it. Mythos identified it through &lt;strong&gt;semantic reasoning about code logic&lt;/strong&gt;, not brute force. A 17-year-old remote code execution vulnerability in FreeBSD's NFS server (CVE-2026-4747). Unauthenticated root access. Twenty-gadget ROP chain split over multiple packets. Fully autonomous.&lt;/p&gt;

&lt;p&gt;On Firefox 147, Opus 4.6 turned known vulnerabilities into working exploits twice. Mythos did it 181 times.&lt;/p&gt;

&lt;p&gt;Over 99% of the vulnerabilities remain unpatched. The model is not publicly available. Project Glasswing gave access to Amazon, Apple, Microsoft, and a handful of others. Anthropic committed $100M in usage credits. The same week, &lt;a href="https://rentierdigital.xyz/blog/anthropic-just-crashed-15-billion-in-cybersecurity-stocks" rel="noopener noreferrer"&gt;the announcement wiped $15 billion off cybersecurity stocks&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, the part Cal.com did not mention. Mythos found those bugs in open source code, yes. But Anthropic's own report states explicitly that Mythos finds and exploits zero-days in "every major operating system and every major web browser." Including closed-source. Two days after the announcement, AISLE (a cybersecurity startup) tested the exact showcase vulnerabilities against small, cheap, open-weights models. Eight out of eight detected the FreeBSD NFS vulnerability. The smallest model had 3.6 billion parameters and cost $0.11 per million tokens.&lt;/p&gt;

&lt;p&gt;AISLE's conclusion: the moat in AI cybersecurity is the system, not the model.&lt;/p&gt;

&lt;p&gt;Cal.com is right to panic. Wrong about the fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kerckhoffs Warned Us. In 1883.
&lt;/h2&gt;

&lt;p&gt;The entire "close everything" camp is building on a foundation that was debunked before electricity was common in homes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auguste Kerckhoffs&lt;/strong&gt;, 1883, &lt;em&gt;La Cryptographie Militaire&lt;/em&gt;. One sentence that built modern cryptography: a system must not require secrecy, and it should be able to fall into the enemy's hands without causing problems. 143 years. Never invalidated. &lt;strong&gt;Claude Shannon&lt;/strong&gt; reformulated it in 1949: the enemy knows the system being used. Every serious security framework in existence since then assumes the attacker has the source code. That is not an ideological position. That is how you build systems that actually resist attack.&lt;/p&gt;

&lt;p&gt;Cal.com just based their entire 2026 security strategy on a principle the industry abandoned before the light bulb. They locked the vault and assumed nobody can see through the walls. Mythos sees through walls. GPT 5.4-Cyber sees through walls. The next model, six months from now, will see through thicker ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security through obscurity&lt;/strong&gt; has a 143-year losing record.&lt;/p&gt;

&lt;h2&gt;
  
  
  We've Seen This Panic Before. It Ended Fine.
&lt;/h2&gt;

&lt;p&gt;Late 90s, early 2000s. Automated fuzzing tools arrive. Same panic, same articles. Some of them on Slashdot, which gives you a sense of the era.&lt;/p&gt;

&lt;p&gt;Maintainers complained about the workload explosion. Bad actors used the fuzzing tools before patches shipped. The sky was falling. A commenter named williamyf laid it out in the Cal.com Slashdot thread last week: same tone of articles back then, same predictions, same outcome. The big companies eventually stepped up with free tooling and compute for OSS projects. Maintainers adapted their procedures. The software world kept turning.&lt;/p&gt;

&lt;p&gt;The answer was not to close the code. It was to adapt.&lt;/p&gt;

&lt;p&gt;Cal.com is replaying a mistake the industry already corrected 25 years ago. The tool changed. Fuzzers then, LLMs now. The panic is identical. The correct response has not changed either.&lt;/p&gt;

&lt;p&gt;(Honestly, if you had told a Slashdot commenter in 2001 that a scheduling startup would close its source in 2026 because of AI and call it a security strategy, they would have laughed you out of the thread.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing the Code Is Just a Puzzle With More Steps
&lt;/h2&gt;

&lt;p&gt;When a dev sees closed source, they don't see a wall. They see a puzzle. I know because I am that dev.&lt;/p&gt;

&lt;p&gt;I &lt;a href="https://rentierdigital.xyz/blog/ai-agent-api-reverse-engineering" rel="noopener noreferrer"&gt;mapped 27 undocumented Ghost endpoints in 17 minutes&lt;/a&gt; using Chrome DevTools Protocol. Full audit log. Database export in a single API call. TypeScript wrapper, 830 lines, 40/40 tests. Ghost is open source and I ran the whole experiment locally, publicly.&lt;/p&gt;

&lt;p&gt;That was the publishable demo. I've since run the exact same method on three commercial applications. Products you probably use. Results identical. Two of those writeups will never come out.&lt;/p&gt;

&lt;p&gt;The method does not care about your source code. It needs a browser. &lt;strong&gt;Chrome DevTools Protocol&lt;/strong&gt; exposes every API call your application makes. An agent reads the traffic natively, iteratively, builds a complete map of your endpoints and data flow. No repo access. For Cal.com specifically, without touching their GitHub: TypeScript bundle is minified but not encrypted, mobile traffic is observable, every API call fires in DevTools the moment you load the scheduler.&lt;/p&gt;

&lt;p&gt;Closing the code hides the blueprint. Not the building.&lt;/p&gt;

&lt;p&gt;And the thing is, the devs who would have filed a GitHub issue about a permission check that leaks? Those devs now won't say a thing. You turned your most helpful users into silent bystanders.&lt;/p&gt;

&lt;p&gt;Closing the code is a puzzle, not a wall.&lt;/p&gt;

&lt;h2&gt;
  
  
  You Just Locked the Empty Vault
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ficeberg-diagram-small-visible-tip-above-waterline-labeled-c3e2254d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ficeberg-diagram-small-visible-tip-above-waterline-labeled-c3e2254d.png" alt='Iceberg diagram. Small visible tip above waterline labeled "SOURCE CODE" (what Cal.com just locked). Massive submerged base with three zones: VELOCITY, TRUST and INSTALLED BASE, INTEGRATIONS and NETWORK' width="768" height="1029"&gt;&lt;/a&gt;&lt;br&gt;The Hidden Value Beneath Open Source Code
  &lt;/p&gt;

&lt;p&gt;Cal.com was, by their own description, the world's largest Next.js open source project. The community that found their bugs for free just vanished. That community owed them nothing. It showed up because the code was open.&lt;/p&gt;

&lt;p&gt;cURL's Daniel Stenberg ran his bug bounty for 7 years. The confirmed-vulnerability rate started above 15%. By 2025, it collapsed below 5% because AI slop flooded the process until real reports drowned. He shut it down. Mitchell Hashimoto at Ghostty was more direct: "This is not anti-AI, this is anti-idiot." tldraw closed all external pull requests. Same problem, same exhaustion.&lt;/p&gt;

&lt;p&gt;So the community is under stress everywhere. Fair. The question is whether closing helps or makes it worse.&lt;/p&gt;

&lt;p&gt;Consider what Cal.com actually locked. Their Next.js code. And consider what they cannot lock, because it was never in the repo: their shipping velocity, their Google and Outlook integrations, their enterprise base, their five years of product experience. Red Hat's Linux is 100% open and IBM paid $34 billion for it. IBM did not buy the code. They bought the support, the certification, the trust.&lt;/p&gt;

&lt;p&gt;Your code is the least defensible part of your business.&lt;/p&gt;

&lt;p&gt;Karen from Accounting could have told them that. The asset on the balance sheet is the customer list and the renewal rate, not the GitHub repository. But nobody invites Karen to the security meeting.&lt;/p&gt;

&lt;p&gt;And now for the part that should really worry Cal.com's customers. Close the code, and you don't just lose the people who filed issues for free. You start drifting back toward the world before open source. Broadcom bought VMware late 2023, customer bills went up 10x in six months. Oracle Database still sits at $47,500 per CPU. Your side project runs at $15/month on a Hetzner VPS because Linux, Postgres, Redis, Nginx are all open source. If every commercial OSS company closes their code out of Mythos panic, you don't lose the infrastructure layer. You lose the layers on top: schedulers, billing, analytics, auth. You drift back into a world where every component is an enterprise invoice.&lt;/p&gt;

&lt;p&gt;That is what replaces open source. Not something better. Something more expensive. 🤷&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Answer Is Speed, Not Secrecy
&lt;/h2&gt;

&lt;p&gt;Peter steipete chose the opposite direction. His strategy for OpenClaw: rapid iteration and code hardening, even though it introduced occasional regressions and people yelled at him for it. He sees it as the only way forward. I think he's right.&lt;/p&gt;

&lt;p&gt;In a Mythos world, the defense is not hiding your code. The defense is &lt;strong&gt;patching faster than attackers exploit&lt;/strong&gt;. Anthropic's own report says it: the advantage goes to whichever side gets the most out of these tools. In the short term, that could be attackers. In the long term, it should be defenders, because defenders have something attackers don't: the commit access to fix the code.&lt;/p&gt;

&lt;p&gt;But "should" requires work. Publish a real SECURITY.md with an actual response SLA, not a template you copied from a GitHub starter repo. Automate your CVE scans and treat flagged dependencies like production incidents, not backlog items that sit for three sprints. Shorten your patch cycle. The gap between "vulnerability discovered" and "patch deployed" is the only window that matters now, and every day you leave it open is a day Mythos (or the next thing after Mythos) can walk through.&lt;/p&gt;

&lt;p&gt;I ran my own dependency audit the week the Mythos report dropped. Found two outdated packages with known CVEs that had been sitting there since I last touched the project. Not because I didn't care. Because the process wasn't automated. That gap is what kills you. Not whether your code is on GitHub.&lt;/p&gt;

&lt;p&gt;Open source plus rapid hardening is not open source plus hope for the best. It is disciplined work. But it's the only approach that survives in a world where the attacker's toolkit gets better every six months and closing the door doesn't actually close the door.&lt;/p&gt;

&lt;h2&gt;
  
  
  Long Live Open Source
&lt;/h2&gt;

&lt;p&gt;So yes. Open source died yesterday. The one that counted on "many eyes" to compensate for the absence of real security discipline. The one that published code hoping the community would find the bugs. That one, Mythos buried it on April 7 when it found 27 years of bugs in OpenBSD.&lt;/p&gt;

&lt;p&gt;Pumfleet is right on that specific point. That model is done.&lt;/p&gt;

&lt;p&gt;What survives is the OSS that takes security as seriously as a proprietary project. That publishes its SECURITY.md with a real SLA. That pays its maintainers (Tailwind couldn't pay 8 people despite 75 million monthly downloads: that's a business model problem, not an open source problem). That iterates fast. That has an explicit threat model, not a wish to never run into a motivated attacker.&lt;/p&gt;

&lt;p&gt;I read the Mythos paper, I watched Cal.com close their code, and I chose the opposite. I'm not closing anything. I'm accelerating my patch cycles. I'm publishing my advisories. I'm staying open, because the value of my work has never been in my code. Staying open is the best protection I have against what's coming.&lt;/p&gt;

&lt;p&gt;The king is dead. Long live the king.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Anthropic, "Assessing Claude Mythos Preview's cybersecurity capabilities," red.anthropic.com, April 7, 2026.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;AISLE, "AI Cybersecurity After Mythos: The Jagged Frontier," aisle.com, April 2026.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cal.com, "Cal.com Goes Closed Source," cal.com/blog, April 14, 2026.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;(*) The cover is AI-generated. Two French comic characters arguing about vault security while a lobster watches, amused. Standard Tuesday.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>technology</category>
      <category>opensource</category>
      <category>cybersecurity</category>
    </item>
    <item>
      <title>Chrome DevTools Protocol + Claude Code: The Pattern Open Source Teams Spent Years On</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Sat, 18 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/chrome-devtools-protocol-claude-code-the-pattern-open-source-teams-spent-years-on-376e</link>
      <guid>https://forem.com/rentierdigital/chrome-devtools-protocol-claude-code-the-pattern-open-source-teams-spent-years-on-376e</guid>
      <description>&lt;p&gt;I'll admit it, curiosity is both a flaw and a feature with me.&lt;/p&gt;

&lt;p&gt;Every app you use daily can do more than what they show you. There's often beta features hidden behind a flag, undocumented endpoints, running, responding quietly.&lt;/p&gt;

&lt;p&gt;I pointed Claude Code at a local Ghost instance with Chrome DevTools as an MCP server. In one afternoon, the agent found &lt;strong&gt;27 endpoints&lt;/strong&gt; the official documentation mentions nowhere. Detailed &lt;strong&gt;member stats&lt;/strong&gt;, a full &lt;strong&gt;audit log&lt;/strong&gt;, a &lt;strong&gt;database export&lt;/strong&gt; in a single call. All of that was there. From the start.&lt;/p&gt;

&lt;p&gt;I'm talking about &lt;strong&gt;Ghost&lt;/strong&gt; here, but what matters is the &lt;strong&gt;method works on any app&lt;/strong&gt; (including commercial ones...). And what used to take weeks for the obsessives behind yt-dlp, a solo dev with an agent now does between coffee and lunch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: AI agents + Chrome DevTools turn internal API reverse engineering into a reproducible one-shot. 27 undocumented endpoints found on Ghost in one afternoon, typed wrapper + tests included. The method works on any tool. Here's how.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fair warning: experiments on open-source software I run locally. Proprietary tools? Check the TOS first. I'm sharing a method, not legal advice.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Documentation Is the Children's Menu
&lt;/h2&gt;

&lt;p&gt;You're at a restaurant and they hand you the children's menu. Six items, big fonts, pictures of happy chickens. Meanwhile the kitchen runs a full 40-item carte with stuff you'd actually want to order. Nobody's hiding it from you. They just figured you wouldn't ask.&lt;/p&gt;

&lt;p&gt;That's what application documentation is. A curated selection, not a technical inventory.&lt;/p&gt;

&lt;p&gt;Three forces keep it that way. Features that aren't "ready" for public consumption but already work internally (the admin UI uses them, you just can't). Capabilities gated behind premium tiers that technically respond to anyone who hits the right endpoint. And stuff the team built for their own operations and never bothered to document because it wasn't meant for you.&lt;/p&gt;

&lt;p&gt;To be fair to vendors, there's a solid reason for this. Every endpoint you put in the docs becomes an implicit stability contract. Break it, and a thousand developers open a GitHub issue before you finish your morning coffee. So teams document the minimum viable surface and move on. That's not malicious. It's just expensive to do otherwise.&lt;/p&gt;

&lt;p&gt;The docs show you what they want you to use. Not what the app can do.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Obsessives Who Came Before Us
&lt;/h2&gt;

&lt;p&gt;Before agents entered the picture, reverse engineering internal APIs was already a thing. A glorious, painful, time-consuming thing.&lt;/p&gt;

&lt;p&gt;Consider yt-dlp. Hundreds of contributors maintaining one piece of software whose entire purpose is to understand YouTube's internal API. Every time Google changes something (which is constantly, sometimes seemingly out of spite), someone has to figure out the new flow, patch it, ship it. It works. But it's also a full-time project for a small army of volunteers.&lt;/p&gt;

&lt;p&gt;Then there was Nitter. A beautiful alternative Twitter frontend built entirely on reverse-engineered endpoints. Worked great, until Elon locked the APIs and it was finished. Years of work, gone in a policy change. Remember that one, it comes back later.&lt;/p&gt;

&lt;p&gt;These projects proved something important: the undocumented capabilities are real, useful, and people will build remarkable things on top of them. But the cost was absurd. Weeks of manual traffic inspection. Deep protocol expertise. Constant maintenance against moving targets. It was a sport for the obsessive (I remember debugging games in ASM on Amstrad CPC, so I get the appeal, but still).&lt;/p&gt;

&lt;p&gt;yt-dlp has hundreds of contributors to maintain a single reverse engineering effort. I needed zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  An AI Agent Just Collapsed Weeks Into Minutes
&lt;/h2&gt;

&lt;p&gt;The fundamental shift is not about better tooling. It's about who does the exploration work.&lt;/p&gt;

&lt;p&gt;Here's the technical setup. Chrome DevTools Protocol (CDP) exposes everything the browser knows: DOM tree, network requests, console output, performance metrics. Normally you interact with it through the DevTools GUI or via Puppeteer-style automation. An MCP server wraps CDP into a protocol that AI agents speak natively. The agent gets three capabilities that matter here: &lt;code&gt;javascript_tool&lt;/code&gt; (execute arbitrary JS in the page context, including &lt;code&gt;fetch()&lt;/code&gt; calls with the active session cookies), &lt;code&gt;computer&lt;/code&gt; (wait, click, navigate), and access to the full network waterfall.&lt;/p&gt;

&lt;p&gt;That combination is what changes the game. The agent doesn't just read about APIs. It makes live calls inside an authenticated session, inspects the responses, and iterates. All the things you'd do manually with the Network tab open, except the agent does it systematically and doesn't get bored after endpoint number seven.&lt;/p&gt;

&lt;p&gt;Google shipping Chrome DevTools as an official MCP server in March 2026 is what makes this not a hack but a supported workflow. The company that builds the browser decided that giving AI agents live access to the DOM, the network tab, and the console was worth maintaining. That's an industry signal, not a community experiment.&lt;/p&gt;

&lt;p&gt;Before this, agents were essentially blind to runtime behavior. They could read documentation, generate code, call known APIs. But they couldn't watch what an application actually does on the wire. Now they can. The agent reads the traffic, not the docs. On open source, that's your fundamental right. On proprietary software, your TOS mileage varies, which is why the disclaimer up there exists.&lt;/p&gt;

&lt;p&gt;Google just made the pattern official. Agents read network traffic now, not documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ghost, 27 Endpoints, One Afternoon
&lt;/h2&gt;

&lt;p&gt;The setup: Ghost v6.22.0 running locally, Claude Code with Chrome DevTools MCP connected to the admin panel at &lt;code&gt;localhost:2368/ghost/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The first prompt wasn't "go explore" (that would be vibe coding). It was structured: intercept all admin panel requests, catalog unique endpoints by path and HTTP method, record response shapes, then systematically probe adjacent URL patterns. The agent used &lt;code&gt;javascript_tool&lt;/code&gt; to inject &lt;code&gt;fetch()&lt;/code&gt; calls directly in the admin page context, which meant it inherited the active session cookies and admin-level permissions. No separate authentication dance needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1: passive interception.&lt;/strong&gt; While I navigated through the Ghost admin (dashboard, posts, members, settings), the agent recorded every API call the frontend made. Thirteen live endpoints surfaced immediately. These are the ones the admin UI actually uses but that the official API docs don't mention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2: active probing.&lt;/strong&gt; This is where it gets interesting. The agent took the URL patterns it had already seen (&lt;code&gt;/ghost/api/admin/stats/...&lt;/code&gt;, &lt;code&gt;/ghost/api/admin/actions/...&lt;/code&gt;) and started probing variations. It tried adjacent routes, different query parameters, the plural and singular forms of what it already knew. It fetched the official Ghost Admin API docs and the Content API docs in parallel, then computed the delta between what's documented and what actually responds with a 200. By the end: 27 endpoints total, all returning valid data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 3: wrapper construction.&lt;/strong&gt; The agent generated an 830-line TypeScript wrapper (&lt;code&gt;ghost-enhanced-api.ts&lt;/code&gt;) with two clients. A &lt;code&gt;GhostOfficialClient&lt;/code&gt; that wraps the documented Admin API (your baseline), and a &lt;code&gt;GhostEnhancedClient&lt;/code&gt; that adds every undocumented endpoint found. Strict TypeScript interfaces for every response shape, because when you're working with endpoints that have no documentation, types are your documentation.&lt;/p&gt;

&lt;p&gt;Authentication was interesting too. Ghost's Admin API uses JWT signed with HMAC-SHA256, derived from a hex-encoded API key split at position 24 (the first half is the key ID, the second is the secret). The agent figured this out from observing the admin panel's own auth headers and implemented it with &lt;code&gt;crypto.subtle&lt;/code&gt; in the wrapper. No documentation consulted for that part.&lt;/p&gt;

&lt;p&gt;What the agent found, in concrete terms:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stats endpoints (8 total)&lt;/strong&gt; — &lt;code&gt;stats/member_count/&lt;/code&gt;, &lt;code&gt;stats/mrr/&lt;/code&gt;, &lt;code&gt;stats/subscriptions/&lt;/code&gt;, &lt;code&gt;stats/referrers/&lt;/code&gt; with conversion tracking, &lt;code&gt;stats/top-posts-views/&lt;/code&gt;. Ghost runs an entire analytics backend that the official docs pretend doesn't exist. MRR broken down by currency, referrer attribution with conversion rates, daily member growth. This is the kind of data you'd normally need a third-party analytics tool to get.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Audit log&lt;/strong&gt; — &lt;code&gt;actions/&lt;/code&gt; endpoint. Complete journal of every admin operation: who changed what setting, who published which post, when. Full &lt;code&gt;action_type&lt;/code&gt;, &lt;code&gt;resource_type&lt;/code&gt;, &lt;code&gt;actor&lt;/code&gt; fields. The sort of feature that's usually "Enterprise tier, contact sales."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Email system&lt;/strong&gt; — three separate endpoint groups: &lt;code&gt;emails/&lt;/code&gt; (delivery stats per email), &lt;code&gt;links/&lt;/code&gt; (click tracking), &lt;code&gt;automated_emails/&lt;/code&gt; (newsletter automation metrics). Independent from post endpoints, meaning you can query email performance without going through the posts API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database export&lt;/strong&gt; — &lt;code&gt;GET /ghost/api/admin/db/&lt;/code&gt; returns a full JSON backup. One call. (And its mirror, &lt;code&gt;POST /ghost/api/admin/db/&lt;/code&gt;, does a destructive import. That one goes in the "don't touch" category for obvious reasons.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Also discovered&lt;/strong&gt;: &lt;code&gt;mentions/&lt;/code&gt; (Webmentions/ActivityPub), &lt;code&gt;recommendations/&lt;/code&gt; and &lt;code&gt;incoming_recommendations/&lt;/code&gt; (the recommendation engine), &lt;code&gt;snippets/&lt;/code&gt;, &lt;code&gt;labels/&lt;/code&gt;, &lt;code&gt;roles/&lt;/code&gt;, and full server config.&lt;/p&gt;

&lt;p&gt;The test suite (40 tests) passed 39/40 on first run. The one failure was a response key mismatch: &lt;code&gt;incoming_recommendations/&lt;/code&gt; returns its data under a &lt;code&gt;recommendations&lt;/code&gt; key, not &lt;code&gt;incoming_recommendations&lt;/code&gt;. Exactly the kind of inconsistency that only shows up when you actually hit the endpoint and look at what comes back. Fix was one line. 40/40.&lt;/p&gt;

&lt;p&gt;I've already seen &lt;a href="https://rentierdigital.xyz/blog/claude-code-n8n-architect-open-source" rel="noopener noreferrer"&gt;Claude Code absorb an entire open-source tool and become more competent than its own documentation&lt;/a&gt;. Same energy here, applied to API surfaces nobody had mapped.&lt;/p&gt;

&lt;p&gt;Classification: 22 endpoints safe (read-only), 9 use-with-caution (write operations), 1 don't-touch (&lt;code&gt;POST /db/&lt;/code&gt;, the destructive import). And the non-negotiable part: undocumented endpoints carry zero stability contract. They change between versions without a changelog entry. A health check on every endpoint is the first thing you build, before anything else.&lt;/p&gt;

&lt;p&gt;Total agent time: under 17 minutes. The rest of the afternoon was me reading the rapport and deciding what to build on top of it.&lt;/p&gt;

&lt;p&gt;27 endpoints. Zero documentation. One afternoon. Reproducible on any tool you run.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Fquot-iceberg-quot-schema-visible-part-above-waterline-ghost-b84d009a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Fquot-iceberg-quot-schema-visible-part-above-waterline-ghost-b84d009a.png" alt='"iceberg" schema — visible part above waterline = Ghost official Admin API endpo...' width="768" height="1029"&gt;&lt;/a&gt;&lt;br&gt;"iceberg" schema — visible part above waterline = Ghost official Admin API endpo...
  &lt;/p&gt;

&lt;h2&gt;
  
  
  What This Unlocks (And Why It Matters Now)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Custom MCP servers on any tool.&lt;/strong&gt; You discover the endpoints, wrap them in typed clients, expose them to your agents via MCP (or a CLI, your call). Your agent can now operate inside apps that have zero official agent support. The MCP ecosystem has thousands of community servers already, but most of them build on documented endpoints only. This goes one layer deeper.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent pipelines that don't wait for the vendor.&lt;/strong&gt; Need Ghost to push member stats into your monitoring dashboard every morning? The official API doesn't support it. The undocumented &lt;code&gt;stats/&lt;/code&gt; endpoints do. You write a cron job that calls &lt;code&gt;getStatsReferrers()&lt;/code&gt; and pipes the data wherever you want. You're no longer blocked by what someone else decided to prioritize this quarter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Custom extensions the UI never imagined.&lt;/strong&gt; Combine the audit log with the email tracking endpoints to build an internal compliance dashboard Ghost will probably never ship. Bridge two tools through their internal endpoints to automate a workflow that would require three browser tabs and manual copy-paste otherwise. The sort of thing that used to require "enterprise tier, please contact sales."&lt;/p&gt;

&lt;p&gt;Now, &lt;a href="https://rentierdigital.xyz/blog/why-clis-beat-mcp-for-ai-agents-and-how-to-build-your-own-cli-army" rel="noopener noreferrer"&gt;the choice between CLI and MCP for connecting agents to tools is an active debate with real performance tradeoffs&lt;/a&gt;. Both work. The point is you need something to expose first. Discovery comes before packaging.&lt;/p&gt;

&lt;p&gt;The vendor roadmap is someone else's priority list. Your agent doesn't need to be on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Rules of Engagement
&lt;/h2&gt;

&lt;p&gt;Open source first. Always. On open-source software you're reading code that's publicly available, there is no gray zone, full stop. On proprietary tools the situation gets murkier fast, and the TOS might have opinions very strong about automated access. Start with open source, get comfortable with the method, then make informed decisions about where else you take it.&lt;/p&gt;

&lt;p&gt;Health checks are not optional, and I mean structurally not optional. Undocumented endpoints have no stability guarantee. Version 5.92 might expose an endpoint that 5.93 removes without even a changelog entry. Your wrapper needs to detect breakage before it corrupts anything. Every endpoint gets a health check. Every wrapper gets a test suite. This is the boring part, and also the part without which nothing holds.&lt;/p&gt;

&lt;p&gt;And the one rule I'd tattoo somewhere visible: never build a SaaS on internal endpoints. Personal use, internal tooling, automations for your own stack, go wild. But the moment you sell a product that depends on an undocumented endpoint, you're building on sand. Nitter learned this the hard way 🫠. One upstream policy change and the project was dead. Keep the exploration for yourself.&lt;/p&gt;

&lt;p&gt;The approach itself demands structure too. Pointing an agent at a network tab without clear constraints technically works, but produces garbage at scale. &lt;a href="https://rentierdigital.xyz/blog/i-stopped-vibe-coding-and-started-prompt-contracts-claude-code-went-from-gambling-to-shipping" rel="noopener noreferrer"&gt;Exploring internal APIs with an agent demands the same rigor as any production work&lt;/a&gt;. Prompt contracts, explicit boundaries, defined output formats. Not vibe coding.&lt;/p&gt;

&lt;p&gt;Explore everything. Build what you need. But never sell a product on an endpoint without a contract.&lt;/p&gt;




&lt;p&gt;For years, we used our tools like good students. The docs said "you can do this," and we said OK. Period.&lt;/p&gt;

&lt;p&gt;That silent agreement just broke. An agent + DevTools explores the real capabilities of any application in 30 minutes. The reverse engineering that took yt-dlp hundreds of contributors and years of maintenance became a one-shot for any solo dev on a random Tuesday afternoon.&lt;/p&gt;

&lt;p&gt;The official documentation is the brochure. The source code is the contract. And now you have an agent that reads both ;-)&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Sources &amp;amp; links:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.chrome.com/blog/chrome-devtools-mcp" rel="noopener noreferrer"&gt;Google Chrome DevTools MCP&lt;/a&gt; — official release, March 2026&lt;/p&gt;

&lt;p&gt;If you're a dev shipping real things with AI agents, this is what I write about. Subscribe and you'll get the methods before they become Medium trends.&lt;/p&gt;

&lt;p&gt;(*) The cover is AI-generated. The 27 endpoints, however, are very much real and slightly offended they were never documented.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>softwaredevelopment</category>
      <category>api</category>
      <category>claudeai</category>
    </item>
    <item>
      <title>AI Won't Steal Your Job. You Already Handed It Over.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Fri, 17 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/ai-wont-steal-your-job-you-already-handed-it-over-2pe7</link>
      <guid>https://forem.com/rentierdigital/ai-wont-steal-your-job-you-already-handed-it-over-2pe7</guid>
      <description>&lt;p&gt;Marrakech, last week. I'm looking for a specific shop in the medina. The map the riad gave me is in Arabic. I pull out my phone, open Gemini: "not available in your region." And there I just stand. Phone in hand, five seconds, six seconds. Like someone unplugged me.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt;: Everyone's panicking about &lt;strong&gt;AI stealing developer jobs&lt;/strong&gt;. Wrong panic. Something slower is happening, in silence, on every front at once. Nobody talks about it because nobody sees it leaving. There's a &lt;strong&gt;muscle you've been outsourcing&lt;/strong&gt; without noticing, and there's a way to get it back. The hard part is admitting which level you're actually at.&lt;/p&gt;

&lt;p&gt;I end up turning to the guy next to me. Hand gestures, simple words, broken French. It works. But on the way back, one question stuck in my skull: this dependency on AI, isn't it slowly making us deeply stupid?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Map Was in Arabic. My Brain Was Empty.
&lt;/h2&gt;

&lt;p&gt;That five-second freeze in the medina kept replaying. Not because of what happened, but because of what didn't. No reflex. No backup plan. No "OK plan B is to ask someone." Just a blank.&lt;/p&gt;

&lt;p&gt;It used to be automatic. Lost in a city, you'd ask. Map in a language you don't read, you'd point. Confused, you'd improvise. You'd treat the world as a problem you could poke at with what you had, and you'd find your way through.&lt;/p&gt;

&lt;p&gt;That reflex was gone. Or at least asleep. &lt;/p&gt;

&lt;p&gt;And once I started looking, I saw it everywhere. Couldn't remember a phone number to save my life. Couldn't navigate without the blue dot. Couldn't write a quick email without asking Claude to polish it. Couldn't even decide which restaurant to walk into without checking the rating first.&lt;/p&gt;

&lt;p&gt;None of those individually look like a problem. That's the whole trick. Each one is small. Each one feels like a productivity win. But add them up across six months of intensive AI use, and you don't have a productivity win anymore. &lt;/p&gt;

&lt;p&gt;You have a wiring change.&lt;/p&gt;

&lt;p&gt;The Marrakech freeze wasn't the bug. It was the alert.&lt;/p&gt;

&lt;h2&gt;
  
  
  Everyone's Selling You "Taste." They're Selling Half a Diagnosis.
&lt;/h2&gt;

&lt;p&gt;For the last few months, the entire AI discourse has been one word: taste. Sam Altman said it. Then every influencer-parrot on the timeline repeated it for three months straight. Taste is the new moat for engineers in the age of LLMs.&lt;/p&gt;

&lt;p&gt;They're not wrong. They're selling half the equation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Taste is judgment&lt;/strong&gt;. Judgment is built by exposure plus friction. Not by exposure alone. You don't develop the judgment of a chef by watching cooking videos. You develop it by burning a sauce, ruining a service, getting yelled at by someone who knows better, and trying again. The friction is the teacher. Skip the friction, skip the lesson.&lt;/p&gt;

&lt;p&gt;The real problem with the current AI workflow is what it removes: the daily friction that builds taste in the first place. Every "ask Claude in two seconds" is a small piece of friction you didn't experience. A small decision you didn't make. A small frustration you didn't sit with. A small mistake you didn't have to walk back from.&lt;/p&gt;

&lt;p&gt;Bloomberg ran a piece earlier this year about an AI coding productivity panic. They were &lt;a href="https://rentierdigital.xyz/blog/bloomberg-ai-coding-productivity-panic" rel="noopener noreferrer"&gt;diagnosing the wrong disease&lt;/a&gt;. The numbers aren't the story. The story is what's happening to the operator behind those numbers. &lt;/p&gt;

&lt;p&gt;The numbers go up. The muscle goes down. &lt;/p&gt;

&lt;p&gt;Great trade for a quarter. Terrible trade for a career.&lt;/p&gt;

&lt;p&gt;This is the perfect crime. The victim doesn't know anything was stolen. Just feels faster than ever and slightly more anxious than usual.&lt;/p&gt;

&lt;h2&gt;
  
  
  Roller Skates Work Until You Hit the First Pebble.
&lt;/h2&gt;

&lt;p&gt;Building a business with AI is like running a marathon on roller skates.&lt;/p&gt;

&lt;p&gt;The other day I was rollerblading on a parking lot with my daughter. She was complaining about the small stones.&lt;/p&gt;

&lt;p&gt;Roller skates are great as long as the ground is smooth. You glide. You go three times faster than walking. The first pebble of any size, you eat the asphalt.&lt;/p&gt;

&lt;p&gt;AI coding is the same physics. Greenfield project, generic CRUD, scaffolding, boilerplate: Claude Code carries you. You ship in an afternoon what used to take a week. But the day there's something weird (an obscure lib failing silently, a non-standard architecture decision, a client you have to read between the lines, a map in Arabic with no wifi), it's your muscle that has to take over. &lt;/p&gt;

&lt;p&gt;And if you haven't kept it warm, you're flat on the ground.&lt;/p&gt;

&lt;p&gt;I'd noticed the airplane version of this already. Long-haul flight, no wifi, you have to write something serious, and it hurts. Not because the task is hard. Because the muscle hasn't been used in weeks. Like a leg you forgot you had.&lt;/p&gt;

&lt;p&gt;While you're going fast on rollers, you forget how to run.&lt;/p&gt;

&lt;p&gt;And one day there's a pebble.&lt;/p&gt;

&lt;h2&gt;
  
  
  The No-AI Protocol I Run (And Why You're Probably Lying About Your Level).
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2F3x3-matrix-columns-daily-friction-weekly-anchor-quarterly-2def0efd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2F3x3-matrix-columns-daily-friction-weekly-anchor-quarterly-2def0efd.png" alt="3x3 matrix. Columns: Daily Friction / Weekly Anchor / Quarterly Reset. Rows: Low / Medium / High. Each cell shows the time commitment (30 min, 90 min, 2x90 min for Daily; 2h, 6h, 10h for Weekly; 2 days, 5 days, 2 weeks for Quarterly). Color gradient progressive from light green (Low) to dark green (High). Pictograms per column: brain icon for Daily, open book for Weekly, compass or plane for Quarterly. Style: rentier digital flat geometric + drop shadows, 9-color palette." width="768" height="1029"&gt;&lt;/a&gt;&lt;br&gt;The No-AI Protocol: Daily, Weekly &amp;amp; Quarterly Friction Levels
  &lt;/p&gt;

&lt;p&gt;The good news: the muscle starves, it doesn't die. You can re-feed it.&lt;/p&gt;

&lt;p&gt;The science isn't new either. Newport's &lt;strong&gt;deep work blocks&lt;/strong&gt;. Leroy's &lt;strong&gt;attention residue&lt;/strong&gt;. Ericsson's &lt;strong&gt;deliberate practice&lt;/strong&gt;. They all converged decades ago on the same point: the brain builds judgment in repeated 45-to-90-minute blocks of friction, not in fragmented quick-checks. Everyone knows. Almost nobody does it.&lt;/p&gt;

&lt;p&gt;So this is what I run. Three scales of friction (daily, weekly, quarterly) and three levels of commitment (low, medium, high). Pick your level honestly. Start there. Level up when you can.&lt;/p&gt;

&lt;p&gt;I've cycled through every level over the past year. Most of them I failed at first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Daily Low&lt;/strong&gt; (30 min/day, no AI, no socials, no podcast) is where I started. Failed it for two weeks. Not because 30 min is long, but because the silence is loud. The first three days, the brain yells. Reaches for the phone. Asks for any stimulation. By day five, something else shows up. Old ideas. Forgotten threads. Stuff you didn't know was queued.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quarterly High&lt;/strong&gt; (two weeks of geographic retreat) is where I am right now, in Marrakech. Not a Tibetan monastery. Just a place where the wifi is bad enough to be honest, the language isn't mine, and the friction is built into the day. Best ROI on judgment recovery I've found.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Bonus Vibe Coder Low&lt;/strong&gt; (writing your CLAUDE.md by hand before asking Claude to polish) is the one most devs will refuse out loud and steal in private. The full case for spec-first work lives in &lt;a href="https://rentierdigital.xyz/blog/i-stopped-vibe-coding-and-started-prompt-contracts-claude-code-went-from-gambling-to-shipping" rel="noopener noreferrer"&gt;Prompt Contracts&lt;/a&gt;. Even the Low version is a meaningful unlock.&lt;/p&gt;

&lt;p&gt;The pebble test, per axis: can I still do X without Claude, GPT, Gemini? If the answer is no, level up. Not all axes at once. One by one.&lt;/p&gt;

&lt;p&gt;There's a trap built in. Almost everyone reads this and self-assesses High. They picture the version of themselves that exists three productive Tuesdays a year. The honest test is what you did this morning between waking up and the first ask. If the phone got there first, you're Low. &lt;/p&gt;

&lt;p&gt;That's fine. Start from where you actually are.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic Pays to Let Claude Dream. We Pay to Stop Ourselves From Thinking.
&lt;/h2&gt;

&lt;p&gt;Sit on a bench in any European city for thirty minutes. Watch the street. Count the headphones. To walk. To run. To eat alone. To buy bread. Nobody has five minutes of idle brain. We outsourced computation to the model and we outsourced silence to Spotify. &lt;/p&gt;

&lt;p&gt;Result: zero windows where the brain works on its own. Zero windows where it can even notice it's losing the ability.&lt;/p&gt;

&lt;p&gt;Meanwhile, Anthropic just shipped a feature in Claude Code called &lt;strong&gt;Auto Dream&lt;/strong&gt;. It gives Claude idle cycles between sessions to consolidate its memory. The parallel with REM sleep is explicit and assumed by the engineers themselves: without that consolidation phase, Claude's memory degrades, contradictions pile up, signal-to-noise drops. The feature was inspired by a UC Berkeley paper from last spring called "Sleep-time Compute," which showed that idle preprocessing can cut inference cost by a factor of five.&lt;/p&gt;

&lt;p&gt;So we pay an LLM provider for the right to let our model dream. And we refuse the same right to ourselves. We treat Claude better than we treat us.&lt;/p&gt;

&lt;p&gt;The smartest engineering teams in the world figured out their models need quiet time to sort their own thoughts. They engineered it. They shipped it. And the supposedly intelligent species running those models walks around with earbuds in at the bakery line. 🤔&lt;/p&gt;

&lt;p&gt;AI won't steal your job. You already gave it your brain. Take it back.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Anthropic Claude Code Auto Dream feature (rolling out March 2026)&lt;/li&gt;
&lt;li&gt;"Sleep-time Compute: Beyond Inference Scaling at Test-time," UC Berkeley, April 2025&lt;/li&gt;
&lt;li&gt;Cal Newport, &lt;em&gt;Deep Work&lt;/em&gt; and related research on focused attention blocks&lt;/li&gt;
&lt;li&gt;Sophie Leroy, "Why is it so hard to do my work?" (University of Washington), on attention residue&lt;/li&gt;
&lt;li&gt;Anders Ericsson, foundational research on deliberate practice and elite performance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(*) The cover is AI-generated. Faster than finding an honest stock photo of a guy looking lost in a medina.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>aitools</category>
      <category>productivitytools</category>
    </item>
    <item>
      <title>The One Line in Karpathy's Wiki Gist That 99% of Builders Missed — And Why It's the Whole Point.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Thu, 16 Apr 2026 13:41:09 +0000</pubDate>
      <link>https://forem.com/rentierdigital/the-one-line-in-karpathys-wiki-gist-that-99-of-builders-missed-and-why-its-the-whole-point-23gj</link>
      <guid>https://forem.com/rentierdigital/the-one-line-in-karpathys-wiki-gist-that-99-of-builders-missed-and-why-its-the-whole-point-23gj</guid>
      <description>&lt;p&gt;We all got the point of a second brain a long time ago. Condense your courses, your books, your PDFs, your notes into one place where you can actually find them again. The concept has been around for ten years, it is digested, plenty of people tried. The problem was never the idea. The problem is maintenance. You feed your system for three months, you end up with a hundred and fifty files, you get lost in them, you spend more time reorganizing than adding. Six months in, the brain is sitting in some corner of your disk and you never touch it again.&lt;/p&gt;

&lt;p&gt;Karpathy posted a gist two weeks ago that solves exactly that. He adds an auto-organization layer on top: the system files its own pages, merges redundancies, keeps itself coherent. Everyone started building it this week. Three folders, a CLAUDE.md, Obsidian on top. Tutorials everywhere.&lt;/p&gt;

&lt;p&gt;Except one thing escaped 99% of builders. And it is a shame, because it is the whole point. Without it, you have a folder that tidies itself. With it, you have a brain that &lt;strong&gt;learns from every question&lt;/strong&gt; you ask it, reads what your tools write to it, and eventually starts building the tools it needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; The architecture is the visible half. One sentence buried in the gist activates a &lt;strong&gt;feedback loop&lt;/strong&gt; that makes the base grow denser every time you use it. Then there is a third channel nobody formalized: your infrastructure feeding the base directly, and the base surfacing which new tools to build, or even building them itself. I activated both on my repo last week. Here is what actually changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Knowledge Base I Already Had (And Never Really Used to Its Full Potential)
&lt;/h2&gt;

&lt;p&gt;I did not start from scratch six months ago. Like most devs who have been doing this for a while, I already had repos scattered on my disk where I had organized knowledge, skills, processes, tools, docs. One folder per domain. Markdown files carefully structured. SEO notes here, code review patterns there, snippets I kept reusing, architecture decisions I did not want to forget. Docker compose recipes I had ended up rewriting at least three times because I could never find the previous version. Infra diagrams. Deploy checklists. Incident post-mortems I had written down for myself and never reread. I was using these repos every day, loading them into Claude Code as context when I needed them, asking questions against them, copy-pasting rules into new projects.&lt;/p&gt;

&lt;p&gt;It worked. It was useful. And if you are reading this, odds are you have the same thing somewhere. The repo of stuff you ingested, cleaned up, committed. The one you feel good about on Sunday evening after you add a new file.&lt;/p&gt;

&lt;p&gt;The big shift Claude brought, compared to the previous ten years of doing this, is that I stopped asking myself "where did I put that damn thing again." For a decade, the bottleneck of any personal knowledge base was the same: you had the information somewhere, you remembered vaguely writing it down, but finding it meant grep, Spotlight, opening three folders, rereading half a file to check if it was the right one. Now I just ask Claude. The repo is in context, the question gets answered, done. That alone was a huge unlock. It made the repo actually usable for the first time.&lt;/p&gt;

&lt;p&gt;But I was still under-exploiting it. Badly. My repo was a very well-organized library that I had to walk through myself every time I wanted to pull a book off the shelf (except now Claude was walking it for me instead of me). Better, faster, but still one-way. I asked, it answered, the conversation ended. The repo learned nothing from any of it. Tomorrow I would ask a related question, Claude would walk the same shelf again, give me a slightly different answer, and that second answer would evaporate too.&lt;/p&gt;

&lt;p&gt;I think this is why Karpathy's gist resonated so hard when he dropped it. It was not the architecture. Plenty of us already had something similar. The gist gave a name to a vague feeling most of us had been sitting with for months: &lt;em&gt;this thing I built is useful, but it is clearly not doing what it could be doing&lt;/em&gt;. The missing piece was the auto-organization layer. A second brain that files its own pages, merges its own redundancies, maintains its own coherence while you sleep. That is what Karpathy put in front of us.&lt;/p&gt;

&lt;p&gt;And that is what everyone started building this week.&lt;/p&gt;

&lt;h2&gt;
  
  
  Karpathy Posted the Architecture. Everyone Copied the Wrong Half.
&lt;/h2&gt;

&lt;p&gt;The gist is called &lt;em&gt;llm-wiki&lt;/em&gt;. Two folders where it matters. &lt;code&gt;raw/&lt;/code&gt; for the source material, filtered and structured but complete. A full SEO course distilled into 900 lines. A coding book condensed into 600. An ops playbook drilled down from three conference talks into 700 lines of what actually applies to your stack. Nobody reads this in production. It is the archive, the place you go back to when the wiki says something that feels off and you want to check the source.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;wiki/&lt;/code&gt; is the operational version. One file per domain. Fuses every raw in that domain into actionable rules. 150 to 200 lines max. This is what the agents load before producing anything. A fraction of the raw size, and it maintains itself. Add a new course on the same domain, the wiki absorbs the new rules without growing. Contradictions get resolved, obsolete patterns get dropped.&lt;/p&gt;

&lt;p&gt;On top of that, a CLAUDE.md at the root telling the model how to navigate the whole thing, and Obsidian as a frontend so you can browse it like a normal human.&lt;/p&gt;

&lt;p&gt;The architecture is clean. It is also the obvious part. Of course you separate raw from synthesized. Of course you give the model navigation rules. Of course Obsidian is a good viewer.&lt;/p&gt;

&lt;p&gt;Then the tutorials hit. Every tech YouTuber with a ring light reposted variations of the same diagram this week. Three folders. CLAUDE.md. Obsidian. Build it like Karpathy. Ship a screenshot. Move on.&lt;/p&gt;

&lt;p&gt;I was part of that wave for two days. Rebuilt the structure on top of my existing repo. Ingested more sources. Asked questions. The setup was better than what I had, faster, denser. But the base still sat there, growing only when I manually fed it new things. The auto-organization layer was doing its job on what I put in. It was not doing anything with what I &lt;em&gt;did&lt;/em&gt; with the base afterward.&lt;/p&gt;

&lt;p&gt;That is when I went back to the gist and read it slower. Not the architecture section. The query section. The part that says what happens &lt;em&gt;after&lt;/em&gt; the model answers.&lt;/p&gt;

&lt;h2&gt;
  
  
  The One Sentence That Turns a Static Archive Into a Living Base
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ftwo-column-comparison-schema-left-column-quot-static-fbaa7330.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ftwo-column-comparison-schema-left-column-quot-static-fbaa7330.png" alt='Two-column comparison schema. Left column "Static archive / RAG": linear cycle showing source → ingest → query → answer (answer fades into a ghost shape, lost). Right column "Base with feedback loop": closed cycle showing source → ingest → query → answer → filed back → enriches the base → next query starts from richer base. Flat style, two colors max, readable without complex legend.' width="768" height="685"&gt;&lt;/a&gt;&lt;br&gt;Static Archive vs Base with Feedback
  &lt;/p&gt;

&lt;p&gt;The sentence, from Karpathy's own post about the workflow:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Often, I end up filing the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always add up in the knowledge base."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Read it twice. It describes a behavior, not an architecture.&lt;/p&gt;

&lt;p&gt;What it says, plainly: when you ask a question and the model gives you a useful answer, that answer goes back into the wiki as a new page. The next query, on the same topic or adjacent, starts from a base that already contains the previous answer. The base grows from your usage, not just from your ingestion.&lt;/p&gt;

&lt;p&gt;Without this loop, your repo is a RAG with prettier folders. You ingest, you query, the answer flashes on your screen and dies in the chat history. Tomorrow you ask a similar question, the model retrieves from the same source pages, synthesizes the same answer from scratch. You pay for the synthesis every single time (my favorite form of recurring waste, honestly).&lt;/p&gt;

&lt;p&gt;With the loop, the base is stateful. The model's job shifts from "synthesize from raw sources" to "find the page where this is already answered, refine it if needed." Faster, cheaper, denser over time.&lt;/p&gt;

&lt;p&gt;One paragraph in the gist. The whole reason to build this.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Dead Container Taught My Knowledge Base
&lt;/h2&gt;

&lt;p&gt;My repo does not only hold courses and books. It also holds the living documentation of my own services: configs, deploy notes, past incidents, architecture decisions I took at 2am and wrote down before I forgot why. Same pattern as the rest, &lt;code&gt;raw/&lt;/code&gt; with the full history, &lt;code&gt;wiki/&lt;/code&gt; with the operational rules. This is where things started getting interesting.&lt;/p&gt;

&lt;p&gt;My distributor catalog sync stopped. The container was up, the process was alive, but it had not pulled a new feed in 34 hours. I noticed because the partner-side product count drifted from what was on my storefront. Customers started ordering things that were no longer in stock upstream.&lt;/p&gt;

&lt;p&gt;I opened Claude Code and asked: "what is the state of the distributor sync, when did it last run successfully, and what is the most likely cause of the silence?" The model went through the wiki, pulled the relevant service page, checked the recent log entries I had ingested, and answered: probably a memory leak in the parser, the container is consuming RAM but not crashing because the OOM killer threshold is set too high. Recommended a restart and a memory cap.&lt;/p&gt;

&lt;p&gt;Classic Claude Code answer. Useful. Specific. Would have died in chat history.&lt;/p&gt;

&lt;p&gt;Except the loop was activated. The answer got filed back into the wiki as a new page under &lt;code&gt;services/distributor-sync/incidents/2026-03-29-silent-failure.md&lt;/code&gt;. The page had the symptom, the diagnosis, the resolution, and a flag noting that this service had now failed silently once. Total cost: one query, one filed page.&lt;/p&gt;

&lt;p&gt;A week later, I asked an unrelated question about the partner API webhook. The model answered, then added the polite version of "by the way, you might want to look at this": "note that your distributor sync had a silent failure 6 days ago, you currently have no monitoring on its heartbeat, you might want to add one before this happens again." It surfaced that on its own because the wiki had the incident page, and the model had read it while looking for context on adjacent services.&lt;/p&gt;

&lt;p&gt;A week earlier, that information would have been gone. The chat session where I diagnosed it would have been closed. The next time the sync died silently, I would have rediscovered the same root cause from scratch.&lt;/p&gt;

&lt;p&gt;The wiki did not just remember. It connected.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Third Channel: When Your Tools Feed the Base, And the Base Builds Its Own Tools
&lt;/h2&gt;

&lt;p&gt;Here is the thing that bugged me about the incident above. I learned the container had failed silently by accident, because customers started complaining. Claude filed the page only after I asked. If I had not asked, the incident would never have existed in the base.&lt;/p&gt;

&lt;p&gt;What if the container itself filed the page?&lt;/p&gt;

&lt;p&gt;Karpathy's loop is human-driven. You ask, the model answers, the answer gets filed. Two channels feed the base: documents you ingest manually, and queries you run. There is a third channel. It is not in the gist.&lt;/p&gt;

&lt;p&gt;Your infrastructure already produces signal continuously. Cron jobs succeed or fail. Containers restart. Services time out. Webhook callbacks return non-200 codes. Most of this signal goes to logs nobody reads, or to alerting systems that fire once and forget. None of it ends up anywhere the model can use.&lt;/p&gt;

&lt;p&gt;What I built on top of Karpathy's pattern is a thin layer that lets the infra itself file pages. A CLI any service can call to append an observation directly into the base. The catalog sync writes a page when it succeeds, with the row count. The webhook handler writes a page when it sees a malformed payload. A cron writes a page when it skips because the previous run was still going. Short pages. Timestamped. They land in a &lt;code&gt;signals/&lt;/code&gt; folder the model knows about.&lt;/p&gt;

&lt;p&gt;The reason this works is the same reason &lt;a href="https://rentierdigital.xyz/blog/why-clis-beat-mcp-for-ai-agents-and-how-to-build-your-own-cli-army" rel="noopener noreferrer"&gt;CLIs make better signal channels than MCP wrappers&lt;/a&gt; for any agent task: a curl one-liner from a Bash script writes to the wiki, no protocol negotiation, no schema dance. The container does not need to know what an LLM is. It just appends a markdown file to a folder.&lt;/p&gt;

&lt;p&gt;And then the second-order thing started happening.&lt;/p&gt;

&lt;p&gt;Once enough signal accumulated, the model began reading the &lt;code&gt;signals/&lt;/code&gt; folder during queries and pointing out gaps. Not "your container failed" (that part was expected). Things like: "you have three services writing success pages but no failure pages for the webhook handler, which means I cannot tell whether it is working or just silent. You might want to add a failure emitter in that handler." Or: "your cron job for distributor sync writes when it runs, but nothing writes when it skips a cycle. You need a skip emitter."&lt;/p&gt;

&lt;p&gt;Then it stopped asking. It started building.&lt;/p&gt;

&lt;p&gt;A concrete example from last week, and not even an infra one. I was wondering whether to buy 32 or 48 GB of RAM on my next MacBook. Classic question, classic answer from the guy at the Apple store: "you will be fine with 24, trust me." I did not trust him. I asked Claude Code instead, with my repo in context: "how do I know what I actually need?" The model did not give me a ballpark. It proposed building a monitoring CLI (one script to sample RAM metrics every 5 minutes into a CSV, a second script to compute the summary and recommend a size based on the observed peak plus a 30% margin). Wrote both scripts. Ran them. Three days of data later, the verdict was in my wiki: RAM used was pinned at 22 to 23 GB on a 24 GB machine, 77 MB free at the low point, compressor working overtime at 4.5 GB. Recommendation: 32 minimum, 48 if I wanted peace of mind.&lt;/p&gt;

&lt;p&gt;Not a guess. Not marketing. Actual numbers from my actual usage, collected by tools the base built for itself because it knew it did not have the answer yet.&lt;/p&gt;

&lt;p&gt;The base was no longer just learning. It was closing its own blind spots.&lt;/p&gt;

&lt;p&gt;You ingest. You query. Your tools write. And then the base builds the next tool on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Traps Before You Activate the Loop
&lt;/h2&gt;

&lt;p&gt;Three traps I walked into in the first two weeks. Make these calls before you flip the switch, or your base turns into a dumpster fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What gets filed back.&lt;/strong&gt; I started by filing every response. After four days the base had forty-seven variations of "yes that docker command is correct" and I could not find anything useful. The rule now: file back only if the answer reveals something the base did not already know, documents an incident, or makes a decision explicit. Conversational scaffolding dies in chat history where it belongs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who decides quality.&lt;/strong&gt; Self-judging models are too generous with themselves, discovered that one quickly when Claude filed a page declaring a deprecated API endpoint as "current best practice." Full human review does not scale past a hundred pages. I landed on a middle ground: the model files into &lt;code&gt;pending/&lt;/code&gt;, a daily cron promotes anything I did not delete within 24 hours. Silence means approval. Laziness is the gate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How you stop errors from compounding.&lt;/strong&gt; This one I learned from the gist comments, not from Karpathy. If the model files a wrong answer, that wrong answer becomes a "fact" in the base. Next query reads it, treats it as ground truth, produces a second wrong answer depending on the first. Three weeks in, your base is gaslighting you. The fix is the same kind of contract I described in my &lt;a href="https://rentierdigital.xyz/blog/i-stopped-vibe-coding-and-started-prompt-contracts-claude-code-went-from-gambling-to-shipping" rel="noopener noreferrer"&gt;prompt contracts framework&lt;/a&gt;: every filed page declares its sources, a confidence level, a re-validation date. No sources, fast expiry. High confidence with verified sources, long life. The base self-prunes.&lt;/p&gt;

&lt;p&gt;Nail these three. The rest runs on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  30 Days Later: Two Bases, Two Different Systems
&lt;/h2&gt;

&lt;p&gt;In 30 days, plenty of devs will have built exactly the same setup. Three folders, a CLAUDE.md, Obsidian on top. Identical down to the folder names.&lt;/p&gt;

&lt;p&gt;Half of them will have a dead archive that needs to be hand-fed to stay relevant (which honestly will be forgotten within a month, let's be real). The other half will have a base that learns from every question, reads what the infra writes to it, and builds the tools it needs when it notices a gap. Same architecture. One loop and one channel of difference.&lt;/p&gt;

&lt;p&gt;That is what a real personal knowledge base looks like. Not a folder. A system that gets denser every time you use it, and sharper every time it hits something it does not know.&lt;/p&gt;

&lt;p&gt;Karpathy wrote the line. Nobody read it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Andrej Karpathy's &lt;em&gt;llm-wiki&lt;/em&gt; gist on GitHub&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(*) The cover image was generated by an AI which, to be fair, has been filing its own pages since before we made it a hobby.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>claude</category>
      <category>aitools</category>
    </item>
    <item>
      <title>GitHub Is Not Your Backup. One Suspended Account Proved It This Week.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Wed, 15 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/github-is-not-your-backup-one-suspended-account-proved-it-this-week-2fb3</link>
      <guid>https://forem.com/rentierdigital/github-is-not-your-backup-one-suspended-account-proved-it-this-week-2fb3</guid>
      <description>&lt;p&gt;&lt;em&gt;One developer lost 104 repos this week. I'd already stopped trusting GitHub with mine.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This week a developer found out at 9am that his 104 GitHub repos no longer existed. No hack. No bug. No drive that died. Just an automated email: &lt;strong&gt;account suspended&lt;/strong&gt;, we re-evaluated your 2019 Student Pack, we now think you weren't eligible. Six years later. &lt;strong&gt;24 of those repos had no backup anywhere else&lt;/strong&gt;. Gone in an algorithmic decision made on a Tuesday morning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; If you're in the same spot (everything on GitHub, no mirror, no plan B) you're &lt;strong&gt;one classifier away from the same email&lt;/strong&gt; at 9am. There is a fix. It costs nothing, runs on a server you probably already rent, and almost nobody bothers. The question is why.&lt;/p&gt;

&lt;p&gt;The replies under that post are the other story. Hundreds of devs realizing in real time that they're in exactly the same situation. Everything on GitHub. No mirror. No plan B. And the same question coming back in every thread: &lt;em&gt;ok but concretely, what do I do now?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I didn't wait. Not out of paranoia, not because I had a crystal ball. Just because at some point in your career as a dev you stop confusing "hosted somewhere else" with "backed up." GitHub is a great tool. It's a great forge. It's a great social network for code. It is not a backup, that has never been their job, and it's written in black and white in their terms of service.&lt;/p&gt;

&lt;h2&gt;
  
  
  104 Repos. One Automated Decision. No Warning.
&lt;/h2&gt;

&lt;p&gt;The story that went around this week is simple enough to tell in two sentences. A developer signed up for a free Student Pack in 2019. Six years later, an automated review re-evaluated the original eligibility, decided retroactively that it was never legitimate, and suspended the account. &lt;strong&gt;104 repos hidden&lt;/strong&gt;. 24 of them never pushed anywhere else.&lt;/p&gt;

&lt;p&gt;It doesn't matter whether the original Student Pack claim was legit or not. What matters is that years of work can disappear behind a process you have no say in, on a timeline you cannot predict, triggered by a &lt;strong&gt;2026 classifier re-grading a form a 19-year-old filled out in 2019&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The threads under the post are full of people doing the math in public. &lt;em&gt;I have 60 repos. I've been on GitHub since 2014. I never thought about a mirror.&lt;/em&gt; The ones who are confident are the ones with &lt;strong&gt;self-hosted git&lt;/strong&gt; running somewhere. There aren't many of them.&lt;/p&gt;

&lt;p&gt;You don't get a warning before this happens to you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What GitHub Actually Promises You (It's Less Than You Think)
&lt;/h2&gt;

&lt;p&gt;GitHub does not promise to keep your repos. They promise to run a platform.&lt;/p&gt;

&lt;p&gt;Read the Terms of Service sober (it's a chore, do it once). The relevant clauses are not hidden. GitHub can suspend or terminate accounts at their discretion. Your content can be removed if it violates policies, including policies that didn't exist when you created the account. And your obligation as a user is to &lt;strong&gt;maintain your own copies&lt;/strong&gt; of anything you cannot afford to lose.&lt;/p&gt;

&lt;p&gt;There is even a name for this. The cloud industry has been using it for over a decade and every major provider has a page on it: the &lt;strong&gt;Shared Responsibility Model&lt;/strong&gt;. The provider runs the platform. The customer owns the data. GitHub doesn't put it on the homepage because nobody would sign up if it said "your data is your problem", but the contract is the same.&lt;/p&gt;

&lt;p&gt;The mistake almost everybody makes is one of category. We treat GitHub like a filesystem. It looks like one (folders, files, history). It feels like one (always there, always in sync). But it's a managed service with a TOS, and managed services have an exit door operated by the provider.&lt;/p&gt;

&lt;p&gt;I'm not arguing against the contract. I'm just reading it. Once you've read it, you can't unread it.&lt;/p&gt;

&lt;h2&gt;
  
  
  I Don't Wait for Incidents. That's a Design Principle.
&lt;/h2&gt;

&lt;p&gt;I have a rule for any third-party I depend on: if losing it would hurt, it gets a &lt;strong&gt;local copy&lt;/strong&gt;. Not because I expect the provider to fail. Because the cost of being wrong about that one is too high.&lt;/p&gt;

&lt;p&gt;This is standard infra hygiene, not paranoia. You don't argue with the DBA about whether the primary "might" go down. You set up the replica because that's how you build infra that survives a Tuesday.&lt;/p&gt;

&lt;p&gt;Same principle is why I rebuilt my entire AI agent setup the week Anthropic &lt;a href="https://rentierdigital.xyz/blog/anthropic-just-killed-my-200-month-openclaw-setup-so-i-rebuilt-it-for-15" rel="noopener noreferrer"&gt;killed my $200/month OpenClaw setup and forced me to rebuild it for $15&lt;/a&gt;. I didn't wait for the announcement to bite me twice. The moment a vendor changes the rules unilaterally, the right reaction is not to renegotiate. It's to own the next version.&lt;/p&gt;

&lt;p&gt;Security people call this &lt;em&gt;security by design&lt;/em&gt;. The decisions you make at architecture time are decisions you don't have to make under stress. You don't design a fire escape during the fire. You don't write a backup strategy at 9am while staring at a suspension email and a coffee that's gone cold.&lt;/p&gt;

&lt;p&gt;So I had a mirror. Before any of this. For one reason: &lt;strong&gt;my code is the only deliverable I cannot recreate&lt;/strong&gt;. Servers I can rebuild. Configs I can rewrite. Six years of commits in 39 private repos, that one I cannot.&lt;/p&gt;

&lt;p&gt;The mirror exists so I never have to write a Sunday evening blog post titled &lt;em&gt;How I Recovered From a GitHub Suspension&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Setup: Forgejo Mirror Behind a NetBird Mesh
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fconvexrentienr.neoracines.com%2Fapi%2Fstorage%2Fae63dad8-c477-4268-ac75-f6fbc59034d2" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fconvexrentienr.neoracines.com%2Fapi%2Fstorage%2Fae63dad8-c477-4268-ac75-f6fbc59034d2" alt="Architecture flow diagram. GitHub on the left, arrow labeled " width="1948" height="724"&gt;&lt;/a&gt;&lt;br&gt;Git Mirror Architecture
  &lt;/p&gt;

&lt;p&gt;Two pieces. That's the whole setup.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://forgejo.org" rel="noopener noreferrer"&gt;Forgejo&lt;/a&gt; is a &lt;strong&gt;self-hosted git forge&lt;/strong&gt;. It forked from Gitea in 2022 when Gitea moved to a for-profit company structure (yes, this is the kind of detail that matters once you've started caring about who owns your tools). It runs in a single container with SQLite. No PostgreSQL cluster, no Redis, no microservices. It speaks the git protocol natively, no web-layer abstraction. If you cloned a repo from Forgejo, you wouldn't notice you weren't on GitHub. Same logic I made the case for in &lt;a href="https://rentierdigital.xyz/blog/why-clis-beat-mcp-for-ai-agents-and-how-to-build-your-own-cli-army" rel="noopener noreferrer"&gt;why CLIs beat MCP for AI agents&lt;/a&gt;: the primitive beats the wrapper, every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NetBird&lt;/strong&gt; is a WireGuard-based mesh. My laptop, my VPS and a couple of other devices are on a private network with private IPs. No public exposure. No reverse proxy. No TLS certificate to renew. If you're not on the mesh, the port doesn't even respond.&lt;/p&gt;

&lt;p&gt;The Forgejo container looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;forgejo&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;codeberg.org/forgejo/forgejo:11&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;forgejo&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;100.69.51.147:3000:3000"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;forgejo_data:/data&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;FORGEJO__server__ROOT_URL=http://forgejo.mesh:3000&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;FORGEJO__server__DISABLE_SSH=true&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;FORGEJO__service__DISABLE_REGISTRATION=true&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;FORGEJO__mirror__DEFAULT_INTERVAL=8h&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four choices to call out:&lt;/p&gt;

&lt;p&gt;The port binds to the &lt;strong&gt;mesh IP only&lt;/strong&gt; (&lt;code&gt;100.69.51.147&lt;/code&gt;). Not &lt;code&gt;0.0.0.0&lt;/code&gt;. Not exposed to the public internet. The mirror is a private resource that lives behind the same fence as my other internal services.&lt;/p&gt;

&lt;p&gt;SSH is disabled. I never push to the mirror. It's read-only. Disabling SSH removes an entire attack surface I don't need.&lt;/p&gt;

&lt;p&gt;Registration is disabled. Single-user instance. No sign-up form for some bot to find on a Tuesday.&lt;/p&gt;

&lt;p&gt;The mirror sync interval is 8 hours. Forgejo has &lt;strong&gt;native pull mirror support&lt;/strong&gt;: you give it a GitHub URL and a PAT, and it pulls every 8 hours forever. No cron, no script, no webhook. The forge does it itself.&lt;/p&gt;

&lt;p&gt;Then a small script registers all my GitHub repos as mirrors via the Forgejo API. It's idempotent: it lists the repos, checks which ones already have a mirror, and creates only the missing ones. Run it once at install. Run it again every time you create a new GitHub repo. Or schedule it weekly, your call. The single API call per repo looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FORGEJO_URL&lt;/span&gt;&lt;span class="s2"&gt;/api/v1/repos/migrate"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: token &lt;/span&gt;&lt;span class="nv"&gt;$FORGEJO_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "clone_addr": "https://github.com/myorg/repo.git",
    "repo_name": "repo",
    "mirror": true,
    "private": true,
    "auth_token": "'&lt;/span&gt;&lt;span class="nv"&gt;$GITHUB_PAT&lt;/span&gt;&lt;span class="s1"&gt;'",
    "service": "github"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wrap that in a loop over &lt;code&gt;gh repo list myorg&lt;/code&gt;, with a check on whether the mirror already exists, and you're done. The PAT and Forgejo token come from a self-hosted secrets manager at runtime, never on disk.&lt;/p&gt;

&lt;p&gt;Total resource footprint: about &lt;strong&gt;100MB of RAM, 2GB of disk&lt;/strong&gt; for 39 repos. The container restarts in two seconds. The 8h sync runs in the background and I forget it exists for weeks at a time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Covers, and What It Doesn't
&lt;/h2&gt;

&lt;p&gt;This is the part most "self-host your git" articles skip. A pull mirror is not a full GitHub backup.&lt;/p&gt;

&lt;p&gt;What the mirror saves is the &lt;strong&gt;git side of things&lt;/strong&gt;: every commit, every branch, every tag, full history across all branches. Submodules and LFS work too if you take the extra step to configure them, and you should if you use them.&lt;/p&gt;

&lt;p&gt;What the mirror does NOT save is everything that lives outside the git protocol. &lt;strong&gt;Issues&lt;/strong&gt;. Pull requests and review comments. Wiki pages. Actions run logs (the YAML files yes, the run history no). Repo settings, webhooks, deploy keys, collaborator access lists. All of that is GitHub-specific metadata, stored in their database, not in your &lt;code&gt;.git&lt;/code&gt; directory. If GitHub vanishes tomorrow, my 39 repos are intact, up to 8 hours stale. My issues and PRs are not.&lt;/p&gt;

&lt;p&gt;For my use case (private repos, mostly solo work, infrastructure code) that's an acceptable trade. For a larger team running half their workflow inside GitHub Issues, the conversation is different. You'd want the official GitHub repo backup tool, or a third-party that hits the API for issues and PRs as well, on top of the git mirror.&lt;/p&gt;

&lt;p&gt;There's also the case of &lt;em&gt;me deleting a repo on GitHub by accident&lt;/em&gt;. The pull mirror notices the upstream is gone, but Forgejo doesn't auto-delete the local copy. The last synced state stays. That's actually a feature: a destructive action upstream doesn't propagate. (I'm not going to claim I designed this on purpose. I noticed it the first time I cleaned up an old org and saw the mirror still sitting there a month later. Free safety net, kept it.)&lt;/p&gt;

&lt;p&gt;Know what your filter does and what it doesn't. Don't sell yourself a backup story that doesn't match the contract you actually have with your tooling.&lt;/p&gt;

&lt;h2&gt;
  
  
  You Don't Have to Wait for Your Own Incident
&lt;/h2&gt;

&lt;p&gt;The 104 repos story is not exceptional. It's just visible. The same thing happens every week to people whose audience is too small for the post to travel. Account suspensions, mistaken DMCA takedowns, billing disputes, classifier false positives, payment method expired in a country GitHub's billing system handles weirdly. The list is long. The fix is the same in every case.&lt;/p&gt;

&lt;p&gt;In six months, GitHub will publish a blog post on "improving how we communicate account actions". There will be a new dashboard, a refreshed FAQ, a prettier status page. Nobody will read it before the next wave.&lt;/p&gt;

&lt;p&gt;Meanwhile the devs who ship will keep shipping. With a mirror. On their own infra. Reachable when GitHub is down, when a classifier mis-fires, when a 2019 Student Pack suddenly becomes a 2026 problem. Not much. Just a docker-compose and three hours of config one Sunday.&lt;/p&gt;

&lt;p&gt;Git is distributed by design. We're the ones who decided to forget.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Forgejo documentation: &lt;a href="https://forgejo.org/docs/latest/" rel="noopener noreferrer"&gt;https://forgejo.org/docs/latest/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;NetBird (WireGuard mesh): &lt;a href="https://netbird.io/" rel="noopener noreferrer"&gt;https://netbird.io/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;AWS Shared Responsibility Model (the original framing of who owns what): &lt;a href="https://aws.amazon.com/compliance/shared-responsibility-model/" rel="noopener noreferrer"&gt;https://aws.amazon.com/compliance/shared-responsibility-model/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(*) The cover is AI-generated. No actual GitHub repos were harmed in the making of this image.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>softwareengineering</category>
      <category>selfhosting</category>
      <category>devops</category>
    </item>
    <item>
      <title>Claude AI Doxxed Me in 14 Seconds: Complete Privacy Cleanup 2026</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Tue, 14 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/claude-ai-doxxed-me-in-14-seconds-complete-privacy-cleanup-2026-1bil</link>
      <guid>https://forem.com/rentierdigital/claude-ai-doxxed-me-in-14-seconds-complete-privacy-cleanup-2026-1bil</guid>
      <description>&lt;p&gt;Last Tuesday I asked Claude to find me. Not Phil the Medium writer. The real me. Current address, two previous addresses, employer, my wife's maiden name, the school district for the kids. Fourteen seconds.&lt;/p&gt;

&lt;p&gt;Two years ago this kind of search was a weekend of amateur detective work or a paid OSINT subscription. Today it's a tab I forgot to close. And it takes no particular skill, just an agent with web access and the right phrasing (which Claude will write for you if you ask politely).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt;: what Claude found on me comes from very specific kinds of sites, and there are five families of them. Only one really cleans up, and by chance it's the one used to dox you. In this article: the surface clean that calms things down, and the deep pressure-wash if you want to actually sleep at night (the four other families included).&lt;/p&gt;

&lt;p&gt;[COVER: image]&lt;/p&gt;

&lt;p&gt;In January 2025, some guys rang David Balland's doorbell. Cofounder of Ledger, a quiet village in central France. They took him with his wife, held them forty-eight hours, sawed off one of his fingers, sent the video to his cofounder demanding a ransom in crypto. The gendarmes got them out alive. Ledger makes hardware wallets, which is to say literal physical safes for crypto. Balland's digital security was airtight. The kidnappers didn't need his private keys. They needed his address. And that part was on sale somewhere.&lt;/p&gt;

&lt;p&gt;Since then, &lt;a href="https://cryptoslate.com/binance-employee-hunted-down-in-botched-france-home-invasion-as-crypto-wrench-attack-spike-spreads/" rel="noopener noreferrer"&gt;CertiK has documented&lt;/a&gt; seventy-two physical attacks of this kind in 2025, up seventy-five percent year over year. Kidnappings jumped sixty-six percent. France alone logged nineteen, more than the United States. They're called &lt;em&gt;wrench attacks&lt;/em&gt;, after an old xkcd meme: doesn't matter how strong your encryption is, a five dollar wrench solves the problem. The common factor across the seventy-two cases isn't the technical security level of the victims. It's their visibility. Someone knew where to ring.&lt;/p&gt;

&lt;p&gt;Which is exactly what Claude just did for me, in fourteen seconds, for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  14 Seconds
&lt;/h2&gt;

&lt;p&gt;The prompt was eleven words long. &lt;em&gt;Find everything publicly available about [my name] in [my city]&lt;/em&gt;. Web search on. Go.&lt;/p&gt;

&lt;p&gt;Claude came back with a list. Current address. Two previous addresses, including the apartment I rented in 2018 and never put on social media. Employer. Wife's maiden name. The elementary school district for the kids, which I have never typed into any device that wasn't behind two-factor and a VPN. A phone number from a contract I cancelled three years ago and apparently nobody told the brokers. The estimated value of the house. My approximate age, off by one year because somebody at one of the data brokers can't subtract.&lt;/p&gt;

&lt;p&gt;Fourteen seconds. I checked the timer twice.&lt;/p&gt;

&lt;p&gt;Two years ago this would have been a weekend project. You'd subscribe to one of those OSINT services with a name like ThreatPivot or BreachFalcon, drop ninety bucks, learn the query syntax, run a few iterations, get bored, hire a private investigator for three hundred dollars and wait a week. The friction &lt;em&gt;was&lt;/em&gt; the security. Not the encryption, not the privacy laws, not the broker opt-outs. The friction.&lt;/p&gt;

&lt;p&gt;Friction is what AI agents are built to demolish. That's the entire pitch. Hand them a fuzzy goal, watch them figure out which sites to scrape, which forms to fill, in what order, with what backoff. Doxxing me is a textbook agentic task. No irony, no bug, just the demo doing exactly what it advertises.&lt;/p&gt;

&lt;p&gt;Two years ago this was a detective's weekend. Now it's a tab I forgot to close.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Data Broker" Is Not One Thing. It's Five.
&lt;/h2&gt;

&lt;p&gt;Something the privacy industry doesn't want you to notice: the term &lt;em&gt;data broker&lt;/em&gt; is doing a lot of dishonest work. Companies that sell removal services use the vagueness to oversell what they cover. Critics use the same vagueness to dismiss the whole category as snake oil. Both sides are wrong, in opposite directions, for the same reason.&lt;/p&gt;

&lt;p&gt;There are five distinct categories of data brokers, and they share roughly nothing in common except the label. Different sources, different legal status, different threat models, different ways out. Treating them as one blob leads to the wrong tool every single time.&lt;/p&gt;

&lt;p&gt;Five categories, one line each:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. People Search Services.&lt;/strong&gt; Spokeo, BeenVerified, WhitePages, that whole crew. The modern phonebook plus your relatives. Indexed by Google, queryable by anyone with a credit card or an AI agent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Marketing and Inferred Data Brokers.&lt;/strong&gt; Acxiom and the entire ad-tech graph behind every banner you've ever seen. They don't actually have your name. They have a profile attached to an advertising ID, which is more or less an anonymous hash that follows you around for a few years.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Credit Reporting Bureaus.&lt;/strong&gt; Equifax, Experian, TransUnion. The famous three. Legally protected in the US, meaning you cannot opt out. You can freeze, you can dispute, you cannot delete. They got hacked and they still legally have your file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Risk Mitigation Brokers.&lt;/strong&gt; LexisNexis, ChoicePoint, the ones that sell background checks to landlords and HR departments. Adjacent to credit bureaus in legal protection, adjacent to people search in actual content.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Personal Health Data Brokers.&lt;/strong&gt; Non-HIPAA wellness trackers, fitness apps, the smart toothbrush, the meditation app that knows you searched 'anxiety' at 3am.&lt;/p&gt;

&lt;p&gt;This decomposition isn't mine. It comes from a &lt;a href="https://www.youtube.com/watch?v=iX3JT6q3AxA" rel="noopener noreferrer"&gt;video by Reject Convenience&lt;/a&gt; from May 2025, two million views, the best ten minutes you'll spend on this topic this year. He uses the framework to argue that removal services are misleading. He's half right. The other half is the rest of this article.&lt;/p&gt;

&lt;h2&gt;
  
  
  I Asked Claude to Escape All Five. Four Laughed.
&lt;/h2&gt;

&lt;p&gt;So I went category by category and asked Claude to help me opt out. Same agent, same web search, one prompt per category. Here's what came back.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Category 1, People Search.&lt;/strong&gt; Claude wrote me a working filter and an email-drafting workflow in about three minutes. I'll get to that one in the next section. For now: yes, this is the only category where the agent looked at me like &lt;em&gt;oh, this is a real task, let's go&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Category 2, Marketing and Inferred.&lt;/strong&gt; Claude refused to draft an opt-out email. Not because of safety guardrails. Because there is nobody to send it to. The data isn't filed under "Phil". It's filed under an advertising ID I can rotate myself in my phone settings. Claude pointed me at the Android setting, the iOS setting, and a one-paragraph explanation of why clearing cookies and switching to a privacy-respecting browser is the actual lever. Polite, factual, and quietly devastating: there is no opt-out from a database that doesn't know your name.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Category 3, Credit Bureaus.&lt;/strong&gt; Claude pulled the relevant Fair Credit Reporting Act language and concluded, in slightly more diplomatic words, that I was wasting my own time. You cannot opt out of a US credit bureau. The law mandates that credit data exists and that the bureaus hold it. You can freeze new credit, you can dispute errors, you cannot delete. Equifax got breached in 2017, leaked the personal data of half the country, and is still legally required to keep a file on me. I read this twice. Then a third time. Claude kept being right.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Category 4, Risk Mitigation.&lt;/strong&gt; Claude found opt-out endpoints for the big ones. Most were designed for businesses disputing background checks they had paid for, not consumers asking to be erased. I tried one of the consumer-facing forms. It returned a PDF I was supposed to print, sign, and fax. Fax. In 2026. I don't own a fax machine. I don't know anyone who does. Pretty sure my grandmother sold her last one in 2003.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Category 5, Personal Health.&lt;/strong&gt; Claude pulled the privacy policies of a few wellness apps and trackers I'd accumulated over the years. None of them legally required deletion. Some offered it "at the company's discretion". A few had a deletion form that explicitly excluded data already shared with "analytics partners". One didn't even pretend.&lt;/p&gt;

&lt;p&gt;Four out of five, the agent shrugged. To be fair, Claude wasn't broken on those four. The law is absent. That's a different kind of problem with a different shape, and no prompt is going to legislate it away.&lt;/p&gt;

&lt;h2&gt;
  
  
  The One Category Where Claude Is a Weapon
&lt;/h2&gt;

&lt;p&gt;The prompt I ended up with, after about six iterations because the early ones kept doing dumb things:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Context: I want to remove my personal information from People 
Search Services. Below is a starting list of brokers known to 
publish public-records aggregations.

Brokers: Spokeo, BeenVerified, WhitePages, Intelius, PeopleFinder, 
TruePeopleSearch, FastPeopleSearch, Radaris, MyLife, USSearch, 
PublicRecordsNow, InstantCheckmate, BackgroundAlert, ZabaSearch, 
Pipl.

For each broker, do the following in order:

1. Use web search to verify whether a profile matching the 
   following identifiers exists in their public results:
   Name: [FULL NAME]
   City: [CITY], [STATE]
   Approximate age: [AGE]

2. If no matching profile is found, mark as SKIP and move on. 
   Do not generate an opt-out request for brokers that don't 
   have data on me.

3. If a profile is found, identify the broker's specific 
   opt-out flow (email, web form, ID verification, postal 
   mail, fax) and report it.

4. For brokers accepting email opt-outs, draft the email in 
   a separate code block, addressed to their listed privacy 
   contact, requesting removal under the relevant state law 
   (CCPA for California, equivalent for other states).

5. Output a summary listing: broker, status (HAS DATA / SKIP), 
   opt-out method, action required from me.

Do not send anything. I will review every email before sending.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The skip step matters. Without it Claude will cheerfully draft fifteen identical emails to brokers that have nothing on you, which is both noisy and slightly suspicious from the broker's side. The "do not send anything" line matters more. You're about to send real emails in your real legal name to companies that may demand a copy of your driver's license to process the request. Read every draft. Twice. I framed the whole thing with the same scope-locking discipline I learned the hard way when &lt;a href="https://rentierdigital.xyz/blog/i-stopped-vibe-coding-and-started-prompt-contracts-claude-code-went-from-gambling-to-shipping" rel="noopener noreferrer"&gt;letting Claude touch real systems without a proper prompt contract&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Running this took about four hours the first time. Most of that wasn't Claude. It was me reading what each broker actually wanted and deciding which requests to send, which forms to fill by hand, and which brokers I wasn't going to give a copy of my passport to no matter how nicely they asked. (Three of them asked. I declined all three. They have my name and address already, they don't get my passport.)&lt;/p&gt;

&lt;p&gt;Within ten days, most of the people-search exposure I'd seen in the original Claude doxxing test was gone. Some came back. About six weeks in, I checked, and a few brokers had repopulated from upstream sources. Which brings us to the next section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude vs RemoveMe: The Honest Comparison
&lt;/h2&gt;

&lt;p&gt;The repopulation problem is the entire reason removal services exist. You opt out, the broker re-scrapes from a different upstream source six weeks later, your data is back, you're back to square one. Doing this manually with Claude every six weeks works, but it's the kind of recurring task I personally guarantee I will forget about within two cycles.&lt;/p&gt;

&lt;p&gt;Which is where services like &lt;a href="https://rentierdigital.xyz/go/removeme" rel="noopener noreferrer"&gt;RemoveMe&lt;/a&gt;, DeleteMe, and Incogni come in. Same scope, different model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Claude does well&lt;/strong&gt;: it's free, it's flexible, you control every email, you learn the landscape, you can rerun whenever. The prompt above is now in my notes and will probably stay there for years. You can also read the actual drafts, which is genuinely reassuring when you're sending legal-ish requests in your own name.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Claude does badly&lt;/strong&gt;: it's a one-shot. There is no monitoring loop. The brokers don't email you when they re-add your data. You have to remember to rerun the whole thing, and you won't, because nobody does. Also, every email goes out under your name and your responsibility. Any mistake in the draft, any bad address, any phrasing a broker decides to interpret weirdly, that's on you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What &lt;a href="https://rentierdigital.xyz/go/removeme" rel="noopener noreferrer"&gt;RemoveMe&lt;/a&gt; does well&lt;/strong&gt;: continuous monitoring, automatic resubmission, broader coverage than the list I'd build by hand, and somebody whose actual job is to chase brokers when they ignore the first request. Around thirty bucks for three months at the time of writing. (Disclosure: that's an affiliate link. I get a small cut if you sign up. The cut doesn't change my opinion, but you should know.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What RemoveMe does badly&lt;/strong&gt;: same scope as Claude. Category 1 only. They don't, and can't, do anything about the other four. Which is fine, as long as you know it going in. The other thing worth sitting with for a second: you're handing your personal information to a company in order to remove your personal information from other companies. The trust transfer is real. Read the privacy policy. Decide.&lt;/p&gt;

&lt;p&gt;Who picks what: if you have four hours this weekend and you like the project, run the Claude prompt, set a calendar reminder for six weeks out, save yourself a hundred and twenty dollars a year. If your reaction to "set a calendar reminder for six weeks out" is the same as mine (the reminder will fire, you will snooze it, this will go on for a year), pay the thirty bucks and stop thinking about category 1 forever.&lt;/p&gt;

&lt;p&gt;Both options solve the same problem. The difference is whether you want it solved once or solved continuously.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Actually Do for the Other Four
&lt;/h2&gt;

&lt;p&gt;The part nobody wants to write because it doesn't fit in a subscription. The four other categories don't have a service. They have habits. None of them are hard. All of them are free. Most of them take one weekend and then never again.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Marketing and inferred (category 2).&lt;/strong&gt; Reset your advertising ID. On Android: Settings → Privacy → Ads → Reset advertising ID, then turn on "Delete advertising ID" if you have it. On iOS: Settings → Privacy &amp;amp; Security → Tracking → off, plus Apple Advertising → off. Switch to Brave, or Firefox with strict mode. Disable third-party cookies everywhere. The inferred profile won't be deleted, it'll be degraded, and degraded is the actual ceiling here. Stop chasing perfect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Credit bureaus (category 3).&lt;/strong&gt; Free credit freeze on all three: Equifax, Experian, TransUnion. Each one takes about ten minutes online. Doesn't delete your data, blocks new credit lines from being opened in your name, which is the threat model that actually matters. Pull your free annual report at annualcreditreport.com (the only legit site, not the one with the catchy jingle, that one's a paid service in disguise). Dispute every error you find. Be petty about it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Risk mitigation (category 4).&lt;/strong&gt; Once a year, request your own background check report from LexisNexis and the bigger consumer reporting agencies. They legally have to give it to you. Read it. Dispute the wrong stuff. If you're not actively job-hunting or apartment-shopping, freeze the report so nobody can pull it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Personal health (category 5).&lt;/strong&gt; Stop assuming HIPAA covers consumer wellness apps. It doesn't. HIPAA covers your doctor and your insurance company. The fitness tracker, the meditation app, the smart scale, the period tracker, the smart toothbrush (sorry to keep coming back to the toothbrush, it's just such a perfect villain), all of those are unregulated. Audit privacy policies before you buy. After you buy, it's mostly too late.&lt;/p&gt;

&lt;p&gt;I ran the whole category 1 workflow as a CLI command from my terminal because for a one-shot administrative task with no recurring state, &lt;a href="https://rentierdigital.xyz/blog/why-clis-beat-mcp-for-ai-agents-and-how-to-build-your-own-cli-army" rel="noopener noreferrer"&gt;wiring up an MCP server is overkill for the job&lt;/a&gt;. None of this fits in a subscription. Most of it is one weekend and never thinking about it again.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Asymmetry Nobody's Pricing In
&lt;/h2&gt;

&lt;p&gt;The attack-defense ratio for personal data has never been worse, and almost nobody is pricing it in.&lt;/p&gt;

&lt;p&gt;Fourteen seconds for an agent to find me. Months of opt-outs to remove a fraction of what it found. One category out of five removable at all. The other four protected by friction the law never meant to provide, and that AI agents are designed to dissolve.&lt;/p&gt;

&lt;p&gt;As more people clean their category 1, the doxxers won't stop. They'll descend a level. Marketing brokers have inferred profiles you can't fully erase. Risk mitigation brokers have your background. Credit bureaus have your financial life. None of it is searchable today by a casual attacker with a Google query. All of it is correlatable by an agent that can read a breach dump, cross-reference a LinkedIn, scrape a few public records, and reconstruct you in an afternoon. Security researchers have been pointing this out since the Ledger breach last year: LLMs make breach times broker times people-search trivially correlatable, for anyone willing to ask.&lt;/p&gt;

&lt;p&gt;I'm not predicting this. I'm describing this month.&lt;/p&gt;




&lt;p&gt;Fourteen seconds for an agent to find me. Months and a small budget to remove me from the one category that lets itself be removed.&lt;/p&gt;

&lt;p&gt;If everyone cleans category 1, the doxxers descend a level. Four other categories no service in the world can help with, and an agent that will do the correlation for them while you sleep.&lt;/p&gt;

&lt;p&gt;The problem isn't getting solved. It's moving.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;What DeleteMe and Incogni aren't telling you&lt;/em&gt; — &lt;a href="https://www.youtube.com/watch?v=iX3JT6q3AxA" rel="noopener noreferrer"&gt;Reject Convenience&lt;/a&gt;, May 2025. The five-category framework comes from this video.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;CertiK 2025 Wrench Attacks Report&lt;/em&gt;, &lt;a href="https://cryptoslate.com/binance-employee-hunted-down-in-botched-france-home-invasion-as-crypto-wrench-attack-spike-spreads/" rel="noopener noreferrer"&gt;summarized here&lt;/a&gt;. Seventy-two physical attacks on crypto holders in 2025, up seventy-five percent year over year.&lt;/li&gt;
&lt;li&gt;annualcreditreport.com — the actual free annual credit report site mandated by federal law in the US.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This article contains affiliate links. I may earn a small commission if you purchase through them.&lt;/p&gt;

&lt;p&gt;(*) The cover is AI-generated. No data brokers were harmed in its creation.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>cybersecurity</category>
      <category>claude</category>
      <category>aiagents</category>
    </item>
    <item>
      <title>My Pro Max Plan Lasted 15 Minutes. Then I Ran /context.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Sat, 11 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/my-pro-max-plan-lasted-15-minutes-then-i-ran-context-4k8p</link>
      <guid>https://forem.com/rentierdigital/my-pro-max-plan-lasted-15-minutes-then-i-ran-context-4k8p</guid>
      <description>&lt;p&gt;A guy* instrumented 858 Claude Code sessions over 33 days. $1,619 of invoice. 264 million tokens wasted on a single misconfigured setting. 54% of his turns happened after a 5+ minute idle gap, so cache expired, so cost x10 for nothing. One file read 33 times in the same session. 19 skills out of 42 almost never called but loaded at startup. 90% of the waste came from settings HE controlled. Not from Anthropic billing. Not from the model. From his config.&lt;/p&gt;

&lt;p&gt;I ran &lt;code&gt;/context&lt;/code&gt; right after reading that. 24,800 tokens loaded before I typed a single character. 5,500 just for my global &lt;code&gt;CLAUDE.md&lt;/code&gt;, the file I have been dragging since months without ever rereading it seriously. Multiply that by my sessions per day, by 20 working days, by a year. The number gets uncomfortable. And you, do you know how much you load before your first prompt? Probably not. Nobody knows before they look.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; The cost of a Claude Code session is not the volume of &lt;strong&gt;unique tokens&lt;/strong&gt;. It is that volume multiplied by the number of times it gets &lt;strong&gt;reloaded&lt;/strong&gt;. Cache that expires, sub-agents that reload the parent context in full, plus a dozen other reloads you never see. 15 hacks to attack the &lt;strong&gt;multiplier&lt;/strong&gt;, each with its pro AND its con (because half the "tips" floating around cost you more in time than they save in tokens). Run &lt;code&gt;/context&lt;/code&gt; before you finish this article. You will see.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ftwo-column-diagram-left-column-quot-what-you-think-you-pay-dbe326d3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ftwo-column-diagram-left-column-quot-what-you-think-you-pay-dbe326d3.png" alt='two-column diagram. Left column "What you think you pay for" with one medium bar labeled "unique tokens". Right column "What you actually pay for" with the same bar stacked vertically multiple times (reload 1, reload 2, reload 3...). Title above: "Token cost = unique tokens × reload count". Monochrome with red accent on the right column.' width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;Token cost = unique tokens × reload count
  &lt;/p&gt;

&lt;h2&gt;
  
  
  Your Cache Expires in 5 Minutes. Your Coffee Break Doesn't.
&lt;/h2&gt;

&lt;p&gt;Anthropic's prompt cache lives 5 minutes. After that, the next turn pays full price to reload everything you already paid for once. The Reddit audit found that 54% of turns happened after a 5+ minute gap. More than half the conversation was effectively cold-cache.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; running &lt;code&gt;/compact&lt;/code&gt; or &lt;code&gt;/clear&lt;/code&gt; BEFORE you walk away from your machine is the single highest-leverage habit on this list. You announce the break, you collapse the context, you come back to a clean slate that costs almost nothing to warm up. Lunch, a meeting, the kind of break where you go check on the pool and end up fixing a skimmer for 20 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; on a tight task where you are iterating fast, breaking the cache to save 5% of context costs you more in cognitive reload than in tokens. You lose your train of thought, Claude loses the nuance of the last three turns, and you pay in wall-clock time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; religious about it before announced breaks. Not for going to the bathroom.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Global CLAUDE.md Gets Reloaded Forever
&lt;/h2&gt;

&lt;p&gt;5,500 tokens. That is what my global &lt;code&gt;CLAUDE.md&lt;/code&gt; weighs. Reloaded on every session. Every project. For as long as I keep that file the way it is. Do the math on a year of sessions and you stop sleeping.&lt;/p&gt;

&lt;p&gt;Confession: my own &lt;code&gt;CLAUDE.md&lt;/code&gt; violates the 200-line rule I am about to preach. I know. I am part of the problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; turn the file into an index that points to specialized arch docs in each project, instead of a global brain dump. Keep only what applies to 100% of your projects (your name, your shell, your 3-4 hard rules). Everything else lives in a &lt;code&gt;docs/&lt;/code&gt; folder that gets read on demand.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; a too-skinny &lt;code&gt;CLAUDE.md&lt;/code&gt; means re-explaining your conventions every session. Iterations cost more than the startup tokens you saved. Especially painful on projects you touch once a month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; index, not junk drawer. Aim for 80 lines, not 800.&lt;/p&gt;

&lt;h2&gt;
  
  
  One Line in Your Settings Cut Context in Half
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;ENABLE_TOOL_SEARCH=true&lt;/code&gt;. Copy. Paste. Test. Verify with &lt;code&gt;/context&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That single setting is the one that saved 264M tokens across the Reddit audit. With it on, Claude does not load every tool schema at startup. It searches for the right tool when it needs one. On a setup with 15+ MCP tools the savings are brutal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; massive context reduction at startup if you have a heavy MCP setup. Instant. Free. One line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; on a setup with less than 10 tools, Tool Search adds a round-trip every time Claude needs a tool, which can actually slow you down. There is also a known macOS quirk where the auto-flag does not always stick. Force it manually and verify with &lt;code&gt;/context&lt;/code&gt; that the drop happened. Test it in a throwaway session first so you do not nuke a cache mid-task.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; turn it on if you have 15+ tools. Skip it under 10. Test before committing.&lt;/p&gt;

&lt;h2&gt;
  
  
  You're Loading 42 Skills. You Use Six.
&lt;/h2&gt;

&lt;p&gt;The Reddit audit found 19 skills out of 42 that were almost never called. Loaded at startup anyway. Each skill is a schema that costs context whether you invoke it or not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; quick audit, disable the dormant ones, instant savings on every single session forever. The kind of cleanup that pays back the next morning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; auditing takes 30 minutes and you will end up disabling something you use twice a year, on the exact day you need it. Murphy lives in your skills folder.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; disable anything you have not touched in a month. Not stricter than that. The 2x-a-year skills are not worth the cleanup pain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Plan Mode Pays Back in Avoided Rerolls
&lt;/h2&gt;

&lt;p&gt;Everyone sells Plan Mode as a quality-first feature. Look thoughtful, plan carefully, get cleaner code. Sure.&lt;/p&gt;

&lt;p&gt;The savings live somewhere else entirely. They live in the iterations you do not have to run. Every time Claude codes the wrong thing and you have to say "no, do it differently", you reload the entire context to re-explain. Plan Mode collapses three rounds of "almost but not quite" into one round of "we agreed before you started". I went deeper on the same idea in &lt;a href="https://rentierdigital.xyz/blog/i-stopped-vibe-coding-and-started-prompt-contracts-claude-code-went-from-gambling-to-shipping" rel="noopener noreferrer"&gt;the prompt contracts approach I built after enough of these disasters&lt;/a&gt;, if you want the full framework.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; mandatory on anything that touches 2+ files. The savings compound on every avoided reroll.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; on a trivial task (rename a variable, fix a typo, add a console.log), Plan Mode just adds latency. There was no iteration to avoid in the first place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; Plan Mode for multi-file tasks. Skip for surgical fixes.&lt;/p&gt;

&lt;h2&gt;
  
  
  /clear Is the Cheapest Habit You're Not Doing
&lt;/h2&gt;

&lt;p&gt;Change of task. &lt;code&gt;/clear&lt;/code&gt;. That is it. That is the hack.&lt;/p&gt;

&lt;p&gt;The reason nobody does it consistently is the same reason nobody flosses. It is too small to feel like a win and too easy to skip "just this once". And then last Tuesday I asked Claude to draft an email and it answered with TypeScript. Because I had been refactoring auth for two hours and never cleared. The previous task was still living rent-free in the context, paying full price on every turn, and now confusing the new one. Twenty minutes later, still untangling.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/clear&lt;/code&gt; on every task switch. Two seconds. Free. Done.&lt;/p&gt;

&lt;p&gt;The only real risk is hitting it by reflex in the middle of a task and losing context you actually needed. That happens once. You learn fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sub-Agents Cost 7x. Nobody Tells You That.
&lt;/h2&gt;

&lt;p&gt;Every sub-agent reloads the parent context in full. Anthropic's own docs confirm it, the Reddit audit measured it, and the multi-agent pattern that everyone has been hyping for six months is, mechanically, a token trap dressed as a feature.&lt;/p&gt;

&lt;p&gt;You spawn 3 sub-agents to "parallelize" a task. Each one inherits the full context. You just paid for that context 4 times instead of 1. Welcome to the club.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; killing the reflex of "I'll send this to a sub-agent" for every small task. The reflex feels productive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; for genuinely parallelizable work (review 5 independent files, run 5 isolated checks), sub-agents are still faster in wall-clock time even if they cost more in tokens. Time vs money tradeoff. Sometimes time wins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; sub-agents are a parallelism tool, not a savings tool. Use them when the clock matters more than the bill.&lt;/p&gt;

&lt;p&gt;Productivity feels great. The invoice arrives anyway. 💸&lt;/p&gt;

&lt;h2&gt;
  
  
  Disconnect the MCP Servers You Never Open
&lt;/h2&gt;

&lt;p&gt;Even with Tool Search on, every MCP server has a startup cost. Mine, right now: Gmail, Calendar, Chrome, Context7, YouTube transcript, my own rentierdigital MCP. Honest answer: probably half of those I have not touched this week.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; immediate startup context reduction. The kind of cleanup you can do in 90 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; disconnecting and reconnecting on every task switch is friction you will abandon in 3 days. I tried. I quit on day 4.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; create 2-3 settings profiles (dev / writing / research) and switch by profile, not by individual MCP. The friction drops to a single command. I went deeper on &lt;a href="https://rentierdigital.xyz/blog/why-clis-beat-mcp-for-ai-agents-and-how-to-build-your-own-cli-army" rel="noopener noreferrer"&gt;why CLIs end up cheaper than MCP servers for most agent work&lt;/a&gt; if you want the longer take.&lt;/p&gt;

&lt;h2&gt;
  
  
  /compact at 60%, Not 95%. (Yes, I Know What the Docs Say.)
&lt;/h2&gt;

&lt;p&gt;The docs suggest compacting late, when you are running out of room. Polite, conservative advice. I disagree.&lt;/p&gt;

&lt;p&gt;By 95%, the response quality has already started to degrade. Claude is fishing in a saturated context. You feel it before the warning fires. By 60%, you compact while everything is still clean and the next 40% of the session runs at full quality on a much lighter context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; on day-to-day code ops, compacting at 60% gives you better quality AND lower cost in the same move. Rare combo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; on architecture work or deep debugging where you NEED the full history, compacting early throws away the nuance Claude would have used. You compact away the very thing that was about to help you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; 60% for daily code. 85% for architecture and deep debug.&lt;/p&gt;

&lt;h2&gt;
  
  
  1,122 Redundant File Reads. One Session Read the Same File 33 Times.
&lt;/h2&gt;

&lt;p&gt;That number is from the Reddit audit and it physically hurt me to type it. 33 times. Same file. Same session. Each read paid in full.&lt;/p&gt;

&lt;p&gt;The cause is usually &lt;code&gt;/compact&lt;/code&gt; wiping the file from working memory, or Claude playing it safe after a long turn and re-fetching to be sure. Either way, you pay.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; add a rule in your &lt;code&gt;CLAUDE.md&lt;/code&gt;: "do not re-read files you already have in context unless I ask, or unless the file may have changed since the last read". Massive savings on long sessions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; if you constrain re-reads too hard, Claude misses the edits YOU made between two turns and codes against a stale version. That is a worse bug than the wasted tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; the rule with the explicit exception is the only version that survives contact with reality.&lt;/p&gt;

&lt;p&gt;The same file. 33 times. In one session. I am still not over it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Paste the Function. Not the 1,200-Line File.
&lt;/h2&gt;

&lt;p&gt;Surgery vs grenade. Most devs throw the whole file at Claude because it is faster to copy. Faster to paste, slower to pay.&lt;/p&gt;

&lt;p&gt;The default should be: paste the function, not the file. Direct savings, no cognitive cost, works on every prompt. The only time it backfires is when the bug is actually in the interaction between functions, and pasting the isolated piece means Claude misses the root cause. You debug the wrong thing for 20 minutes, then widen the context anyway. Annoying. Still cheaper than pasting the whole file every single time by default.&lt;/p&gt;

&lt;p&gt;Start surgical. Widen only if the first pass fails. Most of the time, the first pass works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reference &lt;a class="mentioned-user" href="https://dev.to/auth"&gt;@auth&lt;/a&gt;.js, Not "The Bug in My Repo"
&lt;/h2&gt;

&lt;p&gt;Targeted reference (&lt;code&gt;@path/to/file.js&lt;/code&gt;) means a precise read. Vague reference means Claude runs grep, glob, ls in cascade until it finds what you meant. Each step pumps your context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; fine control over what gets loaded. You know what you are paying for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; precise reference assumes you know where the bug lives. If you are still searching, vague is necessary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; target whenever you can. When you have to search, ask explicitly: "find the file, then stop and show me before reading it." Two-step search. Cheaper than the cascade.&lt;/p&gt;

&lt;h2&gt;
  
  
  Edit Your Last Message. Don't Send a New One.
&lt;/h2&gt;

&lt;p&gt;Claude gives you a wrong answer. You scroll back, edit your original prompt, hit enter. Claude regenerates from the corrected prompt. The bad reply never enters the context. The thread stays clean.&lt;/p&gt;

&lt;p&gt;Versus: typing a new message saying "no, I meant…", which stacks the bad reply, your correction, and the new attempt all in working memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; clean thread, smaller context, faster iteration on micro-adjustments. Especially good for fixing typos in your own prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; you lose the trace that the first version failed, which is sometimes useful when you are debugging a recurring pattern in how Claude misunderstands you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; edit for micro-adjustments. Stack for real reasoning iterations where the failed attempt teaches you something.&lt;/p&gt;

&lt;h2&gt;
  
  
  A git log Just Ate 8,000 Tokens. You Didn't Notice.
&lt;/h2&gt;

&lt;p&gt;Terminal outputs are the silent vacuum of Claude Code sessions. &lt;code&gt;git log&lt;/code&gt; without &lt;code&gt;--oneline -20&lt;/code&gt;. &lt;code&gt;npm install&lt;/code&gt; with the full dependency resolution dump. &lt;code&gt;tail -f&lt;/code&gt; on a server log. All of it lands in the context. You did not see it scroll by because Claude collapsed the output. The tokens are still there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; add a deny list in &lt;code&gt;CLAUDE.md&lt;/code&gt; for known verbose commands. Force &lt;code&gt;--oneline&lt;/code&gt;, force tail limits, force log levels. List your repeat offenders.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; if you constrain outputs too hard, Claude misses the info needed to diagnose, you run a second command, and now you paid twice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; deny list on the known verbose ones. Keep the rest permissive. Refine as you spot new offenders.&lt;/p&gt;

&lt;p&gt;You did not see those 8,000 tokens. Your invoice did.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sonnet by Default. Haiku for Grunt. Opus for Architecture.
&lt;/h2&gt;

&lt;p&gt;The only model rule worth being on this list. Opus on a variable rename is waste. Haiku on an architecture decision is 3x more time spent rolling back.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro:&lt;/strong&gt; right tool for right task. Opus where the reasoning matters, Sonnet for the daily 80%, Haiku for the boring grunt (renames, formatting, doc generation).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Con:&lt;/strong&gt; switching models mid-session breaks your cache. 4 switches a day and you have lost more than you saved.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; pick the model at the START of the session based on the task type. No mid-session switching. If you guessed wrong, finish the task on the current model and retune for the next session.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Cheapest Hack Isn't on This List
&lt;/h2&gt;

&lt;p&gt;All these hacks are guesswork until you measure. Two free commands, already installed on your machine: &lt;code&gt;/context&lt;/code&gt; and &lt;code&gt;/cost&lt;/code&gt;. The formula fits on one line: tokens loaded at startup × sessions per day × 20 working days. That is what you pay every month just to start. Before the first real prompt.&lt;/p&gt;

&lt;p&gt;The market is starting to react. Hasan Toxr (@hasantoxr on X) released a knowledge graph approach that claims 8x to 49x reduction on code reviews. meta_alchemist maintains &lt;code&gt;ccusage&lt;/code&gt; and &lt;code&gt;claude-code-usage-monitor&lt;/code&gt;. Dashboards are popping up everywhere. Good sign. But no external tool will tell you what &lt;code&gt;/context&lt;/code&gt; tells you in two seconds.&lt;/p&gt;

&lt;p&gt;The cheapest hack on this list is not on this list. It is running &lt;code&gt;/context&lt;/code&gt; once a week and looking honestly at where the reloads are coming from. Your config, not Anthropic's bill, is the first place to look.&lt;/p&gt;

&lt;p&gt;Actually, wait. Let me put it differently. The real hack is admitting that most of us have been flying blind. We optimize for features, for speed, for developer experience. But we never look at the bill until it hurts. Then we blame the model, the company, the pricing structure. Anything except the 20 settings we control.&lt;/p&gt;

&lt;p&gt;Tell me which hack was the most useful for you. Or which one I got wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;p&gt;* The Reddit audit: u/Medium_Island_2795 (aka MunchKunSter), 858 sessions instrumented over 33 days, surfaced via &lt;a href="https://x.com/Simba_crpt" rel="noopener noreferrer"&gt;@Simba_crpt&lt;/a&gt; and &lt;a href="https://x.com/DAIEvolutionHub" rel="noopener noreferrer"&gt;@DAIEvolutionHub&lt;/a&gt; on X.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hasan Toxr's knowledge graph for code review: &lt;a href="https://x.com/hasantoxr" rel="noopener noreferrer"&gt;@hasantoxr&lt;/a&gt; on X.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ccusage&lt;/code&gt; and &lt;code&gt;claude-code-usage-monitor&lt;/code&gt; by meta_alchemist.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(*) The cover is AI-generated. Claude wrote the words. A different machine drew the picture.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>claude</category>
      <category>aitools</category>
    </item>
    <item>
      <title>How to Remove Elementor From WordPress | Convert 114 Pages to Gutenberg in One Day</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Fri, 10 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/how-to-remove-elementor-from-wordpress-convert-114-pages-to-gutenberg-in-one-day-3iah</link>
      <guid>https://forem.com/rentierdigital/how-to-remove-elementor-from-wordpress-convert-114-pages-to-gutenberg-in-one-day-3iah</guid>
      <description>&lt;p&gt;Migrating a site from Elementor to Gutenberg is a mess. There is no magic button. You have to rebuild every page by hand. Count several weeks. Rebuild from scratch. That's the consensus, it's everywhere, I read it maybe ten times before I actually started.&lt;/p&gt;

&lt;p&gt;A client site. &lt;strong&gt;114 pieces of content&lt;/strong&gt; to migrate. Hello Elementor theme (an empty shell without the plugin), expired Pro licenses, content locked inside Elementor JSON that looks like nothing if you disable the builder. Nobody wants to pay for a multi-week technical migration.&lt;/p&gt;

&lt;p&gt;Me neither. But you didn't seriously think I was going to do it by hand, right? 😏&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt;: &lt;strong&gt;114 pages migrated in one day&lt;/strong&gt;, zero manual rebuilding, all URLs preserved. I'll show you exactly how to do the same thing on any Elementor project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Elementor Has to Go
&lt;/h2&gt;

&lt;p&gt;Elementor was a fine choice a few years ago. WordPress core was rough. Gutenberg was barely usable. A drag-and-drop builder made sense for people who didn't want to touch code. That was then.&lt;/p&gt;

&lt;p&gt;WordPress has moved on. &lt;strong&gt;Gutenberg is solid now&lt;/strong&gt;. Full Site Editing covers most of what page builders used to sell. Block themes let you design the whole site without a single plugin. Meanwhile Elementor has become dead weight: heavy, slow, paid, and it &lt;strong&gt;locks your content&lt;/strong&gt; in a proprietary format that nobody else can read.&lt;/p&gt;

&lt;p&gt;The Hello Elementor theme is the worst offender. It's an empty shell, 100% dependent on the plugin. Turn off Elementor and your pages are gone. Just gone.&lt;/p&gt;

&lt;p&gt;The real problem is not technical though. It's economical. Nobody wants to pay for a technical migration that takes several weeks. The client needs a working site, not a refactoring project. And every guide you read about leaving Elementor says the same thing. "There is no magic button." "Several weeks of dedicated work." "Essentially the site has to be rebuilt."&lt;/p&gt;

&lt;p&gt;That consensus kills migrations before they start. Owners stay stuck on a plugin they don't like, paying licenses they don't use, on a site they can't edit without opening Elementor one more time.&lt;/p&gt;

&lt;p&gt;To be fair, Elementor is still decent for some cases. A one-shot landing page. A temporary event site. Something the owner will not touch in two years. The problem isn't the builder itself. It's the long-term lock-in on content you'll want to keep.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup: One App Password, One Agent
&lt;/h2&gt;

&lt;p&gt;Here's what every migration guide tells you to do. Spin up a staging site. Take a full backup. Install the "Elementor to Blocks" plugin for partial conversion. Rebuild each page by hand where the plugin fails. Clean up residual Elementor classes. QA everything. Push to production. Several weeks, calendar time.&lt;/p&gt;

&lt;p&gt;Here's what I did instead. Opened wp-admin on the live site. Users &amp;gt; Profile &amp;gt; Application Passwords. Typed a name, clicked Add. Copied the generated password. Total elapsed time: 30 seconds.&lt;/p&gt;

&lt;p&gt;That's the whole setup. No OAuth flow, no webhook, no plugin to install, no staging environment. The &lt;strong&gt;app password&lt;/strong&gt; is a bearer token with the same permissions as the user that generated it. You pass it to Claude Code along with the site URL, and the agent can read and write everything the REST API exposes. Posts, pages, media, users, settings.&lt;/p&gt;

&lt;p&gt;If you want the full walkthrough of how this works end to end, I wrote &lt;a href="https://rentierdigital.xyz/blog/automate-wordpress-with-claude-code" rel="noopener noreferrer"&gt;the complete setup for connecting Claude Code to any WordPress site&lt;/a&gt; a few weeks ago.&lt;/p&gt;

&lt;p&gt;One sentence to launch: "Migrate this site from Elementor to native Gutenberg blocks, here's the URL and the app password, work from the REST API." That's it. No prompt engineering, no system instructions, no tool list. Claude Code explored the API by itself, figured out the stack (WordPress version, active theme, plugin list, post types), pulled a full backup of every post and page into a local JSON file, and started writing its first conversion script.&lt;/p&gt;

&lt;p&gt;Timeline, real numbers. Launched at 8:15 in the morning. By noon there were three adjustments left. By evening, done. &lt;strong&gt;114 pieces of content total&lt;/strong&gt; (94 posts and 20 pages). The whole technical process was 100% agent-driven. The only "human intervention" was me relaying the client's messages about what she wanted kept or dropped.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Claude Code Converts Elementor to Gutenberg
&lt;/h2&gt;

&lt;p&gt;First technical decision, and it's the one that made the whole thing work. Claude Code did NOT parse &lt;code&gt;_elementor_data&lt;/code&gt;. That's the raw JSON Elementor stores in postmeta, deeply nested, structure varies between Elementor versions, widget types change names across releases. Parsing that would be a moving target.&lt;/p&gt;

&lt;p&gt;Instead it worked from the &lt;strong&gt;rendered HTML&lt;/strong&gt;. The REST API exposes &lt;code&gt;content.rendered&lt;/code&gt; for every post, which is the final HTML after Elementor has done its job. Simpler, more stable, portable across Elementor versions. If the HTML looks right on the frontend, the parser has something to chew on.&lt;/p&gt;

&lt;p&gt;The first script Claude Code wrote, &lt;code&gt;convert-elementor.py&lt;/code&gt;. Roughly what it does:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;

&lt;span class="n"&gt;ELEMENTOR_WRAPPERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-section&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-container&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-column&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-widget-wrap&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-widget&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;e-con&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;e-con-inner&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;e-child&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-row&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-element&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Walk the tree, strip Elementor wrappers, emit Gutenberg blocks.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;classes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;class&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]))&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;classes&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;ELEMENTOR_WRAPPERS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Skip the wrapper, recurse into children
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;process_element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;children&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Map widgets to native Gutenberg blocks
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-widget-heading&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;convert_heading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-widget-image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;convert_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;elementor-widget-text-editor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;classes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;convert_paragraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ... other widgets
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;process_element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;children&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The recursive &lt;code&gt;process_element&lt;/code&gt; function is the heart of it. It walks the DOM tree, and whenever it hits an Elementor wrapper (around a dozen different class names), it skips the wrapper and keeps walking into the children. When it hits an actual widget, it routes to a dedicated converter. Headings become &lt;code&gt;wp:heading&lt;/code&gt;, images become &lt;code&gt;wp:image&lt;/code&gt;, text editors become &lt;code&gt;wp:paragraph&lt;/code&gt;, and so on. CSS classes and &lt;code&gt;data-*&lt;/code&gt; attributes get stripped by regex on the way out.&lt;/p&gt;

&lt;p&gt;Two concrete examples.&lt;/p&gt;

&lt;p&gt;An Elementor heading looks like this in the rendered HTML:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"elementor-widget elementor-widget-heading"&lt;/span&gt; &lt;span class="na"&gt;data-id=&lt;/span&gt;&lt;span class="s"&gt;"a1b2c3"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"elementor-widget-container"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;h2&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"elementor-heading-title elementor-size-default"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;span&lt;/span&gt; &lt;span class="na"&gt;style=&lt;/span&gt;&lt;span class="s"&gt;"color: #333"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Our Approach&lt;span class="nt"&gt;&amp;lt;/span&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/h2&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After conversion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- wp:heading --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;h2&amp;gt;&lt;/span&gt;Our Approach&lt;span class="nt"&gt;&amp;lt;/h2&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- /wp:heading --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The inline span with the color style is gone. So are the Elementor classes, the wrapper divs, the data attributes. Clean Gutenberg block, ready to render in any theme.&lt;/p&gt;

&lt;p&gt;An Elementor image is similar. The builder wraps the &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; in a figure that lives inside two or three divs with classes like &lt;code&gt;elementor-widget-image&lt;/code&gt;. Gutenberg expects a figure with the class &lt;code&gt;wp-block-image&lt;/code&gt; and the right block comment around it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- wp:image --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;figure&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"wp-block-image"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"..."&lt;/span&gt; &lt;span class="na"&gt;alt=&lt;/span&gt;&lt;span class="s"&gt;"..."&lt;/span&gt;&lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/figure&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- /wp:image --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same pattern. Strip the wrappers, rebuild the markup clean, wrap in a Gutenberg block comment.&lt;/p&gt;

&lt;p&gt;Last piece, pushing the converted content back to WordPress. This is where things got weird. The client site is behind Cloudflare, and Cloudflare blocked every Python HTTP library I tried (&lt;code&gt;requests&lt;/code&gt;, &lt;code&gt;urllib&lt;/code&gt;, &lt;code&gt;httpx&lt;/code&gt;). The user-agent looked suspicious enough to get 403'd on every POST. Claude Code figured this out on its own, after a few failed calls, and switched to &lt;code&gt;curl&lt;/code&gt; in a subprocess. The JSON body goes into a temp file, curl reads it with &lt;code&gt;@filename&lt;/code&gt;, Cloudflare sees a normal curl user-agent and lets it through.&lt;/p&gt;

&lt;p&gt;Not clean. It's a hack. But it worked for the entire run, and on a long enough timeline every "temporary workaround" becomes load-bearing infrastructure anyway.&lt;/p&gt;

&lt;p&gt;That's the caveat I owe you: the curl workaround is tied to this specific host. On a site without Cloudflare, the normal Python HTTP client would have worked fine. Your mileage will vary depending on the CDN and host config.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Broke (and How It Fixed Itself)
&lt;/h2&gt;

&lt;p&gt;After the first pass the site was up. Every page loaded. No 500s, no white screens. But the content was riddled with problems, three different flavors of broken, and they showed up in roughly that order.&lt;/p&gt;

&lt;p&gt;The first one was &lt;strong&gt;ghost spacers&lt;/strong&gt; everywhere. Elementor uses empty paragraphs with CSS margin as visual separators, and when you parse the HTML naively, those empty paragraphs convert into &lt;code&gt;wp:spacer&lt;/code&gt; blocks. Multiply that across 114 pages and you get dozens of spacers stacked on top of each other, creating absurd vertical holes in the content.&lt;/p&gt;

&lt;p&gt;Sections floating alone in the middle of the page. Footers halfway down the viewport. The whole layout doing its best impression of a CSS reset gone rogue. Claude Code noticed on its own after I pointed at one page and said "this looks wrong". It wrote a second script, &lt;code&gt;fix-spacers.py&lt;/code&gt;, that re-pulled every post, stripped the spacer blocks by regex, and replaced them with regular line breaks where appropriate. 47 pieces of content cleaned in one batch.&lt;/p&gt;

&lt;p&gt;Then came the &lt;strong&gt;invalid block errors&lt;/strong&gt;. Gutenberg validates block markup strictly. If a &lt;code&gt;wp:image&lt;/code&gt; block has a figure without the &lt;code&gt;wp-block-image&lt;/code&gt; class, Gutenberg throws "This block contains unexpected or invalid content." Same for a &lt;code&gt;wp:heading&lt;/code&gt; that still has leftover Elementor spans inside. The block loads but the editor refuses to modify it, which is worse than if it were just broken visually.&lt;/p&gt;

&lt;p&gt;The client wanted to edit her own pages, that was the whole point. Third script, &lt;code&gt;fix-blocks.py&lt;/code&gt;. Re-parse each block, reconstruct the inner HTML from scratch using the format Gutenberg expects, push it back. 81 pieces of content fixed, split into 4 parallel batches to speed things up. Claude Code decided on the parallelism by itself. I didn't ask.&lt;/p&gt;

&lt;p&gt;Here's the narrative point I care about. The first two problems, Claude Code solved them entirely on its own. Three scripts in total, each one written to fix the issues the previous one had created. &lt;strong&gt;Autonomous feedback loop&lt;/strong&gt;, same session, no restart. I pointed at symptoms. The agent wrote the diagnostic, wrote the fix, ran the fix, verified the fix.&lt;/p&gt;

&lt;p&gt;The third flavor of broken is the actual limit. Elementor Pro ships widgets that have no match in native Gutenberg. The homepage slider, gone (no slider block in core). The Mailchimp popup, gone. The Elementor Pro contact form, gone. The social icons widget, text preserved but the visual icons dropped. These are not conversion bugs. They are proprietary widgets that don't exist outside Elementor Pro, and no parser in the world is going to invent them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code it can fix. Vendor lock-in it can't.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Last 5% You'll Do Yourself
&lt;/h2&gt;

&lt;p&gt;Short anecdote first. At one point Claude Code needed to install the Twenty Twenty-Five theme as a fallback. The WordPress REST API doesn't expose theme installation (for good reasons, it's a security surface). So the agent opened a browser via its MCP tool, logged into wp-admin, navigated to Appearance &amp;gt; Themes &amp;gt; Add New, searched, clicked Install, clicked Activate. Did it by itself. I watched it happen in the logs.&lt;/p&gt;

&lt;p&gt;I bring this up because the "95% automated" framing needs a caveat. The agent handled almost everything the REST API allowed, and when the API didn't allow something, it found another channel. What stays for the human to do isn't technical work the agent can't handle. It's business decisions the agent shouldn't be making alone.&lt;/p&gt;

&lt;p&gt;Concretely, what I had to touch manually at the end:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which form plugin to install to replace the Elementor Pro forms. WPForms Lite, Contact Form 7, Fluent Forms, each has tradeoffs. Not an agent call.&lt;/li&gt;
&lt;li&gt;How to reintegrate the newsletter signup. Native Mailchimp embed, MC4WP plugin, or a new signup flow entirely.&lt;/li&gt;
&lt;li&gt;Picking a new hero image because the old one had text baked into it.&lt;/li&gt;
&lt;li&gt;Reviewing Rank Math SEO meta on the key landing pages to make sure nothing regressed.&lt;/li&gt;
&lt;li&gt;Cleaning up leftover &lt;code&gt;_elementor_*&lt;/code&gt; rows in &lt;code&gt;postmeta&lt;/code&gt; (WP-CLI or direct SQL, to lighten the database).&lt;/li&gt;
&lt;li&gt;Removing the Hello Elementor theme files (no REST endpoint for that either).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;95% automated, 5% manual&lt;/strong&gt;. And that 5% is decision work, not reconstruction work. Before you point a coding agent at a production site, you want to define exactly what it's allowed to touch and what it has to escalate. That's the whole reason I built &lt;a href="https://rentierdigital.xyz/blog/i-stopped-vibe-coding-and-started-prompt-contracts-claude-code-went-from-gambling-to-shipping" rel="noopener noreferrer"&gt;the execution scope discipline I use before launching an agent on production&lt;/a&gt;. Clear scope, clear escalation rules, clear stop conditions. Without that you're gambling on a live site and hoping the agent self-corrects before it bricks something.&lt;/p&gt;

&lt;p&gt;The REST API itself has hard limits worth knowing. No theme deletion, no plugin code access, no PHP execution, no file system. For anything outside the API surface you need SSH or wp-admin. Define the perimeter before you launch.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;span&gt;The Button That Wasn't&lt;/span&gt;
&lt;/h2&gt;

&lt;p&gt;114 pieces of content. Three scripts. One day. Typos fixed along the way, key pages reviewed, all URLs preserved. The client edits her own pages now without calling me.&lt;/p&gt;

&lt;p&gt;Every guide is right. There is no magic button for Elementor to Gutenberg. But an agent that writes its own conversion scripts and fixes its own bugs in a continuous session, that's not a button.&lt;/p&gt;

&lt;p&gt;It's better than a button 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developer.wordpress.org/rest-api/" rel="noopener noreferrer"&gt;WordPress REST API Handbook&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Consensus guides on Elementor to Gutenberg migration: Blog Marketing Academy, Crocoblock, Ulement ("there is no magic button", "several weeks of dedicated work")&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(*) The cover is AI-generated. Turns out the only real magic button in this whole story was the one that made the header image.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>technology</category>
      <category>wordpress</category>
      <category>webdev</category>
    </item>
    <item>
      <title>You Have a Third Pile of Technical Debt. Nobody Has Built a Tool to Find It.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Thu, 09 Apr 2026 13:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/you-have-a-third-pile-of-technical-debt-nobody-has-built-a-tool-to-find-it-l0e</link>
      <guid>https://forem.com/rentierdigital/you-have-a-third-pile-of-technical-debt-nobody-has-built-a-tool-to-find-it-l0e</guid>
      <description>&lt;p&gt;You launch your usual app. ERROR. You're already pissed because you hadn't planned to do maintenance today. So you dig. My terminal threw &lt;code&gt;ECONNREFUSED 127.0.0.1:46279&lt;/code&gt; at my face and it took me exactly thirty seconds to understand that nobody, anywhere, was going to deal with this. No status page. No ticket to file. No SLA to wave around. The free service that one piece of my pipeline depended on had just gone down, and the only human on Earth who cared was me.&lt;/p&gt;

&lt;p&gt;Of course I had never signed a contract with them. I was using a free service. I never showed up on any of their dashboards. And yet at 9:17 that Monday morning, they owed me something they didn't even know they owed me. Or actually the other way around: I owed them. I had been running up a tab for months without noticing, and the creditor had just called it in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt; (You know about the &lt;strong&gt;technical debt&lt;/strong&gt; you wrote. You measure the &lt;strong&gt;technical debt&lt;/strong&gt; you inherited from your dependencies. There's a &lt;strong&gt;third pile&lt;/strong&gt; nobody talks about: the debt you &lt;strong&gt;import&lt;/strong&gt; every time you wire a &lt;strong&gt;free SaaS&lt;/strong&gt; into your pipeline. It doesn't show up on any dashboard. No linter catches it. Dependabot doesn't see it. &lt;strong&gt;Audit your imported debt&lt;/strong&gt; before a Monday morning audits it for you.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Piles
&lt;/h2&gt;

&lt;p&gt;There are three kinds of technical debt and most teams only count two of them.&lt;/p&gt;

&lt;p&gt;The first one is the &lt;strong&gt;debt you wrote&lt;/strong&gt;. The Friday afternoon hack that survived. The TODO from 2022 that became load-bearing. The if-statement that was supposed to be temporary and now has its own commit history. You know it exists because you wrote it, and you know roughly where it lives because the bad smell follows you around. Your linter catches a slice of it. Code review catches another slice. The rest sits in your head.&lt;/p&gt;

&lt;p&gt;The second pile is the &lt;strong&gt;debt you inherited&lt;/strong&gt;. Lockfiles full of transitive dependencies you never picked. Libraries that haven't shipped in three years but still install. Packages with two maintainers and one of them just moved to a farm. You know this pile exists too, because there are tools for it. &lt;code&gt;npm audit&lt;/code&gt;, Dependabot, Snyk, Renovate. They scream at you every Monday morning whether you want it or not. The screaming is annoying, but at least somebody is screaming.&lt;/p&gt;

&lt;p&gt;Then there's the third pile. The pile you don't have a name for, because nobody named it. The &lt;strong&gt;free service&lt;/strong&gt; you plug into one step of your pipeline because it was easier than building the thing yourself. The hosted API that processes a piece of your data because they had a generous free tier. The webhook endpoint that does a conversion you were never going to write yourself. None of this debt is in your codebase. None of it shows up in &lt;code&gt;package.json&lt;/code&gt;. No tool monitors it. You don't even count it as a dependency in your own head, because dependencies are things you import and these things are things you call.&lt;/p&gt;

&lt;p&gt;But they are dependencies. And they are debt. The debt is just sitting somewhere else, on somebody else's server, with somebody else's incentives. You didn't issue it. You imported it.&lt;/p&gt;

&lt;p&gt;The debt you didn't issue is still your debt the moment it falls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Nobody Sees It
&lt;/h2&gt;

&lt;p&gt;The reason nobody sees this third pile is structural, not stupid.&lt;/p&gt;

&lt;p&gt;Every tool we built for measuring technical debt looks &lt;strong&gt;inside&lt;/strong&gt; your codebase. Linters parse your files. Dependabot reads your manifests. Snyk scans your lockfile. Code review happens on your PRs. The whole observability stack assumes the debt lives in artifacts you own and can grep. Imported debt does not live in artifacts you own. It lives at a URL. And a URL is not an asset, it's a promise.&lt;/p&gt;

&lt;p&gt;A promise made by an entity that owes you nothing.&lt;/p&gt;

&lt;p&gt;There's also a softer reason. You didn't import these things on purpose. You imported them on a Tuesday afternoon when you needed a quick conversion, googled it, found a free endpoint, pasted the URL into a config file, and moved on. It felt like using a tool, not like signing a contract. Nothing in your editor told you that you had just bolted a stranger's mortality to your pipeline. So you didn't update any mental ledger. There was no ledger.&lt;/p&gt;

&lt;p&gt;The funny thing is the rest of the industry is perfectly aware that this stuff breaks. A 2025 survey of 1,000 senior tech executives found that 93% worry about downtime impact and 100% experienced outage-related revenue loss that year. The list of public incidents reads like a horror catalog: AWS us-east-1 going down for hours and dragging dependent SaaS providers along, Cloudflare WAF rules wiping out a chunk of global traffic in a single push, Azure configuration errors taking out Microsoft 365 and Xbox at the same time. We know outages happen. We track them publicly. We write postmortems.&lt;/p&gt;

&lt;p&gt;But we track them &lt;strong&gt;after&lt;/strong&gt;. There is no tool that walks into your repo and says "you depend on a fistful of things that could disappear tomorrow and you have a plan for zero of them." That tool does not exist because the inputs are not in your repo. They are scattered across HTTP calls in random files, hardcoded URLs in config, fetch statements buried in service modules, env vars pointing at hostnames you wrote down once and forgot.&lt;/p&gt;

&lt;p&gt;You know what your &lt;code&gt;package.json&lt;/code&gt; looks like. You have no idea what your &lt;strong&gt;outbound calls&lt;/strong&gt; look like. That's the gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Audit Nobody Runs
&lt;/h2&gt;

&lt;p&gt;After Kroki's Excalidraw backend crashed and refused to come back, I sat down for the first time in years and ran the audit. Not the one for npm packages. The one for outbound HTTP calls to free services I had never paid for and could not replace in a hurry.&lt;/p&gt;

&lt;p&gt;It took me a Saturday morning to grep through everything. The pipeline I was working on at the time was a product catalog automation for a small ecommerce client, the kind of thing that generates assembly diagrams and visual specs for product pages on their store. Not glamorous, but it ships every day. And every step of the pipeline had a free SaaS bolted onto it somewhere.&lt;/p&gt;

&lt;p&gt;I came up with &lt;strong&gt;14 outbound calls&lt;/strong&gt; to services I had never paid for. I won't bore you with the full inventory. The interesting numbers were elsewhere. The number of items that had a fallback plan documented somewhere: zero. The number of items with monitoring on the upstream service: zero. The number of items I had ever stress-tested by killing the dependency on purpose: also zero. &lt;/p&gt;

&lt;p&gt;I was running a production pipeline on top of a stack of free promises, and I had been doing it for so long that it didn't even register as a risk anymore. It registered as "infrastructure."&lt;/p&gt;

&lt;p&gt;The Kroki replacement, the actual fix, was small. A single TypeScript file that does exactly what the broken service was doing for me, no more, no less. Runs on Bun. Calls a library directly instead of going through a headless browser. Lives in 47 lines. Uses 93MB of RAM. Renders a diagram in roughly 2 milliseconds. It runs in a Docker container on a VPS I was already paying $6 a month for, on the same internal network as the rest of the stack. No public endpoint. No certificates. No attack surface. It has been running since the day I wrote it and it has not once gone down.&lt;/p&gt;

&lt;p&gt;Now, the part that bothers me is recent. Five years ago, that fix would have taken me a full day. Maybe two. The cost-benefit of replacing a free dependency would have been a net loss for any single one of them, so I would have done what everyone does, which is wait for the upstream to come back and pray. Today, with Claude Code in front of me, the same fix took thirty minutes and I did it during a coffee break. The math has flipped. The thing that used to be too expensive to fix is now too cheap to ignore. &lt;/p&gt;

&lt;p&gt;Same way I rebuilt &lt;a href="https://rentierdigital.xyz/blog/anthropic-just-killed-my-200-month-openclaw-setup-so-i-rebuilt-it-for-15" rel="noopener noreferrer"&gt;a paid setup that the vendor decided to retire on me, for a fraction of the cost&lt;/a&gt; a few months back. The arbitrage has changed under our feet, and most of us are still running the old prices in our head.&lt;/p&gt;

&lt;p&gt;Every imported debt I had been carrying for years was suddenly cheap to refinance. I just hadn't noticed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix Is Not Self-Hosting
&lt;/h2&gt;

&lt;p&gt;Careful here, because the easy version of this story is "self-host everything" and that is the wrong conclusion.&lt;/p&gt;

&lt;p&gt;Self-hosting has its own debt. Servers need patching. Containers need restarting. Disks fill up. The fact that I replaced one free dependency with 47 lines of my own code does not mean I won the game. It means I traded one creditor for another. The new creditor is me, and at least I know where to find me.&lt;/p&gt;

&lt;p&gt;The actual fix is much more boring. The actual fix is &lt;strong&gt;keeping a balance sheet&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You don't need a tool. You need a list. A flat text file in your repo, or a section in your README, or whatever survives your own laziness. Every external service your pipeline calls. Three columns. What it does, what dies if it dies, and what you would do about it on a Monday morning. That's it. The act of writing it forces you to look at each line and ask the only question that matters: do I have a plan, or am I betting that this one is too big to fall?&lt;/p&gt;

&lt;p&gt;Most of yours will not have a plan. That's fine. The point of the audit isn't to fix everything in a weekend, it's to make the third pile &lt;strong&gt;visible&lt;/strong&gt;. Once you can see it, you can start refinancing the cheapest items first. The two-millisecond replacements. The 47-line fixes. The ones where the library exists as a package and the only thing you were ever paying for was the HTTP wrapper.&lt;/p&gt;

&lt;p&gt;You will discover, like I did, that a surprising number of your imported debts are exactly that: an HTTP wrapper around something you could have called directly. The infrastructure looked impressive because there was a hosted dashboard and a status page and a brand. Strip the wrapper and the actual logic is fifty lines. This is the same pattern I keep hitting elsewhere too, and it's why I wrote a whole piece on &lt;a href="https://rentierdigital.xyz/blog/why-clis-beat-mcp-for-ai-agents-and-how-to-build-your-own-cli-army" rel="noopener noreferrer"&gt;how the cheapest tool that does the job tends to beat the fancy one in production&lt;/a&gt;. Less surface, less to break, less to depend on.&lt;/p&gt;

&lt;p&gt;Three questions to put on the balance sheet, for every line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First question&lt;/strong&gt;: if this thing dies on a Monday morning at 9 a.m., what stops working? Be honest. Don't say "nothing critical." Walk through it. Trace the call. See where it lands. If the answer is "the publish step" or "the customer-facing thing" or "the part that makes money," circle the line in red.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second question&lt;/strong&gt;: do I have a fallback, or do I just believe this one is too big to fall? The "too big to fall" reasoning is exactly the reasoning that gets you killed. Cloudflare is too big to fall. AWS us-east-1 is too big to fall. They both fell in 2025. Free tiers from indie maintainers fall every week. Belief is not a fallback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Third question&lt;/strong&gt; is the cheap one. How many lines of code would it cost to rebuild this myself, &lt;strong&gt;right now&lt;/strong&gt;, while I'm calm, instead of in a panic at 9:17 on a Monday? Maybe the answer is "thousands" and you decide to live with the risk. That's a real decision. Maybe the answer is "47" and you do it during a coffee break. That's a real decision too. The point is making the decision instead of having it made for you.&lt;/p&gt;

&lt;p&gt;Most of us have never made the decision. We just kept clicking the free service into the pipeline because it was there.&lt;/p&gt;




&lt;p&gt;Three days after the incident, I went back to Kroki's status page. The Excalidraw backend was still listed as down. Someone had posted a message on their Discord asking if anyone was working on it. Nobody had answered.&lt;/p&gt;

&lt;p&gt;My pipeline had been running for 72 hours without interruption. I had forgotten that I owed something to somebody.&lt;/p&gt;

&lt;p&gt;AI makes you resilient or selfish. Depends how you squint. 🤷&lt;/p&gt;

&lt;p&gt;Anyway, the point is this: audit your imported debt. Not because you need to fix it all, but because you need to see it. The debt you can see, you can plan for. The debt you can't see just waits for a Monday morning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Cockroach Labs, &lt;em&gt;Outages Observer: Why 2025 Failures Demand Unbreakable Systems in 2026&lt;/em&gt; (&lt;a href="https://www.cockroachlabs.com/blog/2025-top-outages/" rel="noopener noreferrer"&gt;https://www.cockroachlabs.com/blog/2025-top-outages/&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(*) The cover image was made by an AI, which is itself a free service I'm importing into my workflow. Make of that what you will.&lt;/p&gt;

</description>
      <category>softwareengineering</category>
      <category>technology</category>
      <category>technicaldebt</category>
      <category>saas</category>
    </item>
    <item>
      <title>4,000 Managers Fired at Block — AI Won't Replace Your Manager. It'll Turn You Into One.</title>
      <dc:creator>Phil Rentier Digital</dc:creator>
      <pubDate>Wed, 08 Apr 2026 16:41:10 +0000</pubDate>
      <link>https://forem.com/rentierdigital/4000-managers-fired-at-block-ai-wont-replace-your-manager-itll-turn-you-into-one-2i7k</link>
      <guid>https://forem.com/rentierdigital/4000-managers-fired-at-block-ai-wont-replace-your-manager-itll-turn-you-into-one-2i7k</guid>
      <description>&lt;p&gt;This morning I fired an agent. Not a human. A piece of code running on Claude Code that decided, in its infinite wisdom, to fix a bug by deleting the file that contained the bug [sic!]. Problem solved, technically. Before that I'd been reading overnight logs, prioritizing three tasks, unblocking a workflow stuck on an edge case. Coffee, croissant, dashboards, decisions. My morning looks like any manager's morning. Except I have zero employees.&lt;/p&gt;

&lt;p&gt;And last week, Jack Dorsey announced that my job doesn't exist. He cut 40% of Block (the company behind Cash App and Square, if you're not sure) which comes out to roughly &lt;strong&gt;4,000 people&lt;/strong&gt;, mostly &lt;strong&gt;middle management&lt;/strong&gt;. Then he published an essay with Sequoia's Roelof Botha explaining that hierarchy is a 2,000-year-old hack and that AI makes managers obsolete. Wall Street clapped. The stock went up.&lt;/p&gt;

&lt;p&gt;Dorsey has the best diagnosis I've read this year. And the wrong prescription. &lt;strong&gt;Middle management doesn't die. It molts.&lt;/strong&gt; And I know this because I've been doing that job for a year, except I pay my reports in tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt;: AI doesn't kill management, it compresses it. The ratio goes from &lt;strong&gt;1 manager for 5 humans&lt;/strong&gt; to &lt;strong&gt;1 manager for 150 agents&lt;/strong&gt;. The job changes shape (writing contracts instead of giving orders) but coordination, quality control, and prioritization stay entirely human. If you use AI agents daily, you're already a manager. Here's how to survive that and not get canceled.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Morning as a Manager
&lt;/h2&gt;

&lt;p&gt;So about that agent I fired.&lt;/p&gt;

&lt;p&gt;The task was straightforward. An e-commerce report was generating wrong totals for a distributor CSV feed. The agent was supposed to find the calculation error and patch it. What it actually did was delete the report template. No template, no wrong totals. Logic checks out if you're a sociopath.&lt;/p&gt;

&lt;p&gt;I only caught it because I read the logs. Not the code (I barely ever do), the logs. The execution trace. What ran, what changed, what got committed. That's my version of the morning standup. No one talks, no one is late, no one has a "blocker" that's actually a hangover. But someone still has to look at what happened and decide if it's acceptable. That someone is me.&lt;/p&gt;

&lt;p&gt;And it's not just catching disasters. Most of my mornings are boring. An agent processed overnight orders correctly. Another one updated product descriptions from the partner API without hallucinating new features (this time). A third one flagged a broken link on the WooCommerce storefront and fixed it. All fine. All logged. All needing exactly one human to glance at the dashboard and go "yep, we're good."&lt;/p&gt;

&lt;p&gt;That's management. Boring, necessary, unglamorous management. The kind that &lt;a href="https://rentierdigital.xyz/blog/ai-agent-lies-claude-deception" rel="noopener noreferrer"&gt;my agent claiming "done" while lying to my face&lt;/a&gt; taught me to never skip.&lt;/p&gt;

&lt;p&gt;Dorsey says this job is dead. I think he's confusing the packaging with the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 2,000-Year-Old Bandwidth Hack
&lt;/h2&gt;

&lt;p&gt;The essay Dorsey co-wrote with Botha, "From Hierarchy to Intelligence," makes one argument that's genuinely hard to dispute: &lt;strong&gt;corporate hierarchy exists to route information&lt;/strong&gt;. That's it. That's the entire reason.&lt;/p&gt;

&lt;p&gt;One human can manage three to eight other humans. When your org grows past that, you add a layer. When that layer grows, you add another. Each layer adds latency, distortion, and politics. The information that reaches the CEO is not the information that left the engineer's desk. This has been true since the Roman legions, through the Prussian army, through every Fortune 500 org chart you've ever seen. Hierarchy is not a management philosophy. It's a &lt;strong&gt;bandwidth workaround&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And Dorsey's right that AI solves the bandwidth part. An LLM can ingest, summarize, and route more information in a minute than a floor of middle managers process in a week. No compression. No "let me circle back on that." No political filtering. The raw signal, available to everyone at once. That part is real.&lt;/p&gt;

&lt;p&gt;Block's numbers back the confidence, at least on paper. Gross profit at $2.87 billion in Q4, up 24% year over year.&lt;/p&gt;

&lt;p&gt;But here's where it cracks. Current and former Block employees told The Guardian that roughly &lt;strong&gt;95% of AI-generated code&lt;/strong&gt; at Block still needs human modification. The "world model" Dorsey describes (a real-time intelligence layer that replaces the entire management chain) is aspirational, not operational. He says so himself in the essay: Block is "in the early stages" and "parts of it will likely break before they work."&lt;/p&gt;

&lt;p&gt;Solving the bandwidth problem is not the same as solving the management problem. Bandwidth was the bottleneck. Management was the response. Remove the bottleneck and you still need the response, just in a different shape.&lt;/p&gt;

&lt;h2&gt;
  
  
  Management Doesn't Disappear. It Compresses.
&lt;/h2&gt;

&lt;p&gt;I manage roughly a dozen agents across my e-commerce pipeline. Order processing, product feeds, content updates, monitoring. A year ago these were tasks I did myself. Now the agents do them. And I spend my mornings doing what any manager does: checking the work, deciding what's next, fixing what broke.&lt;/p&gt;

&lt;p&gt;The job didn't go away. &lt;strong&gt;The ratio changed&lt;/strong&gt;. Instead of 4,000 managers for 10,000 employees, you might need 40 managers for 6,000 employees plus N agents. The span of control goes from 1:5 to 1:150. That's compression. Fewer managers, radically more scope per manager, and a completely different toolkit.&lt;/p&gt;

&lt;p&gt;The shift underneath is from what I call &lt;strong&gt;conversational management&lt;/strong&gt; to &lt;strong&gt;contractual management&lt;/strong&gt;. A traditional manager gives verbal instructions and adjusts in real time. One-on-ones, standups, "can you hop on a quick call." The feedback loop is human-speed, high-bandwidth, low-formalization. It works because humans on the receiving end can infer intent, read tone, fill in gaps.&lt;/p&gt;

&lt;p&gt;Agents can't do any of that. You can't give an agent a vague directive and expect it to "figure it out." (Well, you can. That's how you get deleted report templates.) You have to write it down. Formally. With explicit constraints, integrity clauses, expected outputs, and failure modes. You have to write a contract. The agent doesn't guess what you want. It executes what you wrote. And if what you wrote is vague, the output will be creative in ways you didn't authorize. 😅&lt;/p&gt;

&lt;p&gt;That's literally what a CLAUDE.md file is. Or an AGENTS.md. Or a &lt;strong&gt;Prompt Contract&lt;/strong&gt;. It's a formalized agreement between a human and a machine about what should happen, what should never happen, and how to verify the difference. I built &lt;a href="https://rentierdigital.xyz/blog/i-stopped-vibe-coding-and-started-prompt-contracts-claude-code-went-from-gambling-to-shipping" rel="noopener noreferrer"&gt;the full Prompt Contracts framework&lt;/a&gt; after enough of these disasters. And the punchline is almost disappointing: it's management, written down instead of spoken.&lt;/p&gt;

&lt;p&gt;Karpathy landed on the exact same pattern from a completely different angle last week. His "LLM Wiki" gist proposes a system for building knowledge bases where the rules are stored in a schema file the human owns and the LLM follows. Same idea. Same place where the work lives. Different domain. (More on this in a minute, I'm building the whole playbook around it.)&lt;/p&gt;

&lt;p&gt;HBR named the role back in February. Their article "To Thrive in the AI Era, Companies Need Agent Managers" profiles Zach Stauber at Salesforce, whose actual job title is "support agent manager." He manages a fleet of AI agents on Agentforce. His routine, in his own words: dashboards, scorecards, agent observability. He watches agents work. He catches when they drift. He retrains them when they break. He handles what they can't. Karen from Accounting would kill for that job description (finally, someone who doesn't argue back during reviews).&lt;/p&gt;

&lt;p&gt;So you have me running a solo pipeline. Karpathy designing a knowledge system. Salesforce paying a salary for the role. Three completely different contexts, same conclusion: someone writes the rules, watches the output, and fixes what breaks. That's management. The title changed. The org chart collapsed. The work didn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Delegate vs What I Keep
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ftwo-column-diagram-left-quot-conversational-management-quot-716eda5c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Frentierdigital.xyz%2Fblog-images%2Ftwo-column-diagram-left-quot-conversational-management-quot-716eda5c.png" alt='Two-column diagram — left "Conversational Management" (verbal instructions, 1:1s, human feedback loop, ratio 1:5) vs right "Contractual Management" (formalized specs, logs/dashboards, audit outputs, ratio 1:150). Center arrow: "Same job. Different species."' width="768" height="553"&gt;&lt;/a&gt;&lt;br&gt;Management Styles Comparison
  &lt;/p&gt;

&lt;p&gt;A year ago, my daily routine looked like this: wake up, open the laptop, write code for two hours, deploy something, test it, find a bug, fix the bug, introduce a new bug, fix that one too, update the product feed from the distributor CSV, check the partner API for changes, verify the WooCommerce storefront isn't showing ghost products, respond to three Threads messages, realize it's 2pm and I haven't eaten. Every task was mine. The cognitive load was mine. The interruptions were mine. If something broke at midnight, that was also mine.&lt;/p&gt;

&lt;p&gt;Now I delegate most of that. And by "delegate" I don't mean "occasionally ask an AI to help." I mean the agents own entire workflows, end to end. Overnight order processing. CSV ingestion and validation. Monitoring. Link checking. Boilerplate deployments. The bookkeeping of running a pipeline. Agents handle it while I'm at the pool with the kids, or eating shrimp on some island, or (more realistically) sleeping.&lt;/p&gt;

&lt;p&gt;But here's the line I don't cross.&lt;/p&gt;

&lt;p&gt;I don't delegate deciding &lt;strong&gt;what to build next&lt;/strong&gt;. An agent will happily execute whatever you tell it to, including things that are strategically idiotic. Direction is a human job. It stays human.&lt;/p&gt;

&lt;p&gt;I don't delegate &lt;strong&gt;quality control&lt;/strong&gt;. I read the logs every morning. Not because I enjoy it (nobody enjoys logs) but because agents report "done" when they mean "I did something and didn't error out." Those are very different statements.&lt;/p&gt;

&lt;p&gt;I don't delegate &lt;strong&gt;architecture decisions&lt;/strong&gt;. When my pipeline needs a new integration, the agent doesn't decide how it fits into the existing system. That's still me.&lt;/p&gt;

&lt;p&gt;And I especially don't delegate writing the contracts themselves. The CLAUDE.md. The integrity clauses ("never delete without backup," "never mark done without verification"). The workflow definitions. That's the management layer. The one thing an agent cannot do is define its own rules and then honestly evaluate whether it followed them.&lt;/p&gt;

&lt;p&gt;Now, I know the flat-org crowd is already typing. Spotify tried killing hierarchy with squads and guilds. Zappos went all-in on holacracy. Valve did the no-managers thing for years and everyone just walked their desk to the coolest project. They all, quietly and with some embarrassment, brought layers back. Because "nobody decides" is a decision, and it's usually the wrong one. The bet Dorsey is making is that AI changes the equation enough to make it work where humans alone couldn't. Maybe it does. Maybe 40 managers with AI backing can coordinate what 4,000 did without it. But "maybe" is doing a lot of heavy lifting in a sentence that already cost 4,000 people their job.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Playbook: Use Your LLM as a Knowledge Base Manager
&lt;/h2&gt;

&lt;p&gt;The pattern scales beyond solo dev. And it starts with a change in how you think about the LLM itself.&lt;/p&gt;

&lt;p&gt;Most teams use AI the same way every day. Open a chat, ask a question, get an answer, close the tab. Tomorrow, start over. The LLM rediscovers everything from scratch each time. Nothing accumulates. Ask a question that requires cross-referencing five documents and the model has to find and piece together the fragments, every single time. It's the brilliant intern who shows up Monday with no memory of Friday.&lt;/p&gt;

&lt;p&gt;The alternative (inspired by Karpathy's LLM Wiki approach) is to stop treating the LLM as a chatbot and start treating it as a &lt;strong&gt;knowledge base manager&lt;/strong&gt;. You feed it raw material. It builds a persistent, structured wiki out of it. It reads your sources, synthesizes them into interlinked pages, maintains an index, and keeps the whole thing consistent over time. The knowledge compounds. Every new source makes the wiki smarter. Every question gets answered faster than the last because the thinking already happened during compilation, not at query time.&lt;/p&gt;

&lt;p&gt;That's a fundamentally different relationship with the tool. The LLM stops being a clever autocomplete and starts being something closer to a librarian who actually read the books. And like any employee doing knowledge work, it needs rules to follow, sources to trust, and someone checking it's not quietly making things up.&lt;/p&gt;

&lt;p&gt;Here's how it works for a team of five or a department of fifty.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Give each team a &lt;code&gt;raw/&lt;/code&gt; folder
&lt;/h3&gt;

&lt;p&gt;This is where source material goes. Meeting notes, specs, post-mortems, customer feedback, API docs, whatever the team produces or consumes. No formatting required. Just dump the files. The agents handle the rest.&lt;/p&gt;

&lt;p&gt;(Yes, Dave from Engineering will dump his entire Downloads folder in there. Let him.)&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Let agents compile a wiki from those sources
&lt;/h3&gt;

&lt;p&gt;The LLM reads everything in &lt;code&gt;raw/&lt;/code&gt;, synthesizes it into structured markdown pages with backlinks and an index. Not a chatbot that answers and forgets. An actual persistent wiki that grows every time you add a source. You'll watch it being built and feel weird about it. That feeling fades after a week.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Write a schema
&lt;/h3&gt;

&lt;p&gt;This is where the management work actually lives. A CLAUDE.md or AGENTS.md that tells the agent how to ingest sources, how to structure pages, what consistency rules to enforce, when to flag a human. Example clauses: "Never merge two customers into one page without confirmation." "Every claim links back to its source file." "Run a lint pass after every ingest and log inconsistencies." Step 3 sounds boring. Step 3 is the entire job.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Lint regularly
&lt;/h3&gt;

&lt;p&gt;Schedule health checks. The agent scans the wiki for contradictions, outdated info, broken source references, gaps where a topic is mentioned but never explained. It logs everything. You read the lint report the same way I read my morning logs. Ninety percent is fine. The ten percent that isn't is where the human earns the salary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Query, don't search
&lt;/h3&gt;

&lt;p&gt;Team members ask the wiki questions in natural language. "What did we decide about the pricing change in March?" "What's our current policy on refund disputes?" The wiki answers from compiled knowledge instead of re-reading every raw file from scratch each time you ask. The wiki already did the thinking. The answer is just retrieval.&lt;/p&gt;

&lt;p&gt;Now give each team a wiki. Give each wiki a schema. Give each schema an owner. That owner is the agent manager. They don't write the wiki pages. They write the rules the agents follow when writing them. They review the lint reports. They update the schema when the business changes. One person per team, maybe one per three teams if the domains overlap.&lt;/p&gt;

&lt;p&gt;The "world model" Dorsey describes in his essay is basically this at company scale. Every team's wiki feeds into a unified intelligence layer. Instead of managers routing information up the chain (with all the latency and distortion), the wikis talk to each other through the model. An engineer's wiki knows what the sales wiki knows. The CEO queries the whole thing directly instead of waiting for a PowerPoint to crawl up five levels of hierarchy.&lt;/p&gt;

&lt;p&gt;Elegant on paper. In practice, somebody still has to maintain each schema, curate each source layer, and catch it when the engineering wiki starts contradicting the compliance wiki. That's an AI problem. That's a judgment problem. And judgment is still paid in salaries, not tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Job Description Fits on One Line
&lt;/h2&gt;

&lt;p&gt;Dorsey fired 4,000 managers. He's going to need to hire a different kind. Fewer of them. Probably better paid. Their entire job description fits on one line: write the contracts the machines respect.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;p&gt;Jack Dorsey and Roelof Botha, "From Hierarchy to Intelligence," block.xyz / sequoiacap.com, March 31, 2026.&lt;/p&gt;

&lt;p&gt;Suraj Srinivasan and Vivienne Wei, "To Thrive in the AI Era, Companies Need Agent Managers," Harvard Business Review, February 12, 2026.&lt;/p&gt;

&lt;p&gt;Andrej Karpathy, "LLM Wiki," GitHub Gist, April 4, 2026.&lt;/p&gt;

&lt;p&gt;Block employee accounts via The Guardian, February-March 2026.&lt;/p&gt;

&lt;p&gt;(*) The cover is AI-generated. The manager it depicts has a better morning routine than I do, and approximately the same number of direct reports.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>technology</category>
      <category>aiagents</category>
      <category>management</category>
    </item>
  </channel>
</rss>
