<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mladen Stepanić</title>
    <description>The latest articles on Forem by Mladen Stepanić (@crawleyprint_71).</description>
    <link>https://forem.com/crawleyprint_71</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F226731%2Ff2bfeea3-c45e-4ea5-8059-0845c68de321.jpg</url>
      <title>Forem: Mladen Stepanić</title>
      <link>https://forem.com/crawleyprint_71</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/crawleyprint_71"/>
    <language>en</language>
    <item>
      <title>Remote Slop with Claude Code</title>
      <dc:creator>Mladen Stepanić</dc:creator>
      <pubDate>Fri, 20 Mar 2026 21:42:38 +0000</pubDate>
      <link>https://forem.com/crawleyprint_71/remote-slop-with-claude-code-329c</link>
      <guid>https://forem.com/crawleyprint_71/remote-slop-with-claude-code-329c</guid>
      <description>&lt;p&gt;In my &lt;a href="https://dev.to/crawleyprint_71/workflow-engineering-prompt-engineering-3pep"&gt;last post on agentic workflows&lt;/a&gt;, I talked about workflow engineering — how the pipeline you design around AI matters more than the AI itself. I built a book inventory app that way. Skills, OpenSpecs, beads, parallel agents in tmux. It worked great.&lt;/p&gt;

&lt;p&gt;But there was an asterisk I didn't mention: I was sitting at my desk the whole time.&lt;/p&gt;

&lt;p&gt;What happens when you're not?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;Claude Code recently got a Telegram integration. The idea is simple: you control Claude Code from your phone through a Telegram bot. Same skills, same workflow, same project — different interface. If you've used the plugin system at all, setup is straightforward. Documentation walks you through it, you're chatting with your agent in minutes.&lt;/p&gt;

&lt;p&gt;In fact, as of today, Anthropic officially shipped this as &lt;a href="https://code.claude.com/docs/en/channels" rel="noopener noreferrer"&gt;Claude Code Channels&lt;/a&gt; — a plugin-based feature that lets you push messages from &lt;a href="https://telegram.org" rel="noopener noreferrer"&gt;Telegram&lt;/a&gt; or &lt;a href="https://discord.com" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; into a running Claude Code session on your machine. Your session processes the request with full filesystem, MCP, and git access, then replies through the same chat. It's built on MCP, which means it slots into the existing plugin ecosystem cleanly. I ran my experiment the day it launched, so what you're reading is a day-one field report.&lt;/p&gt;

&lt;p&gt;I wanted to test this properly, so I gave it a real task. Not a toy. A refactoring sessions on an existing codebase, the kind of thing I'd normally spend a focused afternoon on. Except today, I wasn't at my desk and I wasn't focused. I was doing something else, checking Telegram between other things.&lt;/p&gt;

&lt;p&gt;Buckle up.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Permission Wall
&lt;/h2&gt;

&lt;p&gt;Immediately: a wall.&lt;/p&gt;

&lt;p&gt;Claude Code has a permission system. It asks before it does anything potentially destructive — file writes, shell commands, external calls. At your desk, this is fine. You see the prompt, you approve, you move on.&lt;/p&gt;

&lt;p&gt;From Telegram? The bot doesn't forward those permission prompts. Your agent hits a permission check, and it just... stops - silently. You're staring at Telegram wondering why nothing is happening, and the answer is that Claude is staring at your terminal wondering why you're not approving anything.&lt;/p&gt;

&lt;p&gt;This is the first thing you'll hit, and there's no elegant workaround. Either you solve the permission problem or you don't use this workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Context Ceiling
&lt;/h2&gt;

&lt;p&gt;Second problem: context. Opus ships with a 1M token context window, which sounds like a lot. And it is — for a focused session. But "entire day, away from your machine, no way to reset" is a different budget. You can't &lt;code&gt;/clear&lt;/code&gt; the session from Telegram. If the conversation gets heavy, you can't start fresh. You're stuck with whatever context you've accumulated.&lt;/p&gt;

&lt;p&gt;For a day of casual back-and-forth this turned out to be manageable. But it's something you have to plan around, not something you can ignore.&lt;/p&gt;

&lt;h2&gt;
  
  
  Biting the Bullet
&lt;/h2&gt;

&lt;p&gt;So I did the thing you're not supposed to do: &lt;code&gt;--dangerously-skip-permissions&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I know. The flag name exists for a reason. But here's my reasoning: I wasn't installing new external dependencies. My Claude Code workflow is already scoped — skills are loaded, the project is defined, the agent knows its boundaries.&lt;/p&gt;

&lt;p&gt;And it worked. The permission wall disappeared. The agent could actually run.&lt;/p&gt;

&lt;p&gt;Not great, not terrible. I'm still not comfortable disabling guardrails as a general practice. For this specific experiment, with this specific setup, it was a calculated risk. Use at your own discretion. Or better — don't. Anyway, don't blame me if Claude decides your &lt;code&gt;main&lt;/code&gt; branch needs to have a different history and you don't have branch protection in place. You've been warned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Saving Context With Agent Teams
&lt;/h2&gt;

&lt;p&gt;The context problem needed a different solution. If I can't clear the session, I need to use less of it.&lt;/p&gt;

&lt;p&gt;This is where the workflow from my previous post paid off. If you haven't read it: I use a pipeline where work gets decomposed into &lt;em&gt;beads&lt;/em&gt; — small, focused tasks based on Steve Yegge's &lt;a href="https://github.com/steveyegge/beads" rel="noopener noreferrer"&gt;beads concept&lt;/a&gt;. Each bead is scoped tightly enough that a sub-agent can pick it up and run with it without needing the full conversation history.&lt;/p&gt;

&lt;p&gt;So I instructed the main agent to delegate aggressively. The pipeline looked like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I describe what I want via Telegram&lt;/li&gt;
&lt;li&gt;Main agent runs a spec creator sub-agent which generates an OpenSpec (a structured definition of the change) and makes a draft PR on GitHub&lt;/li&gt;
&lt;li&gt;I review the spec from my phone and approve&lt;/li&gt;
&lt;li&gt;Spec gets decomposed into beads&lt;/li&gt;
&lt;li&gt;Sub-agents pick up beads, implement, commit, and update the PR &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The main session barely touched any implementation detail. It just coordinated. By the end of the day, I'd used about 30% of the context window. Granted, I wasn't going crazy with requests — but the pattern held up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Claude Code on the Web?
&lt;/h2&gt;

&lt;p&gt;Fair question. Claude Code has a web interface now. I could've used that from my phone. No Telegram bot, no permission hacks.&lt;/p&gt;

&lt;p&gt;The answer is boring: my tools. My skills are loaded locally. My workflow is configured. I have Playwright set up for visual verification — I literally had the agent screenshot pages to confirm layout changes actually landed. That's not something you get from the web interface.&lt;/p&gt;

&lt;p&gt;When you've invested in a workshop, you want to use it. Even from a distance.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Slow Loop
&lt;/h2&gt;

&lt;p&gt;Here's the real downside, and it's not about permissions or context.&lt;/p&gt;

&lt;p&gt;The feedback loop is slow.&lt;/p&gt;

&lt;p&gt;At my desk, I have hot reload. I change something, I see it instantly. From Telegram, the loop is: agent pushes to GitHub → Vercel builds a preview → I check the preview on my phone. That's minutes, not milliseconds. For layout work especially, it's painful. You're doing the development equivalent of texting someone in the next room instead of just talking to them.&lt;/p&gt;

&lt;p&gt;I could live with it because I was mostly multitasking — checking in on the agent between other things. But if this were my primary way of working? The latency would get to me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;I had Claude Code analyze the session after the fact and here's what a day of remote agent work actually looks like:&lt;/p&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Total tool calls&lt;/td&gt;
&lt;td&gt;507&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agents spawned&lt;/td&gt;
&lt;td&gt;22 across 5 teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PRs created / merged&lt;/td&gt;
&lt;td&gt;5 / 5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Telegram messages&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Playwright interactions&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSpec tasks verified&lt;/td&gt;
&lt;td&gt;76&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Agents
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;QA&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spec Writer&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Explorer&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reviewers&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standalone&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Features
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Crop inset fix&lt;/td&gt;
&lt;td&gt;merged&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remove impressum crop&lt;/td&gt;
&lt;td&gt;merged&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scrollable mapping&lt;/td&gt;
&lt;td&gt;merged&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reading status&lt;/td&gt;
&lt;td&gt;in review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSpec cleanup (3 lists)&lt;/td&gt;
&lt;td&gt;76 tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;507 tool calls, 22 agents, 34 Telegram messages, 5 PRs. From my phone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Verdict
&lt;/h2&gt;

&lt;p&gt;At the end of the day, I sat down and did what I always do: reviewed the code myself. Read every change, checked every decision. And it was fine. A few minor optimizations I would've caught in real-time at my desk, but nothing structural. Nothing that made me regret the experiment.&lt;/p&gt;

&lt;p&gt;So here's the honest scorecard:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What works:&lt;/strong&gt; Your full workflow, from Telegram. Skills, beads, agent teams, even Playwright screenshots. If you've built a good pipeline, it travels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What doesn't:&lt;/strong&gt; Permission prompts don't reach you. Context can't be reset. The feedback loop trades seconds for minutes. And you'll probably end up running with &lt;code&gt;--dangerously-skip-permissions&lt;/code&gt;, which is exactly as comfortable as it sounds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who is this for:&lt;/strong&gt; Someone who's already set up their Claude Code workflow and wants to keep things moving while away from their desk. Not as a primary dev environment — as an extension of one you already trust.&lt;/p&gt;

&lt;p&gt;Would I do it again? Yeah, probably. But I'd plan the work differently. Bigger, well-defined refactorings that don't need rapid visual feedback. The kind of work where you can fire and forget, then review later. Not pixel-pushing. Not exploratory coding. Structured changes through a structured pipeline.&lt;/p&gt;

&lt;p&gt;Remote slop? A little. But manageable slop, with a review step at the end. And sometimes that's enough.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>Stop Repeating Yourself. Stop Repeating Yourself. No, Seriously — Put It in a Skill.</title>
      <dc:creator>Mladen Stepanić</dc:creator>
      <pubDate>Sat, 07 Mar 2026 14:24:11 +0000</pubDate>
      <link>https://forem.com/crawleyprint_71/stop-repeating-yourself-stop-repeating-yourself-no-seriously-put-it-in-a-skill-4gha</link>
      <guid>https://forem.com/crawleyprint_71/stop-repeating-yourself-stop-repeating-yourself-no-seriously-put-it-in-a-skill-4gha</guid>
      <description>&lt;p&gt;In my &lt;a href="https://dev.to/crawleyprint_71/workflow-engineering-prompt-engineering-3pep"&gt;last post&lt;/a&gt; I talked about how the workflow is the work, how designing the pipeline matters more than any individual prompt. I still believe that. But I've now hit the next layer of the onion: what happens when the pipeline itself becomes repetitive?&lt;/p&gt;

&lt;p&gt;I've been using Claude Code across five projects. A React + Hono book inventory app, a legacy .NET modernization, a Swift macOS menu bar utility, a Rust audio DAW, and a Rust TUI for task visualization. Different languages, different domains, apparently very similar habits. And I didn't notice until I asked.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Prompt That Started It
&lt;/h2&gt;

&lt;p&gt;Credit where it's due. &lt;a href="https://x.com/chintanturakhia/status/2030089465679728763" rel="noopener noreferrer"&gt;Chintan Turakhia posted a tweet&lt;/a&gt;, "Run this prompt frequently. You're welcome.", with a screenshot of a Claude Code prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scrape all of my claude sessions on this computer. give me a breakdown
of all the things i do, things that are worth making into skills vs
plugins vs agents vs claude.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So I did. My version had typos, his didn't, but the idea was the same: ask Claude Code to introspect on itself. Read through every session I'd ever had and find the patterns I couldn't see because I was too close to them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Claude Found
&lt;/h2&gt;

&lt;p&gt;Claude spawned three subagents in parallel. One explored my global config and plugin structure. Another crawled across all my repos reading every &lt;code&gt;CLAUDE.md&lt;/code&gt; and &lt;code&gt;AGENTS.md&lt;/code&gt; file. The third dug into specific project architectures and specs. In about a minute, they'd collectively mapped my entire Claude Code ecosystem.&lt;/p&gt;

&lt;p&gt;The findings were equal parts validating and embarrassing.&lt;/p&gt;

&lt;p&gt;Validating because I clearly had strong workflows. I'd organically developed a &lt;a href="https://dev.to/crawleyprint_71/workflow-engineering-prompt-engineering-3pep"&gt;feature planning pipeline&lt;/a&gt;, openspec proposal, beads decomposition, worktree, implement, merge, clean up. I had a sophisticated agent team pattern with typed agents and context-aware respawning. I had a review pipeline with dedicated expert agents for security, architecture, and code quality. Real workflows. Stuff that worked.&lt;/p&gt;

&lt;p&gt;Embarrassing because I was re-stating the same ground rules in virtually every session. Never use Python, always use Bun, use Claude Code's native tools instead of sed and awk, agents must commit frequently but not step on each other, all features start with an openspec. These rules were scattered across &lt;code&gt;AGENTS.md&lt;/code&gt; files in every project, re-stated inline whenever Claude forgot, and occasionally contradicting each other between repos. I was spending real tokens and real time repeating myself to an LLM. The irony of a "workflow engineer" with a messy workshop is not lost on me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pivot
&lt;/h2&gt;

&lt;p&gt;I looked at the analysis and immediately decided to act on it. Same session, no break:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;let's do the following:
- add recommended things to global claude.md and remove them from projects
- create a new project in ~/repos/ to host all of the recommended plugins
- implement openspec-and-beads skill
- implement agent-team skill
- write a comprehensive readme doc with implemented skills and a list of todo items

Do not connect any of the new skills, I'll make a github repo and publish them there
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where the session pivoted from analysis to execution. And here's the thing, this pivot itself followed the pattern Claude had just identified. Analysis, planning, execution. The openspec-and-beads workflow, applied to itself. Recursive in the best possible way.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Got Built
&lt;/h2&gt;

&lt;p&gt;Claude entered plan mode, loaded two meta-skills, &lt;code&gt;executing-plans&lt;/code&gt; and &lt;code&gt;writing-skills&lt;/code&gt;, and broke the work into three batches.&lt;/p&gt;

&lt;p&gt;First batch was the lowest-hanging fruit: creating a global &lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt; with all my universal conventions, then surgically removing the duplicated rules from each project's config. Every repo got trimmed to only project-specific information. One file, in one place, read by every future Claude session. No more "never use Python" on repeat.&lt;/p&gt;

&lt;p&gt;Second batch was openspec integration. Instead of embedding massive documentation inline in every project, each repo got a one-liner pointing to the new skill.&lt;/p&gt;

&lt;p&gt;Third batch was the main event: the &lt;code&gt;claude-skills&lt;/code&gt; repository with three complete skills.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;openspec-and-beads&lt;/strong&gt; skill formalized my most-used workflow. Gather project context, scaffold a change proposal with motivation and delta specs, decompose into prioritized beads linked to an epic, track during implementation, archive when done. Before this existed as a skill, I was re-explaining the concept every time I started a new feature. Now its a single invocation.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;agent-team&lt;/strong&gt; skill captured something more subtle, the coordination patterns for running multiple agents in parallel. The key insight was the "cover agent" pattern: when an agent hits roughly 50% of its context window, you let it finish its current task, create a new bead for the remaining work, and respawn a fresh agent to pick it up. The skill also codifies rules I'd learned the hard way. One agent per file to avoid merge conflicts. Commit at every logical checkpoint. Shut down in reverse dependency order.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;use-bun&lt;/strong&gt; skill was just a quick reference card. A lookup table for Bun equivalents of common Node/npm patterns. &lt;code&gt;bunx tsc --noEmit&lt;/code&gt; instead of &lt;code&gt;npx tsc&lt;/code&gt;, &lt;code&gt;Bun.file()&lt;/code&gt; instead of &lt;code&gt;fs.readFile&lt;/code&gt;. Tiny, but it eliminated a whole class of "how do I do X with Bun again?" questions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The TDD Override
&lt;/h2&gt;

&lt;p&gt;Here's a moment that stuck with me. The &lt;code&gt;writing-skills&lt;/code&gt; meta-skill recommended full TDD for new skills, write tests, watch them fail, implement. Claude's response was pragmatic: this content was already empirically validated across hundreds of messages and 240 sessions. The "tests" had already been run, organically, over weeks of real usage. It proceeded directly to writing the specification.&lt;/p&gt;

&lt;p&gt;That felt like a right call. TDD for a skill isn't the same as TDD for code. The validation had already happened in practice. The whole point of this excercise was to &lt;em&gt;capture&lt;/em&gt; what was already working, not to discover new behavior. Sometimes the best test suite is "I did this 50 times and it works."&lt;/p&gt;

&lt;h2&gt;
  
  
  The Structure
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://github.com/Crawleyprint/claude-skills" rel="noopener noreferrer"&gt;repo&lt;/a&gt; ended up looking like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/repos/claude-skills/
├── README.md
├── plugin.json
├── marketplace.json
├── skills/
│   ├── openspec-and-beads/
│   │   └── SKILL.md
│   ├── agent-team/
│   │   └── SKILL.md
│   └── use-bun/
│       └── SKILL.md
└── plugins/
    └── README.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There was a brief moment of confusion about plugin manifests, &lt;code&gt;plugin.json&lt;/code&gt; vs &lt;code&gt;marketplace.json&lt;/code&gt;, what each needed, but it resolved quickly. The repo was initialized, committed, and pushed to GitHub in the same session.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Actually Means
&lt;/h2&gt;

&lt;p&gt;Your LLM conversations are data. 240 sessions contained patterns I couldn't see because I was living inside them. Having Claude analyze its own session history was like running a profiler on your own workflow, the hot paths become obvious. If you haven't done this yet, do it. Chintan was right.&lt;/p&gt;

&lt;p&gt;Global config eliminates an entire class of wasted tokens. Every "never use Python" I typed was burning context and attention. A single &lt;code&gt;CLAUDE.md&lt;/code&gt; in &lt;code&gt;~/.claude/&lt;/code&gt; fixed that permanently. If you're using Claude Code across multiple projects, this is probably the highest-ROI thing you can do right now.&lt;/p&gt;

&lt;p&gt;Skills are just formalized habits. I didn't settle on the openspec-and-beads workflow during this session. I'd been doing it for weeks. The skill just wrote down what was already true. If you find yourself explaining the same process to Claude more than twice, it belongs in a skill. Not a prompt, not a CLAUDE.md entry, a skill. The distinction matters because skills carry context, structure, and sequencing that a flat config file can't.&lt;/p&gt;

&lt;p&gt;The "cover agent" pattern deserves to be a first-class feature. The idea of monitoring an agent's context utilization and strategically respawning it before it degrades, assigning the remaining work as a new task, is something I'm doing manually. In 2026. While Anthropic ships yet another benchmark blog post. The fact that I have to write a skill to manage context windows because the tool won't do it for me is... a choice. Anthropic, if you're reading this: please steal this idea. I'm begging you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The TODO list from this session is still open. A beads MCP plugin so agents can query and update task status natively instead of shelling out to &lt;code&gt;bd&lt;/code&gt; commands. A review pipeline skill for spawning expert agents and turning their findings into follow-up beads. A branch cleanup skill because the merge-delete-remove dance is identical every single time and I'm tired of typing it.&lt;/p&gt;

&lt;p&gt;The session that produced all of this took about 22 minutes. Three subagents, a lot of pattern recognition, and a pivot from "hmm, I wonder what I actually do" to shipping a plugin repo on GitHub. Not bad for a Saturday morning.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tooling</category>
      <category>workflow</category>
    </item>
    <item>
      <title>Workflow Engineering &gt; Prompt Engineering</title>
      <dc:creator>Mladen Stepanić</dc:creator>
      <pubDate>Sun, 01 Mar 2026 10:42:02 +0000</pubDate>
      <link>https://forem.com/crawleyprint_71/workflow-engineering-prompt-engineering-3pep</link>
      <guid>https://forem.com/crawleyprint_71/workflow-engineering-prompt-engineering-3pep</guid>
      <description>&lt;p&gt;... it's early 2026. Remember when I said AI is a tool? I still believe that. But I've been using Claude Code for a few months now and I need to update the nuance a bit: AI is a tool, but the way you set up the workshop matters more than the tool itself.&lt;/p&gt;

&lt;p&gt;I built a book inventory app. Nothing fancy, it tracks books across households, lets users invite others to share their collections. Hono on the backend, React with Vite on the frontend, Neon Postgres for the database, deployed on Vercel. A boring stack for a boring app. And I mean that as a compliment. Oh, and before you ask - no, it doesn't have any AI-powered features. No "smart recommendations," no "AI-curated reading lists." It's a CRUD app. It stores books. The irony is not lost on me.&lt;/p&gt;

&lt;p&gt;But the way I built it? That part wasn't boring at all.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Codex Chapter
&lt;/h2&gt;

&lt;p&gt;I started with Codex. I had high hopes. Same skills loaded, same setup, same project. Progress was slow. Not because Codex is bad, it's not, but because the ergonomics didn't work for me. Worktrees felt awkward. Parallel agents were running but I couldn't see what they were doing well enough to react in time. I was spending more energy managing the tool than building the app. That said, this could easily be a skill issue on my part. Codex might click better for someone with a different workflow or habits, I'm not here to tell you it's a bad tool. It just wasn't the right fit for how I work.&lt;/p&gt;

&lt;p&gt;So I switched to Claude Code. And things exploded.&lt;/p&gt;

&lt;p&gt;Same skills. Same project. Different interface. Claude Code's TUI let me see parallel agents running in tmux, let me react fast, let me stay in the flow. Thats it. That's the difference. Not smarter AI, not better models, better ergonomics. If you take one thing from this post: capability is table stakes. The interface determines your productivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Workflow Is the Skill
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting. I didn't just open Claude Code and say "build me a book inventory app." That's how you get a mess.&lt;/p&gt;

&lt;p&gt;Instead, I designed a pipeline. Every feature goes through the same stages:&lt;/p&gt;

&lt;p&gt;First, brainstorming. I use &lt;a href="https://github.com/obra/superpowers" rel="noopener noreferrer"&gt;obra's superpowers&lt;/a&gt; skill, specifically the brainstorming mode, for the planning phase. It made a noticeable jump in the quality of definitions compared to vanilla planning. The output here isn't code, it's clarity about what I'm building and why.&lt;/p&gt;

&lt;p&gt;Then, specification. The planning phase generates an &lt;a href="https://openspec.dev" rel="noopener noreferrer"&gt;OpenSpec&lt;/a&gt;, a structured definition of what needs to be built. Still no code.&lt;/p&gt;

&lt;p&gt;Then, decomposition. The spec gets broken into &lt;a href="https://github.com/steveyegge/beads" rel="noopener noreferrer"&gt;beads&lt;/a&gt; (if you haven't seen Steve Yegge's beads repo on GitHub, go look). Each bead is a tight, focused task. This is where the magic starts, because beads keep everything scoped. No context window bloat, no agents wandering off into tangents.&lt;/p&gt;

&lt;p&gt;Then, implementation. This is where I hand the OpenSpec and the beads to Claude Code and say: go. Parallel agents in tmux pick up the tasks and run.&lt;/p&gt;

&lt;h2&gt;
  
  
  The One-Shot That Wasn't
&lt;/h2&gt;

&lt;p&gt;Let me tell you about the household feature. Users living in the same space should have access to all books in that space. And I needed an invitation system so the household creator could invite others.&lt;/p&gt;

&lt;p&gt;I kicked off the pipeline. Brainstorming produced a clean definition. OpenSpec captured the full scope. Beads broke it into tasks. I handed it to Claude Code and the parallel agents basically one-shotted the whole thing, household concept and invitations, implemented and working.&lt;/p&gt;

&lt;p&gt;Sounds impressive, right?&lt;/p&gt;

&lt;p&gt;But was it really a one-shot? The implementation was, sure. But I front-loaded the intelligence into planning, specification, and decomposition. The "shot" landed cleanly because the planning was rigorous. Take away the pipeline and ask Claude Code to "implement household sharing with invitations" cold? You'll get something. Whether you'll get something good is a another question.&lt;/p&gt;

&lt;p&gt;This mirrors how experienced developers actually work. Nobody good just starts coding a multi-faceted feature. You think it through, you break it down, then you execute. I just happened to have AI on both sides, doing the thinking and the executing. My job was designing the workflow between them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Layers of "Does It Actually Work?"
&lt;/h2&gt;

&lt;p&gt;I can hear the skeptics: "Sure, AI generated the code. But does it work?"&lt;/p&gt;

&lt;p&gt;Fair question. Here's my answer: TDD is in place from the start. Yes, through a skill. Tests exist before implementation, not as an afterthought.&lt;/p&gt;

&lt;p&gt;But tests only tell you the logic is correct. So I also use a Playwright skill with Chrome to watch actual end-to-end runs. I see the app doing what it's supposed to do. No manual clicking through screens, no "I think it works." I watch it work.&lt;/p&gt;

&lt;p&gt;And then, at the end of each meaningful session, I spawn dedicated reviewer agents, one for frontend, one for backend, one for security. Their findings go into a follow-up PR.&lt;/p&gt;

&lt;p&gt;Three layers: TDD catches logic errors. Playwright catches visual and integration errors. Reviewers catch architectural and security issues. None of them manual.&lt;/p&gt;

&lt;p&gt;And then there's the fourth layer: me. I do a thorough manual code review after all of this. AI catches a lot, but I still read the code myself. I need to understand what's in my codebase, I need to know why decisions were made, and I need to catch the things that automated tools miss. The subtle logic that's technically correct but wrong for the product, the naming that'll confuse me in three months, the architectural drift that no linter will flag. If you skip this step, you're not building software, you're accumulating code you don't own.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cross-Model Twist
&lt;/h2&gt;

&lt;p&gt;Here's something that might raise eyebrows: I've been experimenting with using Codex as my reviewer.&lt;/p&gt;

&lt;p&gt;Yes, that Codex. The one I moved away from for building. Turns out, Codex 5.3 with maxed out thinking produces genuinely valuable review feedback. And it makes sense when you think about it, review is a different cognitive task than generation. You're evaluating against criteria, not creating from scratch. Codex's deep thinking mode suits that well. The ergonomics that frustrated me during building don't matter for review because it's single-threaded, focused work.&lt;/p&gt;

&lt;p&gt;I'm not loyal to one tool. I'm assembling the best pipeline I can from whatever works. Claude Code plans, specs, decomposes, and implements. Codex reviews. Each playing to its strengths.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Actually Means
&lt;/h2&gt;

&lt;p&gt;In my last post I said AI is a tool and that you need to know what you're doing to use it well. I stand by that. But now I'd add: the emerging skill isn't coding, and it isn't prompting either. It's pipeline design.&lt;/p&gt;

&lt;p&gt;Knowing which skills to load, when to brainstorm vs. spec vs. decompose, when to run agents in parallel, when to bring in a different model entirely, that's the craft now. The code is the output. The workflow is the work.&lt;/p&gt;

&lt;p&gt;Will this change again in six months? Probably. But the principle won't: understand what you're building, design a process that keeps quality high, and use whatever tools make that process smooth. Boring? Maybe. But boring apps that work are what people actually need.&lt;/p&gt;

&lt;p&gt;And I still think I'll have a lifetime of work fixing vibe-coded messes. Some things don't change.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>tooling</category>
      <category>devex</category>
    </item>
    <item>
      <title>AI Won't Take Your Job (But Fear Might)</title>
      <dc:creator>Mladen Stepanić</dc:creator>
      <pubDate>Mon, 08 Dec 2025 15:10:50 +0000</pubDate>
      <link>https://forem.com/crawleyprint_71/ai-wont-take-your-job-but-fear-might-5e3o</link>
      <guid>https://forem.com/crawleyprint_71/ai-wont-take-your-job-but-fear-might-5e3o</guid>
      <description>&lt;p&gt;... it's the end of 2025. And the development world as we know it is at turning point. It's been at turning point a lot of times since I've started doing this some 16 years ago, and a lot more times before that. But this one time is special, this one time is different: AI threatens to replace us.&lt;/p&gt;

&lt;p&gt;Relax, I'm dramatic on purpose, we're not going anywhere. I'm writing this post mostly for younger folks who I don't envy because they're in a position where they are threatened by, scared by, and forced into the AI bubble. People are losing their minds over whether AI will make them obsolete, they're listening to false prophets who are telling them that they should be learning a real craft, a physical skill that won't be touched by AI any time soon. But what's the reality? Reality is that AI is a tool (last time I checked). A powerful tool and it could be a great asset to a developer who knows what they're doing. If you don't know about the basics, security, user experience, performance... no amount of AI will help you make a good app. That's the truth, and anyone who's telling you otherwise is likely fearmongering to raise their importance or to sell you an AI-powered service or one of their dime-a-dozen courses. These people are not your friends!&lt;/p&gt;

&lt;p&gt;I'm not saying you need to avoid AI until it disappears. On the contrary, it's here to stay - just maybe not in the areas AI doom merchants want you to believe it will. You should definitely learn how to work &lt;strong&gt;with AI&lt;/strong&gt;. Large Language Models (LLMs) changed the game for me: I can get to the prototype faster, I can debug faster. I abstracted away boring multi-file edits, tests, and boilerplate, which are generally time consuming. Do I work more? No. Do I output more? Yes, but probably not as much as my CEO would like me to. I use it to find myself the time to learn core stuff that I'm missing, stuff that AI maybe knows but is unsure how to apply it (or apply correctly). I'm improving my infrastructure and backend knowledge, I use AI to brainstorm and then have it explain the choices it made. Then I question its choices, which are often either overkill or outright wrong. And this happens more times than any of the doom merchants would like to admit. &lt;/p&gt;

&lt;p&gt;It's easy to pay $200+ a month and have Claude or any agent write the app for you, but what happens when you get a data breach and you don't understand the code well enough to fix it? When you can't explain to your users what went wrong because the AI made decisions you never questioned? Will you be able to sell your service when everyone can use that same $200+ subscription to build their own? Is that sustainable?&lt;/p&gt;

&lt;p&gt;I don't think so.&lt;/p&gt;

&lt;p&gt;Instead, people will still need well-thought-out software that they can use without a hassle of setting up 10 cloud services, or worrying about backups, availability, or disaster recovery.&lt;/p&gt;

&lt;p&gt;Will our lives change?&lt;br&gt;
You bet they will. Which way - that's up to you. You need to decide if you'll chase the agent-of-the-day or you'll use the AI to produce boring solutions that actually work. &lt;/p&gt;

&lt;p&gt;Your call.&lt;/p&gt;

&lt;p&gt;I got this figured out for myself. I think I'll have a lifetime of work fixing vibe-coded messes false prophets will inevitably create. It'll likely be boring but, sometimes, boring is good.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>opinion</category>
    </item>
  </channel>
</rss>
