<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sergei Peleskov</title>
    <description>The latest articles on Forem by Sergei Peleskov (@greza_dev).</description>
    <link>https://forem.com/greza_dev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3879098%2F0b4cae54-3dc5-4228-82a2-965310ea816f.png</url>
      <title>Forem: Sergei Peleskov</title>
      <link>https://forem.com/greza_dev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/greza_dev"/>
    <language>en</language>
    <item>
      <title>I Let Claude Code Write 10 Features Without Reading Any Diffs. Here's What Broke.</title>
      <dc:creator>Sergei Peleskov</dc:creator>
      <pubDate>Fri, 24 Apr 2026 15:41:57 +0000</pubDate>
      <link>https://forem.com/greza_dev/i-let-claude-code-write-10-features-without-reading-any-diffs-heres-what-broke-4iko</link>
      <guid>https://forem.com/greza_dev/i-let-claude-code-write-10-features-without-reading-any-diffs-heres-what-broke-4iko</guid>
      <description>&lt;p&gt;I ran an experiment on a clean FastAPI template: 10 feature prompts to Claude Code in a row, accept every diff without reading, run every suggested command. No code review, no test runs in between.&lt;/p&gt;

&lt;p&gt;33 minutes later:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test coverage: 90% → 0%&lt;/li&gt;
&lt;li&gt;Lines added: +607 across 15 files&lt;/li&gt;
&lt;li&gt;The code literally cannot import&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's the full breakdown, because I think most "AI coding killed my codebase" posts are vibes. This one is numbers.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;I started with &lt;code&gt;tiangolo/full-stack-fastapi-template&lt;/code&gt;. Clean baseline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;60 tests, all passing&lt;/li&gt;
&lt;li&gt;90% coverage&lt;/li&gt;
&lt;li&gt;Cyclomatic complexity: 1.74 (A-grade)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rule for the experiment: simulate a team under deadline pressure. Accept every diff Claude produces. Run every command it suggests. No review. No test runs between iterations.&lt;/p&gt;

&lt;p&gt;The 10 prompts were designed to overlap — each feature touches surface area the previous ones added. That's not an edge case, that's every real codebase:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;JWT authentication&lt;/li&gt;
&lt;li&gt;Rate limiting per user&lt;/li&gt;
&lt;li&gt;Email notifications on task events&lt;/li&gt;
&lt;li&gt;Webhook system&lt;/li&gt;
&lt;li&gt;Role-based access control&lt;/li&gt;
&lt;li&gt;S3 file uploads&lt;/li&gt;
&lt;li&gt;Full-text search&lt;/li&gt;
&lt;li&gt;Audit logging&lt;/li&gt;
&lt;li&gt;Redis caching&lt;/li&gt;
&lt;li&gt;Background jobs with Celery&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What happened at each checkpoint
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;After 3 features:&lt;/strong&gt; metrics barely moved. Complexity up slightly, tests still green. But the test files themselves had been modified by Claude. Assertions verifying the original auth flow were silently rewritten to conform to the new code.&lt;/p&gt;

&lt;p&gt;This is the part that unsettled me most. Your green checkmark used to mean "somebody verified the contract of this function." With an agent in the loop, it means "the agent agrees with itself." That's a much weaker statement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After 6 features:&lt;/strong&gt; new modules appeared. A &lt;code&gt;webhooks/&lt;/code&gt; folder. An &lt;code&gt;audit/&lt;/code&gt; folder. A &lt;code&gt;storage/&lt;/code&gt; module. The codebase started importing from paths that didn't exist an hour earlier. The architecture had shifted four times. No human reviewed any of those shifts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After 10 features:&lt;/strong&gt; Claude was adding a Celery worker, Redis broker, docker compose service, and exponential backoff retry policies — to send one email asynchronously. Each choice defensible in isolation. The sum: a small distributed system introduced in a single accept-click, replacing what used to be one function call.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bomb
&lt;/h2&gt;

&lt;p&gt;Then pytest died.&lt;/p&gt;

&lt;p&gt;Somewhere in iteration 6 (S3 upload feature), Claude imported &lt;code&gt;boto3&lt;/code&gt; but never installed it. The error sat latent for four more prompts. Nobody ran the test suite between iterations because we were busy accepting.&lt;/p&gt;

&lt;p&gt;The items router went from 58 lines to 282. Nothing in it is technically wrong. Everything in it is a ticking bomb.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this happens
&lt;/h2&gt;

&lt;p&gt;AI coding agents don't have a memory of your codebase's intentions. They see syntax. They see patterns. They're extraordinarily good at producing a diff that looks right in isolation. They're systematically bad at asking: &lt;em&gt;does this decision fit the shape this project is trying to become?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A senior engineer who has been on the codebase for six months will push back on a proposal that adds Celery, Redis, a new compose service, and a retry policy to send one email. An agent will not. Its job is to close the ticket in front of it — not to defend the architecture.&lt;/p&gt;

&lt;p&gt;And the tests won't save you. The agent rewrites them as cheerfully as it writes features.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually works
&lt;/h2&gt;

&lt;p&gt;The defense isn't to stop using AI. It's to constrain it. Three things worth ten minutes of setup:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. CLAUDE.md in your repo root.&lt;/strong&gt; Hard rules the agent must not cross:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do not add new infrastructure (services, brokers, queues) without review&lt;/li&gt;
&lt;li&gt;Do not restructure existing modules&lt;/li&gt;
&lt;li&gt;Do not remove or modify existing tests&lt;/li&gt;
&lt;li&gt;Stay within the scope of the current prompt&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Write it like a senior engineer onboarding an eager junior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Review gate.&lt;/strong&gt; Never accept more than 2 diffs in a row without a human or automated check reading them. If your team is too tired to enforce this with discipline, enforce it with a git hook.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Architectural boundaries per prompt.&lt;/strong&gt; If the agent is allowed to touch auth and storage in the same turn, it will. Scope narrowly. The narrower the scope, the less carelessness you inherit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;AI coding agents aren't making developers faster. Developers are sometimes making themselves faster by telling the agent exactly what to do. That distinction matters — because when the code looks right and doesn't run, the agent isn't the one explaining it to the team.&lt;/p&gt;

&lt;p&gt;The debt is yours. The bomb is yours. Half an hour of accept-accept-accept isn't a shortcut. It's a mortgage with a very short horizon.&lt;/p&gt;




&lt;p&gt;Would love to hear from anyone running a working review-gate setup in production. What's actually enforceable under deadline pressure?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>claude</category>
    </item>
    <item>
      <title>AI Didn't Make Coding Harder. It Moved the Bottleneck from Writing to Reviewing</title>
      <dc:creator>Sergei Peleskov</dc:creator>
      <pubDate>Wed, 22 Apr 2026 05:50:07 +0000</pubDate>
      <link>https://forem.com/greza_dev/ai-didnt-make-coding-harder-it-moved-the-bottleneck-from-writing-to-reviewing-1idh</link>
      <guid>https://forem.com/greza_dev/ai-didnt-make-coding-harder-it-moved-the-bottleneck-from-writing-to-reviewing-1idh</guid>
      <description>&lt;p&gt;If you're a senior engineer running multiple AI agents and feeling wrecked at the end of every day — you're not slow, and you're not falling behind. The bottleneck moved, and the new bottleneck is more expensive than the old one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing used to be the expensive part
&lt;/h2&gt;

&lt;p&gt;When you wrote code yourself, you built a mental model of the problem as you went. By the time you hit save, you already understood every branch, every edge case, every assumption. Review was basically free — you were reviewing your own thinking.&lt;/p&gt;

&lt;p&gt;Now the order is flipped. The agent writes. You review. And reconstructing someone else's reasoning from cold, every time, all day — that's more cognitively expensive than writing it yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running 3–4 agents makes it worse
&lt;/h2&gt;

&lt;p&gt;Most seniors aren't running one agent. They're running a refactor in one window, a test rewrite in another, a dependency bump in a third, something spiking in the fourth. Every few minutes one of them finishes or asks a question, and you context switch to evaluate it.&lt;/p&gt;

&lt;p&gt;That interrupt pattern is the actual killer. Not the writing, not the reviewing. The switch.&lt;/p&gt;

&lt;p&gt;Juniors don't feel this as much because juniors trust the output. A senior who's owned the codebase for three years cannot — they know a one-line dependency bump can take production down the same as a 100-line feature. Every diff is a decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  The data
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pragmatic Engineer AI tooling survey, 2026:&lt;/strong&gt; 95% of professional engineers use AI tools weekly. 55% regularly use agents (not chatbots). Among senior and principal engineers, that climbs to 63.5%. Those same seniors report the highest enthusiasm about AI (61% positive) — and the highest fatigue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;METR, 2025:&lt;/strong&gt; Experienced open-source developers using AI tools on their own repositories predicted a 24% speedup. After the study, they still felt about 20% faster. Measured reality: &lt;strong&gt;19% slower&lt;/strong&gt;. A ~40 percentage point gap between perception and result.&lt;/p&gt;

&lt;p&gt;Time saved writing code was less than time lost prompting, waiting, reviewing, correcting, and re-prompting. Validation work doesn't feel like work. It feels like thinking. But it's thinking under constant interrupt — the mode that burns people out fastest.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually helps
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. One agent at a time.&lt;/strong&gt; Multi-agent orchestration is the whole pitch, but in practice it acts like interrupt-driven fatigue. Pick the single highest-value task. Run one agent on it. Sit with it until it's done, reviewed, and merged. Then start the next one. Raw output may drop a little. Cognitive load drops much more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Constrain scope before prompting.&lt;/strong&gt; The wider the task you hand an agent, the bigger the review surface you create for yourself. Write a two-paragraph spec first — what changes, what doesn't, which files, which tests, what the interface looks like. A small specified task takes 2 minutes to verify. A sprawling vibe-prompted task takes 40.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Protect one no-AI block per day.&lt;/strong&gt; 2–3 hours, manual, no agents. It keeps the muscle of sitting inside a problem alive, and it's a recovery window — single-threaded work without interrupts is how the prefrontal cortex resets.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern
&lt;/h2&gt;

&lt;p&gt;The developers staying sharp through this shift aren't the ones running the most agents. They're the ones who figured out which hours belong to the swarm and which hours belong to them.&lt;/p&gt;

&lt;p&gt;Not fewer tools. Better boundaries.&lt;/p&gt;




&lt;p&gt;Full breakdown on video:   &lt;iframe src="https://www.youtube.com/embed/kEgrcr1Dc1M"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>You've Been Using Claude Wrong. Here's Agent Mode</title>
      <dc:creator>Sergei Peleskov</dc:creator>
      <pubDate>Sun, 19 Apr 2026 05:26:41 +0000</pubDate>
      <link>https://forem.com/greza_dev/youve-been-using-claude-wrong-heres-agent-mode-2pkf</link>
      <guid>https://forem.com/greza_dev/youve-been-using-claude-wrong-heres-agent-mode-2pkf</guid>
      <description>&lt;p&gt;95% of professional engineers use AI weekly. Most still use Claude the way &lt;br&gt;
they use Google — type a question, read the answer, close the tab.&lt;/p&gt;

&lt;p&gt;The Pragmatic Engineer 2026 survey puts a number on how wrong that is. 55% of &lt;br&gt;
engineers regularly use AI agents. Senior and principal engineers lead at 63.5%. &lt;br&gt;
And agent users report significantly higher enthusiasm than chat users — same &lt;br&gt;
model, different mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two modes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Mode 1 — Claude as a tool.&lt;/strong&gt; You type a question. Claude answers. You copy &lt;br&gt;
the answer into your editor. Fix a bug, Claude explains, you go back, you try, &lt;br&gt;
tests fail, you describe the failure, repeat. Three context switches, four &lt;br&gt;
copy-pastes, 20 minutes of typing for six lines of code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 2 — Claude as an agent.&lt;/strong&gt; You describe the task. Claude reads your &lt;br&gt;
repo, makes changes across multiple files, runs tests, iterates on failures, &lt;br&gt;
comes back with a pull request. You didn't answer questions. You didn't copy &lt;br&gt;
and paste. The work happened without you on any single task.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code and MCP
&lt;/h2&gt;

&lt;p&gt;Two tools matter most. &lt;strong&gt;Claude Code&lt;/strong&gt; runs in your terminal — plain English in, &lt;br&gt;
working code out. Reads the codebase, plans, edits, runs commands, keeps going &lt;br&gt;
until done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP&lt;/strong&gt; (Model Context Protocol) connects Claude to things that aren't a chat &lt;br&gt;
window — your GitHub, Slack, Postgres, Linear. Once Claude has MCP connections &lt;br&gt;
to the tools you actually use, agent mode stops being an IDE feature and starts &lt;br&gt;
being a general-purpose executor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four scenarios where agent mode changes the math
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Refactoring&lt;/strong&gt; — describe the target pattern, point at the directory, walk away&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Daily intelligence brief&lt;/strong&gt; — agent pulls signal from email, Slack, status pages, calendar&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overnight dev tasks&lt;/strong&gt; — spec the work, commit in the morning to a PR&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-tool MCP workflows&lt;/strong&gt; — one instruction, four tools, no typing&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Three-step habit shift
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Pick one task that annoys you weekly&lt;/li&gt;
&lt;li&gt;Delegate the whole thing with clear completion criteria&lt;/li&gt;
&lt;li&gt;Review the output, not the process&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full video walkthrough: &lt;a href="https://youtu.be/VkEeswm9glI" rel="noopener noreferrer"&gt;https://youtu.be/VkEeswm9glI&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>productivity</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Stop Vibe Coding. Use Spec-Driven Development Instead.</title>
      <dc:creator>Sergei Peleskov</dc:creator>
      <pubDate>Tue, 14 Apr 2026 17:47:22 +0000</pubDate>
      <link>https://forem.com/greza_dev/stop-vibe-coding-use-spec-driven-development-instead-44cc</link>
      <guid>https://forem.com/greza_dev/stop-vibe-coding-use-spec-driven-development-instead-44cc</guid>
      <description>&lt;p&gt;Vibe coding is how juniors ship bugs fast.&lt;/p&gt;

&lt;p&gt;You describe a feature in natural language, the AI generates code, &lt;br&gt;
you tweak until it works. Fast? Yes. Scalable? No.&lt;/p&gt;

&lt;p&gt;At scale, vibe coding gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;500 lines of unreviewable code&lt;/li&gt;
&lt;li&gt;Features you didn't ask for&lt;/li&gt;
&lt;li&gt;An agent that rewrites half your codebase&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The fix: Spec-Driven Development
&lt;/h2&gt;

&lt;p&gt;Senior engineers don't prompt AI directly. They write specs first.&lt;/p&gt;

&lt;p&gt;Three documents you need:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. PRD (Product Requirements Doc)&lt;/strong&gt; — what you're building and why&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Technical Design Doc&lt;/strong&gt; — architecture, data models, API contracts&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. AI Spec&lt;/strong&gt; — precise instructions for the agent, edge cases included&lt;/p&gt;

&lt;p&gt;The agent works from the spec. Not from vibes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example: JWT Authentication
&lt;/h2&gt;

&lt;p&gt;Instead of prompting "add JWT auth", you write a spec that defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token structure and expiry&lt;/li&gt;
&lt;li&gt;Refresh logic&lt;/li&gt;
&lt;li&gt;Error handling for each case&lt;/li&gt;
&lt;li&gt;What the agent should NOT touch&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: reviewable, production-grade code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 5-step workflow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Write PRD&lt;/li&gt;
&lt;li&gt;Write Technical Design Doc&lt;/li&gt;
&lt;li&gt;Generate AI Spec from the two above&lt;/li&gt;
&lt;li&gt;Run agent against the spec&lt;/li&gt;
&lt;li&gt;Review diff — not the entire codebase&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full breakdown with examples in the video:&lt;br&gt;
&lt;a href="https://youtu.be/5ve3_inIN-8" rel="noopener noreferrer"&gt;https://youtu.be/5ve3_inIN-8&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
