<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Alex Metelli</title>
    <description>The latest articles on Forem by Alex Metelli (@ametel01).</description>
    <link>https://forem.com/ametel01</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3682218%2F0e10bca3-b9b5-4704-b417-61d5092e72a5.png</url>
      <title>Forem: Alex Metelli</title>
      <link>https://forem.com/ametel01</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ametel01"/>
    <language>en</language>
    <item>
      <title>Advanced Context Engineering for Coding Agents</title>
      <dc:creator>Alex Metelli</dc:creator>
      <pubDate>Sun, 28 Dec 2025 06:13:00 +0000</pubDate>
      <link>https://forem.com/ametel01/advanced-context-engineering-for-coding-agents-11p7</link>
      <guid>https://forem.com/ametel01/advanced-context-engineering-for-coding-agents-11p7</guid>
      <description>&lt;p&gt;&lt;strong&gt;Full reference:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
📺 &lt;em&gt;Advanced Context Engineering for Coding Agents&lt;/em&gt;&lt;br&gt;&lt;br&gt;
🎤 Dex Horthy&lt;br&gt;&lt;br&gt;
🔗 &lt;a href="https://youtu.be/rmvDxxNubIg?si=GtPAqK-lnY58dlIO" rel="noopener noreferrer"&gt;https://youtu.be/rmvDxxNubIg?si=GtPAqK-lnY58dlIO&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;AI coding agents have dramatically increased developer throughput. However, in real-world usage—especially in large, long-lived (“brownfield”) codebases—many teams observe a mismatch between &lt;em&gt;output&lt;/em&gt; and &lt;em&gt;progress&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This post is a faithful technical distillation of Dex Horthy’s talk on &lt;strong&gt;advanced context engineering&lt;/strong&gt;: practical techniques for making today’s LLMs effective, reliable, and scalable for serious software engineering.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Productivity ≠ Progress
&lt;/h2&gt;

&lt;p&gt;Large-scale surveys of developers show a consistent pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI increases code shipped&lt;/li&gt;
&lt;li&gt;Code churn increases even more&lt;/li&gt;
&lt;li&gt;Teams repeatedly rework AI-generated output&lt;/li&gt;
&lt;li&gt;Brownfield codebases suffer the worst outcomes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI performs well for greenfield projects, prototypes, and dashboards. But in complex systems with legacy constraints, naive agent usage becomes a &lt;strong&gt;tech-debt factory&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This aligns with the lived experience of many senior engineers and founders.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Happens: Context Is the Only Control Surface
&lt;/h2&gt;

&lt;p&gt;Large language models are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stateless&lt;/strong&gt; (no memory between sessions)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Non-deterministic&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Entirely governed by the &lt;strong&gt;current context window&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every decision—tool usage, file edits, hallucinations—is determined by the tokens currently in context.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Better tokens in → better tokens out.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;More tokens does &lt;em&gt;not&lt;/em&gt; mean better outcomes.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Dumb Zone
&lt;/h2&gt;

&lt;p&gt;As context usage grows, model quality degrades. Empirically, this often begins around &lt;strong&gt;~40% of the context window&lt;/strong&gt;, depending on task complexity.&lt;/p&gt;

&lt;p&gt;This region is referred to as the &lt;strong&gt;dumb zone&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Common causes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Large tool outputs (JSON, UUIDs, logs)&lt;/li&gt;
&lt;li&gt;Unfiltered file dumps&lt;/li&gt;
&lt;li&gt;Repeated correction loops&lt;/li&gt;
&lt;li&gt;MCPs dumping irrelevant data&lt;/li&gt;
&lt;li&gt;Long chat histories full of noise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once in the dumb zone, agents become unreliable regardless of model quality.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trajectory Matters
&lt;/h2&gt;

&lt;p&gt;LLMs learn patterns &lt;em&gt;within a conversation&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;If the conversation looks like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Model makes a mistake
&lt;/li&gt;
&lt;li&gt;Human scolds the model
&lt;/li&gt;
&lt;li&gt;Model makes another mistake
&lt;/li&gt;
&lt;li&gt;Human scolds again
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The most likely continuation is… another mistake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bad trajectories reinforce failure modes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is why restarting sessions or compressing context is often more effective than continued correction.&lt;/p&gt;




&lt;h2&gt;
  
  
  Intentional Compaction
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Intentional compaction&lt;/strong&gt; is the deliberate compression of context into a minimal, high-signal representation.&lt;/p&gt;

&lt;p&gt;Instead of dragging an ever-growing conversation forward, you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Summarize the current state into a markdown artifact&lt;/li&gt;
&lt;li&gt;Review and validate it as a human&lt;/li&gt;
&lt;li&gt;Start a fresh context seeded with that artifact&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What to compact
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Relevant files and line ranges&lt;/li&gt;
&lt;li&gt;Verified architectural behavior&lt;/li&gt;
&lt;li&gt;Decisions already made&lt;/li&gt;
&lt;li&gt;Explicit constraints and non-goals&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What not to compact
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Raw logs&lt;/li&gt;
&lt;li&gt;Tool traces&lt;/li&gt;
&lt;li&gt;Full file contents&lt;/li&gt;
&lt;li&gt;Repetitive error explanations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compaction converts exploration into a one-time cost instead of a recurring tax.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sub-Agents Are About Context, Not Roles
&lt;/h2&gt;

&lt;p&gt;Sub-agents are frequently misunderstood.&lt;/p&gt;

&lt;p&gt;They are &lt;strong&gt;not&lt;/strong&gt; about mirroring human roles like “frontend agent” or “QA agent”.&lt;/p&gt;

&lt;p&gt;They exist to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fork a clean context window&lt;/li&gt;
&lt;li&gt;Perform large exploratory reads&lt;/li&gt;
&lt;li&gt;Return a &lt;strong&gt;succinct factual summary&lt;/strong&gt; to a parent agent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sub-agent scans a large repo&lt;/li&gt;
&lt;li&gt;Returns:
&lt;em&gt;“Relevant logic is in &lt;code&gt;foo/bar.ts:120–340&lt;/code&gt;, entrypoint is &lt;code&gt;BazHandler&lt;/code&gt;”&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The parent agent then reads only what matters.&lt;/p&gt;

&lt;p&gt;This is how you scale context without entering the dumb zone.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Research–Plan–Implement Workflow
&lt;/h2&gt;

&lt;p&gt;This workflow is not “spec-driven development”. That term has become semantically diffused.&lt;/p&gt;

&lt;p&gt;RPI is about &lt;strong&gt;systematic compaction&lt;/strong&gt; at every stage.&lt;/p&gt;




&lt;h3&gt;
  
  
  Research: Compressing Truth
&lt;/h3&gt;

&lt;p&gt;Goal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understand how the system &lt;em&gt;actually&lt;/em&gt; works&lt;/li&gt;
&lt;li&gt;Identify authoritative files and flows&lt;/li&gt;
&lt;li&gt;Eliminate assumptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Characteristics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read code, not docs&lt;/li&gt;
&lt;li&gt;Produce a short research artifact&lt;/li&gt;
&lt;li&gt;Validate findings manually&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If agents are not onboarded with accurate context, they will fabricate.&lt;/p&gt;

&lt;p&gt;This mirrors &lt;em&gt;Memento&lt;/em&gt;: without memory, agents invent narratives.&lt;/p&gt;




&lt;h3&gt;
  
  
  Plan: Compressing Intent
&lt;/h3&gt;

&lt;p&gt;Planning is the &lt;strong&gt;highest-leverage activity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A good plan:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lists exact steps&lt;/li&gt;
&lt;li&gt;References concrete files and snippets&lt;/li&gt;
&lt;li&gt;Specifies validation after each change&lt;/li&gt;
&lt;li&gt;Makes failure modes obvious&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A solid plan dramatically constrains agent behavior.&lt;/p&gt;

&lt;p&gt;Bad plans produce dozens of bad lines of code.&lt;br&gt;&lt;br&gt;
Bad research produces hundreds.&lt;/p&gt;




&lt;h3&gt;
  
  
  Implement: Mechanical Execution
&lt;/h3&gt;

&lt;p&gt;Once the plan is correct:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Execution becomes mechanical&lt;/li&gt;
&lt;li&gt;Context remains small&lt;/li&gt;
&lt;li&gt;Reliability increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where token spend actually pays off.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mental Alignment and Code Review
&lt;/h2&gt;

&lt;p&gt;Code review is primarily about &lt;strong&gt;shared understanding&lt;/strong&gt;, not syntax.&lt;/p&gt;

&lt;p&gt;As AI output scales, reviewing thousands of lines becomes unsustainable.&lt;/p&gt;

&lt;p&gt;High-performing teams:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Review research and plans&lt;/li&gt;
&lt;li&gt;Attach agent transcripts or AMP threads to PRs&lt;/li&gt;
&lt;li&gt;Show exact steps and test results&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reviewing plans preserves architectural coherence as throughput increases.&lt;/p&gt;




&lt;h2&gt;
  
  
  Limits: AI Does Not Replace Thinking
&lt;/h2&gt;

&lt;p&gt;AI amplifies the quality of thinking already done.&lt;/p&gt;

&lt;p&gt;In cases like deep architectural refactors or legacy systems with hidden invariants, teams must return to human design first.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;There is no perfect prompt.&lt;br&gt;&lt;br&gt;
There is no silver bullet.&lt;br&gt;&lt;br&gt;
Thinking cannot be outsourced.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Choosing the Right Level of Context Engineering
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task Type&lt;/th&gt;
&lt;th&gt;Recommended Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UI tweak&lt;/td&gt;
&lt;td&gt;Direct instruction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Small feature&lt;/td&gt;
&lt;td&gt;Light plan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-repo change&lt;/td&gt;
&lt;td&gt;Research + plan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deep refactor&lt;/td&gt;
&lt;td&gt;Full RPI + human design&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The ceiling of problem difficulty rises with context discipline.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Comes Next
&lt;/h2&gt;

&lt;p&gt;Coding agents will be commoditized.&lt;/p&gt;

&lt;p&gt;The real challenge is adapting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Team workflows&lt;/li&gt;
&lt;li&gt;SDLC processes&lt;/li&gt;
&lt;li&gt;Cultural norms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this, teams risk:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Juniors shipping slop&lt;/li&gt;
&lt;li&gt;Seniors cleaning it up&lt;/li&gt;
&lt;li&gt;Technical debt scaling with AI usage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a &lt;strong&gt;workflow and leadership problem&lt;/strong&gt;, not a tooling one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Context is the only lever that matters&lt;/li&gt;
&lt;li&gt;More tokens often reduce correctness&lt;/li&gt;
&lt;li&gt;Intentional compaction is mandatory&lt;/li&gt;
&lt;li&gt;Research and planning are the highest ROI activities&lt;/li&gt;
&lt;li&gt;AI amplifies thinking—it does not replace it&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Source &amp;amp; Attribution
&lt;/h2&gt;

&lt;p&gt;This article is a faithful technical adaptation of:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dex Horthy — *Advanced Context Engineering for Coding Agents&lt;/strong&gt;*&lt;br&gt;&lt;br&gt;
📺 &lt;a href="https://youtu.be/rmvDxxNubIg?si=GtPAqK-lnY58dlIO" rel="noopener noreferrer"&gt;https://youtu.be/rmvDxxNubIg?si=GtPAqK-lnY58dlIO&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All ideas, terminology, and frameworks originate from the referenced talk.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>development</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
