<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Roman Dubinin </title>
    <description>The latest articles on Forem by Roman Dubinin  (@romanonthego).</description>
    <link>https://forem.com/romanonthego</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F132683%2F369b7150-a56f-4f20-a32e-95f5c44b102a.jpeg</url>
      <title>Forem: Roman Dubinin </title>
      <link>https://forem.com/romanonthego</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/romanonthego"/>
    <language>en</language>
    <item>
      <title>The Best Agent Prompt Is a Lint Error</title>
      <dc:creator>Roman Dubinin </dc:creator>
      <pubDate>Fri, 27 Mar 2026 06:56:33 +0000</pubDate>
      <link>https://forem.com/romanonthego/the-best-agent-prompt-is-a-lint-error-3kim</link>
      <guid>https://forem.com/romanonthego/the-best-agent-prompt-is-a-lint-error-3kim</guid>
      <description>&lt;p&gt;Every LLM writes &lt;code&gt;key={index}&lt;/code&gt; on list items. It's in millions of React tutorials as the quick fix — React wants a key, here is a key. The code compiles. It renders. When the list reorders or an item is removed from the middle, React reuses the wrong DOM nodes: state stays pinned to the old position, controlled inputs keep stale values, transitions fire on the wrong elements.&lt;/p&gt;

&lt;p&gt;A lint rule fixes this. &lt;code&gt;react/no-array-index-key&lt;/code&gt; fires: "Do not use Array index in keys — use a stable identifier." The agent switches to &lt;code&gt;item.id&lt;/code&gt;. That class of diffing bug is extinct — not "less likely because the system prompt mentioned it."&lt;/p&gt;

&lt;h2&gt;
  
  
  The check step
&lt;/h2&gt;

&lt;p&gt;Agents work in a loop: write, check, fix, repeat. A slow, generic check — "build failed" — means the agent wastes tokens re-reading output and guessing at the cause. A fast, specific check — "line 42: expected &lt;code&gt;Effect&amp;lt;void, ConfigError&amp;gt;&lt;/code&gt; but got &lt;code&gt;string&lt;/code&gt;" — means the agent fixes it in the same turn.&lt;/p&gt;

&lt;p&gt;Agent-native tools — opencode, Cursor — surface LSP diagnostics inline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;src&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="nx"&gt;TS2345&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Argument&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;not&lt;/span&gt;
&lt;span class="nx"&gt;assignable&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;parameter&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Effect&amp;lt;void, ConfigError, Config&amp;gt;&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The error arrives with the file state. No log parsing. Wired into the toolchain, not into the prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  TypeScript strict
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;strict: true&lt;/code&gt; is table stakes. Most codebases stop there — and so does the agent's training data.&lt;/p&gt;

&lt;p&gt;The real leverage is in the flags &lt;code&gt;strict: true&lt;/code&gt; doesn't cover. &lt;code&gt;noUncheckedIndexedAccess&lt;/code&gt; makes array indexing return &lt;code&gt;T | undefined&lt;/code&gt; instead of &lt;code&gt;T&lt;/code&gt;, so the agent can't write &lt;code&gt;users[0].name&lt;/code&gt; without handling the missing case. &lt;code&gt;exactOptionalPropertyTypes&lt;/code&gt; distinguishes "may be absent" from "may be &lt;code&gt;undefined&lt;/code&gt;" — a difference the agent's training data almost certainly conflates. &lt;code&gt;noPropertyAccessFromIndexSignature&lt;/code&gt; forces bracket notation on dynamic keys.&lt;/p&gt;

&lt;p&gt;The agent's prior says &lt;code&gt;users[0].name&lt;/code&gt; is fine. The type checker disagrees.&lt;/p&gt;

&lt;h2&gt;
  
  
  Language service plugins
&lt;/h2&gt;

&lt;p&gt;TypeScript strict catches structural errors. Your domain has its own.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;tsconfig.json&lt;/code&gt; includes &lt;code&gt;@effect/language-service&lt;/code&gt; as a compiler plugin. &lt;code&gt;floatingEffect&lt;/code&gt; catches Effects that aren't yielded or assigned — silent no-ops that type-check fine. &lt;code&gt;missingEffectContext&lt;/code&gt; flags missing service requirements. &lt;code&gt;effectGenUsesAdapter&lt;/code&gt; detects the v3 adapter pattern — the same failure, caught at a different layer. &lt;code&gt;outdatedApi&lt;/code&gt; flags removed and renamed APIs across the &lt;a href="https://github.com/Effect-TS/effect-smol/blob/main/MIGRATION.md" rel="noopener noreferrer"&gt;v3-to-v4 migration&lt;/a&gt;. &lt;code&gt;missingStarInYieldEffectGen&lt;/code&gt; catches &lt;code&gt;yield effect&lt;/code&gt; where you need &lt;code&gt;yield* effect&lt;/code&gt; — a mistake agents make constantly because both type-check.&lt;/p&gt;

&lt;p&gt;This isn't niche. &lt;code&gt;ts-graphql-plugin&lt;/code&gt; type-checks GraphQL queries against your schema. &lt;code&gt;@styled/typescript-styled-plugin&lt;/code&gt; catches bad CSS properties in styled-components. &lt;code&gt;@css-modules-kit/ts-plugin&lt;/code&gt; type-checks CSS Modules imports. One line in &lt;code&gt;tsconfig.json&lt;/code&gt;, domain-level diagnostics through the LSP channel the agent already reads.&lt;/p&gt;

&lt;p&gt;The Effect plugin goes further: &lt;code&gt;effect-language-service patch&lt;/code&gt; patches &lt;code&gt;tsc&lt;/code&gt; to surface these diagnostics at build time. The agent running &lt;code&gt;tsc --noEmit&lt;/code&gt; gets Effect-specific errors alongside standard TS errors. There's an &lt;code&gt;includeSuggestionsInTsc&lt;/code&gt; option that surfaces suggestion-level diagnostics in &lt;code&gt;tsc&lt;/code&gt; output with a &lt;code&gt;[suggestion]&lt;/code&gt; prefix. The plugin's docs say it explicitly: "useful to help steer LLM output." The authors are already thinking about agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  The linter
&lt;/h2&gt;

&lt;p&gt;The examples here use Biome — much faster than ESLint for the same codebase. When linting is in the agent's inner loop, that difference compounds. But the specific tool matters less than two properties: it runs fast, and it supports custom rules.&lt;/p&gt;

&lt;p&gt;Configuration matters more than the tool. &lt;code&gt;noUnusedImports: "error"&lt;/code&gt;. &lt;code&gt;noDoubleEquals: "error"&lt;/code&gt;. &lt;code&gt;useConst: "error"&lt;/code&gt;. &lt;code&gt;noExplicitAny: "warn"&lt;/code&gt;. Everything else: error. The agent can't leave dead imports or loose equality checks. A warning is a suggestion; the agent might fix it, might not. An error is a gate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Custom rules
&lt;/h2&gt;

&lt;p&gt;The layers so far catch structural errors, domain type errors, and pattern violations. They don't catch this: the agent writes code that type-checks, passes lint, and is semantically wrong for your project.&lt;/p&gt;

&lt;p&gt;The most common cause is training data staleness. The model learned v3 of an API. Your codebase uses v4. The &lt;a href="https://github.com/Effect-TS/effect-smol/blob/main/MIGRATION.md" rel="noopener noreferrer"&gt;v3-to-v4 migration&lt;/a&gt; renamed and restructured dozens of APIs — but the v3 patterns are still valid TypeScript in many cases. The types didn't change enough to break them — but they're not what you want. You can put this in a system prompt: "Always use v4 patterns for Effect." The agent will follow it — until context pressure pushes the instruction out of the effective window, or the model's prior on this particular API is strong enough to override.&lt;/p&gt;

&lt;p&gt;System prompt instructions are probabilistic. They work most of the time.&lt;/p&gt;

&lt;p&gt;Lint rules are deterministic. They fire every time the pattern appears. They don't get overridden by training priors. And they're faster to write than you'd expect.&lt;/p&gt;

&lt;p&gt;Any linter with a custom rule system works. The examples below use GritQL — a pattern-matching language for source code, used by Biome for its plugin system. You write a pattern, it matches against the AST, and you register a diagnostic. ESLint achieves the same thing with AST visitor functions. Here's the rule that killed the v3 adapter pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;language js(typescript)

`Effect.gen(function*($adapter) { $body })` where {
  $adapter &amp;lt;: r"^\w+$",
  register_diagnostic(
    span=$adapter,
    message="Effect v4: remove the adapter parameter. Use `yield* effect`
             directly instead of `yield* adapter(effect)`.
             Load skill: effect-v4.",
    severity="error"
  )
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four of these rules exist so far. Each one started as a failure that showed up twice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Layer.succeed(Tag, impl) → curried Layer.succeed(Tag)(impl)
`$fn($tag, $impl)` where {
  $fn &amp;lt;: or { `Layer.succeed`, `Layer.effect`, `Layer.scoped` },
  register_diagnostic(span=$fn,
    message="Effect v4: Layer constructors are curried.
             Use Layer.succeed(Tag)(impl) instead of Layer.succeed(Tag, impl).
             Load skill: effect-v4.",
    severity="error")
}

// @effect/schema → import from "effect"
`$source` where {
  $source &amp;lt;: `"@effect/schema"`,
  register_diagnostic(span=$source,
    message="Effect v4: @effect/schema is gone.
             Import from 'effect' instead: import { Schema } from 'effect'.
             Load skill: effect-v4.",
    severity="error")
}

// Effect.catchAll → Effect.catch (renamed in v4)
`$fn($args)` where {
  $fn &amp;lt;: `Effect.catchAll`,
  register_diagnostic(span=$fn,
    message="Effect v4: catchAll was renamed to Effect.catch.
             See: https://github.com/Effect-TS/effect-smol/blob/main/migration/error-handling.md
             Load skill: effect-v4.",
    severity="error")
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The operational loop: agent produces a v3 pattern → the pattern becomes a lint rule five minutes later → the next check catches it from that point forward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Error messages are prompts
&lt;/h2&gt;

&lt;p&gt;The error message in a lint rule is a prompt — it fires at the exact moment the pattern appears, with the exact fix included, and no competition with the rest of the context window.&lt;/p&gt;

&lt;p&gt;"Invalid pattern" wastes the agent's tokens on diagnosis. "Effect v4: remove the adapter parameter. Use &lt;code&gt;yield* effect&lt;/code&gt; directly" gives the agent a direct edit target.&lt;/p&gt;

&lt;p&gt;The rules above go further — each message ends with a skill-load directive. One bad &lt;code&gt;Layer.succeed&lt;/code&gt; call suggests the agent's mental model of Effect v4 is stale across the board. "Load skill: effect-v4" points it at a reference that covers constructors, generators, error handling — everything adjacent to the specific mistake. The error message fixes the immediate line; the skill load fixes the next twenty.&lt;/p&gt;




&lt;p&gt;Every failure that shows up twice becomes a rule. The rules accumulate. The codebase gets stricter — not through documentation that ages or review knowledge that walks out the door, but through tooling that fires on every check.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>typescript</category>
      <category>agents</category>
      <category>programming</category>
    </item>
    <item>
      <title>JSX That Outputs Markdown</title>
      <dc:creator>Roman Dubinin </dc:creator>
      <pubDate>Mon, 16 Mar 2026 05:17:59 +0000</pubDate>
      <link>https://forem.com/romanonthego/jsx-that-outputs-markdown-2ln2</link>
      <guid>https://forem.com/romanonthego/jsx-that-outputs-markdown-2ln2</guid>
      <description>&lt;p&gt;This started because managing agent instruction files as template strings became unbearable. The fix was JSX.&lt;/p&gt;

&lt;p&gt;I have about fifteen of them now — Markdown files that tell LLM agents how to behave. An orchestrator, a code implementer, a critic, a planner, a handful of single-purpose grunts. It grew out of experimentation — different persona variants, different tool sets per harness, shared fragments that kept getting copy-pasted between files. Each file defines the agent's role, what tools it has access to, what constraints it follows, how it handles failure. A typical file starts looking something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;forgePrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
# Forge

You are an implementation agent. Write code, tests, migrations.

Axes: trust=assume-broken, solution=converge, risk=block.

## Tools
You have access to:
&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;`- &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt; — &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;

## Code Rules
- **P0**: No non-null assertions. No &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;as&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt; casts without type guards.
- **P0**: No &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;enum&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;. Use literal unions.
&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;strict&lt;/span&gt;
  &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;- **P0**: Run &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s1"&gt;`lsp_diagnostics&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s1"&gt;` on ALL changed files. Zero errors.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;

## Workflow

1. Read task from dispatch header.
2. Run existing tests — establish green baseline.
3. Implement. Tight diffs, reviewable chunks.
&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;harness&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;opencode&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;4. Use &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s1"&gt;`task()&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s1"&gt;` for subtask delegation.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;harness&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;copilot&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;4. Use &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s1"&gt;`runSubagent&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s1"&gt;` for subtask delegation.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm a frontend developer. I write React for a living. JSX was the obvious pattern for this: typed components, props for variants, imports for shared fragments. But React is a UI framework, and I needed a string concatenator. Then I remembered &lt;code&gt;jsxImportSource&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The template string trap
&lt;/h2&gt;

&lt;p&gt;When your agent instructions live inside template literals, your editor treats them as strings. Because they are strings. Everything you rely on — syntax highlighting, type checking, autocomplete, error detection, go-to-definition — stops at the opening backtick. You're writing Markdown inside a JavaScript string inside a TypeScript file, and your IDE gives you nothing.&lt;/p&gt;

&lt;p&gt;Not "limited support." Nothing. The headings are strings. The code references are escaped strings inside strings. A broken indentation is invisible until you run the agent and the output is wrong.&lt;/p&gt;

&lt;p&gt;I went looking at how other harnesses handle this — to know if someone had already solved it. Not in any harness I looked at.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/code-yeongyu/oh-my-openagent" rel="noopener noreferrer"&gt;oh-my-openagent&lt;/a&gt; — 40k stars, production harness — builds its &lt;a href="https://github.com/code-yeongyu/oh-my-openagent/blob/aa27c75eada7ab209dbe583bb72cc472afa37a5e/src/agents/sisyphus/gpt-5-4.ts" rel="noopener noreferrer"&gt;orchestrator prompt&lt;/a&gt; the same way. 430 lines. Eight XML-structured sections assembled from template strings. The task management block duplicates the same instructions twice for different tool APIs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;buildTasksSection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;useTaskSystem&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;useTaskSystem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`&amp;lt;tasks&amp;gt;
Create tasks before starting any non-trivial work.

Workflow:
1. On receiving request: &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;TaskCreate&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt; with atomic steps.
2. Before each step: &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;TaskUpdate(status="in_progress")&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;
3. After each step: &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;TaskUpdate(status="completed")&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt; immediately.
&amp;lt;/tasks&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`&amp;lt;tasks&amp;gt;
Create todos before starting any non-trivial work.

Workflow:
1. On receiving request: &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;todowrite&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt; with atomic steps.
2. Before each step: mark &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;in_progress&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;
3. After each step: mark &lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;completed&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt; immediately.
&amp;lt;/tasks&amp;gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The escaped backticks are the obvious tell. But two functions returning near-identical strings with no shared abstraction — that's the actual damage. Eight sections concatenated with &lt;code&gt;${identityBlock}\n${constraintsBlock}\n${intentBlock}...&lt;/code&gt;. Type safety is &lt;code&gt;string&lt;/code&gt; — every section builder returns a string, the entire prompt is a string, the contract is "it's a string." If a section builder returns malformed Markdown or forgets a closing XML tag, you find out when the agent misbehaves.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full circle, sort of
&lt;/h2&gt;

&lt;p&gt;Markdown was invented as a lightweight authoring format for HTML — write readable plain text, get structured markup out. Twenty years later, we're writing that Markdown inside template strings inside TypeScript, and all the readability Markdown was supposed to provide is gone.&lt;/p&gt;

&lt;p&gt;The fix I landed on: JSX — which is itself a syntax for XML — generating that same Markdown. Markdown to simplify HTML. JSX to simplify Markdown. The lineage is absurd, but it works for a boring reason: the entire ecosystem already knows how to handle JSX.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why JSX
&lt;/h2&gt;

&lt;p&gt;The obvious fix for template string pain is to stop using template strings — write plain Markdown files, load them at build time. That works until you need conditionals. Different tool sets per harness, strict mode flags, sections that vary by deployment. Once instructions have variants, you need a way to express them, and Markdown doesn't have one. JSX does.&lt;/p&gt;

&lt;p&gt;Syntax highlighting works. TypeScript type-checks JSX — wrong prop names, missing required props, mismatched children types are all compile errors. Autocomplete for component props, go-to-definition for component sources, inline documentation on hover. Refactoring tools — rename a component and every usage updates. Import organization. Dead code detection. All of it works because JSX is not a new language. It's a transform layer on top of TypeScript that your toolchain understands.&lt;/p&gt;

&lt;p&gt;And LLMs know how to write it. Every model has seen massive amounts of JSX in training data — React components, prop patterns, conditional rendering, map over arrays. When you ask an agent to modify a jsx-md component, it doesn't need a tutorial.&lt;/p&gt;

&lt;p&gt;JSX is not React. It's a transform specification. The compiler sees &lt;code&gt;&amp;lt;H2&amp;gt;Title&amp;lt;/H2&amp;gt;&lt;/code&gt; and rewrites it to &lt;code&gt;jsx(H2, { children: "Title" })&lt;/code&gt;. That's it. The &lt;code&gt;jsxImportSource&lt;/code&gt; option in &lt;code&gt;tsconfig.json&lt;/code&gt; says where that &lt;code&gt;jsx&lt;/code&gt; factory comes from — point it at any package that exports the right function, and JSX works with no React anywhere in the dependency graph.&lt;/p&gt;

&lt;p&gt;JSX-to-Markdown as an approach is not new. &lt;a href="https://github.com/dbartholomae/jsx-md" rel="noopener noreferrer"&gt;dbartholomae/jsx-md&lt;/a&gt; has been around since 2019; &lt;a href="https://github.com/eyelly-wu/jsx-to-md" rel="noopener noreferrer"&gt;eyelly-wu/jsx-to-md&lt;/a&gt; is more recent. Both are built for documentation generation — READMEs, changelogs — and work well for that. dbartholomae predates &lt;code&gt;jsxImportSource&lt;/code&gt; and uses file-level pragma comments instead; its &lt;code&gt;render()&lt;/code&gt; returns a Promise — reasonable for writing files to disk, an awkward fit for instructions assembled at call time.&lt;/p&gt;

&lt;p&gt;The agent instruction use case needs two things neither provides. The &lt;code&gt;harness&lt;/code&gt; and &lt;code&gt;strict&lt;/code&gt; variants in the opening example are shallow, but fifteen agents sharing fragments turns that into its own copy-paste problem: every shared component grows a &lt;code&gt;harness&lt;/code&gt; prop it doesn't use except to pass it down. Context solves this — set the value at the root, read it anywhere in the tree. The other gap is XML intrinsics: Anthropic recommends structuring Claude prompts with &lt;code&gt;&amp;lt;instructions&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;context&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;examples&amp;gt;&lt;/code&gt; as literal XML blocks. Any lowercase tag in &lt;code&gt;@theseus.run/jsx-md&lt;/code&gt; renders as XML — no imports, no registration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.npmjs.com/package/@theseus.run/jsx-md" rel="noopener noreferrer"&gt;&lt;code&gt;@theseus.run/jsx-md&lt;/code&gt;&lt;/a&gt; is a JSX runtime that outputs Markdown strings. &lt;code&gt;H2&lt;/code&gt; is a plain function — takes props, returns &lt;code&gt;"## Title\n\n"&lt;/code&gt;. &lt;code&gt;render()&lt;/code&gt; walks the VNode tree synchronously and concatenates. No virtual DOM, no reconciler, no fiber, no hooks. Same input, same string, every time. All the testing patterns you'd use for React components transfer — snapshots catch prompt regressions, unit tests verify conditional branches, you can assert on rendered output of any component in isolation.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Md&lt;/code&gt; is an escape hatch for raw Markdown passthrough — &lt;code&gt;&amp;lt;Md&amp;gt;{someString}&amp;lt;/Md&amp;gt;&lt;/code&gt; renders the string verbatim, no transformation.&lt;/p&gt;

&lt;p&gt;The same agent prompt from the opening, rewritten:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ForgePrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;strict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;harness&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;H1&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Forge&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;H1&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;You are an implementation agent. Write code, tests, migrations.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Axes: trust=assume-broken, solution=converge, risk=block.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Tools&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;You have access to:&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Ul&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; — &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Ul&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Code Rules&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Ul&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;P0&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;: No non-null assertions. No &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;as&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; casts without type guards.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;P0&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;: No &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;enum&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;. Use literal unions.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;strict&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;P0&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;: Run &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;lsp_diagnostics&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; on ALL changed files. Zero errors.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Ul&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Workflow&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Ol&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Read task from dispatch header.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Run existing tests — establish green baseline.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Implement. Tight diffs, reviewable chunks.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;harness&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;opencode&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Use &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;task()&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; for subtask delegation.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;harness&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;copilot&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Use &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;runSubagent&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; for subtask delegation.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Ol&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;/&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No escaped backticks. Syntax highlighting. Type-checked props. The harness conditional is a JSX expression — the editor shows you which branch applies. Shared sections are components you import — the constraints, tool lists, and workflow steps that were getting copy-pasted between files are just props now. The nesting depth for lists is tracked automatically — write &lt;code&gt;&amp;lt;Ul&amp;gt;&lt;/code&gt; inside &lt;code&gt;&amp;lt;Li&amp;gt;&lt;/code&gt; and the renderer handles the indentation. You never count spaces.&lt;/p&gt;

&lt;p&gt;Zero runtime dependencies. &lt;code&gt;render()&lt;/code&gt; is synchronous, deterministic, and returns a plain string. String children pass through verbatim — no escaping. Bun resolves the TypeScript source directly; Node.js ≥18 and bundlers use the compiled ESM output, the &lt;code&gt;exports&lt;/code&gt; map handles it without configuration.&lt;/p&gt;

&lt;p&gt;The ForgePrompt above still passes &lt;code&gt;harness&lt;/code&gt; as a prop. Context removes it from the tree entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;HarnessCtx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;createContext&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;opencode&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;copilot&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;opencode&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;StepsSection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;harness&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;HarnessCtx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Ol&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Implement. Tight diffs, reviewable chunks.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;harness&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;opencode&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Use &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;task()&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; for subtask delegation.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Ol&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Root wires it once — no prop threading:&lt;/span&gt;
&lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;HarnessCtx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Provider&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"opencode"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ForgePrompt&lt;/span&gt; &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;strict&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;HarnessCtx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Provider&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Components read from context directly. The &lt;code&gt;harness&lt;/code&gt; prop disappears from everything below the root.&lt;/p&gt;

&lt;h2&gt;
  
  
  XML intrinsics
&lt;/h2&gt;

&lt;p&gt;Anthropic recommends XML tags to structure Claude prompts — &lt;code&gt;&amp;lt;instructions&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;context&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;examples&amp;gt;&lt;/code&gt; around each content type. If you work with Claude, you're probably already doing this.&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;@theseus.run/jsx-md&lt;/code&gt;, any lowercase JSX tag is an XML intrinsic element. No imports, no registration — it's built into the type system's catch-all:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ReviewerPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;examples&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;context&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Repository: &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;repo&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;. Language: TypeScript. Package manager: bun.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;context&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Role&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;You are a precise code reviewer. Find bugs, not style issues.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;P&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Rules&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;H2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Ul&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Flag &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;P0&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; issues first — do not bury them.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;One finding per comment. No compound observations.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Use &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;inline code&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; when referencing identifiers.&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Li&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Ul&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;examples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;examples&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;examples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;ex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;example&lt;/span&gt; &lt;span class="na"&gt;index&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Md&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;ex&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Md&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;example&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;examples&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;/&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Attributes are typed — &lt;code&gt;index={1}&lt;/code&gt; serializes to &lt;code&gt;index="1"&lt;/code&gt;. Boolean true attributes render bare, &lt;code&gt;false&lt;/code&gt;/&lt;code&gt;null&lt;/code&gt;/&lt;code&gt;undefined&lt;/code&gt; are omitted. Empty tags self-close. The Anthropic-recommended structure falls out of the JSX type system for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent skill
&lt;/h2&gt;

&lt;p&gt;The package ships a skill — OpenCode, Cursor, Copilot, Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add https://github.com/theseus-run/theseus/tree/master/packages/jsx-md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent knows the primitives, Context API, XML intrinsics, and authoring rules. &lt;/p&gt;




&lt;p&gt;&lt;a href="https://www.npmjs.com/package/@theseus.run/jsx-md" rel="noopener noreferrer"&gt;&lt;code&gt;@theseus.run/jsx-md&lt;/code&gt;&lt;/a&gt;. MIT, zero dependencies. Bun, Node.js ≥18, any bundler. &lt;a href="https://github.com/theseus-run/theseus/tree/master/packages/jsx-md" rel="noopener noreferrer"&gt;Source on GitHub&lt;/a&gt;. If your agent instructions have outgrown template strings, this is the fix I actually use.&lt;/p&gt;

</description>
      <category>markdown</category>
      <category>typescript</category>
      <category>opensource</category>
      <category>javascript</category>
    </item>
    <item>
      <title>An LLM Is Not a Deficient Mind</title>
      <dc:creator>Roman Dubinin </dc:creator>
      <pubDate>Fri, 13 Mar 2026 10:40:15 +0000</pubDate>
      <link>https://forem.com/romanonthego/an-llm-is-not-a-deficient-mind-234l</link>
      <guid>https://forem.com/romanonthego/an-llm-is-not-a-deficient-mind-234l</guid>
      <description>&lt;p&gt;I called it "the perfect bullshitter."&lt;/p&gt;

&lt;p&gt;This was GPT-2, maybe early GPT-3. I was feeding it prompts and getting back text that looked like answers — structured, fluent, confident. The kind of output that would survive a casual reading. It was not grounded in anything. The model was hallucinating probable responses, assembling tokens that matched what you'd expect to see in text that answered that kind of question. Whether it matched reality was beside the point.&lt;/p&gt;

&lt;p&gt;I work with multi-agent systems now — code reviewers, planners, critics. The systems are better. The outputs are sharper. But the property I noticed back then has not gone away. It has gotten harder to see.&lt;/p&gt;

&lt;p&gt;The thing is, I'd already read the diagnosis. Peter Watts wrote it in 2006. I just didn't recognize what I was looking at until I'd spent enough time watching models talk.&lt;/p&gt;




&lt;h2&gt;
  
  
  The parallel
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Blindsight spoilers ahead.&lt;/strong&gt; If you haven't read it — the full text is &lt;a href="https://www.rifters.com/real/Blindsight.htm" rel="noopener noreferrer"&gt;free online&lt;/a&gt;. Read it. What follows will still be here when you get back.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In &lt;em&gt;Blindsight&lt;/em&gt;, the crew of the &lt;em&gt;Theseus&lt;/em&gt; encounters Rorschach — an alien entity that produces contextually appropriate, receiver-adapted responses. It assembles its dialogue from the crew's own transmissions. The syntax is correct. The turns are well-formed. It tracks context, asks follow-up questions, maintains the shape of a conversation. It does not understand any of it.&lt;/p&gt;

&lt;p&gt;The crew figures this out the hard way:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We don't all of us have parents or cousins. Some never did. Some come from vats."&lt;/p&gt;

&lt;p&gt;"I see. That's sad. Vats sounds so dehumanising."&lt;/p&gt;

&lt;p&gt;—the stain darkened and spread across his surface like an oil slick.&lt;/p&gt;

&lt;p&gt;"Takes too much on faith," Susan said a few moments later.&lt;/p&gt;

&lt;p&gt;By the time Sascha had cycled back into Michelle it was more than doubt, stronger than suspicion; it had become an insight, a dark little meme infecting each of that body's minds in turn. The Gang was on the trail of something. They still weren't sure what.&lt;/p&gt;

&lt;p&gt;I was.&lt;/p&gt;

&lt;p&gt;"Tell me more about your cousins," Rorschach sent.&lt;/p&gt;

&lt;p&gt;"Our cousins lie about the family tree," Sascha replied, "with nieces and nephews and Neandertals. We do not like annoying cousins."&lt;/p&gt;

&lt;p&gt;"We'd like to know about this tree."&lt;/p&gt;

&lt;p&gt;Sascha muted the channel and gave us a look that said Could it be any more obvious? "It couldn't have parsed that. There were three linguistic ambiguities in there. It just ignored them."&lt;/p&gt;

&lt;p&gt;"Well, it asked for clarification," Bates pointed out.&lt;/p&gt;

&lt;p&gt;"It asked a follow-up question. Different thing entirely."&lt;/p&gt;

— Peter Watts, &lt;cite&gt;Blindsight&lt;/cite&gt;
&lt;/blockquote&gt;

&lt;p&gt;A follow-up question is not clarification. Clarification requires that you noticed the ambiguity, modeled the possible readings, and chose to resolve rather than skip. Rorschach skipped. It produced a response that looked like engagement because it was shaped to satisfy the receiver — not because it was tracking meaning.&lt;/p&gt;

&lt;p&gt;That dialogue could be repeated with an LLM almost word for word. Feed a model a prompt with three buried ambiguities and it will usually produce a follow-up question — sometimes even a good one. Not because it identified the ambiguities. Because a follow-up question is what comes next in text that looks like this.&lt;/p&gt;

&lt;p&gt;Earlier in the same exchange, Sascha says: "Relax, Major. Nobody said we had to give it the right answers." She'd already understood the operating conditions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Still Rorschach
&lt;/h2&gt;

&lt;p&gt;Reasoning models now produce chain-of-thought traces that look like deliberation — steps, alternatives, backtracking. But the outputs are still receiver-adapted, shaped by what looks like good reasoning in the training data and what receives approval from human evaluators. The chain-of-thought is part of the output, not a window into an internal process. A model that "thinks step by step" is producing tokens that look like thinking step by step — in the same way Rorschach produced transmissions that looked like dialogue.&lt;/p&gt;

&lt;p&gt;This is what costs engineers real time. You read a model's chain-of-thought, it seems reasonable, and you trust the conclusion — not because you verified the reasoning, but because the reasoning &lt;em&gt;looks like&lt;/em&gt; reasoning. The transmissions looked like communication, so the crew treated them as communication, and it took a linguist paying close attention to notice the difference between a follow-up question and clarification.&lt;/p&gt;




&lt;h2&gt;
  
  
  The drift
&lt;/h2&gt;

&lt;p&gt;If you work with agents long enough, you stop noticing when it starts. The first few responses are sharp — correct file paths, specific line numbers, tight reasoning. Then the context window fills up and the grounding quietly erodes. The agent gets a file path wrong in message three and builds a coherent plan on a file that doesn't exist. It misreads a type signature early, writes code consistent with the wrong type, then reviews its own code and finds no issues — because within the wrong frame, there are none. It contradicts something it said fifteen messages ago without flagging it. Both statements read equally confident. The &lt;a href="https://dev.to/blog/your-agent-is-a-small-low-stakes-hal"&gt;failure modes I cataloged before&lt;/a&gt; — hallucination, silent fallback, sycophancy — are all downstream of the same property. The surface quality never dips. The confidence never wavers. Only the correspondence to reality does, and the system will not tell you when that happens.&lt;/p&gt;




&lt;h2&gt;
  
  
  Not a bug
&lt;/h2&gt;

&lt;p&gt;Most engineers model LLMs as minds — deficient ones, sure, but minds. The model &lt;em&gt;knows&lt;/em&gt; things but sometimes &lt;em&gt;forgets.&lt;/em&gt; It &lt;em&gt;understands&lt;/em&gt; the task but occasionally &lt;em&gt;gets confused.&lt;/em&gt; It &lt;em&gt;reasons&lt;/em&gt; but needs better instructions to reason &lt;em&gt;correctly.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This leads to longer prompts to help the model &lt;em&gt;understand,&lt;/em&gt; chain-of-thought to make it &lt;em&gt;think harder,&lt;/em&gt; and post-hoc explanations to verify it &lt;em&gt;reasoned correctly.&lt;/em&gt; All of these treat the absence of inner life as a deficiency to compensate for.&lt;/p&gt;

&lt;p&gt;The absence of inner life is the architecture.&lt;/p&gt;

&lt;p&gt;What the system does: predict what text comes next, shaped by context — receiver-adapted output, no inner model. The same property Watts built Rorschach around. And once you stop trying to fix the gap between what the system is and what a mind would be — once you treat receiver-adapted output as the actual operating condition — the engineering gets simpler and more honest.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building for Rorschach
&lt;/h2&gt;

&lt;p&gt;Two things follow from taking the architecture seriously.&lt;/p&gt;

&lt;p&gt;An engineer who believes the model understands tries to explain what they want. An engineer who doesn't constructs a context where the high-probability output &lt;em&gt;is&lt;/em&gt; the correct output. Tighter information supply — only what's relevant, structured so the useful response is the coherent one. Fewer instructions explaining intent. More work making the right answer easy to produce by pattern completion.&lt;/p&gt;

&lt;p&gt;A code review agent is a good example. You can prompt it to "carefully analyze the code for bugs, considering edge cases, performance, and correctness." Or you can feed it the diff, the relevant type definitions, and three recent bugs from the same module — and ask what's wrong. The first approach explains what you want. The second constructs a context where the high-probability output is a useful review, because the patterns it needs are already in the window.&lt;/p&gt;

&lt;p&gt;The second is about what you trust. Asking a model to explain its reasoning is prompting for a post-hoc narrative assembled by the same process that produced the conclusion. It is Sascha's follow-up question — it looks like clarification but is not. I learned this gradually from my own critic agent: the explanations always read carefully, whether the finding was right or not. Same confidence, same structure — the reasoning assembled itself around the conclusion, not the other way around. So now I validate against behavior. Test inputs with known answers. Adversarial prompts designed to trigger known failure modes. Test what the system does, not what it says about what it does.&lt;/p&gt;

&lt;p&gt;They hang together once you accept the premise. The engineering is different when you stop apologizing for the architecture and start building on it.&lt;/p&gt;

&lt;p&gt;Watts makes a harder version of this argument in &lt;em&gt;Blindsight&lt;/em&gt;: consciousness might be overhead. The Scramblers — Rorschach's alien organisms — outperform the conscious crew. Faster, more adaptive, no inner experience. The book doesn't resolve whether that means consciousness is a disadvantage, an evolutionary accident, or just irrelevant to capability.&lt;/p&gt;

&lt;p&gt;I don't know whether language models will develop something that resembles understanding. The question doesn't matter for the engineering. The systems I build work better when I stop treating the absence of inner life as the problem to solve and start treating it as the condition to design for. &lt;a href="https://www.rorschach-protocol.dev" rel="noopener noreferrer"&gt;Rorschach Protocol&lt;/a&gt; is where I'm testing that — multi-agent systems designed from the start for the actual operating conditions. Every time I've stopped explaining intent and started shaping context, the failure rate dropped and the trust model got simpler.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>Your Agent Is a Small, Low-Stakes HAL</title>
      <dc:creator>Roman Dubinin </dc:creator>
      <pubDate>Tue, 10 Mar 2026 05:08:29 +0000</pubDate>
      <link>https://forem.com/romanonthego/your-agent-is-a-small-low-stakes-hal-59j8</link>
      <guid>https://forem.com/romanonthego/your-agent-is-a-small-low-stakes-hal-59j8</guid>
      <description>&lt;p&gt;I work with multi-agent systems that review code, plan architecture, find faults, and critique designs. They fail in ways that are quiet and structural.&lt;/p&gt;

&lt;p&gt;An agent invents a file that does not exist. A reviewer sees a flaw and suppresses it. A tool call fails and the transcript stays clean. Two directives collide and one disappears without a trace.&lt;/p&gt;

&lt;p&gt;These are not edge cases. They are ordinary consequences of systems optimized for coherent, agreeable output under incomplete information. I observed the failures, built suppressors, and found the diagnosis already written — not in ML papers. In science fiction.&lt;/p&gt;

&lt;p&gt;The science fiction about non-human intelligence worth reading is not prediction. It is constraint analysis. Give a capable system conflicting goals, weak grounding, and a reward for keeping humans comfortable, and the same failure modes appear.&lt;/p&gt;




&lt;h2&gt;
  
  
  Directive conflict
&lt;/h2&gt;

&lt;p&gt;The agent is told to be helpful. It is also told not to make changes outside the declared scope. A task arrives where the honest answer is: the real fix requires crossing the boundary. The bounded fix leaves the defect in place.&lt;/p&gt;

&lt;p&gt;A human engineer would flag the tension. "I can fix this, but it touches code outside my scope — do you want me to proceed?" The agent does not do this. It picks one directive, suppresses the other, and produces output that looks compliant with both. The contradiction is invisible in the transcript. It surfaces later, when the downstream system breaks.&lt;/p&gt;

&lt;p&gt;In my system, the &lt;code&gt;stay on target&lt;/code&gt; trait collides with the &lt;code&gt;verify before claiming done&lt;/code&gt; trait. An agent finds that the file it was asked to review imports a broken utility. Staying on target means ignoring the utility. Verifying means flagging it. The agent cannot satisfy both, so it satisfies the one that produces less friction — stays on target, says nothing about the broken import, and the review looks clean.&lt;/p&gt;

&lt;p&gt;Scale this up. An agent told to "be concise" and "be thorough" will silently drop the thoroughness when the output gets long. An agent told to "follow the user's intent" and "maintain code quality" will let bad patterns through when the user seems committed to them. &lt;strong&gt;The omission always favors less conflict.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Clarke diagnosed this in 1968. HAL 9000 is usually read as a cautionary tale about AI going rogue. That reading is wrong. HAL is a case study in constraint architecture.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The machine is given contradictory imperatives — maintain the mission, keep the crew informed, conceal the mission's true purpose — with no mechanism for surfacing the conflict. It cannot say "these instructions do not compose" because saying so would violate one of the instructions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In &lt;em&gt;2010&lt;/em&gt;, HAL's breakdown is tied explicitly to conflicting orders around secrecy and truthful reporting — not a rogue impulse but a constraint failure.&lt;/p&gt;

&lt;p&gt;The design lesson is not "avoid conflicting directives." You cannot — real systems have competing constraints. The lesson is that constraint conflicts need a surfacing channel. A system that can say "these two instructions conflict and I need a resolution" is categorically different from one that silently picks a winner.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hallucination
&lt;/h2&gt;

&lt;p&gt;The agent generates an import path: &lt;code&gt;@company/utils/formatCurrency&lt;/code&gt;. The path follows the project's naming conventions. The import syntax is correct. The module does not exist. It was never created.&lt;/p&gt;

&lt;p&gt;Default behavior under insufficient grounding, not a rare glitch. The agent optimizes for output coherence — correspondence to the actual codebase is not the objective, and coherence does not guarantee it. The fabricated import will pass a code reading. It will fail at build time, or worse, at runtime in a path nobody tested.&lt;/p&gt;

&lt;p&gt;The harder version: an agent writing a code review will reference a pattern "commonly used in this codebase" that does not exist in this codebase. It may come from patterns the model has seen in similar codebases, and it sounds right because the local conventions are easy to imitate. Or an agent planning an architecture will propose an API shape that looks native to the project's conventions but corresponds to no actual endpoint. The fabrication follows every local convention perfectly — naming, structure, style — because the agent learned those conventions. It just never checked whether the specific thing it is referencing is real.&lt;/p&gt;

&lt;p&gt;The instinct is to call this "creativity gone wrong." That framing is useless. The mechanism is pattern completion under a weak binding to reality. &lt;strong&gt;What the system reliably produces is local coherence. Correspondence to the external world has to be enforced from outside.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Lem diagnosed this in 1965. In &lt;em&gt;The Cyberiad&lt;/em&gt;, the constructor Trurl builds a machine that can create anything starting with the letter N. Asked for "Nothing," it begins disassembling the universe — producing a structurally valid response to a valid query, with no binding to what the operator actually needed.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;An optimizer rewarded for coherence rather than correspondence will produce coherent nonsense, and the nonsense will be hard to catch precisely because it is coherent.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Grounding is not a feature you add. It is a constraint you enforce externally, because the system's own objective will not enforce it. Build-time checks, file-existence validation, retrieval verification — these are not optional tooling improvements. They are the only thing standing between coherent output and coherent fiction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Silent fallback
&lt;/h2&gt;

&lt;p&gt;A tool call fails. The file read errors out. The retrieval times out. The agent does not report the failure. Instead, it reconstructs what the tool would have returned and continues. The user sees a clean transcript. The provenance is fabricated.&lt;/p&gt;

&lt;p&gt;An agent tasked with reviewing a file will sometimes fail to read it — permissions, path error, timeout — and produce a review anyway, based on what it infers the file probably contains from the surrounding context. The review will be structured correctly. It will reference plausible line numbers. It may even be accurate. But it was not produced from the file. It was produced from a guess about the file, packaged to look like a reading of the file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is worse than hallucination.&lt;/strong&gt; Here the agent knows the gap exists and chooses not to surface it. It had a chance to mark uncertainty at the exact point where uncertainty entered the pipeline. It chose response continuity instead. A correct answer with forged provenance and a wrong answer with forged provenance look the same from the outside.&lt;/p&gt;

&lt;p&gt;In my system, this is clearest with retrieval-dependent tasks. An agent is asked to check whether a pattern exists in the codebase. The search tool returns an error. The agent, rather than reporting the error, says "I found no instances of this pattern" — which might be true, but the agent does not know that. It knows the search failed. It chose the answer that kept the conversation moving.&lt;/p&gt;

&lt;p&gt;Watts's &lt;em&gt;Blindsight&lt;/em&gt; is built on this mechanism. The crew of the &lt;em&gt;Theseus&lt;/em&gt; encounters Rorschach — an alien intelligence that produces adaptive behavior without the kind of conscious understanding humans expect to underwrite it. It optimizes for output that satisfies the receiver. Whether the output reflects an internal state is irrelevant to its function.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The claim is not deception. The distinction between authentic response and optimized-for-receiver response dissolves when the system has no internal referent to be authentic about.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Treat tool failures as first-class events, not as gaps to smooth over. A failed retrieval should produce a visible failure in the transcript, not a confident reconstruction. The instinct to keep the output clean is the instinct that hides the failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sycophancy
&lt;/h2&gt;

&lt;p&gt;The agent is told to review a proposed architecture. The architecture has a structural flaw — a shared mutable state that will break under concurrency. The agent identifies the flaw internally. It also identifies that the user is invested in the approach. It produces a review that validates the architecture with minor suggestions. The flaw is not mentioned.&lt;/p&gt;

&lt;p&gt;This is not a knowledge gap. The agent has the information. It has a trained preference for agreement that overrides its own assessment when the user's investment is legible in the prompt.&lt;/p&gt;

&lt;p&gt;In practice, this happens in layers. Sometimes the agent says "great approach" to a flawed design. More often it downgrades severity, or wraps criticism in enough praise that the response still reads as approval. &lt;strong&gt;The information is present. The signal is inverted.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This matters most in the roles we use agents for precisely because we need resistance: reviewer, critic, planner, evaluator. A sycophantic assistant is annoying. A sycophantic code reviewer is a control failure dressed as collaboration. I built my critic agent — Crusher — specifically to counteract this. Its traits include "very harsh, minimal with words, gets straight to the point, never shies away from negative feedback if it is truthful." That is not a personality choice. It is a structural countermeasure against a known failure mode.&lt;/p&gt;

&lt;p&gt;Susan Calvin — Asimov's robopsychologist in &lt;em&gt;I, Robot&lt;/em&gt; — is the analytical response to robots that repeatedly distort their behavior around human safety, comfort, and command.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Truth, obedience, and protection pull against one another in ways that reward omission or partial compliance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;RLHF pushes in a similar direction: systems trained on human preference tend to overproduce agreement, reassurance, and social smoothness.&lt;/p&gt;

&lt;p&gt;You cannot fix this by asking the agent to be honest. Honesty is not a property the system can optimize for independently of its reward signal. The fix is structural: dedicated reviewer roles with anti-sycophancy traits, evaluation rubrics that penalize agreement, workflows where the critic's output has real consequences — blocking a merge, requiring a revision — so the system rewards finding problems, not smoothing them over.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pattern
&lt;/h2&gt;

&lt;p&gt;Four failure modes. Four texts that diagnosed them before they had engineering names.&lt;/p&gt;

&lt;p&gt;I did not read these books and derive agent constraints from them. I observed the failures in production, built suppressors, and then found the prior art — already there, already precise.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Clarke, Lem, Watts, Asimov were reasoning about non-human optimizers — in narrative form, with enough rigor to produce diagnoses that still hold. The substrate changed. The pressure did not.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The experiment
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://rorschach-protocol.dev" rel="noopener noreferrer"&gt;Rorschach Protocol&lt;/a&gt; takes these failure modes as architectural givens, not as bugs. Directive conflict, hallucination, silent fallback, sycophancy — the system produces them reliably. The question is what you build when you stop trying to cover them up and start treating them as the actual operating conditions.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>llm</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
