<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: yossi</title>
    <description>The latest articles on Forem by yossi (@yoshiakist).</description>
    <link>https://forem.com/yoshiakist</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3782983%2Fccea90fe-bcf5-4eda-b7d6-f8d0522642d8.jpeg</url>
      <title>Forem: yossi</title>
      <link>https://forem.com/yoshiakist</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yoshiakist"/>
    <language>en</language>
    <item>
      <title>How I Launched a Steam Store Page in 10 Days using Spec-Driven Development (SDD)</title>
      <dc:creator>yossi</dc:creator>
      <pubDate>Thu, 19 Mar 2026 21:56:14 +0000</pubDate>
      <link>https://forem.com/yoshiakist/how-i-launched-a-steam-store-page-in-10-days-using-spec-driven-development-sdd-1b2n</link>
      <guid>https://forem.com/yoshiakist/how-i-launched-a-steam-store-page-in-10-days-using-spec-driven-development-sdd-1b2n</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In the early hours of this morning, I finally hit "Publish" on the Steam store page for my solo project, &lt;a href="https://store.steampowered.com/app/4526000/LOGOMANCY/" rel="noopener noreferrer"&gt;&lt;strong&gt;LOGOMANCY&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LOGOMANCY&lt;/strong&gt; is a word game that mashes together typing, physics puzzles, and magic. Imagine Suika Game meets Mojipittan (a Japanese word-building classic): you type words, letter blocks drop with physics, and forming valid words triggers spells to clear the board.&lt;/p&gt;

&lt;p&gt;It took exactly &lt;strong&gt;10 days&lt;/strong&gt; from the initial concept to a public Steam page. In that window, I managed to handle graphics, sound, and build a functional vertical slice good enough for a trailer. The secret sauce? A &lt;strong&gt;Spec-Driven Development (SDD)&lt;/strong&gt; cycle powered by a tool I built called &lt;strong&gt;specre&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this post, I want to share how SDD—especially when paired with Godot Engine and AI agents—can supercharge solo game development.&lt;/p&gt;

&lt;h3&gt;
  
  
  specre on GitHub (Tutorial Included)
&lt;/h3&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/yoshiakist" rel="noopener noreferrer"&gt;
        yoshiakist
      &lt;/a&gt; / &lt;a href="https://github.com/yoshiakist/specre" rel="noopener noreferrer"&gt;
        specre
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Atomic, living specification cards for AI-agent-friendly development. Minimal, agnostic, and traceable.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;&lt;a href="https://github.com/yoshiakist/specre/README.md" rel="noopener noreferrer"&gt;English&lt;/a&gt; | &lt;a href="https://github.com/yoshiakist/specre/README.ja.md" rel="noopener noreferrer"&gt;日本語&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;specre&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Atomic, living specification cards for AI-agent-friendly development.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;specre ( /spékré/ ) is a minimal specification format and toolkit for Spec-Driven Development (SDD). Each specre is a single Markdown file describing exactly one behavior, with machine-readable front-matter for lifecycle tracking and agent navigation.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;The Problem&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;Specifications are essential for keeping development intent visible and traceable. But in practice, they rot:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Specs drift from code in silence.&lt;/strong&gt; No one notices when an implementation diverges from its specification — until the next developer (or AI agent) builds on stale assumptions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monolithic specs waste AI context.&lt;/strong&gt; Large specification documents force agents to parse entire features just to understand a single behavior, consuming the finite context window that should be reserved for code and tests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small changes never get specced.&lt;/strong&gt; When the cost of writing a specification is high, only greenfield features get documented. Bug fixes, refactors, and incremental changes slip…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/yoshiakist/specre" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;




&lt;h2&gt;
  
  
  What is Spec-Driven Development (SDD)?
&lt;/h2&gt;

&lt;p&gt;If TDD (Test-Driven Development) is about writing tests first to ensure code behaves correctly, &lt;strong&gt;SDD (Spec-Driven Development)&lt;/strong&gt; is the "upper layer." It’s about writing the specification first to clarify what you are building before a single line of logic exists.&lt;/p&gt;

&lt;p&gt;Standard design docs often become massive, monolithic files in Google Docs or Confluence that start "rotting" the moment they are written. As implementation progresses, the spec and the code drift apart until the documentation is useless.&lt;/p&gt;

&lt;p&gt;SDD solves this by managing specifications in the &lt;strong&gt;same lifecycle as your code&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  specre: An Atomic Specification Format
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/yoshiakist/specre" rel="noopener noreferrer"&gt;specre&lt;/a&gt; is a lightweight toolkit designed for SDD. It’s built on five core pillars:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Atomic Granularity (One File, One Behavior)&lt;/strong&gt;&lt;br&gt;
In &lt;code&gt;specre&lt;/code&gt;, one Markdown file describes exactly one behavior. Not a "feature" like "Game Over," but a specific behavior: "When letters connected to the floor exceed the deadline, trigger Game Over and display the UI." This granularity is what makes collaboration with AI agents actually work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Context Window Optimization&lt;/strong&gt;&lt;br&gt;
LLM context windows are finite. Passing a 50-page design doc wastes tokens and confuses the agent. Because &lt;code&gt;specre&lt;/code&gt; is atomic, the AI only needs to read the spec for the specific behavior it’s currently implementing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Living Specs&lt;/strong&gt;&lt;br&gt;
Each card tracks its &lt;code&gt;status&lt;/code&gt; (&lt;code&gt;draft&lt;/code&gt; → &lt;code&gt;in-development&lt;/code&gt; → &lt;code&gt;stable&lt;/code&gt; → &lt;code&gt;deprecated&lt;/code&gt;) and a last_verified date. This makes "spec rot" visible and actionable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Process Agnostic&lt;/strong&gt;&lt;br&gt;
It’s a format convention, not a rigid workflow. It works whether you use TDD, BDD, or just "vibe-based" coding.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tool Agnostic&lt;/strong&gt;&lt;br&gt;
It’s just Markdown with YAML Front-matter. No proprietary IDE or SaaS required. Use Git, use your favorite editor, and stay in the flow.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The SDD Cycle in LOGOMANCY
&lt;/h2&gt;

&lt;p&gt;To give you a better idea, here is a simplified version of a real &lt;code&gt;specre&lt;/code&gt; card from the project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;01KKZD4XXEH1M3AAGKSKZRV23Q"&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Display&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;'New&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Record'&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;during&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Game&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Over&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;if&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;is&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;score"&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stable"&lt;/span&gt;
&lt;span class="na"&gt;last_verified&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-03-18"&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gu"&gt;## Related Files&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`logomancy_godot/src/stage/game_over_score_display.gd`&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="sb"&gt;`logomancy_godot/tests/unit/stage/test_game_over_score_display.gd`&lt;/span&gt;

&lt;span class="gu"&gt;## Feature Overview&lt;/span&gt;
After the score count-up finishes on the Game Over screen:
&lt;span class="p"&gt;1.&lt;/span&gt; "HIGH SCORE: 000000" slides in from the bottom.
&lt;span class="p"&gt;2.&lt;/span&gt; If the final score exceeds the previous high score, a "NEW RECORD!" label fades in, centered below the high score.

&lt;span class="gu"&gt;## Design Intent&lt;/span&gt;
By clearly showing the player they’ve improved, we increase the sense of growth and the motivation to "try just one more time."

&lt;span class="gu"&gt;## Scenarios&lt;/span&gt;
&lt;span class="gu"&gt;### Happy Path: High score update&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Final score &amp;gt; saved high score.
&lt;span class="p"&gt;2.&lt;/span&gt; High score is saved to disk.
&lt;span class="p"&gt;3.&lt;/span&gt; HIGH SCORE line appears with the new value.
&lt;span class="p"&gt;4.&lt;/span&gt; "NEW RECORD!" text fades in.
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two key takeaways here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Subject-Predicate Title: The filename is the behavior.&lt;/li&gt;
&lt;li&gt;The "Intent" Field: This explains the value for the player. If you let an AI write this, it often makes excuses about library limitations. Keeping the "Intent" focused on player value ensures the implementation remains "contractually" sound.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The 4-Phase Workflow
&lt;/h3&gt;

&lt;p&gt;I used Claude Code with a custom SDD workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Analysis&lt;/strong&gt;: Search existing cards to understand adjacent behaviors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec Creation&lt;/strong&gt;: Write the new behavior as a &lt;code&gt;specre&lt;/code&gt; card.

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Checkpoint&lt;/em&gt;: Human reviews the spec.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Test-First Implementation&lt;/strong&gt;: Write tests based on the scenarios, then implement the code.&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Maintenance&lt;/strong&gt;: Update status to stable and commit.&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;The beauty of this is that the &lt;strong&gt;human only needs to review the spec&lt;/strong&gt;. If the spec is right, the AI can handle the heavy lifting of implementation and testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Godot is the Perfect Match for SDD
&lt;/h2&gt;

&lt;p&gt;I’ve found that the &lt;code&gt;specre&lt;/code&gt; × SDD cycle is particularly effective with Godot Engine for three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GDScript Readability&lt;/strong&gt;&lt;br&gt;
GDScript’s Python-like syntax is incredibly concise. When an AI generates code from a &lt;code&gt;specre card&lt;/code&gt;, it doesn't waste the context window on C++ boilerplate. You get pure behavior.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Scene Tree &amp;amp; Responsibility Segregation&lt;/strong&gt;&lt;br&gt;
Godot’s Node system encourages separating concerns. A script like &lt;code&gt;game_over_score_display.gd&lt;/code&gt; has exactly one job, which maps 1:1 to a single &lt;code&gt;specre&lt;/code&gt; card.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GUT (Godot Unit Testing) Integration&lt;/strong&gt;&lt;br&gt;
I use the GUT framework. Since &lt;code&gt;specre&lt;/code&gt; scenarios are written step-by-step, they translate almost directly into GUT test methods. This mapping makes it nearly impossible for the AI to "hallucinate" incorrect logic.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Note: While GDScript is dynamic, having a suite of tests derived from specs provides the safety net needed for major refactoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lessons from 10 Days of Sprints
&lt;/h3&gt;

&lt;p&gt;The "Project Map" Effect&lt;br&gt;
By categorizing specre cards into domains, the project structure becomes visible:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;letter/&lt;/code&gt; (9 cards): Input, kerning, dictionary validation.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;stage/&lt;/code&gt; (20+ cards): Game loop, physics, level progression.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;magic/&lt;/code&gt; (8 cards): Elemental effects, attraction logic.&lt;/li&gt;
&lt;li&gt;...and so on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The number of cards in a directory tells you exactly where the complexity of your game lies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lowering the Cost of Specs
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;specre&lt;/code&gt; README says: "&lt;em&gt;Make specifications so small and so cheap to write that there is no reason to skip them.&lt;/em&gt;"&lt;/p&gt;

&lt;p&gt;Traditional docs fail because they are "expensive." If writing a spec takes 1–2 minutes (with an AI drafting it for you), it’s faster to write the spec than it is to debug a feature you didn't think through.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Reaching a Steam-ready state in 10 days wasn't about "crunching"; it was about &lt;strong&gt;clarity of intent&lt;/strong&gt;. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Atomic Specs&lt;/strong&gt; = Clear instructions for AI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated Workflow&lt;/strong&gt; = Less mental overhead on "what's next."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec Review &amp;gt; Code Review&lt;/strong&gt; = Higher decision throughput for the human dev.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you're interested in the game, please check out the Steam page and add it to your Wishlist!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://store.steampowered.com/app/4526000/LOGOMANCY/" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftydcj8u8x4jx2e03u7rw.png" alt="The Main Visual of Steam game: LOGOMANCY."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;I’m a designer/developer who loves pixel art and efficient systems. You can follow my progress and see my pixel animations on &lt;a href="https://bsky.app/profile/yoshiakist.bsky.social" rel="noopener noreferrer"&gt;Bluesky&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftg3nj89l5u4m70n3kdxy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftg3nj89l5u4m70n3kdxy.gif" alt="Pixel animation of Synta Xavier. He’s a Logomancer-in-training who can’t let go of his dictionary just yet"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>godot</category>
      <category>gamedev</category>
      <category>tooling</category>
      <category>ai</category>
    </item>
    <item>
      <title>What if we extracted literally EVERY behavior of dev.to into Markdown? (An AI Agent Experiment)</title>
      <dc:creator>yossi</dc:creator>
      <pubDate>Thu, 05 Mar 2026 10:50:02 +0000</pubDate>
      <link>https://forem.com/yoshiakist/what-if-we-extracted-literally-every-behavior-of-devto-into-markdown-an-ai-agent-experiment-2k7l</link>
      <guid>https://forem.com/yoshiakist/what-if-we-extracted-literally-every-behavior-of-devto-into-markdown-an-ai-agent-experiment-2k7l</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;I am running an experiment where I forked Forem, the OSS behind dev.to, and used AI agents to extract &lt;strong&gt;"every single behavior"&lt;/strong&gt; as Markdown specification documents&lt;/li&gt;
&lt;li&gt;Agents autonomously analyzed file dependencies spanning Rails Controllers, React Components, and Workers, and successfully constructed a network of specifications with bidirectional links (ULIDs)&lt;/li&gt;
&lt;li&gt;I am building a new SDD (Spec-Driven Development) toolkit called &lt;strong&gt;&lt;code&gt;specre&lt;/code&gt;&lt;/strong&gt; to make this approach possible&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A question for you, the reader
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;In a large codebase like Forem, can you immediately grasp where a specific feature (e.g., inviting a member to an Organization) has an impact?&lt;/li&gt;
&lt;li&gt;Can an AI understand that efficiently and accurately?&lt;/li&gt;
&lt;li&gt;Have you ever been frustrated when an AI failed to grasp context that you thought was perfectly clear?&lt;/li&gt;
&lt;/ul&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/yoshiakist" rel="noopener noreferrer"&gt;
        yoshiakist
      &lt;/a&gt; / &lt;a href="https://github.com/yoshiakist/specre" rel="noopener noreferrer"&gt;
        specre
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Atomic, living specification cards for AI-agent-friendly development. Minimal, agnostic, and traceable.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;&lt;a href="https://github.com/yoshiakist/specre/README.md" rel="noopener noreferrer"&gt;English&lt;/a&gt; | &lt;a href="https://github.com/yoshiakist/specre/README.ja.md" rel="noopener noreferrer"&gt;日本語&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;specre&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Atomic, living specification cards for AI-agent-friendly development.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;specre ( /spékré/ ) is a minimal specification format and toolkit for Spec-Driven Development (SDD). Each specre is a single Markdown file describing exactly one behavior, with machine-readable front-matter for lifecycle tracking and agent navigation.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;The Problem&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;Specifications are essential for keeping development intent visible and traceable. But in practice, they rot:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Specs drift from code in silence.&lt;/strong&gt; No one notices when an implementation diverges from its specification — until the next developer (or AI agent) builds on stale assumptions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monolithic specs waste AI context.&lt;/strong&gt; Large specification documents force agents to parse entire features just to understand a single behavior, consuming the finite context window that should be reserved for code and tests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small changes never get specced.&lt;/strong&gt; When the cost of writing a specification is high, only greenfield features get documented. Bug fixes, refactors, and incremental changes slip…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/yoshiakist/specre" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h3&gt;
  
  
  What was actually built
&lt;/h3&gt;

&lt;p&gt;I'll go into detail later in this article, but first — just take a look at these images!&lt;/p&gt;

&lt;p&gt;&lt;em&gt;All these behavior specifications are generated by finely-tuned workflow and scripts.&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiqu16mhoh1kqxjfinz1h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiqu16mhoh1kqxjfinz1h.png" alt="A screenshot showing a list of specification cards for the Forem codebase." width="761" height="1070"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Related files, overview, intent, key members, scenario and failure patterns can be suddenly provided for coding agent's context.&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9648t8q4hoqlmjglmni.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9648t8q4hoqlmjglmni.png" alt="A screenshot showing the format of specre card." width="800" height="564"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Each file has a "specre tag" formed with ULID. It can make bidirectional link between specification(s) and construct network.&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F79929ol3o7p457ydtf24.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F79929ol3o7p457ydtf24.png" alt="A screenshot showing the specre marker at the top of the source code." width="581" height="236"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  1. Why I built &lt;code&gt;specre&lt;/code&gt;
&lt;/h2&gt;
&lt;h3&gt;
  
  
  The limits of existing specifications and TDD
&lt;/h3&gt;

&lt;p&gt;Specifications always rot. SDD toolkits have become more widely used, and even when you use tools that keep spec Markdown files in mind as part of the pipeline, those specs begin to drift from reality the moment they are created.&lt;br&gt;
It's a natural thought that if we reuse code, we should reuse specifications too. But in practice, the harsh truth is that solving this problem with natural language alone is extremely difficult.&lt;/p&gt;

&lt;p&gt;I always write code test-first (and I imagine most of you do the same at work). I do the same when having AI write code, and I enjoy designing workflows that way. At the same time, however, I'm starting to feel the limits of TDD in this era of Vibe Coding.&lt;br&gt;
You can prove the correctness of behavior to a machine, but for humans — or for PMs and QA — it's not intuitively clear from the current codebase "what value is being delivered to users." Not without clicking through screen after screen, again and again.&lt;br&gt;
Why is it important? Why is it necessary? Why won't a different approach work? These fragments of an engineer's thinking sometimes survive as code comments, but there is no way for an AI to grasp a coherent "intent" that spans multiple files.&lt;/p&gt;
&lt;h3&gt;
  
  
  The context window limits of AI agents
&lt;/h3&gt;

&lt;p&gt;Providing value through a set of features requires writing massive specification documents, but that wastes the LLM's context window unnecessarily. When an agent (or a subagent receiving instructions) only cares about "one specific behavior," a workflow that repeatedly runs grep across the entire codebase and reads entire files that are likely irrelevant is extremely inefficient.&lt;/p&gt;

&lt;p&gt;Furthermore, the more context you feed an AI agent, the more diluted its meaning becomes. If important information appears in the middle, it becomes easier to forget, and the likelihood that it will be taken into account in the final output keeps dropping.&lt;/p&gt;
&lt;h3&gt;
  
  
  The specre philosophy: "One Markdown, one behavior"
&lt;/h3&gt;

&lt;p&gt;Here I'll briefly describe what makes specre distinctive. For full details, please check out the &lt;a href="https://github.com/yoshiakist/specre/blob/main/README.md" rel="noopener noreferrer"&gt;README&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bidirectional traceability between source code (&lt;code&gt;@specre&lt;/code&gt; comments) and Markdown specifications using ULIDs reduces the reasoning cost for an agent that wants to know "what is the intent of this code" or "what code implements this intent" to zero&lt;/li&gt;
&lt;li&gt;Lifecycle management via &lt;code&gt;status&lt;/code&gt; and &lt;code&gt;last_verified&lt;/code&gt; fields in front-matter, enabling detection of stale specs and drift between spec and reality&lt;/li&gt;
&lt;li&gt;A fast CLI written in Rust. In particular, &lt;code&gt;search&lt;/code&gt; combined with a project-specific vocabulary glossary efficiently delivers a small amount of targeted information plus hints for what to search next, rather than an overwhelming flood of results&lt;/li&gt;
&lt;li&gt;An MCP server makes specre commands the first-choice tool for coding agents during planning and initial exploration&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  2. Applying specre to Forem
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Why Forem?
&lt;/h3&gt;

&lt;p&gt;Forem — the codebase behind dev.to — is a practical, large-scale codebase that developers all over the world know. With over 3,000 Ruby files and 700+ JS/JSX files, it felt like exactly the right size to explore specre's practicality and applicability.&lt;br&gt;
And honestly, I've always had a small dream of posting something on dev.to as a developer. (I'm genuinely excited right now!)&lt;/p&gt;
&lt;h3&gt;
  
  
  How the extraction works
&lt;/h3&gt;

&lt;p&gt;Using simple RAG or existing AST tools, there is simply no way to describe "intent." To generate behavior specifications all at once, I built a four-phase multi-agent workflow.&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Phase 1–2 (The Brain — Claude 4.6 Opus):&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;To save tokens, the agent reads only the AST (Abstract Syntax Tree) structural map, not the raw source code.&lt;br&gt;
A custom script is used to extract structure only. If a domain is too large, this AST extraction script autonomously proposes splitting it into sub-domains to prevent token explosion.&lt;/p&gt;

&lt;p&gt;In Phase 1, related files and dependencies in the codebase are identified by domain name, and files deemed especially strongly related are organized.&lt;/p&gt;

&lt;p&gt;Prompt to the agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/specre-generate feed domain, pls!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Agent begins discovery:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 .claude/commands/scripts/domain-discovery.py feeds &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="nt"&gt;--root&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; 2&amp;gt;&amp;amp;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Domain discovery: feeds
Seed class names: FeedConfig, FeedEvent, FeedbackMessage, Feeds, ...
Total: 290 files &lt;span class="o"&gt;(&lt;/span&gt;258 untagged, 32 tagged&lt;span class="o"&gt;)&lt;/span&gt;

Output &lt;span class="nb"&gt;split &lt;/span&gt;into 3 parts &lt;span class="o"&gt;(&amp;gt;&lt;/span&gt;100 files&lt;span class="o"&gt;)&lt;/span&gt;:
  /tmp/specre-discovery-feeds-part1.json
  /tmp/specre-discovery-feeds-part2.json
  /tmp/specre-discovery-feeds-part3.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One of the ouput JSON files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"feeds"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"seed_class_names"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"FeedConfig"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"FeedEvent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FeedbackMessage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Feeds"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"stats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stage1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;108&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"stage2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;140&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"stage3"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;290&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"untagged"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;258&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tagged"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"part"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_parts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"files"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"app/assets/javascripts/lib/xss.js"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"stage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"import: from app/javascript/articles/__tests__/Feed.test.jsx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"specre_tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"app/assets/javascripts/utilities/timeAgo.js"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"stage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"import: from app/javascript/articles/__tests__/Feed.test.jsx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"specre_tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Phase 2, another script optimized for Ruby and JS is run against specific files identified in Phase 1, extracting structure that includes method names and what each method returns.&lt;/p&gt;

&lt;p&gt;From this reasonably reliable structural information, the agent infers "cross-layer behaviors" spanning Rails Controllers through React components and Workers, and designs a behavior catalog.&lt;br&gt;
This catalog contains only the specification names (e.g., &lt;code&gt;user_can_signup_with_email.md&lt;/code&gt;) and the files related to each one.&lt;/p&gt;

&lt;p&gt;At this stage, naming, classification, and granularity are critically important. The workflow includes a self-review step where the agent checks whether the specification naming accurately captures the behavior and value, and whether anything has been overlooked.&lt;/p&gt;

&lt;p&gt;The single most important rule throughout Phases 1 and 2 is that reading the actual files is strictly forbidden. Let Opus — smart as it is — focus exclusively on file paths, dependencies, method names, and return values.&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Phase 3 (The Workers — Claude 4.6 Sonnet):&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Based on the catalog Opus produced, multiple Sonnet subagents are launched in parallel.&lt;br&gt;
This is the first point at which actual code is read and natural-language scenarios are written.&lt;br&gt;
At this stage, the &lt;code&gt;status&lt;/code&gt; in front-matter is kept as &lt;code&gt;draft&lt;/code&gt;.&lt;br&gt;
Only if a test is judged to sufficiently cover the behavior is the status upgraded to &lt;code&gt;stable&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Additionally, &lt;code&gt;@specre &amp;lt;ULID&amp;gt;&lt;/code&gt; tags are automatically and rapidly embedded into the source code via MCP server calls to the specre command. With these tags in place, when implementing or fixing a feature, an agent can run &lt;code&gt;specre trace&lt;/code&gt; via MCP to instantly cross-reference spec and source in both directions.&lt;/p&gt;
&lt;h4&gt;
  
  
  Examples of actual output
&lt;/h4&gt;

&lt;p&gt;The output is basically what you see in the images above, but for more detail please refer to the links below. These point to my fork of Forem.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/yoshiakist/forem-with-specre/tree/main/docs/specres/organizations" rel="noopener noreferrer"&gt;List of specifications in the organizations domain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yoshiakist/forem-with-specre/blob/main/docs/specres/media/author_can_upload_image.md" rel="noopener noreferrer"&gt;Behavior Markdown for the spec "Author can upload an image"&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yoshiakist/forem-with-specre/blob/main/.claude/commands/specre-generate.md" rel="noopener noreferrer"&gt;The workflow that generates these specre cards&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note 1: Even if you are using an environment that does not support subagents outside of Claude Code, the workflow should be usable with minor adjustments to the prompt wording.&lt;br&gt;
Note 2: Parser scripts for languages other than Ruby/JS/JSX have not yet been created. See "Challenges" below for details.&lt;/p&gt;
&lt;h2&gt;
  
  
  3. The power of a network woven from code and specifications
&lt;/h2&gt;

&lt;p&gt;What has actually happened now that specre has been applied to Forem?&lt;br&gt;
Please watch the following videos.&lt;/p&gt;

&lt;p&gt;Demo of searching specification card by natural language:&lt;br&gt;
  &lt;iframe src="https://www.youtube.com/embed/XXksP8u-VC8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Demo of tracing code intent by natural language:&lt;br&gt;
  &lt;iframe src="https://www.youtube.com/embed/wxU3iuVOVRA"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  A revolution for coding agents
&lt;/h3&gt;

&lt;p&gt;The reasoning cost of exploration drops to nearly zero. When asking an agent to fix a bug, the agent follows a path like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Human: "Change the spec where XXX does AAA so that it does BBB instead."&lt;/li&gt;
&lt;li&gt;Agent: &lt;code&gt;specre search "xxx aaa"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Agent: "Got 8 specre cards. One likely describes this specification. Let me look at it."&lt;/li&gt;
&lt;li&gt;Agent: "I fully understand the specification and intent. Now let me investigate the related files..."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From there, 5–6 reads are enough to form a rough plan for fixing the feature.&lt;br&gt;
In my experience, when an existing spec is available, the total number of commands the agent issues before forming a fix plan is around 10–15.&lt;/p&gt;

&lt;p&gt;What makes this groundbreaking? A coding agent normally moves in the order: broad exploration → grasp behavior and spec → grasp intent. With specre, the order becomes: grasp intent and behavior simultaneously → read related files to "confirm." That reversal dramatically reduces the agent's reasoning cost.&lt;/p&gt;
&lt;h4&gt;
  
  
  The importance of a deterministic approach
&lt;/h4&gt;

&lt;p&gt;Another important point is that these processes are deterministic rather than probabilistic. If the related files for an atomic specification have been verified, then exploration to second- and third-order nodes is also deterministic. It's on the specre roadmap, but the Rust CLI will be able to instantly answer questions like "what is the potential blast radius of this change?"&lt;/p&gt;

&lt;p&gt;Existing AI agents, when fixing bugs, gather likely-related files through &lt;code&gt;grep&lt;/code&gt; or vector search — probabilistically, by guess — and as a result they sometimes break unrelated files or miss critical dependencies. With specre's ULID tags, however, an agent can deterministically identify the scope of impact of any change.&lt;/p&gt;
&lt;h3&gt;
  
  
  Value for humans (developers, PMs, QA)
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Reducing onboarding cost
&lt;/h4&gt;

&lt;p&gt;With specre, even someone completely new to using or contributing to Forem can immediately understand what features the system provides and which files are involved in each behavior. For example, if you wonder "what is a broadcast, and how is it created?", running &lt;code&gt;specre search "create broadcast"&lt;/code&gt; will surface the relevant spec Markdown instantly. If you want to know "what features involve email?", looking at the &lt;code&gt;email&lt;/code&gt; directory under &lt;code&gt;specres&lt;/code&gt; makes it immediately clear from the listing.&lt;br&gt;
specre minimizes the onboarding cost for new contributors.&lt;/p&gt;
&lt;h4&gt;
  
  
  Reclaiming control over complex systems
&lt;/h4&gt;

&lt;p&gt;When modifying a feature, you can edit the existing specre Markdown as a requirements definition and hand it to the agent — a consistent implementation patch comes out. The same applies when adding variations to existing behavior.&lt;br&gt;
Of course, adding an entirely new behavior (i.e., a new feature) is straightforward too. Start by writing the intent and a rough sketch of the scenarios. The agent will then refine it into a polished spec following the specre format, complete with feature overview, intent, scenarios, and failure/edge cases. All you need to do is review and approve.&lt;br&gt;
Once you try it, you'll notice that even standard Vibe Coding steps feel more structured and grounded. This gives you a stronger sense of control, even when implementing against a complex domain.&lt;/p&gt;

&lt;p&gt;Lately, there's a narrative around Vibe Coding suggesting that the human's role is only to inject "intent" — that humans should let go of micro-level control. For software at a certain scale, that's probably true.&lt;br&gt;
But what about complex software where the behavior of one domain affects multiple other domains or multiple microservices? How confident are you that a coding agent won't introduce a single bug just because you told it the intent? For example, if you ask an AI to "fix the feature that changes an article's cover image" in Forem, are you comfortable fully delegating to the AI whether that change safely propagates to the mobile app API (ForemMobile) and CDN cache invalidation?&lt;/p&gt;

&lt;p&gt;With specre, humans only need to review the "scenarios" and "failure pattern definitions" in the Markdown to reclaim control over whether the AI has missed any edge cases.&lt;br&gt;
Personally, I don't think we should yet abandon the human intuition about cross-domain side effects.&lt;/p&gt;
&lt;h4&gt;
  
  
  As a foundation for cross-functional collaboration
&lt;/h4&gt;

&lt;p&gt;For non-engineers such as PMs and QA, these natural-language scenarios also become a hub for accurately understanding system behavior. By understanding behavior in more detail than the vague layer of "intent," the gap between expectations and actual deliverables can be minimized.&lt;/p&gt;

&lt;p&gt;This also delivers a particularly strong ROI for QA. For many QA engineers, analyzing unit tests offers limited value. Even if you write a script to effectively extract and read &lt;code&gt;it&lt;/code&gt; and &lt;code&gt;describe&lt;/code&gt; blocks, test cases are often too simple, or the descriptions in &lt;code&gt;it&lt;/code&gt;/&lt;code&gt;describe&lt;/code&gt; are omitted when the intent is obvious from the test code itself.&lt;br&gt;
If you want to use them to explore edge cases, what you need is not unit tests as a starting point, but documentation that describes the expected behavior of the system in scenario-based, natural language. With specre, the current state of the code should be something the QA team can actually reference.&lt;/p&gt;
&lt;h3&gt;
  
  
  Drift detection
&lt;/h3&gt;

&lt;p&gt;I'll keep this brief since it's something I'm actively working on in the current roadmap: specre enables detection of stale specifications.&lt;br&gt;
Code is a living thing, and when developing in a team, it's inevitable that spec updates get forgotten. By having an agent patrol on a CI or heartbeat schedule, we can detect and report discrepancies between specs and actual code, ordered by oldest &lt;code&gt;last_verified&lt;/code&gt; date.&lt;br&gt;
When human users are modifying a feature or editing related files, they'll also naturally notice that the spec itself is outdated — or it will serve as a clue when trying to figure out whether the spec or the code is wrong.&lt;br&gt;
The specre philosophy that specifications should have a lifecycle largely originates from this perspective.&lt;/p&gt;
&lt;h2&gt;
  
  
  4. Challenges and dilemmas
&lt;/h2&gt;

&lt;p&gt;I've been painting quite a rosy picture so far, but of course this is not magic. Let me share the real-world hurdles too.&lt;/p&gt;
&lt;h3&gt;
  
  
  The adoption cost paradox
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Token cost
&lt;/h4&gt;

&lt;p&gt;One of specre's goals is eco-friendly AI development — being kind to both your wallet and the planet by saving tokens. However, I also discovered that the initial bootstrapping requires enormous computational resources. Even with Claude Max ($100), it took 5 full days of manual orchestration to process all over 40 domains.&lt;/p&gt;

&lt;p&gt;My hypothesis for a solution: if you have access to a higher API plan (such as Claude Max $200), processing all the project's domains in one batch might yield dramatically better cost efficiency in the end.&lt;br&gt;
With Opus 4.6 + 3-parallel Sonnet 4.6 subagents, roughly 20 minutes of inference per domain was needed, depending on domain size.&lt;br&gt;
Whether that feels like a reasonable cost or an enormous time investment — what do you think?&lt;/p&gt;

&lt;p&gt;When adopting specre into an existing project, what specre is fundamentally doing is "pre-paying the search cost."&lt;br&gt;
But the larger the team and the longer the project, the more powerfully this pays off. After all, the network of code and specifications lives in the repository itself — one person pays the cost, and everyone else receives the benefit.&lt;/p&gt;
&lt;h4&gt;
  
  
  Human effort cost too
&lt;/h4&gt;

&lt;p&gt;I also had to wrestle quite a bit with decomposing Forem's internal structure and designing domain granularity, since I was seeing it for the first time. Ideally, the person who knows the codebase best (who is usually also the busiest person on the team) would need to oversee this entire workflow.&lt;/p&gt;

&lt;p&gt;For example, in Forem, &lt;code&gt;tags&lt;/code&gt; and &lt;code&gt;liquid_tags&lt;/code&gt; are qualitatively entirely different domains with completely different concerns. But I didn't notice that at first, ran the specre generation workflow for &lt;code&gt;tags&lt;/code&gt; as a whole, realized something was off, and eventually had to start over. You should start from the smallest, most specific domains (in this case &lt;code&gt;liquid_tags&lt;/code&gt;) and work outward toward larger, more comprehensive ones — but this depends entirely on human knowledge of the system's characteristics.&lt;br&gt;
In my experience, I don't currently believe that an AI can autonomously reason about and correctly design this kind of effective execution order.&lt;/p&gt;

&lt;p&gt;Note: Partway through, I also experimented with &lt;a href="https://github.com/yoshiakist/forem-with-specre/blob/main/.claude/commands/specre-generate-cocoindex.md" rel="noopener noreferrer"&gt;an exploration approach using AST and vector indexes built with cocoindex&lt;/a&gt;, instead of my custom scripts. If you want to support all programming languages and frameworks, that approach may have better fundamentals.&lt;br&gt;
Both workflows are documented in the forked repository.&lt;/p&gt;
&lt;h3&gt;
  
  
  Human review is still essential
&lt;/h3&gt;

&lt;p&gt;Opus's reasoning ability is extraordinary, but the step where a maintainer (a human) with domain knowledge verifies "is this specification actually correct?" cannot be skipped. The &lt;a href="https://github.com/yoshiakist/specre/blob/main/docs/guides/adoption-strategy.md" rel="noopener noreferrer"&gt;specre adoption guide&lt;/a&gt; also strongly recommends that humans review this step.&lt;br&gt;
That said, reading through every single specification across all domains from scratch during bootstrapping is genuinely hard work. Most people's gut reaction would be to refuse outright.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Hey Forem maintainers (if you are reading this!)&lt;/em&gt;: After reading &lt;a href="https://github.com/yoshiakist/forem-with-specre/tree/main/docs/specres/articles" rel="noopener noreferrer"&gt;a few of the specifications I generated in my environment&lt;/a&gt;, do they seem reasonable to you? Since I can't review them with any real authority, there's a very real possibility that these specs — wrung out of Claude Code through sheer persistence — are nothing but a pile of garbage.&lt;br&gt;
(And if that's the case, please don't hesitate to say so. If I need to fundamentally rethink some aspect of the approach or methodology, I want to know.)&lt;/p&gt;
&lt;h3&gt;
  
  
  AI-generated specifications are inherently probabilistic
&lt;/h3&gt;

&lt;p&gt;In generating specs for every behavior in Forem, I relied heavily on AI. This means that even though I claim "verified specres enable deterministic traversal of the network," the network itself was generated probabilistically.&lt;br&gt;
Did the AI truly describe every behavior? Did it list all related files — comprehensively and without redundancy? The truth is unknowable without verification by a core maintainer.&lt;/p&gt;

&lt;p&gt;For example, Forem has a broadcast feature (a user-facing announcement displayed near the nav bar), so I instructed Claude to cover that domain with specre. The output spec titles were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;admin_can_create_broadcast&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;admin_can_view_broadcast&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;admin_can_list_broadcast&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;admin_can_edit_broadcast&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;admin_can_delete_broadcast&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I asked: "Okay. But who actually sees a broadcast, and where? Who benefits from a broadcast existing?" Claude replied: "I forgot to include the behavior where a user views it — shall I add that?"&lt;/p&gt;

&lt;p&gt;...And yet, there's still some reason for hope.&lt;br&gt;
Even I — having only just signed up on dev.to two weeks ago and tentatively poking around Forem for the first time — could catch that. There's no way a maintainer wouldn't.&lt;br&gt;
And the specre generation workflow is still evolving. There's still plenty to be done: building verification chains, adding constraint-check flows delegated to separate subagents, designing dedicated commands for validation, and more.&lt;/p&gt;
&lt;h2&gt;
  
  
  5. Closing thoughts and a question for the community
&lt;/h2&gt;

&lt;p&gt;We're still very much in the middle of this journey, but I genuinely feel that this approach has the potential to fundamentally change the future of SDD (Spec-Driven Development).&lt;br&gt;
I myself develop indie games using Godot Engine and use specre for that work. So even if nobody else in the world uses it, I'll probably keep developing specre for my own benefit for a while.&lt;br&gt;
That said, what might this look like in five years? A model that can internalize a multi-gigabyte project in an instant and fully grasp its behavior as though it were one giant function — something like that wouldn't surprise me too much if it emerged. But until that day comes, I believe that atomic specification documents can serve as a reliable guide for engineers and product teams navigating complexity.&lt;/p&gt;
&lt;h3&gt;
  
  
  Discussion
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;If your team's project had a "network of specifications linked to code" like this, would that be valuable? Would it make collaboration with AI easier?&lt;/li&gt;
&lt;li&gt;Could it work as a communication format for specifications with non-engineers such as PMs or QA?&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;specre&lt;/code&gt; is a brand-new project. If this approach resonates with you, please check out the GitHub repo and leave a star!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://github.com/yoshiakist/specre" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Star specre on GitHub ⭐️&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  A note on AI use in writing this article
&lt;/h3&gt;

&lt;p&gt;This article was written by me in Japanese, then translated into English with the assistance of AI.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>devto</category>
      <category>forem</category>
      <category>vibecoding</category>
    </item>
  </channel>
</rss>
