<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sebastian Schürmann</title>
    <description>The latest articles on Forem by Sebastian Schürmann (@sebs).</description>
    <link>https://forem.com/sebs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F20049%2Fdfdb5f55-23bd-4bc1-9605-bd548fc3b62d.jpeg</url>
      <title>Forem: Sebastian Schürmann</title>
      <link>https://forem.com/sebs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/sebs"/>
    <language>en</language>
    <item>
      <title>Local AI Will Save Us All (The Math Says So, Trust Me)</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Wed, 15 Apr 2026 14:05:13 +0000</pubDate>
      <link>https://forem.com/sebs/local-ai-will-save-us-all-the-math-says-so-trust-me-4m22</link>
      <guid>https://forem.com/sebs/local-ai-will-save-us-all-the-math-says-so-trust-me-4m22</guid>
      <description>&lt;p&gt;Every few weeks a take goes viral in tech circles making the case for ditching cloud AI and running models locally. The argument is always roughly the same: cloud costs add up, your data is being shipped to American servers of dubious legal standing, and a one-time GPU purchase pays for itself in 18 months. Bold claim. Simple math. Lots of hashtags.&lt;/p&gt;

&lt;p&gt;It deserves a closer look.&lt;/p&gt;

&lt;p&gt;The typical version of this argument runs something like: two RTX PRO 6000 Blackwells, 1,200W draw, six hours a day, €0.32 per kWh — "about €48/month" in electricity. The cards themselves cost around €16,000. Cloud AI, by comparison, runs €100–200 per developer per month. Eight developers, 18 months, done.&lt;/p&gt;

&lt;p&gt;Except the electricity bill is already wrong. &lt;strong&gt;1.2 kW × 6h × 30 days × €0.32 = €69.12.&lt;/strong&gt; Not €48. A 44% error in the opening calculation of an argument whose entire appeal is rigorous arithmetic.&lt;/p&gt;

&lt;p&gt;The break-even math has bigger problems. €100–200/month per developer implies roughly 20 million tokens consumed per person per month. That is not a power user. That is a token foundry. For any team using AI at normal human rates, the break-even slides quietly past two years — by which point the GPU generation is already dated.&lt;/p&gt;

&lt;p&gt;The €16,000 hardware figure also never travels with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cooling.&lt;/strong&gt; 1,200W sustained is a serious heat load. Office HVAC was not designed for this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Labor.&lt;/strong&gt; Keeping local model infrastructure running — version management, security patches, prompt compatibility across model updates — is real engineering work that doesn't appear in these spreadsheets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hardware failure.&lt;/strong&gt; Cloud providers have SLAs. Your server closet does not.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Noise.&lt;/strong&gt; Two RTX PRO 6000 Blackwells under full load exceed 50 dB — a loud dishwasher, sustained, all day. In a dedicated server room, fine. In a shared office, your colleagues will have opinions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Availability.&lt;/strong&gt; The RTX PRO 6000 Blackwell is a new, high-demand professional card with constrained supply and multi-week lead times. If one card fails, you are not buying a replacement over the weekend. You wait — potentially a month or more. Keeping a spare sounds prudent; that spare costs another ~€8,000 and is equally hard to source. A single-point-of-failure setup with no redundancy and a six-week replacement window is not infrastructure. It is optimism.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the Argument Has a Point
&lt;/h2&gt;

&lt;p&gt;Data sovereignty is real. GDPR compliance for third-country data transfers is genuinely complex, vendor terms change, and strategic dependence on external model providers is a risk that tends to get underweighted until it isn't. The upfront capital requirement is the actual barrier for most teams, not the long-run economics.&lt;/p&gt;

&lt;p&gt;But the most important question gets skipped entirely: &lt;strong&gt;is the local model actually as good?&lt;/strong&gt; Two Blackwells with 192GB VRAM can run serious open-weight models — this is not a toy setup. But if developers need two or three attempts to get what a frontier cloud model produces in one, the labour savings evaporate and the break-even never arrives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Local AI infrastructure can make sense — for teams with heavy, sensitive workloads, strong in-house ops capability, and the capital to do it properly, including redundancy, cooling, and the realistic assumption that hardware will occasionally fail at inconvenient times.&lt;/p&gt;

&lt;p&gt;What it is not is a simple 18-month arbitrage available to anyone with a GPU and a spreadsheet.&lt;/p&gt;

&lt;p&gt;The sovereignty argument is the strongest card in the deck. Lead with that. The cost argument needs a lot more columns in the spreadsheet before it holds up.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mba</category>
      <category>operations</category>
    </item>
    <item>
      <title>Down the Rabbit Hole: Building the Reference List for the Pair-Programming Book</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Mon, 13 Apr 2026 11:20:07 +0000</pubDate>
      <link>https://forem.com/sebs/down-the-rabbit-hole-building-the-reference-list-for-the-pair-programming-book-367n</link>
      <guid>https://forem.com/sebs/down-the-rabbit-hole-building-the-reference-list-for-the-pair-programming-book-367n</guid>
      <description>&lt;p&gt;There's a particular kind of humbling that happens when you sit down to write a book and realize you need to actually &lt;em&gt;read&lt;/em&gt; the papers you've been casually citing for years.&lt;/p&gt;

&lt;p&gt;That's more or less where I found myself when I started assembling the reference list for the Pair Programming Book. What started as "I'll just gather the key papers" turned into a months-long excavation through decades of software engineering research. The current estimate: somewhere between 250 and 500 relevant papers. And counting.&lt;/p&gt;

&lt;p&gt;Here's what that journey looked like.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Papers You Know But Haven't Read
&lt;/h2&gt;

&lt;p&gt;Every field has its citation folklore — papers so frequently referenced that they've achieved the status of common knowledge without anyone actually opening them. Pair programming research is no exception.&lt;/p&gt;

&lt;p&gt;I had a mental list of "classics" I'd been nodding at for years. Williams et al., 2000. Cockburn and Williams. The early XP studies. I knew their conclusions the way you know the plot of a movie you've never seen — through cultural osmosis, hallway conversations, and abstracts alone.&lt;/p&gt;

&lt;p&gt;Actually reading them was a different experience. Some held up beautifully. Others were more nuanced, more conditional, more &lt;em&gt;contested&lt;/em&gt; than the canonical summary suggested. A few conclusions that had calcified into "everyone knows that pair programming does X" turned out to rest on a single study with 41 undergraduates.&lt;/p&gt;

&lt;p&gt;The lesson: citation chains in a young field are fragile things. You owe it to your readers — and yourself — to go back to the source.&lt;/p&gt;

&lt;h2&gt;
  
  
  Laurie Williams Deserves a Prize
&lt;/h2&gt;

&lt;p&gt;If pair programming research has a GOAT, it is, without question, &lt;strong&gt;Laurie Williams&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The sheer volume of rigorous, foundational work she has produced on the subject is staggering. While others were still debating whether pair programming was a gimmick, Williams was running controlled studies, developing frameworks, and building the empirical case that made the whole conversation possible. Decade after decade.&lt;/p&gt;

&lt;p&gt;Writing this book without her work would be like writing about relativity and hoping Einstein doesn't come up. She doesn't just appear in the bibliography — she &lt;em&gt;is&lt;/em&gt; a substantial portion of it.&lt;/p&gt;

&lt;p&gt;If there is ever a formal prize for contributions to software engineering research, the pair programming category should be named after her.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Questionable Corners of the Literature
&lt;/h2&gt;

&lt;p&gt;Not every paper in the pile earned its place gracefully.&lt;/p&gt;

&lt;p&gt;Some announced themselves with titles that made me wince before I even opened the PDF. You know the genre. A combination of buzzwords, a forced acronym, and a vague promise of insight that the abstract doesn't quite deliver on. I won't name names. But I have a folder.&lt;/p&gt;

&lt;p&gt;More substantively: a surprising amount of pair programming research is built on frameworks that the broader scientific community has quietly retired. &lt;strong&gt;Personality type taxonomies&lt;/strong&gt; are the main offender. Myers-Briggs in particular makes repeated appearances — studies earnestly classifying programmers into 16 types and drawing conclusions about pairing compatibility. The problem is that the psychometric foundation for these instruments has been thoroughly undermined. They're not useless as casual conversation tools, but basing empirical research claims on them is shaky ground.&lt;/p&gt;

&lt;p&gt;The same applies to some of the "introvert vs. extrovert" dichotomy work, which tends to treat personality as a binary switch rather than the distributed, context-dependent trait that modern personality psychology describes.&lt;/p&gt;

&lt;p&gt;This doesn't mean the research is worthless — often the observations are real even when the interpretive framework is suspect. But it does mean a lot of careful reading, and a lot of footnotes that essentially say: &lt;em&gt;the finding is interesting, the taxonomy it's hung on is not.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What 250–500 Papers Looks Like
&lt;/h2&gt;

&lt;p&gt;It looks like a lot of tabs.&lt;/p&gt;

&lt;p&gt;It also looks, honestly, like a field that is richer and more contested than its popular summary suggests. Pair programming is not simply "proven effective" or "proven ineffective." The evidence is contextual, domain-specific, experience-level-dependent, and shaped enormously by how you define and measure "effective" in the first place.&lt;/p&gt;

&lt;p&gt;That complexity is exactly why the book needs to exist. The practitioner literature tends toward confident prescriptions. The academic literature is full of hedges, replications, and contradictions that rarely make it into the conference talk or the blog post.&lt;/p&gt;

&lt;p&gt;The reference list is the honest accounting of that complexity. Every citation is a commitment: &lt;em&gt;I looked at this, I understand what it claims, and I'm representing it faithfully.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That's the job. It's slower than I expected. It's also more interesting.&lt;/p&gt;

</description>
      <category>pairprogramming</category>
      <category>writing</category>
      <category>research</category>
    </item>
    <item>
      <title>From Cardboard to Code</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Fri, 10 Apr 2026 22:43:15 +0000</pubDate>
      <link>https://forem.com/sebs/from-cardboard-to-code-29d5</link>
      <guid>https://forem.com/sebs/from-cardboard-to-code-29d5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;The design challenge isn't understanding board games. It's turning prose rules into structures a software team can actually act on.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are thousands of board games. Most of them contain fascinating design work: carefully balanced economies, elegant interaction models, loop structures refined over years of playtesting. Almost none of them exist as digital games. The barrier is real work — translating a 40-page rulebook into a game design document, a feature backlog, an architecture diagram, user stories — before a single line of code is written.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/sebs/ruleforge" rel="noopener noreferrer"&gt;RuleForge&lt;/a&gt; automates that translation. You hand it a PDF. It hands you a developer bundle.&lt;/p&gt;




&lt;h2&gt;
  
  
  What it actually does
&lt;/h2&gt;

&lt;p&gt;At its core, RuleForge is a suite of Claude Code slash commands stored in a &lt;code&gt;.claude/commands/&lt;/code&gt; directory. Each command is a focused AI workflow targeting one specific phase of the board-game-to-digital-game translation process. They can be run individually, or chained together through the main &lt;code&gt;/ruleforge&lt;/code&gt; pipeline command.&lt;/p&gt;

&lt;p&gt;The full pipeline runs 16 stages and produces a self-contained output directory scoped to the game — something like &lt;code&gt;output/catan/&lt;/code&gt; or &lt;code&gt;output/terraforming-mars/&lt;/code&gt; — filled with structured files ready for a development team.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pipeline, step by step
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;code&gt;/complexity-estimate&lt;/code&gt; — Quick pre-flight scan&lt;/strong&gt;&lt;br&gt;
Before committing to the full pipeline, get a fast complexity estimate. How long is the rulebook? How many mechanics? Is this a 20-minute job or a 2-hour one?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. &lt;code&gt;/ruleforge&lt;/code&gt; — Full pipeline, PDF to developer bundle&lt;/strong&gt;&lt;br&gt;
The main event. Extracts rules, identifies mechanics, generates the game loop diagram, writes the GDD, builds the feature list, creates user stories, outputs architecture diagrams. Resumable if interrupted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. &lt;code&gt;/card-database&lt;/code&gt; + &lt;code&gt;/economy-flow&lt;/code&gt; — Domain-specific extraction&lt;/strong&gt;&lt;br&gt;
Card-heavy games need their component databases structured. Economy-driven games need their resource flows mapped — sources, sinks, conversions. These commands go deeper on those specific concerns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. &lt;code&gt;/accessibility-audit&lt;/code&gt; — Check for digital barriers&lt;/strong&gt;&lt;br&gt;
Audits the extracted design across five accessibility dimensions: visual, motor, cognitive, hearing, and communication. Digital ports are an opportunity to do better than the physical original.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. &lt;code&gt;/realtime-forge&lt;/code&gt; — Translate to interactive game design&lt;/strong&gt;&lt;br&gt;
The big leap. Takes the RuleForge output and translates it into a real-time or interactive digital game design — covering analysis, a revised GDD, architecture, balance sheets, asset specifications, and prototype prompts. Seven waves, roughly 30 output files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. &lt;code&gt;/dev-bundle&lt;/code&gt; — Validate and package&lt;/strong&gt;&lt;br&gt;
Validates all output files including Mermaid diagram syntax, checks for completeness, and packages everything into a clean bundle ready to hand off.&lt;/p&gt;




&lt;h2&gt;
  
  
  The full command library
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Extraction &amp;amp; Analysis
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/extract-rules&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Parse and summarize the rules from a PDF. The raw input layer.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/identify-mechanics&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Classify game mechanics across 25 standard types — Worker Placement, Deck Building, Area Control, Engine Building, and so on.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/game-loop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generate a Mermaid diagram of atomic, primary, secondary, and tertiary game loops.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/validate-loop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check the game loop for structural soundness and state reachability. Catches design dead ends.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/adaptation-gap&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Report on how much work a digital port actually requires — No Change / Simple Adaptation / Redesign.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/flag-ambiguities&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Surface rules that are unclear, contradictory, or likely to cause bugs when implemented.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/confidence-score&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Self-assessment of extraction quality. Useful for knowing when to do a manual review.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Design &amp;amp; Documentation
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/generate-gdd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full Game Design Document. Chunked automatically for complex games.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/balance-sheet&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Extract balance parameters with digital annotations and sensitivity analysis.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/feature-list&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Prioritized feature list output as both CSV and Markdown, with a dependency diagram.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/user-stories&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User stories with granularity selector and acceptance criteria. Outputs to Stories.csv.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/onboarding-design&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tutorial and onboarding flow design — how a new player learns the game digitally.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/interaction-model&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Component interaction model — how game entities relate to and affect each other.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Architecture &amp;amp; Prototyping
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/architecture-diagram&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System architecture in Mermaid. Supports Unity, Godot, Phaser, Web, or generic targets.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/prototype-prompts&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AI prototyping prompts for Rosebud, v0, Bolt, Lovable, or generic tooling.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/economy-flow&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Resource economy diagram — where resources come from, where they go, and how they convert.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/card-database&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Extracts individual card, tile, or component data into a structured database.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Standalone Utilities
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/game-mixer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Blend mechanics from two or more games into hybrid designs, with iteration support.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/decompose-idea&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Break down a game idea using a 7-category ludemic framework.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/ludeme-generator&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generate a Ludii game description file (.lud) from a concept.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/game-fitness&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Analyze a game concept across 6 fitness dimensions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/playtest-design&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Design an automated playtesting plan with fitness functions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/procedural-generator&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Design procedural generation systems using the Watson et al. (2008) workflow.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/game-comparison&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Side-by-side comparison of two RuleForge extractions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/pdf-to-markdown&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Convert any PDF to clean, well-structured Markdown. Useful as a standalone tool.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The output structure
&lt;/h2&gt;

&lt;p&gt;Every skill writes into a game-scoped directory under &lt;code&gt;output/&lt;/code&gt;. The game slug is derived automatically from the title in the rulebook. A &lt;code&gt;.context.json&lt;/code&gt; metadata file lets downstream commands pick up where upstream ones left off — that's what makes the pipeline resumable.&lt;/p&gt;

&lt;p&gt;A typical output for something like Terraforming Mars would contain a GDD, a feature CSV, a user stories CSV, Mermaid files for the game loop and architecture, a balance sheet, an onboarding flow design, and prototype prompts ready to paste into your AI prototyping tool of choice.&lt;/p&gt;




&lt;h2&gt;
  
  
  The solo dungeon bash
&lt;/h2&gt;

&lt;p&gt;The repository also ships a &lt;code&gt;solo-dungeon-bash/&lt;/code&gt; directory — a worked example of the pipeline in action on a solo dungeon-crawl game. It's useful both as a reference output and as a test case to understand what the extraction quality actually looks like on a real game with real rules.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why slash commands, not a CLI tool?
&lt;/h2&gt;

&lt;p&gt;This is a deliberate choice. Claude Code's slash command system makes each step conversational and inspectable. You can run &lt;code&gt;/identify-mechanics&lt;/code&gt;, read the output, decide the model missed a nuance, correct it manually, and then continue with &lt;code&gt;/game-loop&lt;/code&gt;. That feedback loop would be much harder to preserve in a fully automated CLI pipeline.&lt;/p&gt;

&lt;p&gt;It also means the tool is essentially zero-setup. Clone the repo, point Claude Code at the directory, and the commands are available. No build step, no package install, no configuration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;The project is on GitHub at &lt;a href="https://github.com/sebs/ruleforge" rel="noopener noreferrer"&gt;github.com/sebs/ruleforge&lt;/a&gt;. Clone it, drop a rulebook PDF next to it, and start with &lt;code&gt;/complexity-estimate path/to/your-game.pdf&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The design is intentionally modular — you don't have to run the full pipeline. If you just need a GDD from a rulebook, run &lt;code&gt;/generate-gdd&lt;/code&gt;. If you want to compare two games, run &lt;code&gt;/game-comparison&lt;/code&gt;. Each command is independently useful.&lt;/p&gt;

&lt;p&gt;Board games are some of the most densely designed interactive systems humans have made. RuleForge is a bet that those designs are worth bringing into software — and that AI can do a lot of the translation work.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>gamedev</category>
      <category>gamedesign</category>
    </item>
    <item>
      <title>Leading With "I Don't Know"</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Mon, 30 Mar 2026 19:46:23 +0000</pubDate>
      <link>https://forem.com/sebs/leading-with-i-dont-know-324h</link>
      <guid>https://forem.com/sebs/leading-with-i-dont-know-324h</guid>
      <description>&lt;p&gt;&lt;em&gt;A powerful thing a tech lead can say isn't an answer. It's an honest admission — about your team's code, about AI's trajectory, about a world in crisis — followed by the only thing that matters: what you do next.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There's a version of tech leadership that never actually exists but haunts every leader anyway: the person who has seen every edge case, knows where the technology is heading, understands the macro forces shaping the business, and fields every question with calm, grounded certainty.&lt;/p&gt;

&lt;p&gt;It's a fiction. And quietly chasing it is one of the most corrosive things a leader can do.&lt;/p&gt;

&lt;p&gt;The real job — leading developers through ambiguous problems, positioning teams in the face of transformative technology, making business decisions while the world keeps breaking in unpredictable ways — requires a completely different posture. It starts with saying three words without flinching: &lt;strong&gt;I don't know.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Leaders Resist Saying It
&lt;/h2&gt;

&lt;p&gt;The fear is understandable. You got the role because you were sharp. Your team looks to you. Admitting ignorance feels like handing back your credentials in front of everyone who gave them to you.&lt;/p&gt;

&lt;p&gt;But engineers are a perceptive group. They know when an answer is being improvised. They can feel the difference between grounded confidence and performed certainty. And nothing erodes trust faster than a leader who bluffs — especially when it costs the team direction, time, or morale.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Faking knowledge doesn't protect your authority. It slowly transfers it to whoever actually knows.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Admitting "I don't know" is one of the highest-signal things a leader can do. It tells your team that you operate in reality — that their trust is well-placed because you won't lead them off a cliff to protect your ego.&lt;/p&gt;

&lt;p&gt;But the admission isn't the end. It's an opening move. Three areas, in particular, are where honest not-knowing is most consequential right now.&lt;/p&gt;




&lt;h2&gt;
  
  
  Your Team: The Daily Not-Knowing
&lt;/h2&gt;

&lt;p&gt;At the most immediate level, this is about the problems that land on your desk every morning: the architectural decision someone needs a call on, the production incident whose root cause is unclear, the technical direction your team is asking you to set on a system you haven't touched in six months.&lt;/p&gt;

&lt;p&gt;In this context, "I don't know" is a team-safety tool. When a lead normalises it, developers stop pretending too. They surface problems earlier. They ask questions instead of grinding silently for two hours. They admit blockers instead of heroically absorbing them.&lt;/p&gt;

&lt;h3&gt;
  
  
  When you don't know → concrete alternatives
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Move&lt;/th&gt;
&lt;th&gt;What it sounds like&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;01&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Name who does know&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"I don't know, but Sarah has been closest to that service — let's pull her in." Directing to expertise is leadership, not deferral.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;02&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Define the investigation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"I don't know, but I think the answer is in the caching config. Can we spike on it this afternoon?" Turn fog into a task.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;03&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Reason out loud together&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"I don't know — walk me through what you're seeing and let's think it through." Your value isn't always the answer; it's the thinking process.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;04&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Surface the systemic gap&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"I don't know — and that's a signal we have a documentation problem worth fixing." Use your ignorance diagnostically.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The whole team starts operating in reality rather than in the performance of competence — and reality, however uncomfortable, is a much better place to build software.&lt;/p&gt;




&lt;h2&gt;
  
  
  AI: The Impact No One Can Honestly Quantify
&lt;/h2&gt;

&lt;p&gt;Then there's the larger question your CTO, your board, your reports, and your peers are all asking — the one that gets dressed up in confident slides and frameworks but remains stubbornly, genuinely open: &lt;em&gt;what does AI actually do to how we build software, and how we work?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The honest answer, right now, is that nobody knows.&lt;/p&gt;

&lt;p&gt;We have data points. AI coding assistants measurably change output velocity in some contexts. Some categories of junior tasks look automatable; others that seemed automatable turned out to require more human judgment than assumed. Certain roles are being restructured; others are being amplified. The second and third-order effects — on team structure, on the value of different skills, on how we hire and what seniority means — are genuinely unresolved.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Any leader who tells you they know exactly what AI will do to their team in two years is either guessing confidently or selling something.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is not a reason for paralysis. It's a reason for a particular kind of leadership: one that acknowledges the uncertainty explicitly, moves deliberately rather than reactively, and builds in the organisational capacity to adapt as clarity arrives.&lt;/p&gt;

&lt;h3&gt;
  
  
  What "I don't know" looks like on AI strategy
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Run time-boxed experiments with clear hypotheses instead of committing to wholesale transformations based on hype&lt;/li&gt;
&lt;li&gt;Tell your team honestly: "We're going to try this, observe what changes, and adjust — we're not doing a big bet we can't reverse"&lt;/li&gt;
&lt;li&gt;Resist pressure to make confident AI roadmap calls purely for optics; say "we're still learning" to stakeholders when that's true&lt;/li&gt;
&lt;li&gt;Watch the teams two years ahead of you on adoption and study what they're actually saying now vs. what they said then&lt;/li&gt;
&lt;li&gt;Invest in the capabilities that remain valuable regardless of how AI develops: systems thinking, communication, judgment under ambiguity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The leaders doing the most honest, useful work on this right now are the ones who've stopped trying to predict AI's impact and started building teams good at navigating whatever it turns out to be.&lt;/p&gt;




&lt;h2&gt;
  
  
  The World on Fire: Crises That Reach Your Sprint Board
&lt;/h2&gt;

&lt;p&gt;And then there's everything else.&lt;/p&gt;

&lt;p&gt;Tech leads used to be able to bracket the world's problems at the office door. That boundary has been dissolving for years — and in the current moment, it's essentially gone. Supply chain shocks affect infrastructure budgets. Geopolitical instability affects where you can hire and what data sovereignty rules apply to your systems. Economic turbulence reshapes what your company thinks the engineering team should be building. Social crises affect your team members directly, and expect leadership to notice.&lt;/p&gt;

&lt;p&gt;None of this has clean answers. The honest position is that most leaders — most people — don't know how these crises resolve, what the downstream business effects will be, or exactly what the right response is. Pretending otherwise doesn't help your team. It insults their intelligence.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Your team doesn't need you to have solved geopolitics. They need to know you're not pretending it isn't happening.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Crisis uncertainty → alternatives to false confidence
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Move&lt;/th&gt;
&lt;th&gt;What it looks like&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;01&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Name the uncertainty in planning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Build explicit contingency into roadmaps. "This timeline assumes current conditions hold; here's our branch if they don't." Uncertainty acknowledged is uncertainty managed.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;02&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Separate "I don't know" from "we're watching"&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Distinguish between things you're genuinely uncertain about and things you're actively monitoring. Give your team a sense of the signals you're tracking even when you can't give conclusions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;03&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Acknowledge impact without performing solutions&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;When crises affect team members directly, you don't need a policy or a fix. Sometimes "I see this is real and I don't have the answers" is more valuable than a five-point plan.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;04&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Influence decisions above you with honest data&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;When the business is being steered by false certainty about external conditions, your job is to put accurate uncertainty on the table — even when that's unwelcome. Especially then.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;05&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Prioritise reversibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;When the environment is genuinely unpredictable, bias toward decisions that can be undone. Make "how reversible is this?" a standard question in planning when the context is volatile.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Second Half of the Sentence
&lt;/h2&gt;

&lt;p&gt;Across all three registers — your team, the technology, the world — the structure is the same. "I don't know" is never the complete sentence. It's always followed by momentum.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I don't know — but here's how we're going to move anyway."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That pivot is everything. You're not outsourcing the problem or performing helplessness. You're modelling how a technically mature, psychologically honest person handles uncertainty: they acknowledge it, then they act on it. They find what they can know. They reduce the blast radius of what they can't. They keep moving.&lt;/p&gt;

&lt;p&gt;There's a kind of confidence that doesn't depend on having the answers. It's the confidence that comes from trusting your ability to navigate uncertainty — to find information, connect people, ask the right questions, and make reasonable calls under ambiguity. That's the confidence your team needs from you. Not omniscience. Not a human forecast engine.&lt;/p&gt;

&lt;p&gt;Just someone who can say "I don't know where we are" without panicking — and then get out the compass.&lt;/p&gt;




&lt;p&gt;The best leads I've worked with share one trait: they made it feel completely ordinary to not have an answer. And they made it equally obvious that not having one was never the end of the story. Just the beginning of figuring it out together.&lt;/p&gt;

&lt;p&gt;That combination — honesty first, momentum second — is what leading a team looks like when the world keeps changing faster than any of us can confidently predict. Which, as far as I can tell, is the world we're in now.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>leadership</category>
      <category>crisismode</category>
    </item>
    <item>
      <title>Your build pipeline is not your trust boundary</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Tue, 24 Mar 2026 16:27:07 +0000</pubDate>
      <link>https://forem.com/sebs/your-build-pipeline-is-not-your-trust-boundary-1bnn</link>
      <guid>https://forem.com/sebs/your-build-pipeline-is-not-your-trust-boundary-1bnn</guid>
      <description>&lt;p&gt;Some teams deploying software to AWS have two registries and think of them as a logistics detail. One holds what came out of CI. The other holds what goes into production. The relationship between those two things — the decision about what is allowed to cross from one into the other, and who makes that decision, and what happens when the answer is no — is not a logistics detail. It is a security architecture decision, and treating it as anything less is how production incidents happen.&lt;/p&gt;

&lt;p&gt;The bulkhead pattern is old. It comes from naval engineering, where ships are divided into watertight compartments so that flooding in one section does not sink the whole vessel. The insight is that you do not prevent damage by building a perfect hull. You prevent catastrophic loss by limiting how far damage can travel. Software engineers rediscovered this principle independently and applied it to distributed systems, microservices, and fault tolerance. It belongs equally in a deployment pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem with a single registry
&lt;/h2&gt;

&lt;p&gt;When your CI pipeline pushes directly to the registry your ECS cluster pulls from, you have made a consequential choice that probably did not feel like a choice. You have decided that the build environment and the production environment share a trust boundary. Anything that can write to your CI pipeline — any engineer, any compromised dependency, any malformed Dockerfile, any branch that passes tests — can, directly or indirectly, place an artifact into the registry that production infrastructure will consume without further scrutiny.&lt;/p&gt;

&lt;p&gt;This is not a theoretical concern. Supply chain attacks against CI systems have become routine. A compromised build dependency installs a malicious binary during the build phase. The resulting image passes your existing image scan if the scanner's definitions are not current, or if the binary is not yet known to the scanner. The image gets tagged and pushed. On the next deploy, ECS pulls it and runs it in your production environment. At no point did anything behave unexpectedly from a pipeline perspective. Every light was green. That is the problem.&lt;/p&gt;

&lt;p&gt;The deeper issue is that a single-registry architecture conflates two fundamentally different questions. The first question is: did this build succeed? The second question is: is this artifact trustworthy enough to run in production? CI answers the first question. Only a deliberate validation gate — one that runs independently of the build environment, with different permissions and different tooling — can answer the second.&lt;/p&gt;

&lt;h2&gt;
  
  
  The structure of a bulkhead deployment
&lt;/h2&gt;

&lt;p&gt;The architecture worth building has four distinct zones, each with clearly scoped responsibilities and explicitly limited permissions between them.&lt;/p&gt;

&lt;p&gt;The first zone is your GitLab CI pipeline. Its job is to build. It runs your tests, compiles your code, assembles your container image, and pushes that image to the GitLab Container Registry. The GitLab registry in this architecture is intentionally treated as ephemeral and untrusted. It is a staging area. Images land there the way packages land on a loading dock: present, but not yet cleared for entry. CI runners have write access to the GitLab registry. They have no access to AWS whatsoever. Not to IAM, not to ECR, not to ECS. If your CI environment is compromised, the blast radius is bounded to the GitLab registry.&lt;/p&gt;

&lt;p&gt;The second zone is the deliver pipeline. This is the bulkhead. It is triggered — on a tag, on a merge to a protected branch, on whatever promotion event your organization has decided represents a release candidate — and its sole purpose is to evaluate whether an image from the GitLab registry is trustworthy enough to enter the AWS trust boundary. It pulls the image, runs validation: vulnerability scanning, signature verification, policy checks, SBOM attestation, whatever your threat model requires. If validation passes, it pushes the image to ECR and tags it with a provenance marker. If validation fails, it stops there. Nothing enters AWS. The deliver pipeline is the only principal in your entire system with write access to ECR.&lt;/p&gt;

&lt;p&gt;The third zone is ECR. In this architecture, ECR is not just a faster registry. It is a trust signal. The presence of an image in ECR means exactly one thing: the deliver pipeline evaluated it and cleared it. No image arrives in ECR through any other path. Your ECS tasks can therefore pull from ECR with confidence that the contents were not placed there by a CI runner, a developer with elevated credentials, or an automated process that bypassed validation. ECR's access policy reflects this: the deliver pipeline can write, ECS task roles can read, and nothing else has write access.&lt;/p&gt;

&lt;p&gt;The fourth zone is the deploy pipeline and ECS cluster. The deploy pipeline runs inside AWS, typically on a runner with an IAM role scoped to the specific ECS actions it needs. It reads from ECR, updates the task definition, and triggers a rolling deployment. It has no awareness of GitLab's registry. It does not cross back outside the AWS trust boundary for any artifact. The deployment is entirely self-contained within the environment it controls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the boundary placement matters
&lt;/h2&gt;

&lt;p&gt;You could draw the bulkhead in a different place. You could run validation inside the CI pipeline, before the push to GitLab's registry, and use a single registry throughout. Many teams do this. It is better than no validation at all. But it is not a bulkhead. A bulkhead only works if the compartments it separates are genuinely isolated — if flooding one compartment cannot automatically flood the other. Validation that runs inside the same environment as the build is subject to all the same compromises as the build. A malicious package can interfere with test execution. A malicious script can tamper with scanner output. The environment in which validation runs cannot be the same environment that produced the artifact being validated, if you want the validation to mean anything.&lt;/p&gt;

&lt;p&gt;The deliver pipeline solves this because it runs in a clean context with no dependency on the build environment. It does not trust the image. It does not trust the metadata the build produced. It pulls the image, treats it as an opaque artifact of unknown provenance, and evaluates it from scratch. The only thing it takes on faith is that the image digest it pulls from the GitLab registry corresponds to what CI claims to have built — and even that can be addressed with build attestation and signed manifests if your threat model demands it.&lt;/p&gt;

&lt;p&gt;There is also an operational argument separate from the security argument. When validation and promotion are separated from build, you can change your validation requirements without touching your build configuration. You can introduce a new scanner, tighten a policy, or add a new required attestation by changing the deliver pipeline. CI keeps running the same way it always has. The operational surface of security changes shrinks considerably.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permissions as documentation
&lt;/h2&gt;

&lt;p&gt;One of the most underappreciated properties of this architecture is what the permission model tells you. When you look at your IAM policies and your GitLab CI variable scopes, the structure of your trust boundaries is legible. GitLab runners have credentials that can push to the GitLab registry. They have nothing in AWS. The deliver pipeline has credentials to read from the GitLab registry and write to ECR. ECS task roles can read from ECR. The deploy pipeline can describe and update ECS services. Nothing has more than it needs. Nothing can reach across a zone boundary it has no business crossing.&lt;/p&gt;

&lt;p&gt;This matters because permissions-as-documentation is honest in a way that comments and runbooks are not. Runbooks say what is supposed to be true. IAM policies say what is actually true. When your access model is correctly scoped, reading it is equivalent to reading the architecture. When your access model has accumulated scope over time — when CI runners have ECR write access because someone needed to debug something once and never cleaned it up — the permissions tell you that the architecture has quietly collapsed. The bulkhead no longer holds because the compartments are no longer sealed.&lt;/p&gt;

&lt;p&gt;Keeping the permission model clean is not just security hygiene. It is architectural discipline. Every time you are tempted to give a component access to something outside its designated zone — to let CI push directly to ECR "just this once," to give the deploy pipeline GitLab credentials "because it's easier" — you are being asked to trade architectural clarity for convenience. The answer should almost always be no.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cost
&lt;/h2&gt;

&lt;p&gt;This architecture is not free. You have a third pipeline to maintain, with its own failure modes and operational requirements. The deliver pipeline becomes a single point of failure in your promotion path: if it is broken, no image reaches production regardless of how healthy your build and deploy pipelines are. You need to monitor it, alert on it, and be capable of diagnosing failures in it quickly.&lt;/p&gt;

&lt;p&gt;The deliver pipeline also adds latency to your release cycle. Validation takes time. Scans take time. If your threat model requires extensive policy evaluation, the gap between a successful build and a deployable artifact may be measured in minutes rather than seconds. This is usually acceptable, but it is a real tradeoff that your organization needs to make consciously rather than discover in the middle of an incident.&lt;/p&gt;

&lt;p&gt;The answer to both of these costs is not to eliminate the bulkhead. It is to treat the deliver pipeline with the same engineering seriousness as the rest of your infrastructure. It deserves good observability, clear failure messages, documented recovery procedures, and regular testing. A security boundary that cannot be maintained is not actually a security boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this is not
&lt;/h2&gt;

&lt;p&gt;A bulkhead is not a substitute for secure coding practices. An image that passes every validation check you have defined can still contain application-level vulnerabilities. The bulkhead protects you against supply chain compromise in the build environment and enforces a consistent set of standards on every artifact that reaches production. It does not protect you against vulnerabilities you have not checked for or logic errors in your application code.&lt;/p&gt;

&lt;p&gt;A bulkhead is also not a guarantee of immutability. An image that passes validation today may have a vulnerability discovered tomorrow. Your ECR should be configured with immutable tags so that an existing image digest cannot be overwritten, and you should have a process for responding to newly discovered vulnerabilities in images that are already in production. The bulkhead tells you about the state of an artifact at the moment it crossed the boundary. Keeping that assessment current over time is a different problem, requiring different tooling.&lt;/p&gt;

&lt;p&gt;What a bulkhead is, at its most fundamental, is a decision about what it means to trust an artifact. Defining that decision explicitly, embodying it in a pipeline stage with clear inputs and clear outputs, and enforcing it as the mandatory path between your build environment and your production environment — that is the entire value of the pattern. The implementation details matter less than the clarity of the decision. Before you build anything, you should be able to answer: what does it mean for an image to be trustworthy? Who decides? What happens when the answer is no? If those questions have clear answers, you have an architecture. If they do not, you have a pipeline.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>bulkhead</category>
      <category>wellarchitected</category>
      <category>aws</category>
    </item>
    <item>
      <title>From Idea to Implementation-Ready: A Six-Phase Pipeline with Rewelo</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Tue, 17 Mar 2026 19:13:14 +0000</pubDate>
      <link>https://forem.com/sebs/from-idea-to-implementation-ready-a-six-phase-pipeline-with-rewelo-42lp</link>
      <guid>https://forem.com/sebs/from-idea-to-implementation-ready-a-six-phase-pipeline-with-rewelo-42lp</guid>
      <description>&lt;p&gt;Most projects start with a vague idea and a Jira board. The gap between "we should build X" and "here is a fully specified, dependency-ordered, priority-scored backlog ready for sprint planning" is usually traversed through a series of meetings, half-written requirements documents, and optimistic estimates scribbled on sticky notes.&lt;/p&gt;

&lt;p&gt;This post documents a different approach: a six-phase pipeline in which each phase produces structured, machine-readable artifacts that feed directly into the next. The backbone is &lt;a href="https://github.com/sebs/rewelo" rel="noopener noreferrer"&gt;Rewelo&lt;/a&gt; — a CLI and MCP server for relative-weight backlog prioritization built on DuckDB — which transforms the back half of the process from intuition-based ticket-sorting into a transparent, reproducible calculation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The question is not "how do we decide what to build first?" but "how do we make the decision auditable, reversible, and legible to every stakeholder?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The result, across a real product build: 113 scored and tagged tickets, 124 dependency relations organized into five dependency layers, and a backlog that can be re-ranked in seconds when priorities shift.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gherkin feature files&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BDD scenarios&lt;/td&gt;
&lt;td&gt;~160&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scored tickets&lt;/td&gt;
&lt;td&gt;113&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dependency relations&lt;/td&gt;
&lt;td&gt;124&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Pipeline
&lt;/h2&gt;

&lt;p&gt;The process is structured into six sequential phases, each with a defined entry condition, a set of artifacts it produces, and a quality gate before the output is accepted downstream. Two phases have explicit iteration loops; three feedback paths run from the final review back into earlier phases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1 — Vision &amp;amp; Concept
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Lock the problem space before touching architecture.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The phase begins with four documents: &lt;code&gt;concept.md&lt;/code&gt; (what and why), &lt;code&gt;elements.mmd&lt;/code&gt; (a Mermaid diagram of the major domain entities), &lt;code&gt;estimates.md&lt;/code&gt; (rough sizing and constraints), and &lt;code&gt;trlc-cheatsheet.md&lt;/code&gt; (a quick-reference for the requirements language used throughout). Before anything moves forward, the four documents undergo a cross-consistency review — checking that the entity model matches the concept, that estimates are grounded in scope, and that the requirements language is applied uniformly.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Artifacts:&lt;/em&gt; &lt;code&gt;concept.md&lt;/code&gt; · &lt;code&gt;elements.mmd&lt;/code&gt; · &lt;code&gt;estimates.md&lt;/code&gt; · &lt;code&gt;trlc-cheatsheet.md&lt;/code&gt; · cross-consistency review&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 2 — Architecture &amp;amp; Specifications
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Define how the system actually works, then find the gaps.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With the vision locked, &lt;code&gt;architecture.md&lt;/code&gt; documents the major components, their data flows, and any specialized concerns (in this project, a CRDT-to-Git synchronization layer). A gap analysis follows — specifically looking for what is needed to ship a first version versus what is aspirational. Only what passes that bar makes it into the three specification documents: &lt;code&gt;artifact_schemas.md&lt;/code&gt;, &lt;code&gt;api_contract.md&lt;/code&gt;, and &lt;code&gt;operational.md&lt;/code&gt;, which together define the technical surface area that will be tested and implemented.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Artifacts:&lt;/em&gt; &lt;code&gt;architecture.md&lt;/code&gt; · gap analysis → v1 · &lt;code&gt;artifact_schemas.md&lt;/code&gt; · &lt;code&gt;api_contract.md&lt;/code&gt; · &lt;code&gt;operational.md&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3 — Behavioral Specifications
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Sixteen feature files. ~160 scenarios. One explicit loop.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where requirements become falsifiable. Gherkin scenarios are written for every feature identified in Phase 2 — Given, When, Then triples that can drive automated tests and that make edge-case thinking explicit. A best-practices review is applied to the full scenario set: are scenarios atomic? Are they written from the user's perspective? Do they avoid implementation detail? If the answer to any of these is "no", the feature files are reworked. Only when the scenario set passes the gate does the process continue.&lt;/p&gt;

&lt;p&gt;This is the most iterative phase, and deliberately so — fixing an ambiguous scenario at this stage costs minutes; finding the same ambiguity during implementation costs days.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Artifacts:&lt;/em&gt; 16 feature files · ~160 scenarios · best-practices review · explicit rework loop&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 4 — Decisions &amp;amp; Quality Gates
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Capture the choices so future team members can understand them.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Three Architecture Decision Records are authored at this stage: one mapping the technology stack to the problem constraints, one capturing the rationale for the BDD approach, and one documenting the Rewelo integration decision itself. The Definition of Ready and Definition of Done are written here as well — these become the acceptance criteria applied in Phase 6. Finally, reusable templates for User Stories and ADRs are finalized, ensuring that new tickets and decisions added later follow a consistent structure.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Artifacts:&lt;/em&gt; ADR: tech mapping · ADR: BDD rationale · ADR: Rewelo · Definition of Ready · Definition of Done · Story and ADR templates&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 5 — Backlog
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;113 tickets. Scored, tagged, and dependency-ordered.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every story derived from the BDD scenarios and architecture documents is entered into Rewelo and scored across four dimensions: Benefit, Penalty, Estimate, and Risk. Tags group tickets by feature, team, and state. Ticket relations — &lt;code&gt;blocks&lt;/code&gt;, &lt;code&gt;depends-on&lt;/code&gt;, &lt;code&gt;relates-to&lt;/code&gt; — are declared explicitly, yielding 124 relations that sort the backlog into five logical dependency layers. The top of the backlog is not a product manager's gut feel; it is the output of a priority formula.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Artifacts:&lt;/em&gt; 113 tickets · B/P/E/R scores · 124 relations · 5 dependency layers&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 6 — Four Amigos Review
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Four perspectives, one approval gate, three feedback paths.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Four Amigos — Product Owner, Developer, QA Engineer, and UX Designer — each review the backlog through their own lens, informed by AI-simulated personas. The PO checks value propositions and acceptance criteria. The Developer flags architecture misalignment and re-scores Estimate and Risk. QA surfaces edge cases and cross-feature risks. The UX designer reviews interaction states, cognitive load, and missing flows. The gate is a four-way approval. If it doesn't pass, three feedback loops are available: back to requirements (for conceptual gaps), back to the feature files (for BDD issues), or back to ticket refinement (for scope or scoring problems).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Personas:&lt;/em&gt; Product Owner · Developer · QA Engineer · UX Designer · three feedback loops&lt;/p&gt;




&lt;h2&gt;
  
  
  Rewelo at the Center
&lt;/h2&gt;

&lt;p&gt;The pipeline would be useful without Rewelo — structured documents and BDD scenarios alone are a meaningful step up from most engineering processes. But the backlog phase is where the approach goes from "disciplined" to "genuinely different."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/sebs/rewelo" rel="noopener noreferrer"&gt;Rewelo&lt;/a&gt; is a CLI and MCP server for relative-weight backlog prioritization. It stores tickets in an embedded DuckDB database — no server required — and calculates a priority score at runtime based on four dimensions, normalized across the full backlog or any tagged subset.&lt;/p&gt;

&lt;p&gt;Each ticket receives four scores on the Fibonacci scale (1, 2, 3, 5, 8, 13, 21):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Measures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;B&lt;/strong&gt; — Benefit&lt;/td&gt;
&lt;td&gt;Value delivered by implementing this story&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;P&lt;/strong&gt; — Penalty&lt;/td&gt;
&lt;td&gt;Cost of &lt;em&gt;not&lt;/em&gt; implementing — the downside of deferral&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;E&lt;/strong&gt; — Estimate&lt;/td&gt;
&lt;td&gt;Resources required for implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;R&lt;/strong&gt; — Risk&lt;/td&gt;
&lt;td&gt;Uncertainty or complexity in the implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At runtime, Rewelo calculates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Value vs Cost, normalized across the backlog
Value    = Benefit + Penalty
Cost     = Estimate + Risk
Priority = Value / Cost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Higher priority means better return on investment. The scores are normalized relative to the whole backlog — or any subset filtered by tag — so re-ranking is instantaneous when new tickets are added or when the team changes their weighting preferences. &lt;code&gt;rw calc priority&lt;/code&gt; is a single command away.&lt;/p&gt;

&lt;p&gt;This matters in the Four Amigos phase specifically. When a Developer argues that a ticket's Estimate score is too optimistic, or a QA engineer surfaces a hidden dependency that increases Risk, the scores are updated and the backlog re-sorts itself. The discussion produces data, not just minutes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tag-driven organization
&lt;/h3&gt;

&lt;p&gt;Rewelo uses a flexible &lt;code&gt;prefix:value&lt;/code&gt; tag system rather than fixed fields. In this project, tags covered state (&lt;code&gt;state:backlog&lt;/code&gt;, &lt;code&gt;state:wip&lt;/code&gt;, &lt;code&gt;state:done&lt;/code&gt;), feature grouping (&lt;code&gt;feature:auth&lt;/code&gt;, &lt;code&gt;feature:checkout&lt;/code&gt;), and team (&lt;code&gt;team:platform&lt;/code&gt;). Because every tag assignment is logged in an audit trail, the tag history also yields lead time and cycle time data from &lt;code&gt;state:&lt;/code&gt; transitions — a useful side effect.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependency ordering
&lt;/h3&gt;

&lt;p&gt;The 124 relation declarations (&lt;code&gt;blocks&lt;/code&gt;, &lt;code&gt;depends-on&lt;/code&gt;, &lt;code&gt;relates-to&lt;/code&gt;) produce a directed graph of the backlog. Rewelo uses this to expose a five-layer topological ordering: the tickets in layer one have no upstream dependencies and can be started immediately; each subsequent layer becomes unblocked as the previous one completes. This is far more actionable than a flat, priority-sorted list.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Four Amigos Review in Detail
&lt;/h2&gt;

&lt;p&gt;The Four Amigos is a well-established agile practice: before any story reaches a sprint, it should be reviewed by representatives of the four key perspectives. What makes this pipeline's implementation unusual is that the review is run against AI-simulated personas — each grounded in the artifact set produced by the earlier phases — before involving the human team. This surfaces structural problems in the backlog without consuming sprint planning time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Product Owner&lt;/strong&gt; reviews value propositions, acceptance criteria, and Benefit/Penalty scores. Asks: does this story deliver the outcome described in the concept? Is the acceptance criteria in the feature file comprehensive? Would a user recognize this as solving their problem?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Developer&lt;/strong&gt; reviews architecture fit, implementation plausibility, and Estimate/Risk scores. Asks: is this story implementable given the architecture defined in Phase 2? Are the E and R scores realistic? Are there hidden technical dependencies not captured in the relation graph?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;QA Engineer&lt;/strong&gt; reviews edge cases and cross-feature risks. Asks: are the Gherkin scenarios sufficient to catch regressions? Are there error states or boundary conditions missing from the feature files? Do any of these stories interact in ways that could produce surprising failures?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UX Designer&lt;/strong&gt; reviews interaction states, transitions, and cognitive load. Asks: are all the states this feature can be in represented in the acceptance criteria? Is the described flow consistent with how users actually think about the task? Where might a user get confused?&lt;/p&gt;

&lt;p&gt;The gate is a four-way approval. If any persona finds a material gap, one of three feedback paths is taken:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;↩ Refine requirements.&lt;/strong&gt; Conceptual gaps or value misalignments send the work back to Phase 1's cross-consistency review — the deepest and most expensive loop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;↩ Rework feature files.&lt;/strong&gt; Missing scenarios, incomplete edge cases, or poorly specified acceptance criteria send individual feature files back to Phase 3 for revision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;↩ Refine stories.&lt;/strong&gt; Mis-scored tickets, missing dependencies, or scope problems are addressed directly in the Rewelo backlog — the shallowest and most common loop.&lt;/p&gt;

&lt;p&gt;The three loops are tiered by cost: story refinement is cheap (minutes), BDD rework is moderate (hours), requirements revision is expensive (days). The earlier a problem is found, the cheaper it is to fix — which is the central argument for front-loading structure in the first place.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Output Looks Like
&lt;/h2&gt;

&lt;p&gt;When the Four Amigos gate passes, the output is a Rewelo project containing 113 tickets that have been scored by all four personas, organized into five dependency layers, and validated against 160 behavioral scenarios. The implementation team can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run &lt;code&gt;rw calc priority&lt;/code&gt; to get an instant priority ranking&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;rw report dashboard&lt;/code&gt; to generate an HTML dashboard showing backlog health, score distribution, and tag breakdowns&lt;/li&gt;
&lt;li&gt;Export to CSV or JSON for integration with any downstream tool&lt;/li&gt;
&lt;li&gt;Start a sprint immediately from layer one, knowing every ticket in that layer is dependency-free and has passed four distinct review perspectives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What the team cannot do is argue about what to build next without data. That is, perhaps, the most useful property of the whole pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running Rewelo as an MCP Server
&lt;/h2&gt;

&lt;p&gt;One of Rewelo's less obvious capabilities is its MCP server mode. Run &lt;code&gt;rw serve&lt;/code&gt; (or deploy the Docker container and configure Claude to point to it), and the AI assistant can manage the entire backlog — creating tickets, updating scores, assigning tags, running calculations — through natural language. This is how the Four Amigos review phase was implemented: each persona is a system prompt, the Rewelo MCP server provides the backlog as context, and the review runs as a structured conversation.&lt;/p&gt;

&lt;p&gt;The configuration is straightforward. Rewelo's &lt;code&gt;.mcp.json&lt;/code&gt; file in the repository shows the exact setup. Because the data lives in a named Docker volume, the database persists across container restarts and the full audit trail — every score change, every tag transition — is preserved.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The pipeline described here is not light-weight. Six structured phases, 16 feature files, 113 tickets, three ADRs, and a four-persona review process is a meaningful investment before any implementation begins. The argument for making that investment is simple: the cost of ambiguity grows exponentially the later it is found. A missing acceptance criterion discovered during sprint planning costs a conversation. The same gap found during code review costs a rewrite. Found in production, it costs users.&lt;/p&gt;

&lt;p&gt;Rewelo sits at the center of this because the backlog is where ambiguity historically hides most effectively — in vague story descriptions, in optimistic estimates, in priorities that change with whoever spoke last at the planning meeting. Replacing that with a transparent scoring formula, a dependency graph, and a full revision history is not bureaucracy. It is engineering applied to the product development process itself.&lt;/p&gt;

&lt;p&gt;The repository is at &lt;a href="https://github.com/sebs/rewelo" rel="noopener noreferrer"&gt;github.com/sebs/rewelo&lt;/a&gt;. It is experimental software, as the README notes — but the ideas it implements are not.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agenticagile</category>
      <category>agile</category>
      <category>productivity</category>
    </item>
    <item>
      <title>I Was So Angry, I Actually Shipped It</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Fri, 13 Mar 2026 22:31:40 +0000</pubDate>
      <link>https://forem.com/sebs/i-was-so-angry-i-actually-shipped-it-2m19</link>
      <guid>https://forem.com/sebs/i-was-so-angry-i-actually-shipped-it-2m19</guid>
      <description>&lt;p&gt;A while ago I wrote about how I was fed up enough with project management tools to build my own. No URL. No code. Just a rant and some screenshots of a half-baked UI.&lt;/p&gt;

&lt;p&gt;Several people in the comments called it a tease. They weren't wrong.&lt;/p&gt;

&lt;p&gt;So here's the follow-up nobody .... erm ... at least three people did ask for.&lt;/p&gt;

&lt;h2&gt;
  
  
  The UI Didn't Happen
&lt;/h2&gt;

&lt;p&gt;Let me be upfront: I didn't build the fancy web UI I was implicitly promising. I started down that road a couple of times, got bored fighting CSS and component state, and asked myself the honest question — &lt;em&gt;who is this actually for?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Me. It's for me.&lt;/p&gt;

&lt;p&gt;And I live in the terminal.&lt;/p&gt;

&lt;p&gt;So I threw out the frontend entirely and built a CLI instead. No regrets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Meet rewelo
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;rewelo&lt;/strong&gt; — Relative Weight Backlogs for the CLI and MCP.&lt;/p&gt;

&lt;p&gt;It does exactly what I said I wanted: it prioritizes work using four dimensions instead of the fictional psychic measurement known as story points.&lt;/p&gt;

&lt;p&gt;Every ticket gets scored on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Benefit&lt;/strong&gt; — value gained by doing the thing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Penalty&lt;/strong&gt; — cost of &lt;em&gt;not&lt;/em&gt; doing the thing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Estimate&lt;/strong&gt; — how much work it actually is&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Risk&lt;/strong&gt; — how uncertain or gnarly the implementation is&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From those four numbers, priority calculates itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Value    = Benefit + Penalty
Cost     = Estimate + Risk
Priority = Value / Cost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Higher priority = better return on investment. It's not rocket science. It's just math that most tools refuse to let you do.&lt;/p&gt;

&lt;h2&gt;
  
  
  DuckDB Was The Right Call
&lt;/h2&gt;

&lt;p&gt;One of the decisions I'm most happy about: no server.&lt;/p&gt;

&lt;p&gt;I spent zero hours configuring a database daemon, zero hours fighting connection pools, and zero hours explaining to myself why postgres was running at 3am. The whole thing runs on DuckDB — an embedded analytical database that lives in a single file.&lt;/p&gt;

&lt;p&gt;This meant I could focus on the actual problem instead of infrastructure theater. Turns out a project management tool for one person doesn't need a distributed SQL cluster.&lt;/p&gt;

&lt;p&gt;Who knew.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tags Instead of Fixed Fields
&lt;/h2&gt;

&lt;p&gt;The state machine I wanted to build kept getting complicated. So I simplified it down to a tag system.&lt;/p&gt;

&lt;p&gt;Every ticket gets tags in &lt;code&gt;prefix:value&lt;/code&gt; format: &lt;code&gt;state:backlog&lt;/code&gt;, &lt;code&gt;state:wip&lt;/code&gt;, &lt;code&gt;state:done&lt;/code&gt;, &lt;code&gt;feature:checkout&lt;/code&gt;, &lt;code&gt;team:platform&lt;/code&gt;. Whatever you need. The system doesn't care — it just tracks every assignment and removal in an audit log.&lt;/p&gt;

&lt;p&gt;The beautiful side effect: since every &lt;code&gt;state:&lt;/code&gt; tag change is recorded with a timestamp, cycle time and lead time fall out of the data for free. No extra instrumentation. No dashboards you have to manually update. Just the log.&lt;/p&gt;

&lt;h2&gt;
  
  
  Revision History Because I've Been Burned
&lt;/h2&gt;

&lt;p&gt;Every change to a ticket creates a snapshot of what it looked like before. Not just the scores — the tags too.&lt;/p&gt;

&lt;p&gt;This means you can reconstruct the exact state of your backlog at any point in time. Remember that estimation session three weeks ago? You can see the numbers from before the panic re-estimation happened. This turned out to be more useful than I expected. Past me was making different tradeoffs than present me, and it's actually worth knowing when that changed and why.&lt;/p&gt;

&lt;h2&gt;
  
  
  The MCP Part Is The Interesting Part
&lt;/h2&gt;

&lt;p&gt;Here's where it gets weird in a good way.&lt;/p&gt;

&lt;p&gt;The CLI doubles as an MCP server over stdio. Which means Claude — or any AI assistant that speaks MCP — can manage your backlog directly. Create tickets, assign tags, run priority calculations, generate reports. All from a conversation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7e5lti6ti1xyzxv764l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7e5lti6ti1xyzxv764l.png" alt=" " width="800" height="919"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I wrote in the original post that I wanted to bind agent integration to workflows, to have some control over machine-made changes. This is the answer to that. The MCP tools are the workflow. The AI calls them explicitly and the audit log catches everything it touches. Nothing happens silently.&lt;/p&gt;

&lt;p&gt;In claudes own words&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I extracted all 18 Gherkin feature files from your features/ directory and converted each "Rule" block into a rewelo ticket — 104 stories total — with acceptance criteria derived from the scenarios, Fibonacci scores for benefit/penalty/estimate/risk, and system tags. I created the golden-season project in rewelo from scratch, set up 21 tags (3 part tags for B2G/NLS/24H and 18 system tags), and assigned every ticket its corresponding system:* tag. The backlog is now fully populated and prioritised, ready for sprint planning or further refinement like assigning part:* tags to map stories to the three-part implementation roadmap.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  A Word on Scope Creep Not Happening
&lt;/h2&gt;

&lt;p&gt;I am genuinely proud of what I did &lt;em&gt;not&lt;/em&gt; build.&lt;/p&gt;

&lt;p&gt;No user accounts. No sharing. No real-time collaboration. No mobile app. No integrations with Slack, GitHub, Linear, Jira, or anything that would require me to maintain OAuth tokens at 2am.&lt;/p&gt;

&lt;p&gt;This tool is for one person — me — and it does that job well. The moment I start building for an imaginary team of five, I stop building for myself and start building a worse version of tools that already exist.&lt;/p&gt;

&lt;p&gt;I've read enough HN threads to know how that ends.&lt;/p&gt;

&lt;h2&gt;
  
  
  It Exists. You Can Download It.
&lt;/h2&gt;

&lt;p&gt;Here it is: &lt;a href="https://github.com/sebs/rewelo" rel="noopener noreferrer"&gt;github.com/sebs/rewelo&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sometimes the best tool is the one that you actually finish.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vibecoding</category>
      <category>agile</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Generator Generator</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Tue, 10 Mar 2026 12:39:08 +0000</pubDate>
      <link>https://forem.com/sebs/generator-generator-1m0h</link>
      <guid>https://forem.com/sebs/generator-generator-1m0h</guid>
      <description>&lt;p&gt;Suppose you need to produce a physical set of Agile Workshop Tokens for your development team. Forty distinct pieces: planning poker chips embossed with Fibonacci values, sprint coins, retrospective cards, role medallions. You have two ways to ask an AI for help.&lt;/p&gt;

&lt;p&gt;The first way: ask a generative model to show you what an Agile Workshop Token looks like. It obliges. You receive a pleasant render of a melted-looking coin bearing the legend "SCRROM MSTR." It has no dimensional accuracy, no awareness of Agile methodologies, and no manufacturing utility whatsoever. This is the Nano Banana approach — prompting a black-box model to spit out a singular, static artifact. It yields a statistically probable object frozen in time: zero parametric control, no hierarchical semantics, and a complete ignorance of physical constraints. It is an artifact stripped of its axioms.&lt;/p&gt;

&lt;p&gt;The second way is the subject of this essay.&lt;/p&gt;

&lt;p&gt;Instead of asking for a token, you ask the AI to design the classical procedural grammar for manufacturing an entire ecosystem of tokens. The output is not an image. It is a generative engine — a formal system of rules, constraints, and stochastic parameters that, when executed, produces exactly 40 watertight, functionally distinct, 3D-printable models. This is the paradigm shift: moving from the discrete generation of objects to the synthesis of generative systems.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;f(Direct_Prompt) → X_static
f(System_Prompt) → Parametric_Engine(Θ)
Σ Parametric_Engine(p, t) = ∞
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The token set will reappear throughout this essay as our running example, grounding each theoretical claim in the concrete output of a real pipeline.&lt;/p&gt;




&lt;h2&gt;
  
  
  I. The Nano Banana Fallacy
&lt;/h2&gt;

&lt;p&gt;Direct asset generation treats artificial intelligence as a glorified vending machine. The fundamental problem is not quality — modern diffusion models produce convincing images. The problem is structure. A static output has no memory of how it was made, no handles by which it can be adjusted, and no awareness of the constraints that govern the domain it represents.&lt;/p&gt;

&lt;p&gt;In game development, the fallacy is obvious. A neural network hallucinating a 3D mesh of a sword gives you messy topology that cannot be animated, lacks collision volumes, and has no semantic awareness of its own edge flow. The Nano Banana sword is useless the moment you need a second sword that is longer, or rustier, or held by a different character.&lt;/p&gt;

&lt;p&gt;A generated system, by contrast, outputs the deterministic procedural grammar to forge a million weapons. It defines the hierarchical L-system of the hilt, the Bézier constraints of the blade curvature, and the algorithmic distribution of surface wear based on a runtime age parameter. It generates the mathematical forge, not the singular sword.&lt;/p&gt;

&lt;p&gt;Apply this lens to our token set. A direct prompt gives us one melted coin. The meta-generative approach gives us a shape grammar:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;G = (V, Σ, R, S)

Where V is the set of abstract token categories,
      Σ is the terminal geometries (hexagons, discs, shields, rectangles),
      R represents the substitution rules,
      S is the starting axiom.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That grammar, when executed, produces not one token but a coherent family of forty — each geometrically distinct, each semantically correct, each printable.&lt;/p&gt;




&lt;h2&gt;
  
  
  II. The Three Phases of Meta-Generation
&lt;/h2&gt;

&lt;p&gt;The pipeline that produced our token set operates in three distinct phases. These phases are universal — they apply equally to architectural design, material science, and synthetic data generation. Understanding them is the key to applying the meta-generative approach in any domain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase One: Semantic Partitioning (SPLIT)
&lt;/h3&gt;

&lt;p&gt;The generator's first task is not to draw anything. It is to understand the domain well enough to partition the problem space correctly. For the token set, this means recognising that "Agile Workshop" implies a specific ecosystem of functional object types with specific real-world usage distributions.&lt;/p&gt;

&lt;p&gt;The system allocates thirteen VotingChips — because planning poker requires a full Fibonacci sequence plus variants — eight SprintCoins, twelve RetroCards, and seven RoleMedallions. These numbers are not arbitrary. They reflect the actual ratio of pieces required for a workshop of eight to twelve people. A Nano Banana generator ignores this entirely; it does not know what a sprint retrospective is.&lt;/p&gt;

&lt;p&gt;In architectural morphogenesis, the equivalent phase generates Voronoi tessellation logic for load-bearing steel, partitioning the structural problem into zones parameterized against wind shear and solar radiation. The output is not a building; it is a spatial algorithm.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase Two: Parametric Substitution (SUB)
&lt;/h3&gt;

&lt;p&gt;Once the semantic partitions exist, the generator binds each abstract category to concrete geometry and encodes domain-specific rules as mathematical constraints. This is where the system demonstrates genuine understanding.&lt;/p&gt;

&lt;p&gt;For VotingChips, the generator correctly applies the Fibonacci sequence to the value distribution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;f(n) = f(n−1) + f(n−2)  ∀ VotingChip_value
Values: 0, ½, 1, 2, 3, 5, 8, 13, 21, ?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not a cosmetic choice. Fibonacci values in planning poker reflect a deliberate epistemological decision: the gaps between numbers encode increasing uncertainty. A generator that assigned random integers would produce a set that looks like poker chips but fails as a planning tool. The meta-generative system encodes the mathematical rule directly into the substitution grammar.&lt;/p&gt;

&lt;p&gt;SprintCoins become hexagons (stable, grippable, stackable); RetroCards become rectangles with write-on surfaces and lane indicators (+/Δ/−) that map to the three-column retrospective format; RoleMedallions become shields, differentiated by weight and finish from the lighter functional tokens.&lt;/p&gt;

&lt;p&gt;The industrial engineering parallel is the generation of topology optimisation algorithms for triply periodic minimal surfaces — gyroids and diamond lattices — where the substitution rules encode physical constraints like energy absorption and mass minimisation rather than Fibonacci sequences and retrospective formats. The structure of the problem is identical.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase Three: Instantiation (TERMINAL)
&lt;/h3&gt;

&lt;p&gt;The grammar resolves. Terminal rules execute Boolean mesh operations, stamp glyphs onto base geometry, apply edge-banding for physical grip, assign material properties, and export watertight models. For the token set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AXIOM: Token → queued

────────────────────────────────────────────────────────────
[SPLIT]    Token → VotingChip    ×13 instances
[SPLIT]    Token → SprintCoin    ×8  instances
[SPLIT]    Token → RetroCard     ×12 instances
[SPLIT]    Token → RoleMedallion ×7  instances

────────────────────────────────────────────────────────────
[SUB]      VotingChip    → BaseDisc    + ValueBadge(0,½,1,2,3,5,8,13,21,?)
[SUB]      SprintCoin    → BaseHex     + FaceGlyph(⚡×3, ↗×2, ✓×2, 🔥×1)
[SUB]      RetroCard     → BaseRect    + WriteSurface + CategoryBar(+/Δ/−)
[SUB]      RoleMedallion → BaseShield  + RoleGlyph(SM,PO,Dev,QA,UX,STK,Coach)

────────────────────────────────────────────────────────────
[TERMINAL] 40 meshes instantiated, glyphs stamped, edge-bands placed
[TERMINAL] GENERATION COMPLETE: 40 tokens, seed 0xA91ECAFE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same three-phase structure appears in every successful application of meta-generation: partition the domain, bind abstract categories to mathematically constrained geometry, instantiate. The domain changes; the architecture does not.&lt;/p&gt;




&lt;h2&gt;
  
  
  III. Where the Paradigm Extends
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Parametric Architecture and Urban Morphogenesis
&lt;/h3&gt;

&lt;p&gt;When an architect uses a basic generative AI to create a render of a futuristic building, they receive a beautiful hallucination that fundamentally ignores thermodynamics, zoning laws, and material tensile strength. It is, like the melted coin, a useless image.&lt;/p&gt;

&lt;p&gt;A meta-generative system does not output a building. It outputs a spatial algorithm. It generates Voronoi tessellation logic for load-bearing steel, parameterizing geometry against wind shear and solar radiation. Instead of a static blueprint, the system defines a generative grammar where floorplans dynamically restructure themselves based on traffic flow optimisations and HVAC efficiency constraints. Critically, the geometry produced can be mathematically constrained by Cauchy's equilibrium equation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;∇ · σ + F = 0
σ = C : ε
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every procedural strut is guaranteed to actually support its structural load — something no hallucinated render can promise.&lt;/p&gt;

&lt;p&gt;Note the structural echo of the token pipeline. The architect's grammar partitions a building into zones (SPLIT), substitutes each zone with geometrically constrained elements (SUB), and instantiates watertight, structurally valid meshes (TERMINAL). The domain is different. The architecture is the same.&lt;/p&gt;

&lt;h3&gt;
  
  
  Synthetic Data and Kinematic Reality
&lt;/h3&gt;

&lt;p&gt;Machine learning models for autonomous driving or robotics cannot be trained on static images. They require rich, physically accurate synthetic environments. A direct-to-image AI cannot generate a functioning physics simulation.&lt;/p&gt;

&lt;p&gt;A meta-generative system writes the deterministic rules for a dynamic world. It parameterises the friction coefficients of procedural asphalt, generates the optical scattering algorithms of simulated fog, and orchestrates the localised stochastic behaviours of pedestrian traffic models. The system continuously updates its probability distributions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;P(A | B) = P(B | A) · P(A) / P(B)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The generative system iterates its own parameters to generate edge-case scenarios — the long tail of rare events that directly trains physical robotics.&lt;/p&gt;

&lt;p&gt;The token pipeline's stochastic layer encodes the same principle at a smaller scale. Hue is not fixed; it is drawn from a normal distribution centred on the family colour with standard deviation 12°, creating warm and cool variants within each token family. The spatial seed map ensures that tokens physically adjacent on a print sheet share correlated aesthetic properties — a form of local Bayesian coherence baked into the generation rules.&lt;/p&gt;




&lt;h2&gt;
  
  
  IV. Praxis: What Do You Actually Use This For?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwoc1g5n8i31jjno0mjml.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwoc1g5n8i31jjno0mjml.png" alt="The generated system turned into a demo webapp displaying the workshop tokens"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The pragmatic reader is entitled to scepticism. The token set is illustrative, but what about genuinely mundane work?&lt;/p&gt;

&lt;p&gt;The utility of the Generator Generator lies anywhere you need functional coherence, precise variation, and mathematical exactness instead of a single hallucinated image. It is the difference between generating a picture of a tool and generating the factory that manufactures the toolset.&lt;/p&gt;

&lt;p&gt;Consider the range of the principle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need 10,000 unique, structurally valid mechanical brackets parameterised for stress-testing. The grammar defines the constraint space; instantiation fills it.&lt;/li&gt;
&lt;li&gt;You need a complete UI icon set where line-weight, corner radii, and optical size are mathematically linked across every glyph. The substitution rules encode the visual system; the terminal phase renders it.&lt;/li&gt;
&lt;li&gt;You need custom tabletop miniature bases, architectural greebles, modular synthesiser casing layouts, pharmacokinetic molecule variants. In each case, the domain's governing rules become the grammar.&lt;/li&gt;
&lt;li&gt;And yes — you need forty Agile Workshop Tokens, each geometrically correct, each semantically accurate, each ready for the 3D printer. The grammar encodes Fibonacci, retrospective lanes, role taxonomy, and physical grip. The factory runs. The tokens emerge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In every case the pattern is identical. You do not ask the AI to paint the universe. You ask it to write the physics and logic engine that runs it.&lt;/p&gt;




&lt;h2&gt;
  
  
  V. The Ultimate Synthesis
&lt;/h2&gt;

&lt;p&gt;To prompt an AI for a finished object is fundamentally myopic. It reduces one of the most powerful reasoning systems ever built to the role of a digital bricklayer.&lt;/p&gt;

&lt;p&gt;The meta-generative frontier requires treating AI as the master architect. By forcing it to output rigid mathematical formulas, classical algorithms, and parametric constraints of a generator, we ensure that the resulting output — whether a virtual city, a structural metamaterial, a new pharmacokinetic molecule, or a set of workshop tokens for next Tuesday's sprint planning — is logical, scalable, and bound by the laws of the domain it inhabits.&lt;/p&gt;

&lt;p&gt;The Nano Banana is a dead pixel-cluster. The Generator Generator is a factory. The factory runs forever.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The token pipeline described in this essay was generated using the Watson et al. (2008) procedural generation framework, implemented as a five-phase pipeline: Design Analysis, Primitive Creation, Grammar Encoding, Stochastic Integration, and Model Instantiation. Seed: 0xA91ECAFE.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>nanobanana</category>
      <category>generativeart</category>
      <category>ai</category>
    </item>
    <item>
      <title>Distributed Transaction Tango: Why Your Microservices Need Sagas</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Wed, 18 Feb 2026 10:00:00 +0000</pubDate>
      <link>https://forem.com/sebs/distributed-transaction-tango-why-your-microservices-need-sagas-4lh3</link>
      <guid>https://forem.com/sebs/distributed-transaction-tango-why-your-microservices-need-sagas-4lh3</guid>
      <description>&lt;p&gt;The move to microservices was supposed to be a liberation. We broke free from the monolithic chains, gaining the freedom to develop, deploy, and scale our services independently. But in our rush to embrace this new world, we left something critical behind: the simple, comforting safety of the ACID transaction. In the monolithic world, if a complex business process failed halfway through, we had a magic word: &lt;code&gt;ROLLBACK&lt;/code&gt;. It was our ultimate undo button, a guarantee that our data would never be left in a messy, inconsistent state. In the distributed chaos of microservices, where each service has its own private database, that safety net is gone. We have traded the simplicity of a single, atomic transaction for a new kind of fear—the constant, nagging anxiety that a partial failure will leave our system permanently broken.&lt;/p&gt;

&lt;p&gt;Our first instinct in this new reality is often to try and recreate the old one. We might reach for complex, heavyweight protocols like two-phase commits in a desperate attempt to stretch a transaction across multiple services. This approach is a trap. It reintroduces the very coupling we sought to escape, creating a brittle, slow, and unscalable system where the failure of one service can bring the entire process to a grinding halt. An even more common, and far more dangerous, response is to simply ignore the problem. We write our services to handle the “happy path,” crossing our fingers and hoping that the network is reliable and every service is always available. This is not engineering; it is wishful thinking. It inevitably leads to disaster: a customer is billed for an item that is out of stock, a user’s account is debited but their access is not granted, and our data drifts into a state of irreconcilable chaos.&lt;/p&gt;

&lt;p&gt;We must accept that in a distributed system, partial failure is not an edge case; it is a certainty. The Saga pattern offers a way out of this trap by forcing us to confront this reality head-on. It is a fundamental shift in thinking: instead of trying to prevent failure with a single, all-or-nothing transaction, we manage it with a series of small, reversible steps. A saga is a sequence of local transactions, where each step is a self-contained operation within a single service. The magic lies in the second half of the pattern: for every action that moves the process forward, we must define a corresponding “compensating action” that can undo it. The saga doesn’t prevent failure; it provides a clear, automated path to recovery. The relationship is straightforward:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Compensating Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Create Order&lt;/td&gt;
&lt;td&gt;Order Service&lt;/td&gt;
&lt;td&gt;Delete Order&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reserve Item&lt;/td&gt;
&lt;td&gt;Inventory Service&lt;/td&gt;
&lt;td&gt;Release Item&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Process Payment&lt;/td&gt;
&lt;td&gt;Payment Service&lt;/td&gt;
&lt;td&gt;Refund Payment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This sequence of actions and compensating actions can be managed in one of two primary ways. The first approach is orchestration, where a central coordinator acts like a conductor, telling each service what to do and when. It calls the customer service, then the inventory service, then the billing service. If any step fails, the orchestrator takes responsibility for calling the necessary compensating actions in reverse order to clean up the mess. The alternative is choreography, a more decentralized dance where each service, upon completing its local transaction, simply emits an event. The next service in the chain listens for this event and is triggered to perform its own work. In this model, there is no central brain; the logic is distributed across the event streams. Choosing between them is a trade-off between having a single point of control and visibility versus a more decoupled, and potentially more complex, event-driven architecture.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Orchestration&lt;/th&gt;
&lt;th&gt;Choreography&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Coordination&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Centralized coordinator manages all steps&lt;/td&gt;
&lt;td&gt;Decentralized; services react to each other's events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Control&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High; logic is in one place&lt;/td&gt;
&lt;td&gt;Low; logic is distributed across services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Visibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High; easy to see the state of a saga&lt;/td&gt;
&lt;td&gt;Low; requires monitoring event streams to trace a saga&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Coupling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tightly coupled to the orchestrator&lt;/td&gt;
&lt;td&gt;Loosely coupled; services only know about events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simpler for sagas with few participants&lt;/td&gt;
&lt;td&gt;Can become complex to track with many participants&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Adopting the Saga pattern is not a free lunch. It introduces a new kind of complexity, demanding that we explicitly design for failure and recovery. We must build, test, and maintain these compensating transactions, which adds to the development overhead. It also forces us to embrace the concept of eventual consistency, accepting that there will be brief moments where the system is in an intermediate state. But the payoff is a system that is resilient by design. It is a system that can gracefully handle the inevitable failures of a distributed world without losing data or requiring manual intervention. Sagas are more than a design pattern; they are an acknowledgment that the world of microservices is messy and unpredictable. By embracing this reality, we can finally build systems that are not just scalable and independent, but also truly robust.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>distributedsystems</category>
      <category>acid</category>
    </item>
    <item>
      <title>The Build vs. Buy Trap: Why You Should Be Assembling Instead</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Tue, 17 Feb 2026 09:00:00 +0000</pubDate>
      <link>https://forem.com/sebs/the-build-vs-buy-trap-why-you-should-be-assembling-instead-29f4</link>
      <guid>https://forem.com/sebs/the-build-vs-buy-trap-why-you-should-be-assembling-instead-29f4</guid>
      <description>&lt;p&gt;For a while, engineering teams have been trapped in a false dichotomy, a binary choice that has dictated the shape of our projects and the fate of our budgets: do we build it or do we buy it? The "build" path is a siren song of ultimate control, promising a bespoke solution perfectly tailored to our unique needs. We imagine crafting a flawless system from the ground up, but we conveniently ignore the brutal reality of the resources it will consume, the maintenance burden it will become, and the high cost of pivoting when our perfect requirements inevitably change. On the other side lies the pragmatic allure of "buy," the promise of an off-the-shelf solution that gets us to market faster. Yet, this path often leads to the frustration of shoehorning a generic product into a specific problem, the operational headache of running and patching someone else’s software, and the creeping dread of vendor lock-in. We have been conditioned to see these two paths as our only options, but this rigid mindset is a relic of a bygone era, and it is holding us back.&lt;/p&gt;

&lt;p&gt;The first real evolution beyond this binary trap was the rise of the “rent” model, a paradigm shift powered by the SaaS explosion. Instead of buying the software and running it ourselves, we could simply pay a subscription and outsource the entire problem. For a company needing to perform a complex data processing task, this meant no longer building a dedicated compute cluster or managing a licensed software suite; they could just call a third-party API. This approach offers undeniable advantages: near-zero operational overhead, instant access to specialized expertise, and a predictable cost model. However, it comes at the steep price of control. When you rent, you are a tenant in someone else’s ecosystem. Your destiny is tied to their SLA, their feature roadmap, and their security posture. The service is a black box, and when it fails, you are left powerless, endlessly refreshing a status page. It is the pinnacle of convenience, but it forces a trade-off between ease and ownership that many businesses are rightly hesitant to make.&lt;/p&gt;

&lt;p&gt;This is where the truly cloud-native paradigm emerges, offering a fourth option that transcends the old debate: we can “assemble.” This isn’t about building from scratch, but about composing a sophisticated solution from a palette of smaller, fully managed, best-of-breed services—the Lego bricks provided by a modern cloud platform. Instead of renting a black-box API, a team can assemble its own data-processing pipeline. They use a cloud storage service for raw data, a message queue to trigger jobs, and a managed compute service to perform the transformation. The critical distinction is that they own the &lt;em&gt;workflow&lt;/em&gt;, the &lt;em&gt;logic&lt;/em&gt;, and the &lt;em&gt;configuration&lt;/em&gt;, but they are completely liberated from managing the underlying &lt;em&gt;infrastructure&lt;/em&gt;. This is the synthesis we have been searching for: the customization and control of the “build” world combined with the operational simplicity of the “rent” world.&lt;/p&gt;

&lt;p&gt;Adopting this assembly-line mindset requires a fundamental shift in how we evaluate cost and effort, because teams consistently fall into the trap of miscalculation. We dramatically overestimate our ability to build a solution quickly and cheaply, forgetting that the initial development cost is merely the tip of the Total Cost of Ownership (TCO) iceberg. We ignore the immense, ongoing costs of maintenance, security patching, scaling, and the operational staff required to keep a custom service alive. We also underestimate the complexity of running a “bought” solution, which is never as simple as the sales pitch suggests. The “assemble” model may appear to have a higher cost per transaction, but its TCO is often drastically lower because entire categories of work—like managing servers, planning capacity, or patching operating systems—are eliminated for important parts of a app. This requires a cultural shift in financial thinking, moving from a world of upfront capital expenditure to a more fluid, pay-as-you-go operational model.&lt;/p&gt;

&lt;p&gt;Ultimately, the most strategic question is not about cost, but about focus. The decision to build, buy, rent, or assemble should be a conscious choice about where to invest your team’s most valuable and finite resource: their attention. It makes little sense to divert your best engineers to build a mediocre version of a solved problem, like a message queue or a workflow engine, when they could be working on the unique features that actually differentiate your business and create a competitive edge. The assembly model allows us to outsource the undifferentiated heavy lifting to the cloud provider, freeing our teams to focus on the core business logic where they can create the most value. The future of engineering is not about being the best at building everything from scratch; it is about being the best at intelligently assembling the powerful components that are already at our fingertips.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>tco</category>
    </item>
    <item>
      <title>Culture Trap: Why Your DevOps Transformation is Failing</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Mon, 16 Feb 2026 11:04:14 +0000</pubDate>
      <link>https://forem.com/sebs/culture-trap-why-your-devops-transformation-is-failing-22l6</link>
      <guid>https://forem.com/sebs/culture-trap-why-your-devops-transformation-is-failing-22l6</guid>
      <description>&lt;p&gt;The arrival of DevOps promised a revolution, but for many, it has become a frustrating exercise in cargo cultism. We have dutifully acquired the artifacts of modern engineering: the CI/CD pipelines are humming, the Kubernetes clusters are provisioned, and our infrastructure is meticulously defined in Terraform. We track DORA metrics, run our daily stand-ups, and our dashboards are a kaleidoscope of real-time data. From the outside, it looks like a textbook transformation. We have all the visible symbols of a high-functioning DevOps environment. Yet, the deployments remain fraught with anxiety, the blame-game after an outage is as fierce as ever, and the wall between development and operations, while perhaps more technologically advanced, stands as tall as ever. We have fallen for the great illusion of our industry: mistaking the visible tools and rituals for the culture itself, and in doing so, we have completely missed the point.&lt;/p&gt;

&lt;p&gt;This disconnect becomes painfully obvious when you compare what we say to what we do. Our internal wikis and design documents are filled with the noble espoused values of the DevOps movement. We champion "collaboration over silos," we preach the gospel of "continuous improvement," and we proudly display Werner Vogels’ mantra, "You build it, you run it," on our conference room screens. We justify architectural decisions like microservices or immutable infrastructure with these very principles. But these stated beliefs often serve as a thin veneer over a contradictory reality. We talk about collaboration, but developers still throw code over the wall for Ops to handle at 3 AM. We advocate for continuous improvement, but our post-mortems devolve into finger-pointing sessions. We claim teams have ownership, but a developer still needs five layers of approval to provision a new database. The principles we claim to hold dear are not the principles that actually govern our behavior, creating a cynical gap between the culture we advertise and the one we actually live.&lt;/p&gt;

&lt;p&gt;The truth is that no amount of tooling or inspirational posters can fix a problem that lies at a much deeper, invisible level. The real drivers of an organization's culture are not the visible artifacts or the stated values, but the un-spoken, taken-for-granted assumptions that shape every decision and action. These are the beliefs so deeply ingrained that we no longer question them. Do we, as an organization, truly believe that failure is an opportunity to learn, or do we instinctively search for the person to blame? Do our engineers feel the psychological safety to admit a mistake or ask a “stupid” question, or do they fear looking incompetent? Do we fundamentally assume that our people are responsible professionals who can be trusted with autonomy, or do we assume they need to be constrained by rigid processes and approvals to prevent chaos? Until these foundational beliefs are confronted and changed, we are just rearranging the deck chairs on the Titanic.&lt;/p&gt;

&lt;p&gt;This is why so many top-down DevOps initiatives fail. They focus on changing the visible things—the tools and the processes—while leaving the invisible, underlying assumptions untouched. A true transformation works from the inside out. It begins by fostering a genuine, shared belief in collective ownership, where the team responsible for building a service is also truly empowered and responsible for running it in production. It requires leadership to model a new response to failure, treating it not as a punishable offense but as an invaluable, inevitable part of innovation. When these core assumptions shift, the espoused values become authentic, and the artifacts of DevOps naturally follow as their logical expression. A team that genuinely believes in ownership will naturally gravitate towards IaC and robust monitoring because those tools empower them to fulfill their responsibilities. A culture that truly sees failure as a learning opportunity will conduct blameless post-mortems as a matter of course.&lt;/p&gt;

&lt;p&gt;Ultimately, DevOps is not a technical specification or a process framework that can be installed. It is a cultural outcome that emerges from a set of deeply held, shared assumptions about how people work together to build and deliver software. The tools are secondary; they are the means, not the end. The journey to a healthy DevOps culture is not about buying a new platform or mandating a new workflow. It is the much harder, more human work of examining our own unspoken beliefs about trust, failure, and responsibility. It is only when we change those foundational assumptions that we can escape the culture trap and begin to realize the true promise of a collaborative, resilient, and humane way of working.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>cicd</category>
      <category>terraform</category>
    </item>
    <item>
      <title>The Over-Abstraction Trap: Why We Need to Stop Over-Engineering Our Infrastructure</title>
      <dc:creator>Sebastian Schürmann</dc:creator>
      <pubDate>Fri, 23 Jan 2026 16:35:50 +0000</pubDate>
      <link>https://forem.com/sebs/the-over-abstraction-trap-why-we-need-to-stop-over-engineering-our-infrastructure-3737</link>
      <guid>https://forem.com/sebs/the-over-abstraction-trap-why-we-need-to-stop-over-engineering-our-infrastructure-3737</guid>
      <description>&lt;p&gt;The arrival of Infrastructure as Code (IaC) promised a fundamental shift in how we manage our digital environments, offering a future where automation, repeatability, and clarity would replace the chaos of manual configuration. Tools like Terraform, Bicep, and the AWS CDK rapidly became industry standards, delivering on that promise by allowing us to version our infrastructure alongside our application code. However, as these tools have matured, a subtle but pervasive anti-pattern has emerged within the industry: a tendency toward excessive abstraction that prioritizes theoretical "best practices" over the practical reality of reading and maintaining code. We have reached a point where the pursuit of "clean code" is ironically leading to systems that are opaque, fragile, and far more difficult to manage than the manual processes they replaced.&lt;/p&gt;

&lt;p&gt;If you have worked in a modern DevOps environment, you have likely encountered the "best practice" trap firsthand. It usually begins when an engineer attempts to define a simple resource, like an S3 bucket or a virtual machine, only to be blocked during code review because they didn't use the company's standardized module. The justification for this pushback is almost always rooted in the principles of software engineering, specifically the desire to keep code DRY (Don't Repeat Yourself) and to enforce governance at scale. Consequently, engineers find themselves under immense social pressure to wrap their simple declarative logic in layers of modules and variable maps, forcing them to defend a straightforward solution against a complex one that is perceived as superior simply because it is more abstract.&lt;/p&gt;

&lt;p&gt;The hidden cost of this approach is that these layers of abstraction are, in effect, software, yet they lack the rigorous testing standards we apply to actual application code. When a team wraps a Terraform resource in complex logic to make it "reusable," they are essentially writing an untested library that sits at the very foundation of their production stack. This introduces a significant amount of cognitive overhead for anyone trying to debug the system later; instead of simply reading a file to see what infrastructure will be deployed, an engineer must mentally compile the code, tracing variables through multiple files and modules to understand the final state. The declarative beauty of "what I want" is entirely lost in favor of the procedural complexity of "how I generate it," and frequently, these "reusable" modules are so tightly coupled to a specific use case that they are never actually reused, rendering the entire exercise a waste of time.&lt;/p&gt;

&lt;p&gt;A stark and refreshing contrast to this trend can be found by observing the community that has grown around the Hetzner Cloud (hcloud) Terraform provider. Unlike the complex, multi-layered architectures often seen in AWS or Azure implementations, the Hetzner community culture embraces a philosophy of aggressive simplicity where configurations are usually flat, explicit, and incredibly easy to read. While an enterprise team might obscure a server definition behind a generic "compute" module with thirty different variable toggles, a typical hcloud user will simply declare the resource directly, specifying the image and server type in plain text. This difference highlights a crucial realization: the tool itself, Terraform, does not mandate complexity; rather, it is the culture surrounding the tool that dictates how it is used.&lt;/p&gt;

&lt;p&gt;This cultural divergence likely stems from the different motivations of the two groups, but it is also significantly influenced by the extensive training and certification ecosystems that surround major hyperscalers like AWS and Azure. While the Hetzner community often prioritizes immediate utility and speed, the enterprise cloud world naturally leans toward the comprehensive, standardized patterns taught in certification courses and official reference architectures. These frameworks are designed to manage massive scale and complexity, but an unintentional side effect is that teams often adopt these sophisticated structures as the default simply "because it is written," applying enterprise-grade abstraction to projects that might benefit from a lighter touch. It is easy to fall into the habit of implementing a complex pattern just because it aligns with a Well-Architected Framework, rather than stepping back to ask if it effectively serves the specific needs of the current project. Ultimately, the goal is not to reject these established best practices, but to apply them with intention; we must balance the robust standards of the hyperscalers with the practical clarity found in simpler communities, ensuring that our infrastructure code remains a helpful map for our teams rather than just a testament to our compliance with a textbook.&lt;/p&gt;

</description>
      <category>infrastructureascode</category>
      <category>terraform</category>
      <category>cdk</category>
      <category>bicep</category>
    </item>
  </channel>
</rss>
