<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Tai Dang</title>
    <description>The latest articles on Forem by Tai Dang (@dttai71).</description>
    <link>https://forem.com/dttai71</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3790503%2Fc03d7404-9808-4ddd-85dc-477d722958ab.png</url>
      <title>Forem: Tai Dang</title>
      <link>https://forem.com/dttai71</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dttai71"/>
    <language>en</language>
    <item>
      <title>How We Built a Governance Loop for AI Coding Agents</title>
      <dc:creator>Tai Dang</dc:creator>
      <pubDate>Tue, 24 Feb 2026 23:52:33 +0000</pubDate>
      <link>https://forem.com/dttai71/how-we-built-a-governance-loop-for-ai-coding-agents-da8</link>
      <guid>https://forem.com/dttai71/how-we-built-a-governance-loop-for-ai-coding-agents-da8</guid>
      <description>&lt;p&gt;AI coding agents are fast. Claude Code, Cursor, Copilot — they can generate hundreds of lines in seconds. But here's the uncomfortable truth we learned after testing five different multi-agent tools across five production projects:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without governance, speed just amplifies mistakes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the story of how we built &lt;a href="https://github.com/Minh-Tam-Solution/tinysdlc" rel="noopener noreferrer"&gt;TinySDLC&lt;/a&gt; — a minimal, open-source agent orchestrator that adds SDLC role discipline to AI coding. 8 roles, structured handoffs, separation of duties, security hardening — all local, zero external dependencies.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Starting Point: A Skeptical Team and a Non-Coding CEO
&lt;/h2&gt;

&lt;p&gt;Before we talk about architecture, let me be honest about where this started.&lt;/p&gt;

&lt;p&gt;In May 2025, I had a problem. My development team at MTS was slow to adopt AI coding tools. They were skeptical — and frankly, they had a point. They thought AI-generated code was full of bugs. "More time fixing than coding," they said. They used ChatGPT and Gemini individually for quick prompts, but had no team-wide process.&lt;/p&gt;

&lt;p&gt;I'm a CEO. Not a professional software developer — at least, not for the past 30 years. The last time I wrote code was 1994: Assembler and Borland C++ for my graduation thesis, a graphics application for designing electronic circuit boards. Object-oriented programming deeply shaped how I think about systems. Then I moved into management and never looked back. My team knew that. And when I pushed for AI adoption, they were polite but unconvinced: &lt;em&gt;the boss isn't a real software engineer.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I brought in experts. Ran a Claude Code workshop. The team was still slow to change.&lt;/p&gt;

&lt;p&gt;So I made a decision: I would learn it myself.&lt;/p&gt;

&lt;p&gt;I started with Python. Then free tools — LM Studio, Ollama with Continue.dev. Then paid — GitHub Copilot, Cursor, Claude Code. Small apps first. Then I moved to real enterprise platforms: Bflow (&lt;a href="https://www.bflow.vn" rel="noopener noreferrer"&gt;bflow.vn&lt;/a&gt; — an ERP+BPM Platform for Vietnamese SMEs, built by my MTS team, launched Oct 2024), then evolving it into Bflow 2.0 (ERP+BPM+AI — Conversation-First with AI as a core pillar), and NQH-Bot (an AI-Powered Workforce Management platform for Vietnamese F&amp;amp;B market, at Nhat Quang Holding — my second startup).&lt;/p&gt;




&lt;h2&gt;
  
  
  The Crisis: 679 Mocks and 78% Failure
&lt;/h2&gt;

&lt;p&gt;NQH-Bot was an AI-Powered Workforce Management platform for Vietnamese F&amp;amp;B market — auto-scheduling, multi-tenant SaaS, regional compliance — at Nhat Quang Holding, my second startup. We were using AI coding tools heavily. The speed was incredible.&lt;/p&gt;

&lt;p&gt;Then we deployed to production.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;679 out of ~900 implementations were mock code (placeholder &lt;code&gt;// TODO: implement&lt;/code&gt; patterns)&lt;/li&gt;
&lt;li&gt;78% of production endpoints failed on real traffic&lt;/li&gt;
&lt;li&gt;6 weeks of debugging to untangle what the AI had generated vs. what was actually working&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI tools weren't broken. Our process was. We had no gates, no evidence capture, no structured review. The agents generated code, we skimmed it, and we shipped it.&lt;/p&gt;

&lt;p&gt;My team's skepticism was validated — but for the wrong reason. The problem wasn't AI. The problem was &lt;strong&gt;ungoverned AI&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That crisis gave birth to what we now call the &lt;strong&gt;Zero Mock Policy&lt;/strong&gt; — and eventually, to a complete governance framework.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Tried First: Five Multi-Agent Tools
&lt;/h2&gt;

&lt;p&gt;Over the next months, we experimented with different multi-agent orchestration approaches:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;What It Did&lt;/th&gt;
&lt;th&gt;Why It Wasn't Enough&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;TinyClaw&lt;/td&gt;
&lt;td&gt;@mention-based agent routing&lt;/td&gt;
&lt;td&gt;No governance loop, just routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenClaw&lt;/td&gt;
&lt;td&gt;Lane-based message queue + failover&lt;/td&gt;
&lt;td&gt;Great infra, no quality gates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NanoBot&lt;/td&gt;
&lt;td&gt;Tool-context isolation + shell guards&lt;/td&gt;
&lt;td&gt;Security focused, not governance focused&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PicoClaw&lt;/td&gt;
&lt;td&gt;Lightweight single-agent wrapper&lt;/td&gt;
&lt;td&gt;Too simple for team workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ZeroClaw&lt;/td&gt;
&lt;td&gt;Output scrubbing + query classification&lt;/td&gt;
&lt;td&gt;Post-hoc safety, not pre-hoc governance&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each tool solved a piece of the puzzle. But none of them answered the fundamental question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;How do you ensure AI-generated code meets quality standards before it enters your codebase?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Architecture: Role Discipline + Structured Handoffs
&lt;/h2&gt;

&lt;p&gt;TinySDLC's architecture is built on two principles: &lt;strong&gt;separation of duties&lt;/strong&gt; and &lt;strong&gt;structured handoffs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The methodology (MTS-SDLC-Lite) defines the governance loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────┐     ┌────────────┐     ┌────────────┐     ┌────────────┐
│    Spec     │────&amp;gt;│    Gate     │────&amp;gt;│  Evidence   │────&amp;gt;│  Approval  │
│  (Define)   │     │ (Validate)  │     │ (Capture)   │     │ (Sign-off) │
└────────────┘     └────────────┘     └────────────┘     └────────────┘
       ↑                                                        │
       └──────────────────── Feedback loop ────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;TinySDLC enforces this loop through &lt;strong&gt;role constraints&lt;/strong&gt;, not automated gates:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role isolation&lt;/strong&gt;: Each agent has a defined workspace, tool permissions, and scope. The coder can't approve its own output. The reviewer can't skip the tester. Separation of duties is structural, not optional.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structured handoffs&lt;/strong&gt;: Agents communicate through &lt;code&gt;@agent: message&lt;/code&gt; mentions. Work flows from researcher → architect → coder → reviewer → tester with explicit handoff points. No silent pass-through.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Event logging&lt;/strong&gt;: Every agent action is logged as a JSON event with correlation IDs — which agent did what, when, in response to what request. This gives you traceability, not just chat history:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"event"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"handoff"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"from_role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"coder"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"to_role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"reviewer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"correlation_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"conv-a1b2c3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"submit_for_review"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-02-18T14:32:01Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@reviewer: Auth service implementation ready for review"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Human checkpoints&lt;/strong&gt;: The methodology defines when a human should review. TinySDLC provides the structure; your team provides the judgment.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important distinction&lt;/strong&gt;: TinySDLC is a minimal agent orchestrator extracted from a larger internal system. It provides structure and role discipline — real governance with zero infrastructure. It's a complete, standalone tool, not a crippled version of something else.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The 8 Agent Roles
&lt;/h2&gt;

&lt;p&gt;TinySDLC defines 8 specialized roles, each with scoped permissions and responsibilities:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────┐
│                    Governance Layer                        │
│                                                           │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐   │
│  │Researcher│ │    PM    │ │   PJM    │ │ Architect│   │
│  │ (Explore)│ │(Require) │ │ (Track)  │ │ (Design) │   │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘   │
│                                                           │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐   │
│  │  Coder   │ │ Reviewer │ │  Tester  │ │  DevOps  │   │
│  │(Generate)│ │ (Review) │ │  (Test)  │ │ (Deploy) │   │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘   │
│                                                           │
└──────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each role has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Defined tool permissions&lt;/strong&gt; (what the agent can access — isolated workspaces)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System prompt&lt;/strong&gt; (what the agent's responsibilities are)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scope constraints&lt;/strong&gt; (what the agent is NOT allowed to do — enforced separation of duties)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handoff responsibilities&lt;/strong&gt; (which role receives its output next)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't about restricting AI. It's about giving structure to a multi-agent workflow so that a reviewer can't be bypassed, a coder can't self-approve, and every handoff is explicit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Design Decisions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Local-first, zero external dependencies
&lt;/h3&gt;

&lt;p&gt;TinySDLC runs on your machine. File-based queue (incoming → processing → outgoing), no Redis, no Postgres, no cloud services. Install and run in under 5 minutes.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Multi-channel from day one
&lt;/h3&gt;

&lt;p&gt;Discord, Telegram, WhatsApp, Zalo — your team works where they already are. Agents respond in the same channel. No context switching.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Security hardening built in
&lt;/h3&gt;

&lt;p&gt;This came directly from our ZeroClaw experiments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Credential scrubbing&lt;/strong&gt;: Agent output is scanned for leaked API keys, tokens, passwords before it reaches the channel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment variable scrubbing&lt;/strong&gt;: &lt;code&gt;.env&lt;/code&gt; contents never appear in agent responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input sanitization&lt;/strong&gt;: 12 injection patterns blocked for external channel content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shell guards&lt;/strong&gt;: 8 deny patterns + path traversal detection for any shell operations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Role constraints that enforce discipline
&lt;/h3&gt;

&lt;p&gt;A reviewer can't approve their own code. A coder can't skip the review step. A tester can't deploy. These aren't suggestions — they're structural constraints in the agent definitions. This was a hard lesson from the NQH-Bot crisis: when governance is optional, it gets skipped.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Methodology: MTS-SDLC-Lite
&lt;/h2&gt;

&lt;p&gt;TinySDLC is the tool. But governance needs more than code — it needs a methodology.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Minh-Tam-Solution/MTS-SDLC-Lite" rel="noopener noreferrer"&gt;MTS-SDLC-Lite&lt;/a&gt; is the community edition of our SDLC 6.1.0 framework. It's pure documentation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Core concepts&lt;/strong&gt;: Design Thinking, Systems Thinking, 10-Stage Lifecycle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Roles and teams&lt;/strong&gt;: 4 team archetypes for different project sizes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Playbooks&lt;/strong&gt;: Step-by-step guides for common workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Templates&lt;/strong&gt;: Spec templates, gate checklists, evidence formats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Case studies&lt;/strong&gt;: Real examples from our production projects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's tool-agnostic. Use it with Claude, GPT, Copilot, Cursor, or pen and paper. The methodology works regardless of which AI tool you choose.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Learned
&lt;/h2&gt;

&lt;p&gt;After 12 iterations of the framework and 5 production projects, here are our key takeaways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Governance is not overhead — it's insurance&lt;/strong&gt;. The time spent on gates and evidence capture pays back 10x when something breaks in production and you need to trace the root cause.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AI doesn't need fewer rules — it needs better rules&lt;/strong&gt;. The agents are eager to follow structure. Give them clear constraints and they produce better output than with vague "be careful" instructions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Methodology outlives tools&lt;/strong&gt;. We've switched AI providers three times. The SDLC framework hasn't changed. Invest in your process, not your tool vendor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Not being an expert can be an advantage&lt;/strong&gt;. Professional developers have ingrained habits — "this is how we've always done it." As a non-coding CEO, I had no muscle memory to override. No legacy patterns to defend. I was ready to learn whatever was new, because everything was new. Sometimes the beginner's mind sees what the expert's mind filters out. We are always programming in our lives — with AI today, anyone with design thinking, systems thinking, and domain knowledge can quickly experiment and turn ideas into products.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start minimal&lt;/strong&gt;. TinySDLC is deliberately small. You don't need a full enterprise platform to start governing AI output. You need a loop: Spec → Gate → Evidence → Approval.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What TinySDLC Does NOT Solve
&lt;/h2&gt;

&lt;p&gt;Transparency matters more than polish. Here's what TinySDLC intentionally does not do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It does not guarantee code quality.&lt;/strong&gt; It structures the workflow — the quality of output still depends on your AI provider and your prompts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It does not replace CI/CD or SAST.&lt;/strong&gt; No automated test execution, no static analysis. Those belong in your pipeline, not your orchestrator.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It does not eliminate bad architecture decisions.&lt;/strong&gt; If your spec is wrong, governed agents will build the wrong thing — just more traceably.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It adds structure, not intelligence.&lt;/strong&gt; The agents are still AI. TinySDLC constrains how they interact, not what they think.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Governance is a constraint system, not a magic layer. TinySDLC makes multi-agent workflows auditable and disciplined — nothing more, nothing less.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# The tool&lt;/span&gt;
git clone https://github.com/Minh-Tam-Solution/tinysdlc.git
&lt;span class="nb"&gt;cd &lt;/span&gt;tinysdlc &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm run build
./tinysdlc.sh start    &lt;span class="c"&gt;# Interactive setup wizard&lt;/span&gt;

&lt;span class="c"&gt;# The methodology&lt;/span&gt;
git clone https://github.com/Minh-Tam-Solution/MTS-SDLC-Lite.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both repos are MIT licensed. Use them, fork them, improve them.&lt;/p&gt;

&lt;p&gt;If you're building with AI coding agents and want to talk about governance approaches, find me on &lt;a href="https://www.linkedin.com/in/the-tai-dang-a81bb710/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or open an issue on GitHub.&lt;/p&gt;

&lt;p&gt;AI is fast. Governance must be faster.&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;Tai Dang&lt;/strong&gt;, CEO/Founder MTS &amp;amp; CEO Nhat Quang Holding&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>devops</category>
      <category>softwareengineering</category>
    </item>
  </channel>
</rss>
