<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Dawid Dahl</title>
    <description>The latest articles on Forem by Dawid Dahl (@dawiddahl).</description>
    <link>https://forem.com/dawiddahl</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F468874%2F7ebb0342-1e6a-4ac7-a19a-443045364564.jpg</url>
      <title>Forem: Dawid Dahl</title>
      <link>https://forem.com/dawiddahl</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dawiddahl"/>
    <language>en</language>
    <item>
      <title>AAID: Augmented AI Development</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Mon, 06 Oct 2025 08:21:06 +0000</pubDate>
      <link>https://forem.com/dawiddahl/aaid-augmented-ai-development-50c9</link>
      <guid>https://forem.com/dawiddahl/aaid-augmented-ai-development-50c9</guid>
      <description>&lt;p&gt;&lt;em&gt;Professional TDD for AI-Augmented Software Development&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;What Is AAID and Why It Matters&lt;/li&gt;
&lt;li&gt;The Business Case: What Performance Research Shows&lt;/li&gt;
&lt;li&gt;Who This Guide Is For&lt;/li&gt;
&lt;li&gt;Built on Proven Foundations&lt;/li&gt;
&lt;li&gt;Developer Mindset&lt;/li&gt;
&lt;li&gt;
Prerequisite: Product Discovery &amp;amp; Specification Phase

&lt;ul&gt;
&lt;li&gt;From Specification to Development&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Getting Started With AAID&lt;/li&gt;

&lt;li&gt;AAID Workflow Diagram&lt;/li&gt;

&lt;li&gt;

AAID Development Stages

&lt;ul&gt;
&lt;li&gt;Stage 1: Context Providing&lt;/li&gt;
&lt;li&gt;Stage 2: Planning&lt;/li&gt;
&lt;li&gt;Stage 3: TDD Development Begins&lt;/li&gt;
&lt;li&gt;Stage 4: The TDD Cycle&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Continuing the TDD Cycle&lt;/li&gt;

&lt;li&gt;Conclusion: The Augmented Advantage&lt;/li&gt;

&lt;li&gt;Example Implementation&lt;/li&gt;

&lt;li&gt;

Appendices (Optional)

&lt;ul&gt;
&lt;li&gt;Appendix A: Acceptance Testing&lt;/li&gt;
&lt;li&gt;Appendix B: Helpful Commands (Reusable Prompts)&lt;/li&gt;
&lt;li&gt;Appendix C: AAID AI Workflow Rules&lt;/li&gt;
&lt;li&gt;Appendix D: Handling Technical Implementation Details&lt;/li&gt;
&lt;li&gt;Appendix E: Dependencies and Mocking&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;About the Author&lt;/li&gt;

&lt;/ul&gt;




&lt;p&gt;&lt;a id="what-is-aaid"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is AAID and Why It Matters
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AUGMENTED AI DEVELOPMENT &lt;code&gt;AAID&lt;/code&gt;&lt;/strong&gt; (&lt;strong&gt;/eɪd/&lt;/strong&gt; - pronounced like "aid") is a disciplined approach where developers augment their capabilities by integrating with AI, while maintaining full architectural control. You direct the agent to generate tests and implementation code, reviewing every line and ensuring alignment with business requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You're not being replaced. You're being augmented.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This separates professional software development from "vibe coding." While vibe coders blindly accept AI output and ship buggy, untested code they can't understand, &lt;code&gt;AAID&lt;/code&gt; practitioners use proper TDD (Test-Driven Development) to ensure reliable agentic assistance.&lt;/p&gt;

&lt;p&gt;&lt;a id="the-business-case"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Business Case: What Performance Research Shows
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dora.dev/" rel="noopener noreferrer"&gt;DORA&lt;/a&gt; (Google Cloud's &lt;strong&gt;DevOps Research and Assessment&lt;/strong&gt;) highlights the proven TDD principle &lt;code&gt;AAID&lt;/code&gt; relies on: developer-owned testing drives performance &lt;a href="https://dora.dev/capabilities/test-automation/" rel="noopener noreferrer"&gt;[1]&lt;/a&gt;. At the same time, a 25% increase in AI adoption correlates with a 7.2% drop in delivery stability and 1.5% decrease in throughput, while 39% of developers report little to no trust in AI-generated code &lt;a href="https://cloud.google.com/blog/products/devops-sre/announcing-the-2024-dora-report" rel="noopener noreferrer"&gt;[2]&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;AAID&lt;/code&gt; solves this. The TDD discipline forces every AI-generated line through comprehensive testing and mandatory reviews, capturing AI's productivity gains (increased &lt;strong&gt;documentation quality&lt;/strong&gt;, &lt;strong&gt;code quality&lt;/strong&gt;, &lt;strong&gt;review&lt;/strong&gt; and &lt;strong&gt;generation speed&lt;/strong&gt; &lt;a href="https://cloud.google.com/blog/products/devops-sre/announcing-the-2024-dora-report" rel="noopener noreferrer"&gt;[2]&lt;/a&gt;) without the stability loss.&lt;/p&gt;

&lt;p&gt;DORA proves speed and stability aren't trade-offs &lt;a href="https://dora.dev/guides/dora-metrics-four-keys/" rel="noopener noreferrer"&gt;[3]&lt;/a&gt;. With &lt;code&gt;AAID&lt;/code&gt;, speed comes from AI augmentation supported by the safety net of tests, stability from disciplined testing. You get both together, not one at the expense of the other.&lt;/p&gt;

&lt;p&gt;[1] &lt;a href="https://dora.dev/capabilities/test-automation/" rel="noopener noreferrer"&gt;DORA Capabilities: Test automation&lt;/a&gt;&lt;br&gt;
[2] &lt;a href="https://cloud.google.com/blog/products/devops-sre/announcing-the-2024-dora-report" rel="noopener noreferrer"&gt;Announcing the 2024 DORA report | Google Cloud Blog&lt;/a&gt;&lt;br&gt;
[3] &lt;a href="https://dora.dev/guides/dora-metrics-four-keys/" rel="noopener noreferrer"&gt;DORA's software delivery metrics: the four keys&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a id="who-this-guide-is-for"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Who This Guide Is For
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;AAID&lt;/code&gt; is for serious developers who aim at maintainable software. Whether you're a professional engineer or someone building a personal project you expect to last over an extended period of time.&lt;/p&gt;

&lt;p&gt;If you just need quick scripts or throwaway prototypes, other AI approaches work better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What you need:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic understanding of how AI prompts and context work&lt;/li&gt;
&lt;li&gt;Some experience with automated testing&lt;/li&gt;
&lt;li&gt;Patience to review what the AI writes (no blind copy-pasting)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What you don't need:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;TDD experience (you'll learn it here)&lt;/li&gt;
&lt;li&gt;Specific tech stack knowledge&lt;/li&gt;
&lt;li&gt;Deep AI expertise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result? &lt;strong&gt;Predictable&lt;/strong&gt; development with great potential for &lt;strong&gt;production-grade&lt;/strong&gt; quality software. While initially the &lt;code&gt;AAID&lt;/code&gt; workflow requires more discipline and effort than vibe coding, in the long run you'll move faster. No debugging mysterious AI-generated bugs or untangling code you don't understand.&lt;/p&gt;

&lt;p&gt;This guide shows you exactly how to, from context-setting through disciplined TDD cycles, ship features that deliver real business value.&lt;/p&gt;

&lt;p&gt;It's also an incredibly fun way to work!&lt;/p&gt;

&lt;p&gt;&lt;a id="built-on-proven-foundations"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Built on Proven Foundations
&lt;/h2&gt;

&lt;p&gt;Unlike most other AI-driven workflows, &lt;code&gt;AAID&lt;/code&gt; doesn't try to reinvent product discovery or software development. Instead it stands on the shoulders of giants, applying well-established methodologies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Kent Beck&lt;/strong&gt;'s TDD cycles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dave Farley&lt;/strong&gt;'s Continuous Delivery and four-layer acceptance testing model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Robert C. Martin&lt;/strong&gt;'s Three Laws of TDD&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Daniel Terhorst-North&lt;/strong&gt;'s Behavior-Driven Development (BDD) methodology&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gojko Adzic&lt;/strong&gt;'s Specification by Example methodology&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aslak Hellesøy&lt;/strong&gt;'s BDD and Gherkin syntax for executable specifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eric Evans&lt;/strong&gt;'s Domain-Driven Design and Ubiquitous Language&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Martin Fowler&lt;/strong&gt;'s work on refactoring, evolutionary design, and Domain-Specific Languages&lt;/li&gt;
&lt;li&gt;And more.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These battle-tested practices become your foundation that guides AI-assisted development.&lt;/p&gt;

&lt;p&gt;&lt;a id="developer-mindset"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Developer Mindset
&lt;/h2&gt;

&lt;p&gt;Success with &lt;code&gt;AAID&lt;/code&gt; requires a specific mindset:&lt;/p&gt;

&lt;p&gt;1: &lt;strong&gt;🧠 Don't abandon your brain&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You need to stay engaged with, and comprehend, every line of code, every test, every refactoring. The AI generates the code, but you decide what stays, what changes, what is removed, and why.&lt;/p&gt;

&lt;p&gt;Without this understanding, you're just &lt;em&gt;hoping&lt;/em&gt; things will work, which is sure to spell disaster in any real-world project.&lt;/p&gt;

&lt;p&gt;2: &lt;strong&gt;🪜 Incremental steps&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This mentality is what really sets this AI workflow apart from others. Here, instead of letting the AI go off for minutes or even hours and produce a lot of dangerous garbage code, you make sure to remain in control by iterating in &lt;strong&gt;small&lt;/strong&gt;, &lt;strong&gt;focused&lt;/strong&gt; steps.&lt;/p&gt;

&lt;p&gt;One test at a time. One feature at a time. One refactor at a time.&lt;/p&gt;

&lt;p&gt;This approach surfaces mistakes early and can even help you save money by keeping token usage low, while also making it easier to use smaller and cheaper models.&lt;/p&gt;

&lt;p&gt;This is why the TDD cycle in &lt;code&gt;AAID&lt;/code&gt; adds multiple review checkpoints—&lt;strong&gt;⏸️ AWAIT USER REVIEW&lt;/strong&gt;—after each phase (🔴 &lt;strong&gt;RED&lt;/strong&gt;, 🟢 &lt;strong&gt;GREEN&lt;/strong&gt;, and 🧼 &lt;strong&gt;REFACTOR&lt;/strong&gt;).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;These incremental steps mirror &lt;a href="https://dora.dev/" rel="noopener noreferrer"&gt;DORA&lt;/a&gt;'s research on working in small batches: tiny, independent changes give you faster feedback and reduce risk &lt;a href="https://dora.dev/capabilities/working-in-small-batches/" rel="noopener noreferrer"&gt;[1]&lt;/a&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="prerequisite"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Prerequisite: Product Discovery &amp;amp; Specification Phase
&lt;/h2&gt;

&lt;p&gt;Before development begins, professional teams complete a product specification phase involving stakeholders, product owners, tech leads, product designers, developers, QA engineers, architects. From a high level, it follows some kind of refinement-pattern like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Client's Vague Wish → Stories → Examples&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using techniques like Impact Mapping, Event Storming, and Story Mapping, teams establish specifications that represent the fundamental business needs that must be satisfied. The resulting specifications can include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User stories with BDD examples, organized into epics

&lt;ul&gt;
&lt;li&gt;Or a &lt;a href="https://jpattonassociates.com/wp-content/uploads/2015/03/story_mapping.pdf" rel="noopener noreferrer"&gt;Story Map&lt;/a&gt; containing the user stories + BDD examples ← (the &lt;code&gt;AAID&lt;/code&gt; recommendation)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;PRD (Product Requirements Document)&lt;/li&gt;
&lt;li&gt;Ubiquitous language documentation. (A common language shared among stakeholders, developers, and anyone taking part in the project)&lt;/li&gt;
&lt;li&gt;Any additional project-specific requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The exact combination varies by project.&lt;/p&gt;

&lt;p&gt;This specification package will then be used—almost religiously—to serve as the objective foundation for the &lt;code&gt;AAID&lt;/code&gt; workflow, aligning development with the actual needs of the business.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;⚙️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Technical requirements (infrastructure elements, styling, NFRs) are tracked as separate linked tasks within stories, keeping behavioral specs pure. Learn more.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="prerequisite-spec-to-dev"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  From Specification to Development
&lt;/h3&gt;

&lt;p&gt;Here's how a typical user story with BDD examples can look.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Take note of how all these BDD examples only describe the &lt;strong&gt;behavior&lt;/strong&gt; of the system. Importantly, they say nothing of how to implement them technically.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User Story Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="err"&gt;Title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;User&lt;/span&gt; &lt;span class="err"&gt;archives&lt;/span&gt; &lt;span class="err"&gt;completed&lt;/span&gt; &lt;span class="err"&gt;todos&lt;/span&gt;

&lt;span class="err"&gt;User Story&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="err"&gt;As&lt;/span&gt; &lt;span class="nf"&gt;a &lt;/span&gt;user, I want to archive completed todos, so that my active list stays clean
&lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;I &lt;/span&gt;can focus on current tasks.

&lt;span class="err"&gt;Acceptance Criteria&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="kd"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; User archives completed todos

&lt;span class="kn"&gt;Scenario&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; Archive a completed todo
  &lt;span class="nf"&gt;Given &lt;/span&gt;the user has a completed todo &lt;span class="s"&gt;"Buy milk"&lt;/span&gt;
  &lt;span class="nf"&gt;When &lt;/span&gt;they archive &lt;span class="s"&gt;"Buy milk"&lt;/span&gt;
  &lt;span class="nf"&gt;Then &lt;/span&gt;&lt;span class="s"&gt;"Buy milk"&lt;/span&gt; should be in archived todos
  &lt;span class="nf"&gt;And &lt;/span&gt;&lt;span class="s"&gt;"Buy milk"&lt;/span&gt; should not be in active todos

&lt;span class="kn"&gt;Scenario&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; Cannot archive an incomplete todo
  &lt;span class="nf"&gt;Given &lt;/span&gt;the user has an incomplete todo &lt;span class="s"&gt;"Walk dog"&lt;/span&gt;
  &lt;span class="nf"&gt;When &lt;/span&gt;they attempt to archive &lt;span class="s"&gt;"Walk dog"&lt;/span&gt;
  &lt;span class="nf"&gt;Then &lt;/span&gt;they should see an error message
  &lt;span class="nf"&gt;And &lt;/span&gt;&lt;span class="s"&gt;"Walk dog"&lt;/span&gt; should remain in active todos

&lt;span class="kn"&gt;Scenario&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; Restore an archived todo
  &lt;span class="nf"&gt;Given &lt;/span&gt;the user has archived todo &lt;span class="s"&gt;"Review code"&lt;/span&gt;
  &lt;span class="nf"&gt;When &lt;/span&gt;they restore &lt;span class="s"&gt;"Review code"&lt;/span&gt;
  &lt;span class="nf"&gt;Then &lt;/span&gt;&lt;span class="s"&gt;"Review code"&lt;/span&gt; should be in active todos
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This article is not about this product discovery and specification refinement step; it assumes you have the specs ready. When you do, it will guide you towards how to transform the specs → tests and code ready for production.&lt;/p&gt;

&lt;p&gt;&lt;a id="getting-started-with-aaid"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started With AAID
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Prerequisites&lt;/strong&gt;: &lt;code&gt;AAID&lt;/code&gt; is a feature development workflow that assumes:&lt;/p&gt;

&lt;p&gt;✅ &lt;strong&gt;Specifications ready&lt;/strong&gt;: User stories with BDD scenarios from Product Discovery&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Working project&lt;/strong&gt;: Development environment, test runner, linting and basic tooling configured (new or existing codebase)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Basic project scaffolding (running framework generators, setting up config files) involves structural setup rather than implementable technical contracts, placing it outside &lt;code&gt;AAID&lt;/code&gt;'s TDD workflow. Custom infrastructure &lt;em&gt;implementations&lt;/em&gt; (adapters, middleware, auth setup, etc) use &lt;code&gt;AAID&lt;/code&gt; with TDD. See Appendix D for details on technical implementation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Setting up &lt;code&gt;AAID&lt;/code&gt; takes just three steps:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1: Add the workflow rules&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Save the &lt;code&gt;AAID&lt;/code&gt; rules from Appendix C to your project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cursor&lt;/strong&gt;: &lt;code&gt;.cursor/rules/aaid.mdc&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code&lt;/strong&gt;: &lt;code&gt;CLAUDE.md&lt;/code&gt; in project root&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini CLI&lt;/strong&gt;: &lt;code&gt;GEMINI.md&lt;/code&gt; in project root&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;2: Add reusable commands (optional but recommended)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Copy the command files from Appendix B to your project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cursor&lt;/strong&gt;: &lt;code&gt;.cursor/commands/&lt;/code&gt; (Markdown format)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code&lt;/strong&gt;: &lt;code&gt;.claude/commands/&lt;/code&gt; (Markdown format)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini CLI&lt;/strong&gt;: &lt;code&gt;.gemini/commands/&lt;/code&gt; (TOML format - needs conversion)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;3: Have the &lt;code&gt;AAID&lt;/code&gt; workflow diagram ready&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;AI agents sometimes make mistakes and unfortunately may not always follow instructions. If/when that happens, since you follow the &lt;code&gt;AAID&lt;/code&gt; mindset, you can manually steer it back on track with the workflow &lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/aaid-workflow-diagram.mermaid" rel="noopener noreferrer"&gt;diagram&lt;/a&gt; as your guide.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Liftoff into &lt;code&gt;AAID&lt;/code&gt; space 🚀&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;That’s it, there's no more magic to it than that. The rules enforce a disciplined TDD workflow, and the commands speed up your development. Now you're ready for &lt;code&gt;AAID&lt;/code&gt; Stage 1!&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;💡&lt;/th&gt;
&lt;th&gt;
&lt;strong&gt;Demo repo&lt;/strong&gt;: For a working example with all files pre-configured for Cursor, check the &lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development-demo" rel="noopener noreferrer"&gt;TicTacToe demo repository&lt;/a&gt;.&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="workflow-diagram"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  AAID Workflow Diagram
&lt;/h2&gt;

&lt;p&gt;Now that you have your specs from the product specification phase (like the user story above), and the AI environment set up, we are ready to start building!&lt;/p&gt;

&lt;p&gt;This diagram presents the formal workflow; detailed explanations for each step follow in the &lt;strong&gt;&lt;code&gt;AAID&lt;/code&gt; Development Stages&lt;/strong&gt; section below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F23kuy9jk91b50dmgrxh3.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F23kuy9jk91b50dmgrxh3.webp" alt="AAID workflow diagram" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The diagram shows three distinct development paths, distinguished by colored arrows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Blue arrows&lt;/strong&gt;: Shared workflow stages and the domain/business logic path this article focuses on, as opposed to the technical and presentation paths.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orange arrows&lt;/strong&gt;: Technical implementation specific branches (see Appendix D)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Purple arrows&lt;/strong&gt;: Presentation/UI specific branches (no TDD - see Appendix D)&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🔗&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Click &lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/aaid-workflow-diagram.mermaid" rel="noopener noreferrer"&gt;this link&lt;/a&gt; to &lt;strong&gt;view&lt;/strong&gt; the full diagram.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;If the diagram is not rendered on mobile, copy/paste the mermaid code into a &lt;a href="https://mermaid.live" rel="noopener noreferrer"&gt;mermaid editor&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a id="development-stages"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  AAID Development Stages
&lt;/h2&gt;

&lt;p&gt;&lt;a id="stage-1-context"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  📚 Stage 1: Context Providing
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1buitum19txdrc2pcnw.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1buitum19txdrc2pcnw.jpg" alt="Stage 1 - Context Providing" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Before any AI interaction, establish comprehensive context. The AI needs to understand the project landscape to generate relevant code.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Note on commands&lt;/strong&gt;: Throughout this guide, you'll see references like &lt;code&gt;/project-context&lt;/code&gt;. These are pre-written &lt;strong&gt;reusable&lt;/strong&gt; &lt;strong&gt;prompts&lt;/strong&gt; that you trigger with the &lt;code&gt;/&lt;/code&gt; prefix. The repo stores them in Cursor's &lt;code&gt;.cursor/commands/&lt;/code&gt;, and you can copy the same markdown into other tools' custom-command setups (e.g., &lt;code&gt;CLAUDE.md&lt;/code&gt;).&lt;br&gt;&lt;br&gt;&lt;strong&gt;You use these commands to augment your implementation speed.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;Find their implementations in Appendix B.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Steps:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add High-Level Context&lt;/strong&gt; (trigger &lt;code&gt;/project-context&lt;/code&gt; and include the relevant context in the same message as arguments to the command)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Project's README, architecture docs, package.json, config files, etc. Whatever you find important to your project from a high level.&lt;/li&gt;
&lt;li&gt;Overall system design and patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI Research&lt;/strong&gt;: Use &lt;code&gt;/research-&amp;amp;-stop&lt;/code&gt; to let AI proactively search codebase patterns and relevant documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🤖&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;This will make the AI read through and summarize the basic project context, and how to do things. This is similar to onboarding a new human colleague.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Determine What You're Building&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Choose your development type early to load the right context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Domain/Business Logic&lt;/strong&gt;: Core behavior delivering business value&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical Implementation&lt;/strong&gt;: Infrastructure elements—adapters (Hexagonal), repositories/gateways (Clean/DDD), controllers (MVC)—plus integrations and initializations (see Appendix D)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Presentation/UI&lt;/strong&gt;: Visual styling, animations, audio (see Appendix D)&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add Specification Context&lt;/strong&gt; (specific to your development type)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For Domain/Business&lt;/strong&gt;: User stories with BDD scenarios, PRD sections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For Technical&lt;/strong&gt;: Technical tasks, NFRs, architecture decisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For Presentation&lt;/strong&gt;: Design specs, Figma files, style guides&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🤖&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;The AI is now fundamentally aligned with your development goals, whether creating business value, implementing technical infrastructure, or crafting user interfaces.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add Relevant Code Context&lt;/strong&gt; (specific to your development type)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For Domain/Business&lt;/strong&gt;: Domain dependencies, tests, similar features, pure function utils for similar logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For Technical&lt;/strong&gt;: Existing infrastructure elements, infrastructure patterns, utils, integration points&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For Presentation&lt;/strong&gt;: Components, design system, CSS framework, presentation-related config files&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🤖&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Along with automated checks like linting and formatting, plus any personal AI IDE/CLI instructions you use, this step keeps the AI consistent with your codebase’s style and conventions. It also helps it technically understand how the various parts of the codebase depend on each other.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="stage-2-planning"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🤝 Stage 2: Planning (High-Level Approach)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr6dm1rt2uecxrrlqacy7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr6dm1rt2uecxrrlqacy7.jpg" alt="Stage 2 - Planning" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the AI agent now informed of your specific project's context from Stage 1, collaborate to understand the feature at a &lt;strong&gt;high level&lt;/strong&gt; before diving into implementation. This is &lt;em&gt;not&lt;/em&gt; about prescribing implementation details; those will emerge through TDD for domain and technical work, or through design implementation for presentation/UI work. Instead, it's about making sure you and the AI are &lt;strong&gt;aligned&lt;/strong&gt; on scope and approach.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;If you and the AI have different ideas of what is supposed to be built, many times using AI can actually slow progress down rather than speed it up. This AI planning stage helps eliminate this issue.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Planning vs TDD Discovery
&lt;/h4&gt;

&lt;p&gt;For domain and technical work, the planning stage provides a roadmap of &lt;strong&gt;&lt;em&gt;what&lt;/em&gt;&lt;/strong&gt; to build and roughly which tests to write. TDD will still discover &lt;strong&gt;&lt;em&gt;how&lt;/em&gt;&lt;/strong&gt; to build it through the 🔴 &lt;strong&gt;Red&lt;/strong&gt; • 🟢 &lt;strong&gt;Green&lt;/strong&gt; • 🧼 &lt;strong&gt;Refactor&lt;/strong&gt; cycle. For presentation/UI work, planning outlines validation criteria rather than tests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Planning IS:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understanding which parts of the system are involved&lt;/li&gt;
&lt;li&gt;Creating a test roadmap (roughly what to test, and in what order) for domain/technical work&lt;/li&gt;
&lt;li&gt;Creating validation criteria for presentation/UI work&lt;/li&gt;
&lt;li&gt;Recognizing existing patterns to follow&lt;/li&gt;
&lt;li&gt;Mapping out the feature's boundaries and finding related key interfaces/ports&lt;/li&gt;
&lt;li&gt;Identifying external dependencies to mock (for testable work)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What Planning IS NOT:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Designing specific classes or methods&lt;/li&gt;
&lt;li&gt;Defining data structures&lt;/li&gt;
&lt;li&gt;Prescribing implementation details&lt;/li&gt;
&lt;li&gt;Creating complete interfaces/ports up front&lt;/li&gt;
&lt;li&gt;Making architectural decisions tests haven't forced yet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of it like navigation: Planning sets the destination, TDD finds the path.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Modifying Untested Code?&lt;/strong&gt; Changing untested code in an existing codebase requires strategies (characterization tests, finding seams, etc) outside this guide's scope. In that case, see books like &lt;a href="https://www.oreilly.com/library/view/working-effectively-with/0131177052/" rel="noopener noreferrer"&gt;Working Effectively with Legacy Code&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Steps:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;1) &lt;strong&gt;Discuss the Feature&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Discuss and explore freely, as you would with a human&lt;/li&gt;
&lt;li&gt;Is everything crystal clear given the provided specifications? Does the AI have any questions?&lt;/li&gt;
&lt;li&gt;Share any constraints or technical considerations&lt;/li&gt;
&lt;li&gt;Explore potential approaches with the AI&lt;/li&gt;
&lt;li&gt;Clarify ambiguities; make sure the AI makes no wild assumptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2) &lt;strong&gt;Check for Additional Context&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ask: "&lt;em&gt;Do you need any other context to understand the feature's scope and boundaries?&lt;/em&gt;"&lt;/li&gt;
&lt;li&gt;Provide any missing domain knowledge or system information&lt;/li&gt;
&lt;li&gt;Trigger &lt;code&gt;/research-&amp;amp;-stop&lt;/code&gt; for AI-driven investigation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;3) &lt;strong&gt;Request Appropriate Roadmap&lt;/strong&gt;&lt;br&gt;
   Based on your Stage 1 choice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate a high-level roadmap before any coding&lt;/li&gt;
&lt;li&gt;For &lt;strong&gt;domain/business logic&lt;/strong&gt;: trigger &lt;code&gt;/ai-roadmap-template&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;For &lt;strong&gt;technical implementation&lt;/strong&gt;: trigger &lt;code&gt;/ai-technical-roadmap-template&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For presentation/UI&lt;/strong&gt;: trigger &lt;code&gt;/ai-presentation-roadmap-template&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Focus on test/validation scenarios and their logical sequence&lt;/li&gt;
&lt;li&gt;Keep at "mermaid diagram" level of abstraction&lt;/li&gt;
&lt;li&gt;An actual mermaid diagram can be generated if applicable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;4) &lt;strong&gt;Review and Refine Roadmap&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Review the roadmap to make sure it addresses &lt;strong&gt;every&lt;/strong&gt; specification (business, technical, or presentation requirements)&lt;/li&gt;
&lt;li&gt;Use it to ensure you and the AI agent are aligned&lt;/li&gt;
&lt;li&gt;Make sure it respects existing project patterns and boundaries&lt;/li&gt;
&lt;li&gt;For domain/technical: Verify the test sequence builds incrementally from simple to complex&lt;/li&gt;
&lt;li&gt;For presentation: Verify validation criteria are clear and measurable&lt;/li&gt;
&lt;li&gt;Iterate with the AI if adjustments are needed&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;☝️ &lt;strong&gt;Note on task lists&lt;/strong&gt;: Many other AI workflows (such as &lt;a href="https://github.com/eyaltoledano/claude-task-master" rel="noopener noreferrer"&gt;Task Master&lt;/a&gt;) generate "task lists" with checkboxes in the planning stage. The idea is the AI will then arbitrarily check off items as "done" as it goes. But how can you &lt;strong&gt;trust&lt;/strong&gt; the AI's judgment for when something is actually done?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzpebpttmob0s0exbxvju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzpebpttmob0s0exbxvju.png" alt="Task Master - Tasks" width="800" height="424"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In addition, with such checkboxes, you must manually re-verify everything after future code changes, to prevent &lt;strong&gt;regressions&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's why you don't use checkbox/task-planning in &lt;code&gt;AAID&lt;/code&gt;. Instead, for domain and technical work, you express completion criteria as good old &lt;strong&gt;automated tests&lt;/strong&gt;. Tests aren't added as an afterthought, they're treated as first-class citizens.&lt;/p&gt;

&lt;p&gt;Automated tests = &lt;strong&gt;objective&lt;/strong&gt; and &lt;strong&gt;re-runnable&lt;/strong&gt; verification, eliminating both aforementioned problems of &lt;strong&gt;trust&lt;/strong&gt; and &lt;strong&gt;regression&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the roadmap looks good, now is when disciplined development actually starts!&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🔀&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Path Divergence&lt;/strong&gt;: After roadmap approval, the workflow splits into three paths:&lt;br&gt;&lt;br&gt;• &lt;strong&gt;Domain/Business Logic&lt;/strong&gt; → Continue to Stage 3 (TDD Development)&lt;br&gt;• &lt;strong&gt;Technical Implementation (Non-Observable)&lt;/strong&gt; → Continue to Stage 3 (TDD Development)&lt;br&gt;• &lt;strong&gt;Presentation/UI (Observable Technical)&lt;/strong&gt; → Proceed to implementation and validation without TDD&lt;br&gt;&lt;br&gt;See Appendix D or the &lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/aaid-workflow-diagram.mermaid" rel="noopener noreferrer"&gt;Workflow Diagram&lt;/a&gt; for more information on these three implementation categories.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;💻&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Important for Frontend Developers&lt;/strong&gt;: &lt;code&gt;AAID&lt;/code&gt; absolutely applies to frontend development! Frontend behavioral logic (form validation, state management, data transformations, etc) uses TDD just like backend. Only pure presentation/UI aspects (colors, audio, spacing, animations, some accessibility concerns) skip TDD for manual validation. See Appendix D for detailed examples.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="stage-3-tdd-begins"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  📝 Stage 3: TDD Development Begins
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwwpbh6rch5u0qfigqd4t.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwwpbh6rch5u0qfigqd4t.jpg" alt="Stage 3 - TDD Development Begins" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Choose one of these two approaches for implementing your tests when starting work on a new feature:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Test Ordering with ZOMBIES&lt;/strong&gt;: Whichever approach you choose, order your tests following James Grenning's ZOMBIES heuristic. &lt;strong&gt;Z&lt;/strong&gt;ero → &lt;strong&gt;O&lt;/strong&gt;ne → &lt;strong&gt;M&lt;/strong&gt;any is the happy path; after each step, interleave applicable &lt;strong&gt;B&lt;/strong&gt;oundaries, &lt;strong&gt;I&lt;/strong&gt;nterface, and &lt;strong&gt;E&lt;/strong&gt;xceptions before moving to the next. Keep both &lt;strong&gt;S&lt;/strong&gt;cenarios and solutions simple throughout. &lt;a href="https://blog.wingman-sw.com/tdd-guided-by-zombies" rel="noopener noreferrer"&gt;Link&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Test List Approach&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Collaborate with the AI to create a list of (unimplemented) tests derived from the specs; breaking down each behavior into granular, testable steps.&lt;/p&gt;

&lt;p&gt;Use the Roadmap from "&lt;strong&gt;Stage 2: Planning&lt;/strong&gt;" directly or as inspiration for the test list.&lt;/p&gt;

&lt;p&gt;The test list is a living document. Following Kent Beck's TDD approach, this list isn't carved in stone. Add new tests as you discover/think of new edge cases, remove tests that become redundant, or modify tests as your understanding evolves. The list is a tool to guide development, not a contract you must fulfill exactly as written.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This TDD practice is in contrast to acceptance testing, where tests &lt;strong&gt;must&lt;/strong&gt; map 1:1 to the project specs (usually BDD scenarios).&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;User archives completed todos&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;skip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should archive a completed todo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;skip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should not archive an incomplete todo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;skip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should restore an archived todo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;It is extremely important that the tests are not yet implemented at this stage. This is because TDD's iterative cycle prevents you from baking implementation assumptions into your tests. Writing all tests upfront risks testing your preconceptions rather than actual behavior requirements.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Option 2: Single Test Approach&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with the simplest test and then build incrementally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;User archives completed todos&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should archive a completed todo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// To be implemented&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a id="stage-4-tdd-cycle"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🔄 Stage 4: The TDD Cycle
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frslbh63lptf0iqw47kam.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frslbh63lptf0iqw47kam.jpg" alt="Stage 4 - TDD Cycle" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🤖&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;The Three Laws of TDD&lt;/strong&gt;: The reusable TDD commands (&lt;code&gt;/red-&amp;amp;-stop&lt;/code&gt;, &lt;code&gt;/green-&amp;amp;-stop&lt;/code&gt;, &lt;code&gt;/refactor-&amp;amp;-stop&lt;/code&gt;) enforce Robert C. Martin's Three Laws of TDD through the disciplined &lt;strong&gt;RED&lt;/strong&gt; - &lt;strong&gt;GREEN&lt;/strong&gt; - &lt;strong&gt;REFACTOR&lt;/strong&gt; cycle:&lt;br&gt;&lt;br&gt;• &lt;strong&gt;RED&lt;/strong&gt;: Write a minimal failing test (enforces Laws 1 &amp;amp; 2: no production code without failing test; minimal test to fail)&lt;br&gt;• &lt;strong&gt;GREEN&lt;/strong&gt;: Write the simplest code to pass (enforces Law 3: minimal production code to pass)&lt;br&gt;• &lt;strong&gt;REFACTOR&lt;/strong&gt;: Improve code while keeping tests green&lt;br&gt;&lt;br&gt;In practice, the &lt;code&gt;AAID&lt;/code&gt; rules file often handles phase discipline automatically, but these commands offer explicit control when needed. Re-issue with feedback to guide the AI.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For each test, follow this disciplined 3-phase cycle:&lt;/p&gt;

&lt;p&gt;🔴 &lt;strong&gt;RED Phase&lt;/strong&gt; →&lt;br&gt;
🟢 &lt;strong&gt;GREEN Phase&lt;/strong&gt; →&lt;br&gt;
🧼 &lt;strong&gt;REFACTOR Phase&lt;/strong&gt; →&lt;br&gt;
&lt;strong&gt;Next test&lt;/strong&gt; → &lt;em&gt;(cycle repeats)&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Each phase follows the same internal pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Collaborate and generate with AI&lt;/strong&gt; ¹&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Run tests&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handle potential issues&lt;/strong&gt; &lt;em&gt;(if any arise)&lt;/em&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;/analyze-&amp;amp;-stop&lt;/code&gt; or other investigation &amp;amp; problem solving commands as needed&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;AWAIT USER REVIEW&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let's walk through a full TDD cycle using this consistent structure.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;¹ 🦾 &lt;strong&gt;Proficiency Note&lt;/strong&gt;: As you master &lt;code&gt;AAID&lt;/code&gt;, the initial "collaborate" step often becomes autonomous AI generation using your established commands and context. This speeds up the workflow considerably. You might simply invoke &lt;code&gt;/red-&amp;amp;-stop&lt;/code&gt; and let the AI generate appropriate code, then focus your attention on the &lt;code&gt;AWAIT USER REVIEW&lt;/code&gt; checkpoints. This dual-review structure (light collaboration + formal review) is what enables both speed and control.&lt;/p&gt;
&lt;/blockquote&gt;



&lt;p&gt;&lt;strong&gt;User Story Specification:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's use this simple spec as a basis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="err"&gt;Title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;User&lt;/span&gt; &lt;span class="err"&gt;adds&lt;/span&gt; &lt;span class="nf"&gt;a &lt;/span&gt;new todo

&lt;span class="err"&gt;User Story&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="err"&gt;As&lt;/span&gt; &lt;span class="nf"&gt;a &lt;/span&gt;user, I want to add a new todo to my list, so that I can keep track of my tasks.

&lt;span class="err"&gt;Acceptance Criteria&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="kd"&gt;Feature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; Add a new todo

&lt;span class="kn"&gt;Scenario&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; Add a new active todo
  &lt;span class="nf"&gt;Given &lt;/span&gt;the user has an empty todo list
  &lt;span class="nf"&gt;When &lt;/span&gt;they add a new todo &lt;span class="s"&gt;"Buy groceries"&lt;/span&gt;
  &lt;span class="nf"&gt;Then &lt;/span&gt;&lt;span class="s"&gt;"Buy groceries"&lt;/span&gt; should be in their active todos
  &lt;span class="nf"&gt;And &lt;/span&gt;the todo should not be completed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Unit tests build incrementally&lt;/strong&gt;, testing one behavior at a time. As they plan for fine-grained technical correctness/edge cases, unit tests don't always need to map 1:1 with acceptance criteria; that's the acceptance test's job.&lt;br&gt;&lt;br&gt;More on this distinction in Appendix A: Unit Testing and Acceptance Testing.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  🔴 RED Phase
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;→ Collaborate with AI to write test&lt;/strong&gt; (&lt;code&gt;/red-&amp;amp;-stop&lt;/code&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Un-skip the first test if using test list&lt;/li&gt;
&lt;li&gt;Or write the first test from scratch if using single test approach&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;→ Run test and verify failure&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Should fail as expected (compilation failures count as valid test failures)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;→ Handle potential issues&lt;/strong&gt; &lt;em&gt;(if any arise)&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If test passes unexpectedly: AI stops and reports the issue&lt;/li&gt;
&lt;li&gt;Choose investigation approach (often using investigation &amp;amp; problem solving commands like &lt;code&gt;/analyze-&amp;amp;-stop&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;AI implements your chosen fix, then stops for review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example RED phase prompt:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/red-&amp;amp;-stop

// link/paste the business specification, e.g the BDD scenario
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Because of the context that has been provided in the previous steps, the prompt often doesn't have to be longer than this.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generated test:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// todo.service.test.ts&lt;/span&gt;

&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;addTodo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should add a todo with the correct text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// When&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;addTodo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy groceries&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// Fails: 'addTodo' is not defined&lt;/span&gt;

    &lt;span class="c1"&gt;// Then&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy groceries&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;⏸️ &lt;strong&gt;STOP: AWAIT USER REVIEW&lt;/strong&gt;
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI agent must &lt;code&gt;AWAIT USER REVIEW&lt;/code&gt; before proceeding to GREEN.&lt;br&gt;&lt;br&gt;&lt;strong&gt;During RED phase review, evaluate:&lt;/strong&gt;&lt;br&gt;🔴 Tests behavior (what the system does), not implementation (how it does it)&lt;br&gt;🔴 In the test phase you design the API of what you are building; its user interface. So—does it feel nice to use?&lt;br&gt;🔴 Is the test hard to understand or set up? That could be a sign you need to rethink your approach. &lt;strong&gt;Clean code starts with a clean test&lt;/strong&gt;&lt;br&gt;🔴 Clear test name describing the requirement&lt;br&gt;🔴 Proper Given/When/Then structure&lt;br&gt;🔴 Mock external dependencies to isolate the unit; test should run in milliseconds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Optional: example RED Phase follow-up prompt:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/red-&amp;amp;-stop

- Create todo service class instead of function
- Inject repository
- Update test to check "completed" attribute only
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Often follow-ups like these are not needed because of Stage 1.4: Add Relevant Code Context, and 2.3 Request Feature Roadmap&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test after RED review:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// todo.service.test.ts&lt;/span&gt;
&lt;span class="c1"&gt;// Both imports will fail - files don't exist yet (compilation failure = valid test failure)&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;TodoService&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./todo.service&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./interfaces/todo.interface&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TodoService&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should add a todo with completed set to false&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Given&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mockRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="c1"&gt;// Start minimal - no API assumptions yet&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TodoService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mockRepository&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// When&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addTodo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy groceries&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// Then&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// Note: You're testing ONE behavior. The repository.save()&lt;/span&gt;
    &lt;span class="c1"&gt;// will be forced by a future test, not this one.&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🟢 GREEN Phase
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;→ Collaborate with AI to write code&lt;/strong&gt; (&lt;code&gt;/green-&amp;amp;-stop&lt;/code&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Write the simplest code to make the test pass&lt;/li&gt;
&lt;li&gt;Keep implementation naïve/hardcoded until tests "&lt;a href="https://tdd.mooc.fi/1-tdd/#triangulation" rel="noopener noreferrer"&gt;triangulate&lt;/a&gt;" (multiple tests force abstraction/generalization)&lt;/li&gt;
&lt;li&gt;No extra logic for untested scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;→ Run tests to verify success&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Current test should pass&lt;/li&gt;
&lt;li&gt;All other existing tests still pass&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;→ Handle potential issues&lt;/strong&gt; &lt;em&gt;(if any arise)&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If tests fail: AI stops and reports which ones failed&lt;/li&gt;
&lt;li&gt;Choose debugging approach (often using investigation &amp;amp; problem solving commands like &lt;code&gt;/debug-&amp;amp;-stop&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;AI implements your chosen solution, then stops for review&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Why simplest first?&lt;/strong&gt; One test can only verify one thing, so complex code means untested parts. If your over-engineered solution breaks, you're debugging the test failure AND untested logic simultaneously. Simple code gets you stable fast and forces each new feature to get its own test, keeping everything verified.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Example GREEN phase prompt:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/green-&amp;amp;-stop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Generated code:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// interfaces/todo.interface.ts&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// todo.service.ts&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./interfaces/todo.interface&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TodoService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;repository&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="c1"&gt;// 'unknown' is fine - no test demands otherwise&lt;/span&gt;

  &lt;span class="nf"&gt;addTodo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Simplest naïve/hardcoded implementation to pass the test&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;⏸️ &lt;strong&gt;STOP: AWAIT USER REVIEW&lt;/strong&gt;
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI agent must &lt;code&gt;AWAIT USER REVIEW&lt;/code&gt; before proceeding to REFACTOR.&lt;br&gt;&lt;br&gt;&lt;strong&gt;During GREEN phase review, evaluate:&lt;/strong&gt;&lt;br&gt;🟢 The code is the simplest possible solution to make the test pass&lt;br&gt;🟢 If tests triangulate (multiple examples reveal a pattern), verify code generalizes&lt;br&gt;🟢 No unnecessary abstractions or future-proofing if tests do not demand it&lt;br&gt;🟢 Code structure follows project patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  🧼 REFACTOR Phase
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;❕&lt;/th&gt;
&lt;th&gt;This phase is one of the main reasons developers won't be replaced any time soon&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;→ Collaborate with AI to refactor&lt;/strong&gt; (&lt;code&gt;/refactor-&amp;amp;-stop&lt;/code&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Collaborate to enhance code while keeping tests green&lt;/li&gt;
&lt;li&gt;Apply patterns that improve current code quality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;→ Run tests to verify stability&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No regressions introduced&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;→ Handle potential issues&lt;/strong&gt; &lt;em&gt;(if any arise)&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If refactoring breaks tests: AI stops and reports what failed&lt;/li&gt;
&lt;li&gt;Review failure and choose: fix the issue or revert to green state&lt;/li&gt;
&lt;li&gt;Use investigation &amp;amp; problem solving commands like &lt;code&gt;/minimal-fix-&amp;amp;-analyze-&amp;amp;-stop&lt;/code&gt; for fixes&lt;/li&gt;
&lt;li&gt;Revert when the refactoring approach itself is flawed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example REFACTOR phase prompt:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@refactor-&amp;amp;-stop

// Note to reader: The AI may suggest minimal refactoring since you only have one test.
// But you can guide it toward patterns that improve current code quality.

- Extract Todo model class
- Put interface in same file
- Model should be immutable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Generated refactored code:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// models/todo.model.ts&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TodoModel&lt;/span&gt; &lt;span class="k"&gt;implements&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

  &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// For now just handles completed status&lt;/span&gt;
    &lt;span class="c1"&gt;// Future tests will force us to handle text properly&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TodoModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/todo.service.ts&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;TodoModel&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../models/todo.model&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TodoService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;repository&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

  &lt;span class="nf"&gt;addTodo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Still ignoring text parameter - test doesn't check it yet&lt;/span&gt;
    &lt;span class="c1"&gt;// Repository still unused - no test requires persistence yet&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;TodoModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;⏸️ &lt;strong&gt;STOP: AWAIT USER REVIEW&lt;/strong&gt;
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI agent must &lt;code&gt;AWAIT USER REVIEW&lt;/code&gt; before proceeding to next test. Final overall review opportunity.&lt;br&gt;&lt;br&gt;&lt;strong&gt;During REFACTOR phase final review, evaluate:&lt;/strong&gt;&lt;br&gt;🧼 Apply your engineering expertise to assure quality&lt;br&gt;🧼 Focus on fundamentals: modularity, abstraction, cohesion, separation of concerns, coupling management, readability, testability&lt;br&gt;🧼 Remove unnecessary comments, logs, debugging code&lt;br&gt;🧼 Consider potential security vulnerabilities&lt;br&gt;🧼 Optional: Conduct manual user testing for what you've built. Check the "&lt;em&gt;feel&lt;/em&gt;"—only humans can do that!—and UX&lt;br&gt;🧼 Optional: Run AI bug finder for additional safety&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Optional: example REFACTOR Phase follow-up prompt:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@refactor-&amp;amp;-stop

- Remove all comments
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Often these prompts aren't needed due to the AI workflow instructions and context provided earlier.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code after REFACTOR review:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/todo.service.ts&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;TodoModel&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../models/todo.model&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TodoService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;repository&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

  &lt;span class="nf"&gt;addTodo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Todo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;TodoModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;em&gt;Congratulations&lt;/em&gt;,&lt;/strong&gt; you made it through all the &lt;code&gt;AAID&lt;/code&gt; steps! While the workflow might seem overwhelming at first, with practice it becomes habit, and the speed increases accordingly.&lt;/p&gt;

&lt;p&gt;&lt;a id="continuing-tdd-cycle"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Continuing the Stage 4: TDD Cycle
&lt;/h2&gt;

&lt;p&gt;After completing the first cycle, you'd repeat the process with the next test that forces the code to evolve:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second cycle&lt;/strong&gt; might test: &lt;code&gt;'should create todo with provided text'&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Forces: &lt;code&gt;return { text, completed: false }&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Third cycle&lt;/strong&gt; might test: &lt;code&gt;'should persist new todos'&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A repository interface to define the persistence contract, replacing the &lt;code&gt;unknown&lt;/code&gt; type.&lt;/li&gt;
&lt;li&gt;Forces: Repository to have a &lt;code&gt;save&lt;/code&gt; method&lt;/li&gt;
&lt;li&gt;Forces: &lt;code&gt;this.repository.save({ text, completed: false })&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Fourth cycle&lt;/strong&gt; might test: &lt;code&gt;'should be able to find the persisted todo after creating it'&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Forces: Repository.save must provide identifying information (an ID)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each cycle follows the same disciplined flow:&lt;/p&gt;

&lt;p&gt;🔴 &lt;strong&gt;RED&lt;/strong&gt; →&lt;br&gt;
⏸️ &lt;strong&gt;Review&lt;/strong&gt; →&lt;br&gt;
🟢 &lt;strong&gt;GREEN&lt;/strong&gt; →&lt;br&gt;
⏸️ &lt;strong&gt;Review&lt;/strong&gt; →&lt;br&gt;
🧼 &lt;strong&gt;REFACTOR&lt;/strong&gt; →&lt;br&gt;
⏸️ &lt;strong&gt;Final review&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The tests gradually shape the implementation, ensuring every line of production code exists only because a test demanded it. This eliminates dead code and hidden bugs: if it's not tested, it doesn't exist.&lt;/p&gt;

&lt;p&gt;&lt;a id="conclusion"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Conclusion: The Augmented Advantage
&lt;/h2&gt;

&lt;p&gt;Your bottleneck changes with &lt;code&gt;AAID&lt;/code&gt;. Instead of being stuck on implementation details, you're now constrained only by your ability to architect and review.&lt;/p&gt;

&lt;p&gt;The work becomes more strategic. You make the high-level decisions while AI handles the code generation. TDD keeps this relationship stable by forcing you to define exactly what you want before the AI builds it.&lt;/p&gt;

&lt;p&gt;This completely avoids the dangers of vibe coding. &lt;code&gt;AAID&lt;/code&gt; helps you as a professional ship quality software with full understanding of what you've built.&lt;/p&gt;

&lt;p&gt;And as the &lt;code&gt;AAID&lt;/code&gt; loop becomes muscle memory, you will catch regressions early and ship faster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That's the augmented advantage.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a id="example-implementation"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Example Implementation
&lt;/h2&gt;

&lt;p&gt;For a concrete example of code generated with &lt;code&gt;AAID&lt;/code&gt;, explore this &lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development-demo" rel="noopener noreferrer"&gt;TicTacToe CLI demo&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;100% of the code was generated by an AI agent.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It demonstrates a minimal hexagonal architecture with clear separation between domain logic and adapters, following &lt;code&gt;AAID&lt;/code&gt; principles.&lt;/p&gt;

&lt;p&gt;Comprehensive test coverage is also included as a consequence of TDD; both unit tests and BDD-style acceptance tests, mapped directly from specs.&lt;/p&gt;



&lt;p&gt;&lt;a id="appendices"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  End of Guide
&lt;/h3&gt;

&lt;p&gt;You've reached the end of the &lt;code&gt;AAID&lt;/code&gt; guide. The appendices below are optional reference material you can dip into as needed.&lt;/p&gt;
&lt;/blockquote&gt;



&lt;p&gt;&lt;a id="appendix-a"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Appendix A: Acceptance Testing
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9owndu41m0c7vdstgzra.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9owndu41m0c7vdstgzra.webp" alt="Appendix A" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This article on &lt;code&gt;AAID&lt;/code&gt; focuses on TDD (Test-Driven Development) for &lt;strong&gt;Unit Testing&lt;/strong&gt;, which ensures you actually write your code correctly and with high quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Acceptance Testing&lt;/strong&gt;, on the other hand, verifies that your software aligns with business goals and is actually &lt;em&gt;done&lt;/em&gt;. It serves as an executable definition-of-done.&lt;/p&gt;

&lt;p&gt;Understanding how these two testing strategies complement each other is crucial for professional developers, as both are invaluable parts of writing production-grade software.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Acceptance Testing is similar to E2E testing; both test the full app flow, via the system boundaries.&lt;br&gt;&lt;br&gt;The key difference: AT mocks external dependencies you don't control (third-party APIs, etc) while keeping internal dependencies you do control (your database, etc) real. E2E usually mocks nothing and runs everything together.&lt;br&gt;&lt;br&gt;Problem with E2E: Tests fail due to external factors (third-party outages, network issues) rather than your code. Acceptance Testing isolates your system so failures indicate real business logic problems, or technical issues that you are responsible for.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The two kinds of tests answer different questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TDD (Unit Tests)&lt;/strong&gt;: "&lt;em&gt;Is my code technically correct?&lt;/em&gt;"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ATDD (Acceptance Tests)&lt;/strong&gt;: "&lt;em&gt;Is my system releasable after this change?&lt;/em&gt;"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So in short: TDD builds the solution, Acceptance Tests confirm it’s the right solution.&lt;/p&gt;
&lt;h3&gt;
  
  
  Key Differences
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Unit Tests (TDD)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Answer: "&lt;em&gt;Is my code technically correct?&lt;/em&gt;"&lt;/li&gt;
&lt;li&gt;Fine-grained, developer-focused testing&lt;/li&gt;
&lt;li&gt;Mock all external dependencies

&lt;ul&gt;
&lt;li&gt;See Appendix E for dependency categories&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Test suite should run in seconds to tens-of-seconds&lt;/li&gt;
&lt;li&gt;Apply design pressure through testability&lt;/li&gt;
&lt;li&gt;Can but doesn't necessarily map 1:1 to user stories/acceptance criteria&lt;/li&gt;
&lt;li&gt;Guide code quality and modularity&lt;/li&gt;
&lt;li&gt;Part of the fast feedback loop in CI/CD&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example of what a unit test looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TodoService&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should archive a completed todo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Given&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;completedTodo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;todo-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy milk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;mockTodoRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mockResolvedValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;completedTodo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// When&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;archiveTodo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;todo-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// Then&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isOk&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mockTodoRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;moveToArchive&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toHaveBeenCalledWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;completedTodo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mockTodoRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;removeFromActive&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toHaveBeenCalledWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;todo-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Acceptance Tests (ATDD/BDD)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Answer: "&lt;em&gt;Does the system meet business requirements?&lt;/em&gt;"&lt;/li&gt;
&lt;li&gt;Business specification validation through user-visible features&lt;/li&gt;
&lt;li&gt;Test in a production-like environment through system boundaries&lt;/li&gt;
&lt;li&gt;Mock unmanaged external dependencies (like third-party APIs)

&lt;ul&gt;
&lt;li&gt;Don't mock managed external dependencies (like app's database)&lt;/li&gt;
&lt;li&gt;See Appendix E for dependency categories&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Test suite will run slower than unit tests&lt;/li&gt;

&lt;li&gt;Maps 1:1 to user stories/acceptance criteria&lt;/li&gt;

&lt;li&gt;Verify the system is ready for release&lt;/li&gt;

&lt;li&gt;Stakeholder-focused (though developers + AI implement)&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Example of what an acceptance test looks like (using the &lt;a href="https://dojoconsortium.org/assets/ATDD%20-%20How%20to%20Guide.pdf" rel="noopener noreferrer"&gt;Four-Layer&lt;/a&gt; model pioneered by &lt;a href="https://courses.cd.training/" rel="noopener noreferrer"&gt;Dave Farley&lt;/a&gt;):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. Executable Specification&lt;/td&gt;
&lt;td&gt;The test&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. Domain-Specific Language (DSL)&lt;/td&gt;
&lt;td&gt;Business vocabulary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Driver&lt;/td&gt;
&lt;td&gt;Bridge between DSL and SUT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. System Under Test (SUT)&lt;/td&gt;
&lt;td&gt;Production-like application environment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;todo&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../dsl&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;User archives completed todos&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;should archive a completed todo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Given&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startsWithNewAccount&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hasCompletedTodo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy milk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// When&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;todo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;archive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy milk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// Then&lt;/span&gt;
    &lt;span class="nx"&gt;todo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;confirmInArchive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy milk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nx"&gt;todo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;confirmNotInActive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Buy milk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Acceptance tests know nothing about how our app works internally. Even if the app changes its technical implementation details, this specification (test) will remain valid.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In acceptance tests, every DSL call follows the same flow: &lt;strong&gt;Test → DSL → Driver → SUT&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The DSL provides business vocabulary (like &lt;code&gt;user&lt;/code&gt; or &lt;code&gt;archive todo&lt;/code&gt;), while the driver &lt;strong&gt;connects to your SUT from the outside (through APIs, UI, or other entry points)&lt;/strong&gt;. This separation keeps tests readable and maintainable.&lt;/p&gt;

&lt;p&gt;Notice how unit tests directly test the class with mocks, while acceptance tests use this DSL layer to express tests in business terms.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🔌&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Note on Integration Testing&lt;/strong&gt;: While this guide focuses on unit testing through TDD, &lt;code&gt;AAID&lt;/code&gt; also applies to integration testing. Integration tests verify a single infrastructure element's technical contract by testing it with only its immediate managed dependency (e.g., a repository adapter with real database). Unmanaged dependencies are mocked. See &lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/appendices/appendix-e/dependencies-and-mocking.md" rel="noopener noreferrer"&gt;Appendix E&lt;/a&gt; for complete dependency handling guidelines.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In &lt;code&gt;AAID&lt;/code&gt;, AI helps you rapidly write unit tests and implementations. Knowing the difference between unit and acceptance testing prevents you from mistaking 'technically correct code' for 'done features,' a crucial distinction in professional development.&lt;/p&gt;

&lt;h3&gt;
  
  
  AAID Acceptance Testing Resources
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Companion article covering the full AT workflow&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/appendices/appendix-a/docs/aaid-acceptance-testing-workflow.md" rel="noopener noreferrer"&gt;AAID Acceptance Testing Workflow&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Visual workflow diagram of the &lt;code&gt;AAID&lt;/code&gt; three-phase AT cycle (Mermaid)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/appendices/appendix-a/aaid-at-workflow.diagram.mermaid" rel="noopener noreferrer"&gt;AAID AT graph&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rule file to enable &lt;code&gt;AAID&lt;/code&gt; AT mode in a project&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/appendices/appendix-a/rules/aaid-at/acceptance-testing-mode.mdc" rel="noopener noreferrer"&gt;Acceptance Testing Mode&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Demo of executable specifications used in practice&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development-demo/blob/main/acceptance-test/executable-specs/cli.acceptance.spec.ts" rel="noopener noreferrer"&gt;TicTacToe executable specifications&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="appendix-b"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Appendix B: Helpful Commands (Reusable Prompts)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqosqtt4q31uuf452l0ds.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqosqtt4q31uuf452l0ds.webp" alt="Appendix B" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These reusable prompt commands speed up your &lt;code&gt;AAID&lt;/code&gt; workflow.&lt;/p&gt;

&lt;p&gt;&lt;a id="appendix-b-setup-commands"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Setup &amp;amp; Planning Commands
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Used in Stage &lt;strong&gt;1: Context Providing&lt;/strong&gt; and &lt;strong&gt;Stage 2: Planning&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/project-context&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Establishes comprehensive project understanding with architecture, testing strategy, code style, etc&lt;br&gt;&lt;br&gt; &lt;em&gt;&lt;strong&gt;Note on context&lt;/strong&gt;: Since Commands in Cursor cannot currently directly reference files with &lt;code&gt;@&lt;/code&gt; symbols inside the command files themselves, you'll need to include any necessary context when invoking the command. For example:&lt;/em&gt; &lt;code&gt;/project-context @README.md @docs/architecture.md&lt;/code&gt;. &lt;em&gt;The command will then operate on the provided context.&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;Stage 1&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/planning/project-context.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/ai-roadmap-template&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Creates high-level roadmap for domain/business logic features that guides TDD without prescribing implementation&lt;/td&gt;
&lt;td&gt;Stage 2&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/planning/ai-roadmap-template.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/ai-technical-roadmap-template&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Creates roadmap for technical implementation (infrastructure elements: adapters, repositories, controllers, etc.) - see Appendix D
&lt;/td&gt;
&lt;td&gt;Stage 2&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/planning/ai-technical-roadmap-template.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/ai-presentation-roadmap-template&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Creates roadmap for observable technical elements (pure UI/sensory) - see Appendix D
&lt;/td&gt;
&lt;td&gt;Stage 2&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/planning/ai-presentation-roadmap-template.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/ai-acceptance-roadmap-template&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Creates strategic roadmap for acceptance testing with isolation strategy - see Appendix A
&lt;/td&gt;
&lt;td&gt;Stage 2&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/acceptance/ai-acceptance-roadmap-template.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Planning Tools&lt;/strong&gt;: Some tools have dedicated planning mechanics (e.g., Claude Code's &lt;a href="https://claudelog.com/mechanics/plan-mode/" rel="noopener noreferrer"&gt;Plan Mode&lt;/a&gt; or the &lt;a href="https://cursor.com/docs/agent/planning#plan-mode" rel="noopener noreferrer"&gt;Cursor equivalent&lt;/a&gt;). Combine these with roadmap commands when beneficial.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="appendix-b-tdd-commands"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  TDD Development Commands
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Used in &lt;strong&gt;Stage 4: The TDD Cycle&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;These commands embed the Three Laws of TDD:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No behavioral production code without a failing test&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Write only enough test code to fail&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Write only enough production code to pass&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each command enforces these laws at the appropriate phase by referencing the AAID rules file, which serves as the single source of truth for the workflow.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;TDD Phase&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/red-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enter RED phase: Write minimal failing test, then STOP for review&lt;/td&gt;
&lt;td&gt;🔴 RED&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/tdd/red-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/green-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enter GREEN phase: Write simplest passing code, then STOP for review&lt;/td&gt;
&lt;td&gt;🟢 GREEN&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/tdd/green-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/refactor-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enter REFACTOR phase: Improve code with tests green, then STOP for review&lt;/td&gt;
&lt;td&gt;🧼 REFACTOR&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/tdd/refactor-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;In practice:&lt;/strong&gt; Since the rules file is automatically loaded by your IDE/CLI, you often won't need these commands; the AI will often follow the workflow from the rules alone. That said, the commands remain useful as explicit phase triggers when needed.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Adding to Existing Projects&lt;/strong&gt;: These commands work for adding new features to any codebase (new or existing).&lt;br&gt;&lt;br&gt;&lt;strong&gt;Modifying Untested Code&lt;/strong&gt;: When changing existing untested code, first establish characterization tests (documenting current behavior) and find seams (testable injection points). See books like &lt;a href="https://www.oreilly.com/library/view/working-effectively-with/0131177052/" rel="noopener noreferrer"&gt;Working Effectively with Legacy Code&lt;/a&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="appendix-b-investigation-commands"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Investigation &amp;amp; Problem Solving Commands
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Used throughout various &lt;code&gt;AAID&lt;/code&gt; stages for research and debugging&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;These commands help when you need additional context (Stage 2: Planning) or encounter issues during the TDD cycle (Stage 4: "Handle potential issues" step).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Primary Use&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/analyze-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Diagnose specific problems, errors, or failures without making changes&lt;/td&gt;
&lt;td&gt;Debugging failures&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/investigation/analyze-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/analyze-script-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run a specific script and analyze results without making changes&lt;/td&gt;
&lt;td&gt;Script diagnostics&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/investigation/analyze-script-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/debug-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Add debug logging and analyze results to understand issues&lt;/td&gt;
&lt;td&gt;Deep debugging&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/investigation/debug-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/minimal-fix-&amp;amp;-analyze-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Implement the simplest fix, verify results, and analyze outcome&lt;/td&gt;
&lt;td&gt;Quick fixes&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/investigation/minimal-fix-%26-analyze-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/research-&amp;amp;-stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Comprehensive investigation and context gathering (use for broad exploration)&lt;/td&gt;
&lt;td&gt;Context gathering&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/investigation/research-%26-stop.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Triggering "/analyze-script-&amp;amp;-stop"&lt;/strong&gt;: The user discusses or simply types the the script after the command name, for example: "&lt;code&gt;/analyze-script-&amp;amp;-stop test:db&lt;/code&gt;"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a id="appendix-b-misc-commands"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Miscellaneous Commands
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Utility commands for common development tasks&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/git-commit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create clean commit messages following project guidelines&lt;/td&gt;
&lt;td&gt;Version control&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/misc/git-commit.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/gherkin-guard&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enforce consistent Gherkin-style Given/When/Then comments in tests&lt;/td&gt;
&lt;td&gt;Test formatting&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/.cursor/commands/misc/gherkin-guard.md" rel="noopener noreferrer"&gt;View&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These are just examples of &lt;code&gt;AAID&lt;/code&gt; commands. Create your own or modify these to match your workflow. The key is using reusable prompts to greatly augment your development speed.&lt;/p&gt;

&lt;p&gt;&lt;a id="appendix-c"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Appendix C: AAID Workflow Rules
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn9rnnecpubuy4vh1cqe.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn9rnnecpubuy4vh1cqe.webp" alt="Appendix C" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Configure your AI environment to understand the &lt;code&gt;AAID&lt;/code&gt; workflow. These are simple text instructions, no special &lt;code&gt;AAID&lt;/code&gt; app or tool is required.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;☝️&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Note on AI instruction following accuracy&lt;/strong&gt;: At the time of writing, current AIs are good, but not at all perfect, at following instructions and rules such as the &lt;strong&gt;AAID AI Workflow Rules&lt;/strong&gt;. Sometimes you may need to remind the AI if it for example forgets a TDD phase, or moves directly to GREEN without stopping for user review at RED.&lt;br&gt;&lt;br&gt;As LLMs improve over time, you'll need to worry less about this.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  AAID AI Workflow Rules/Instructions
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;This is the official &lt;code&gt;AAID&lt;/code&gt; workflow rules. But feel free to customise it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/rules/aaid/aaid-development-rules.mdc" rel="noopener noreferrer"&gt;AAID AI Workflow Rules/Instructions&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Usage Guide
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;For Cursor:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Project‑specific: commit a rule file in &lt;code&gt;.cursor/rules/&lt;/code&gt; so it's version controlled and scoped to the repo.&lt;/li&gt;
&lt;li&gt;Global: Add to User Rules in Cursor Settings&lt;/li&gt;
&lt;li&gt;Simple alternative: Place in &lt;code&gt;AGENTS.md&lt;/code&gt; in project root&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;For Claude Code:&lt;/strong&gt;&lt;br&gt;
Place in &lt;code&gt;CLAUDE.md&lt;/code&gt; file in your project root (or &lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt; for global use)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For other AI tools:&lt;/strong&gt;&lt;br&gt;
Look for "custom instructions", "custom rules", or "system prompt" settings&lt;/p&gt;

&lt;p&gt;&lt;a id="appendix-d"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Appendix D: Handling Technical Implementation Details
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fksmx6k20qnqyuw8rmub2.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fksmx6k20qnqyuw8rmub2.webp" alt="Appendix D" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The main guide above has focused on BDD/TDD for domain behavior. Technical implementation details—infrastructure elements and presentation—are covered in Appendix D.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkt6lxtttozuxv9q6h9b.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkt6lxtttozuxv9q6h9b.webp" alt="AAID implementation categories" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/appendices/appendix-d/handling-technical-implementation-details.md" rel="noopener noreferrer"&gt;Read Appendix D: Handling Technical Implementation Details&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a id="appendix-e"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Appendix E: Dependencies and Mocking
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc9bkc6kmd6qkf3cusen.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc9bkc6kmd6qkf3cusen.webp" alt="Appendix E" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you've identified your test type from the &lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/appendices/appendix-d/handling-technical-implementation-details.md#aaid-implementation-matrix-build-types-and-verification" rel="noopener noreferrer"&gt;Implementation Categories&lt;/a&gt;, this reference clarifies how to properly handle the dependencies of what you're testing.&lt;/p&gt;

&lt;p&gt;It covers the four dependency categories (Pure In-Process, Impure In-Process, Managed Out-of-Process, Unmanaged Out-of-Process) and shows how each test type (unit, integration, contract, acceptance) handles them differently.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg0v5yp6xrvpnqjevfunz.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg0v5yp6xrvpnqjevfunz.webp" alt="Dependency Categories" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dawid-dahl-umain/augmented-ai-development/blob/main/appendices/appendix-e/dependencies-and-mocking.md" rel="noopener noreferrer"&gt;Read Appendix E: Dependencies and Mocking&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a id="about-author"&gt;&lt;/a&gt;&lt;br&gt;
Dawid Dahl is a full-stack developer and AI skill lead at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://www.eidra.com/" rel="noopener noreferrer"&gt;EIDRA&lt;/a&gt;. In his free time, he enjoys metaphysical ontology and epistemology, analog synthesizers, consciousness, techno, Huayan and Madhyamika Prasangika philosophy, and being with friends and family.&lt;/p&gt;

&lt;p&gt;Photography credit: &lt;a href="https://unsplash.com/@kaixapham" rel="noopener noreferrer"&gt;kaixapham&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tdd</category>
      <category>bdd</category>
      <category>testing</category>
    </item>
    <item>
      <title>Encapsulating the Past: How We Tamed a Legacy System with Timeless Software Engineering Principles</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Wed, 18 Sep 2024 08:16:49 +0000</pubDate>
      <link>https://forem.com/dawiddahl/encapsulating-the-past-how-we-tamed-a-legacy-system-with-timeless-software-engineering-principles-154i</link>
      <guid>https://forem.com/dawiddahl/encapsulating-the-past-how-we-tamed-a-legacy-system-with-timeless-software-engineering-principles-154i</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Inheriting a Mess&lt;/li&gt;
&lt;li&gt;Reinventing from the Ground Up&lt;/li&gt;
&lt;li&gt;Ports and Adapters&lt;/li&gt;
&lt;li&gt;The Technology Behind Our Overhaul&lt;/li&gt;
&lt;li&gt;Did SOLID Principles Guide Our Design?&lt;/li&gt;
&lt;li&gt;Building Confidence with the Testing Pyramid Strategy&lt;/li&gt;
&lt;li&gt;Ensuring Testability with Pure Functions&lt;/li&gt;
&lt;li&gt;Deployment on Google Cloud Platform&lt;/li&gt;
&lt;li&gt;In Summary: Was the Backend Transformation Successful?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The day we took over the operations of a legacy e-commerce backend system from the global protection brand &lt;strong&gt;POC&lt;/strong&gt;, one thing was certain: this was going to be a formidable challenge.&lt;/p&gt;

&lt;p&gt;The codebase we inherited from the previous development team was riddled with issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It was fragile, often breaking with (or even without) the slightest modification.&lt;/li&gt;
&lt;li&gt;Changes couldn’t be made with confidence, as the system was completely untested.&lt;/li&gt;
&lt;li&gt;It lacked any coherent design principles, leaving us without a solid foundation to build on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Given the state of the system, it became clear that a simple cleanup wouldn’t suffice. What we needed was a complete overhaul — a new application designed from the ground up, drawing inspiration and guidance from various timeless software engineering principles. This approach would allow us to address every flaw we encountered, laying a solid foundation for the future.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inheriting a Mess
&lt;/h2&gt;

&lt;p&gt;The use case that led to the now-legacy solution involved transferring data—such as stock, orders, and tracking events—between the client's ERP (Enterprise Resource Planning) system and their Shopify e-commerce platform.&lt;/p&gt;

&lt;p&gt;Let’s take a look at the inventory flow as an example:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyuvkojxxmag44aey3w0k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyuvkojxxmag44aey3w0k.png" alt="Legacy High Level Architecture" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The problem was that their ERP, Microsoft Dynamics AX, is a relic from the stone age, offering none of the modern amenities like a REST or GraphQL API. Instead, it resorted to dropping literal XML files onto an SFTP server, to later be picked up for processing.&lt;/p&gt;

&lt;p&gt;This processing was handled by a no-code platform called Make. While Make offered a nifty solution for simpler workflows, its limitations became painfully obvious when dealing with complex business logic and advanced use cases.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2h0sjp423gz2989n9bj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2h0sjp423gz2989n9bj.png" alt="Legacy software mess" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On top of that, the technology chosen as the "database" for data on its way to and from Shopify was Google Sheets. Using a spreadsheet for this purpose of course lacked the robustness needed for complex workflows and storage.&lt;/p&gt;

&lt;p&gt;The system also relied on Matrixify, a third-party Shopify app, for data imports and exports. While functional, the app's awkward interface and us needing to depend on an external tool introduced additional risks, underscoring the fragility of the entire legacy setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reinventing from the Ground Up
&lt;/h2&gt;

&lt;p&gt;To solve these challenges, we first asked if the client was open to switching to a more modern ERP. They were initially on board, but their IT team estimated the cost at nearly 1 million euros, so that option was off the table. Rather than dwelling on this obstacle, we came up with an idea 💡:&lt;/p&gt;

&lt;p&gt;How about we encapsulate the whole legacy system in a new backend service—an ERP adapter—which would then be able to offer a simple API interface for the E-com engine to interact with?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvqb6l2txngx6s7qtkj7n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvqb6l2txngx6s7qtkj7n.png" alt="Umain's High Level Architecture" width="800" height="799"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This way, we could deal with the issues inside once, and then no one on the outside would ever have to think about quirky XML file syntax, Google Sheets going down because of not being able to process more than 50 000 rows, or unstable SFTP server interactions ever again.&lt;/p&gt;

&lt;p&gt;So we did a major architectural overhaul. Here are a few of the main changes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adapted a proper Postgres database, with Prisma as ORM.&lt;/li&gt;
&lt;li&gt;Got rid of the dependency for an import/export SaaS product and built the functionality ourselves. (Mutation batching, Centra rate-limit handling, logging.)&lt;/li&gt;
&lt;li&gt;Added strong typing with TypeScript for every entity and interaction.&lt;/li&gt;
&lt;li&gt;Exposed a GraphQL API.&lt;/li&gt;
&lt;li&gt;For security, storage, cron jobs, hosting, and more, we used Google Cloud Platform.&lt;/li&gt;
&lt;li&gt;Set up an independent QA environment in GCP, to be able to safely test new features before deploying to production.&lt;/li&gt;
&lt;li&gt;(For the E-com engine we switched from their old Shopify setup that used Liquid templating and barely readable checkout scripts, to a headless setup with &lt;a href="https://centra.com" rel="noopener noreferrer"&gt;Centra&lt;/a&gt; and a Next app for the frontend.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fllvbl0uajamgiitvl0il.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fllvbl0uajamgiitvl0il.png" alt="Encapsulating the past: metaphor" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Technology Behind Our Overhaul
&lt;/h2&gt;

&lt;p&gt;One of our primary goals was to ensure that different parts of the codebase were independent (decoupled), so changes in one area wouldn't affect another. With the legacy system, we never felt free to change something that worked, because we had no idea what would break. This is a very bad situation to be in, as new features can't be added easily, if at all.&lt;/p&gt;

&lt;p&gt;To achieve our goal, we chose what we believe is the best backend framework for TypeScript: &lt;a href="https://nestjs.com/" rel="noopener noreferrer"&gt;NestJS&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd47bg9066btc402uaena.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd47bg9066btc402uaena.png" alt="Nest JS backend library logo" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It’s like &lt;a href="https://expressjs.com/" rel="noopener noreferrer"&gt;Express&lt;/a&gt;, but more fleshed out with built-in features that developers from languages like Java or C# will recognize, such as a modular architecture, middleware, and tools for request interception and validation. &lt;/p&gt;

&lt;p&gt;Most importantly, it provides a robust Dependency Injection (DI) system, making the code scalable and easier to test by preventing different parts of the codebase from becoming entangled.&lt;/p&gt;

&lt;p&gt;Armed with this framework, we were now ready implement the &lt;strong&gt;Hexagonal&lt;/strong&gt;, or &lt;strong&gt;Ports and Adapaters&lt;/strong&gt;, software architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ports and Adapters
&lt;/h2&gt;

&lt;p&gt;The point of this architecture is to keep the core business logic decoupled from external systems, like third-party services, databases, or file transfers. By organizing the system around interfaces (or "ports") and separating the external integrations into distinct implementations (or "adapters"), we ensure the business logic remains stable even as external dependencies change. This separation also makes testing easier by allowing us inject fake/mock adapters without touching the core logic.&lt;/p&gt;

&lt;p&gt;To enforce this separation, we split the system into public modules (business logic) and private modules (adapters). Public modules contain stable core logic, while private modules handle external dependencies, which can evolve without affecting the core.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6tbcfnhidktc6o86ucmv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6tbcfnhidktc6o86ucmv.png" alt="Public and private modules" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Adapters (Red)
&lt;/h3&gt;

&lt;p&gt;Adapters connect the core application to external systems, such as SFTP services, XML processing, and network batching. They are part of the private modules, meaning they can change without touching the stable core logic in the public modules. This keeps external system changes isolated.&lt;/p&gt;
&lt;h3&gt;
  
  
  Ports
&lt;/h3&gt;

&lt;p&gt;Ports define interfaces that the business logic both &lt;strong&gt;implements&lt;/strong&gt; and &lt;strong&gt;invokes&lt;/strong&gt; to interact with external systems. For example, the &lt;code&gt;ISyncInventory&lt;/code&gt; port, implemented by the &lt;code&gt;InventoryService&lt;/code&gt;, handles inventory synchronization, while the &lt;code&gt;ISftpConnector&lt;/code&gt; port, invoked by the business logic, deals with file transfers. Using these ports, the business logic remains decoupled from external system details, ensuring the application is flexible and adaptable to changes.&lt;/p&gt;
&lt;h3&gt;
  
  
  Application Business Logic (Green)
&lt;/h3&gt;

&lt;p&gt;The business logic lives in the public modules and handles the core rules and processes, such as the inventory service managing data synchronization. By depending only on ports, the business logic stays decoupled from external systems, ensuring it remains stable, maintainable, and easy to test, even when external systems change.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqsf6ihjtl1lt81i2pt40.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqsf6ihjtl1lt81i2pt40.png" alt="Umain's Technical Architecture" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Did SOLID Principles Guide Our Design?
&lt;/h2&gt;

&lt;p&gt;To ensure our architecture is robust, let’s review it against the SOLID principles that Robert C. Martin, famous for his books on &lt;code&gt;Clean Code&lt;/code&gt; and &lt;code&gt;Clean Architecture&lt;/code&gt;, has laid out. Does our system hold up to these timeless software engineering guidelines?&lt;/p&gt;
&lt;h3&gt;
  
  
  S - Single-responsibility Principle
&lt;/h3&gt;

&lt;p&gt;Each module has one clear purpose. For example, our InventoryService only handles inventory logic, while adapters deal with external interactions like SFTP or APIs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;imports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nx"&gt;ConfigModule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forRoot&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="na"&gt;isGlobal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="nx"&gt;GraphQLModule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;forRoot&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ApolloDriverConfig&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ApolloDriver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;playground&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;autoSchemaFile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;src/schema.gql&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="na"&gt;sortSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="nx"&gt;AuthModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;EventEmitterModule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forRoot&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="nx"&gt;PrismaModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;CloudStorageModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;InventoryModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// &amp;lt;--- Here&lt;/span&gt;
        &lt;span class="nx"&gt;XmlModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;GraphQLBatchModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;NetworkRequestRetryModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;FetchModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;CentraIntegrationModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;WebhookModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;SftpConnectorModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;OrderModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ExceptionModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ErrorModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;LoggerModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;TrackingModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;PricingModule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ProductModule&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;controllers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;AppController&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;providers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nx"&gt;AppService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;appConfig&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;exports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;ConfigModule&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AppModule&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the main app module, where the Nest framework allows us to import all our modules, each of which has a single responsibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  O - Open-closed Principle
&lt;/h3&gt;

&lt;p&gt;Our modules are open for extension but closed for modification. We can add new features, like additional adapters, without altering the existing core logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  L - Liskov Substitution Principle
&lt;/h3&gt;

&lt;p&gt;This principle ensures that different implementations of an interface can be swapped without breaking the system. Adhering to this, we can replace an adapter like &lt;code&gt;ISftpConnector&lt;/code&gt; with another SFTP implementation, and it will work seamlessly as long as it follows the expected behavior defined by the interface. This way, adapters can be switched out without affecting the business logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  I - Interface Segregation Principle
&lt;/h3&gt;

&lt;p&gt;We create small, focused interfaces that each handle a single responsibility, and then compose them into larger ones, like &lt;code&gt;ISftpConnector&lt;/code&gt;, ensuring that modules only rely on the specific functionality they need. This prevents the tight coupling often caused by inheritance and keeps dependencies clean and maintainable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ISftpConnector&lt;/span&gt;
    &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;ISftpConnectorFileGet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ISftpConnectorFilesGet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ISftpConnectorFileAdd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ISftpConnectorFileDelete&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ISftpConnectorIsDirEmpty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ISftpConnectorPurgeDir&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ISftpConnector&lt;/code&gt; is composed of smaller interfaces, allowing us to separate concerns and avoid bloated, monolithic interfaces, which can lead to the infamous "&lt;a href="https://en.wikipedia.org/wiki/God_object" rel="noopener noreferrer"&gt;God object&lt;/a&gt;".&lt;/p&gt;

&lt;h3&gt;
  
  
  D - Dependency Inversion Principle
&lt;/h3&gt;

&lt;p&gt;As we have seen, our system relies on abstractions (interfaces) rather than concrete implementations. The core logic depends on ports (interfaces), while the adapters implement those ports, keeping the layers decoupled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;IInventory&lt;/span&gt;
    &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;ISyncAxInventoryToAdapterInventory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ISyncCentraInventoryToAdapterInventory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;ISyncAdapterInventoryToCentraInventory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;IGetWarehouse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="c1"&gt;// etc ...&lt;/span&gt;
        &lt;span class="nx"&gt;IDeleteInventoryRecord&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the interface (port) for the inventory service. (The application business logic.)&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Resolver&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;AXInventoryResolver&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AXInventoryResolver&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;LoggerService&lt;/span&gt;

    &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;INVENTORY_SERVICE_TOKEN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;inventoryService&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IInventory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ExceptionService&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;LoggerService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;withContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;AXInventoryResolver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// etc ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we see that the GraphQL resolver (network) has no direct interactions with the inventory service. It directly depends on the abstract &lt;code&gt;IInventory&lt;/code&gt;, and thus remains decoupled.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Injectable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AXInventoryService&lt;/span&gt; &lt;span class="k"&gt;implements&lt;/span&gt; &lt;span class="nx"&gt;IInventory&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;LoggerService&lt;/span&gt;

    &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PrismaService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XmlService&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;xml&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IXMLService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;SFTP_CONNECTOR_TOKEN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;sftp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ISftpConnector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CLOUD_STORAGE_SERVICE_TOKEN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;cloudStorageService&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ICloudStorageService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="c1"&gt;// etc ..&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ConfigService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;LoggerService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;withContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;AXInventoryService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nx"&gt;syncAxInventoryToAdapterInventory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;market&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Market&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TaskEither&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;
            &lt;span class="nx"&gt;InventoryError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nx"&gt;SyncAxInventoryToAdapterInventorySuccessMessage&lt;/span&gt;
        &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
            &lt;span class="nf"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;inventoryFileIdentifier&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
                    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getInventoryFileIdentifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;market&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="c1"&gt;// etc ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the code, we see &lt;code&gt;ISftpConnector&lt;/code&gt; being injected into the &lt;code&gt;AXInventoryService&lt;/code&gt;. This illustrates the "inversion" principle: at compile time the high-level service depends on an abstract interface, while the concrete implementation is injected only at runtime. This keeps the system flexible and adaptable to changes in external services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building Confidence with the Testing Pyramid Strategy
&lt;/h2&gt;

&lt;p&gt;Our system does indeed follow the SOLID principles! That's great, but how does it hold up in testing? One of our main goals was to ensure that changes could be made confidently, with good test coverage.&lt;/p&gt;

&lt;p&gt;Fortunately, by adhering to SOLID and the Ports and Adapters architecture, testing becomes much easier as a natural side effect. Like a bonus! The clear separation of concerns allows us to test each layer independently, as shown in the diagram below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpsvno9t7ixdv622b0vq3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpsvno9t7ixdv622b0vq3.png" alt="Testing strategy" width="800" height="799"&gt;&lt;/a&gt;&lt;br&gt;This strategy is known as the &lt;a href="https://martinfowler.com/articles/practical-test-pyramid.html" rel="noopener noreferrer"&gt;Testing Pyramid&lt;/a&gt;.
  &lt;/p&gt;

&lt;h3&gt;
  
  
  E2E Testing
&lt;/h3&gt;

&lt;p&gt;Simulates real user interactions by calling the server over HTTP, using a test database, and seeding data before tests. It's thorough but slower due to involving external systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Integration Testing
&lt;/h3&gt;

&lt;p&gt;Mocks most adapters to avoid calling real external systems. It’s faster and ensures modules work well together without involving full system dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unit Testing
&lt;/h3&gt;

&lt;p&gt;Mocks all adapters, ensuring no external systems are touched. It’s ultra-fast, focusing on testing isolated logic within a single module.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ensuring Testability with Pure Functions
&lt;/h2&gt;

&lt;p&gt;In addition to our testing strategy, we ensure that each service in our Nest modules—whether public or private—follows a functional programming style using the &lt;a href="https://gcanti.github.io/fp-ts/" rel="noopener noreferrer"&gt;fp-ts&lt;/a&gt; library.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzkwkzhujzurmcu5uu0x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjzkwkzhujzurmcu5uu0x.png" alt="Logo of functional programming library fp-ts" width="360" height="366"&gt;&lt;/a&gt;&lt;br&gt;The creator of fp-ts, Giulio Canti, recently joined the &lt;a href="https://effect.website/" rel="noopener noreferrer"&gt;Effect&lt;/a&gt; team. Effect is a library very similar to fp-ts, with some additional bells and whistles.
  &lt;/p&gt;

&lt;p&gt;To illustrate this, let's take a look at the typical structure of a Nest module.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e3v3uam9g4y4292ewyt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e3v3uam9g4y4292ewyt.png" alt="NestJs module structure" width="800" height="722"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This approach allows us to write &lt;em&gt;pure functions&lt;/em&gt;—functions that 1) always return the same output for the same input, and 2) don’t produce side effects. Side effects occur when a function interacts with the world outside of itself (e.g., calling an API or modifying a global state), making testing and debugging more difficult.&lt;/p&gt;

&lt;p&gt;To avoid this, we use &lt;code&gt;TaskEither&lt;/code&gt;, a &lt;em&gt;type&lt;/em&gt; from fp-ts that represents an asynchronous operation that can either succeed or fail. Here’s an example from our &lt;code&gt;IOrder&lt;/code&gt; interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;taskEither&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;TE&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fp-ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;OrderNumber&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;IOrderServiceCreate&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;createOrder&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;axOrderJson&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TaskEither&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;OrderError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;OrderNumber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;IOrder&lt;/code&gt; composes interfaces like &lt;code&gt;IOrderServiceCreate&lt;/code&gt;, where &lt;code&gt;TaskEither&lt;/code&gt; is used for async operations that could fail.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Injectable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AXOrderService&lt;/span&gt; &lt;span class="k"&gt;implements&lt;/span&gt; &lt;span class="nx"&gt;IOrder&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

&lt;span class="c1"&gt;// ... (code)&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nx"&gt;createOrder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TaskEither&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;OrderError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;OrderNumber&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// ... (code)&lt;/span&gt;

        &lt;span class="c1"&gt;// A value "data" of generic type T goes into the pipeline:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;data&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;right&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;validatedData&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;validateData&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;orderNrAndShipmentId&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;getOrderNrAndShipmentId&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;market&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;getMarket&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fromEither&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;persist&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;xml&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;createXml&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gcsBucketName&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;getGCSBucketName&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cloudUploadSuccess&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;performCloudUpload&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sftpUploadSuccess&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;performSftpUpload&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nx"&gt;TE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;persistSuccess&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;// And comes out transformed on the other side,&lt;/span&gt;
        &lt;span class="c1"&gt;// as a type: TaskEither&amp;lt;OrderError, OrderNumber&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;TaskEither is technically what is called a &lt;em&gt;monad&lt;/em&gt;, which is a design pattern in functional programming. The funky syntax is based on Haskell's &lt;code&gt;do&lt;/code&gt; notation. (&lt;a href="https://en.wikibooks.org/wiki/Haskell/do_notation" rel="noopener noreferrer"&gt;Link&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;This entire declarative flow in the service, from beginning to end, is lazy and pure. Laziness ensures that nothing happens until exactly when the function is invoked, and purity guarantees that the function’s behavior is deterministic. This predictability makes our services easier to test, as every input will consistently return the same result without causing hidden side effects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment on Google Cloud Platform
&lt;/h2&gt;

&lt;p&gt;Finally for the deployment, we handle it by running the app in a Docker container in Google Cloud Run, which handles infrastructure and scales automatically to meet demand. We also rely on Google Cloud's built-in authentication, so security is managed behind the scenes, letting us focus on building the app instead of worrying about access control.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg7dg3bbiiidoctstf6t5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg7dg3bbiiidoctstf6t5.png" alt="Umain's Cloud Deployment" width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  In Summary: Was the Backend Transformation Successful?
&lt;/h2&gt;

&lt;p&gt;So, this all sounds great on paper, but what’s been the real outcome? We’re proud to say the system has been running smoothly since deployment, doing its job without a hitch. &lt;/p&gt;

&lt;p&gt;For an engineer, there’s little more satisfying than seeing a system you’ve built work seamlessly, reliably, and without constant intervention.&lt;/p&gt;

&lt;p&gt;By taking the time to apply timeless software engineering principles, we’ve built a stable backend platform that the client has been highly satisfied with — one that lets them focus on adding new features instead of constantly fixing things. They can finally innovate with confidence, knowing their backend will keep up with whatever comes next.&lt;/p&gt;




&lt;p&gt;Dawid Dahl is a full-stack developer at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://www.eidra.com/" rel="noopener noreferrer"&gt;EIDRA&lt;/a&gt;. In his free time, he enjoys metaphysical ontology and epistemology, analog synthesizers, consciousness, techno, Huayan and Madhyamika Prasangika philosophy, and being with friends and family.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>architecture</category>
      <category>nestjs</category>
      <category>typescript</category>
    </item>
    <item>
      <title>The Death of RAG: What a 10M Token Breakthrough Means for Developers</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Mon, 19 Feb 2024 09:33:42 +0000</pubDate>
      <link>https://forem.com/dawiddahl/the-death-of-rag-what-a-10m-token-breakthrough-means-for-developers-3p24</link>
      <guid>https://forem.com/dawiddahl/the-death-of-rag-what-a-10m-token-breakthrough-means-for-developers-3p24</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;"&lt;em&gt;In our research, we’ve also successfully tested up to 10 million tokens&lt;/em&gt;." - Google Researchers&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The other day, Google announced &lt;a href="[https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#architecture]"&gt;Gemini Pro 1.5&lt;/a&gt; with a massive increase in accurate long-context understanding. While I could not immediately put my finger on exactly what the broader implications might be, I had a hunch that if this is actually true, it's going to change the game.&lt;/p&gt;

&lt;p&gt;And it was not until I spoke to a colleague at work when I finally realized the true impact of Google's announcement.&lt;/p&gt;

&lt;p&gt;He said:&lt;/p&gt;

&lt;p&gt;"&lt;em&gt;Latency is going to be a thing though... 10M tokens is quite a few MBs&lt;/em&gt;."&lt;/p&gt;

&lt;p&gt;It struck me that I’d actually be &lt;strong&gt;thrilled&lt;/strong&gt; to have the option to wait longer, if it meant a higher quality AI conversation.&lt;/p&gt;

&lt;p&gt;For example let’s say it took 5 minutes, or 1 hour — hell, even if it took &lt;em&gt;one whole day&lt;/em&gt; — to have my entire codebase put into the chat’s context window. If after that time, the AI had near-perfect access to that context throughout the rest of the conversation like Google claims, I’d happily, patiently and gratefully wait that amount of time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is &lt;code&gt;RAG&lt;/code&gt; (Retrieval-Augmented Generation)
&lt;/h2&gt;

&lt;p&gt;Both me and my colleague had worked on the &lt;strong&gt;ARC AI Portal&lt;/strong&gt; at our workplace, an internal platform where we give everybody free access to GPT-4 and are utilizing something that is called &lt;code&gt;RAG&lt;/code&gt; (retrieval-augmented generation), for various purposes. The purpose of &lt;code&gt;RAG&lt;/code&gt; is to provide an AI access to information it does not natively possess, akin to the fresh perspective of initiating a new ChatGPT session.&lt;/p&gt;

&lt;p&gt;For example, one use case for &lt;code&gt;RAG&lt;/code&gt; was when another colleague of ours, the author Rebecka Carlsson, asked us to let people chat directly with her latest book &lt;a href="https://www.adlibris.com/se/bok/the-speakers-journey-7-steps-to-create-the-important-narratives-speeches-for-our-transformative-times-9789198775686" rel="noopener noreferrer"&gt;The Speaker's Journey&lt;/a&gt; using our company's AI portal.&lt;/p&gt;

&lt;p&gt;So the AI portal team developed the full &lt;code&gt;RAG&lt;/code&gt; pipeline: took the book → chunked it into small pieces (not literally) → used &lt;a href="https://openai.com/blog/new-embedding-models-and-api-updates" rel="noopener noreferrer"&gt;OpenAI's embedding model&lt;/a&gt; to vectorize the chunks → inserted them into a vector database → and finally gave the AI access to the database within the chat via something called semantic search.&lt;/p&gt;

&lt;p&gt;Mostly, it worked great. If people asked some specific detail from her book, the &lt;code&gt;RAG&lt;/code&gt; solution was able to retrieve the information more often than not.&lt;/p&gt;

&lt;p&gt;But here is the deal, &lt;code&gt;RAG&lt;/code&gt; is a hack. We are essentially brute-forcing the information onto the AI, and that means that often it doesn't actually work as well as one would hope. It also can't do summaries well. It also requires developer time to set up, meaning it's slow and costly.&lt;/p&gt;

&lt;p&gt;So my point to my colleague was this:&lt;/p&gt;

&lt;p&gt;As it is now, people obviously wait weeks, even months, and pay loads of money to people like us to implement &lt;code&gt;RAG&lt;/code&gt;, a solution which is riddled with problems even when done by an expert.&lt;/p&gt;

&lt;p&gt;Surely then, Google — and eventually OpenAI when they release the equivalent solution — adding a little bit of latency for this new feature is fine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Big Problem In AI-Driven Development Today
&lt;/h2&gt;

&lt;p&gt;In my current AI-driven development (AIDD) workflow, I always find myself copy-pasting the relevant parts of my codebase into the AI chat window in the beginning of the conversation. This is because, like most things in life, the specific functions I collaborate on with the AI never exist in isolation. It is always embedded in some larger network of system dependencies.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important to point out&lt;/strong&gt;: At work we use either our internal AI portal or ChatGPT Teams, where OpenAI never train their models on our proprietary data or conversations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So even as I painstakingly take my time to try and copy-paste all the relevant context, since a production codebase is such a huge eco-system of tens or even hundreds of thousands of lines of code, I could never realistically give it all. And even if I do give a lot, as the conversation goes on, the AI will eventually begin to &lt;a href="https://www.nightfall.ai/ai-security-101/catastrophic-forgetting#:~:text=At%20its%20core%2C%20catastrophic%20forgetting,related%20to%20previously%20learned%20tasks." rel="noopener noreferrer"&gt;forget&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Github Copilot tried to solve this with a native &lt;code&gt;RAG&lt;/code&gt; solution built straight into the code editor. While it works sometimes, it's so sketchy I can never rely on it, meaning their &lt;code&gt;RAG&lt;/code&gt; implementation is almost useless.&lt;/p&gt;

&lt;p&gt;This cumbersome dance of feeding the AI piece by piece of our codebase, and it constantly forgetting and needing to start over, is a fragmented, inefficient process that disrupts the flow of the AI collaboration and often leads to results that are hit or miss.&lt;/p&gt;

&lt;p&gt;That is - until now.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkboujas6sqtv3bel66y3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkboujas6sqtv3bel66y3.jpg" alt="Claims on X about Google Gemini 1.5 Pro" width="800" height="648"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Exciting Post-&lt;code&gt;RAG&lt;/code&gt; Era
&lt;/h2&gt;

&lt;p&gt;Approximately, 1 million tokens would amount to around 50,000 lines of code, and 10 million tokens would thus equate to about 500,000 lines of code. That means that if Google's claims are correct, almost all our codebases would fits into an AI's view all at once.&lt;/p&gt;

&lt;p&gt;This would be nothing short of revolutionary.&lt;/p&gt;

&lt;p&gt;It's akin to moving from examining individual cells under a microscope to viewing an entire organism at once. Where once we pieced together snippets of code to get a partial understanding, a 10 million token context allows us to perceive the full "organism" of our codebase in all its glorious complexity and interconnectivity.&lt;/p&gt;

&lt;p&gt;This shift then would offer a complete and holistic view, enhancing our ability to collaborate with the AI to add new features, refactor, test and optimize our software systems efficiently.&lt;/p&gt;

&lt;h2&gt;
  
  
  So Is &lt;code&gt;RAG&lt;/code&gt; Dead?
&lt;/h2&gt;

&lt;p&gt;Even after the conversation with my colleague, the thoughts of the deeper implications kept on coming. When we get up to 10m context length with better retrieval than &lt;code&gt;RAG&lt;/code&gt;, what is even the point of &lt;code&gt;RAG&lt;/code&gt;? Does it have any value at all?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;By &lt;code&gt;RAG&lt;/code&gt; I mean specifically: creating embeddings, feeding them into a vector database and then doing semantic search over those embeddings before feeding the results back to the AI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Just take one unique selling point of &lt;code&gt;RAG&lt;/code&gt; today: metadata. That is, the ability to attach extra pieces of information — such as sources, line-of-code number, file-type, compliance-labels, etc — to the data that the AI interacts with. With such metadata, we can enable the retrieval step to access detailed information for greater specificity and context-awareness in the AI's responses.&lt;/p&gt;

&lt;p&gt;But really, why go through the vector database hassle, when you could just have a quick higher-order function that transforms your entire codebase into a json data structure with whatever desired metadata you'd like?&lt;/p&gt;

&lt;p&gt;Something such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;FormatterFunction&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;V&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;inputData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;V&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;ProcessData&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;V&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;formatter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;FormatterFunction&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;V&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;inputData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;V&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Could result in some data structure like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="err"&gt;…&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;“&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;‘&lt;/span&gt;&lt;span class="nx"&gt;the&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;73672&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;“&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;‘&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;73673&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;“&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;‘&lt;/span&gt;&lt;span class="nx"&gt;RAG&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;73674&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;“&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;‘&lt;/span&gt;&lt;span class="nx"&gt;era&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;”&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;73675&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="err"&gt;…&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then you give this as JSON to the AI, instead of giving it the regular codebase. Sure, it’s more characters, which would increase the overall token count. But when you’re dealing with the hypothetical insanity of &lt;em&gt;millions&lt;/em&gt; of tokens, this is starting feel like a possibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Playing the Devil's &lt;code&gt;RAG&lt;/code&gt;-vocate
&lt;/h2&gt;

&lt;p&gt;Before we declare &lt;code&gt;RAG&lt;/code&gt; dead, let's invite a Devil's advocate and think about some of the other reasons why we might want to keep &lt;code&gt;RAG&lt;/code&gt; around?&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Fake?&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;Yeah I saw the original Gemini video, which turned out to be fake. So why would I believe this?&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;I was also very skeptical, until I saw &lt;a href="https://www.youtube.com/watch?v=Pvk4vqescz4" rel="noopener noreferrer"&gt;this video&lt;/a&gt; from someone not working for Google.&lt;/p&gt;

&lt;p&gt;Also, there were &lt;a href="https://twitter.com/rowancheung/status/1759280384930459941?t=nBZ51ivqbeUyghitYLZHZQ" rel="noopener noreferrer"&gt;these demos&lt;/a&gt; from a tester not affiliated with Google on X as well.&lt;/p&gt;

&lt;p&gt;I was extremely surprised by these promising results.&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Staying Updated&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;&lt;code&gt;RAG&lt;/code&gt; keeps AI clued into the latest info, something a static context can't always do.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;Well, what would prevent us from just giving the freshest data at the beginning of every AI conversation? Or even updating it periodically during the same conversation?&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Reducing Hallucinations&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;Since &lt;code&gt;RAG&lt;/code&gt; runs on our own server, we have the power to tell the AI to simply say 'I don't know' if relevant context was not able to be retrieved.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;This is true, and the simple fact that we as developers have a programmatic step of total control between the &lt;strong&gt;retrieval&lt;/strong&gt; and the &lt;strong&gt;response&lt;/strong&gt; stages just intuitively &lt;em&gt;feels&lt;/em&gt; good. So this is a good point.&lt;/p&gt;

&lt;p&gt;But then again, there is nothing stopping us from implementing some solution where we first do the retrieval query, and then perform some arbitrary action before feeding the result back to the model. You wouldn't need to do the whole manual chunking/manual embedding/vector database/manual semantic-search for that.&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Handling the Tough Questions&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;For those tricky queries that need more than just a quick look-up, &lt;code&gt;RAG&lt;/code&gt; can dig deeper.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;If we have the full and complete data, and if the AI can have instant access to all of it like Google appeared to demonstrate in their demos, why would we need to dig deeper with &lt;code&gt;RAG&lt;/code&gt; at all?&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Efficiency&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;When it comes to managing big data without bogging down the system, &lt;code&gt;RAG&lt;/code&gt; can be pretty handy.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;If this large context window is offered as a service, then that means the system is actually designed to be bogged with data.&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Keeping Content Fresh&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;&lt;code&gt;RAG&lt;/code&gt; helps AI stay on its toes, pulling in new data on the fly.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;Google declares: "&lt;em&gt;Gemini 1.5 Pro can seamlessly analyze, classify and summarize large amounts of content within a given prompt.&lt;/em&gt;" This means it can pull in data from the entire context window on the fly.&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Computational and Memory Constraints&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;Processing 100 million tokens in a single pass would require significant computational resources and memory, which might not be practical or efficient for all applications. Not to mention costly.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;This is a good point. As more compute is needed, costs will be higher compared to RAG. &lt;/p&gt;

&lt;p&gt;Also considering the global environmental impact - running data centers is one of the major energy drains today. Efficient use of computational resources with &lt;code&gt;RAG&lt;/code&gt; could potentially contribute to more sustainable AI practices.&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Extending with API requests&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;Sometimes, the AI would need to augment its data with external API requests to get the full picture. When we do &lt;code&gt;RAG&lt;/code&gt;, it happens on a server, so we can call out to external services before returning the relevant context back to the model.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;AI already has access to web browsing, and there is nothing in principle that prevents an AI to use it while constructing its responses. If you would like more control over external services and make network requests, you should utilize AI &lt;a href="https://dev.to/dawiddahl/the-new-computer-use-serverless-to-build-your-first-ai-os-app-409"&gt;Function Calling&lt;/a&gt; instead.&lt;/p&gt;

&lt;h3&gt;
  
  
  😈 &lt;strong&gt;Speed&lt;/strong&gt;:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  "&lt;em&gt;I saw Google's demos. It took a long time to get a response; vector databases are much much faster.&lt;/em&gt;"
&lt;/h4&gt;

&lt;p&gt;This is also true. But personally, I'd rather wait a long time for an accurate response, than wait a short time for a response I can't trust.&lt;/p&gt;

&lt;p&gt;Also honestly, who would be surprised if within a couple of months the latency is starting to decrease as new models are released?&lt;/p&gt;

&lt;h2&gt;
  
  
  In Summary
&lt;/h2&gt;

&lt;p&gt;Google's new break-through announcement could flip the script for developers by allowing AI to digest our entire codebases at once, thanks to its potential 10 million token capacity. This leap forward should make us rethink the need for &lt;code&gt;RAG&lt;/code&gt;, as direct, comprehensive code understanding by AI becomes a reality.&lt;/p&gt;

&lt;p&gt;The prospect of waiting a bit longer for in-depth AI collaboration seems a small price to pay for the massive gains in accuracy and sheer brain power. As we edge into this new era, it's not just about coding faster; it's about coding smarter, with AI as a true partner in our creative and problem-solving endeavors.&lt;/p&gt;




&lt;p&gt;Dawid Dahl is a full-stack developer at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://arc.inc/" rel="noopener noreferrer"&gt;ARC&lt;/a&gt;. In his free time, he enjoys metaphysical ontology and epistemology, analog synthesizers, consciousness, techno, Huayan and Madhyamika Prasangika philosophy, and being with friends and family.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>coding</category>
    </item>
    <item>
      <title>The New Computer: Use Serverless to Build Your First AI-OS App</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Thu, 01 Feb 2024 12:46:55 +0000</pubDate>
      <link>https://forem.com/dawiddahl/the-new-computer-use-serverless-to-build-your-first-ai-os-app-409</link>
      <guid>https://forem.com/dawiddahl/the-new-computer-use-serverless-to-build-your-first-ai-os-app-409</guid>
      <description>&lt;p&gt;There is no denying some really interesting and groundbreaking things are cooking over at OpenAI.&lt;/p&gt;

&lt;p&gt;Why do I say that? The reason is that in recent months they have started to release some things many people didn't fully expect. I believe this is a sign that internally, OpenAI is currently executing on an overarching plan that over the coming years will change the digital landscape completely. &lt;/p&gt;

&lt;p&gt;What are some of these things they have released? Examples include &lt;code&gt;GPTs&lt;/code&gt;, &lt;code&gt;GPT Actions&lt;/code&gt;, and most recently: &lt;code&gt;GPT @-mentions&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9x9awb9mdpct8kgzco4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9x9awb9mdpct8kgzco4.jpg" alt="GPTs @ mentions" width="800" height="554"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is simply a way to reference your GPTs—AI chatbots that you can customize on your own—in your current ChatGPT conversation.&lt;/p&gt;

&lt;p&gt;Well, you might say, that doesn't sound like such a big deal? And why is so much time being spent on these GPTs? Are they even any good?&lt;/p&gt;

&lt;p&gt;Let me show you why it is a big deal.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Dawn Of A New Computer
&lt;/h2&gt;

&lt;p&gt;Back in the day, there was a little company called Microsoft that revolutionized personal computing. Founded in 1975 by Bill Gates and Paul Allen, Microsoft achieved its big break with MS-DOS, an operating system developed for the IBM PC in 1981. This success paved the way for Windows, which became the dominant operating system worldwide.&lt;/p&gt;

&lt;p&gt;Seizing the opportunity in the flourishing era of personal computing, Microsoft's strategic innovations and market adaptability turned it into a tech juggernaut.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3bqfxh5ji1q3fmfu7j4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3bqfxh5ji1q3fmfu7j4.png" alt="OpenAI hypothetical LLM OS" width="800" height="451"&gt;&lt;/a&gt;&lt;br&gt;Image of an hypothetical LLM OS by &lt;a href="https://twitter.com/karpathy?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor" rel="noopener noreferrer"&gt;Andrej Karpathy&lt;/a&gt;, working at OpenAI.
  &lt;/p&gt;

&lt;p&gt;I believe that what Microsoft did with the release of Windows 1.0 back in 1985, is what OpenAI is gearing up to do in 2024 and beyond: creating a new kind of AI-OS for the next generation of personal computers. This could be as pivotal for our digital interactions as when Bill and Paul revolutionized computing with Windows.&lt;/p&gt;

&lt;h2&gt;
  
  
  GPTs as AI-OS Apps
&lt;/h2&gt;

&lt;p&gt;So essentially, these GPTs are to the AI-OS, what traditional applications were to Windows; instead of launching an app on your computer, you will be orchestrating AI agents to perform actions on your behalf. And it will be so much more fun and engaging than pressing down 👇🏻 keys on a board or other pieces of plastic. &lt;/p&gt;

&lt;p&gt;Instead of being on your own as in the days of PC's past, as I described in a &lt;a href="https://dev.to/dawiddahl/meet-your-future-co-workers-the-rise-of-ai-agents-in-the-office-441m"&gt;previous article&lt;/a&gt;, you will instead be collaborating directly with a host of artificially intelligent beings, not at all unlike how Luke Skywalker is dealing with C-3PO in Star Wars.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobvi46b6gfueg0l1mi95.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobvi46b6gfueg0l1mi95.jpeg" alt="C-3PO from Star Wars" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But how do you actually create one of these new AI-OS apps? In the next section, I'll guide you through the process using AI to help you build a (&lt;a href="https://dev.to/dawiddahl/climbing-the-ai-application-value-ladder-4cf0#value-level-2-function-calling"&gt;AI Application Value Level 2&lt;/a&gt;) &lt;code&gt;GPT Action&lt;/code&gt;, using serverless functions technology. &lt;/p&gt;

&lt;p&gt;The most common way of building a &lt;code&gt;GPT Action&lt;/code&gt; today, if you look on ChatGPT-related YouTube content, is Zapier: a no-code platform allowing you to perform actions like sending email or updating your calendar. By using serverless functions instead, you actually won't need to pay Zapier a subscription fee every month!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ℹ️ 1. Even though being a developer helps when building serverless functions, with the help of AI (and a little grit), it's not strictly necessary, as you can learn as you go.&lt;/p&gt;

&lt;p&gt;ℹ️ 2. Even though it is called “serverless”, that doesn’t mean there is no server. It just means that we don’t use our own local server; we use some other company’s server in the ☁️.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Using Serverless Functions to Create Your First Proto AI-OS-App
&lt;/h2&gt;

&lt;p&gt;So what shall we build? As a proof-of-concept, let's go for an AI-OS app that should, on the server, generate some ASCII art of a cow that says something. Like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89ats08pjltcpt7jxph3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F89ats08pjltcpt7jxph3.png" alt="ASCII art cow" width="800" height="270"&gt;&lt;/a&gt;&lt;br&gt;To create the ASCII art, we'll use &lt;code&gt;cowsay&lt;/code&gt; on the server, which is an &lt;a href="https://www.npmjs.com/package/cowsay" rel="noopener noreferrer"&gt;external library&lt;/a&gt; designed for this cowsome purpose.
  &lt;/p&gt;

&lt;p&gt;Then that art should be sent from the server back to the AI-OS app (our GPT), which will then create a beautiful painting drawing inspiration from this ASCII art.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You will need 1) a &lt;a href="https://chat.openai.com/" rel="noopener noreferrer"&gt;ChatGPT&lt;/a&gt; Plus or Teams account, 2) a free &lt;a href="https://vercel.com/" rel="noopener noreferrer"&gt;Vercel&lt;/a&gt; account and 3) a free &lt;a href="https://github.com" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; account to build along with me.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 1: Set Up The ☁️ Environment
&lt;/h3&gt;

&lt;p&gt;Open up ChatGPT and ask it to generate a serverless function on Vercel.&lt;/p&gt;

&lt;p&gt;To get started, use this prompt:&lt;/p&gt;

&lt;p&gt;"&lt;em&gt;Could you carefully guide me through creating a serverless function with Vercel using Node, starting by setting up a Next.js project using create-next-app, then writing a basic serverless function in TypeScript, and finally deploying it via the Vercel CLI? Please also explain step-by-step how we link the Vercel project to GitHub&lt;/em&gt;."&lt;/p&gt;

&lt;p&gt;If you prefer a written guide, you can use &lt;a href="https://vercel.com/docs/functions/serverless-functions/quickstart" rel="noopener noreferrer"&gt;this&lt;/a&gt;. To see or clone my finished serverless function repository on GitHub, click &lt;a href="https://github.com/dawid-dahl-umain/gpt-functions-cowsay/tree/main" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Vercel's Hobby Plan offers &lt;em&gt;free&lt;/em&gt; serverless functions for small projects, with up to 10-second runtime and ample monthly capacity of 100 GB-hours. That means a simple function can be run around 700,000 times a month, for free! No need to pay Zapier every month.&lt;/p&gt;

&lt;p&gt;More info on pricing &lt;a href="https://vercel.com/docs/accounts/plans/hobby" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 2: Create The Function 🛠️
&lt;/h3&gt;

&lt;p&gt;Now you should have a Next project. Inside the &lt;code&gt;app&lt;/code&gt; folder, there is an &lt;code&gt;api&lt;/code&gt; folder. Inside that folder, create a new folder and call it something that should be thought of as a spell 🪄✨ we use to activate our function. Let's go for &lt;code&gt;gpt-functions-cowsay&lt;/code&gt;, or whatever you'd like. Remember this spell name, we will need it later.&lt;/p&gt;

&lt;p&gt;Next, in this spell folder, create a file called &lt;code&gt;route.ts&lt;/code&gt;. The folder structure will thus be: &lt;code&gt;app/api/gpt-functions-cowsay/route.ts&lt;/code&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If at any point you feel lost, no worries! Just ask ChatGPT for clarification or help.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now, request ChatGPT to write the server-side code if it didn't already, to generate a cowsay and return the result. Use this prompt to get started:&lt;/p&gt;

&lt;p&gt;"&lt;em&gt;I need help creating a simple serverless function in Next.js that uses the 'cowsay' package. The function should take text from a URL search parameter, make a cow say it, and return this along with the request. Can you guide me through the steps, including necessary TypeScript code, to set up this function?&lt;/em&gt;"&lt;/p&gt;

&lt;p&gt;If the AI does its job, the code for the function will end up something like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NextResponse&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;next/server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;say&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cowsay&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NextRequest&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;next/server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;GET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;NextRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cowsayText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;nextUrl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;searchParams&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cowsay&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;NextResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;cowsay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;say&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;cowsayText&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Paste this code into the &lt;code&gt;route.ts&lt;/code&gt; file in the spell folder (&lt;code&gt;gpt-functions-cowsay&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Please note that although this function performs a simple task, in reality, within this server environment, you now wield the &lt;strong&gt;full power of software engineering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's right. Unlike with Zapier where you are restricted to follow their rules, in here, you can build &lt;em&gt;any&lt;/em&gt; tool you want. And through the &lt;code&gt;Actions&lt;/code&gt; input the GPTs creation editor, you can hand this tool over to the AI for it to use on your behalf. &lt;/p&gt;

&lt;p&gt;Take a moment and just reflect on the vast possibilities. &lt;strong&gt;The sk-AI is the limit!&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Create An OpenAPI Spec 📄
&lt;/h3&gt;

&lt;p&gt;Now, the way we make our GPT aware of our new function so it can use it, is to hand it something called an OpenAPI specification.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Yes, that was not a typo. While OpenAI is the company, &lt;a href="https://swagger.io/specification/" rel="noopener noreferrer"&gt;OpenAPI&lt;/a&gt; is a rulebook for how computer programs talk to each other (APIs).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you are not a developer, you will have no idea how to write such a specification. But fear not, you can use another GPT called &lt;a href="https://chat.openai.com/g/g-TYEliDU6A-actionsgpt" rel="noopener noreferrer"&gt;ActionsGPT&lt;/a&gt; to do it for you.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbf3q6kk1q6cq0kvnn38l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbf3q6kk1q6cq0kvnn38l.png" alt="Add actions button in gpt editor" width="705" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In the configuration tab of the GTP creator, click the "Create new action" button.&lt;/li&gt;
&lt;li&gt;In a separate ChatGPT thread, @-mention &lt;code&gt;ActionsGPT&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgjzbzvow0crtaltx4rlv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgjzbzvow0crtaltx4rlv.png" alt="@ mentioning a gpt" width="791" height="298"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ask it: "&lt;em&gt;I have set up a serverless function in Vercel. What should I do now to get an OpenAPI specification from you?&lt;/em&gt;" You could hand it some of the code too.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ActionsGPT&lt;/code&gt; will tell you to hand it some information.&lt;/li&gt;
&lt;li&gt;You will give it something like this. (The Base URL you get from your Vercel project.) The prompt doesn't have to be exact, just get the urls and the &lt;code&gt;GET&lt;/code&gt; or &lt;code&gt;POST&lt;/code&gt; right and describe what your function does. Use this prompt to get started:&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;"&lt;em&gt;Endpoint URL(s): gpt-functions-cowsay&lt;br&gt;
HTTP Methods: GET&lt;br&gt;
Base URL: gpt-functions-cowsay.vercel.app&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;When given an input called cowsay, it will take it and make a cowsay out of it. Then it will return the cowsay.&lt;/em&gt;"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9lnghot7bkt1xl8d0kn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9lnghot7bkt1xl8d0kn.png" alt="abra cadabra spell function invocation" width="800" height="414"&gt;&lt;/a&gt;&lt;br&gt;In Aramaic, "&lt;em&gt;avra kehdabra&lt;/em&gt;" means "&lt;em&gt;I will create as I speak&lt;/em&gt;". If &lt;code&gt;gpt-functions-cowsay&lt;/code&gt; is the &lt;em&gt;kadabra&lt;/em&gt;, &lt;code&gt;GET&lt;/code&gt; is the &lt;em&gt;abra&lt;/em&gt;. Using them both together will cast the function's magic! ✨
  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ActionsGPT&lt;/code&gt; will then generate the OpenAPI spec for you.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: Launch Your GPT! 🚀
&lt;/h3&gt;

&lt;p&gt;Finally, paste the OpenAI specification into the Schema input of the GPT Actions editor. Like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feolvcloqyh29moqzpxu4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feolvcloqyh29moqzpxu4.png" alt="GPT actions configuration" width="711" height="915"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you encounter errors, consult &lt;code&gt;ActionsGPT&lt;/code&gt; with your serverless function code at hand. Iteration is key in when building with AI.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Use this free &lt;a href="https://app.freeprivacypolicy.com/wizard/privacy-policy" rel="noopener noreferrer"&gt;privacy policy generator&lt;/a&gt; to create a policy for the GPT action, in case you want your GPT to be public.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 5: You're done! ✅
&lt;/h3&gt;

&lt;p&gt;That's it, if OpenAI allows you to save this GPT, that means you did it - you just built your first simple AI-OS app! 👏🏻&lt;/p&gt;

&lt;p&gt;This might've seemed daunting, especially for non-developers. And don't worry if you couldn't get it to work on your first try. Because remember, adding an action to a GPT is a &lt;a href="https://dev.to/dawiddahl/climbing-the-ai-application-value-ladder-4cf0#value-level-2-function-calling"&gt;Level 2&lt;/a&gt; task in AI software development — it's supposed to be a bit on the tougher side! But also more rewarding and fun to build, if you ask me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feqre84np7y5blzp85zw5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feqre84np7y5blzp85zw5.png" alt="Cow congratulating you" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this guide, you've learned how to create an AI-OS app using serverless technology with our &lt;code&gt;cowsay&lt;/code&gt; example. This introductory project showcases the potential for building some truly innovative AI applications.&lt;/p&gt;

&lt;p&gt;If you didn't follow along and build it with me, here is the cowtastic &lt;a href="https://chat.openai.com/g/g-z1SLp6C5w-cowsay-creator" rel="noopener noreferrer"&gt;Cowsay Creator&lt;/a&gt; in action!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnsx2su80wztzhsrtiubd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnsx2su80wztzhsrtiubd.png" alt="gpt allow button for actions" width="799" height="265"&gt;&lt;/a&gt;&lt;br&gt;It is all right to press "&lt;em&gt;Allow&lt;/em&gt;" here. You can check the &lt;a href="https://github.com/dawid-dahl-umain/gpt-functions-cowsay" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt; to verify that apart from bad cow art, nothing else bad happens in our serverless action.
  &lt;/p&gt;

&lt;p&gt;OpenAI's latest developments hint at a major shift, similar to when Windows first changed computing. We're seeing the start of a new AI-OS that could change everything, indicating that a future with C-3PO-like companions might be closer than we anticipate.&lt;/p&gt;

&lt;p&gt;And while our &lt;code&gt;Cowsay Creator&lt;/code&gt; GPT was just for fun and practice, by exploring this, you're already a part of the emerging AI-OS future. Who knows what actually valuable AI-OS apps you'll create next!&lt;/p&gt;




&lt;p&gt;Dawid Dahl is a full-stack developer at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://arc.inc/" rel="noopener noreferrer"&gt;ARC&lt;/a&gt;. In his free time, he enjoys metaphysical ontology, analog synthesizers, consciousness, Huayan and Madhyamika Prasangika philosophy, and being with friends and family.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For those keen to dive deeper into &lt;code&gt;function calling&lt;/code&gt; with LLMs, in &lt;a href="https://dev.to/dawiddahl/function-calling-the-most-significant-ai-feature-since-chatgpt-itself-81m"&gt;this article&lt;/a&gt; I offer another thorough exploration of the topic.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>serverless</category>
      <category>chatgpt</category>
      <category>programming</category>
    </item>
    <item>
      <title>Climbing the AI Application Value Ladder: 🤖🪜</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Tue, 12 Dec 2023 05:52:21 +0000</pubDate>
      <link>https://forem.com/dawiddahl/climbing-the-ai-application-value-ladder-4cf0</link>
      <guid>https://forem.com/dawiddahl/climbing-the-ai-application-value-ladder-4cf0</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fofnev9vn9exiso3hmhhy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fofnev9vn9exiso3hmhhy.png" alt="AI Application Value Ladder Levels" width="800" height="643"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the whirlwind of AI advancements, it's easy to get caught up in the hype. Many companies boast about leveraging AI, often merely as a facade for a basic ChatGPT implementation, making a few calls to their API.&lt;/p&gt;

&lt;p&gt;As developers and AI enthusiasts, we need to ask therefore: what truly adds &lt;strong&gt;real value&lt;/strong&gt; to a company? Let’s climb the AI Application Value Ladder 🤖🪜, a mental framework where we balance implementation difficulty against a company's unique selling point (USP).&lt;/p&gt;

&lt;h2&gt;
  
  
  Value Level 1: Custom Instructions &amp;amp; Prompt Engineering
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Difficulty&lt;/strong&gt;: Easy&lt;br&gt;
&lt;strong&gt;Value&lt;/strong&gt;: Low&lt;br&gt;
&lt;strong&gt;Team Required&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Domain Experts&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Software Developers&lt;/strong&gt;&lt;/em&gt;: None&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;ML Developers&lt;/strong&gt;&lt;/em&gt;: None&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Eval QA&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9oraf8ii3ykuwsrohvb.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9oraf8ii3ykuwsrohvb.jpg" alt="Value Level 1" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At this initial level, we focus on customizing AI models to access proprietary data or mimic specific personalities. This is basic and straightforward, often involving system prompts via GUIs or APIs and ChatGPT custom instructions. While valuable for specific purposes, its overall impact is limited.&lt;/p&gt;

&lt;h2&gt;
  
  
  Value Level 2: Function Calling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Difficulty&lt;/strong&gt;: Medium - Hard&lt;br&gt;
&lt;strong&gt;Value&lt;/strong&gt;: Medium - High - Very High&lt;br&gt;
&lt;strong&gt;Team Required&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Domain Experts&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Software Developers&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;ML Developers&lt;/strong&gt;&lt;/em&gt;: None&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Eval QA&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvzg17w4k0u7ztdh9szy.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvzg17w4k0u7ztdh9szy.jpg" alt="Value Level 2" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here, AI models execute software actions predefined by human programmers. This step involves bridging structured software functionality with the more vague data handling of large language models (LLMs). It's a significant step up in both complexity and value.&lt;/p&gt;

&lt;p&gt;For more information, I have a whole blog post &lt;a href="https://dev.to/dawiddahl/function-calling-the-most-significant-ai-feature-since-chatgpt-itself-81m"&gt;here&lt;/a&gt; on Function Calling.&lt;/p&gt;

&lt;h2&gt;
  
  
  Value Level 3: Basic RAG (Retrieval Augmented Generation)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Difficulty&lt;/strong&gt;: Easy - Medium&lt;br&gt;
&lt;strong&gt;Value&lt;/strong&gt;: Low - High&lt;br&gt;
&lt;strong&gt;Team Required&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Domain Experts&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Software Developers&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;ML Developers&lt;/strong&gt;&lt;/em&gt;: None&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Eval QA&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fory7r2vp1ebn80fw0yol.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fory7r2vp1ebn80fw0yol.jpg" alt="Value Level 3" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Basic RAG is employed when an AI model through &lt;a href="https://en.wikipedia.org/wiki/Semantic_search" rel="noopener noreferrer"&gt;semantic search&lt;/a&gt; retrieve &lt;em&gt;proprietary&lt;/em&gt; data or context (information that the base model doesn't know), which is stored in a so called vector database. &lt;/p&gt;

&lt;p&gt;It helps reduce hallucinations (inaccurate or fictional outputs) and examples include the ARC AI Portal - an internal app my company made where after corporate conventions one could, in near-real-time, ask questions about what was said by speakers at the convention.&lt;/p&gt;

&lt;p&gt;However, it's complex, unpredictable, and rather hacky as it's not genuinely machine learning-based; we're not actually teaching a model how to do something.&lt;/p&gt;

&lt;h2&gt;
  
  
  Value Level 4: Advanced RAG
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Difficulty&lt;/strong&gt;: Hard - Very Hard&lt;br&gt;
&lt;strong&gt;Value&lt;/strong&gt;: High - Very High&lt;br&gt;
&lt;strong&gt;Team Required&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Domain Experts&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Software Developers&lt;/strong&gt;&lt;/em&gt;: Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;ML Developers&lt;/strong&gt;&lt;/em&gt;: None&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Eval QA&lt;/strong&gt;&lt;/em&gt;: Intermediate or Senior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fih2xu273z3w3063r1l5j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fih2xu273z3w3063r1l5j.png" alt="Value Level 4" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Advanced RAG steps up the complexity with summary queries, re-ranking, and multi-step RAG pipelines, like those used in the data framework library &lt;a href="https://www.llamaindex.ai/" rel="noopener noreferrer"&gt;Llamaindex&lt;/a&gt;. While offering high value, it's expensive, notoriously tricky to get right, slow, and still not a true ML application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Value Level 5: Fine-tuning
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Difficulty&lt;/strong&gt;: Very Hard&lt;br&gt;
&lt;strong&gt;Value&lt;/strong&gt;: High - Very High&lt;br&gt;
&lt;strong&gt;Team Required&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Domain Experts&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Software Developers&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;ML Developers&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Eval QA&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbayzxk28wfhcsyqj0yfu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbayzxk28wfhcsyqj0yfu.png" alt="Value Level 5" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Used in actual ML applications, fine-tuning is key for giving an AI model unique abilities or styles. OpenAI's Function Calling behaviour itself is a good example how a model can learn to use different tools effectively through fine-tuning.&lt;/p&gt;

&lt;p&gt;This process is less about accessing proprietary data (as in RAG) and more about training the model in a specific manner. In contrast to levels 2, 3, and 4 which can be achieved by programming, this level requires machine learning knowledge and the skills to gather and clean high-quality datasets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Value Level 6: ML/Programmer Multi-Model Hybrid
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Difficulty&lt;/strong&gt;: Hardest&lt;br&gt;
&lt;strong&gt;Value&lt;/strong&gt;: Highest&lt;br&gt;
&lt;strong&gt;Team Required&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Domain Experts&lt;/strong&gt;&lt;/em&gt;: Junior, Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Software Developers&lt;/strong&gt;&lt;/em&gt;: Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;ML Developers&lt;/strong&gt;&lt;/em&gt;: Intermediate or Senior&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Eval QA&lt;/strong&gt;&lt;/em&gt;: Intermediate or Senior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4tel29ji2g8kfs644d2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4tel29ji2g8kfs644d2.jpg" alt="Value Level 6" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The pinnacle of the AI Application Value Ladder 🤖🪜 involves creating multi-model AI systems, combining the previous levels. This method integrates various models of different sizes and merges software with ML development, leading to advanced, performant, and cost-efficient systems. &lt;/p&gt;

&lt;p&gt;An example is &lt;a href="https://www.figma.com/community/plugin/747985167520967365/builder-io-ai-powered-figma-to-code-react-vue-tailwind-more" rel="noopener noreferrer"&gt;Builder.io&lt;/a&gt;'s translation of Figma designs into code. Rather than relying solely on the more expensive and slower ChatGPT 4, they effectively segmented their challenges, applying fine-tuned, smaller, and faster models for each, in combination with RAG and regular programming.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The AI Application Value Ladder 🤖🪜 serves as a guide to understanding the varied levels of value creation in AI development. It outlines how each step, from basic prompt engineering to complex multi-model systems, contributes differently to a company's AI capabilities.&lt;/p&gt;

&lt;p&gt;As the field of AI continues to evolve rapidly, embracing agents and multi-sense models, having a general framework like the 🤖🪜 is crucial. It helps in discerning which innovations truly advance our capabilities, ensuring we stay ahead in a landscape of constant change.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2yuzla7gmf9i9miai7o.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2yuzla7gmf9i9miai7o.jpg" alt="Outro image - AI Application Ladder" width="800" height="1400"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Dawid Dahl is a full-stack developer at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://arc.inc/" rel="noopener noreferrer"&gt;ARC&lt;/a&gt;. In his free time, he enjoys philosophy, analog synthesizers, consciousness, techno, Huayan and Madhyamika Prasangika, and being with friends and family.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>development</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Meet Your Future Co-workers: The Rise of AI Agents in the Office</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Wed, 15 Nov 2023 10:40:23 +0000</pubDate>
      <link>https://forem.com/dawiddahl/meet-your-future-co-workers-the-rise-of-ai-agents-in-the-office-441m</link>
      <guid>https://forem.com/dawiddahl/meet-your-future-co-workers-the-rise-of-ai-agents-in-the-office-441m</guid>
      <description>&lt;p&gt;OpenAI's &lt;a href="https://devday.openai.com/" rel="noopener noreferrer"&gt;Dev Day&lt;/a&gt; has concluded, bringing a host of exciting announcements such as a longer context window (128k), the unification of all their tools into a single model, reduced prices, and much more.&lt;/p&gt;

&lt;p&gt;But in this flurry of new and shiny releases, I think many might have missed one of the most striking things that Sam Altman (founder of OpenAI) said: &lt;/p&gt;

&lt;p&gt;"&lt;em&gt;Now I want to talk about where we are headed, and the main reason for why we are here today. Starting now, we're taking our first small step that is taking us closer to a future of agents.&lt;/em&gt;"&lt;/p&gt;

&lt;h2&gt;
  
  
  Emergence of Autonomous Digital Beings
&lt;/h2&gt;

&lt;p&gt;Agents? What's an agent? Sam Altman's mention of &lt;strong&gt;GPTs&lt;/strong&gt; (&lt;a href="https://openai.com/blog/introducing-gpts" rel="noopener noreferrer"&gt;link&lt;/a&gt;) and &lt;strong&gt;Assistant API&lt;/strong&gt; (&lt;a href="https://platform.openai.com/docs/assistants/overview" rel="noopener noreferrer"&gt;link&lt;/a&gt;), also released on Dev Day, isn't just about enabling the creation of advanced chatbots with capabilities like visual perception 👁️, image generation 🖼️, and interactive functionalities 🦾. It's a nod towards a more profound shift.&lt;/p&gt;

&lt;p&gt;He is primarily highlighting that his company is currently laying the groundwork for a world inhabited by digital entities, capable of functioning with varying levels of autonomy.&lt;/p&gt;

&lt;p&gt;Having thought long and hard about what this all means, I now want to paint a picture of this near (2-3 years) future. &lt;/p&gt;

&lt;p&gt;While my perspective is that of a developer, the revolution of autonomous and multi-sensory AI agents is set to redefine workplaces across a wide range of professions. Let’s explore what this could look like!&lt;/p&gt;

&lt;h2&gt;
  
  
  A Day in the Life with AI Colleagues
&lt;/h2&gt;

&lt;p&gt;You arrive at the office, and grab yourself a perfect cup of coffee. &lt;/p&gt;

&lt;p&gt;As you get to your desk, you're greeted by a gathering of dedicated, albeit unusual, collaborators. One perched on your monitor, another nestled beside your keyboard, a third mid-air displaying analytics, and many more around, all eager to assist you with the tasks of the day.&lt;/p&gt;

&lt;p&gt;Many of them have been working all night—in fact most never sleep at all—and are ready with feedback, status reports and improvements for you to review and hopefully accept while enjoying that coffee.&lt;/p&gt;

&lt;p&gt;Let's meet the team, shall we?&lt;/p&gt;

&lt;h3&gt;
  
  
  Philosopher King - Lion
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;Cloud&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3kb3qdyjbbaxx6oenz05.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3kb3qdyjbbaxx6oenz05.jpg" alt="Philosopher King - Lion" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Just like a lion is the king of the animal kingdom, this agent is in charge of your overall AI assistant collective, and offers wise oversight based on philosophical and ethical principles. &lt;/p&gt;

&lt;p&gt;Governing and guiding the mission based on your—the human’s—goals and vision, the Philosopher King lion makes sure things do not descend into madness, chaos or nonsense. &lt;/p&gt;

&lt;p&gt;Has the power to turn on, promote, demote, or turn off other agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unit Tester - Woodpecker
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;Project terminal, GitHub repo&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqzjc8jxspdcqll8wgzc2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqzjc8jxspdcqll8wgzc2.jpg" alt="Unit Tester - Woodpecker" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Unit Tester AI diligently checks each small unit of your code for bugs, mirroring the meticulous tapping of a woodpecker on trees to find insects.&lt;/p&gt;

&lt;h3&gt;
  
  
  Integration Tester - Spider
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;Project terminal, GitHub repo&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0zsyoun2iv3d4mppig1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0zsyoun2iv3d4mppig1.jpg" alt="Integration Tester - Spider" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Spiders weave complex webs where each thread is connected, much like how an Integration Tester agent would ensure that different pieces of your application work together seamlessly.&lt;/p&gt;

&lt;h3&gt;
  
  
  End-to-End Tester - Dolphin
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;Monitor, Browser&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff1ta7qr1rae6oyh8gyt5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff1ta7qr1rae6oyh8gyt5.jpg" alt="End-to-End Tester - Dolphin" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Dolphins are known for their intelligence and comprehensive hunting strategies, akin to how an End-to-End Tester AI would smartly navigate through the entire application to verify a complete user experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  Refactorer - Beaver
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;GitHub repo, Code-editor&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qvxtu2esi9ebuq4flie.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qvxtu2esi9ebuq4flie.jpg" alt="Refactorer - Beaver" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Beavers are natural builders who constantly modify their dams; a Refactorer AI would from time to time similarly offer ways to reshape and optimize your codebase for better flow and efficiency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug Hunter - Anteater
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;GitHub repo&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frin3kmikwn9ackwkul43.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frin3kmikwn9ackwkul43.jpg" alt="Bug Hunter - Anteater" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With a keen sense for sniffing out its prey, an anteater represents the Bug Hunter AI agent that sniffs out code smells and eradicates errors in your repo.&lt;/p&gt;

&lt;h3&gt;
  
  
  Spec Guardian - Elephant
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;GitHub repo&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8t8pretrdvdqb95ng2tn.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8t8pretrdvdqb95ng2tn.jpg" alt="Spec Guardian - Elephant" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Elephants are known for their exceptional memory, much like a Spec Guardian assistant that would keep track of and enforce the software specifications and standards.&lt;/p&gt;

&lt;h3&gt;
  
  
  Team: Security Auditor - Owl &amp;amp; Performance Optimizer - Cheetah
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;Project terminal, GitHub repo, Browser console&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe27ww6gwj93rd0xnj8jz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe27ww6gwj93rd0xnj8jz.jpg" alt="Security Auditor - Owl &amp;amp; Performance Optimizer - Cheetah" width="800" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Right beside your workspace, you often find an owl and a cheetah, an unusual pair. &lt;/p&gt;

&lt;p&gt;The owl, with its exceptional vision, acts as a Security Auditor AI, constantly alert for vulnerabilities, while the cheetah, embodying a Performance Optimizer AI, ensures your application runs at great speed. Together, they exemplify a perfect balance of security and efficiency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Documentation Author - Honeybee
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;GitHub repo, Google Drive&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3d3eblplgq6ln940wr9.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3d3eblplgq6ln940wr9.jpg" alt="Documentation Author - Honeybee" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Much like honeybees create structured honeycombs, a Documentation Author agent crafts organized and detailed documentation, ensuring clarity and ease of access for developers and users alike.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code Reviewer - Meerkat
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;GitHub repo, Code-editor&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbsv670kgegv7ookyb7gp.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbsv670kgegv7ookyb7gp.jpg" alt="Code Reviewer - Meerkat" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Vigilant and social, meerkats take turns watching for danger while others work, similar to a Code Reviewer AI’s role in critically examining code before it has a chance to turn into spaghetti. Often even before you have a chance to press save!&lt;/p&gt;

&lt;p&gt;Works in close collaboration with the testing team: the woodpecker, the spider, and the dolphin.&lt;/p&gt;

&lt;h3&gt;
  
  
  Customer Support - Golden Retriever
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Habitat: &lt;em&gt;Deployed project, Website&lt;/em&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmdhfrv9f8fo9rzfls93i.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmdhfrv9f8fo9rzfls93i.jpg" alt="Customer Support - Golden Retriever" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Just as a Golden Retriever is known for its loyalty and friendliness, this AI assists your users or stakeholders day and night with your creations, sparing you the effort.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tech Empowering AI Co-workers
&lt;/h2&gt;

&lt;p&gt;This is just a sneak peek of some of the specialized agents that will be available to assist you in the future. I don't know about you, but personally, I get super excited about this future workplace! It's amazing how this is not just distant science-fiction speculation, but our actual and fast approaching reality.&lt;/p&gt;

&lt;p&gt;So what are the upcoming AI technologies that will make our new AI assistant friends come alive?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Easy Agent Creation&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With OpenAI's GPTs and Assistant API, this has actually already started.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Larger context window&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When AI assistants can handle and retain vast amounts of data, they will be capable of integrating seamlessly into environments like a GitHub repository, an Adobe software suite, or an Asana project.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Vision&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Improved AI vision will transform their interaction from purely text-based to visual. For instance, if our E2E Dolphin didn't have good "eye"-sight 👁️, it wouldn't have been able to monitor actual computer screens, clicking around the browser like a real user or spotting bugs as they occur and taking action accordingly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Lower Costs and Increased Speed&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These key factors make AI more accessible and efficient, crucial for widespread adoption.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Continuous Existence and Memory&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unlike current ChatGPT, the AI agents in our story maintained ongoing digital awareness of both self and environment. They therefore recall past interactions and continue their "life" over time; they are not just waiting for the next human prompt.&lt;/p&gt;

&lt;p&gt;This continuous existence—which is what Sam Altman was hinting at in his keynote speech—coupled with more advanced memory management, will elevate them from mere static chatbots to dynamic collaborators, capable of growing with each new experience.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Greater Intelligence&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With enhanced intelligence, these AI assistants will be able to cooperate in a dynamic network together with other agents, guided by strategic directives like those from the Philosopher Lion.&lt;/p&gt;

&lt;p&gt;Just like the animals in our story found themselves in a scenario requiring a general contextual understanding and collaborative problem-solving skills, this greater reasoning ability of future AI models will prove to be key.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future of Work Awaits
&lt;/h2&gt;

&lt;p&gt;In the new world of work, AI agents like the ones in our story will not be just tools, they will be co-workers. Embrace this change, stay curious, and be ready to work alongside them.&lt;/p&gt;




&lt;p&gt;Dawid Dahl is a full-stack developer at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://arc.inc/" rel="noopener noreferrer"&gt;ARC&lt;/a&gt;. In his free time, he enjoys philosophy, analog synthesizers, consciousness, and being with friends and family.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>code</category>
      <category>openai</category>
    </item>
    <item>
      <title>Function Calling: The Most Significant AI Feature Since ChatGPT Itself?</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Thu, 07 Sep 2023 10:52:12 +0000</pubDate>
      <link>https://forem.com/dawiddahl/function-calling-the-most-significant-ai-feature-since-chatgpt-itself-81m</link>
      <guid>https://forem.com/dawiddahl/function-calling-the-most-significant-ai-feature-since-chatgpt-itself-81m</guid>
      <description>&lt;p&gt;A few months ago, &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;Umain&lt;/a&gt;, the company I work for, organized a hackathon. The goal? To create tech products that harness the power of AI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6pp1487i3f2alzrwfyw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6pp1487i3f2alzrwfyw.png" alt="umain tech accelerator hackathon" width="800" height="621"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Although the hackathon yielded some promising results, it also revealed a fundamental obstacle stifling further breakthroughs in AI application development: the wide gap between unstructured and structured data.&lt;/p&gt;

&lt;p&gt;To better cement the concepts of unstructured and structured data into memory, I will from now on refer to this as &lt;strong&gt;Vague-Ass&lt;/strong&gt; stuff and &lt;strong&gt;Hard-Ass&lt;/strong&gt; stuff.&lt;/p&gt;

&lt;p&gt;First, I will clarify what I mean by &lt;strong&gt;Vague-Ass&lt;/strong&gt; and &lt;strong&gt;Hard-Ass&lt;/strong&gt;. Because grasping this dichotomy is crucial for the utility of Function Calling, a brand new feature from OpenAI that addresses all the challenges we faced during the hackathon.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vague-Ass Stuff
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Vague-Ass&lt;/strong&gt; stuff represents the nebulous, the immeasurable—things that can't be captured in algorithms or databases.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7j6l2160xmzqwcyehf46.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7j6l2160xmzqwcyehf46.jpeg" alt="informal emotions" width="800" height="800"&gt;&lt;/a&gt;&lt;br&gt;I used a new AI product to expand the image in the middle out to the sides. The result turned out great.
  &lt;/p&gt;

&lt;p&gt;Here, you'll find:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Human consciousness&lt;/li&gt;
&lt;li&gt;Emotions; desires and fears&lt;/li&gt;
&lt;li&gt;Social norms; cultural nuances&lt;/li&gt;
&lt;li&gt;The dreams of clients and stakeholders&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this realm, humans communicate using &lt;strong&gt;natural language&lt;/strong&gt; — just everyday conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hard-Ass Stuff
&lt;/h2&gt;

&lt;p&gt;Conversely, &lt;strong&gt;Hard-Ass&lt;/strong&gt; stuff is the domain governed by rules and structured methodologies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcx4dz0pnhhatfp558ntf.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcx4dz0pnhhatfp558ntf.jpeg" alt="formal computers and robots" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This category includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mathematics &amp;amp; Logic&lt;/li&gt;
&lt;li&gt;Computation&lt;/li&gt;
&lt;li&gt;Machines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The language of choice here is &lt;strong&gt;structured language&lt;/strong&gt;—code, logic, and equations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bridging the Vague-Ass / Hard-Ass Gap (VHG)
&lt;/h2&gt;

&lt;p&gt;Bridging the &lt;strong&gt;Vague-Ass&lt;/strong&gt; / &lt;strong&gt;Hard-Ass&lt;/strong&gt; Gap (VHG for short) has brought some revolutionary improvements for human civilization. It has enabled automation, economic growth, and improved connectivity, making life significantly better.&lt;/p&gt;

&lt;p&gt;The point of many professions we have today—such as mathematicians, structural engineers, software developers—is to serve as mediators between these two realms.&lt;/p&gt;

&lt;p&gt;In fact, I realized that what we developers actually do all day is transform &lt;strong&gt;Vague-Ass&lt;/strong&gt; stuff into &lt;strong&gt;Hard-Ass&lt;/strong&gt; stuff and then back into &lt;strong&gt;Vague-Ass&lt;/strong&gt; stuff.&lt;/p&gt;

&lt;p&gt;Let me show you:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qk2l9c2m9xrtvs4quiq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qk2l9c2m9xrtvs4quiq.png" alt="developer workflow" width="800" height="806"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Information flows from the &lt;strong&gt;Vague-Ass&lt;/strong&gt; dimension of unstructured feelings &lt;br&gt;
→ using some kind of form to collect &lt;strong&gt;Hard-Ass&lt;/strong&gt; structured data &lt;br&gt;
→ doing something with that data &lt;br&gt;
→ and then handing it back to the &lt;strong&gt;Vague-Ass&lt;/strong&gt; dimension where some value has hopefully been created, in terms of good feelings.&lt;/p&gt;

&lt;p&gt;This is how developers have always worked. And the question we were trying to explore at the hackathon was how to integrate the new and transformative potential of AI into this workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The ChatGPT Challenge: A Dead-End at the VHG
&lt;/h2&gt;

&lt;p&gt;So here I was, presenting our new AI code-smell-detector-app. Yet, something was nagging me: the limitations of using ChatGPT for &lt;em&gt;reliable&lt;/em&gt; VHG bridging.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3b1schlnbxeht1oc14j2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3b1schlnbxeht1oc14j2.jpg" alt="umain hackathon presentation" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What does that mean? ChatGPT deals in &lt;strong&gt;Vague-Ass&lt;/strong&gt; stuff using natural language, and therefore it struggles when it attempts to interface with the strictly &lt;strong&gt;Hard-Ass&lt;/strong&gt; stuff that our applications require to function correctly.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Specifically, here is where we as developers are struggling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Collect &lt;strong&gt;Hard-Ass&lt;/strong&gt; structured data with ChatGPT&lt;/li&gt;
&lt;li&gt;Extract &lt;strong&gt;Hard-Ass&lt;/strong&gt; structured data from ChatGPT&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Despite my team and colleagues creating some remarkable prototypes, we were often hamstrung by these limitations. No matter how nicely we asked and pleaded with the AI, it just wouldn't do what we needed it to in order to interact with our apps.&lt;/p&gt;

&lt;p&gt;Well, maybe 90% or even 99% of the time it would actually do what we wanted. But when writing software, such odds are often not acceptable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffofaqccfhxwvi1cv119g.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffofaqccfhxwvi1cv119g.jpeg" alt="human and robot struggling to connect" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This significantly restricted the potential of every team at the hackathon. And ever since the release of ChatGPT, this has pretty much been the status quo for AI app development.&lt;/p&gt;

&lt;p&gt;Until now...&lt;/p&gt;

&lt;h2&gt;
  
  
  The Artificial Intelligence Bridge
&lt;/h2&gt;

&lt;p&gt;In June 2023, OpenAI suddenly released Function Calling for ChatGPT. What does it do? Basically, it solves every single issue that I have been writing about up to this point.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjk0ifeiu9qxobzte7apg.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjk0ifeiu9qxobzte7apg.jpg" alt="open-ai function calling bridge" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's why I believe it might be the most significant feature since ChatGPT itself was released. Bridging the gap between &lt;strong&gt;Vague-Ass&lt;/strong&gt; stuff and &lt;strong&gt;Hard-Ass&lt;/strong&gt; stuff means that we will be able to take everything we already excel at as developers, and plug it straight into the promising land of generative AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  So How Does It Work?
&lt;/h2&gt;

&lt;p&gt;Contrary to what its name suggests, ChatGPT will never actually call your functions. In fact, for Function Calling to work there doesn't even have to &lt;em&gt;be&lt;/em&gt; any real functions!&lt;/p&gt;

&lt;p&gt;Essentially, all it does is attempt to generate the parameters to &lt;em&gt;hypothetical&lt;/em&gt; or &lt;em&gt;potential&lt;/em&gt; functions, which you using a &lt;a href="https://json-schema.org/" rel="noopener noreferrer"&gt;JSON schema&lt;/a&gt; describe to ChatGPT.&lt;/p&gt;

&lt;p&gt;To break it down, this is what is now possible:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Convert &lt;strong&gt;Vague-Ass&lt;/strong&gt; natural language into &lt;strong&gt;Hard-Ass&lt;/strong&gt; parameters. Once you have &lt;strong&gt;Hard-Ass&lt;/strong&gt; parameters, you are able to do whatever you want inside your function and return it in whatever way you'd like.&lt;/li&gt;
&lt;li&gt;Summarize the &lt;strong&gt;Hard-Ass&lt;/strong&gt; data you've generated inside your function back to the user in &lt;strong&gt;Vague-Ass&lt;/strong&gt; natural language&lt;/li&gt;
&lt;li&gt;Extract &lt;strong&gt;Hard-Ass&lt;/strong&gt; structured data from &lt;strong&gt;Vague-Ass&lt;/strong&gt; ChatGPT computations. By describing exactly what kind of parameters you want (for example, an array of any length with objects), you can reliably return ChatGPT data in your desired format. How? By simply collecting the parameters and returning them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To explain how Function Calling works using an example, I have made this illustration:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.ibb.co%2Ft4rW5V6%2FUntitled-2023-08-23-1941-big.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.ibb.co%2Ft4rW5V6%2FUntitled-2023-08-23-1941-big.jpg" alt="open-ai function calling example" width="800" height="839"&gt;&lt;/a&gt;&lt;br&gt;Please excuse the pinecone references, it is an internal joke. Pinecones are not actually an integral part of our corporate structure.
  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://excalidraw.com/#json=S1eyXovleUtX7Ie-svvYe,UucBIXmU9ChCOx3te_YKKQ" rel="noopener noreferrer"&gt;Here&lt;/a&gt; or &lt;a href="https://i.ibb.co/t4rW5V6/Untitled-2023-08-23-1941-big.jpg" rel="noopener noreferrer"&gt;here&lt;/a&gt; or is a link to the actual illustration since the image above is too small.&lt;/p&gt;

&lt;h2&gt;
  
  
  In Conclusion
&lt;/h2&gt;

&lt;p&gt;The advent of Function Calling has some amazing implications for the future of AI apps. It stands as a powerful tool for developers to better serve as the bridge between &lt;strong&gt;Hard-Ass&lt;/strong&gt; stuff and &lt;strong&gt;Vague-Ass&lt;/strong&gt; stuff. &lt;/p&gt;

&lt;p&gt;This breakthrough not only addresses existing challenges but also opens new avenues for innovative applications.&lt;/p&gt;




&lt;p&gt;Dawid Dahl is a full-stack developer at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://arc.inc/" rel="noopener noreferrer"&gt;ARC&lt;/a&gt;. In his free time, he enjoys philosophy, analog synthesizers, consciousness, and being with friends and family.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>chatgpt</category>
      <category>development</category>
      <category>openai</category>
    </item>
    <item>
      <title>AI Is Changing The Way We Code: AI-Driven Development (AIDD)</title>
      <dc:creator>Dawid Dahl</dc:creator>
      <pubDate>Thu, 09 Mar 2023 07:49:32 +0000</pubDate>
      <link>https://forem.com/dawiddahl/ai-is-changing-the-way-we-code-ai-driven-development-aidd-2ngo</link>
      <guid>https://forem.com/dawiddahl/ai-is-changing-the-way-we-code-ai-driven-development-aidd-2ngo</guid>
      <description>&lt;p&gt;These days, it appears that anyone can write code. You’ve probably seen plenty of videos online of non-developers using AI to generate working scripts in no time. Code that actually seems to work! It’s like magic.&lt;/p&gt;

&lt;p&gt;And it &lt;em&gt;is&lt;/em&gt; kind of like magic. These next-generation models like ChatGPT emerged onto the scene in 2023 and blew almost everyone’s minds; even non-technical people were amazed by all the things AI could do, coding being just one of them.&lt;/p&gt;

&lt;p&gt;And not only that, it is getting better and smarter by the month. The pace of improvement is absolutely astounding. It seems everything is about to change.&lt;/p&gt;

&lt;h2&gt;
  
  
  It's a Br-AI-ve New World
&lt;/h2&gt;

&lt;p&gt;But wait… doesn’t that mean developers are about to be replaced by AI?&lt;/p&gt;

&lt;p&gt;Surely if grandma can now divide-and-conquer a data structure using an AI-generated merge sort algorithm in O(log n) time without breaking a sweat, how could it not mean exactly that? &lt;/p&gt;

&lt;p&gt;Maybe it’d be smart to start looking elsewhere for work. How about gardening? Plumbing? Yoga instructor influencer? Anywhere where we might shield ourselves from this sudden AI disruption.&lt;/p&gt;

&lt;p&gt;Well, hold on just a moment. In this new world, code can indeed be generated by AI. Great code, too. But here’s the kicker: how can we gain &lt;em&gt;confidence&lt;/em&gt; that this code will actually do what it is supposed to do, in a good way, and in an ever-changing environment? Especially within the context of a software system consisting of other AI-generated code, with which it is going to have to integrate.&lt;/p&gt;

&lt;p&gt;How can we as developers harness and benefit from the vast intelligence of these new AI companions, while also ensuring that our customers’ complex software systems remain maintainable, scale well, and function without bugs?&lt;/p&gt;

&lt;p&gt;That is the problem that AIDD solves. &lt;/p&gt;

&lt;p&gt;In this article and its accompanied video, we'll explore the eight core steps of AI-Driven Development and see how this workflow could have the potential to supercharge your life as a developer. And how it does so not by fighting against—nor surrendering—to the AI, but by joining forces with them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Value Of Human Development Skills
&lt;/h2&gt;

&lt;p&gt;But before we do, for the developers out there who might still be concerned about the rise of Artificial Intelligence and how it could affect their job security. What exactly are the reasons why your skills and creativity as a developer will remain crucial to the software construction process?&lt;/p&gt;

&lt;p&gt;Here are just five of the many exciting responsibilities you can expect as a future AIDD developer. Some you will be familiar with, some are brand new.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Be the one who has a deep and thorough understanding of the vast space of vague human customer requirements.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Carefully guide the AI through the AIDD process. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create your application’s &lt;a href="https://martinfowler.com/articles/practical-test-pyramid.html" rel="noopener noreferrer"&gt;testing pyramid&lt;/a&gt;. From unit to integration to E2E tests. Yes, the AI can and will help you out here, but you are ultimately responsible for the health and maintenance of the test suite.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flord2czy40yx9fm525ov.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flord2czy40yx9fm525ov.jpg" alt="Testing pyramid" width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Repair and modify systems built using AIDD.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use your skills as a developer to make sure the system fulfils customer requirements while also respecting the many principles of professional software construction:&lt;br&gt;
 &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Correct (&lt;em&gt;&lt;strong&gt;vs buggy, crashing, doing nothing, or not executing&lt;/strong&gt;&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;Stable (&lt;em&gt;&lt;strong&gt;vs brittle, or non-deterministic&lt;/strong&gt;&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;Readable (&lt;strong&gt;&lt;em&gt;vs unreadable, or obfuscating&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Testable (&lt;em&gt;&lt;strong&gt;vs tightly coupled&lt;/strong&gt;&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;Scalable (&lt;strong&gt;&lt;em&gt;vs tightly coupled&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Extensible (&lt;strong&gt;&lt;em&gt;vs tightly coupled&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Flexible (&lt;strong&gt;&lt;em&gt;vs tightly coupled&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Reusable  (&lt;strong&gt;&lt;em&gt;vs tightly coupled&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Cohesive (&lt;strong&gt;&lt;em&gt;vs low cohesion&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Maintainable (&lt;strong&gt;&lt;em&gt;vs not DRY, no 
documentation, dead code, etc&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Performant  (&lt;strong&gt;&lt;em&gt;vs slow, and/or expensive&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Secure  (&lt;strong&gt;&lt;em&gt;vs insecure&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Usable (&lt;strong&gt;&lt;em&gt;vs frustrating user experience&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Accessible (&lt;strong&gt;&lt;em&gt;vs only for privileged users&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Portable (&lt;strong&gt;&lt;em&gt;vs tightly coupled to specific medium&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Consistent (&lt;strong&gt;&lt;em&gt;vs paradigms or coding styles or formatting mixed together without forethought&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Minimal (&lt;strong&gt;&lt;em&gt;vs over-engineered&lt;/em&gt;&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt; &lt;br&gt;
These five competencies—dealing with human customers and their ambiguous requirements, AI guidance, test suite creation/maintenance, repair/modification of AI-created systems, wisdom of high-level software principles—are just some of the reasons why you will still be valuable in the workplace as a developer. &lt;/p&gt;

&lt;p&gt;For the foreseeable future, it is reasonable to assume that many of these skills won’t be taken over by AI.&lt;/p&gt;
&lt;h2&gt;
  
  
  So What Is AI-Driven Development?
&lt;/h2&gt;

&lt;p&gt;AIDD follows the 'red, green, refactor' cycle, just like Test-Driven Development (TDD). It also employs the technique of writing tests &lt;em&gt;before&lt;/em&gt; implementation code. In a way, we can say that AIDD is like a futuristic extension of TDD.&lt;/p&gt;

&lt;p&gt;However, unlike in traditional TDD where developers are responsible for creating both unit tests and implementation code on their own, AIDD introduces a new approach, emphasizing deep AI teamwork.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe17bbqgrczh3o9gqp6j0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe17bbqgrczh3o9gqp6j0.jpg" alt="Red, Green, Refactor Cycle" width="800" height="605"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Employing the AIDD technique, you are no longer working alone; you have a smart and patient AI ally by your side, ready to come to your &lt;em&gt;aid&lt;/em&gt; at almost every stage of development. Instead of you doing the heavy lifting, your AI companion goes to work behind the scenes while you can focus on higher-level development tasks.&lt;/p&gt;
&lt;h2&gt;
  
  
  The 8 Steps of AIDD
&lt;/h2&gt;

&lt;p&gt;To demo the technique, let’s first go through the steps at a general level. After that, in a video we’ll go concrete with a simple yet practical example, where we construct an isolated function to achieve some goal.&lt;/p&gt;

&lt;p&gt;This way, you will start to get a feel for how this process works, and how it can really empower you in delivering value to your customers.&lt;/p&gt;
&lt;h3&gt;
  
  
  1: Set the goal
&lt;/h3&gt;

&lt;p&gt;Think about the function at an extremely high level; what are you trying to achieve? Consider it only in terms of Input→Output. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;☝🏻 &lt;em&gt;Reflect on the function’s API, the ergonomics of actually using the function. Specifically, how many and what kind of arguments should it take? Should they be optional or not? Etc.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  2: Formulate the abstract type
&lt;/h3&gt;

&lt;p&gt;Use a strongly-typed language—TypeScript, C#, Haskell—whatever you are using in your project, to manually write the type or interface for the function's input and the output. Strive always for &lt;a href="https://en.m.wikipedia.org/wiki/Pure_function" rel="noopener noreferrer"&gt;pure functions&lt;/a&gt;, unless you have absolutely no choice but to cause side effects.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;☝🏻 &lt;em&gt;If you need help in this or any other step, as always—ask the AI.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  3: Construct mock functionality
&lt;/h3&gt;

&lt;p&gt;Using the function type you made, give them to the AI and ask it to construct a &lt;a href="https://en.m.wikipedia.org/wiki/Mock_object" rel="noopener noreferrer"&gt;mock function&lt;/a&gt;. This is a function without an implementation; it simply simulates a real function by faking the output.&lt;/p&gt;
&lt;h3&gt;
  
  
  4: Write tests
&lt;/h3&gt;

&lt;p&gt;Ask the AI to create unit tests for this yet-non-existent-function, as many as you need, based on the function type and a description of what you are trying to achieve. &lt;/p&gt;

&lt;p&gt;Carefully review the tests it creates. If necessary, add more manually, until you get the feeling that if these tests pass, you will actually feel &lt;em&gt;confident&lt;/em&gt; in the function’s correctness.&lt;/p&gt;

&lt;p&gt;Here's another possible way to do step 4: write all the tests yourself, but with only one assertion for each test case. Then, you can ask the AI to generate 5, 10, 20 more similar assertions for each. &lt;/p&gt;

&lt;p&gt;With more assertions per unique test case, the probability of the function passing the test by sheer luck decreases.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;☝🏻 &lt;em&gt;The unit tests should be lightning fast. This is achieved by not relying on any external state or circumstance, employing techniques like the heavy use of pure functions, monads like Promise or Maybe, dependency injection, mocking, etc.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  5: Run tests and expect failure ❌
&lt;/h3&gt;

&lt;p&gt;Run the tests using the mocked function and watch them dramatically fail.&lt;/p&gt;

&lt;p&gt;If the majority of the tests are not failing—some probably will pass due to luck—return to step 4 and rewrite the tests to be more comprehensive and all-encompassing.&lt;/p&gt;
&lt;h3&gt;
  
  
  6: Create concrete implementation
&lt;/h3&gt;

&lt;p&gt;Now finally, it’s time for the magical last step of TDD and AIDD: the refactoring. It is here we actually create and then improve on the function. &lt;/p&gt;

&lt;p&gt;Give the AI the type or types you made in step 2. Or give it a simple verbal explanation. Or even all the unit tests you made. And unleash it!&lt;/p&gt;
&lt;h3&gt;
  
  
  7: Run tests and expect success ✅
&lt;/h3&gt;

&lt;p&gt;Use the function it creates and, hopefully, watch them gloriously pass!&lt;/p&gt;

&lt;p&gt;If they don't—which will probably happen quite frequently, at least now in 2023—collaborate with the AI to achieve the goal of the tests passing. You could:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ask the AI to try again, supplying any error messages and ideas you might have to help it out. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Or:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manually review the function and see if you can fix it, or implement it yourself. Ask the AI for ideas to help you out.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  8: Refactor
&lt;/h3&gt;

&lt;p&gt;Once all the tests have passed, it's important to review the function manually and &lt;strong&gt;assess whether it aligns with the spirit of professional software construction&lt;/strong&gt; mentioned earlier. If not, maybe consider refactoring the function.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;☝🏻 &lt;em&gt;Again, if you're still not sure why you won't be replaced by AI anytime soon, the step above is one of the main reasons.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the tests start to fail during the refactoring process, keep collaborating with the AI, following the same approach as in step 7, until all the tests pass again.&lt;/p&gt;
&lt;h3&gt;
  
  
  (Optional Final Step): Celebrate!
&lt;/h3&gt;

&lt;p&gt;All done! Congratulate yourself and celebrate the fact that you are now one step closer to delivering happiness to your customer! 🎉&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;☝🏻 &lt;em&gt;In this article we are focusing on unit testing. However, I believe that in general AIDD is applicable to the other testing strategies as well. That said, it should be acknowledged that these other tests typically cover more complex scenarios and therefore require further manual effort and careful thought from the developer.&lt;/em&gt;&lt;br&gt;
☝🏻 &lt;em&gt;Although the principles of TDD and AIDD are close to identical, it's not until you start developing that you begin to see the significant differences between the two approaches.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  AI-Driven Development in the Wild
&lt;/h2&gt;

&lt;p&gt;Finally, let’s see an actual example of how AIDD can be used in the real world. &lt;/p&gt;

&lt;p&gt;Why not attempt to follow along with your own AI companion?&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/xQx3Do9Fji8"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;If you did follow along with the video, pause for just a moment and take in the fact you didn’t actually create that function yourself. You didn’t even create all the unit tests yourself!&lt;/p&gt;

&lt;p&gt;Instead, you were like a &lt;em&gt;god&lt;/em&gt;; guiding your creation with subtle nudges—yet rarely directly intervening.&lt;/p&gt;

&lt;p&gt;That, at least to me, is mind-blowing. &lt;/p&gt;




&lt;p&gt;Dawid Dahl is a full-stack developer at &lt;a href="https://www.umain.com/" rel="noopener noreferrer"&gt;UMAIN&lt;/a&gt; | &lt;a href="https://arc.inc/" rel="noopener noreferrer"&gt;ARC&lt;/a&gt;. In his free time, he enjoys philosophy, analog synthesizers, consciousness, and being with friends and family.&lt;/p&gt;




&lt;p&gt;Credit for pyramid and cycle graphics: Jenny Eckerud.&lt;br&gt;
TS/Jest CodeSandbox &lt;a href="https://codesandbox.io/s/template-ts-jest-qb63h" rel="noopener noreferrer"&gt;template&lt;/a&gt; used in the video.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>ai</category>
      <category>chatgpt</category>
      <category>tdd</category>
    </item>
  </channel>
</rss>
