<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Vinod W</title>
    <description>The latest articles on Forem by Vinod W (@vinod_wa).</description>
    <link>https://forem.com/vinod_wa</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3633340%2F04e88f3c-1074-414f-ad4d-0775fd6e7176.jpg</url>
      <title>Forem: Vinod W</title>
      <link>https://forem.com/vinod_wa</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/vinod_wa"/>
    <language>en</language>
    <item>
      <title>AI Agents Roadmap: Zero to Production</title>
      <dc:creator>Vinod W</dc:creator>
      <pubDate>Fri, 03 Apr 2026 19:38:33 +0000</pubDate>
      <link>https://forem.com/vinod_wa/ai-agents-roadmap-zero-to-production-2ohe</link>
      <guid>https://forem.com/vinod_wa/ai-agents-roadmap-zero-to-production-2ohe</guid>
      <description>&lt;p&gt;&lt;strong&gt;What if your AI could stop just answering questions and start finishing entire projects?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's the promise of AI agents systems that plan, use tools, remember context, and loop until the job is done. Not chatbots. Not autocomplete. Autonomous problem-solvers.&lt;/p&gt;

&lt;p&gt;This guide walks you through every layer of building them: from understanding why LLMs can reason at all, to wiring multi-agent teams that collaborate on complex workflows, to monitoring them in production so they don't hallucinate their way into trouble.&lt;/p&gt;

&lt;p&gt;Whether you write code daily or prefer visual builders, there's a path here for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 1: What Actually Makes Something an "Agent"?
&lt;/h2&gt;

&lt;p&gt;Forget the hype-cycle definitions. Let's build one from scratch.&lt;/p&gt;

&lt;p&gt;You're a freelance consultant. A new client emails you asking for a competitive analysis report by Friday. To deliver that, you need to research three competitors, pull their recent financials, compare their product strategies, draft a 10-page report, format it in their brand template, and email the final PDF. You're at &lt;strong&gt;Point A&lt;/strong&gt; (the email) and need to reach &lt;strong&gt;Point B&lt;/strong&gt; (report delivered).&lt;/p&gt;

&lt;p&gt;Today, you'd do all of that manually. An AI agent would do it &lt;em&gt;for you&lt;/em&gt;  autonomously deciding what to research, which tools to use, and how to structure the output.&lt;/p&gt;

&lt;p&gt;But "going from A to B" is too vague, a GPS does that too. So let's sharpen the definition:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;An AI agent is an LLM-powered system that reaches a goal by planning, making decisions, using tools, and learning from its environment, retaining memory across steps.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Five properties packed in there:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLM-powered&lt;/strong&gt;: The reasoning comes from a language model's deep understanding of language and logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Planning &amp;amp; decisions&lt;/strong&gt;: It doesn't just execute a script. It evaluates options (which competitor metrics are important? what format does the client prefer?) and adapts when things go wrong.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt;: It can search the web, call APIs, query databases, run calculations, generate files, anything you expose to it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment interaction&lt;/strong&gt;: It receives feedback (wrong data source? client replied with a correction?) and adjusts course.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: It remembers what it's already done so it doesn't repeat searches or lose context mid-workflow.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Agency&lt;/strong&gt; is the &lt;em&gt;degree&lt;/em&gt; of autonomy. A chatbot that summarizes one article has low agency. An agent that autonomously researches, drafts, revises, and delivers a complete report has high agency. More agency = more value, but also more risk, which is why observability (Phase 10) matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 2: The Engine : Why LLMs Can Reason
&lt;/h2&gt;

&lt;p&gt;Agents are only as smart as their reasoning engine.&lt;/p&gt;

&lt;p&gt;Here's the core insight most tutorials skip: &lt;strong&gt;LLMs are prediction machines that accidentally learned to reason.&lt;/strong&gt; They're trained to predict the next token in a sequence, a seemingly simple task. But to do that well across billions of text examples, the model has to internalize grammar, logic, cause-and-effect, even common-sense relationships.&lt;/p&gt;

&lt;p&gt;Think of it like this: if you trained someone to complete any sentence in any book ever written, they'd &lt;em&gt;have&lt;/em&gt; to understand how language, arguments, and narratives work. That's what happens at scale with transformer-based models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How large is "large"?&lt;/strong&gt; Linear regression has 2 parameters. GPT-3 has 175 billion. That's not a typo. The sheer number of parameters is what allows these models to capture the complexity of human language, researchers have observed that certain capabilities (multi-step math, code generation, analogical reasoning) only &lt;em&gt;emerge&lt;/em&gt; past a certain model size, a phenomenon called &lt;strong&gt;emergent abilities&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For agent builders, the practical implication: you don't train your own LLM. You leverage one (GPT-4, Claude, Llama, Qwen) and focus on how you &lt;em&gt;prompt it&lt;/em&gt;, what &lt;em&gt;tools&lt;/em&gt; you give it, and how you &lt;em&gt;structure its workflow&lt;/em&gt;. The model's reasoning quality is your foundation, everything else you build sits on top.&lt;/p&gt;

&lt;p&gt;LLMs go through two training stages that matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pre-training&lt;/strong&gt;: The model ingests massive text corpora and learns language patterns through next-token prediction. This produces a &lt;strong&gt;foundation model&lt;/strong&gt;, capable but unrefined.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fine-tuning&lt;/strong&gt;: The foundation model is adapted with curated data to follow instructions, hold conversations, or specialize in domains. This is what turns a raw language model into the assistant you interact with.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 3: The Heartbeat : ReAct and the Thought-Action-Observation Loop
&lt;/h2&gt;

&lt;p&gt;A raw LLM generates text. An agent generates text &lt;em&gt;and takes actions&lt;/em&gt;. The bridge is &lt;strong&gt;ReAct&lt;/strong&gt; (Reason + Act).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why not just Chain-of-Thought?&lt;/strong&gt; Chain-of-Thought (CoT) prompting tells the model to think step-by-step. It dramatically improves reasoning. But CoT has a fatal flaw: the model reasons &lt;em&gt;only from its own knowledge&lt;/em&gt;. If a needed fact isn't in its training data, it confidently invents one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ReAct breaks the vacuum&lt;/strong&gt; by interleaving reasoning with real-world actions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "What's the current market cap of NVIDIA and how does it 
       compare to its value 12 months ago?"

Thought: I need NVIDIA's current market cap. Let me look it up.
Action:  web_search("NVIDIA current market cap 2026")
Observation: "NVIDIA's market cap is approximately $3.2 trillion."

Thought: Now I need the value from 12 months ago for comparison.
Action:  web_search("NVIDIA market cap April 2025")
Observation: "In April 2025, NVIDIA's market cap was ~$2.6 trillion."

Thought: I can now calculate the change and answer.
Answer: NVIDIA's market cap grew from ~$2.6T to ~$3.2T over the
        past year, roughly a 23% increase.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each cycle is one iteration of the &lt;strong&gt;Thought → Action → Observation&lt;/strong&gt; loop. The agent keeps looping until it determines it has enough information to produce a final answer. This pattern is the operational heartbeat of nearly every production agent today.&lt;/p&gt;

&lt;p&gt;The key paper behind this is &lt;a href="https://arxiv.org/abs/2210.03629" rel="noopener noreferrer"&gt;ReAct: Synergizing Reasoning and Acting in Language Models&lt;/a&gt; (Yao et al., 2022), Thi paper demonstrated that interleaving reasoning traces with tool actions significantly outperforms either approach alone. Pure reasoning hallucinates facts. Pure action-taking lacks planning. ReAct combines both.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 4: Tools : Giving Your Agent Hands
&lt;/h2&gt;

&lt;p&gt;An LLM can only read and generate text. It can't browse the web, run calculations, query a database, or send an email. &lt;strong&gt;Tools&lt;/strong&gt; bridge the gap.&lt;/p&gt;

&lt;p&gt;A tool is any function the agent can invoke. As the &lt;a href="https://huggingface.co/learn/agents-course/" rel="noopener noreferrer"&gt;Hugging Face agents course&lt;/a&gt; puts it: tools are what allow the assistant to perform additional tasks beyond text generation. You define the tool's name, description, and input schema, the LLM uses that description to decide &lt;em&gt;when&lt;/em&gt; and &lt;em&gt;how&lt;/em&gt; to call it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does the LLM "use" a tool?&lt;/strong&gt; Through prompting. You describe available tools in the system message, specify the invocation format, and the agent framework intercepts tool calls from the model's output, executes them, and feeds results back. Frameworks like LangChain and SmolAgents automate this prompt engineering, but under the hood, it's always text-in, text-out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool design is the #1 determinant of agent quality.&lt;/strong&gt; From real-world experience building agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clear descriptions&lt;/strong&gt;: The model selects tools based on their text descriptions. A description like "does stuff with data" will cause wrong tool selection. A description like "Queries the PostgreSQL inventory database and returns product stock levels for a given SKU" works.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strict input schemas&lt;/strong&gt;: Don't let the agent pass free-form strings where structured parameters are needed. Define types, constraints, required fields.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Informative errors&lt;/strong&gt;: When a tool fails, return a message the agent can reason about ("Rate limited, retry in 30s") rather than a stack trace.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single responsibility&lt;/strong&gt;: One tool, one job. A tool that "searches the web and also sends emails" will confuse the model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complement the LLM's weaknesses&lt;/strong&gt;: Give tools for things the model is bad at like exact math, live data, file I/O, API calls. Don't wrap things the model already handles well (summarization, translation).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 5: Memory : The Difference Between a Demo and a Product
&lt;/h2&gt;

&lt;p&gt;Without memory, an agent forgets everything between loop iterations. Ask it to compare quarterly revenue across three business units -&amp;gt; it'll analyze Q1, then start Q2 with zero recollection of Q1's numbers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Short-term memory&lt;/strong&gt; is the conversation history and scratchpad within a single task. Every Thought, Action, and Observation gets appended so the agent can reference what it already tried. This is what lets an agent handle a 15-step workflow without losing the thread.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long-term memory&lt;/strong&gt; persists &lt;em&gt;across&lt;/em&gt; sessions. User preferences, past interactions, learned facts, typically stored in a vector database (ChromaDB, Pinecone, Weaviate) and retrieved via semantic search when relevant. This is what makes the agent smarter over time.&lt;/p&gt;

&lt;p&gt;The practical impact: short-term memory prevents the agent from repeating itself within a task. Long-term memory prevents it from repeating itself across &lt;em&gt;weeks&lt;/em&gt; remembering that your client prefers bullet-point summaries, that your database password changed last Tuesday, or that you already researched this competitor in March.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 6: Choose Your Framework
&lt;/h2&gt;

&lt;p&gt;Now that you understand &lt;em&gt;what&lt;/em&gt; an agent needs like reasoning, tools, memory, you can evaluate frameworks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code-First (Maximum Control)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt; : A message-passing framework where nodes do the work and edges tell what to do next. You define a graph of processing steps with conditional branches, loops, and shared state. Each node represents a step in the computation, and the graph maintains a state that is passed around and updated as the computation progresses. Best for non-linear workflows where execution paths depend on intermediate results. (&lt;a href="https://langchain-ai.github.io/langgraph/" rel="noopener noreferrer"&gt;Docs&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LlamaIndex&lt;/strong&gt; : The go-to for Agentic RAG. Agents dynamically decide &lt;em&gt;when&lt;/em&gt; and &lt;em&gt;how&lt;/em&gt; to retrieve information from large document sets. Offers RouterQueryEngine for automatic question routing, LlamaParse for intelligent document parsing, and LlamaHub for 40+ pre-built tool connectors. (&lt;a href="https://docs.llamaindex.ai/" rel="noopener noreferrer"&gt;Docs&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SmolAgents&lt;/strong&gt; : Hugging Face's minimalist library where the logic for agents fits in roughly 1,000 lines of code. Its CodeAgent writes tool calls as Python snippets rather than JSON, this approach is highly expressive, allowing for complex logic, control flow, and the ability to combine tools, loop, and transform data. It's model-agnostic, supporting any LLM from local models to OpenAI, Anthropic, and others via LiteLLM integration. (&lt;a href="https://huggingface.co/docs/smolagents/" rel="noopener noreferrer"&gt;Docs&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AutoGen (Microsoft)&lt;/strong&gt; : Models AI applications as conversations between multiple specialized agents. One agent generates code, another critiques it, a third tests it. Supports group chats, hierarchical delegation, and human-in-the-loop.&lt;/p&gt;

&lt;h3&gt;
  
  
  Low-Code (Rapid Orchestration)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt; : Enables you to define specialized autonomous agents with specific roles, goals, and expertise areas, assign tasks based on their capabilities, and establish clear dependencies between tasks. The framework mirrors human team structures, a crew embodies a collective ensemble of agents collaborating to accomplish a predefined set of tasks using sequential, hierarchical, or parallel processes. Backed by over 100,000 certified developers. (&lt;a href="https://docs.crewai.com/" rel="noopener noreferrer"&gt;Docs&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;n8n&lt;/strong&gt; : An open-source visual automation tool (like Zapier, but self-hostable). Connect AI nodes to 400+ apps as Gmail, Sheets, Slack, databases, webhooks. No code required. (&lt;a href="https://docs.n8n.io/" rel="noopener noreferrer"&gt;Docs&lt;/a&gt;)&lt;/p&gt;

&lt;h3&gt;
  
  
  Decision Guide
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If you need...&lt;/th&gt;
&lt;th&gt;Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Conditional branches, loops, explicit state control&lt;/td&gt;
&lt;td&gt;LangGraph&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Smart retrieval over documents&lt;/td&gt;
&lt;td&gt;LlamaIndex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token-efficient, code-generating agents&lt;/td&gt;
&lt;td&gt;SmolAgents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Role-based team collaboration&lt;/td&gt;
&lt;td&gt;CrewAI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No-code business automation with AI&lt;/td&gt;
&lt;td&gt;n8n&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Phase 7: Build It : SmolAgents (Code-First, Minimal)
&lt;/h2&gt;

&lt;p&gt;SmolAgents is the fastest path from zero to working agent. From the &lt;a href="https://huggingface.co/docs/smolagents/" rel="noopener noreferrer"&gt;official docs&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;smolagents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CodeAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DuckDuckGoSearchTool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;InferenceClientModel&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;InferenceClientModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CodeAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;DuckDuckGoSearchTool&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What were NVIDIA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s Q4 2025 earnings?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's a working agent in three lines. But what happens under the hood?&lt;/p&gt;

&lt;p&gt;SmolAgents provides first-class support for Code Agents, where actions are written as Python code rather than JSON, enabling natural composability through function nesting, loops, and conditionals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The CodeAgent loop:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Task arrives&lt;/strong&gt; → added to agent memory with a system prompt describing its role and available tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM generates Python code&lt;/strong&gt;  → e.g., &lt;code&gt;results = web_search("NVIDIA Q4 2025 earnings")&lt;/code&gt; followed by parsing logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Framework executes the code&lt;/strong&gt; in a sandboxed environment and captures output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observation logged to memory&lt;/strong&gt; → agent sees what the tool returned&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Loop repeats&lt;/strong&gt;  → LLM generates the next code snippet with full history&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent calls &lt;code&gt;final_answer(result)&lt;/code&gt;&lt;/strong&gt; → loop ends&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Why code instead of JSON? Because the agent can use Python's full expressiveness in a single action, loops to iterate over search results, conditionals to handle edge cases, string processing to extract data. A JSON-based agent would need multiple separate tool calls for the same work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world example : Stock Research Agent:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;smolagents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CodeAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;InferenceClientModel&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_stock_price&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Gets the current stock price for a given ticker symbol.
    Args:
        ticker: The stock ticker symbol (e.g., &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;AAPL&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;NVDA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;yfinance&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;yf&lt;/span&gt;
    &lt;span class="n"&gt;stock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;yf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Ticker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;currentPrice&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;N/A&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ticker&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CodeAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;get_stock_price&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; 
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;InferenceClientModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Compare the current prices of AAPL, MSFT, and GOOGL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent will write Python code that calls &lt;code&gt;get_stock_price&lt;/code&gt; three times, collects the results, and formats a comparison. One thought cycle, three tool calls, done.&lt;/p&gt;

&lt;p&gt;SmolAgents also supports &lt;strong&gt;ToolCallingAgent&lt;/strong&gt; (JSON-style, more predictable), &lt;strong&gt;Vision Agents&lt;/strong&gt; (process images), and &lt;strong&gt;multi-agent hierarchies&lt;/strong&gt; where one agent manages others.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 8: Build It : LangGraph (Graph-Based, Full Control)
&lt;/h2&gt;

&lt;p&gt;LangGraph gives you explicit control over every decision point. If you've ever drawn a flowchart, you already know LangGraph's model —&amp;gt; the difference is that a LangGraph graph is executable code where every box becomes a function and every arrow becomes an edge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The three building blocks&lt;/strong&gt; (&lt;a href="https://docs.langchain.com/oss/python/langgraph/graph-api" rel="noopener noreferrer"&gt;from the docs&lt;/a&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nodes&lt;/strong&gt;: Python functions that take current state, perform work, and return updated state. Nodes receive the current state as input, perform computation or side-effects, and return an updated state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edges&lt;/strong&gt;: Connections between nodes either fixed ("always go to Node B after Node A") or conditional ("if the classification is 'urgent', go to escalation; otherwise go to auto-reply").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State&lt;/strong&gt;: A shared data structure (typically a &lt;code&gt;TypedDict&lt;/code&gt;) that persists throughout execution. Every node reads and writes to this state.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Real-world example : Customer Support Ticket Router:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TicketState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ticket_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;        &lt;span class="c1"&gt;# "billing", "technical", "general"
&lt;/span&gt;    &lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;        &lt;span class="c1"&gt;# "high", "low"
&lt;/span&gt;    &lt;span class="n"&gt;response_draft&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;escalated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify_ticket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TicketState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# LLM classifies the ticket into category + priority
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;billing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;priority&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;draft_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TicketState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# LLM drafts a response based on category
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response_draft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;escalate_to_human&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TicketState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalated&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_by_priority&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TicketState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;priority&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;draft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Wire the graph
&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TicketState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;classify&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classify_ticket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;draft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;draft_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;escalate_to_human&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;classify&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;classify&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;route_by_priority&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;draft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;draft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;draft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The execution flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;START → classify → [high priority?]
    ├── Yes → escalate_to_human → END
    └── No  → draft_response → END
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This conditional branching is &lt;em&gt;visible in the graph structure&lt;/em&gt;  not buried in prompt engineering. You can render it as a diagram, debug any path, and extend it (add a "send_to_slack" node after drafting) without rewriting the core logic.&lt;/p&gt;

&lt;p&gt;LangGraph also excels at &lt;strong&gt;tool-use loops&lt;/strong&gt; where an assistant node calls tools, observes results, and loops back until it has a complete answer. This is the ReAct loop implemented as an explicit graph cycle.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 9: Build It : Agentic RAG with LlamaIndex
&lt;/h2&gt;

&lt;p&gt;Traditional RAG retrieves documents once and generates once. It breaks on complex queries that need multiple passes, heterogeneous sources, or reasoning across results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agentic RAG&lt;/strong&gt; puts an agent in the driver's seat, it decides &lt;em&gt;what&lt;/em&gt; to retrieve, &lt;em&gt;whether&lt;/em&gt; to retrieve more, and &lt;em&gt;how&lt;/em&gt; to combine findings.&lt;/p&gt;

&lt;p&gt;Implementation with LlamaIndex:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;SimpleDirectoryReader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;SummaryIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Settings&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;QueryEngineTool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core.agent.workflow&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentWorkflow&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Load and chunk your documents
&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SimpleDirectoryReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;annual_report.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;load_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Create two different indexes over the same data
&lt;/span&gt;&lt;span class="n"&gt;vector_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# for specific facts
&lt;/span&gt;&lt;span class="n"&gt;summary_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SummaryIndex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;       &lt;span class="c1"&gt;# for overviews
&lt;/span&gt;
&lt;span class="c1"&gt;# 3. Wrap each as a tool with clear descriptions
&lt;/span&gt;&lt;span class="n"&gt;detail_tool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;QueryEngineTool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_defaults&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query_engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vector_index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_query_engine&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retrieves specific facts, figures, and details from the annual report.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;summary_tool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;QueryEngineTool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_defaults&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query_engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;summary_index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_query_engine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tree_summarize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Provides high-level summaries and overviews of the annual report.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 4. Create the agent
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AgentWorkflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_tools_or_functions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tools_or_functions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;detail_tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;summary_tool&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a financial analyst assistant.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 5. Ask complex questions
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize the key revenue trends and give me the exact Q3 margin percentage.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent routes the summary request to &lt;code&gt;summary_tool&lt;/code&gt; and the specific margin question to &lt;code&gt;detail_tool&lt;/code&gt;, then synthesizes both into a unified answer. A static RAG pipeline would either miss the exact figure or give a shallow overview. The agent retrieves &lt;em&gt;twice&lt;/em&gt; with different strategies.&lt;/p&gt;

&lt;p&gt;You can extend this further, add a &lt;code&gt;WebSearchTool&lt;/code&gt; for current market data, a &lt;code&gt;CalculatorTool&lt;/code&gt; for on-the-fly computations, or plug in connectors from &lt;a href="https://llamahub.ai/" rel="noopener noreferrer"&gt;LlamaHub&lt;/a&gt; (Google Drive, Slack, databases, 40+ integrations).&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 10: Build It : CrewAI (Multi-Agent Teams)
&lt;/h2&gt;

&lt;p&gt;Some problems decompose naturally into roles. CrewAI lets you define the team.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three building blocks&lt;/strong&gt; (from the &lt;a href="https://docs.crewai.com/" rel="noopener noreferrer"&gt;CrewAI docs&lt;/a&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent&lt;/strong&gt; — an autonomous entity with a &lt;code&gt;role&lt;/code&gt;, &lt;code&gt;goal&lt;/code&gt;, and &lt;code&gt;backstory&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Task&lt;/strong&gt; — an assignment with &lt;code&gt;description&lt;/code&gt;, &lt;code&gt;expected_output&lt;/code&gt;, and responsible &lt;code&gt;agent&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crew&lt;/strong&gt; — brings agents and tasks together with a workflow process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Real-world example : Automated Due Diligence Pipeline:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;crewai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Crew&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Process&lt;/span&gt;

&lt;span class="c1"&gt;# Define specialized agents
&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Due Diligence Researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Gather comprehensive background information on target companies&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;backstory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re a senior analyst at a PE firm who digs deep into &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
              &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financials, leadership, litigation history, and market position.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;risk_analyst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Risk Assessment Analyst&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Identify and quantify potential risks in acquisition targets&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;backstory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You specialize in spotting red flags : regulatory issues, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
              &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;debt structures, customer concentration, and market headwinds.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;memo_writer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Investment Memo Writer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Synthesize research into a clear, actionable investment memo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;backstory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You write concise memos that partners actually read  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
              &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;structured, evidence-based, with clear recommendations.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Define tasks
&lt;/span&gt;&lt;span class="n"&gt;research_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research {company}: financials, leadership team, recent news, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;competitive landscape, and any notable events in the last 24 months.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;expected_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A structured research brief with sections for financials, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;leadership, competitive position, and recent developments.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;risk_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Based on the research brief, identify the top 5 risks of &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;acquiring {company}. Quantify where possible.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;expected_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A ranked risk assessment with severity ratings and mitigation suggestions.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;risk_analyst&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;memo_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a 2-page investment memo synthesizing the research and &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk assessment. Include a clear recommendation.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;expected_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A polished investment memo in markdown with executive summary, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key findings, risks, and recommendation.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;memo_writer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memo.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Assemble and run the crew
&lt;/span&gt;&lt;span class="n"&gt;crew&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Crew&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;researcher&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;risk_analyst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memo_writer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;research_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;risk_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memo_task&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sequential&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kickoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Acme Robotics Inc.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The crew runs sequentially: researcher gathers data → risk analyst identifies red flags → writer produces the memo. Each agent works autonomously on its task, but the crew passes context between them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflow options&lt;/strong&gt; from the &lt;a href="https://docs.crewai.com/" rel="noopener noreferrer"&gt;CrewAI docs&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sequential&lt;/strong&gt; → tasks run in order, output chains forward&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hierarchical&lt;/strong&gt; → a manager agent dynamically delegates and validates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel&lt;/strong&gt; → independent tasks run simultaneously&lt;/li&gt;
&lt;li&gt;Agents can use &lt;strong&gt;tools&lt;/strong&gt; (web search, file reading, custom functions) declared on individual tasks&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 11: Build It : n8n (No-Code Automation)
&lt;/h2&gt;

&lt;p&gt;Not every agent needs custom Python. Sometimes you need an LLM integrated into a business workflow, n8n lets you build that visually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world example : Automated Meeting Notes Pipeline:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Calendar Trigger (new event ends) 
    → Fetch transcript from recording tool
    → Send to OpenAI ("Extract action items, decisions, and owners")
    → Filter (only items with assigned owners)
    → Create tasks in project management tool
    → Send Slack summary to #team-updates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You build this by dragging nodes onto a canvas and connecting them. No code. n8n supports 400+ integrations, conditional logic, loops, error handling, webhooks, and cron scheduling. It's open-source and self-hostable, free forever.&lt;/p&gt;

&lt;p&gt;For teams that need AI-powered automation without a development team, n8n is the fastest path from idea to running workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 12: Observability : Don't Ship Blind
&lt;/h2&gt;

&lt;p&gt;An agent that works in testing will hallucinate in production. You need visibility into every decision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Langfuse&lt;/strong&gt; and &lt;strong&gt;Arize Phoenix&lt;/strong&gt; are the two leading observability platforms. They provide:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tracing&lt;/strong&gt; : A timeline of every step: which node fired, what the LLM reasoned, which tools were called, what they returned. For the support ticket router, you'd see: &lt;code&gt;classify → "billing/high" → escalate_to_human → done&lt;/code&gt;. If the agent misclassified, you pinpoint exactly where.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evaluation metrics&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Faithfulness&lt;/strong&gt; → Is the output grounded in retrieved data?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relevance&lt;/strong&gt; → Does the output address what was asked?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool accuracy&lt;/strong&gt; → Right tool called with correct parameters?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trustworthiness&lt;/strong&gt; → Composite score of consistency and factual accuracy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Evaluation dashboards&lt;/strong&gt; : Aggregate scores across runs. Filter to high-hallucination traces for targeted debugging. Compare agent versions to measure whether a prompt change or model swap actually improved quality.&lt;/p&gt;

&lt;p&gt;Both platforms integrate directly with the frameworks covered here → Langfuse has native support for LangGraph and SmolAgents, Phoenix integrates with LlamaIndex via callback handlers.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Complete Sequence
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Phase 1:  Define the agent (autonomous goal completion, not just Q&amp;amp;A)
Phase 2:  Understand the engine (next-token prediction → emergent reasoning)
Phase 3:  Learn the loop (Thought → Action → Observation)
Phase 4:  Design tools (strict schemas, clear descriptions, informative errors)
Phase 5:  Architect memory (short-term context + long-term persistence)
Phase 6:  Evaluate frameworks (LangGraph / LlamaIndex / SmolAgents / CrewAI / n8n)
Phase 7:  Build with SmolAgents (code-first, 3-line quickstart)
Phase 8:  Build with LangGraph (graph-based conditional workflows)
Phase 9:  Build Agentic RAG with LlamaIndex (dynamic multi-index retrieval)
Phase 10: Build multi-agent teams with CrewAI (roles, tasks, crews)
Phase 11: Automate with n8n (visual workflows, 400+ integrations)
Phase 12: Monitor with Langfuse / Arize Phoenix (trace, evaluate, improve)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each phase builds on the last. Skip tool design and your agent hallucinates actions. Skip memory and it forgets at step 3. Skip observability and you'll never know &lt;em&gt;why&lt;/em&gt; it failed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Research Papers
&lt;/h2&gt;

&lt;p&gt;→ &lt;strong&gt;Chain-of-Thought Prompting&lt;/strong&gt; (&lt;a href="https://arxiv.org/abs/2201.11903" rel="noopener noreferrer"&gt;Wei et al., 2022&lt;/a&gt;),  Step-by-step reasoning dramatically improves LLM performance on complex tasks&lt;br&gt;
→ &lt;strong&gt;ReAct&lt;/strong&gt; (&lt;a href="https://arxiv.org/abs/2210.03629" rel="noopener noreferrer"&gt;Yao et al., 2022&lt;/a&gt;), The interleaved reasoning-and-acting paradigm that became the industry standard&lt;br&gt;
→ &lt;strong&gt;Toolformer&lt;/strong&gt; (&lt;a href="https://arxiv.org/abs/2302.04761" rel="noopener noreferrer"&gt;Schick et al., 2023&lt;/a&gt;) , LLMs can learn to autonomously decide when and how to use external tools&lt;br&gt;
→ &lt;strong&gt;Generative Agents&lt;/strong&gt; (&lt;a href="https://arxiv.org/abs/2304.03442" rel="noopener noreferrer"&gt;Park et al., 2023&lt;/a&gt;), Believable simulations of human behavior using LLM agents with memory and reflection&lt;/p&gt;




&lt;h2&gt;
  
  
  Framework Quick-Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Docs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LangGraph&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code&lt;/td&gt;
&lt;td&gt;Non-linear workflows, conditional routing&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.langchain.com/langgraph" rel="noopener noreferrer"&gt;langchain.com/langgraph&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LlamaIndex&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code&lt;/td&gt;
&lt;td&gt;Document Q&amp;amp;A, Agentic RAG&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.llamaindex.ai/" rel="noopener noreferrer"&gt;llamaindex.ai&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SmolAgents&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code&lt;/td&gt;
&lt;td&gt;Lightweight code-generating agents&lt;/td&gt;
&lt;td&gt;&lt;a href="https://huggingface.co/docs/smolagents/" rel="noopener noreferrer"&gt;huggingface.co/docs/smolagents&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CrewAI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low-code&lt;/td&gt;
&lt;td&gt;Multi-agent team collaboration&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.crewai.com/" rel="noopener noreferrer"&gt;docs.crewai.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;n8n&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No-code&lt;/td&gt;
&lt;td&gt;Business automation, 400+ integrations&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.n8n.io/" rel="noopener noreferrer"&gt;docs.n8n.io&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Langfuse&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;Tracing, evaluation dashboards&lt;/td&gt;
&lt;td&gt;&lt;a href="https://langfuse.com/" rel="noopener noreferrer"&gt;langfuse.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Arize Phoenix&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;Open-source LLM debugging&lt;/td&gt;
&lt;td&gt;&lt;a href="https://phoenix.arize.com/" rel="noopener noreferrer"&gt;phoenix.arize.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;This guide draws on concepts explored in the official framework documentation from LangGraph, SmolAgents, LlamaIndex, and CrewAI, and the foundational research papers that launched the field. If you found it useful, follow for more deep dives into agent architectures and production ML systems.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>tutorial</category>
      <category>learning</category>
    </item>
    <item>
      <title>From AI Chat tool to Autonomous Solvers: A Developer’s Guide to AI Agents</title>
      <dc:creator>Vinod W</dc:creator>
      <pubDate>Thu, 02 Apr 2026 19:13:36 +0000</pubDate>
      <link>https://forem.com/vinod_wa/from-ai-chat-tool-to-autonomous-solvers-a-developers-guide-to-ai-agents-38dk</link>
      <guid>https://forem.com/vinod_wa/from-ai-chat-tool-to-autonomous-solvers-a-developers-guide-to-ai-agents-38dk</guid>
      <description>&lt;p&gt;The world of AI is moving beyond simple text generation. We are entering the era of AI Agents systems that don't just answer questions but execute complex workflows autonomously. This guide provides a sequential path to understanding, building, and deploying your own agents.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjf9w11jqo6eo28tn8ohn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjf9w11jqo6eo28tn8ohn.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 1: Understanding the Core "Brain"
&lt;/h2&gt;

&lt;p&gt;Before building, you must understand the foundation. AI agents are powered by Large Language Models (LLMs), which act as their reasoning engine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next-Token Prediction&lt;/strong&gt;:&lt;br&gt;
At their simplest, LLMs are engines with billions of parameters trained to predict the next word in a sequence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Emergent Abilities&lt;/strong&gt;:&lt;br&gt;
As these models grow, they develop "emergent abilities," allowing them to understand language form and meaning to solve tasks they weren't explicitly trained for.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 2: The Heartbeat of an Agent (ReAct &amp;amp; TAO)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Agency&lt;/strong&gt;:&lt;br&gt;
An agent’s "agency" is its level of autonomy. While a chatbot just talks, an agent takes you from Point A (a request) to Point B (a finished outcome, like a booked trip) by planning and making decisions.&lt;/p&gt;

&lt;p&gt;To turn a "static" LLM into an "active" agent, you must implement a reasoning framework. The industry standard is ReAct (Reason + Action).&lt;/p&gt;

&lt;p&gt;The TAO Loop :&lt;br&gt;
Agents operate in a Thought → Action → Observation cycle:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thought&lt;/strong&gt;:&lt;br&gt;
The agent reasons about the next step.&lt;br&gt;
&lt;strong&gt;Action&lt;/strong&gt;:&lt;br&gt;
It invokes a tool (e.g., a search engine or calculator).&lt;br&gt;
&lt;strong&gt;Observation&lt;/strong&gt;:&lt;br&gt;
It sees the tool's result and updates its memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory Importance&lt;/strong&gt;:&lt;br&gt;
Without memory, an agent is "stateless" and forgets its progress. Effective agents use short-term and long-term memory to retain context across the loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 3: Choose Your Implementation Framework
&lt;/h2&gt;

&lt;p&gt;Depending on your coding preference, you can implement agents using different tiers of frameworks:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Code-First (High Control)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt;:&lt;br&gt;
Best for non-linear workflows. Unlike linear chains, it uses a graph (Nodes, Edges, and State) to allow for loops and complex decision-making.&lt;br&gt;
&lt;strong&gt;LlamaIndex&lt;/strong&gt;:&lt;br&gt;
The leader for Agentic RAG. It allows agents to dynamically decide when and how to fetch data from massive document sets.&lt;br&gt;
&lt;strong&gt;SmolAgents&lt;/strong&gt;:&lt;br&gt;
A minimalist library where agents solve tasks by writing and executing Python code directly, which can be 30% more efficient than traditional JSON-based agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Low-Code (Rapid Orchestration)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt;:&lt;br&gt;
Designed for Multi-Agent Systems. You can define a "Crew" of specialized agents (e.g., a Researcher and a Writer) with specific backstories and goals to collaborate on a single project.&lt;br&gt;
&lt;strong&gt;n8n&lt;/strong&gt;:&lt;br&gt;
A visual editor where you can connect AI nodes to thousands of apps like Gmail or Google Sheets to automate repetitive business tasks without deep coding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 4: Implementation: A Sequential Example
&lt;/h2&gt;

&lt;p&gt;If you want to see immediate results, follow this sequential logic to build an Email Sorting Butler:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Define State&lt;/strong&gt;:&lt;br&gt;
Create a shared data object to hold email content and is_spam flags.&lt;br&gt;
&lt;strong&gt;Node 1 (Classify)&lt;/strong&gt;:&lt;br&gt;
Send the email text to an LLM to determine if it is "Spam" or "Ham".&lt;br&gt;
&lt;strong&gt;Conditional Edge&lt;/strong&gt;:&lt;br&gt;
If "Spam," route to a "Delete" node; if "Ham," route to a "Draft Reply" node.&lt;br&gt;
&lt;strong&gt;Node 2 (Draft)&lt;/strong&gt;:&lt;br&gt;
Use the LLM to write a polite response based on the original content.&lt;br&gt;
&lt;strong&gt;Node 3 (Notify)&lt;/strong&gt;:&lt;br&gt;
Present the final draft to the user for review.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 5: Observability &amp;amp; Evaluation
&lt;/h2&gt;

&lt;p&gt;Once your agent is running, you must monitor its performance to prevent hallucinations.&lt;/p&gt;

&lt;p&gt;Tools like Langfuse or Arize Phoenix allow you to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trace Execution&lt;/strong&gt;:&lt;br&gt;
See exactly which tool the agent called and what it thought at every step.&lt;br&gt;
&lt;strong&gt;Evaluate Quality&lt;/strong&gt;:&lt;br&gt;
Score outputs based on:&lt;br&gt;
Faithfulness (is it grounded in facts?)&lt;br&gt;
Relevance (does it answer the prompt?)&lt;/p&gt;

&lt;p&gt;By following this sequence from understanding the LLM brain to implementing a TAO loop and monitoring with Langfuse you can build robust, production-ready AI agents.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>agentskills</category>
      <category>learning</category>
    </item>
    <item>
      <title>Building a Full-Stack AI Memory System in 2 Weeks with Kiro AI IDE</title>
      <dc:creator>Vinod W</dc:creator>
      <pubDate>Thu, 27 Nov 2025 19:08:31 +0000</pubDate>
      <link>https://forem.com/vinod_wa/building-a-full-stack-ai-memory-system-in-2-weeks-with-kiro-ai-ide-5e9h</link>
      <guid>https://forem.com/vinod_wa/building-a-full-stack-ai-memory-system-in-2-weeks-with-kiro-ai-ide-5e9h</guid>
      <description>&lt;h1&gt;
  
  
  &lt;strong&gt;Building a Full-Stack AI Memory System in 2 Weeks with Kiro AI IDE&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;I built &lt;strong&gt;Memory Layer&lt;/strong&gt;, a Chrome extension + Next.js dashboard + FastAPI backend in 2 weeks using the Kiro AI IDE. Kiro’s spec-driven development, hooks, and steering docs cut development time by almost 70% and helped me ship a complex multi-language system quickly.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;The Challenge: Building a Frankenstein AI System&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Hackathons push you to build way more than you should in way less time.&lt;/p&gt;

&lt;p&gt;My goal for Kiroween 2025:&lt;br&gt;
&lt;strong&gt;Build a universal AI memory system&lt;/strong&gt; that captures a user's conversations across LLMs and enhances future prompts with relevant context.&lt;/p&gt;

&lt;p&gt;The problem?&lt;br&gt;
The stack was a 3-headed monster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FastAPI + FAISS + Embeddings&lt;/strong&gt; (Python)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Next.js 14 + shadcn/ui&lt;/strong&gt; (TypeScript)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chrome Extension MV3&lt;/strong&gt; (JavaScript)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything needed to integrate &lt;em&gt;flawlessly&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Normally, this setup causes weeks of API mismatches, inconsistent models, and debugging hell.&lt;/p&gt;

&lt;p&gt;Kiro changed that story.&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;What is Kiro?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://kiro.dev" rel="noopener noreferrer"&gt;Kiro&lt;/a&gt; is an AI-powered IDE that blends:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vibe Coding&lt;/strong&gt; (conversational code generation)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Spec-Driven Development&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Hooks&lt;/strong&gt; (tests, security scans, workflows)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Steering Docs&lt;/strong&gt; (teach the AI your coding style)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of being “autocomplete on steroids”, Kiro acts like a &lt;strong&gt;junior engineer who follows your rules&lt;/strong&gt;, reads your architecture, and writes aligned code.&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;Part 1 -&amp;gt; Specs: The Secret Weapon for Multi-Language Projects&lt;/strong&gt;
&lt;/h2&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Why vibe coding alone wasn’t enough&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;I started by asking Kiro:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Build a FastAPI endpoint to save prompts.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It generated something usable, but…&lt;br&gt;
the frontend expected different fields, the extension sent different names, and the backend validated something else.&lt;/p&gt;

&lt;p&gt;Example mismatch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Backend expected: &lt;code&gt;{ user_id: string, prompt: string }&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Extension sent: &lt;code&gt;{ userId: string, text: string, platform: string }&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Result: Hours lost debugging 400 errors.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Fix: a 3-file spec system&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;I introduced a simple, repeatable spec structure:&lt;/p&gt;


&lt;h3&gt;
  
  
  &lt;code&gt;requirements.md&lt;/code&gt; -&amp;gt; &lt;em&gt;What to build&lt;/em&gt;
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;### FR-1: Save Prompt Endpoint&lt;/span&gt;
&lt;span class="gs"&gt;**Priority:**&lt;/span&gt; High  
The backend must accept and store user prompts.

&lt;span class="gs"&gt;**Acceptance Criteria:**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; AC-1.1: Accepts user_id, prompt text, platform
&lt;span class="p"&gt;-&lt;/span&gt; AC-1.2: Responds within 200ms
&lt;span class="p"&gt;-&lt;/span&gt; AC-1.3: Stores vector embedding in FAISS
&lt;span class="p"&gt;-&lt;/span&gt; AC-1.4: Returns prompt_id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;design.md&lt;/code&gt; -&amp;gt; &lt;em&gt;How to build it&lt;/em&gt;
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;### Save Prompt Endpoint Design&lt;/span&gt;

&lt;span class="gs"&gt;**Route:**&lt;/span&gt; POST /save-prompt  
&lt;span class="gs"&gt;**Validation:**&lt;/span&gt; Pydantic SavePromptRequest  
&lt;span class="gs"&gt;**Storage:**&lt;/span&gt; FAISS IndexFlatL2  
&lt;span class="gs"&gt;**Response:**&lt;/span&gt; { success: bool, prompt_id: int }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;tasks.md&lt;/code&gt; -&amp;gt; &lt;em&gt;Step-by-step implementation&lt;/em&gt;
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;### TASK-6: Implement Save Prompt Endpoint&lt;/span&gt;
&lt;span class="gs"&gt;**Status:**&lt;/span&gt; TODO  
&lt;span class="gs"&gt;**Acceptance Criteria:**&lt;/span&gt; AC-1.1 → AC-1.4
&lt;span class="p"&gt;
1.&lt;/span&gt; Create Pydantic model
&lt;span class="p"&gt;2.&lt;/span&gt; Implement POST route
&lt;span class="p"&gt;3.&lt;/span&gt; Add FAISS embedding + metadata
&lt;span class="p"&gt;4.&lt;/span&gt; Return { success, prompt_id }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Impact&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Once specs existed, Kiro:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generated consistent backend code&lt;/li&gt;
&lt;li&gt;Generated frontend API functions that matched the contract&lt;/li&gt;
&lt;li&gt;Generated Chrome extension network calls using the same model&lt;/li&gt;
&lt;li&gt;Prevented drift completely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Estimated time saved: ~15 hours&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;Part 2 -&amp;gt; Agent Hooks: Automated Testing &amp;amp; Security&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Kiro’s hooks became my personal QA team.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Hook 1 -&amp;gt; Test on Save (Python)&lt;/strong&gt;
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Run Tests on Save"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"trigger"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"onSave"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"filePattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"**/*.py"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"actions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"executeCommand"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pytest -v --tb=short"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This caught:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Incorrect return types&lt;/li&gt;
&lt;li&gt;Missing type hints&lt;/li&gt;
&lt;li&gt;FAISS dimension mismatch&lt;/li&gt;
&lt;li&gt;A similarity threshold bug&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Hook 2 -&amp;gt; AI Security Scanner&lt;/strong&gt;
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Security Scan"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"trigger"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"onSave"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"filePattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"**/{auth,api,main}.py"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"actions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Scan ${file} for security issues: injections, secrets, weak validation."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;It flagged:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Improper JWT handling&lt;/li&gt;
&lt;li&gt;Missing input validation&lt;/li&gt;
&lt;li&gt;Overly verbose error messages&lt;/li&gt;
&lt;li&gt;Potential rate-limit bypass&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These weren’t theoretical -&amp;gt; they were real vulnerabilities caught early.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Hook 3 -&amp;gt; Lint on Save (Disabled)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Too noisy during fast prototyping.&lt;br&gt;
Lesson learned: &lt;strong&gt;match tooling to development phase&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;Part 3 -&amp;gt; Steering Docs: Teaching Kiro to Code Like Me&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Without steering docs, Kiro generated inconsistent styles.&lt;/p&gt;

&lt;p&gt;With them, it produced code like a trained team member.&lt;/p&gt;
&lt;h3&gt;
  
  
  Example steering doc:
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# FastAPI Patterns&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Use async def for all routes
&lt;span class="p"&gt;-&lt;/span&gt; Use Pydantic models for validation
&lt;span class="p"&gt;-&lt;/span&gt; Use Depends() for auth
&lt;span class="p"&gt;-&lt;/span&gt; Add type hints everywhere
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Example output &lt;strong&gt;after&lt;/strong&gt; steering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SavePromptRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Depends&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;get_current_user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;SavePromptResponse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Save user prompt to memory layer.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;prompt_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;SavePromptResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;success&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean. Typed. Validated. Secure.&lt;br&gt;
Generated automatically.&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;Part 4 -&amp;gt; Hybrid Workflow: Specs + Vibe Coding = Best of Both&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;My final workflow:&lt;/p&gt;
&lt;h3&gt;
  
  
  Use &lt;strong&gt;Specs&lt;/strong&gt; for:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;API contracts&lt;/li&gt;
&lt;li&gt;Multi-service features&lt;/li&gt;
&lt;li&gt;Authentication&lt;/li&gt;
&lt;li&gt;Data models&lt;/li&gt;
&lt;li&gt;Vector search workflows&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Use &lt;strong&gt;Vibe Coding&lt;/strong&gt; for:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;UI components&lt;/li&gt;
&lt;li&gt;Animations&lt;/li&gt;
&lt;li&gt;Utility functions&lt;/li&gt;
&lt;li&gt;Chrome extension DOM logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This hybrid approach hit the perfect balance.&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;Case Study -&amp;gt; Building the Chrome Extension&lt;/strong&gt;
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Step 1: Spec the flow
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;### Prompt Enhancement Flow&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Capture typed prompt
&lt;span class="p"&gt;-&lt;/span&gt; Save to backend
&lt;span class="p"&gt;-&lt;/span&gt; Fetch relevant context (&amp;lt;2s)
&lt;span class="p"&gt;-&lt;/span&gt; Inject enhanced prompt
&lt;span class="p"&gt;-&lt;/span&gt; Auto-click send
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Design it
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="p"&gt;1.&lt;/span&gt; Listen to textarea
&lt;span class="p"&gt;2.&lt;/span&gt; POST prompt to backend
&lt;span class="p"&gt;3.&lt;/span&gt; GET memory/context
&lt;span class="p"&gt;4.&lt;/span&gt; Build enhanced prompt
&lt;span class="p"&gt;5.&lt;/span&gt; Insert + send
&lt;span class="p"&gt;6.&lt;/span&gt; Capture response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Generate the extension using vibes
&lt;/h3&gt;

&lt;p&gt;My prompt to Kiro:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Generate a content script that implements the 7-step enhancement flow, observes responses, and injects a Memory Layer button.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Result:&lt;br&gt;
~300 lines of working code with DOM listeners, MutationObserver, UI injection, error handling.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 4: Hooks catch early issues
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Wrong selector&lt;/li&gt;
&lt;li&gt;Missing null-check&lt;/li&gt;
&lt;li&gt;Response observer timing bug&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total dev time: &lt;strong&gt;3 hours (vs ~12 hours manually)&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  &lt;strong&gt;Advanced Techniques I Used&lt;/strong&gt;
&lt;/h2&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;1. DRY Specs with Cross-References&lt;/strong&gt;
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# API Contract (Backend)&lt;/span&gt;
interface SavePromptRequest {
  user_id: string;
  prompt: string;
  platform: string;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Frontend just references it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;See backend/specs for SavePromptRequest shape
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Kiro keeps them synced.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;2. Conditional Steering&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;inclusion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;fileMatch&lt;/span&gt;
&lt;span class="na"&gt;fileMatchPattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/*.tsx"&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# React Patterns...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Steering applies only where relevant.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;3. Hook Chaining&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Save file
 → Run tests
   → Security scan
     → Type-check
       → Commit allowed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  &lt;strong&gt;4. Steering Inheritance&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Base style + language-specific style = perfect consistency.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Results&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Development Speed&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;15,000+ lines&lt;/strong&gt; generated&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;6+ weeks → 2 weeks&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;~&lt;strong&gt;70% faster&lt;/strong&gt; overall&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Code Quality&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;100% typed&lt;/li&gt;
&lt;li&gt;85% test coverage&lt;/li&gt;
&lt;li&gt;12 bugs caught pre-commit&lt;/li&gt;
&lt;li&gt;7 real security issues fixed early&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Component Breakdown&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Backend: 2.5k LOC → 8 hours (would be 30+)&lt;/li&gt;
&lt;li&gt;Web app: 1.2k LOC → 6 hours&lt;/li&gt;
&lt;li&gt;Extension: 1.5k LOC → 10 hours&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;What I’d Do Differently&lt;/strong&gt;
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Avoid over-specifying UI&lt;/li&gt;
&lt;li&gt;Start with fewer hooks&lt;/li&gt;
&lt;li&gt;Make steering docs more specific earlier&lt;/li&gt;
&lt;li&gt;Depend on tasks.md sooner&lt;/li&gt;
&lt;li&gt;Never ignore hook warnings&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Final Architecture Overview&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chrome Extension → FastAPI → FAISS → OpenAI Embeddings
        ↑                 ↓
        └── Next.js Dashboard + Supabase Auth
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All glued together by specs + hooks + steering.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Try It Yourself&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Install Kiro
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://kiro.dev" rel="noopener noreferrer"&gt;https://kiro.dev&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Create your first spec
&lt;/h3&gt;

&lt;h3&gt;
  
  
  3. Add a steering doc
&lt;/h3&gt;

&lt;h3&gt;
  
  
  4. Add a test hook
&lt;/h3&gt;

&lt;h3&gt;
  
  
  5. Vibe code your first feature
&lt;/h3&gt;

&lt;h3&gt;
  
  
  6. Watch everything integrate on the first try
&lt;/h3&gt;

&lt;p&gt;Starter template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/vinodwaghmare/webapp-memgenx-kiro
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Building Memory Layer proved one thing clearly:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;AI-assisted development doesn’t replace engineers -&amp;gt; it amplifies them.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With specs, hooks, and steering docs, Kiro let me build a multi-language, multi-repo, fully integrated AI product in 2 weeks.&lt;/p&gt;

&lt;p&gt;If you’re building anything full-stack + AI, try this workflow once. You’ll never go back.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Never lose context again. Built with Kiro. 🎃&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Resources&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Project: &lt;a href="https://memgen-x-webapp.vercel.app/" rel="noopener noreferrer"&gt;https://memgen-x-webapp.vercel.app/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Kiro IDE: &lt;a href="https://kiro.dev" rel="noopener noreferrer"&gt;https://kiro.dev&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Demo Videos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Backend: &lt;a href="https://youtube.com/watch?v=assOjyddKb0" rel="noopener noreferrer"&gt;https://youtube.com/watch?v=assOjyddKb0&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Extension: &lt;a href="https://youtube.com/watch?v=MR6LTC0IZBE" rel="noopener noreferrer"&gt;https://youtube.com/watch?v=MR6LTC0IZBE&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Dashboard: &lt;a href="https://youtube.com/watch?v=ya8lWP_vPZ8" rel="noopener noreferrer"&gt;https://youtube.com/watch?v=ya8lWP_vPZ8&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




</description>
      <category>kiro</category>
      <category>ai</category>
      <category>hackathon</category>
      <category>fullstack</category>
    </item>
  </channel>
</rss>
