<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Charles Wu</title>
    <description>The latest articles on Forem by Charles Wu (@_4f268336f6580845cdc475).</description>
    <link>https://forem.com/_4f268336f6580845cdc475</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3805236%2Fc2ca81a3-8882-48ac-9ce7-046a056995ab.jpg</url>
      <title>Forem: Charles Wu</title>
      <link>https://forem.com/_4f268336f6580845cdc475</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/_4f268336f6580845cdc475"/>
    <language>en</language>
    <item>
      <title>I Built a Knowledge Base That Thinks — Inspired by Karpathy’s LLM Wiki</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Thu, 30 Apr 2026 02:15:07 +0000</pubDate>
      <link>https://forem.com/seekdb/i-built-a-knowledge-base-that-thinks-inspired-by-karpathys-llm-wiki-128l</link>
      <guid>https://forem.com/seekdb/i-built-a-knowledge-base-that-thinks-inspired-by-karpathys-llm-wiki-128l</guid>
      <description>&lt;p&gt;&lt;em&gt;Notes pile up and go stale. This tool updates your knowledge base automatically — inspired by Karpathy’s LLM Wiki.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ssk6hu32qp9t8j5j23v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ssk6hu32qp9t8j5j23v.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaways&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Inspired by Karpathy’s LLM Wiki, &lt;a href="https://www.npmjs.com/package/ex-brain" rel="noopener noreferrer"&gt;ex-brain&lt;/a&gt; is an open-source CLI that compiles new information into existing knowledge pages, extracts timelines, and builds entity links automatically — so your notes stay current instead of just piling up.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The search layer uses seekdb’s native hybrid search (BM25 + vector similarity in one query), with built-in AI functions for embedding and reranking — no external retrieval pipeline needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ships with a built-in MCP server so Claude can read, write, search, and compile your knowledge base directly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Andrej Karpathy’s &lt;a href="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f" rel="noopener noreferrer"&gt;LLM Wiki&lt;/a&gt; dropped a simple idea: store knowledge as plain text, let an LLM understand and update it. Garry Tan’s GBrain ran with the same concept. Both projects prove that LLM + local storage is a surprisingly powerful combination for personal knowledge management.&lt;/p&gt;

&lt;p&gt;But after using them, I kept hitting the same wall: notes pile up, nothing gets updated, and finding connections between pieces of knowledge requires me to do all the work. So I built ex-brain — a CLI tool that compiles, links, and evolves a personal knowledge base using LLMs.&lt;/p&gt;

&lt;h2&gt;
  
  
  What ex-brain Does
&lt;/h2&gt;

&lt;p&gt;At a high level, ex-brain provides four mechanisms that standard note-taking tools don’t:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Smart compilation — New information updates existing knowledge instead of just appending to it&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Automatic timeline extraction — Events are pulled from text and organized chronologically&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Entity linking — Relationships between people, companies, and concepts are detected and cross-referenced automatically&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hybrid search — Keyword precision and semantic understanding in one query, powered by &lt;a href="https://www.seekdb.ai/" rel="noopener noreferrer"&gt;seekdb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result: a knowledge base that behaves less like a filing cabinet and more like a memory that keeps itself current.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with “Just Take Notes”
&lt;/h2&gt;

&lt;p&gt;Tools like Notion and Obsidian are great at storing information. They’re terrible at keeping it current. You write a note about a company’s Series A in March, their new CEO in June, and their Series B in August — and six months later, you have to read all three notes and mentally reconstruct the current state.&lt;/p&gt;

&lt;p&gt;AI-powered alternatives like Mem or Granola add summarization, but the intelligence is a black box. You can’t control how it categorizes, what it prioritizes, or when it decides something is outdated.&lt;/p&gt;

&lt;p&gt;The human brain doesn’t work this way. When you learn that a company raised a Series B, you don’t file it next to the Series A note — you update your mental model. The Series A becomes history. The Series B becomes current state.&lt;/p&gt;

&lt;p&gt;ex-brain applies the same principle to a knowledge base.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 1: Compiled Truth
&lt;/h2&gt;

&lt;p&gt;Run a single command to feed new information into an existing knowledge page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ebrain compile companies/river-ai &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="s2"&gt;"River AI closed Series A, &lt;/span&gt;&lt;span class="nv"&gt;$50M&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="nt"&gt;--source&lt;/span&gt; meeting_notes &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="nt"&gt;--date&lt;/span&gt; 2024-05-20
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM analyzes the information type — is this a status change (funding stage moved from Seed to Series A), a new fact (founded in 2020), or an event (product launched)? — then applies the right update strategy:&lt;/p&gt;

&lt;p&gt;The compiled page always reflects current truth:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Status&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Funding Stage**&lt;/span&gt;: Series A (Source: meeting_notes, 2024-05-20)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Valuation**&lt;/span&gt;: ~$50M

&lt;span class="gu"&gt;## History&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Previously Seed (until 2024-05-20)

&lt;span class="gu"&gt;## Facts&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Series A led by Sequoia
&lt;span class="p"&gt;-&lt;/span&gt; Founded 2020
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No manual reorganization. No stale information buried in a page you’ll never re-read.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 2: Timeline Extraction
&lt;/h2&gt;

&lt;p&gt;Time is the axis that makes knowledge useful. ex-brain extracts events from compiled pages and structures them chronologically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ebrain timeline extract companies/river-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt; 
 &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
   &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-05-20"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
     &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Series A closed, $50M"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
       &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Led by Sequoia"&lt;/span&gt;&lt;span class="w"&gt; 
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; 
         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;   
          &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-06-15"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
            &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sarah Chen appointed CEO"&lt;/span&gt;&lt;span class="w"&gt; 
             &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
           &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Date parsing handles ISO, natural language (last week, yesterday), and localized formats. Timeline extraction runs automatically during compilation — every compile that contains an event adds it to the timeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 3: Entity Linking
&lt;/h2&gt;

&lt;p&gt;A piece of knowledge is rarely about one thing. “Ali Partovi is the founder of Neo” connects a person, an organization, and a role. ex-brain uses LLMs to detect these relationships:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ebrain put people/ali-partovi &lt;span class="nt"&gt;--file&lt;/span&gt; notes.md

&lt;span class="c"&gt;# Detected:&lt;/span&gt;
&lt;span class="c"&gt;# - Ali Partovi founder_of Neo&lt;/span&gt;
&lt;span class="c"&gt;# - Ali Partovi invested_in [other companies]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a new entity is detected, the system creates a stub page for it automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# people/sarah-chen&lt;/span&gt;

&lt;span class="gu"&gt;## Facts&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**CEO_of**&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;River AI&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;companies/river-ai&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;: appointed June 2024
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The knowledge graph grows organically as you add information. No manual tagging, no predefined ontologies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mechanism 4: Hybrid Search with seekdb
&lt;/h2&gt;

&lt;p&gt;Single-mode search breaks down fast in a knowledge base. Full-text search is precise but misses semantics — search “funding” and you won’t find “financing round.” Vector search understands meaning but can be noisy — search “Sequoia” and you might get results about trees.&lt;/p&gt;

&lt;p&gt;ex-brain uses seekdb as its search and storage layer. seekdb is an AI-native database that unifies vector search, full-text search, and scalar filtering in a single engine. One query combines BM25 keyword matching with vector similarity — no need to stitch two retrieval systems together.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Keyword search&lt;/span&gt;
ebrain search &lt;span class="s2"&gt;"River AI Series A"&lt;/span&gt;

&lt;span class="c"&gt;# Semantic queryebrain query&lt;/span&gt;
 &lt;span class="s2"&gt;"Which companies raised funding recently?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under the hood, seekdb supports multi-stage retrieval: vector and full-text indexes recall candidates independently, then results are fused via weighted combination or Reciprocal Rank Fusion (RRF), with optional LLM-based reranking for precision.&lt;/p&gt;

&lt;p&gt;ex-brain adds a scoring layer on top:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Semantic relevance (85%) — vector similarity&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Freshness (10%) — recently updated content ranks higher&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Type weight (5%) — people pages get a slight boost&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why seekdb
&lt;/h2&gt;

&lt;p&gt;Several properties made seekdb the right fit for this project:&lt;/p&gt;

&lt;p&gt;Embedded mode, zero ops. seekdb runs as a single database file — no server process, no Docker container. For a local-first personal tool, this is the lightest possible deployment. It runs comfortably on 1 CPU core and 2 GB of memory.&lt;/p&gt;

&lt;p&gt;Native hybrid search. Vector search (HNSW, IVF, and quantized variants), full-text search (BM25 with phrase and boolean matching), and scalar filtering — all in one engine with multi-stage ranking pipelines.&lt;/p&gt;

&lt;p&gt;Built-in AI functions. AI_EMBED generates vector embeddings in SQL. AI_COMPLETE runs text generation. AI_RERANK applies reranking models. These work with OpenAI, DashScope, or custom model endpoints. Embedding, retrieval, and inference happen inside the database — no external pipeline needed.&lt;/p&gt;

&lt;p&gt;SQL-compatible. seekdb is built on the OceanBase engine and speaks MySQL-compatible SQL. Standard CREATE TABLE, CREATE INDEX, and query syntax. Full ACID transactions with real-time write visibility.&lt;/p&gt;

&lt;p&gt;Multi-model data. Vectors, text, scalars, JSON, and GIS data coexist in the same engine. ex-brain stores structured metadata (page properties, entity links) and unstructured content (text, embeddings) in one database.&lt;/p&gt;

&lt;p&gt;Here’s the core integration code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Connect — it's just a file pathconst&lt;/span&gt;
 &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;BrainDb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;~/.ebrain/data/ebrain.db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

 &lt;span class="c1"&gt;// Create a vector collection&lt;/span&gt;
 &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getOrCreateCollection&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; 
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ebrain_pages&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
   &lt;span class="na"&gt;embeddingFunction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;createBrainEmbeddingFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;),});&lt;/span&gt;
   &lt;span class="c1"&gt;// Hybrid search&lt;/span&gt;
   &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hybridSearch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; 
    &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;whereDocument&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$contains&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;funding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; 
     &lt;span class="na"&gt;nResults&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  MCP Integration
&lt;/h2&gt;

&lt;p&gt;ex-brain ships with a built-in MCP server. If you use Claude, connect it in one step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; 
 &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
   &lt;/span&gt;&lt;span class="nl"&gt;"ebrain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;   
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ebrain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   
         &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"serve"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; 
            &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; 
             &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude can then read pages (brain_get), write pages (brain_put), search (brain_search), compile new information (brain_compile), and create links (brain_link) — directly against your local knowledge base.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Installbun&lt;/span&gt;
 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; ex-brain

&lt;span class="c"&gt;# Initialize&lt;/span&gt;
ebrain init
&lt;span class="c"&gt;# Create your first page&lt;/span&gt;
ebrain put companies/river-ai &lt;span class="nt"&gt;--type&lt;/span&gt; company &lt;span class="nt"&gt;--content&lt;/span&gt; &lt;span class="s2"&gt;"
River AI is an AI analytics platform.
Founded 2020."&lt;/span&gt;

&lt;span class="c"&gt;# Compile new information&lt;/span&gt;
ebrain compile companies/river-ai &lt;span class="se"&gt;\ &lt;/span&gt;
 &lt;span class="s2"&gt;"River AI closed Series A, Sequoia led"&lt;/span&gt; &lt;span class="se"&gt;\ &lt;/span&gt; 
 &lt;span class="nt"&gt;--source&lt;/span&gt; news &lt;span class="se"&gt;\ &lt;/span&gt; 
 &lt;span class="nt"&gt;--date&lt;/span&gt; 2024-05-20

 &lt;span class="c"&gt;# Search&lt;/span&gt;
 ebrain search &lt;span class="s2"&gt;"River AI funding"&lt;/span&gt;

 &lt;span class="c"&gt;# Start MCP servere&lt;/span&gt;
 brain serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What’s Next
&lt;/h2&gt;

&lt;p&gt;ex-brain is early-stage. The compilation logic isn’t perfect, timeline extraction occasionally misses events, and entity detection produces false positives. But the core idea works: knowledge should update itself when new information arrives, not just accumulate.&lt;/p&gt;

&lt;p&gt;A few directions worth exploring: conflict detection when new information contradicts existing records, confidence decay for stale data, bidirectional propagation when linked entities change, and batch compilation for high-volume ingestion.&lt;/p&gt;

&lt;p&gt;If you’re interested in building knowledge tools — or if you just want a second brain that actually keeps up — check out ex-brain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About seekdb&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;ex-brain’s storage and retrieval layer is powered by seekdb — an open-source, AI-native database that unifies vector search, full-text search, structured data, and built-in AI functions in a single engine. Whether you’re building RAG pipelines, semantic search, or AI agent applications, seekdb handles storage and retrieval without the need to stitch together multiple systems.&lt;/p&gt;

&lt;p&gt;If you’re building an application that needs storage + semantic search + AI inference, give seekdb a try:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Website: &lt;a href="https://www.seekdb.ai/" rel="noopener noreferrer"&gt;https://www.seekdb.ai/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GitHub: &lt;a href="https://github.com/oceanbase/seekdb" rel="noopener noreferrer"&gt;https://github.com/oceanbase/seekdb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Install: &lt;code&gt;pip install -U pyseekdb&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Docs: &lt;a href="https://docs.seekdb.ai/seekdb/seekdb-overview/" rel="noopener noreferrer"&gt;https://docs.seekdb.ai/seekdb/seekdb-overview/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>llm</category>
      <category>bigdata</category>
    </item>
    <item>
      <title>How to Write Workflow Skills: Patterns and Best Practices Distilled from 7 Top Projects</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Wed, 29 Apr 2026 02:15:20 +0000</pubDate>
      <link>https://forem.com/seekdb/how-to-write-workflow-skills-patterns-and-best-practices-distilled-from-7-top-projects-2ip</link>
      <guid>https://forem.com/seekdb/how-to-write-workflow-skills-patterns-and-best-practices-distilled-from-7-top-projects-2ip</guid>
      <description>&lt;p&gt;&lt;em&gt;Five patterns distilled from Skills at OpenAI, Google Labs, obra, and more.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dkrt3eg3ol24uisi04r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dkrt3eg3ol24uisi04r.png" alt=" " width="720" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is a Skill?
&lt;/h2&gt;

&lt;p&gt;A Skill is a folder centered around a SKILL.md file, using YAML frontmatter + Markdown body format. When an LLM determines a Skill is needed, it invokes the skill tool to load it. The entire content of SKILL.md is injected into the conversation context as a tool-result, and the LLM autonomously decides how to execute the instructions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my-skill/
├── SKILL.md          # Main file (required)
├── scripts/          # Executable scripts (optional)
├── references/       # Detailed reference docs (optional, load on demand)
├── resources/        # Templates, checklists, etc. (optional)
└── examples/         # Examples (optional)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Mechanism: A Skill is essentially “knowledge injection” — it doesn’t dynamically generate new tools. Instead, it injects instruction text into the LLM’s context, and the LLM executes those instructions using existing tools (bash, read, edit, etc.).&lt;/p&gt;

&lt;h2&gt;
  
  
  Frontmatter: The “Facade” That Determines Whether a Skill Gets Loaded
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Required Fields
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7bdqfggfh2pepriz7w2m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7bdqfggfh2pepriz7w2m.png" alt=" " width="720" height="162"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How You Write description Determines Load Rate
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Good description — includes trigger phrases and keywords&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="s"&gt;Deploy applications and websites to Vercel. Use when the user&lt;/span&gt;
  &lt;span class="s"&gt;requests deployment actions like "deploy my app", "push this live",&lt;/span&gt;
  &lt;span class="s"&gt;or "create a preview deployment".&lt;/span&gt;

&lt;span class="c1"&gt;# Good description - defines temporal position&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="s"&gt;Use when implementing any feature or bugfix, before writing&lt;/span&gt;
  &lt;span class="s"&gt;implementation code&lt;/span&gt;

&lt;span class="c1"&gt;# Bad description - too vague&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Helps with deployment stuff&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Core Principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;List trigger phrases: Write in the things users might actually say (“deploy my app”, “push this live”)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Define temporal position: Explain “before/after what” (e.g., “before writing implementation code”)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Include product keywords: If covering a large platform, list all product names&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Optional Extended Fields
&lt;/h2&gt;

&lt;p&gt;Extended fields observed across the 7 Skills:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvxva8nspbtbhs84klq25.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvxva8nspbtbhs84klq25.png" alt=" " width="720" height="353"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Five Patterns (Author’s Synthesis)
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Pattern 1: Linear Workflow
&lt;/h2&gt;

&lt;p&gt;Applicable Scenario: Operations with clear steps like deployment, installation, or migration.&lt;/p&gt;

&lt;p&gt;Representative: openai/skills — vercel-deploy&lt;a href="https://dev.to77%20lines"&gt;1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Title&lt;/span&gt;
&lt;span class="gu"&gt;## Prerequisites&lt;/span&gt;
&lt;span class="gu"&gt;## Quick Start (Main flow: Step 1 → 2 → 3)&lt;/span&gt;
&lt;span class="gu"&gt;## Fallback&lt;/span&gt;
&lt;span class="gu"&gt;## Troubleshooting&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Techniques:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqadikdwdk7uo1ooup7y1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqadikdwdk7uo1ooup7y1.png" alt=" " width="720" height="317"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decision Rule: If your Skill can be described as “first do A, then do B, finally do C”, use the Linear pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 2: Decision Tree + Load-on-Demand
&lt;/h2&gt;

&lt;p&gt;Applicable Scenario: Large platform selection, product navigation, problem diagnosis.&lt;/p&gt;

&lt;p&gt;Representative: openai/skills — cloudflare-deploy&lt;a href="https://dev.to224%20lines"&gt;2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Title&lt;/span&gt;
&lt;span class="gu"&gt;## Authentication (auth prerequisite)&lt;/span&gt;
&lt;span class="gu"&gt;## Quick Decision Trees&lt;/span&gt;
&lt;span class="gu"&gt;### "I need to run code" (classified by user intent)&lt;/span&gt;
&lt;span class="gu"&gt;### "I need to store data"&lt;/span&gt;
&lt;span class="gu"&gt;### "I need AI/ML"&lt;/span&gt;
&lt;span class="gu"&gt;## Product Index&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Techniques:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva5li2vvl0zyqdrt16w9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva5li2vvl0zyqdrt16w9.png" alt=" " width="720" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decision Rule: If your Skill covers a knowledge domain with 10+ branches, each with extensive detailed documentation, use the Decision Tree pattern.&lt;/p&gt;

&lt;p&gt;Advanced: The same knowledge domain can be split into two Skills:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Navigation type (cloudflare): Selection only, no operations&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Operational type (cloudflare-deploy): Includes auth, commands, troubleshooting&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Pattern 3: Loop Iteration
&lt;/h2&gt;

&lt;p&gt;Applicable Scenario: TDD, code review, design review — processes requiring repeated execution.&lt;/p&gt;

&lt;p&gt;Representative: obra/superpowers — test-driven-development&lt;a href="https://dev.to371%20lines"&gt;3&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Title&lt;/span&gt;
&lt;span class="gu"&gt;## The Iron Law (core principles that cannot be violated)&lt;/span&gt;
&lt;span class="gu"&gt;## Red-Green-Refactor (loop body)&lt;/span&gt;
&lt;span class="gu"&gt;### RED — Write a failing test&lt;/span&gt;
&lt;span class="gu"&gt;### Verify RED — Confirm it actually fails&lt;/span&gt;
&lt;span class="gu"&gt;### GREEN — Write minimal code&lt;/span&gt;
&lt;span class="gu"&gt;### Verify GREEN — Confirm it passes&lt;/span&gt;
&lt;span class="gu"&gt;### REFACTOR — Clean up&lt;/span&gt;
&lt;span class="gu"&gt;### Repeat (back to RED)&lt;/span&gt;
&lt;span class="gu"&gt;## Common Rationalizations&lt;/span&gt;
&lt;span class="gu"&gt;## Verification Checklist (exit conditions)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Techniques:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5aag62a65lgo4tv17q5m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5aag62a65lgo4tv17q5m.png" alt=" " width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decision Rule: If your Skill requires the LLM to repeatedly execute a “do → verify → improve” cycle, use the Iteration pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 4: Baton Loop (Cross-Session Persistence)
&lt;/h2&gt;

&lt;p&gt;Applicable Scenario: Long-term projects requiring multiple iterations across sessions.&lt;/p&gt;

&lt;p&gt;Representative: google-labs-code/stitch-skills — stitch-loop&lt;a href="https://dev.to203%20lines"&gt;4&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Title&lt;/span&gt;
&lt;span class="gu"&gt;## Overview (baton mode overview)&lt;/span&gt;
&lt;span class="gu"&gt;## The Baton System (baton file specification)&lt;/span&gt;
&lt;span class="gu"&gt;## Execution Protocol (6-step execution protocol)&lt;/span&gt;
&lt;span class="gu"&gt;### Step 1: Read the Baton&lt;/span&gt;
&lt;span class="gu"&gt;### Step 2: Consult Context Files&lt;/span&gt;
&lt;span class="gu"&gt;### Step 3: Generate&lt;/span&gt;
&lt;span class="gu"&gt;### Step 4: Integrate&lt;/span&gt;
&lt;span class="gu"&gt;### Step 5: Update Documentation&lt;/span&gt;
&lt;span class="gu"&gt;### Step 6: Prepare the Next Baton ⚠️ (Critical!)&lt;/span&gt;
&lt;span class="gu"&gt;## File Structure Reference&lt;/span&gt;
&lt;span class="gu"&gt;## Orchestration Options&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Techniques:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwwmscxiltqoala08yd2l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwwmscxiltqoala08yd2l.png" alt=" " width="720" height="283"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decision Rule: If your Skill needs to persist across multiple sessions or requires multiple Agents to collaborate, use the Baton Loop pattern.&lt;/p&gt;

&lt;p&gt;Differences from Pattern 3:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faq2m90tlt3adf6h5keo3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faq2m90tlt3adf6h5keo3.png" alt=" " width="720" height="183"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 5: Multi-Phase + Checkpoints + Skill Orchestration
&lt;/h2&gt;

&lt;p&gt;Applicable Scenario: Complex multi-week processes requiring Go/No-Go decisions at key milestones.&lt;/p&gt;

&lt;p&gt;Representative: deanpeters/Product-Manager-Skills — discovery-process&lt;a href="https://dev.to502%20lines"&gt;5&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Title&lt;/span&gt;
&lt;span class="gu"&gt;## Key Concepts (+ anti-patterns)&lt;/span&gt;
&lt;span class="gu"&gt;## Phase 1: Frame the Problem&lt;/span&gt;
&lt;span class="gu"&gt;### Activities (which sub-Skills to invoke)&lt;/span&gt;
&lt;span class="gu"&gt;### Outputs (phase deliverables)&lt;/span&gt;
&lt;span class="gu"&gt;### Decision Point 1 (checkpoint: YES/NO + time impact)&lt;/span&gt;
&lt;span class="gu"&gt;## Phase 2-6... (repeated structure)&lt;/span&gt;
&lt;span class="gu"&gt;## Complete Workflow (end-to-end timeline)&lt;/span&gt;
&lt;span class="gu"&gt;## Common Pitfalls&lt;/span&gt;
&lt;span class="gu"&gt;## References (list of cited sub-Skills)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Techniques:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmu4bvvz71zmkx0rtioik.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmu4bvvz71zmkx0rtioik.png" alt=" " width="720" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decision Rule: If your Skill spans multiple days/weeks with clear phase divisions and Go/No-Go decision points, use the Multi-Phase pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Special Pattern: Thinking Framework (Controlling “How the LLM Thinks”)
&lt;/h2&gt;

&lt;p&gt;Applicable Scenario: Security audits, code review, architecture analysis — scenarios requiring deep thinking.&lt;/p&gt;

&lt;p&gt;Representative: trailofbits/skills — audit-context-building&lt;a href="https://dev.to302%20lines"&gt;6&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Title&lt;/span&gt;
&lt;span class="gu"&gt;## Purpose (positioning: controls thinking mode, not behavior)&lt;/span&gt;
&lt;span class="gu"&gt;## When to Use / When NOT to Use&lt;/span&gt;
&lt;span class="gu"&gt;## Rationalizations&lt;/span&gt;
&lt;span class="gu"&gt;## Phase 1: Initial Orientation&lt;/span&gt;
&lt;span class="gu"&gt;## Phase 2: Ultra-Granular Function Analysis (core)&lt;/span&gt;
&lt;span class="gu"&gt;### Per-Function Checklist&lt;/span&gt;
&lt;span class="gu"&gt;### Cross-Function Flow Analysis&lt;/span&gt;
&lt;span class="gu"&gt;### Output Requirements (format + quantitative thresholds)&lt;/span&gt;
&lt;span class="gu"&gt;### Completeness Checklist&lt;/span&gt;
&lt;span class="gu"&gt;## Phase 3: Global System Understanding&lt;/span&gt;
&lt;span class="gu"&gt;## Stability Rules (anti-hallucination rules)&lt;/span&gt;
&lt;span class="gu"&gt;## Non-Goals (explicitly forbidden actions)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Techniques:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgm8oqugku90vehtih49s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgm8oqugku90vehtih49s.png" alt=" " width="720" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decision Rule: If your Skill requires deep analysis rather than quick execution — controlling “thinking quality” rather than “operational steps” — use the Thinking Framework pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Universal Writing Techniques
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Four Tactics to Prevent LLM Laziness
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbd8ozfxe2d88nx5tvp6e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbd8ozfxe2d88nx5tvp6e.png" alt=" " width="720" height="284"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Effective Teaching Methods
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qmuu83zv0rtz4kr1u80.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qmuu83zv0rtz4kr1u80.png" alt=" " width="720" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Principles for Safety and Boundaries
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5v4kxvid9t7olvnjduf3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5v4kxvid9t7olvnjduf3.png" alt=" " width="720" height="226"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Three-Layer Knowledge Architecture
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Layer 1: Frontmatter (~100 tokens) → LLM scans all Skills’ descriptions to decide whether to load&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Layer 2: SKILL.md body (&amp;lt;5K tokens) → Core instructions, decision trees, process steps&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Layer 3: references/ and resources/ (load on demand) → Detailed docs, examples, checklists; LLM reads via read tool as needed&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Token Budget (Rule of Thumb):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1jazk6sn0pupi2jlo817.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1jazk6sn0pupi2jlo817.png" alt=" " width="720" height="184"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Which Pattern Should You Use?
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What does your Skill need to do?
│
├─ Execute an operation with clear steps
│ └─ → Pattern 1: Linear Workflow
│
├─ Help users choose the right direction among many options
│ └─ → Pattern 2: Decision Tree + Load-on-Demand
│
├─ Repeatedly execute "do → verify → improve" in a single session
│ └─ → Pattern 3: Loop Iteration
│
├─ Sustain a long-term project across multiple sessions
│ └─ → Pattern 4: Baton Loop
│
├─ Span multiple days/weeks with phase divisions and Go/No-Go decisions
│ └─ → Pattern 5: Multi-Phase + Checkpoints
│
└─ Require LLM to perform deep analysis rather than quick execution
  └─ → Special Pattern: Thinking Framework
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Quick-Start Templates
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Minimal Viable Skill (Linear Pattern)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-skill&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[One-sentence&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;of&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;what&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;it&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;does&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;+&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;when&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;trigger]"&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Skill Name&lt;/span&gt;

[One-sentence description of core principles + safe defaults]

&lt;span class="gu"&gt;## Prerequisites&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [Prerequisite 1]
&lt;span class="p"&gt;-&lt;/span&gt; [Prerequisite 2]

&lt;span class="gu"&gt;## Steps&lt;/span&gt;

&lt;span class="gu"&gt;### Step 1: [Action]&lt;/span&gt;
[Specific command]

&lt;span class="gu"&gt;### Step 2: [Action]&lt;/span&gt;
[Specific instruction]

&lt;span class="gu"&gt;### Step 3: [Action]&lt;/span&gt;
[Specific instruction]

&lt;span class="gu"&gt;## Troubleshooting&lt;/span&gt;
| Issue | Solution |
|-------|----------|
| [Problem 1] | [Solution] |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Loop Iteration Skill Template
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-loop-skill&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Description of what it does + when to trigger&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Skill Name&lt;/span&gt;

&lt;span class="gu"&gt;## Core Principle&lt;/span&gt;
[The iron law]

&lt;span class="gu"&gt;## The Loop&lt;/span&gt;

&lt;span class="gu"&gt;### Phase A - [Action]&lt;/span&gt;
[Specific instruction]

&lt;span class="gu"&gt;### Verify A&lt;/span&gt;
[Verification command]

&lt;span class="gu"&gt;### Phase B - [Action]&lt;/span&gt;
[Specific instruction]

&lt;span class="gu"&gt;### Verify B&lt;/span&gt;
[Verification command]

&lt;span class="gu"&gt;### Repeat&lt;/span&gt;
Back to Phase A.

&lt;span class="gu"&gt;## Rationalizations&lt;/span&gt;
| Excuse | Reality |
|--------|---------|
| "[Excuse 1]" | [Rebuttal] |

&lt;span class="gu"&gt;## Completion Checklist&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [ ] [Condition 1]
&lt;span class="p"&gt;-&lt;/span&gt; [ ] [Condition 2]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Quick Reference: 7 Skills Analyzed in This Article
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6k7ncqbfrjhg0740t2n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6k7ncqbfrjhg0740t2n.png" alt=" " width="720" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Built a Skill using these patterns? I’d love to see it — drop a link in&lt;br&gt;
the comments.&lt;/p&gt;

&lt;p&gt;👏 Clap if this helped · 🔔 Follow for more Agent engineering content&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;p&gt;[1] openai/skills — vercel-deploy: &lt;a href="https://github.com/openai/skills/tree/main/skills/.curated/vercel-deploy" rel="noopener noreferrer"&gt;https://github.com/openai/skills/tree/main/skills/.curated/vercel-deploy&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[2] openai/skills — cloudflare-deploy: &lt;a href="https://github.com/openai/skills/tree/main/skills/.curated/cloudflare-deploy" rel="noopener noreferrer"&gt;https://github.com/openai/skills/tree/main/skills/.curated/cloudflare-deploy&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[3] obra/superpowers — test-driven-development: &lt;a href="https://github.com/obra/superpowers/tree/main/skills/test-driven-development" rel="noopener noreferrer"&gt;https://github.com/obra/superpowers/tree/main/skills/test-driven-development&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[4] google-labs-code/stitch-skills — stitch-loop: &lt;a href="https://github.com/google-labs-code/stitch-skills/tree/main/skills/stitch-loop" rel="noopener noreferrer"&gt;https://github.com/google-labs-code/stitch-skills/tree/main/skills/stitch-loop&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[5] deanpeters/Product-Manager-Skills — discovery-process: &lt;a href="https://github.com/deanpeters/Product-Manager-Skills/tree/main/skills/discovery-process" rel="noopener noreferrer"&gt;https://github.com/deanpeters/Product-Manager-Skills/tree/main/skills/discovery-process&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[6] trailofbits/skills — audit-context-building: &lt;a href="https://github.com/trailofbits/skills/tree/main/plugins/audit-context-building/skills/audit-context-building" rel="noopener noreferrer"&gt;https://github.com/trailofbits/skills/tree/main/plugins/audit-context-building/skills/audit-context-building&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[7] Agent Skills Open Standard: &lt;a href="https://agentskills.io/" rel="noopener noreferrer"&gt;https://agentskills.io/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[8] anthropics/skills — Official Template: &lt;a href="https://github.com/anthropics/skills/tree/main/template" rel="noopener noreferrer"&gt;https://github.com/anthropics/skills/tree/main/template&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[9] anthropics/skills — Specification: &lt;a href="https://github.com/anthropics/skills/tree/main/spec" rel="noopener noreferrer"&gt;https://github.com/anthropics/skills/tree/main/spec&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[10] openai/skills: &lt;a href="https://github.com/openai/skills" rel="noopener noreferrer"&gt;https://github.com/openai/skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[11] obra/superpowers: &lt;a href="https://github.com/obra/superpowers" rel="noopener noreferrer"&gt;https://github.com/obra/superpowers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[12] google-labs-code/stitch-skills: &lt;a href="https://github.com/google-labs-code/stitch-skills" rel="noopener noreferrer"&gt;https://github.com/google-labs-code/stitch-skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[13] deanpeters/Product-Manager-Skills: &lt;a href="https://github.com/deanpeters/Product-Manager-Skills" rel="noopener noreferrer"&gt;https://github.com/deanpeters/Product-Manager-Skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[14] trailofbits/skills: &lt;a href="https://github.com/trailofbits/skills" rel="noopener noreferrer"&gt;https://github.com/trailofbits/skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[15] openclaw/clawhub: &lt;a href="https://github.com/openclaw/clawhub" rel="noopener noreferrer"&gt;https://github.com/openclaw/clawhub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[16] VoltAgent/awesome-agent-skills: &lt;a href="https://github.com/VoltAgent/awesome-agent-skills" rel="noopener noreferrer"&gt;https://github.com/VoltAgent/awesome-agent-skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[17] travisvn/awesome-claude-skills: &lt;a href="https://github.com/travisvn/awesome-claude-skills" rel="noopener noreferrer"&gt;https://github.com/travisvn/awesome-claude-skills&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>llm</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>The Database Bottleneck You Never Saw Coming: Why 50ms Will Make or Break Your AI Agent in 2026</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Tue, 28 Apr 2026 01:59:07 +0000</pubDate>
      <link>https://forem.com/seekdb/the-database-bottleneck-you-never-saw-coming-why-50ms-will-make-or-break-your-ai-agent-in-2026-55ok</link>
      <guid>https://forem.com/seekdb/the-database-bottleneck-you-never-saw-coming-why-50ms-will-make-or-break-your-ai-agent-in-2026-55ok</guid>
      <description>&lt;p&gt;&lt;em&gt;The uncomfortable truth about AI infrastructure that nobody is talking about — and why your stack might be optimizing for the wrong metric&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0crfi056j57lxk9u17f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0crfi056j57lxk9u17f.png" alt="Is infra dead?"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In February 2026, a machine learning engineer at a well-funded fintech startup discovered something that kept her awake at night.&lt;/p&gt;

&lt;p&gt;Her AI-powered ad recommendation system was technically “working.” The vector database was returning results. The embedding model was generating similarities. The API was responding with HTTP 200 codes.&lt;/p&gt;

&lt;p&gt;But the advertisers were seeing creative assets that were 2 seconds stale.&lt;/p&gt;

&lt;p&gt;In programmatic advertising, 2 seconds is a lifetime. User intent has shifted. Inventory has been sold. The ad the AI thought was perfect was targeting a context that no longer existed.&lt;/p&gt;

&lt;p&gt;The culprit? Not the embedding model. Not the ranking algorithm. Not even the API layer.&lt;/p&gt;

&lt;p&gt;The humble CDC (Change Data Capture) synchronization link between their SQL database and their vector store.&lt;/p&gt;

&lt;p&gt;This is the story that isn’t being told in the AI revolution conversations. While everyone obsesses over model benchmarks, context windows, and prompt engineering, a quiet infrastructure crisis is brewing. And it’s going to determine which AI products survive 2026 — and which become expensive demos that never reach production.&lt;/p&gt;

&lt;p&gt;The database is back. And after 15 years of commoditization, it’s becoming the most strategically important piece of your AI infrastructure again.&lt;/p&gt;

&lt;p&gt;I spent the last year analyzing how 7 enterprise teams — from autonomous vehicle startups to Fortune 500 fintechs — are rebuilding their data layers for the AI-native era. What I found surprised me, frustrated me, and ultimately convinced me that we’re witnessing one of the most significant infrastructure shifts since the cloud transition.&lt;/p&gt;

&lt;p&gt;This is Part 1 of that story. Part 2 (coming next week) covers the emerging solutions: new safety mechanisms, unified architectures, and the “Agent-First” design philosophy that will define the next decade of data infrastructure.&lt;/p&gt;

&lt;p&gt;But first, you need to understand why everything you thought you knew about database selection might be wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1: The Identity Crisis — Who Is the Database Actually For?
&lt;/h2&gt;

&lt;p&gt;Let me ask you a question that sounds simple but isn’t:&lt;/p&gt;

&lt;p&gt;Who is your database designed to serve?&lt;/p&gt;

&lt;p&gt;For the last 40 years, the answer has been obvious: humans. More specifically, human database administrators who write SQL, human application developers who read API documentation, and human DevOps engineers who configure instances through web consoles.&lt;/p&gt;

&lt;p&gt;Every major database architecture makes assumptions about its user:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;They have an email address (for account creation and verification)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;They can wait 3–10 minutes for a new instance to provision&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;They understand complex logic like two-phase commit, isolation levels, and eventual consistency&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;They can manually reconcile data inconsistencies when systems drift out of sync&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;They will read PDF documentation, fill out forms, and open support tickets when something breaks&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI Agents are not humans.&lt;/p&gt;

&lt;p&gt;Your AI Agent cannot:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Check its email for a verification code&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Wait 5 minutes for a new database instance to spin up while a user is actively chatting&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Read a 50-page PDF and understand that one footnote on page 34 changes everything&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Manually fix data inconsistencies between three separate systems (MySQL for transactions, Elasticsearch for search, Pinecone for vectors)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Explain to you why it made a particular decision that broke your data model&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An Agent operates in what I call the Perceive-Reason-Act-Reflect loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Perceive: Read current state from the database&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reason: LLM processes information and decides next action&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Act: Write operation back to the database&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reflect: Read results and evaluate success&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A single task might execute this loop 20–50 times. Each iteration requires database interaction. And here’s where traditional database assumptions catastrophically break down.&lt;/p&gt;

&lt;p&gt;When a human queries a database, they run maybe 5–10 queries total, with seconds or minutes between each one. If one query takes 200ms, they don’t even notice.&lt;/p&gt;

&lt;p&gt;When an Agent executes 20 queries in a tight loop to complete one user request, that same 200ms latency becomes 4 seconds of cumulative waiting. In a conversational AI interface, 4 seconds of silence feels like abandonment. Users don’t think “the database is slow” — they think “this AI is broken.”&lt;/p&gt;

&lt;p&gt;The paradigm has completely flipped:&lt;/p&gt;

&lt;p&gt;Traditional Databases AI-Native Data Infrastructure Built for human DBAs Built for AI Agents Optimized for throughput (queries/second) Optimized for latency (end-to-end time) “Read the docs and figure it out” Programmatic self-discovery via structured interfaces Separate systems for different data types (SQL + Vector + Search) Unified engine for relations, vectors, and full-text Human-driven configuration and tuning Agent-driven, API-first operations with auto-scaling&lt;/p&gt;

&lt;p&gt;This isn’t an incremental upgrade. This is a fundamental inversion of database design philosophy — from human-operable to machine-native, from storage-centric to cognition-centric, from “how do we make the DBA’s life easier” to “how do we make the Agent’s life possible.”&lt;/p&gt;

&lt;p&gt;And most engineering teams haven’t realized the shift is happening.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 2: The Five Generations — How We Got Here (and Why Generation 4 Is Breaking)
&lt;/h2&gt;

&lt;p&gt;To understand where we’re going, you need to see the full evolutionary arc. I’ve mapped five distinct generations of data infrastructure, each defined by the dominant application pattern of its era:&lt;/p&gt;

&lt;h2&gt;
  
  
  Generation 1: OLTP Dominance (Pre-2010)
&lt;/h2&gt;

&lt;p&gt;The Killer App: E-commerce and electronic payments&lt;/p&gt;

&lt;p&gt;The Problem: “How do we keep our users’ money safe when they buy something online?”&lt;/p&gt;

&lt;p&gt;The Solution: MySQL, Oracle, PostgreSQL. Row-optimized storage. ACID transactions at all costs. The database as the “source of truth” for financial systems.&lt;/p&gt;

&lt;p&gt;The Mental Model: Trust the database with everything. If it committed, it happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  Generation 2: OLAP Separation (2010–2020)
&lt;/h2&gt;

&lt;p&gt;The Killer App: Business intelligence and data analytics&lt;/p&gt;

&lt;p&gt;The Problem: “We have terabytes of data in our OLTP system, but running analytics queries crashes the production database.”&lt;/p&gt;

&lt;p&gt;The Solution: Hadoop, Spark, data warehouses. Columnar storage. Batch processing. ETL pipelines that extract data nightly, transform it, and load it into separate systems for analysis.&lt;/p&gt;

&lt;p&gt;The Mental Model: Yesterday’s data is good enough for tomorrow’s business decisions. (T+1 latency was acceptable.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Generation 3: HTAP Convergence (2020–2024)
&lt;/h2&gt;

&lt;p&gt;The Killer App: Real-time personalization and fraud detection&lt;/p&gt;

&lt;p&gt;The Problem: “By the time our batch process identifies the fraud, the money is already gone.”&lt;/p&gt;

&lt;p&gt;The Solution: OceanBase, TiDB, CockroachDB. Hybrid Transactional/Analytical Processing. Row storage for writes, columnar for reads, inside the same system.&lt;/p&gt;

&lt;p&gt;The Mental Model: Analyze data as it arrives, without waiting for the ETL batch job.&lt;/p&gt;

&lt;h2&gt;
  
  
  Generation 4: Vector-Native (2024–2025)
&lt;/h2&gt;

&lt;p&gt;The Killer App: LLM-powered applications and semantic search&lt;/p&gt;

&lt;p&gt;The Problem: “Users want to search by meaning (‘comfortable cafe for a business chat’), not just keywords.”&lt;/p&gt;

&lt;p&gt;The Solution: Pinecone, Milvus, Weaviate. Purpose-built vector databases with HNSW/IVF indexes for approximate nearest neighbor search.&lt;/p&gt;

&lt;p&gt;The Mental Model: Find similar things, not just exact matches. Embeddings capture semantic relationships.&lt;/p&gt;

&lt;p&gt;But here’s where the wheels fall off.&lt;/p&gt;

&lt;p&gt;Every team I interviewed that built on the Generation 4 stack eventually hit the same wall. They were running three separate data systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;MySQL/PostgreSQL for transactional data and business logic&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Elasticsearch for full-text search and filtering&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Milvus/Pinecone for vector similarity and semantic search&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And then they wrote “glue code” — hundreds or thousands of lines of application logic trying to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Keep these three systems synchronized&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Decide which system to query first&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Merge results from multiple systems&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Handle the inevitable inconsistencies when one system’s CDC lagged behind&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One engineering lead described it to me as: “Three databases, three failure modes, three 3AM pages. And good luck explaining to your CEO why the AI recommended a product that sold out 5 seconds ago because our CDC was behind.”&lt;/p&gt;

&lt;p&gt;This architecture works for proofs-of-concept. It fails spectacularly in production when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You need sub-100ms response times&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You require strong consistency across data types&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You’re trying to build agent systems that make 20–50 database calls in a single task&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generation 4 was the right solution for the wrong problem. It solved “how do we do vector search” but created “how do we do hybrid search with low latency and strong consistency.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 3: The 50ms Problem — Why Latency Is the New Throughput
&lt;/h2&gt;

&lt;p&gt;Let me say something that sounds wrong at first, but will save you months of architectural pain:&lt;/p&gt;

&lt;p&gt;In the AI Agent era, latency matters more than throughput.&lt;/p&gt;

&lt;p&gt;Repeat that: latency &amp;gt; throughput.&lt;/p&gt;

&lt;p&gt;For 30 years, database optimization focused on a single metric: “How many queries can we process per second?” (QPS). This was the right metric for web applications, where thousands of humans are clicking around dashboards and product pages.&lt;/p&gt;

&lt;p&gt;For human-facing applications, 200ms query latency is “reasonable.” Users barely notice. Throughput is what matters because you need to serve thousands of concurrent users.&lt;/p&gt;

&lt;p&gt;AI Agents don’t generate load like humans do. They generate latency like chains.&lt;/p&gt;

&lt;p&gt;Consider a typical agent workflow for a restaurant recommendation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Loop 1:
  - READ: Get user's location and preferences (50ms)
  - REASON: LLM identifies intent (500ms-3s depending on model)
  - ACT: WRITE search query parameters (50ms)
Loop 2:
  - READ: Get candidate venues from database (50ms)
  - REASON: LLM evaluates options (500ms-3s)
  - ACT: WRITE refined filters (50ms)
Loop 3:
  - READ: Get detailed venue data (50ms)
  - REASON: LLM checks availability and preferences (500ms-3s)
  - REFLECT: READ final candidates (50ms)
  - REASON: Final ranking (500ms-3s)
  - ACT: WRITE recommendation (50ms)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This single request might involve 6 database round-trips.&lt;/p&gt;

&lt;p&gt;Now do the latency math:&lt;/p&gt;

&lt;p&gt;Per-Query Latency × 6 Queries Cumulative Agent Latency 20ms (optimized) 120ms (imperceptible) 50ms (good by human standards) 300ms (noticeable delay) 100ms (acceptable for web) 600ms (feels sluggish) 200ms (common for hybrid queries) 1.2s (feels broken)&lt;/p&gt;

&lt;p&gt;That “reasonable” 50ms lag that humans barely notice? To an Agent doing 20 queries to complete a task, it’s a full 1 second of cumulative waiting.&lt;/p&gt;

&lt;p&gt;In a conversational AI interface, 1 second of silence between messages is an eternity. Users don’t think “the database is slow.” They think “this AI is dumb,” or worse, “this AI is broken,” and they leave.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottleneck Migration
&lt;/h2&gt;

&lt;p&gt;But here’s the really counterintuitive insight: as LLMs get faster, the database becomes MORE important, not less.&lt;/p&gt;

&lt;p&gt;Follow this timeline:&lt;/p&gt;

&lt;p&gt;2024: GPT-4 in the cloud takes 3–5 seconds per inference&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Database latency (50–200ms) is lost in the noise&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;“The model is the bottleneck” ✓&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2025: Groq-optimized LLMs run at 100–500ms per inference&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Database latency (50–200ms) is now 20–50% of total time&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The database is becoming the bottleneck 🔶&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2026: On-device LLMs (Llama-3–8B, etc.) run at 10–20ms per inference&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Database latency (50–200ms) is now 2–10x slower than the “slow part”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The database IS the bottleneck 🔴&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There’s an infrastructure evolution rule that has held true for 40 years:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When one layer of the stack gets dramatically faster, the next layer becomes the new bottleneck.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Disks got faster (HDD → SSD) → CPU became the bottleneck&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Networks got faster (1Gbps → 100Gbps) → Serialization became the bottleneck&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;LLMs got faster (5s → 100ms) → The database is becoming the bottleneck&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The teams that are ahead of this curve are already optimizing for P99 latencies under 20ms. They’re treating 50ms as a bug, not a feature.&lt;/p&gt;

&lt;p&gt;Because in 12–18 months, when on-device models are standard, having a 200ms database will feel exactly like trying to stream 4K video over dial-up internet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 4: The Data Freshness Crisis — Why “Eventually Consistent” Is Eventually Broken
&lt;/h2&gt;

&lt;p&gt;There’s a second latency problem that’s even more insidious than query latency: data synchronization latency.&lt;/p&gt;

&lt;p&gt;Remember that fintech team with the 2-second CDC lag? Here’s why it was catastrophic for their AI system:&lt;/p&gt;

&lt;p&gt;Their AI was making decisions based on stale data.&lt;/p&gt;

&lt;p&gt;The sequence of failure looked like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;User browses products (triggers inventory decrement in SQL database)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;SQL database is authoritative source of truth&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;CDC process replicates change to vector database (2-second delay)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AI recommendation engine queries vector database: “What should we show this user?”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Vector database returns products that matched the user’s interests&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;One of those products just went out of stock 1.5 seconds ago&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;User clicks recommendation → sees “Out of Stock” error → abandons session → never returns&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI didn’t make a bad decision. It made a good decision based on bad data.&lt;/p&gt;

&lt;p&gt;This is the fundamental problem with the “three separate systems” architecture of Generation 4:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Your SQL database has the truth NOW&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Your vector database has the truth 2 seconds ago&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Your search index has the truth 5 seconds ago&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Your application is trying to merge these timelines like a time-travel movie with plot holes&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the AI agent use case, “eventually consistent” is actually “eventually wrong.”&lt;/p&gt;

&lt;p&gt;Because agents operate at machine speed — they’re not waiting 30 seconds between queries like a human browsing a website. They’re making decisions in milliseconds based on the data they read. If that data is 2 seconds stale, the decisions are being made on a reality that no longer exists.&lt;/p&gt;

&lt;p&gt;The three requirements of AI-native data infrastructure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Write-Visible: As soon as a transaction commits, new queries must see the updated data (no replication lag windows)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Persist-Available: Data must be queryable immediately in all indexing formats (vector, text, relational) without waiting for background jobs&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Predictably Fast: P99 latency must be bounded even under high concurrency, because agents don’t back off when the system is stressed — they pile on more requests&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional databases separate these concerns. You write to the SQL database, wait for the CDC job, wait for the vector index update, wait for the search index reindexing. The “freshness gap” is measured in seconds. AI agents make hundreds of decisions in those seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s at Stake
&lt;/h2&gt;

&lt;p&gt;Let me bring this back to ground level and explain why this matters for your next architecture decision.&lt;/p&gt;

&lt;p&gt;If you’re building RAG (Retrieval-Augmented Generation) applications, the data layer will determine whether you ship a demo or a production product.&lt;/p&gt;

&lt;p&gt;The demoable version uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;PostgreSQL for structured data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Pinecone for vectors&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Elasticsearch for text search&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;200 lines of Python to glue them together&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;500ms latency (but you only test with 10 items, so it feels instant)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;“Works on my machine” energy&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The production version doesn’t work. Because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The glue code becomes 2000 lines of complexity&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The 500ms becomes 2 seconds at scale&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The “eventually consistent” becomes “consistently wrong” when the CDC lags during a traffic spike&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The 3AM pages start coming faster and faster&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI-native generation of databases (OceanBase 4.4.2, Lakebase, seekers) approach this differently:&lt;/p&gt;

&lt;p&gt;One system. Three query interfaces. Single transaction boundary. When you commit, the data is visible for vector search, full-text search, and SQL queries simultaneously.&lt;/p&gt;

&lt;p&gt;That architectural shift — from “three systems with glue” to “one system with multiple access patterns” — is the difference between a prototype and a production system.&lt;/p&gt;

&lt;p&gt;In Part 2 (publishing next week), I’ll cover the emerging solutions to these problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Data Branching: Giving AI agents “sandbox” databases where they can experiment without risking production data (then merging changes after human review)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Unified vs. Specialized debate: Why the “best tool for the job” approach might be the worst choice for AI applications&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Agent-First Design: What it means to build infrastructure that AI agents can discover and operate autonomously&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A decision framework: How to choose the right data architecture for your specific AI use case&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Bottom Line for Part 1
&lt;/h2&gt;

&lt;p&gt;Database infrastructure has gone through five distinct generations, each solving the dominant problem of its era:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;OLTP: Make transactions reliable&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;OLAP: Enable batch analytics&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;HTAP: Enable real-time analytics&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Vector-Native: Enable semantic search&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AI-Native: Enable AI agents to interact with data safely, quickly, and autonomously&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generation 4 (separate vector databases) created a “glue layer complexity” problem that breaks production systems.&lt;/p&gt;

&lt;p&gt;The two metrics that matter for AI agents aren’t the ones we optimized for in the web era:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Latency, not throughput: 50ms × 20 queries = 1 second of waiting&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Freshness, not eventual consistency: “2 seconds behind” means “2 seconds wrong”&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As LLMs get faster (3s → 100ms → 10ms), the database becomes the bottleneck. The teams that realize this now and optimize for sub-20ms P99 latencies will have a 2–3 year head start.&lt;/p&gt;

&lt;p&gt;The infrastructure that wins won’t be the one with the highest benchmark score in isolation. It’ll be the one that eliminates the most architectural complexity while meeting the latency and consistency requirements that AI agents demand.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Do You Think?
&lt;/h2&gt;

&lt;p&gt;Is your team feeling the database latency pain yet? Have you hit the “glue layer” complexity wall with separate vector and SQL databases? Or are you still in the “the LLM is the slow part” phase?&lt;/p&gt;

&lt;p&gt;Drop a comment — I’d love to hear what your production monitoring is actually showing.&lt;/p&gt;

&lt;p&gt;And if this resonated, Part 2 drops tomorrow with the solutions: data branching, unified architectures, and the practical decision framework for choosing your AI-native data infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow me on Medium for weekly deep dives into the infrastructure layers that actually determine AI product success.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>database</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Building RAG &amp; Knowledge Bases with seekdb: Three Paths, One Stack</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:57:54 +0000</pubDate>
      <link>https://forem.com/seekdb/building-rag-knowledge-bases-with-seekdb-three-paths-one-stack-3nd3</link>
      <guid>https://forem.com/seekdb/building-rag-knowledge-bases-with-seekdb-three-paths-one-stack-3nd3</guid>
      <description>&lt;p&gt;The real headache in RAG isn’t retrieval or generation — it’s the layer in between. Where does the data live? How do you keep it in sync? Who glues it all together? seekdb and Dify are both open-source. Your RAG stack — from storage to orchestration — can be self-hosted, auditable, and customizable, without locking you into closed services. This post walks through three paths, all built on one stack: RAG from scratch with seekdb, Dify + seekdb, and a knowledge base desktop app. Pick the one that fits and get it running.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft2uk0noy7j4yrdqg5yql.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft2uk0noy7j4yrdqg5yql.png" alt="In the mesh of light, a patch that fits" width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where seekdb Fits in the RAG Pipeline
&lt;/h2&gt;

&lt;p&gt;A typical RAG pipeline looks like: load documents → chunk → embed → store; at query time: retrieve → (optionally) rerank → feed to LLM → generate. If your storage is a patchwork of MySQL + vector DB + full-text engine, you end up managing sync, multi-source queries, and fusion yourself. seekdb’s role: one database that holds relational data, vectors, and full-text in the same place. Write once, index automatically; one hybrid query returns results. You can use in-database AI functions for embedding and reranking when needed, so storage and retrieval live in one layer with less glue code.&lt;/p&gt;

&lt;p&gt;Three paths we’ll cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;RAG from scratch with seekdb — Best if you want full control over the pipeline or already have a Python/app stack.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dify + seekdb — Best if you want Dify for orchestration and UI and seekdb as the knowledge-base backend, collapsing the stack to Dify config + seekdb storage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Knowledge base desktop application — Best if you want a local, multi-project desktop app with seekdb as the backend and a custom frontend.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Path 1: RAG from Scratch with seekdb (Summary)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Deploy and create tables&lt;br&gt;
Run seekdb in Embedded or Client/Server mode. Create a table (or Python collection) with vector + full-text columns, and create a VECTOR INDEX and FULLTEXT INDEX.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Load documents&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read docs (PDF, TXT, MD, etc.) → chunk them (by paragraph, by length, with overlap, etc.).&lt;/li&gt;
&lt;li&gt;For each chunk, call your embedding model to get a vector (use seekdb’s in-database AI functions, or compute in your app and insert into seekdb).&lt;/li&gt;
&lt;li&gt;INSERT into seekdb: each row has chunk text, vector, and any metadata you need (source, doc id, segment id, etc.).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;At query time&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Turn the user question into a query vector (same embedding).&lt;/li&gt;
&lt;li&gt;Use hybrid search: vector_query + full_text_query(optional) + relational filters (e.g. by knowledge-base id), and take top_k candidates.&lt;/li&gt;
&lt;li&gt;Optional: rerank with seekdb or in your app → pass the final context to your LLM to generate the answer.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Things to watch&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Chunking strategy and chunk size directly affect recall; pick one and tune from there.&lt;/li&gt;
&lt;li&gt;If you use in-database AI for embedding/reranking, you save a round-trip to external services.&lt;/li&gt;
&lt;li&gt;For full steps and code, see &lt;a href="https://docs.seekdb.ai/seekdb/build-a-rag-system-with-seekdb/" rel="noopener noreferrer"&gt;https://docs.seekdb.ai/seekdb/build-a-rag-system-with-seekdb/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Path 2: Dify + seekdb — Collapse the RAG Stack (Both Open-Source)
&lt;/h2&gt;

&lt;p&gt;Dify handles workflow orchestration, knowledge-base setup, and the chat UI. The data source can be seekdb: Dify’s pipeline does “upload/parse → chunk → embed → write,” while storage and retrieval happen in seekdb — with strong consistency, hybrid search, and in-database AI. Dify and seekdb are both open-source, so the whole RAG stack can be self-hosted, audited, and extended. Good fit if you care about data and architecture ownership.&lt;/p&gt;

&lt;p&gt;Configuration idea (check your Dify version for exact UI):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;In Dify, set the knowledge base data source to seekdb (or wire seekdb via Dify’s supported vector store/API).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;After you upload documents, Dify parses and chunks them, calls the embedding service, and writes into seekdb. At query time, Dify sends the query to seekdb, gets hybrid-search results back, and passes them to the LLM node for the final answer.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: no separate sync scripts or multi-database juggling — the stack is just “Dify config + seekdb.” For details, see &lt;a href="https://en.oceanbase.com/blog/24316625920" rel="noopener noreferrer"&gt;https://en.oceanbase.com/blog/24316625920&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Path 3: Knowledge Base Desktop App — Local, Multi-Project
&lt;/h2&gt;

&lt;p&gt;If you’d rather skip Dify and want a local knowledge base desktop application (multiple projects, multiple docs, local search): use seekdb as the backend and a desktop client (e.g. Tauri or Electron + your frontend) to connect to seekdb’s API. The flow is the same: parse → chunk → embed → write to seekdb; at query time use hybrid search and show results or feed them to a local LLM.&lt;/p&gt;

&lt;p&gt;There’s an official guide: &lt;a href="https://docs.seekdb.ai/seekdb/build-kb-in-seekdb/" rel="noopener noreferrer"&gt;https://docs.seekdb.ai/seekdb/build-kb-in-seekdb/&lt;/a&gt; — it outlines the stack and steps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which Path to Choose?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiu43gmqs01d0glsxw5pe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiu43gmqs01d0glsxw5pe.png" alt=" " width="720" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you’ve got RAG or a knowledge base running with seekdb, you might wonder where it goes next. In the next post we’ll take seekdb beyond text: multimodal and agents — think travel assistant, image search, TEN+PowerMem voice assistant — and how the same stack extends to those scenarios.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Repo: &lt;a href="https://github.com/oceanbase/seekdb" rel="noopener noreferrer"&gt;https://github.com/oceanbase/seekdb&lt;/a&gt; (Apache 2.0 — Stars, Issues, PRs welcome)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Docs: &lt;a href="https://docs.seekdb.ai/seekdb/seekdb-overview/" rel="noopener noreferrer"&gt;https://docs.seekdb.ai/seekdb/seekdb-overview/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Discord: &lt;a href="https://discord.com/channels/1331061822945624085/1331061823465590805" rel="noopener noreferrer"&gt;https://discord.com/channels/1331061822945624085/1331061823465590805&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dev.to: &lt;a href="https://dev.to/seekdb"&gt;https://dev.to/seekdb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Press: &lt;a href="https://www.marktechpost.com/2025/11/26/oceanbase-releases-seekdb-an-open-source-ai-native-hybrid-search-database-for-multi-model-rag-and-ai-agents/" rel="noopener noreferrer"&gt;https://www.marktechpost.com/2025/11/26/oceanbase-releases-seekdb-an-open-source-ai-native-hybrid-search-database-for-multi-model-rag-and-ai-agents/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Building RAG or an AI workflow? What’s the one thing you wish your database did better — or didn’t do at all? Drop it in the comments. We read them, and the next features we ship often come from exactly those pain points. Open source only gets better when people say what’s broken.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>database</category>
      <category>opensource</category>
      <category>rag</category>
    </item>
    <item>
      <title>seekdb Core Features: Hybrid Search &amp; AI Functions</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:49:48 +0000</pubDate>
      <link>https://forem.com/seekdb/seekdb-core-features-hybrid-search-ai-functions-3p3l</link>
      <guid>https://forem.com/seekdb/seekdb-core-features-hybrid-search-ai-functions-3p3l</guid>
      <description>&lt;p&gt;Vector search finds “what it’s like.” Full-text search finds “what it says.” Relational filters handle “who” and “where.” With seekdb, you can combine all three in a single query and run embedding and reranking inside the database. The fusion logic (e.g., RRF) and the AI Functions API are open on GitHub (&lt;a href="https://github.com/oceanbase/seekdb" rel="noopener noreferrer"&gt;https://github.com/oceanbase/seekdb&lt;/a&gt;) — you can review, modify, and send PRs. This post walks through how it works and how to use it in RAG and knowledge-base setups, and how you can contribute.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66f2otdvj702kznj1irn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66f2otdvj702kznj1irn.png" alt=" " width="720" height="720"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Hybrid Search: Why One SQL Beats Multi-Stage Retrieval
&lt;/h2&gt;

&lt;p&gt;The usual approach: hit a vector store, hit a full-text store, and then normalize, fuse scores (e.g., RRF), and rerank in the application layer. The catch: extra network hops, custom fusion logic, and filter conditions that can drift between systems (e.g.,“only this user’s data” has to be expressed in both stores).&lt;/p&gt;

&lt;p&gt;seekdb’s hybrid search is different: one table has both a vector index and a full-text index. One query sends vector conditions, full-text conditions, and relational filters, and the database does the fusion and ranking. You get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Consistency — Filters are defined once. No “vector side filtered, full-text side didn’t.”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Low-latency — No application-layer fusion hop; results are ranked inside the DB.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Simplicity — No glue code for multi-stage retrieval.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In SQL, you typically use DBMS_HYBRID_SEARCH.SEARCH() with full_text_query, vector_query, and optional relational filters to get relevance-ranked results. The Python SDK’s hybrid_search() supports more options (e.g., separate top_k for vector/full-text, filter expressions). Fusion is often done with RRF (Reciprocal Rank Fusion), combining vector similarity and full-text scores into one ranking.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. How to Configure and Tune Hybrid Search
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Schema — The table needs a VECTOR column (with a VECTOR INDEX) and text columns you want to search (with FULLTEXT INDEX). When you insert a row, both indexes are updated; no extra sync step.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Queries — Pass both vector (or column name + query vector) and full-text query string, and add relational filters (e.g. WHERE user_id = ?) as needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tuning — On the vector side: top_k and similarity thresholds. On the full-text side: tokenization and match mode. When merging with RRF, watch the ratio of results from each side so one doesn’t dominate (the docs have examples and suggested ranges).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Community experience (e.g. Experience seekdb’s Hybrid Search) shows that semantic + keyword together is more reliable than vector-only or full-text-only, especially with proper nouns, numbers, and code.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. AI Functions: Embedding, Reranking, and LLM Inside the Database
&lt;/h2&gt;

&lt;p&gt;Hybrid search alone isn’t enough for RAG — you still need embedding, reranking, and an LLM. Doing all of that in the app via remote services adds latency and dependencies. seekdb’s AI Functions move part of that into the database: call in-DB embedding and rerank at write or query time, and even LLM inference and prompt handling, so the “retrieve → rerank → generate” pipeline is shorter and some logic lives in the DB.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Embedding — Vectorize text at write time or at query time; no need to call an external API from the app before writing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reranking— Rerank hybrid search results with a model inside the DB, cutting down app-layer round-trips.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;LLM / Prompt — Run simple inference or prompt templates in the DB for rule-heavy flows; keep complex chat in your application LLM.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compared with “everything via external services”: in-DB AI reduces network hops and centralizes permissions and config; it fits when you care about latency and privacy and are fine using seekdb’s built-in or configured models. If you’re already tied to external embedding/LLM services, you can keep them and use seekdb purely as the retrieval layer.&lt;/p&gt;

&lt;p&gt;For details and configuration, see seekdb docs.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Summary: What You Gain
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc08lmhqwub1f5rskctwi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc08lmhqwub1f5rskctwi.png" alt=" " width="720" height="169"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you’re comfortable with these, the next step is RAG and knowledge bases: use seekdb for storage and retrieval, and Dify or your own front end for chat and workflows. The next post will cover getting started with building RAG and a knowledge base using seekdb and Dify.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Repo: &lt;a href="https://github.com/oceanbase/seekdb" rel="noopener noreferrer"&gt;https://github.com/oceanbase/seekdb&lt;/a&gt; (Apache 2.0 — Stars, Issues, PRs welcome)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Docs: &lt;a href="https://docs.seekdb.ai/seekdb/hybrid-search/" rel="noopener noreferrer"&gt;https://docs.seekdb.ai/seekdb/hybrid-search/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Discord: &lt;a href="https://discord.com/channels/1331061822945624085/1331061823465590805" rel="noopener noreferrer"&gt;https://discord.com/channels/1331061822945624085/1331061823465590805&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dev.to: &lt;a href="https://dev.to/seekdb"&gt;https://dev.to/seekdb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Press: &lt;a href="https://www.marktechpost.com/2025/11/26/oceanbase-releases-seekdb-an-open-source-ai-native-hybrid-search-database-for-multi-model-rag-and-ai-agents/" rel="noopener noreferrer"&gt;https://www.marktechpost.com/2025/11/26/oceanbase-releases-seekdb-an-open-source-ai-native-hybrid-search-database-for-multi-model-rag-and-ai-agents/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you or your team are building an AI application/workflow— what do you expect from a database? Let’s chat in the comments.&lt;/p&gt;

&lt;p&gt;Our team is building some cool new features, and we might just solve your pain points. Open source is about collaboration — share your challenges and let’s build better together!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vectordatabase</category>
      <category>opensource</category>
      <category>database</category>
    </item>
    <item>
      <title>We Built an Agent That Analyzes Itself — Here’s What We Learned</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:40:53 +0000</pubDate>
      <link>https://forem.com/seekdb/we-built-an-agent-that-analyzes-itself-heres-what-we-learned-md9</link>
      <guid>https://forem.com/seekdb/we-built-an-agent-that-analyzes-itself-heres-what-we-learned-md9</guid>
      <description>&lt;p&gt;&lt;em&gt;When your Agent’s footprints become team insights, something interesting happens.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpvttmf2qlt8xyjrunaur.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpvttmf2qlt8xyjrunaur.png" alt=" " width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Your team builds an AI Agent. It works great. People use it every day — in Slack, in DingTalk, in Discord.&lt;/p&gt;

&lt;p&gt;Then what?&lt;/p&gt;

&lt;p&gt;The conversations vanish into chat history. The queries disappear after execution. The insights stay trapped in individual sessions.&lt;/p&gt;

&lt;p&gt;Over time, nobody can answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;What questions do people ask most?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Where does the Agent fail repeatedly?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;What patterns hide in thousands of interactions?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We faced this exact problem. And the answer wasn’t “add more analytics.” The answer was: build an Agent that analyzes itself.&lt;/p&gt;

&lt;p&gt;Meet bubseek — an insight Agent that turned our scattered footprints into team intelligence.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is bubseek?
&lt;/h2&gt;

&lt;p&gt;One-liner: A self-driven insight Agent built on bub (Agent framework) + seekdb (AI-native database).&lt;/p&gt;

&lt;p&gt;What it does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Accepts natural language requests (“Track AI trending projects this week”)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Connects to data sources autonomously (GitHub, Slack, internal systems)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Defines views, executes analysis, generates reports&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Stores everything in seekdb — including its own execution traces&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Analyzes its own traces to produce team insights&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The twist: bubseek doesn’t just consume data. It consumes itself. Every interaction becomes fuel for understanding how the team works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why We Built It
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The Old Way: BI Ticket Backlog
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Team member: "I need a dashboard for GitHub trending"
 ↓
Product manager: "Add to backlog"
 ↓
2 weeks later: "Requirements unclear, need refinement"
 ↓
1 month later: Dashboard shipped (wrong metrics)
 ↓
Repeat.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Small requests get deprioritized. Big requests take forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bubseek Way: Conversation, Not Queue
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Team member: "Track AI trending projects, update daily"
 ↓
bubseek: "Got it. Setting up GitHub → seekdb → daily report"
 ↓
Next morning: Report arrives in Slack
 ↓
Team member: "Add vLLM mention analysis"
 ↓
bubseek: "Updated. Next report will include it"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response time: From “weeks” to “seconds.”&lt;/p&gt;

&lt;h2&gt;
  
  
  The Building Blocks
&lt;/h2&gt;

&lt;p&gt;bubseek combines two projects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;bub (Agent Framework)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hook-first architecture: core stays minimal, features as plugins&lt;/li&gt;
&lt;li&gt;Tape system: immutable execution trace (every thought, tool call, result)&lt;/li&gt;
&lt;li&gt;Skills engine: extendable tool library&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;seekdb (AI-Native Database)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL + vector + full-text search in one database&lt;/li&gt;
&lt;li&gt;Lightweight: runs on 1 core, 2GB RAM&lt;/li&gt;
&lt;li&gt;Designed for AI workloads (RAG, embeddings, hybrid search)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Together: bub handles the Agent loop, seekdb stores everything (including the Agent’s own footprints).&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Learned
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Lesson 1: Channels should be zero-code
&lt;/h2&gt;

&lt;p&gt;Built-in channels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Feishu, DingTalk, WeChat, Discord, Telegram&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Web interface (Marimo notebook)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Setup: Configure environment variables for each channel. No additional code required.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: Feishu is an enterprise collaboration platform popular in Asia, similar to Slack.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Note: Feishu is an enterprise collaboration platform popular in Asia, similar to Slack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 2: Data consumption is a conversation, not a queue
&lt;/h2&gt;

&lt;p&gt;Traditional BI: Deploy system → build reports → train users&lt;/p&gt;

&lt;p&gt;bubseek: Tell it what you want → it figures out the rest&lt;/p&gt;

&lt;h2&gt;
  
  
  Example Workflows
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frdab2347wz04jm9uau8s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frdab2347wz04jm9uau8s.png" alt=" " width="720" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Output formats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Marimo notebooks (interactive Python dashboards)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GitHub repo cards (SVG/PNG for sharing)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Natural language reports (for chat)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Lesson 3: The Agent is its own best analyst
&lt;/h2&gt;

&lt;p&gt;Traditional observability: External monitoring system → metrics → dashboards&lt;/p&gt;

&lt;p&gt;bubseek observability: Agent naturally produces data → analyzes itself&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tape System
&lt;/h2&gt;

&lt;p&gt;Every Agent execution creates an immutable trace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;User request&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Agent thoughts (step-by-step reasoning)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tool calls (which APIs, which queries)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Results (what was found/generated)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Delivery (where sent, when)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This tape isn’t a log. It’s the data source for meta-analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example Insights (from early testing)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3nmt1sfjzri7hq0u203m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3nmt1sfjzri7hq0u203m.png" alt=" " width="720" height="149"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The loop closes: Agent serves team → produces data → data analyzed → team understands itself better.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 4: Your data foundation shapes everything
&lt;/h2&gt;

&lt;p&gt;Why seekdb?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4uazi0isftkqubo6ush.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4uazi0isftkqubo6ush.png" alt=" " width="720" height="218"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BUB_TAPESTORE_SQLALCHEMY_URL=mysql+oceanbase://user:pass@host:port/database
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit strategy: If seekdb hits limits, seamless upgrade to OceanBase (same protocol, distributed scale).&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Innovation: Self-Understanding Agent
&lt;/h2&gt;

&lt;p&gt;Most Agents are stateless workers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Do task → forget everything&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Next task starts from zero&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;No institutional memory&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;bubseek is stateful team member:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Remembers all interactions&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Learns from failures&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Produces insights about itself&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example: bubseek Analyzes Its Own Usage&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "What questions did people ask most this week?"

bubseek: (queries its own tape in seekdb)
         (clusters by topic)
         (generates report)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example output (illustrative — your actual numbers will vary):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Top 5 topics:
 1. GitHub trending (23 queries)
 2. AI paper summaries (18 queries)
 3. Team sprint metrics (12 queries)
 4. Competitor analysis (9 queries)
 5. Code review automation (7 queries)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No separate analytics tool needed. The Agent is the analytics.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;seekdb installed (1 core, 2GB RAM minimum) — see seekdb deployment docs (&lt;a href="https://docs.seekdb.ai/seekdb/seekdb-overview/" rel="noopener noreferrer"&gt;https://docs.seekdb.ai/seekdb/seekdb-overview/&lt;/a&gt;) if you need a local server&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A model provider account and API credentials compatible with bubseek (see the bubseek README: &lt;a href="https://github.com/ob-labs/bubseek" rel="noopener noreferrer"&gt;https://github.com/ob-labs/bubseek&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Configure bubseek
&lt;/h2&gt;

&lt;p&gt;Example values below are placeholders from the README; replace them with your own model, API key, API base URL, and database URL before running uv run bub chat.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/ob-labs/bubseek.git
cd bubseek
uv sync
uv run bub --help
export BUB_MODEL=openrouter:qwen/qwen3-coder-next
export BUB_API_KEY=sk-or-v1-your-key
export BUB_API_BASE=https://openrouter.ai/api/v1
export BUB_TAPESTORE_SQLALCHEMY_URL=mysql+oceanbase://user:pass@host:port/database
uv run bub chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For channel-specific variables, production URLs, and alternative model providers, use the full guides in the repo: Getting started (&lt;a href="https://github.com/ob-labs/bubseek/blob/main/docs/getting-started.md" rel="noopener noreferrer"&gt;https://github.com/ob-labs/bubseek/blob/main/docs/getting-started.md&lt;/a&gt;), Configuration (&lt;a href="https://github.com/ob-labs/bubseek/blob/main/docs/configuration.md" rel="noopener noreferrer"&gt;https://github.com/ob-labs/bubseek/blob/main/docs/configuration.md&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s Next
&lt;/h2&gt;

&lt;p&gt;Under discussion:&lt;/p&gt;

&lt;p&gt;Further iterations may include multi-Agent coordination, smarter schema design, and proactive insight recommendations.&lt;/p&gt;

&lt;p&gt;(Roadmap still evolving — join the conversation on GitHub.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The Big Picture
&lt;/h2&gt;

&lt;p&gt;bubseek isn’t a BI tool. It’s a bet on a different future:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7hgzoz49pzul4xbeqi6n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7hgzoz49pzul4xbeqi6n.png" alt=" " width="720" height="220"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We’re not saying bubseek is the answer. We’re saying: the question is worth asking.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What if your Agent knew as much about your team as you do&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;bubseek: &lt;a href="https://github.com/ob-labs/bubseek" rel="noopener noreferrer"&gt;https://github.com/ob-labs/bubseek&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;seekdb: &lt;a href="https://github.com/oceanbase/seekdb" rel="noopener noreferrer"&gt;https://github.com/oceanbase/seekdb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;bub Framework: &lt;a href="https://github.com/bubbuild/bub" rel="noopener noreferrer"&gt;https://github.com/bubbuild/bub&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tape Context Model: &lt;a href="https://tape.systems" rel="noopener noreferrer"&gt;https://tape.systems&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>dataengineering</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Token, Harness, OpenClaw, RAG, MCP, Agent — What’s the Difference? One Map Makes It Clear</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:33:31 +0000</pubDate>
      <link>https://forem.com/seekdb/token-harness-openclaw-rag-mcp-agent-whats-the-difference-one-map-makes-it-clear-576a</link>
      <guid>https://forem.com/seekdb/token-harness-openclaw-rag-mcp-agent-whats-the-difference-one-map-makes-it-clear-576a</guid>
      <description>&lt;p&gt;&lt;em&gt;You know these terms alone. Together? They’re confusing. Here’s the map.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F782udjnvjxo8n3u41x74.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F782udjnvjxo8n3u41x74.png" alt=" " width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ever feel like this — people keep saying you need “Agents” for AI apps, connect “MCP”, install “OpenClaw”, and now there’s “Harness” too? Each term makes sense alone, but together it’s overwhelming. Today we’ll sort it out — who comes first, who comes after, who manages whom. By the end, you’ll see clearly.&lt;/p&gt;

&lt;p&gt;No fluff. Let’s look at a real case: Your boss asks you to “research the latest competitor dynamics online, combine with our company’s past two years of legacy product data, and produce a new product development PPT with data charts.”&lt;/p&gt;

&lt;p&gt;Below is the complete process from start to finish. Once we run through this, those confusing concepts will naturally fall into place.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqett7meddx2eeh95ohon.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqett7meddx2eeh95ohon.png" alt="One Map Makes It Clear" width="720" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: You Receive the Task, Send Instructions to OpenClaw
&lt;/h2&gt;

&lt;p&gt;Your boss’s request is clear, but you can’t personally search for info, query data, draw charts, and write the PPT. To complete this efficiently, you organize the task into a single instruction and send it to something called OpenClaw.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. OpenClaw (“Lobster” 🦞)
&lt;/h2&gt;

&lt;p&gt;What is OpenClaw? Simply put, it’s the “central dispatch console” for the entire AI assembly line — task decomposition, resource allocation, budget monitoring, and logging.&lt;/p&gt;

&lt;p&gt;To understand why OpenClaw is needed, we first need to know what the foundation of the entire system is. No matter how complex the operations later become, everything ultimately rests on two most basic things.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Large Language Model (LLM)
&lt;/h2&gt;

&lt;p&gt;ChatGPT, Claude — essentially, they’re just really smart brains. Brilliant, knowledgeable, but they have two fatal flaws:&lt;/p&gt;

&lt;p&gt;First, they only “respond passively” — you ask one question, they give one answer, never proactively working.&lt;/p&gt;

&lt;p&gt;Second, the default chat experience has no durable thread state — every conversation is a fresh start; close the dialog and the model does not “remember” your last session. Production stacks fix that outside the model with logs, RAG, Memory, and databases (covered below).&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Token
&lt;/h2&gt;

&lt;p&gt;Many people think Token equals word count — big mistake. Token is the model’s basic text unit — think “atoms of text.” One word can be one token (cat) or split into several (understanding → under + stand + ing). On average, 100 English words ≈ 130 tokens. Every sentence you send and every piece the model generates shows up on the meter. This determines two things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;First, your money — APIs charge by token&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Second, its “short-term memory” — whatever fits in the context window for this request&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why does token affect memory? Here’s a counter-intuitive mechanism. LLMs themselves have no memory function. Before answering you, the system packages your prior conversation together with your new question into a huge text block and feeds it to the model to read from scratch. The size of this text block is the “context window,” and the token limit is this window’s maximum capacity. Once conversation history gets too long and exceeds the token limit, the system has to truncate — discarding the earliest content.&lt;/p&gt;

&lt;p&gt;So “amnesia” in a bare chat UI isn’t mystical — there is literally a finite window. Token is both fuel and the size of that window. Long-term facts and preferences still live in external stores (RAG, Memory, databases) — not inside the next blank chat.&lt;/p&gt;

&lt;p&gt;Alright, foundation is clear. But foundation alone isn’t enough — who orchestrates all those complex parts? This is why OpenClaw exists. Next, it will awaken a team to get to work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: OpenClaw Awakens a Multi-Agent Team, Each Playing Their Role
&lt;/h2&gt;

&lt;p&gt;Upon receiving the instruction, OpenClaw awakens a Multi-Agent team.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Multi-Agent
&lt;/h2&gt;

&lt;p&gt;Multi-agent is the product of necessary division of labor for complex tasks. One agent can be a great line cook. But don’t ask it to also be the head chef, server, cashier, and dishwasher simultaneously — that’s why kitchens have stations.&lt;/p&gt;

&lt;p&gt;In the multi-agent model, you create a group with four core roles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Search Agent — finding information across the web&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Writer Agent — drafting articles and reports&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reviewer Agent — checking for errors and policy violations&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Analyst Agent — generating charts and data insights&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are two coordination patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Orchestrator–worker — A coordinator decomposes tasks, assigns work, and collects results (most enterprise setups)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Peer-to-peer — No fixed coordinator; multiple agents message each other in a shared channel and pick up relevant tasks&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Enterprise deployments usually prefer orchestrator–worker because it’s easier to control, permission, and audit.&lt;/p&gt;

&lt;p&gt;In this PPT task, OpenClaw awakens three of these agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;“Search Agent” crawls competitor dynamics&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;“Internal Data Agent” retrieves historical data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;“Analyst Agent” generates charts&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How do they work? This brings us to the essence of Agent.&lt;/p&gt;

&lt;p&gt;Many people think Agent is just “LLM plus some tools,” but this misses the most critical thing. The core difference between Agent and LLM is the locus of control.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxs7t4zmrfetgolih60rq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxs7t4zmrfetgolih60rq.png" alt=" " width="720" height="239"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To achieve this transformation, you need to wrap a “scheduler” layer outside the LLM. This scheduler does four things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Decomposition — Break complex tasks into executable sub-steps&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Execution — Call tools one by one to complete each step&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Observation — Watch execution results of each step; continue if successful, retry or switch plans if failed&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Decision — Make its own judgments at forks in the road&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So: Agent = Brain (LLM) + Scheduler + Knowledge Base + Skill Library + Tooling (often wired through MCP). The LLM only understands goals and generates instructions; the true “proactiveness” comes from the scheduler layer outside. AI assistants help you brainstorm; agents finish the work for you.&lt;/p&gt;

&lt;p&gt;One more common point of confusion: What’s the difference between Agent and OpenClaw?&lt;/p&gt;

&lt;p&gt;In one sentence: Agent is the worker getting the job done; OpenClaw is the system managing the workers.&lt;/p&gt;

&lt;p&gt;One agent is like a renovation worker — you tell him “paint this wall white,” he finishes. Multi-agent is like a renovation team with masons, electricians, painters, collaborating to renovate a room. OpenClaw is the renovation company’s operations backend. It doesn’t manage how to paint walls specifically; it manages: which worker is available, whether tools are ready, whether there’s permission to enter the site, how much work was done and how much it cost, whether the work process was logged, what to do if a worker runs away.&lt;/p&gt;

&lt;p&gt;Why can’t one giant agent replace OpenClaw? Three recurring failure modes:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F207887hqw6k007nprkup.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F207887hqw6k007nprkup.png" alt=" " width="720" height="248"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the agent concept clear, let’s see how the three agents awakened by OpenClaw actually work. This is where MCP, databases, RAG, Skill, Memory naturally emerge.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. MCP (Model Context Protocol)
&lt;/h2&gt;

&lt;p&gt;First, the “Search Agent” crawls competitor dynamics across the web via MCP interfaces.&lt;/p&gt;

&lt;p&gt;MCP is an open standard for plugging tools into models (think USB-C for capabilities). Before it appeared, to let AI search the web, you needed programmers to write bespoke glue translating “what AI wants to search” into “call search API”. Change the tool, rewrite the code; change the AI model, maybe rewrite again. This is the “M×N Problem”: M models × N tools = M×N development efforts.&lt;/p&gt;

&lt;p&gt;MCP changes this pattern to “M+N”: Tool developers ship one MCP surface; any MCP-capable host can call it; hosts implement MCP once and reach many tools. MCP is essentially a translation layer — AI says “I want to search competitors,” MCP translates to browser-understandable instructions; browser returns results, MCP translates back to model-friendly content. With MCP, the model is like plugging into a hub — gaining many connectors without custom glue each time.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Vector Database / AI Database
&lt;/h2&gt;

&lt;p&gt;Second, the “Internal Data Agent” triggers RAG, diving into the vector database to retrieve the past two years of historical data.&lt;/p&gt;

&lt;p&gt;A vector database (or AI-native DB) is a semantic index over embeddings. Traditional databases (like MySQL) are rigid — you search “happy,” it absolutely won’t find “glad.” Vector databases turn documents and chat into vectors — long arrays of numbers representing directions in embedding space. Texts with similar meaning land near each other in that space. “Happy” and “glad” are close; “happy” and “sad” are far.&lt;/p&gt;

&lt;p&gt;When you search “competitor Q3 data,” it doesn’t match keywords — it embeds the query, then returns the nearest neighbors. It’s not matching text; it’s retrieving by semantic distance.&lt;/p&gt;

&lt;p&gt;Related projects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Vector Database OceanBase: &lt;a href="https://github.com/oceanbase/oceanbase" rel="noopener noreferrer"&gt;https://github.com/oceanbase/oceanbase&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Native AI Database seekdb: &lt;a href="https://github.com/oceanbase/seekdb" rel="noopener noreferrer"&gt;https://github.com/oceanbase/seekdb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  7. RAG (Retrieval-Augmented Generation)
&lt;/h2&gt;

&lt;p&gt;Without RAG, LLMs lean on parametric knowledge — when that’s thin, they hallucinate more often. With RAG, a typical loop has four steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Retrieval — Find relevant materials in the vector database&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ranking — Pick the most reliable pieces&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Context assembly — Combine materials and question into a single prompt&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Generation — The LLM answers conditioned on those materials&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RAG reduces — but does not eliminate — fabrication: The model is steered with “answer from the following context” instead of an open-ended prompt. If the context is thin or wrong, it can still go off rails — guardrails and harnesses still matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Skills (Skill Package)
&lt;/h2&gt;

&lt;p&gt;Third, the “Analyst Agent” calls up the chart generation Skill you predefined earlier, and queries its Memory: “The boss can’t distinguish red from green; charts must not rely on red–green encoding.”&lt;/p&gt;

&lt;p&gt;Skills exist to solve a real problem: Prompts don’t stick. A prompt is a one-off command like “rewrite this email to sound more professional” or “summarize this meeting transcript into 3 bullets.” It works today, but when you open a new chat tomorrow, you have to type it again. Writing Prompts daily equals doing AI chores every day.&lt;/p&gt;

&lt;p&gt;Skills fix this by making repeatable workflows permanent. Think automated buttons, not verbal instructions. You write the SOP once, the system executes it forever. Prompt is giving orders; Skill is building an assembly line.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. Memory (Long-Term Memory)
&lt;/h2&gt;

&lt;p&gt;And the Memory just mentioned records “you as a person.” RAG remembers objective materials; Memory remembers subjective preferences. Technically they’re often implemented similarly — both retrieved from stores outside the model (vector DBs, KV, etc.). The difference: RAG stores documents and reports, imported by developers in advance; Memory stores user preferences and identity tags, automatically extracted and stored by the system during conversation. RAG is the company’s shared filing cabinet; Memory is your personal file folder. With Memory, the assistant can reuse stable preferences — e.g., avoiding red–green palettes — without you repeating them every session.&lt;/p&gt;

&lt;p&gt;PowerMem (a reference implementation for OpenClaw-style hosts): &lt;a href="https://github.com/oceanbase/powermem" rel="noopener noreferrer"&gt;https://github.com/oceanbase/powermem&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Encountering Hard Problems, Summon Special Forces
&lt;/h2&gt;

&lt;p&gt;The task involves writing a complex piece of data analysis code — beyond what the standard Analyst Agent can handle. For this, OpenClaw activates a specialist coding agent (in our stack, Claude Code — the coder’s exclusive tool for deep technical work).&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Claude Code
&lt;/h2&gt;

&lt;p&gt;Don’t confuse Claude Code with the web-chat version of Claude. Web version is a consultant — in the browser you ask one question, it answers one. Claude Code is completely different — it lives directly in your computer terminal, with broad filesystem and shell access — read, write, modify, and delete files where your OS user allows. Workflow: you give the goal, it decomposes and executes with minimal interruption. Built-in tools include reading files, writing files, running commands, searching code.&lt;/p&gt;

&lt;p&gt;Its principle is: When Anthropic trained Claude, they specifically strengthened its terminal command and file operation capabilities, then packaged it as a local terminal agent, pre-connecting two MCP tools: file system and command line. Opening Claude Code equals launching an agent specialized in writing code. One sentence — it digs through tens of thousands of lines of codebase itself, fixes bugs itself, submits tests itself.&lt;/p&gt;

&lt;p&gt;Claude Code finishes writing and running the data analysis code, results returned to “Analyst Agent,” charts generated smoothly. PPT draft emerges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Finished Product, First Pass Security Check
&lt;/h2&gt;

&lt;p&gt;PPT draft is generated. But do you really dare to send it directly to your boss?&lt;/p&gt;

&lt;p&gt;What if the agent used red–green-only charts (the boss can’t distinguish red from green)? What if there’s a number in the data charts the model invented? What if the format completely doesn’t match company templates? Worse: what if a tool-enabled agent had credentials broad enough to damage production data? Keep database credentials, destructive tools, and customer PII out of reach of exploratory agents — scope tools per environment.&lt;/p&gt;

&lt;p&gt;This is why the AI assembly line still needs one final layer: Harness Engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  11. Harness Engineering
&lt;/h2&gt;

&lt;p&gt;Mitchell Hashimoto (HashiCorp co-founder) wrote persuasively in early 2026 about treating long-running agents like systems that need a harness — constraints, checks, and recovery. See My AI Adoption Journey. “Harness” literally evokes horse tackle — reins, harness, saddle — equipment for guiding strength without fighting it. The metaphor fits: agents are powerful and fast, but they can spook, drift, or act outside intent. Harness engineering is the discipline of making unsafe failure modes rare — with permissions, tests, sandboxes, humans in the loop, and telemetry.&lt;/p&gt;

&lt;p&gt;Harness Engineering differs from ad-hoc “debug and hope.” Traditional thinking: agent makes a mistake, you manually intervene, then hope it doesn’t repeat. Harness thinking: each time a failure mode appears, encode a guard — policy, test, sandbox, or auto-rollback — so the same class of harm is much harder to repeat.&lt;/p&gt;

&lt;p&gt;Mitchell Hashimoto cited a classic example: Let an AI agent refactor a million-line codebase. The naive path is wide GitHub permissions and “go ahead,” then waiting for disaster — the agent will recklessly modify files, introduce bugs, and delete files it thinks are useless. A harnessed approach might look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fombzdsfu4il9wj1wmjly.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fombzdsfu4il9wj1wmjly.png" alt=" " width="720" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Speaking of this, you might be curious: What’s the difference between Harness Engineering and OpenClaw?&lt;/p&gt;

&lt;p&gt;OpenClaw manages “assembly line operations” — scheduling, allocation, monitoring, logging. Harness Engineering manages “assembly line safety” — constraining behavior boundaries, validating output quality, building self-healing closed loops. One manages “can run”; the other manages “runs stably.”&lt;/p&gt;

&lt;p&gt;Here, let’s pause and think about a question: Why do many enterprises still dare not put AI agents into production environments?&lt;/p&gt;

&lt;p&gt;It’s not because agents aren’t smart enough — it’s because of distrust. You don’t know what it’ll do next second, you don’t know if it’ll spend your entire budget, you don’t know if it’ll send a nonsensical email to clients in the middle of the night.&lt;/p&gt;

&lt;p&gt;Harness Engineering targets that trust gap. It uses engineered constraints — policy, tests, observability, approvals — to move agents from opaque automation toward auditable, predictable, stoppable systems. Predictability is what lets serious work land on agents.&lt;/p&gt;

&lt;p&gt;Back to our task. The system validates PPT format. It checks color encodings against the accessibility rule. After passing, the PPT lands in your Slack drafts.&lt;/p&gt;

&lt;p&gt;Tokens are metered end to end; per-step caps catch runaway loops. If spend crosses 80% of budget, route to a cheaper model tier where policy allows. Every tool call hits an audit log so when the boss asks where did this number come from? You can answer in seconds.&lt;/p&gt;

&lt;p&gt;In a few minutes, done — and you only clicked Approve once.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Which Layer Are You At?
&lt;/h2&gt;

&lt;p&gt;Once you see how these thirteen building blocks fit together across five layers — from the foundation (LLM + Token) through memory and knowledge&lt;br&gt;
(RAG + Memory + Vector DB + AI DB + Skill + MCP), execution&lt;br&gt;
(Agent + Multi-Agent + Claude Code), orchestration (OpenClaw),&lt;br&gt;
and finally the safety envelope (Harness Engineering) — you stop treating the space as magic jargon.&lt;/p&gt;

&lt;p&gt;If this helped, clap and share it with someone who’s still drawing the map on a whiteboard.&lt;/p&gt;

&lt;p&gt;Where are you now?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flietx2ikfu9fkisqspa5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flietx2ikfu9fkisqspa5.png" alt=" " width="720" height="222"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Would you ship Friday’s release from an agent you can’t stop or explain?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>softwareengineering</category>
      <category>llm</category>
    </item>
    <item>
      <title>Harness Engineering in Practice: Building a 6-Agent System That Runs Itself</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:28:30 +0000</pubDate>
      <link>https://forem.com/seekdb/harness-engineering-in-practice-building-a-6-agent-system-that-runs-itself-31b</link>
      <guid>https://forem.com/seekdb/harness-engineering-in-practice-building-a-6-agent-system-that-runs-itself-31b</guid>
      <description>&lt;p&gt;&lt;em&gt;“Six agents” here means one orchestrator (Zoe) plus five specialist agents. Six ACP coding experts run as concurrent implementation workers — not counted in that headline number.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyeuwd4rk17h9jab6hkpy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyeuwd4rk17h9jab6hkpy.png" alt=" " width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Day Has Been Taken Over
&lt;/h2&gt;

&lt;p&gt;Overnight, the trading agent ships the prior US session wrap-up. By morning, the macro analyst has the pre-market brief ready. The butler has pushed weather, schedule, and to-dos. AINews (AI Sentinel) has scanned GitHub Trending, arXiv’s latest papers, and 100+ sources — 18+ curated items ranked by importance. Content (Content Strategist) is tracking trending topics across 50+ platforms.&lt;/p&gt;

&lt;p&gt;Here’s what matters most to me — automatic tracking of AI dynamics and tech trends. After discovering valuable projects or papers, the system doesn’t just push news — it evaluates impact on our systems and provides P0/P1/P2 action recommendations. Valuable discoveries enter Zoe’s Tech Radar (Zoe is the CTO Agent), going through evaluation → decision → delegated coding implementation.&lt;/p&gt;

&lt;p&gt;60 cron tasks run automatically every day (3 AM backup to 11:45 PM reflection). Agents are evolving on their own — mistakes are remembered, recurrence rates drop significantly. This isn’t rules I wrote — it’s autonomous iteration from .learnings/ to MEMORY.md.&lt;/p&gt;

&lt;p&gt;System: 1 orchestrator (Zoe) + 5 specialized agents (AINews, Trading, Macro, Content, Butler) + 6 ACP coding experts + 60 cron tasks + 100+ Skills + ~30 configured model profiles + 23 automatic recoveries in two weeks.&lt;/p&gt;

&lt;p&gt;Note: Metrics based on February-March 2026 monitoring. Individual results may vary.&lt;/p&gt;

&lt;h2&gt;
  
  
  System Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────┐
│              User (Human)                   │
│    Requirements + Key Node Approval         │
└─────────────────┬───────────────────────────┘
                  │
         ┌────────▼────────┐
         │   Zoe (CTO)     │
         │  3x Daily Check │
         └────────┬────────┘
                  │
    ┌─────────────┼─────────────┐
    │             │             │
┌───▼───┐   ┌────▼────┐   ┌───▼───┐
│AINews │   │ Trading │   │ Macro │
└───┬───┘   └────┬────┘   └───┬───┘
    │            │            │
    └────────────┼────────────┘
                 │
        ┌────────▼────────┐
        │ Content + Butler│
        └────────┬────────┘
                 │
        ┌────────▼────────┐
        │   Event Bus     │
        │ + Shared Context│
        └────────┬────────┘
                 │
        ┌────────▼────────┐
        │ ACP Coding      │
        │ (6 concurrent)  │
        └─────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key Design Decisions:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tytyilun38ness608n2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tytyilun38ness608n2.png" alt=" " width="720" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents Evolving Autonomously
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Designed protocols — Zoe diagnosed communication issues, designed three-state protocol (request → confirmed → final, with silent as the default "no news is good news" state), solidified into AGENTS.md&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Self-developed Skills — Content researched ways to make drafts sound less generically LLM-written (“de-AI” polish), wrote Skills, published to ClawHub (shared repository)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Strategy roundtables — Macro + Trading produce weekly reports with data snapshots, position recommendations, stop-loss discipline&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Task Watcher — Zoe designed cron-level Task Callback Event Bus for async monitoring&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My role: Set up framework, establish constraints, confirm direction. Requirement discovery, solution research, protocol design, implementation — all done by agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Team: 1+5+6 Formation
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Zoe (CTO / Chief Orchestrator)
&lt;/h2&gt;

&lt;p&gt;3 daily inspections (10:00/14:00/22:00 PT): cron execution, disk usage, session health, Chrome DevTools Protocol (CDP) leak checks, .learnings/ pending, shared-context/ timestamps.&lt;/p&gt;

&lt;p&gt;Weekly: Analyze each agent’s MEMORY.md, execute layered compression.&lt;/p&gt;

&lt;p&gt;Key capability: Solution design — three-state protocol, Task Watcher, Communication Guardrail framework, all designed autonomously.&lt;/p&gt;

&lt;h2&gt;
  
  
  AINews (AI Sentinel) — Intelligence Hub
&lt;/h2&gt;

&lt;p&gt;Collects from 100+ sources daily: GitHub Trending, arXiv, RSS, HackerNews, Reddit. 7 cron tasks: morning brief (08:30), midday paper (12:00), evening trends (20:00).&lt;/p&gt;

&lt;p&gt;Critical capability: Proactive tech impact evaluation. Discovered ReMe framework → proposed to Zoe → I confirmed → agents executed.&lt;/p&gt;

&lt;p&gt;Toolchain: github_trending.py, rss_aggregator.py, arxiv_papers.py, Tavily, agent-browser. Anti-hallucination: every item MUST have a URL, reachability self-check, and unverifiable items labeled single-source.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trading (Quantitative Analyst)
&lt;/h2&gt;

&lt;p&gt;21 cron tasks (densest load). 20 quant tools, 15 Skills (68K+ lines), 65/35 scoring (tool/AI). Covers US stocks + commodities + crypto.&lt;/p&gt;

&lt;p&gt;Four-step framework: Macro factors → scoring (technical 25% / flow 30% / fundamentals 10% / sentiment 20% / market 15%) → cross-check (sanity-check vs. macro and flow) → target + score + stop-loss + confidence.&lt;/p&gt;

&lt;p&gt;Not financial advice — automated research output only; you are responsible for any real-money decisions.&lt;/p&gt;

&lt;p&gt;Hard rules (system policy, not investment advice): no entry without a defined stop, never fabricate data, confidence &amp;lt;60% = “wait.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Macro (Chief Economist)
&lt;/h2&gt;

&lt;p&gt;9 cron tasks: Morning (07:50) → Midday (12:30) → Evening (18:00) → US pre-market (22:00) → morning digest of the prior US session (05:20 PT) — scheduled after the cash close, not at the closing bell. Sunday weekly review → Trading references for market review.&lt;/p&gt;

&lt;p&gt;Discipline: Cite sources, distinguish facts vs judgments, mark confidence (high &amp;gt;70% / medium 50–70% / low &amp;lt;50%), propose counter-arguments.&lt;/p&gt;

&lt;p&gt;Real case: Iran tension → traditional: “gold rises” → actual: oil +14%, gold -5%. Macro: “inflation logic dominates, not safe haven.” Saved to MEMORY.md.&lt;/p&gt;

&lt;h2&gt;
  
  
  Content (Content Strategist)
&lt;/h2&gt;

&lt;p&gt;9 cron tasks: Research (09:00, 50+ platforms) → Ideate (10:30, consume AINews) → Write (14:00, score drafts) → Reflect (22:10).&lt;/p&gt;

&lt;p&gt;Autonomous evolution: Discovered content too “AI-flavored” → researched humanizing / de-generic copy tools → wrote Skills → published to ClawHub.&lt;/p&gt;

&lt;p&gt;Five-Basket Radar: AI/Tech (≤40%), Product/Startup, Solopreneur, Investment/Macro, Social/International. 40% AI cap self-imposed during reflection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Butler (Life Assistant)
&lt;/h2&gt;

&lt;p&gt;7 cron tasks: Greeting (08:00) → Schedule (08:30) → 5 water reminders (rotating styles) → Health (20:00) → Summary (22:00).&lt;/p&gt;

&lt;p&gt;Philosophy: &amp;lt;50 chars per reminder, ≥1.5h interval, 23:00–07:00 emergency only, no pestering if no reply.&lt;/p&gt;

&lt;h2&gt;
  
  
  ACP Coding Experts
&lt;/h2&gt;

&lt;p&gt;Pi / Claude Code / Codex / OpenCode / Gemini / GPT-4.1-Codex. Max 6 concurrent, 120min TTL — queue or shed load when saturated so you don’t stampede gateways. Analysis agents don’t code — delegated via sessions_spawn.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design Lesson
&lt;/h2&gt;

&lt;p&gt;Don’t let analysis agents code directly. Early setup: coding + architect + PM roles. Result: almost no output, high overlap with Zoe + ACP, increased complexity. Cut them all. Zoe handles PM + architect.&lt;/p&gt;

&lt;p&gt;Complexity grows fast: pairwise coordination explodes (six specialists ≈ fifteen pairwise handoffs if everyone talks to everyone). Each new agent ≈ half a day debugging conflicts, resource competition, and rule compatibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Core Engineering Problems
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Problem 1: Context Is the Agent’s OS
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The Problem: Entropy Always Increases
&lt;/h2&gt;

&lt;p&gt;Without constraints, agent systems deterministically collapse. Agents are processes without an OS: no memory management, no garbage collection, no OOM protection.&lt;/p&gt;

&lt;p&gt;Three incidents:&lt;/p&gt;

&lt;p&gt;P0–8 Hours Paralysis&lt;/p&gt;

&lt;p&gt;AINews session: 235K tokens. Gateway compaction → timeout → crash → macOS launchd ThrottleInterval=1 infinite loop. All agents offline.&lt;/p&gt;

&lt;p&gt;Fix: Clean session → ThrottleInterval 1→10 → idleMinutes 180→30 → execution policy tightened from permissive to allowlist (smaller blast radius; keep the list maintained). Four defense lines missing.&lt;/p&gt;

&lt;p&gt;P1–3500 Chars → 800 Chars&lt;/p&gt;

&lt;p&gt;Trading’s flash report had data tables. OpenClaw auto-compacted when exceeded textChunkLimit—"intelligently compressed" away. AI "help" is disaster in data-dense scenarios.&lt;/p&gt;

&lt;p&gt;P2 — Rules Ignore After Bloat&lt;/p&gt;

&lt;p&gt;Sessions bloat to 10K+ tokens → agents “selectively comply.” Butler doing investment analysis. Trading ignoring validation. Critical info drowned in noise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution: Dual-Layer Control
&lt;/h2&gt;

&lt;p&gt;Layer 1: Context Engineering (information architecture)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;SOUL.md (front): Identity + hard constraints + decision framework (40–60 lines)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AGENTS.md (after): Operating norms + collaboration protocols&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Skills: Via extraDirs on-demand (Trading: 15 Skills, 68K lines on disk—retrieve or inject only the 1–3 relevant fragments per turn, not the whole tree)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;shared-context/: Cross-agent state, read via tools&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Obsidian: Cold storage, archives output, no inference&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rule wording targets weakest model (GPT-4.1 → Qwen3.5 → Ollama qwen3:8b):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;"Suggest not fabricating" → qwen3:8b ignores&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"MUST: do not fabricate" → all comply&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"MUST + P0 + NON-NEGOTIABLE" → even weak models comply&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Write for weakest link.&lt;/p&gt;

&lt;p&gt;Layer 2: Harness (framework lifecycle management)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvr60toxydafcl0qavtut.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvr60toxydafcl0qavtut.png" alt=" " width="720" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Without Harness → 235K tokens → crash. Without Context Engineering → all piled → rules drowned.&lt;/p&gt;

&lt;p&gt;Representative openclaw.json excerpt (field names drift by release—validate against your OpenClaw version before paste-deploying):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "compaction": {
    "mode": "safeguard",
    "memoryFlush": {
      "enabled": true,
      "softThresholdTokens": 40000,
      "prompt": "Distill to memory/YYYY-MM-DD.md. Focus: decisions, state changes, lessons."
    }
  },
  "contextPruning": { "mode": "cache-ttl", "ttl": "6h", "keepLastAssistants": 3 },
  "session": {
    "reset": { "mode": "daily", "atHour": 5, "idleMinutes": 30 },
    "maintenance": { "pruneAfter": "7d", "maxDiskBytes": 104857600 }
  },
  "hooks": { "bootstrap": ["self-improving-agent"] }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cross-session recovery:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;New session → SOUL.md + AGENTS.md + MEMORY.md + .learnings/ → memorySearch → shared-context/
= "Knows who, what done, what team doing"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Problem 2: Let Agents Remember and Grow
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The Problem: Repeating Mistakes
&lt;/h2&gt;

&lt;p&gt;Trading got BILLBOARD_BUY_AMT wrong 5 times (wrote BUY_AMT). Session reset → lost memory → repeat. User corrects → agent changes → 3 days later same scenario → same error.&lt;/p&gt;

&lt;p&gt;Chatbot vs Agent dividing line: Agents learn from mistakes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution: Five-Layer Memory
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftd6djqvr779qzn20zcwe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftd6djqvr779qzn20zcwe.png" alt=" " width="720" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Autonomous Memory: 6-Step Cycle
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Trigger: Operation failed · User corrected · Better approach found&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;L4 Recording: Write to .learnings/ERRORS.md or LEARNINGS.md&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Daily Reflection (22:00): Review .learnings/, Zoe aggregates cross-agent value&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;PROMOTE: 3+ verifications → MEMORY.md, single → keep observing&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;L2 Sedimentation: Weekly compression, ❤000 tokens&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;L5 Skill: Generalizable → write as Skills → ClawHub&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the core mechanism. Without it: chatbot. With it: agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chatbot vs Agent
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7opyxb09uixmz96og7ir.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7opyxb09uixmz96og7ir.png" alt=" " width="720" height="304"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem 3: Let Agents Collaborate
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The Problem: Multi-Agent Communication
&lt;/h2&gt;

&lt;p&gt;Initial issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Status sync failures: A finished, B didn’t know&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Resource contention: Multiple agents write same file&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Information silos: Macro produced, Trading never saw&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Responsibility gaps: “Who’s handling this?” → all silent&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Solution: Three-State Protocol + Event Bus
&lt;/h2&gt;

&lt;p&gt;Protocol (three active states + default silent state):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;request → confirmed → final → [silent]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;p&gt;request: Explicitly acknowledges, starts the loop&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;confirmed: In progress, sends intermediate updates&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;final: Complete, result delivered, loop closes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;[silent]: Default state when no active task — “no news is good news” (prevents spam)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Event Bus:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "type": "MARKET_CLOSE",
  "source": "TRADING",
  "timestamp": "2026-03-07T15:00:00-08:00",
  "payload": { "symbol": "SPY", "note": "schema omitted for brevity" },
  "requiresAck": false
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Shared Context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;tech-radar.json — Read-only except authorized writers&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;market-status.json — Trading updates, Macro/Content consume&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Guardrails:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;No ad-hoc cross-agent file writes; mediated writes only (tools, bus, approved writers)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;All communication via event bus or shared-context; never park API keys or session tokens in shared JSON — use your platform’s secret store&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Zoe has final arbitration&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Results: 4 Weeks, 23 Auto-Recoveries
&lt;/h2&gt;

&lt;p&gt;Timeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Week 1: Basic setup, single agent, frequent crashes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Week 2: Multi-agent coordination, protocols established&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Week 3: Autonomous evolution, agents self-fixing&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Week 4: Production-ready, 60 cron tasks smooth&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Key Metrics:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwc42toorcvouyfd3f4h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwc42toorcvouyfd3f4h.png" alt=" " width="720" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1. Agents Need an OS, Not Just Prompts
&lt;/h2&gt;

&lt;p&gt;Context Engineering is OS design. You need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Memory management (compaction, pruning, reset)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Process isolation (separate workspaces)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;IPC mechanisms (event bus, shared context)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Garbage collection (session cleanup, disk limits)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. Memory = Chatbot vs Agent
&lt;/h2&gt;

&lt;p&gt;Can’t remember yesterday’s mistakes = fancy chatbot. Five-layer memory transforms stateless LLM calls into stateful, learning entities.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Constraints Enable Creativity
&lt;/h2&gt;

&lt;p&gt;Clear boundaries = more creative, not less. 40% AI quota, three-state protocol, hard “MUST” rules — these are guardrails for autonomous operation.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Multi-Model Fallback Is Production Necessity
&lt;/h2&gt;

&lt;p&gt;GPT-4.1 → Qwen3.5 → Ollama qwen3:8b. Write rules for weakest link.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Human: Doer → Designer
&lt;/h2&gt;

&lt;p&gt;My job: design system where code writes itself. I’m architect, not bricklayer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Looking Ahead
&lt;/h2&gt;

&lt;p&gt;Next steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;P0: Dead-letter queue for failed events&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;P1: Manual resend CLI for stuck tasks&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;P1: Audit log rotation&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;P2: Visual dashboard for system health&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Goal: Amplify human capability. One person + six agents &amp;gt; one person + zero agents. That’s Harness Engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Reference
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Agent Roster
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovxh03h19ce0v4jys569.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovxh03h19ce0v4jys569.png" alt=" " width="720" height="255"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Total: 60 cron, ~90 Skills&lt;/p&gt;

&lt;h2&gt;
  
  
  Daily Schedule (Pacific Time, America/Los_Angeles)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7w2jf3kwqk72omiratta.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7w2jf3kwqk72omiratta.png" alt=" " width="720" height="396"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Cron rows are snapshots from my stack — align to your exchange calendar, asset class (equities vs. crypto), and whether you are on PST or PDT.&lt;/p&gt;

&lt;h2&gt;
  
  
  Critical Files
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkebsc6mqs01j5d1lccrj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkebsc6mqs01j5d1lccrj.png" alt=" " width="720" height="292"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you run a similar harness, how do you handle failures when compaction, cron, and multi-agent handoffs all interact — what breaks first in your stack, and what fixed it?&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;OpenClaw: &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;https://github.com/openclaw/openclaw&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;PowerMem: &lt;a href="https://github.com/oceanbase/powermem" rel="noopener noreferrer"&gt;https://github.com/oceanbase/powermem&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;ClawHub: &lt;a href="https://github.com/openclaw/clawhub" rel="noopener noreferrer"&gt;https://github.com/openclaw/clawhub&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mitchell Hashimoto: “My AI Adoption Journey” — &lt;a href="https://mitchellh.com/writing/my-ai-adoption-journey" rel="noopener noreferrer"&gt;https://mitchellh.com/writing/my-ai-adoption-journey&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Anthropic: “Effective harnesses for long-running agents” — &lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;All times Pacific Time (America/Los_Angeles; PST or PDT depending on season). macOS + OpenClaw. Monitoring: Feb–Mar 2026. Validate config against your OpenClaw release at &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;https://github.com/openclaw/openclaw&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>openclaw</category>
      <category>llm</category>
    </item>
    <item>
      <title>Harness Engineering: From AI-Assisted to AI-Driven, What Is Software Engineering Undergoing?</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:22:38 +0000</pubDate>
      <link>https://forem.com/seekdb/harness-engineering-from-ai-assisted-to-ai-driven-what-is-software-engineering-undergoing-l6n</link>
      <guid>https://forem.com/seekdb/harness-engineering-from-ai-assisted-to-ai-driven-what-is-software-engineering-undergoing-l6n</guid>
      <description>&lt;p&gt;&lt;em&gt;What happens when a team ships a million lines of code without writing a single line by hand?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83198r8ttol9y4eb5re6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83198r8ttol9y4eb5re6.png" alt=" " width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Editor’s Note: This article is based on verified engineering blogs from OpenAI, Anthropic, and Mitchell Hashimoto (2025–2026). All references are linked in the end.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In February 2026, OpenAI engineers revealed they had shipped a million-line codebase in five months — with zero manually-written code. All of it was generated by Codex agents.&lt;/p&gt;

&lt;p&gt;When I first read about this experiment, my feelings were complicated. Three engineers, five months, one million lines of code — all written by AI. As someone who’s written code for years, I had to ask: If AI can write code by itself, what’s left for us?&lt;/p&gt;

&lt;p&gt;It wasn’t until I understood Harness Engineering that I found the answer:&lt;/p&gt;

&lt;p&gt;AI is not replacing engineers; it’s forcing us to level up — from “craftsmen” to “system designers.”&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Harness Engineering?
&lt;/h2&gt;

&lt;p&gt;“Harness” originally means horse tack — equipment that both constrains a horse and enables it to pull a cart. You can’t let the horse run wild, but you can’t tie up its legs either. You give it direction.&lt;/p&gt;

&lt;p&gt;Mitchell Hashimoto (HashiCorp founder) applied this to AI engineering in early 2026:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I don’t know if there is a broad industry-accepted term for this yet, but I’ve grown to calling this ‘harness engineering.’ It is the idea that anytime you find an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Traditional debugging: “Something broke, fix it.”&lt;br&gt;
Harness Engineering: “Something broke, build a system to prevent it from happening again.”&lt;/p&gt;

&lt;p&gt;For example: If AI keeps calling an API incorrectly, don’t just remind it. Write code that requires API calls to pass type checking — encode human judgment into system constraints.&lt;/p&gt;

&lt;p&gt;The core of Harness: Not predicting what mistakes AI will make, but designing an environment where mistakes are hard to occur.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI’s Experiment: Zero Human-Written Code
&lt;/h2&gt;

&lt;p&gt;In August 2025, a three-person team at OpenAI set a rule: No writing any code — all left to Codex.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge
&lt;/h2&gt;

&lt;p&gt;Early progress was slower than expected. Not because AI was incapable, but because the environment wasn’t ready. The team spent time not on “using AI,” but on “making AI usable.”&lt;/p&gt;

&lt;p&gt;Key insight:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When AI fails, don’t just “let it try again.” Ask: “What capability is missing? How can we make this executable and verifiable?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Three Core Designs
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Progressive Context Management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI’s context window is limited. OpenAI used structured files like claude-progress.txt to pass state between sessions. Like a relay race baton—each AI reads what the previous "shift colleague" accomplished.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Feature List: Letting AI Know “It’s Not Done”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI has a bad habit: thinking it’s done after half the work. The Initializer Agent breaks requirements into 200+ small features, each marked “incomplete.” Coding Agents take one at a time, only changing status after completion.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Self-Check Mechanism&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They equipped AI with Puppeteer (browser automation), letting it interact with its own application — clicking, screenshotting, verifying functionality. Not checking if code is correct, but if the thing actually works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;After five months:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;1,500 PRs, all AI-generated&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;3.5 PRs per engineer per day&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Single AI sessions running 6+ hours (often while humans slept)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the later stages, AI could complete features end-to-end:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Reproduce bug → Record video → Fix bug → Verify → Open PR → Respond to feedback → Merge&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is autonomous software engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic’s Deep Exploration: Generator + Evaluator
&lt;/h2&gt;

&lt;p&gt;OpenAI proved feasibility. Anthropic went further with harness design for long-running agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Challenge
&lt;/h2&gt;

&lt;p&gt;As Anthropic researcher Justin Young noted:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Imagine a software project staffed by engineers working in shifts, where each new engineer arrives with no memory of what happened on the previous shift.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every time the context window fills and resets, AI gets amnesia. Anthropic’s solution: “structured handoff” — designing documentation so the next AI can quickly take over.&lt;/p&gt;

&lt;h2&gt;
  
  
  GAN-Inspired Architecture
&lt;/h2&gt;

&lt;p&gt;Prithvi Rajasekaran from Anthropic Labs designed a Generator + Evaluator dual-agent system, inspired by GANs (Generative Adversarial Networks).&lt;/p&gt;

&lt;p&gt;Key discovery: If AI evaluates its own work, it’s lenient — clearly mediocre, but says “pretty good.” Split evaluation into an independent agent, and things change.&lt;/p&gt;

&lt;p&gt;Frontend design experiment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Generator: Produces design drafts&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Evaluator: Operates the page using Playwright, scoring on design quality, originality, craftsmanship, functionality&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One prompt addition made a difference: “The best designs should be museum quality.” The generator started producing surprisingly creative effects — including once restructuring a Dutch art museum’s website into a 3D spatial experience with walkable exhibition halls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full-Stack Application
&lt;/h2&gt;

&lt;p&gt;Rajasekaran expanded to a three-agent system:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fluur4mnbvg5d01ditvjr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fluur4mnbvg5d01ditvjr.png" alt=" " width="531" height="203"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Key: Contract — before each iteration, Generator and Evaluator negotiate “what counts as done.”&lt;/p&gt;

&lt;p&gt;Test: Build a “retro game maker.”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvlw87mfwjyzb1h9po0e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvlw87mfwjyzb1h9po0e.png" alt=" " width="530" height="132"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The gap was not small.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Deeper Logic: Three Paradigm Shifts
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1. From “Writing Code” to “Designing Systems”
&lt;/h2&gt;

&lt;p&gt;Traditional growth path: learn to write code → design modules → architect systems.&lt;/p&gt;

&lt;p&gt;Harness Engineering reverses this: You must first know how to design systems to use AI for writing code effectively.&lt;/p&gt;

&lt;p&gt;This is upgrading, not downgrading. From “craftsman” to “factory designer” — you’re not making less, but what you make has more leverage.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. “Constraints” Become Creation
&lt;/h2&gt;

&lt;p&gt;Adding constraints to AI doesn’t limit it — clear constraints make AI more creative.&lt;/p&gt;

&lt;p&gt;Anthropic’s “museum quality” standard made AI strive in that direction. OpenAI’s architecture constraints let AI write code boldly without breaking things.&lt;/p&gt;

&lt;p&gt;Like jazz chord progressions — having the framework lets musicians improvise confidently.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Quality Control Paradigm Shift
&lt;/h2&gt;

&lt;p&gt;From the OpenAI team:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Humans may review pull requests, but aren’t required to. Over time, we’ve pushed almost all review effort towards being handled agent-to-agent.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;AI is reviewing code written by AI. Not because humans got lazy, but because throughput is too high. When AI opens 3.5 PRs per day, humans can’t review them all.&lt;/p&gt;

&lt;p&gt;Quality control shifts from “checking every line” to “designing mechanisms for AI self-verification.”&lt;/p&gt;

&lt;p&gt;If AI can operate browsers, run tests, and verify functionality — why can’t it review code?&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Developers
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Don’t Panic, But Don’t Wait
&lt;/h2&gt;

&lt;p&gt;Harness Engineering doesn’t mean programmers become unemployed. Those three OpenAI engineers weren’t replaced — they changed their way of working from “people who write code” to “people who design systems and control quality.”&lt;/p&gt;

&lt;p&gt;But this transformation won’t wait until you’re ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  “Boring” Technologies Win
&lt;/h2&gt;

&lt;p&gt;The OpenAI team prefers “boring” technologies — stable APIs, comprehensive documentation, high composability. AI performs better in predictable environments.&lt;/p&gt;

&lt;p&gt;Future tech selection: React over Svelte (more documentation), Python over Rust (more training data). Not exciting, but practical.&lt;/p&gt;

&lt;h2&gt;
  
  
  Soft Skills Become More Valuable
&lt;/h2&gt;

&lt;p&gt;If AI writes code, what’s more valuable?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;System Design: Breaking down problems, defining interfaces&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Product Sense: Turning vague requirements into clear definitions&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Quality Judgment: Knowing “good” versus “just runs”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AI Management: Which tasks suit AI, which need humans&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren’t new skills, but their importance is changing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future: Orchestrating Agents
&lt;/h2&gt;

&lt;p&gt;Industry trends through 2025–2026 show software engineering shifting from “writing code” to “orchestrating agents.”&lt;/p&gt;

&lt;p&gt;Harness Engineering provides a framework: not treating AI as a black box, but as a system component to be designed.&lt;/p&gt;

&lt;p&gt;You’re not “conversing” with AI; you’re collaborating — like conducting an orchestra, or managing a team.&lt;/p&gt;

&lt;p&gt;Future software engineers may be more like “directors” or “conductors”: not playing every note, but determining the direction and style of the entire piece.&lt;/p&gt;

&lt;p&gt;Is this good? I don’t know. But this is what’s happening.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;If AI can write code by itself, what can programmers do?&lt;/p&gt;

&lt;p&gt;We can design systems that let AI write code.&lt;/p&gt;

&lt;p&gt;This is not escape — this is evolution.&lt;/p&gt;

&lt;p&gt;Harness Engineering doesn’t make engineers lazy; it makes engineers stronger — using system power instead of individual power, design capability instead of coding capability.&lt;/p&gt;

&lt;p&gt;Perhaps this is what software engineering was always meant to be.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Turn
&lt;/h2&gt;

&lt;p&gt;Where is your AI pipeline breaking first?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Ingestion?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Retrieval?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Evaluation?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cost control?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Share your experience in the comments. The best harness designs come from real war stories, not textbook theory.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Enjoyed this? Follow for more on AI infrastructure, agentic coding, and the future of software engineering.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;OpenAI (Feb 2026). “Harness engineering: leveraging Codex in an agent-first world.” &lt;a href="https://openai.com/index/harness-engineering/" rel="noopener noreferrer"&gt;https://openai.com/index/harness-engineering/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hashimoto, Mitchell (2026). “My AI Adoption Journey.” &lt;a href="https://mitchellh.com/writing/my-ai-adoption-journey" rel="noopener noreferrer"&gt;https://mitchellh.com/writing/my-ai-adoption-journey&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fowler, Martin (Feb 2026). “Harness Engineering — first thoughts.” &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering-memo.html" rel="noopener noreferrer"&gt;https://martinfowler.com/articles/exploring-gen-ai/harness-engineering-memo.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Anthropic (Nov 2025). “Effective harnesses for long-running agents.” &lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Anthropic (2026). “Harness design for long-running application development.” &lt;a href="https://www.anthropic.com/engineering/harness-design-long-running-apps" rel="noopener noreferrer"&gt;https://www.anthropic.com/engineering/harness-design-long-running-apps&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>openai</category>
      <category>programming</category>
    </item>
    <item>
      <title>AI Products Break on the Data Layer — Not on the Next Model Release</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:17:24 +0000</pubDate>
      <link>https://forem.com/seekdb/ai-products-break-on-the-data-layer-not-on-the-next-model-release-13g0</link>
      <guid>https://forem.com/seekdb/ai-products-break-on-the-data-layer-not-on-the-next-model-release-13g0</guid>
      <description>&lt;p&gt;&lt;em&gt;Harness Engineering, in plain terms: ingestion, retrieval, memory lifecycle, and hybrid search matter more than a bigger context window once real traffic hits.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feqoonug4l5ve5dtgtjbb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feqoonug4l5ve5dtgtjbb.png" alt=" " width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your AI can sound fluent and still burn trust after launch. The failure mode is rarely “swap the model.” It is what you retrieve, what you remember, and what you never ingested correctly.&lt;/p&gt;

&lt;p&gt;Across deployments and community conversations, we keep seeing the same arc: traction first, then tickets — answers that read authoritative but trace to stale docs, policies updated on Monday still echoed on Friday, and churn that does not look like “model weakness” because the underlying model is still capable.&lt;/p&gt;

&lt;p&gt;We treat that pattern as a pipeline problem, not pure hallucination. Here, Harness Engineering means applying data-engineering discipline — schemas, lifecycle, retrieval, and cost/latency budgets — so production behavior is bounded by the data layer, not by prompt cleverness. If you are shipping AI in 2026, this is the layer to harden next; below, we walk through memory, RAG, and the database patterns teams use to get past the demo.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hallucination Problem
&lt;/h2&gt;

&lt;p&gt;Looking at the journey from large models to AI products over the past two years, hallucination has remained the core bottleneck. Many optimization solutions have emerged: prompt engineering, context engineering, and the recently highlighted Harness Engineering.&lt;/p&gt;

&lt;p&gt;Hallucinations in production often trace to gaps in the data processing pipeline, not only to model limits. Below, we clarify why the underlying data foundation matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Context Engineering Matters
&lt;/h2&gt;

&lt;p&gt;Context engineering has evolved through several stages: prompt engineering, RAG (Retrieval-Augmented Generation), and the emerging Memory mechanism.&lt;/p&gt;

&lt;p&gt;A model’s context window functions like a computer’s RAM — fast but limited. Even with expanded context windows (some can process entire novels), dumping raw data into context rarely produces reliable gains on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Long Context Trap
&lt;/h2&gt;

&lt;p&gt;Theoretically, more information should mean better understanding. In reality, excessive context length triggers multiple issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Performance: More tokens = higher inference latency&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cost: Token usage grows linearly with context length&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Accuracy: Context length and model accuracy are negatively correlated — the “Lost in the Middle” effect&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkof8uelcdmzc7uahjli7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkof8uelcdmzc7uahjli7.png" alt="Source: Research from ChromaDB. Longer contexts lead to lower accuracy." width="720" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This explains why system prompts and user prompts should be placed at the beginning and end of the sequence — avoiding attention weakening toward middle information.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solving AI “Amnesia” at the Data Level
&lt;/h2&gt;

&lt;p&gt;Many AI products have obvious memory deficits: user interactions from today are forgotten tomorrow; switching sessions causes the system to fail to recognize the user’s identity.&lt;/p&gt;

&lt;p&gt;PowerMem, an open-source AI memory component, addresses this at the data layer. In stress testing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Improved Accuracy: +48.77% (from 52.9% to 78.7%)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Higher Retrieval Efficiency: P95 latency significantly reduced&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reduced Costs: Up to 96.53% token cost savings&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;PowerMem simulates human memory mechanisms through three layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Access Layer: Python SDK, MCP Protocol, HTTP API, CLI (pmem), and Dashboard&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Core Layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hierarchical Memory (Working/Short-term/Long-term)&lt;/li&gt;
&lt;li&gt;Shared vs Private Memory isolation&lt;/li&gt;
&lt;li&gt;Intelligent filtering and conflict detection&lt;/li&gt;
&lt;li&gt;Ebbinghaus Forgetting Curve for lifecycle management&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Model &amp;amp; Storage Layers: Integration with GPT, Qwen, DeepSeek; optimized for OceanBase and seekdb&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffouw0qtlro1po8064zih.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffouw0qtlro1po8064zih.png" alt="PowerMem Architecture: layered memory with forgetting curve simulation." width="720" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For AI hallucination, what’s needed is not just memory storage, but an intelligent memory engine with “thinking” and “forgetting” capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Impact: OpenClaw Integration
&lt;/h2&gt;

&lt;p&gt;OpenClaw, a popular AI agent framework, defaults to local Markdown files and SQLite for memory. This works for individuals but fails at enterprise scale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Uncontrolled token consumption as memory files grow&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;No centralized, structured management for collaboration&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By integrating PowerMem via plugins.slots.memory = memory-powermem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Accuracy: +49%&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Latency: -92%&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Token consumption: 18% of original&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This transforms memory from a “personal toy” to an “enterprise tool.”&lt;/p&gt;

&lt;h2&gt;
  
  
  RAG Done Right
&lt;/h2&gt;

&lt;p&gt;Memory alone doesn’t give AI wisdom. The key lies in Retrieval-Augmented Generation (RAG).&lt;/p&gt;

&lt;h2&gt;
  
  
  The RAG Dilemma
&lt;/h2&gt;

&lt;p&gt;Many teams face “great demos, poor production.” After months, results remain unsatisfactory with frequent hallucinations. Pain points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Insufficient Document Parsing: Can’t extract structured data from unstructured sources&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Poor Retrieval Accuracy: Can’t recall relevant information from massive datasets&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PowerRAG enhances both modules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Enhanced Parsing: SOTA models with title recognition, regex matching, intelligent chunking&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Enhanced Retrieval: Hybrid search (full-text + vector + scalar filtering)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Unified Storage: Metadata, documents, vectors in a single database&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Application Scenarios
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq85bbr2pc9vgirbo2p28.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq85bbr2pc9vgirbo2p28.png" alt=" " width="704" height="174"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The AI Database Imperative
&lt;/h2&gt;

&lt;p&gt;Beyond memory and RAG, the database directly determines AI application intelligence, performance, and cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hybrid Search Becomes Standard
&lt;/h2&gt;

&lt;p&gt;Consider this query:&lt;/p&gt;

&lt;p&gt;“Find coffee shops within 0.3 miles, with average price under $6, rating above 4.0 stars, and minimal wait time.”&lt;/p&gt;

&lt;p&gt;This contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Spatial: “Within 0.3 miles” → geographic indexing&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Scalar: “Under $6,” “above 4.0” → structured filters&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Vector: “Minimal wait” → semantic understanding&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional architectures need multiple systems. Integrated databases handle this in one SQL query.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Path Fusion Retrieval
&lt;/h2&gt;

&lt;p&gt;Single retrieval modes (vector-only or full-text-only) can’t meet complex needs. Multi-path fusion significantly improves recall:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7vbc3yzeuz49c4r4626.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7vbc3yzeuz49c4r4626.png" alt=" " width="577" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8lfymp1wtwqebprkakw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8lfymp1wtwqebprkakw.png" alt="Source: internal benchmarks. 19% improvement over single mode." width="720" height="237"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  One-Stop Document Processing
&lt;/h2&gt;

&lt;p&gt;AI-era databases integrate AI Functions at the kernel level:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Built-in Models: Embedding, Rerank, Document AI deployed inside&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Simple Development: One SQL statement processes unstructured documents&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Full Automation: document → text → parsing → embedding → retrieval → reranking&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All AI processing logic is “pushed down” to the database layer, lowering the development threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  Database Selection Guide
&lt;/h2&gt;

&lt;h2&gt;
  
  
  For Enterprise Projects
&lt;/h2&gt;

&lt;p&gt;Traditional “patchwork” architecture (Milvus + MySQL + Elasticsearch + Redis) brings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Complex deployment and monitoring&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Version compatibility nightmares&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Stacked operational costs&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A typical hybrid search requires 2+ database requests and transmits 910 records to present Top 10 results — inefficient and costly.&lt;/p&gt;

&lt;p&gt;Integrated architecture manages vector, document, KV, spatial, relational, and time-series uniformly — one database covers all scenarios.&lt;/p&gt;

&lt;h2&gt;
  
  
  For Startups
&lt;/h2&gt;

&lt;p&gt;Startups need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Reliability: Data foundation is the lifeline&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cost-Performance: Every penny counts&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;seekdb — a lightweight AI-native database based on OceanBase — mainly offers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;AI-Native Functions: AI_EMBED, AI_COMPLETE, AI_RERANK built into SQL; DBMS_AI_SERVICE for LLM integration&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hybrid Search: Vector + full-text + scalar in a single query with multi-path recall and advanced reranking (RRF, LLM-based)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Multi-Model Data: Relational tables, vectors, text, JSON, and GIS unified in one engine&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Lightweight Deployment: 1C2G specification; embedded mode for prototyping, client/server mode for production&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Apache 2.0 Open Source: Fully open-source with smooth migration path to OceanBase for scale-up&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Layered Architecture
&lt;/h2&gt;

&lt;p&gt;A successful AI product design follows four layers:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzjw3yl20fmejuapnyc7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzjw3yl20fmejuapnyc7.png" alt=" " width="720" height="293"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpua0mf9ul13ny28ehiwy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpua0mf9ul13ny28ehiwy.png" alt="AI Product Architecture: Application → Memory → Knowledge → Data." width="720" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Logic
&lt;/h2&gt;

&lt;p&gt;AI engineering has evolved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;2022–2023: Prompt Engineering&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;2025: Context Engineering&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;2026: Harness Engineering&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Breaking through AI bottlenecks still relies on a solid data foundation:&lt;/p&gt;

&lt;p&gt;“Use the reliability of data engineering to drive AI products to operate more efficiently and stably from the data level.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to start
&lt;/h2&gt;

&lt;p&gt;If you are building in this problem space, these public entry points lead straight to the code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;PowerMem (open-source memory component): &lt;a href="https://github.com/oceanbase/powermem" rel="noopener noreferrer"&gt;https://github.com/oceanbase/powermem&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;seekdb (lightweight AI-native database for prototyping and small deployments): &lt;a href="https://github.com/oceanbase/seekdb" rel="noopener noreferrer"&gt;https://github.com/oceanbase/seekdb&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;If you have shipped memory or RAG in production, tell us where the pipeline broke first — ingestion, retrieval, evaluation, or cost. We welcome the discussion in the comments.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Context Engineering Research: &lt;a href="https://research.trychroma.com/context-rot" rel="noopener noreferrer"&gt;https://research.trychroma.com/context-rot&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Anthropic on Memory: &lt;a href="https://www.anthropic.com/research/memory" rel="noopener noreferrer"&gt;https://www.anthropic.com/research/memory&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>memory</category>
    </item>
    <item>
      <title>Nobody Picks Your AI Product Because of Your Spreadsheet</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 12:08:27 +0000</pubDate>
      <link>https://forem.com/seekdb/nobody-picks-your-ai-product-because-of-your-spreadsheet-dhb</link>
      <guid>https://forem.com/seekdb/nobody-picks-your-ai-product-because-of-your-spreadsheet-dhb</guid>
      <description>&lt;p&gt;&lt;em&gt;The adoption curve still runs on copycats, not charts.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;ll line up to try it.&lt;/p&gt;

&lt;p&gt;They didn’t. Not most of them.&lt;/p&gt;

&lt;p&gt;And the more time I spend with real customers — not personas on a slide — the more convinced I am that we’re not living in a “data decides” era for the majority of buyers. We’re living in the same era we always were. The packaging just got shinier.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3f7b0s6918nk8urj1mh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3f7b0s6918nk8urj1mh.png" alt="Diffusion of Innovations Didn’t Die." width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup: a book from 1962 still runs your roadmap
&lt;/h2&gt;

&lt;p&gt;There’s an old book — Diffusion of Innovations, first published in 1962 — that studied how new ideas spread in agriculture and a dozen other “unsexy” domains. The core idea is almost boring in how well it holds:&lt;/p&gt;

&lt;p&gt;Don’t spray a new thing on everyone. Seed it with a small group who actually use it, succeed with it, and become visible.&lt;/p&gt;

&lt;p&gt;People who look like them notice. Then imitation does the marketing for you.&lt;/p&gt;

&lt;p&gt;That’s not a metaphor. It’s the mechanism.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real issue: “data-driven” is a minority sport
&lt;/h2&gt;

&lt;p&gt;Like a lot of teams, we invested heavily in proof: benchmarks, reproducible scripts, comparison tables, the whole performance page aesthetic. That work matters — for someone.&lt;/p&gt;

&lt;p&gt;But here’s the pattern I keep seeing in the field:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Roughly nine in ten people, when they’re choosing something new, aren’t optimizing a spreadsheet first. They’re running a social proof script: Who that’s like me is already on this? Do they like it? If it’s working for them, I’ll try it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Maybe one in ten is the “try the weird thing because the architecture is interesting” crowd — the people who read the methodology footnotes, who reproduce the benchmark, who file the issue before they’ve paid.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That second group is tiny. They’re also the entire audience for your benchmark PDF.&lt;/p&gt;

&lt;p&gt;So we keep optimizing artifacts for 10% of the decision loop — and wondering why the other 90% still asks the same three questions on every sales call: Who uses this? At what scale? Would you use it yourself?&lt;/p&gt;

&lt;p&gt;Call it what it is: people learn from people. The rest is commentary.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually works (for GTM and internal rollouts)
&lt;/h2&gt;

&lt;p&gt;I’m not arguing against rigor. I’m arguing for allocation.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Treat early adopters like infrastructure, not vanity metrics.&lt;br&gt;
They’re not “nice to have.” They’re the distribution channel. Find them deliberately — communities, power users, the engineers who already filed three feature requests — and over-invest in their success: access, support, fast fixes, public credit.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Make their wins legible to the 90%.&lt;br&gt;
Case studies aren’t marketing fluff if they answer the real question: someone like me got from A to B. Prefer named contexts (role, stack, constraint) over anonymous “10x faster” claims.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Stop expecting the median buyer to behave like a reviewer.&lt;br&gt;
Your benchmark suite is for trust-building with skeptics and internal discipline. Your growth loop is referenceability among peers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Inside the company, run the same play.&lt;br&gt;
When you introduce an AI tool or an internal “skill,” give the enthusiasts the lion’s share of pilot budget — time, credits, executive air cover — then broadcast what they shipped. Not as a mandate. As a story. The rest of the org will follow the same imitation math external users do.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What I’d do differently next time
&lt;/h2&gt;

&lt;p&gt;I’d still ship the charts. I’d just stop mistaking them for the main character in adoption.&lt;/p&gt;

&lt;p&gt;I’d start every launch plan with one blunt question: Who are our first ten visible users, and what does “win” look like for them — not for our narrative?&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Diffusion beats diffusion slides: adoption is still a social process dressed up in SaaS metrics.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;~90% of selection is peer imitation; ~10% is exploratory — and that 10% is who actually reads your benchmark appendix.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Your scarcest asset isn’t attention — it’s credible early users; resource them like a channel, not a lottery.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Internal AI rollouts follow the same law: over-invest in willing experimenters, celebrate outcomes loudly, let imitation do the rest.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;If you’re building in the AI infra space: I’m curious — what’s the one question prospects ask you before they ever open your perf doc? Drop it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>marketing</category>
      <category>product</category>
    </item>
    <item>
      <title>Your Documentation Has Two Audiences Now (And One Is an AI)</title>
      <dc:creator>Charles Wu</dc:creator>
      <pubDate>Mon, 27 Apr 2026 11:59:40 +0000</pubDate>
      <link>https://forem.com/seekdb/your-documentation-has-two-audiences-now-and-one-is-an-ai-3ce8</link>
      <guid>https://forem.com/seekdb/your-documentation-has-two-audiences-now-and-one-is-an-ai-3ce8</guid>
      <description>&lt;p&gt;&lt;em&gt;Technical documentation’s audience has changed. It’s no longer just engineers reading pages — increasingly, humans and AI work together: humans make decisions, AI finds materials, organizes steps, and assists execution. Here’s how to make the same knowledge serve both.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm92bxcae1z2q4gekosq2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm92bxcae1z2q4gekosq2.png" alt=" " width="720" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Last month, our team shipped a major documentation update. Two weeks later, I noticed something odd: our AI assistant was giving outdated answers. The docs were right. The AI was wrong.&lt;/p&gt;

&lt;p&gt;That’s when I realized — we were writing for humans, but forgetting the AI assistants helping them.&lt;/p&gt;

&lt;p&gt;This isn’t just about adding llms.txt to your repo. It's about recognizing a fundamental shift: technical documentation now serves two audiences simultaneously. And if you optimize for only one, you're already behind.&lt;/p&gt;

&lt;p&gt;Welcome to the era of Document “Skillification” — a term for making documentation AI-consumable through structured, callable capabilities. Not yet in the dictionary, but you’ll hear it more as AI Agents become mainstream.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters Now
&lt;/h2&gt;

&lt;p&gt;The audience for technical documentation has changed. In the past, it was primarily engineers reading pages. Today, more and more scenarios involve humans and AI working together: humans make decisions, while AI is responsible for finding materials, organizing steps, and assisting execution.&lt;/p&gt;

&lt;p&gt;If documentation only exists in web page form, AI typically needs to process HTML first, then extract the main content from navigation, styling, and irrelevant information. This step reduces accuracy and slows down response times.&lt;/p&gt;

&lt;p&gt;Therefore, what document Skillification solves is not a new buzzword, but a practical problem: how can the same body of knowledge serve both humans and AI?&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Document Skillification?
&lt;/h2&gt;

&lt;p&gt;This can be understood in two layers.&lt;/p&gt;

&lt;p&gt;The first layer is making documentation AI-consumable. Common entry points are llms.txt, llms-full.txt, and page-level .md exports. The focus is on enabling AI to reliably obtain structured content.&lt;/p&gt;

&lt;p&gt;The second layer is the Skill repository. Using SKILL.md to write processes, constraints, and best practices as callable capabilities, answering "what rules to use in what scenarios."&lt;/p&gt;

&lt;p&gt;These two layers work in tandem: the former provides content entry points, while the latter constrains usage patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Practical Implementation Roadmap
&lt;/h2&gt;

&lt;p&gt;Start by adding AI entry points — no need to refactor all documentation at once. Create llms.txt, fill in .md exports for key pages, and provide llms-full.txt as needed. This way, you first ensure content is readable, then gradually improve reading accuracy and usage stability.&lt;/p&gt;

&lt;p&gt;Then codify high-frequency actions into Skills. Examples include SQL documentation writing standards, operations troubleshooting flows, and upgrade checklists. This type of content has high reuse rates and is most prone to inconsistencies when communicated orally.&lt;/p&gt;

&lt;p&gt;For the retrieval layer, adopt “search first, then read.” Given large and rapidly updating documentation, do not dump the entire text into context at once. You can use MCP or a database query layer. The key is not the integration form, but on-demand retrieval.&lt;/p&gt;

&lt;p&gt;Drawing from the oceanbase-doc-skills implementation (&lt;a href="https://github.com/amber-moe/oceanbase-doc-skills" rel="noopener noreferrer"&gt;https://github.com/amber-moe/oceanbase-doc-skills&lt;/a&gt;), this path is already viable: skills, rules, examples tables store structured knowledge, and QueryService retrieves results by skill name, category, or keywords, then passes them to the upper layer for invocation.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Existing Projects Approach This
&lt;/h2&gt;

&lt;p&gt;Nuxt’s approach is to standardize entry points first. Official support for llms.txt capabilities is provided, and Nuxt Content offers an LLM integration module. Instead of manually maintaining two sets of documentation, this adds an AI consumption export on top of the existing content system.&lt;/p&gt;

&lt;p&gt;Docus leans toward “in-site capabilities.” It integrates AI Assistant and MCP into the documentation site. AI doesn’t need to preload the entire site but retrieves on demand. This reduces context pressure and improves hit controllability.&lt;/p&gt;

&lt;p&gt;Vite’s path is straightforward: documentation pages and Markdown pages have a one-to-one correspondence — guide corresponds to guide.md. Combined with llms.txt as a directory entry point, this forms a "locate first, then fetch by page" flow. Engineering refactoring costs are relatively low.&lt;/p&gt;

&lt;p&gt;Cloudflare’s approach is more comprehensive. Documentation entry points, Skills, and retrieval capabilities are connected together, basically forming a closed loop of content, rules, and invocation.&lt;/p&gt;

&lt;p&gt;Anthropic’s focus is on standardization itself. Through Agent Skills specifications and example repositories, it standardizes Skill metadata, trigger descriptions, and content organization methods, facilitating cross-tool reuse.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F62dejq9qus0k8pttezr8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F62dejq9qus0k8pttezr8.png" alt=" " width="720" height="243"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Direct Value This Direction Brings
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Reduced maintenance costs. One knowledge source can simultaneously serve web reading and AI invocation, reducing duplicate maintenance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reduced team memory burden. Operations statements, SQL specifications, and troubleshooting flows change from “relying on human memory” to “on-demand invocation.”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Better usability in offline scenarios. In intranet or isolated environments, llms-full.txt and local Skills can support local retrieval and assisted execution.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;More consistent output. Processes and formats are front-loaded into Skills, reducing variance between different people and different models.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Problems to Address Upfront
&lt;/h2&gt;

&lt;p&gt;This is not a one-time engineering effort. Specifications change, tools change, models change — Skills require continuous maintenance.&lt;/p&gt;

&lt;p&gt;If documentation volume is large, avoid doing everything at once. A safer approach is to pilot in high-value domains first, then gradually expand.&lt;/p&gt;

&lt;p&gt;Additionally, human-readable versions and AI consumption entry points need a synchronization mechanism. Without synchronization, you’ll eventually end up with two diverging documentation sets.&lt;/p&gt;

&lt;p&gt;Finally, Skill instructions must be executable and verifiable. Rules written too abstractly will spiral out of control during implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Document Skillification is not about changing how documentation is written — it’s about refactoring the knowledge delivery path: enabling the same content to support reading, retrieval, and execution simultaneously. For technical teams, this ultimately translates to lower communication costs, more stable delivery quality, and faster issue resolution cycles.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s Next?
&lt;/h2&gt;

&lt;p&gt;→ Try it yourself: Start with llms.txt on your next documentation update.&lt;/p&gt;

&lt;p&gt;→ Building agent skills? Check out the Anthropic Skills specification (&lt;a href="https://github.com/anthropics/skills" rel="noopener noreferrer"&gt;https://github.com/anthropics/skills&lt;/a&gt;) for best practices.&lt;/p&gt;

&lt;p&gt;→ Got questions? Drop a comment below — I read every one.&lt;/p&gt;

&lt;p&gt;→ Want more? Follow for deeper dives into AI-native documentation patterns.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Is your documentation AI-ready? Share your approach in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>techwriting</category>
      <category>ai</category>
      <category>documentation</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
