<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Moshe Simantov</title>
    <description>The latest articles on Forem by Moshe Simantov (@moshe_io).</description>
    <link>https://forem.com/moshe_io</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3705784%2F13c477c7-0533-431c-b8bd-3bcca00bf0bf.jpg</url>
      <title>Forem: Moshe Simantov</title>
      <link>https://forem.com/moshe_io</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/moshe_io"/>
    <language>en</language>
    <item>
      <title>llms.txt Is Just a Table of Contents. Most AI Tools Stop There.</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Thu, 16 Apr 2026 21:34:24 +0000</pubDate>
      <link>https://forem.com/moshe_io/llmstxt-is-just-a-table-of-contents-most-ai-tools-stop-there-38be</link>
      <guid>https://forem.com/moshe_io/llmstxt-is-just-a-table-of-contents-most-ai-tools-stop-there-38be</guid>
      <description>&lt;p&gt;If you've spent any time in the AI tooling space recently, you've probably seen llms.txt popping up everywhere. React Aria, Anthropic, Svelte, Next.js, MUI — a growing list of projects now ship an &lt;code&gt;llms.txt&lt;/code&gt; file at their site root. The idea, &lt;a href="https://llmstxt.org/" rel="noopener noreferrer"&gt;proposed by Jeremy Howard&lt;/a&gt; and inspired by &lt;code&gt;robots.txt&lt;/code&gt; and &lt;code&gt;sitemap.xml&lt;/code&gt;, is simple: give AI tools a structured entry point to your documentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But there's a catch that most tools miss.&lt;/strong&gt; An &lt;code&gt;llms.txt&lt;/code&gt; file is a discovery index — a table of contents with section headers and links to the actual documentation pages. It is not the documentation itself. And most tools that claim llms.txt support stop at reading the index.&lt;/p&gt;

&lt;h2&gt;
  
  
  What llms.txt actually is (and isn't)
&lt;/h2&gt;

&lt;p&gt;An &lt;code&gt;llms.txt&lt;/code&gt; file is a single Markdown file at a site's root that lists documentation sections and links to detail pages. Think of it as a map — it tells you what exists and where to find it.&lt;/p&gt;

&lt;p&gt;The structure is straightforward: headings group topics, and each topic links to one or more documentation pages. Some sites also publish an &lt;code&gt;llms-full.txt&lt;/code&gt; that bundles everything inline — but most don't, because their documentation is too large to fit in a single file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The important distinction:&lt;/strong&gt; &lt;code&gt;llms.txt&lt;/code&gt; is a discovery mechanism, not a documentation format. It points to docs. It doesn't contain them.&lt;/p&gt;

&lt;p&gt;This matters because of how tools use it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "half the answer" problem
&lt;/h2&gt;

&lt;p&gt;Here's what happens with most llms.txt-aware tools today: they fetch the &lt;code&gt;llms.txt&lt;/code&gt; file, feed it to the model, and call it done. Your AI assistant gets a list of section titles and one-line descriptions — a menu, not a meal.&lt;/p&gt;

&lt;p&gt;Take a real example. A popular framework's &lt;code&gt;llms.txt&lt;/code&gt; is about 84 KB with 8 major sections. That sounds like a lot, but it's almost entirely links and brief descriptions. The actual documentation — the API signatures, code examples, migration guides, edge cases — lives behind those links. Without following them, your AI assistant is working with an outline.&lt;/p&gt;

&lt;p&gt;This creates a frustrating failure mode. The model &lt;em&gt;knows&lt;/em&gt; the API exists (it saw the link title), but it doesn't have the details. So it does what LLMs do — it fills in the gaps from training data. You get answers that sound right but reference deprecated patterns, wrong parameter names, or APIs from the wrong version.&lt;/p&gt;

&lt;p&gt;Cloud-based tools like GitMCP do read &lt;code&gt;llms.txt&lt;/code&gt;, but they still bounce queries through a remote service — adding latency, rate limits, and routing your codebase questions through someone else's infrastructure. The &lt;a href="https://dev.to/blog/2026-02-19/local-first-documentation-for-ai"&gt;local-first approach&lt;/a&gt; avoids all of that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The missing piece is simple: follow the links, fetch the real docs, and store them locally.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;context add &amp;lt;website&amp;gt;&lt;/code&gt; — the new path
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dev.to/context"&gt;@neuledge/context&lt;/a&gt; now supports adding documentation directly from any website that publishes an &lt;code&gt;llms.txt&lt;/code&gt; file. No git repo needed, no manual &lt;code&gt;.db&lt;/code&gt; file construction — just point it at a URL.&lt;/p&gt;

&lt;p&gt;Three usage patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bare domain&lt;/strong&gt; — auto-discovers &lt;code&gt;llms-full.txt&lt;/code&gt;, then falls back to &lt;code&gt;llms.txt&lt;/code&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  context add https://react-aria.adobe.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Direct file URL&lt;/strong&gt; — skips discovery, uses the specified file:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  context add https://mui.com/material-ui/llms.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Custom package name&lt;/strong&gt; — overrides the default hostname-based name:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  context add https://react-aria.adobe.com &lt;span class="nt"&gt;--name&lt;/span&gt; react-aria
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under the hood, any HTTPS URL that isn't a &lt;code&gt;.db&lt;/code&gt; file or a git host is treated as a &lt;code&gt;website&lt;/code&gt; source. Context tries &lt;code&gt;llms-full.txt&lt;/code&gt; first (the complete bundle), then &lt;code&gt;llms.txt&lt;/code&gt; (the index). If it finds the full version, you get everything in one fetch. If it finds the index, it does something most tools skip — it follows every link.&lt;/p&gt;

&lt;h2&gt;
  
  
  Following the links (why the index isn't enough)
&lt;/h2&gt;

&lt;p&gt;When Context detects an &lt;code&gt;llms.txt&lt;/code&gt; index (as opposed to &lt;code&gt;llms-full.txt&lt;/code&gt;), it doesn't stop at the table of contents. It parses the Markdown links grouped by section header, then fetches each linked document concurrently.&lt;/p&gt;

&lt;p&gt;The defaults are practical:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency:&lt;/strong&gt; 5 parallel fetches&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout:&lt;/strong&gt; 30 seconds per link&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max links:&lt;/strong&gt; 500 (covers even massive documentation sites)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Same-origin only:&lt;/strong&gt; links to external sites are skipped — you asked for React Aria docs, not random blog posts it happens to link to&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-link failure tolerance:&lt;/strong&gt; one 404 doesn't kill the whole build&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fetched documents get consolidated with the index and passed through the same package builder that handles git repos — deduplication, semantic chunking, FTS5 indexing into a portable SQLite &lt;code&gt;.db&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before (index only):&lt;/strong&gt; 8 sections, 84 KB — a table of contents.&lt;br&gt;
&lt;strong&gt;After (links followed):&lt;/strong&gt; hundreds of pages of actual documentation, deduped and indexed into a searchable local database.&lt;/p&gt;

&lt;p&gt;Same-origin filtering matters for both signal and security. When you run &lt;code&gt;context add https://react-aria.adobe.com&lt;/code&gt;, you want React Aria's documentation, not every external resource their docs happen to reference.&lt;/p&gt;
&lt;h2&gt;
  
  
  A real example, end to end
&lt;/h2&gt;

&lt;p&gt;Let's walk through adding React Aria's documentation. Their site publishes an &lt;code&gt;llms.txt&lt;/code&gt; at the root.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;context add https://react-aria.adobe.com &lt;span class="nt"&gt;--name&lt;/span&gt; react-aria
&lt;span class="go"&gt;Fetching https://react-aria.adobe.com/llms-full.txt... not found
Fetching https://react-aria.adobe.com/llms.txt... found
Detected llms.txt index with 147 linked documents
Fetching linked documents...
Fetched 139/147 documents (8 failed)
Building package "react-aria"...
Package built: .context/react-aria.db (139 documents)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Eight links returned 404s — probably outdated references in the &lt;code&gt;llms.txt&lt;/code&gt;. That's fine. The 139 that succeeded contain the actual component APIs, hooks documentation, styling guides, and accessibility patterns.&lt;/p&gt;

&lt;p&gt;Now wire it into your MCP client. If you're using Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add context &lt;span class="nt"&gt;--&lt;/span&gt; npx @neuledge/context mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Cursor or VS Code, add to your settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"@neuledge/context"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now ask your AI assistant something specific — not "what is React Aria?" but something that requires the real docs: "How do I implement a custom calendar with React Aria's useCalendar hook, including locale support and disabled date ranges?"&lt;/p&gt;

&lt;p&gt;Without the package, your assistant would cobble together an answer from training data — probably mixing up hook names or missing the &lt;code&gt;createCalendar&lt;/code&gt; dependency. With the indexed docs, it searches the actual React Aria reference for &lt;code&gt;useCalendar&lt;/code&gt;, finds the parameters, the locale configuration, and the &lt;code&gt;isDateUnavailable&lt;/code&gt; callback. Grounded answers instead of educated guesses.&lt;/p&gt;

&lt;p&gt;You can inspect what got indexed with &lt;code&gt;context browse react-aria&lt;/code&gt; to see the full list of documents in the package.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this fits in the bigger picture
&lt;/h2&gt;

&lt;p&gt;There are now three ways to get documentation into @neuledge/context:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Community registry&lt;/strong&gt; — &lt;code&gt;context install npm/react&lt;/code&gt; — &lt;a href="https://dev.to/blog/2026-03-07/community-registry-pre-built-mcp-documentation-packages"&gt;116+ pre-built packages&lt;/a&gt; ready to download.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Any llms.txt site&lt;/strong&gt; — &lt;code&gt;context add https://...&lt;/code&gt; — the capability covered in this article.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Any git repo with docs&lt;/strong&gt; — &lt;code&gt;context add https://github.com/...&lt;/code&gt; — the original path, with &lt;a href="https://dev.to/blog/2026-04-01/beyond-markdown-multi-format-documentation-for-ai"&gt;multi-format support&lt;/a&gt; for Markdown, reStructuredText, and AsciiDoc.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The llms.txt path closes an important gap. The registry covers popular libraries, but it can't cover everything. If a library publishes an &lt;code&gt;llms.txt&lt;/code&gt; — and &lt;a href="https://llmstxt.org/" rel="noopener noreferrer"&gt;the list keeps growing&lt;/a&gt; — you can grab its docs even if nobody has added it to the registry yet.&lt;/p&gt;

&lt;p&gt;For library authors, this creates a clear path: publish an &lt;code&gt;llms.txt&lt;/code&gt;, and your users can instantly index your documentation into their AI tooling. No PR to any registry required. Just ship the file and let the tools follow the links.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Find a library you use that ships an &lt;code&gt;llms.txt&lt;/code&gt;. Run &lt;code&gt;context add &amp;lt;url&amp;gt;&lt;/code&gt;. Then ask your AI assistant the hardest question about that library — the one it usually gets wrong.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/context add https://docs.anthropic.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/context"&gt;Product page&lt;/a&gt; — features, architecture, and how Context works&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/docs"&gt;Documentation&lt;/a&gt; — quick start and editor configuration&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blog/2026-03-07/community-registry-pre-built-mcp-documentation-packages"&gt;Community registry&lt;/a&gt; — 116+ pre-built packages&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://llmstxt.org/" rel="noopener noreferrer"&gt;llmstxt.org&lt;/a&gt; — the llms.txt specification&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>documentation</category>
      <category>llm</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Your AI Coding Assistant Isn't Stupid — It's Starving for Context</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Tue, 07 Apr 2026 03:17:52 +0000</pubDate>
      <link>https://forem.com/moshe_io/your-ai-coding-assistant-isnt-stupid-its-starving-for-context-5ben</link>
      <guid>https://forem.com/moshe_io/your-ai-coding-assistant-isnt-stupid-its-starving-for-context-5ben</guid>
      <description>&lt;p&gt;Every few months, a new model drops and developers upgrade their AI coding assistant expecting the hallucinations to finally stop. GPT-4 to GPT-5 to GPT-5.4. Claude 3.5 to 4 to Opus 4.6. Gemini 2 to 3 to 3.1. The benchmarks go up. The confident-but-wrong suggestions keep coming.&lt;/p&gt;

&lt;p&gt;At some point you have to ask: if the model keeps getting smarter and the output keeps being wrong in the same ways, maybe the model was never the problem.&lt;/p&gt;

&lt;p&gt;It isn't. &lt;strong&gt;The bottleneck in AI coding accuracy is context, not capability&lt;/strong&gt; — and upgrading the model is the least effective lever you have.&lt;/p&gt;

&lt;h2&gt;
  
  
  The model upgrade treadmill
&lt;/h2&gt;

&lt;p&gt;Here's the loop most teams are stuck in. The assistant suggests a deprecated API. You blame the model. A new model ships. You upgrade. The assistant suggests a &lt;em&gt;different&lt;/em&gt; deprecated API. You blame the model again.&lt;/p&gt;

&lt;p&gt;Look at what actually causes these failures in practice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Wrong API signatures.&lt;/strong&gt; Your assistant calls &lt;code&gt;fetch(url, { json: true })&lt;/code&gt; because it learned a pattern from 2021 Node.js libraries. The current &lt;code&gt;fetch&lt;/code&gt; doesn't take that option. The model can reason fine — it just learned an obsolete fact and has no way to know it's obsolete.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deprecated method suggestions.&lt;/strong&gt; It reaches for &lt;code&gt;componentWillMount&lt;/code&gt; or &lt;code&gt;useEffect&lt;/code&gt; patterns from React 16. The model isn't broken. The training data is just a blur of every React version ever written.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version-mismatched code.&lt;/strong&gt; You're on Next.js 15, the assistant writes Next.js 13 patterns because that's where most of its training data lives. &lt;a href="https://dev.to/blog/2026-02-18/version-specific-documentation-ai-coding-assistants"&gt;Every major version is blended together&lt;/a&gt; with no version labels.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are reasoning failures. A human given the same inputs would make the same mistakes. These are &lt;strong&gt;context failures&lt;/strong&gt; — the model is answering the question it was asked with the information it was given, and that information is wrong.&lt;/p&gt;

&lt;p&gt;A smarter model won't fix any of this. It'll just be wrong more confidently.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the research actually says
&lt;/h2&gt;

&lt;p&gt;This isn't a hunch. The research community has been converging on the same conclusion for about a year now.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ETH Zurich's study on AGENTS.md files&lt;/strong&gt; showed that structured, project-specific context files dramatically improved the accuracy of AI coding output — using the same underlying models. The delta came entirely from what was in the context window, not from which model read it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The New Stack published "Context Is AI Coding's Real Bottleneck in 2026"&lt;/strong&gt; documenting the same pattern across multiple tools and vendors. The industry is quietly realizing that "upgrade the model" is diminishing returns and "upgrade the context" is where the wins are hiding.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hallucination rate gaps tell the story.&lt;/strong&gt; Leading models hit 0.7–0.9% hallucination rates on well-grounded tasks. The industry average hovers around 9.2%. The gap between "best" and "average" isn't model capability — it's how well the context is curated.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Put another way: if context quality were held constant, most of the gap between GPT-5 and GPT-5.4 — or between Claude 4 and Opus 4.6 — would disappear. The gains developers attribute to new models are largely gains from better default prompts, better retrieval, and better system instructions that ship alongside them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three context failures
&lt;/h2&gt;

&lt;p&gt;When an AI coding assistant gives you wrong code, the root cause is almost always one of three context problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Missing context.&lt;/strong&gt; The model was never shown the library's docs at all. It's guessing from pattern similarity with other libraries. Confident, plausible, and wrong — because it's literally making the API up by analogy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stale context.&lt;/strong&gt; The model was trained on v3 of a library, you're on v6, and nobody told it. It knows &lt;em&gt;an&lt;/em&gt; API; it just knows the wrong one. This is the most common failure mode for anything that ships faster than model training cycles (which is most things).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Noisy context.&lt;/strong&gt; The model has too much information, not too little. You dumped 200KB of docs into the context window and the signal for your specific question drowned. The relevant paragraph was there — buried under everything else.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's the uncomfortable part: &lt;strong&gt;all three of these get worse, not better, as context windows grow.&lt;/strong&gt; A million-token context window doesn't fix missing docs. It doesn't un-stale training data. And it actively encourages the noisy-context failure by giving teams permission to throw everything at the model and hope.&lt;/p&gt;

&lt;p&gt;The fix isn't a bigger pipe. It's a cleaner one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fixing context at the source
&lt;/h2&gt;

&lt;p&gt;If context quality is the lever, the question becomes: what does a good context pipeline look like? Three properties matter, and they compound:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Version-specific, not latest-only.&lt;/strong&gt; The assistant needs docs for &lt;em&gt;your&lt;/em&gt; version, not the most recent release. A cloud doc service that indexes HEAD is useless if you're pinned to &lt;code&gt;react@18&lt;/code&gt;. Versioning has to be first-class, not an afterthought.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local-first, not network-bound.&lt;/strong&gt; If retrieving docs takes 300ms over the network, the agent starts skipping retrievals for "simple" questions. Sub-10ms local lookups mean retrieval is always on, for every question, even the trivial ones. Latency determines behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pre-indexed, not lazily scraped.&lt;/strong&gt; On-the-fly scraping is fragile — sites rate-limit, pages move, layouts change. Pre-built packages that ship the parsed, structured docs eliminate an entire class of flakiness.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the thesis behind &lt;a href="https://dev.to/context"&gt;@neuledge/context&lt;/a&gt;: an MCP documentation server that gives your AI assistant accurate, version-pinned library docs from a local SQLite database. It isn't magic — it's the boring answer to what a fixed context pipeline looks like. Version-specific packages, sub-10ms local retrieval, and a &lt;a href="https://dev.to/blog/2026-03-07/community-registry-pre-built-mcp-documentation-packages"&gt;community registry of 116+ pre-built packages&lt;/a&gt; so you don't build anything from source unless you want to.&lt;/p&gt;

&lt;p&gt;MCP (Model Context Protocol) matters here because it's the standard interface that lets any coding assistant — Claude Code, Cursor, Continue, and a growing list of others — plug into the same documentation source. Fix your context pipeline once and every tool on your machine gets the benefit. No per-editor integration, no vendor lock-in.&lt;/p&gt;

&lt;p&gt;The point isn't "use this tool." The point is that the context problem has concrete, fixable causes, and you should use &lt;em&gt;something&lt;/em&gt; that addresses them. Several tools in this space exist. Pick one. What you don't want to do is keep waiting for the next model release to fix a problem the model never caused.&lt;/p&gt;

&lt;h2&gt;
  
  
  Before and after
&lt;/h2&gt;

&lt;p&gt;The difference is easier to see than to describe. Take a question almost every React developer has asked an AI assistant in the last year:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How do I fetch data in a React Server Component with suspense?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Without good context&lt;/strong&gt;, a typical assistant reaches for training data. You get code that looks right at a glance — &lt;code&gt;use client&lt;/code&gt;, &lt;code&gt;useEffect&lt;/code&gt;, a loading state — except React Server Components don't use &lt;code&gt;useEffect&lt;/code&gt;. That's a Client Component pattern from the pre–RSC era. The assistant mixed two React paradigms because both are in its training data and neither was labeled as "wrong for this context." The answer isn't nonsense; it's just an answer from 2022.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With version-pinned React 19 docs in context&lt;/strong&gt;, the same model gives you a Server Component that &lt;code&gt;await&lt;/code&gt;s the fetch directly, wrapped in a &lt;code&gt;&amp;lt;Suspense&amp;gt;&lt;/code&gt; boundary at the parent. No &lt;code&gt;use client&lt;/code&gt;. No &lt;code&gt;useEffect&lt;/code&gt;. Because the actual React 19 docs say so, and the model is no longer guessing.&lt;/p&gt;

&lt;p&gt;Same model. Same question. Different answer — because the context was different. Set it up once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/context add react@19
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your assistant now has the React 19 reference at sub-10ms local latency. The next time you ask about Server Components, it's reading the docs, not dredging them up from a 2023 blog post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop upgrading models. Start upgrading context.
&lt;/h2&gt;

&lt;p&gt;The model-upgrade treadmill is a comfortable place to be — there's always a new release, the benchmarks always go up, and the problem always feels like it's about to be solved. It isn't. The hallucinations you're seeing today will still be there in the next model, because they aren't reasoning failures. They're context failures wearing a reasoning failure's clothes.&lt;/p&gt;

&lt;p&gt;The good news is that context is the &lt;em&gt;easier&lt;/em&gt; problem. You can fix it this afternoon:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Audit what your assistant actually has in its context window. Is it version-specific? Is it fresh? Is it the docs for the libraries you actually use?&lt;/li&gt;
&lt;li&gt;Set up a retrieval pipeline that's local, fast, and pre-indexed. &lt;a href="https://dev.to/context"&gt;@neuledge/context&lt;/a&gt; is one option; use whatever fits your stack.&lt;/li&gt;
&lt;li&gt;Pick your most frustrating AI coding scenario — the one that made you blame the model last week — and try it again with real docs in context. See if the model suddenly gets smarter.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It won't have gotten smarter. It'll just finally have the information it needed the first time.&lt;/p&gt;

&lt;p&gt;Try it with the scenario that annoyed you most this week. If it still gets the answer wrong after that, &lt;em&gt;then&lt;/em&gt; start blaming the model. Most people never have to.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Want the hands-on version? Read &lt;a href="https://dev.to/blog/2026-02-17/getting-started-with-neuledge-context"&gt;Getting Started with @neuledge/context&lt;/a&gt; for the setup walkthrough, or browse &lt;a href="https://dev.to/docs"&gt;the docs&lt;/a&gt; to pin your first package.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
    <item>
      <title>Your AI Coding Assistant Can Finally Read Django and Spring Boot Docs</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Wed, 01 Apr 2026 02:46:15 +0000</pubDate>
      <link>https://forem.com/moshe_io/beyond-markdown-how-neuledgecontext-indexes-python-java-and-any-documentation-format-3ab2</link>
      <guid>https://forem.com/moshe_io/beyond-markdown-how-neuledgecontext-indexes-python-java-and-any-documentation-format-3ab2</guid>
      <description>&lt;p&gt;Most AI documentation tools make a quiet assumption: &lt;strong&gt;your library's docs are in Markdown&lt;/strong&gt;. If they are, great. If they aren't, you're out of luck.&lt;/p&gt;

&lt;p&gt;That's a problem, because some of the most important frameworks in software development don't use Markdown at all. Python's ecosystem standardized on &lt;strong&gt;reStructuredText&lt;/strong&gt; (.rst) — Django, Flask, and most Sphinx-based projects write their docs in it. Many Java projects, including Spring Boot, use &lt;strong&gt;AsciiDoc&lt;/strong&gt; (.adoc) for their reference documentation.&lt;/p&gt;

&lt;p&gt;If your AI documentation tool can only parse Markdown, it can't index Django. It can't index Spring Boot. It's locked out of entire ecosystems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/context"&gt;@neuledge/context v0.3.0&lt;/a&gt; fixes this with native support for all three formats.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three formats, zero configuration
&lt;/h2&gt;

&lt;p&gt;Context now parses three documentation formats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Markdown&lt;/strong&gt; (.md, .mdx, .qmd, .rmd) — the existing default&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;reStructuredText&lt;/strong&gt; (.rst) — Python ecosystem: Django, Flask, Sphinx-based docs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AsciiDoc&lt;/strong&gt; (.adoc) — Java ecosystem: Spring Boot, enterprise documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Format detection is automatic.&lt;/strong&gt; Context reads the file extension and selects the right parser. No configuration flags, no format declarations. Point it at a repo and it figures out the rest.&lt;/p&gt;

&lt;p&gt;This means a single repository with mixed formats — say, Markdown README files alongside .rst API reference docs — gets parsed correctly without any extra steps. Each file is handled by its extension.&lt;/p&gt;

&lt;h2&gt;
  
  
  Python ecosystem: Django, FastAPI, Flask
&lt;/h2&gt;

&lt;p&gt;Let's walk through indexing Django's documentation. If you're using the &lt;a href="https://dev.to/blog/2026-03-07/community-registry-pre-built-mcp-documentation-packages"&gt;community registry&lt;/a&gt;, it's one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context &lt;span class="nb"&gt;install &lt;/span&gt;pip/django
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That downloads a pre-built package with Django's full .rst documentation already parsed, chunked, and indexed into a searchable SQLite database.&lt;/p&gt;

&lt;p&gt;Want to see what versions are available? The registry pulls version data from &lt;strong&gt;PyPI's REST API&lt;/strong&gt; automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context browse pip/django
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you'd rather build from source — maybe you're tracking a development branch or using a fork:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/django/django
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Context will clone the repo, detect the .rst files in Django's &lt;code&gt;docs/&lt;/code&gt; directory, and parse them into the same indexed format. &lt;strong&gt;Django's documentation is extensive&lt;/strong&gt; — hundreds of .rst files covering models, views, middleware, forms, and the admin interface. All of it becomes searchable by your AI assistant.&lt;/p&gt;

&lt;p&gt;The same workflow works for the rest of the Python ecosystem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FastAPI:&lt;/strong&gt; &lt;code&gt;context install pip/fastapi&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flask:&lt;/strong&gt; &lt;code&gt;context install pip/flask&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pydantic:&lt;/strong&gt; &lt;code&gt;context install pip/pydantic&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once indexed, your AI coding assistant gets &lt;strong&gt;version-specific, accurate answers&lt;/strong&gt; instead of guessing from training data. Ask about Django 5.1 middleware and you get Django 5.1 middleware docs — not a hallucinated blend of version 3, 4, and 5.&lt;/p&gt;

&lt;h2&gt;
  
  
  Java ecosystem: Spring Boot
&lt;/h2&gt;

&lt;p&gt;The AsciiDoc parser opens up Java's documentation world. Spring Boot's reference documentation is written entirely in .adoc files, and Context handles it natively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context &lt;span class="nb"&gt;install &lt;/span&gt;maven/spring-boot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Version discovery works through &lt;strong&gt;Maven Central's API&lt;/strong&gt;, so &lt;code&gt;context browse maven/spring-boot&lt;/code&gt; shows every published version. Install the one that matches your project.&lt;/p&gt;

&lt;p&gt;Building from source follows the same pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/spring-projects/spring-boot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Context detects the .adoc files and parses Spring Boot's reference docs — configuration properties, auto-configuration, actuator endpoints, and deployment guides. Instead of your AI assistant guessing at Spring Boot configuration, it can search the actual reference documentation for your exact version.&lt;/p&gt;

&lt;p&gt;The registry also includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;JUnit:&lt;/strong&gt; &lt;code&gt;context install maven/junit&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Micrometer:&lt;/strong&gt; &lt;code&gt;context install maven/micrometer&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Build from any git repo
&lt;/h2&gt;

&lt;p&gt;The multi-format support isn't limited to registry packages. &lt;strong&gt;Any git repo with .rst or .adoc files works.&lt;/strong&gt; This is especially useful for teams with internal documentation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/your-org/internal-docs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Context scans the repository, auto-detects formats by extension, and parses everything it finds. If your team maintains API docs in reStructuredText and architecture docs in Markdown, both get indexed correctly from the same repo.&lt;/p&gt;

&lt;p&gt;This also works for libraries that aren't in the registry yet. Found an open-source Python library with great .rst docs? Just point Context at the repo. No need to wait for someone to add it to the registry — though if it's a popular library, consider &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;submitting a registry entry&lt;/a&gt; so others can benefit too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;The Markdown assumption has been a blind spot for AI documentation tooling. Python has the third-largest developer community globally. Java remains the backbone of enterprise software. &lt;strong&gt;Excluding these ecosystems from AI documentation tools meant excluding millions of developers.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With multi-format support, &lt;a href="https://dev.to/context"&gt;@neuledge/context&lt;/a&gt; is no longer a JavaScript/TypeScript documentation tool. It's a documentation tool for any ecosystem that writes docs in Markdown, reStructuredText, or AsciiDoc — which covers the vast majority of open-source projects.&lt;/p&gt;

&lt;p&gt;Your AI assistant shouldn't be limited to libraries that happen to use Markdown. Try indexing a Python or Java library and see the difference accurate, &lt;a href="https://dev.to/blog/2026-02-18/version-specific-ai-docs"&gt;version-specific documentation&lt;/a&gt; makes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/context &lt;span class="nb"&gt;install &lt;/span&gt;pip/django
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Browse the full &lt;a href="https://dev.to/blog/2026-03-07/community-registry-pre-built-mcp-documentation-packages"&gt;community registry&lt;/a&gt; for pre-built packages, check the &lt;a href="https://dev.to/docs"&gt;documentation&lt;/a&gt; for setup instructions, or explore how &lt;a href="https://dev.to/blog/2026-02-19/local-first-documentation-for-ai"&gt;local-first documentation&lt;/a&gt; keeps everything fast and private.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>documentation</category>
      <category>tooling</category>
    </item>
    <item>
      <title>RAG vs. Fine-Tuning vs. Grounding: Which One Does Your AI Actually Need?</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Sun, 15 Mar 2026 17:02:52 +0000</pubDate>
      <link>https://forem.com/moshe_io/rag-vs-fine-tuning-vs-grounding-which-one-does-your-ai-actually-need-5429</link>
      <guid>https://forem.com/moshe_io/rag-vs-fine-tuning-vs-grounding-which-one-does-your-ai-actually-need-5429</guid>
      <description>&lt;p&gt;I've watched three teams this year burn weeks fine-tuning models that just needed access to their own docs. One spent $12K on GPU time training a customer support model to stop hallucinating product features — features that were already documented in their help center. The fix was a retrieval pipeline that took an afternoon to set up.&lt;/p&gt;

&lt;p&gt;The problem isn't that these teams were stupid. It's that "RAG vs. fine-tuning" is the wrong question, and most content online frames it that way because the authors are selling one or the other.&lt;/p&gt;

&lt;p&gt;Here's the actual question: &lt;strong&gt;what kind of wrong is your LLM being?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The misdiagnosis that costs you weeks
&lt;/h2&gt;

&lt;p&gt;When an LLM gives bad output, developers reach for one of two fixes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RAG&lt;/strong&gt; — stuff relevant documents into the context window before generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fine-tuning&lt;/strong&gt; — retrain the model on examples of correct output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both work. Neither works for everything. And confusing when to use which leads to the most expensive mistake in AI development: solving the right problem with the wrong tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fine-tuning a model to know facts is like teaching someone a foreign language by having them memorize a dictionary.&lt;/strong&gt; They'll learn the words, but they'll still make things up when you ask about something that wasn't in the dictionary. Fine-tuning changes &lt;em&gt;how&lt;/em&gt; a model responds — its tone, reasoning style, output format. It does not reliably teach it &lt;em&gt;what is true&lt;/em&gt;. A fine-tuned model will confidently produce wrong facts in exactly the style you trained it on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RAG without good retrieval is like handing someone a library card and expecting them to write a PhD thesis.&lt;/strong&gt; Access to information isn't the same as accessing the &lt;em&gt;right&lt;/em&gt; information. If your retrieval returns noisy, irrelevant, or stale documents, the model will dutifully weave that garbage into a polished-sounding response.&lt;/p&gt;

&lt;p&gt;Both techniques are tools. The goal they serve is &lt;strong&gt;grounding&lt;/strong&gt; — anchoring every claim the model makes to a verifiable source.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diagnose first, then pick your tool
&lt;/h2&gt;

&lt;p&gt;Match the symptom to the fix:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"It says wrong facts about our product."&lt;/strong&gt; Your model was trained on internet data, not your docs. It doesn't know that you renamed the API in v3, deprecated the old auth flow, or added a new pricing tier last month. This is a retrieval problem. Give it access to your documentation at query time — don't try to bake every fact into model weights.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"It responds in the wrong format."&lt;/strong&gt; You want JSON with specific fields. Or you want the model to follow your company's support tone. Or you need it to reason through multi-step problems in a specific way. This is a behavior problem. Fine-tune on examples of the format and style you want.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"It hallucinates even when I give it context."&lt;/strong&gt; Your retrieval is returning the wrong documents, too many documents, or documents with conflicting information. This is a retrieval &lt;em&gt;quality&lt;/em&gt; problem. Fix your chunking, your ranking, your filtering — don't add more infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"It doesn't follow complex instructions well."&lt;/strong&gt; The model's instruction-following capability is the bottleneck, not its knowledge. Fine-tune for reasoning patterns, AND ground it in real data so it has something accurate to reason over.&lt;/p&gt;

&lt;p&gt;Here's the pattern: &lt;strong&gt;if the problem is what the model knows → grounding. If the problem is how the model behaves → fine-tuning.&lt;/strong&gt; Most production issues are knowledge problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why fine-tuning is almost never the right first step
&lt;/h2&gt;

&lt;p&gt;Fine-tuning has a seductive pitch: "make the model work exactly how you want." But the costs are real and often underestimated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Training data curation.&lt;/strong&gt; You need hundreds to thousands of high-quality input/output examples. Someone has to write or curate those. That's weeks of work before you even start training.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute costs.&lt;/strong&gt; A single fine-tuning run on a capable model runs $500–$5,000 depending on the provider, dataset size, and model. Multiple iterations are normal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model lock-in.&lt;/strong&gt; When Anthropic ships Claude 4.7 or OpenAI releases GPT-5, your fine-tuned weights don't transfer. You retrain from scratch. Every model upgrade resets you to zero.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The accuracy ceiling.&lt;/strong&gt; After all that investment, the model still can't answer questions about facts not in the training data. Your product docs changed last Tuesday? The fine-tuned model doesn't know.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compare that to a grounding pipeline: set up retrieval, point it at your docs, done. When the docs change, the model's answers change immediately. No retraining, no dataset curation, no compute budget.&lt;/p&gt;

&lt;p&gt;Research backs this up. RAG-based grounding &lt;a href="https://arxiv.org/abs/2311.09210" rel="noopener noreferrer"&gt;reduces hallucinations by 42–68%&lt;/a&gt; with no model modification at all. That's the kind of improvement that makes fine-tuning an optimization for later, not a starting point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The right sequence: ground first, measure what's still wrong, fine-tune only if the remaining problems are behavioral.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What good grounding actually looks like
&lt;/h2&gt;

&lt;p&gt;Bad grounding is "dump all the docs into the prompt." Good grounding is an architecture:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Right data, right time.&lt;/strong&gt; Not all data is the same. Static docs (API references, guides, policies) change per release — index them once and search locally. Live data (prices, inventory, status) changes per minute — query it at request time. Mixing these up is how you end up quoting yesterday's prices with today's confidence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Selective context.&lt;/strong&gt; Don't send 20 documents to the model. Send the 3 most relevant ones. More context means more noise for the model to latch onto. The model doesn't need your entire knowledge base — it needs the specific answer to the specific question.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Source traceability.&lt;/strong&gt; Every fact the model cites should trace back to a source document with a URL, version, and timestamp. If it can't cite a source, it should say so instead of guessing.&lt;/p&gt;

&lt;p&gt;In practice, this means two layers. For documentation and reference material, use something that indexes docs into a local, searchable store — we built &lt;a href="https://dev.to/context"&gt;@neuledge/context&lt;/a&gt; for this, which packages docs as SQLite databases with sub-10ms full-text search, served as an &lt;a href="https://dev.to/docs"&gt;MCP server&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@neuledge/context"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the &lt;a href="https://dev.to/blog/2026-03-07/community-registry-pre-built-mcp-documentation-packages"&gt;community registry&lt;/a&gt;, you don't even need to build packages yourself — 116+ libraries are pre-built and ready to install.&lt;/p&gt;

&lt;p&gt;For live operational data, use a semantic data layer like &lt;a href="https://dev.to/graph"&gt;@neuledge/graph&lt;/a&gt; that queries structured sources at request time and returns clean JSON the model can reason over.&lt;/p&gt;

&lt;p&gt;The combination covers both failure modes: stale knowledge (retrieval from indexed docs) and stale data (live queries to operational systems).&lt;/p&gt;

&lt;h2&gt;
  
  
  When you actually need fine-tuning
&lt;/h2&gt;

&lt;p&gt;Fine-tuning isn't useless — it's just not the first thing to reach for. There are specific situations where it's the right tool:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consistent output format.&lt;/strong&gt; You need every response to follow a strict JSON schema, or match a specific tone, or produce a particular reasoning structure. Prompt engineering can get you 80% there, but fine-tuning locks it in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain reasoning patterns.&lt;/strong&gt; Your use case requires the model to reason through problems in a domain-specific way — medical differential diagnosis, legal contract analysis, financial risk assessment. The model needs to &lt;em&gt;think&lt;/em&gt; differently, not just know different facts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficiency at scale.&lt;/strong&gt; You're making millions of API calls and a fine-tuned smaller model could replace a larger one with enough quality for your use case. This is a cost optimization, not an accuracy play.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The common thread: &lt;strong&gt;fine-tuning changes behavior, not knowledge.&lt;/strong&gt; If you fine-tune AND ground, you get a model that reasons the way you want about facts that are actually true. That's the combination that production systems eventually land on — but grounding comes first because it solves the bigger, more common problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bottom line
&lt;/h2&gt;

&lt;p&gt;Stop asking "RAG or fine-tuning?" and start asking "what's actually wrong?"&lt;/p&gt;

&lt;p&gt;Wrong facts → ground it. Wrong behavior → fine-tune it. Wrong everything → ground first, then fine-tune, because a model that behaves perfectly while confidently lying is worse than one that's awkwardly correct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get started:&lt;/strong&gt; Install &lt;a href="https://dev.to/context"&gt;@neuledge/context&lt;/a&gt; for documentation grounding and &lt;a href="https://dev.to/graph"&gt;@neuledge/graph&lt;/a&gt; for live data. Both are free, open source, and work with any MCP-compatible AI agent. The &lt;a href="https://dev.to/docs"&gt;getting started guide&lt;/a&gt; walks through the full setup.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>machinelearning</category>
      <category>rag</category>
    </item>
    <item>
      <title>116 Pre-Built Documentation Packages for Your AI Coding Assistant</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Sat, 07 Mar 2026 14:17:53 +0000</pubDate>
      <link>https://forem.com/moshe_io/we-built-a-community-registry-for-neuledgecontext-heres-how-it-works-3ble</link>
      <guid>https://forem.com/moshe_io/we-built-a-community-registry-for-neuledgecontext-heres-how-it-works-3ble</guid>
      <description>&lt;p&gt;Every time someone set up &lt;a href="https://dev.to/context"&gt;@neuledge/context&lt;/a&gt; for a new project, they'd do the same thing: clone the React docs repo, find the right directory, build a package. Then do it again for Next.js. And Tailwind. And Prisma.&lt;/p&gt;

&lt;p&gt;I kept seeing the same repos show up in GitHub traffic. Hundreds of developers, all independently building identical documentation packages for the same popular libraries. That felt like a problem worth solving.&lt;/p&gt;

&lt;p&gt;So we built a community registry — a shared collection of pre-built documentation packages that anyone can download instead of building from source.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem was simple repetition
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;context add&lt;/code&gt; workflow works great. You point it at a repo, it finds the docs, builds a searchable SQLite database. Done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add &lt;span class="nt"&gt;--name&lt;/span&gt; react https://github.com/reactjs/react.dev /src/content/reference
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But for popular libraries, this is redundant work. You need to know the right repo URL, find the correct docs directory (which sometimes takes a few minutes of browsing), and wait for the build. Multiply that by every developer who uses React, and it's a lot of collective time spent producing the exact same &lt;code&gt;.db&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;The registry just short-circuits that. Someone builds the package once, and everyone else downloads it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's actually in a registry package
&lt;/h2&gt;

&lt;p&gt;Same thing you'd get building locally — a SQLite &lt;code&gt;.db&lt;/code&gt; file with FTS5 full-text search, containing semantically chunked documentation. There's no difference between a package you build yourself and one from the registry. Same format, same search quality, same everything.&lt;/p&gt;

&lt;p&gt;Right now the registry has packages for three ecosystems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;npm (109 packages):&lt;/strong&gt; React, Next.js, Angular, Vue, Svelte, Astro, Tailwind CSS, Express, Fastify, NestJS, Prisma, Drizzle, and a lot more&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pip (4 packages):&lt;/strong&gt; Django, FastAPI, Flask, Pydantic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;maven (3 packages):&lt;/strong&gt; Spring Boot, JUnit, Micrometer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Packages get rebuilt daily through GitHub Actions. When Next.js ships a new version, the registry picks it up automatically. No one needs to do anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use it
&lt;/h2&gt;

&lt;p&gt;The simplest way is &lt;code&gt;context install&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context &lt;span class="nb"&gt;install &lt;/span&gt;npm/react
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That downloads the pre-built package and makes it available to your AI coding assistant immediately. If you want a specific version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context browse npm/react        &lt;span class="c"&gt;# see what's available&lt;/span&gt;
context &lt;span class="nb"&gt;install &lt;/span&gt;npm/next 15.0   &lt;span class="c"&gt;# install a specific version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with &lt;code&gt;npx&lt;/code&gt; if you haven't installed &lt;code&gt;@neuledge/context&lt;/code&gt; globally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/context &lt;span class="nb"&gt;install &lt;/span&gt;npm/react
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're running &lt;code&gt;@neuledge/context&lt;/code&gt; as an &lt;a href="https://dev.to/docs"&gt;MCP server&lt;/a&gt;, your AI agent can also find and install packages on its own. It has two tools for this — &lt;code&gt;search_packages&lt;/code&gt; to find what's available, and &lt;code&gt;download_package&lt;/code&gt; to install it. So if it encounters a library it doesn't have docs for, it can just go grab them from the registry without you doing anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the registry works
&lt;/h2&gt;

&lt;p&gt;The pipeline is pretty straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Registry entries&lt;/strong&gt; are YAML files that map a package name to a git repo and docs path. Each one says "for this library, clone this repo, look in this directory."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;A daily GitHub Actions workflow&lt;/strong&gt; checks for new library versions. When it finds one, it clones the repo, builds the documentation package, and publishes it to the registry API.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The API&lt;/strong&gt; at &lt;code&gt;api.context.neuledge.com&lt;/code&gt; serves search and download endpoints. Search to find packages, download to get the &lt;code&gt;.db&lt;/code&gt; file.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The packages themselves&lt;/strong&gt; are the same SQLite databases &lt;code&gt;@neuledge/context&lt;/code&gt; uses locally — &lt;code&gt;meta&lt;/code&gt; table for metadata, &lt;code&gt;chunks&lt;/code&gt; table for the documentation sections, and &lt;code&gt;chunks_fts&lt;/code&gt; for full-text search.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you want to add a library that's missing, you submit a YAML file to the &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; with the package mapping. The build pipeline handles everything else from there.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's covered
&lt;/h2&gt;

&lt;p&gt;A quick overview of the categories:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frontend:&lt;/strong&gt; React, Next.js, Angular, Vue, Svelte, SvelteKit, Astro, Solid, Remix, Nuxt, Gatsby&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CSS:&lt;/strong&gt; Tailwind CSS, Sass, PostCSS, Styled Components, Emotion&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Backend:&lt;/strong&gt; Express, Fastify, NestJS, Hono, Django, FastAPI, Flask, Spring Boot&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database/ORM:&lt;/strong&gt; Prisma, Drizzle, TypeORM, Mongoose, Sequelize, Knex&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Testing:&lt;/strong&gt; Jest, Vitest, Playwright, Cypress, Testing Library&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI SDKs:&lt;/strong&gt; OpenAI SDK, Anthropic SDK, LangChain, Vercel AI SDK&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build tools:&lt;/strong&gt; Vite, Webpack, esbuild, Turbo, Bun, Deno&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Infra:&lt;/strong&gt; Docker, Kubernetes, Terraform, AWS CDK&lt;/p&gt;

&lt;p&gt;If you're working with a typical stack — say React, Next.js, Prisma, and Tailwind — that's four install commands and your AI assistant has accurate, version-specific docs for everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it out
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/context &lt;span class="nb"&gt;install &lt;/span&gt;npm/react
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or set up &lt;code&gt;@neuledge/context&lt;/code&gt; as an MCP server and let your AI agent discover packages on its own. The &lt;a href="https://dev.to/docs"&gt;getting started guide&lt;/a&gt; walks through the full setup.&lt;/p&gt;

&lt;p&gt;The registry is open source and free. If your favorite library isn't there yet, &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;adding it&lt;/a&gt; is one YAML file.&lt;/p&gt;

</description>
      <category>documentation</category>
      <category>productivity</category>
      <category>showdev</category>
      <category>tooling</category>
    </item>
    <item>
      <title>How @neuledge/graph Gives AI Agents Access to Live Data</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Sun, 22 Feb 2026 15:03:31 +0000</pubDate>
      <link>https://forem.com/moshe_io/how-neuledgegraph-gives-ai-agents-access-to-live-data-4hep</link>
      <guid>https://forem.com/moshe_io/how-neuledgegraph-gives-ai-agents-access-to-live-data-4hep</guid>
      <description>&lt;p&gt;Your customer asks the AI agent: "What's the current price for the Pro plan?" The agent responds with $29/month — the price from six months ago. You raised it to $39 in January. Yesterday the same agent told a prospect you have 200 units in stock. You have 12.&lt;/p&gt;

&lt;p&gt;These aren't hallucinations in the traditional sense. The model isn't making things up from nothing — it's answering from stale training data because it has no connection to your live systems. Prices change, inventory moves, statuses update. &lt;strong&gt;If your AI agent can't access current data, it will confidently serve outdated facts.&lt;/strong&gt; And outdated facts are often worse than no facts at all.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/blog/2026-02-20/what-is-llm-grounding"&gt;RAG&lt;/a&gt; solves this for documentation. But structured operational data — prices, inventory, order statuses — needs a different approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why RAG alone isn't enough for live data
&lt;/h2&gt;

&lt;p&gt;RAG was designed for documents. It chunks text, embeds it into vectors, and retrieves relevant passages. That works well for documentation, knowledge bases, and guides. But it breaks down with live operational data for three reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Documents vs. structured data.&lt;/strong&gt; RAG returns text fragments. When an agent needs the current price of SKU-1234, it needs a number — not a paragraph that might contain a number from last week's catalog export.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Staleness matters more.&lt;/strong&gt; Documentation might be acceptable at a week old. Pricing data is wrong after an hour. Inventory counts are wrong after a minute. Different data types have fundamentally different freshness requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Too many API tools creates a selection problem.&lt;/strong&gt; The alternative to RAG — giving your agent direct API access — means exposing 10 or 20 tools. The LLM has to pick the right endpoint, format the right parameters, and parse the response. This is fragile, error-prone, and gets worse as you add more data sources.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need something between "embed everything into vectors" and "give the agent raw API access." A unified data layer that handles routing, caching, and structured responses.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Graph approach — one tool, all your data
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/neuledge/graph" rel="noopener noreferrer"&gt;@neuledge/graph&lt;/a&gt; is a semantic data layer for AI agents. Instead of giving your agent many API tools to choose from, Graph provides a single &lt;code&gt;lookup()&lt;/code&gt; tool. The agent describes what it needs in natural language; Graph routes it to the right data source and returns structured JSON.&lt;/p&gt;

&lt;p&gt;The core idea:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Connect your data sources&lt;/strong&gt; — APIs, databases, or any structured data endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expose a single lookup tool&lt;/strong&gt; — the agent calls &lt;code&gt;lookup()&lt;/code&gt; with a natural language query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get structured JSON back&lt;/strong&gt; — not free text, but exact values the LLM can reason over&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Responses are pre-cached and return in under 100ms. The LLM doesn't wait for upstream APIs during a conversation — Graph handles that in the background.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up Graph
&lt;/h2&gt;

&lt;p&gt;Install Graph and its peer dependency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @neuledge/graph zod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Initialize the client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the minimal setup. Graph connects to the Neuledge knowledge graph service, which provides access to structured data sources. No API key is required for basic usage (100 requests/day).&lt;/p&gt;

&lt;p&gt;For production workloads, sign up for a free API key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/graph sign-up your-email@example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then configure it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dotenv/config&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;NEULEDGE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 10,000 requests/month&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Querying data
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;lookup()&lt;/code&gt; method is the single interface your agent uses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cities.tokyo.weather&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// Returns: { status: "matched", match: {...}, value: {...} }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Responses come back as structured JSON — not free text. The LLM can reason over exact values (prices, counts, statuses) instead of parsing unstructured paragraphs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting Graph to your AI agent
&lt;/h2&gt;

&lt;p&gt;Graph is designed as a first-class tool for AI agent frameworks. You pass &lt;code&gt;graph.lookup&lt;/code&gt; directly as a tool — no wrapper code needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vercel AI SDK
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@ai-sdk/anthropic&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;stepCountIs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ToolLoopAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ToolLoopAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-sonnet-4-5&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;stopWhen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;stepCountIs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;textStream&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What's the current weather in Tokyo?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  OpenAI Agents SDK
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@openai/agents&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Data Assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What is the current price of Apple stock?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  LangChain
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;langchain&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lookup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai:gpt-4.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What is the exchange rate from USD to EUR?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pattern is the same across frameworks: create a &lt;code&gt;NeuledgeGraph&lt;/code&gt; instance, pass &lt;code&gt;graph.lookup&lt;/code&gt; as a tool, and let the agent call it when it needs live data.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the agent experience looks like
&lt;/h2&gt;

&lt;p&gt;When your agent has Graph connected, conversations with live data queries look like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User:&lt;/strong&gt; "What's the current weather in San Francisco?"&lt;/p&gt;

&lt;p&gt;The agent calls &lt;code&gt;graph.lookup({ query: "cities.san-francisco.weather" })&lt;/code&gt; and gets back structured JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"matched"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"temperature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;62&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"unit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fahrenheit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"partly cloudy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"humidity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;68&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"updated_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-02-22T14:30:00Z"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent sees exact numbers, not prose. It can tell the user the temperature is 62°F with 68% humidity — not "approximately in the low 60s" based on historical averages from training data.&lt;/p&gt;

&lt;p&gt;This structured format matters. LLMs reason more accurately over explicit values than extracted text. A JSON response with &lt;code&gt;"price": 39.00&lt;/code&gt; is unambiguous. A paragraph that says "the price was recently updated to around $39" leaves room for the model to hedge, round, or misinterpret.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a custom data server
&lt;/h2&gt;

&lt;p&gt;For proprietary data sources — your product catalog, internal pricing API, inventory system — you can run your own Graph server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraphRouter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph-router&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraphMemoryRegistry&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph-memory-registry&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Fastify&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fastify&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraphMemoryRegistry&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Register your data sources&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;products.{sku}.price&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;resolver&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sku&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sku&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`https://api.internal/pricing?sku=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sku&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;inventory.{sku}.stock&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;resolver&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sku&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sku&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`https://api.internal/inventory?sku=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sku&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraphRouter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;registry&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Fastify&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/lookup&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then point your Graph client at the custom server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://localhost:3000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you full control over data sources, caching, and access policies while keeping the same &lt;code&gt;lookup()&lt;/code&gt; interface for your AI agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Graph + Context — the full grounding stack
&lt;/h2&gt;

&lt;p&gt;Graph handles live operational data. But AI agents also need static knowledge — library docs, API references, guides. That's where &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;@neuledge/context&lt;/a&gt; comes in.&lt;/p&gt;

&lt;p&gt;The two tools complement each other:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Context&lt;/strong&gt; grounds your agent in &lt;a href="https://dev.to/blog/2026-02-17/getting-started-with-neuledge-context"&gt;static documentation&lt;/a&gt; — library docs, internal wikis, API references. Indexes into SQLite, serves via MCP, sub-10ms queries. Best for knowledge that changes with releases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph&lt;/strong&gt; grounds your agent in live data — product catalogs, pricing, inventory, system status. Pre-cached structured responses, single lookup tool. Best for data that changes continuously.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An AI coding assistant uses Context for accurate, &lt;a href="https://dev.to/blog/2026-02-18/version-specific-documentation-ai-coding-assistants"&gt;version-specific documentation&lt;/a&gt;. A customer-facing agent uses Graph for current prices and availability. A sophisticated agent uses both — docs for how-to knowledge, Graph for current facts.&lt;/p&gt;

&lt;p&gt;Together, they form the &lt;a href="https://dev.to/blog/2026-02-15/building-ai-agents-that-dont-hallucinate"&gt;grounding architecture&lt;/a&gt; that eliminates the most common categories of hallucination: outdated documentation and stale operational data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;Install Graph and connect it to your agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @neuledge/graph zod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;your-data-query-here&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For production usage, &lt;a href="https://github.com/neuledge/graph" rel="noopener noreferrer"&gt;sign up for a free API key&lt;/a&gt; to get 10,000 requests/month. For proprietary data, set up a custom server with &lt;code&gt;@neuledge/graph-router&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Your AI agent should answer from your data, not from six-month-old training patterns. Ground it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/neuledge/graph" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; — source code, API reference, and examples&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/docs"&gt;Documentation&lt;/a&gt; — quick start guide and configuration&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blog/2026-02-17/getting-started-with-neuledge-context"&gt;Getting started with Context&lt;/a&gt; — complement Graph with documentation grounding&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blog/2026-02-20/what-is-llm-grounding"&gt;What is LLM grounding?&lt;/a&gt; — the concept behind tools like Graph and Context&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>llm</category>
      <category>rag</category>
    </item>
    <item>
      <title>What Is LLM Grounding? A Developer's Guide</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Fri, 20 Feb 2026 21:55:49 +0000</pubDate>
      <link>https://forem.com/moshe_io/what-is-llm-grounding-a-developers-guide-3cj4</link>
      <guid>https://forem.com/moshe_io/what-is-llm-grounding-a-developers-guide-3cj4</guid>
      <description>&lt;p&gt;Ask an AI coding assistant to use the &lt;code&gt;useAgent&lt;/code&gt; hook from Vercel's AI SDK. If the model was trained before v6 shipped, you'll get a confident answer referencing &lt;code&gt;Experimental_Agent&lt;/code&gt; — an API that was renamed months ago. The code looks right. The types look right. It's wrong.&lt;/p&gt;

&lt;p&gt;This is what happens when a language model has no connection to current reality. LLMs are powerful pattern matchers trained on internet snapshots. They have no access to your docs, your APIs, or your data. When they lack information, they fill the gap with plausible-sounding fiction. This isn't a bug — it's a fundamental limitation of how these models work. Researchers call it "hallucination," but that implies randomness. In practice, it's worse: the model generates answers that are structurally correct but factually outdated, and there's nothing in the output that tells you which parts are real.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grounding is the architectural solution.&lt;/strong&gt; Instead of hoping the training data is current enough, you connect the model to real data sources at the time it generates a response. The result: answers based on facts, not patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is LLM grounding?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;LLM grounding is the process of connecting a language model to external data sources at inference time&lt;/strong&gt;, so it can retrieve and reason over real information instead of relying solely on its training data.&lt;/p&gt;

&lt;p&gt;It's not a single technique — it's an umbrella term for a category of approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt; — fetching relevant documents before generation. The model searches a knowledge base, retrieves matching content, and uses it as context for its response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool use / function calling&lt;/strong&gt; — letting the model query APIs, databases, or services directly. Instead of guessing a price, it calls a pricing API.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge retrieval&lt;/strong&gt; — structured access to specific facts through knowledge graphs, lookup tables, or semantic search indexes. Not just document chunks, but precise answers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Grounding is the goal. RAG, tool use, and knowledge retrieval are techniques to achieve it. Most production systems combine more than one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Types of grounding by data source
&lt;/h2&gt;

&lt;p&gt;Not all grounding is the same. Different data sources have different characteristics, and the right approach depends on what kind of data your model needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Static documentation&lt;/strong&gt; — library docs, API references, internal guides. Changes infrequently (per release cycle). Best approach: index locally, serve via search. Full-text search or vector embeddings work well here because the content is stable enough to pre-index. Read more about &lt;a href="https://dev.to/blog/2026-02-19/local-first-documentation-for-ai"&gt;local-first documentation&lt;/a&gt; for a deep dive on this approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live operational data&lt;/strong&gt; — prices, inventory, system status, feature flags. Changes continuously (hours, minutes, or seconds). Best approach: query via API or database at request time with appropriate cache TTLs. RAG doesn't work well here because by the time you've embedded and indexed the data, it's already stale. A customer-facing agent quoting yesterday's prices is worse than quoting no price at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structured knowledge&lt;/strong&gt; — facts, relationships, taxonomies, entity data. Best approach: knowledge graphs or semantic lookup tools that return structured JSON rather than document fragments. When the model needs "the current price of SKU-1234," it needs a number, not a paragraph that might contain a number.&lt;/p&gt;

&lt;p&gt;The distinction matters because mixing up approaches creates subtle failures. Embedding live pricing data into a vector database gives you yesterday's prices with today's confidence. Querying a documentation API in real-time adds latency and fragility where a local index would be instant and reliable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The grounding architecture
&lt;/h2&gt;

&lt;p&gt;When an AI agent receives a query, a grounded system follows a consistent pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The agent identifies what external data it needs.&lt;/strong&gt; Is this a question about API usage (docs), current pricing (live data), or general knowledge (training data is fine)?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The retrieval layer fetches relevant context.&lt;/strong&gt; This could be a local doc search, an API call, a database query — or all three.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context is injected into the prompt&lt;/strong&gt; alongside the user's query. The model now has both the question and the facts needed to answer it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The model generates a response grounded in retrieved facts&lt;/strong&gt; rather than training-time patterns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;(Optional) A verification layer checks the output&lt;/strong&gt; against the sources to catch remaining hallucinations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is sometimes called a "grounding pipeline" — and it's the core architecture behind &lt;a href="https://dev.to/blog/2026-02-15/building-ai-agents-that-dont-hallucinate"&gt;AI agents that don't hallucinate&lt;/a&gt;. The specifics vary (what retrieval systems you use, how you compose the prompt, whether you add verification), but the pattern is consistent.&lt;/p&gt;

&lt;p&gt;The key insight: &lt;strong&gt;grounding is an architectural concern, not a prompt engineering trick.&lt;/strong&gt; You can't reliably ground a model by telling it "only use facts." You need infrastructure that provides those facts.&lt;/p&gt;

&lt;p&gt;Notice that step 1 is the hardest. Knowing when to retrieve and what to retrieve requires understanding the query's intent. A question about "how to configure authentication" needs docs. A question about "what's the current subscription price" needs live data. A good grounding system handles this routing automatically — the agent doesn't need to know the implementation details of each data source.&lt;/p&gt;

&lt;h2&gt;
  
  
  Grounding vs. fine-tuning
&lt;/h2&gt;

&lt;p&gt;This is a common source of confusion. Fine-tuning and grounding solve different problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fine-tuning&lt;/strong&gt; changes the model's behavior — its tone, reasoning style, domain vocabulary, output format. You're adjusting how it thinks by training on task-specific examples. But the facts it knows still come from training data. Fine-tuning a model on medical terminology doesn't keep it current on drug interactions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grounding&lt;/strong&gt; changes the model's information at query time. You're giving it access to current facts without modifying the model itself. The model's behavior stays the same, but its answers reflect real data instead of training patterns.&lt;/p&gt;

&lt;p&gt;The decision framework is straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Need factual accuracy about things that change?&lt;/strong&gt; Use grounding. Current docs, live data, version-specific APIs — grounding handles these because it provides facts at inference time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Need the model to behave differently?&lt;/strong&gt; Use fine-tuning. Domain-specific output formats, specialized reasoning patterns, company tone — fine-tuning handles these because they're behavioral.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Building a production system?&lt;/strong&gt; You probably need both. Fine-tune for behavior, ground for facts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fine-tuning without grounding gives you a model that sounds like a domain expert but still hallucinates about current data. Grounding without fine-tuning gives you accurate facts delivered in a generic style. The combination is where production systems land.&lt;/p&gt;

&lt;h2&gt;
  
  
  Grounding in practice with MCP
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;Model Context Protocol (MCP)&lt;/a&gt; makes grounding practical by standardizing how AI agents connect to external data sources. Instead of building custom integrations for every model and every data source, MCP defines a common interface: data sources expose "tools" through MCP servers, and AI agents query them through a standard protocol.&lt;/p&gt;

&lt;p&gt;This matters for grounding because it means you can compose multiple grounding sources without custom integration code. A coding assistant can pull library docs from one MCP server and live API data from another — same protocol, same agent, different data. And because MCP is an open standard, you're not locked into any particular vendor or model provider.&lt;/p&gt;

&lt;p&gt;Here's what a practical grounding setup looks like with MCP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@neuledge/context"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This configuration gives your AI agent access to local documentation through &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;@neuledge/context&lt;/a&gt; — a tool that indexes library docs into local SQLite databases and serves them via MCP. The agent gets &lt;a href="https://dev.to/blog/2026-02-18/version-specific-documentation-ai-coding-assistants"&gt;version-specific documentation&lt;/a&gt; with sub-10ms queries, no cloud dependency, and no rate limits.&lt;/p&gt;

&lt;p&gt;For live data grounding, &lt;a href="https://github.com/neuledge/graph" rel="noopener noreferrer"&gt;@neuledge/graph&lt;/a&gt; provides a semantic data layer that connects agents to operational data sources — pricing APIs, inventory systems, databases — through a single &lt;code&gt;lookup()&lt;/code&gt; tool with pre-cached responses and structured JSON output.&lt;/p&gt;

&lt;p&gt;The combination covers both grounding categories: static documentation via Context, live operational data via Graph. Both run locally, both expose tools through MCP, and both work with any AI agent that supports the protocol. Check the &lt;a href="https://dev.to/integrations"&gt;integrations page&lt;/a&gt; for setup guides with Claude Code, Cursor, Windsurf, and other editors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;Start with the type of grounding that matches your biggest pain point:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Your AI keeps using wrong API versions&lt;/strong&gt; → Ground it with &lt;a href="https://dev.to/blog/2026-02-17/getting-started-with-neuledge-context"&gt;local documentation&lt;/a&gt;. Index the docs for the exact versions you use, serve them to your assistant via MCP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Your AI needs live data&lt;/strong&gt; (prices, statuses, inventory) → Ground it with a &lt;a href="https://dev.to/graph"&gt;data layer&lt;/a&gt;. Connect your operational APIs and let the agent query structured facts instead of guessing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Your AI hallucinates entirely&lt;/strong&gt; → Read the &lt;a href="https://dev.to/blog/2026-02-15/building-ai-agents-that-dont-hallucinate"&gt;hallucination prevention architecture guide&lt;/a&gt; for the full four-layer approach.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Grounding isn't optional for production AI systems. Research shows that RAG-based grounding alone reduces hallucinations by 42–68%, and combining grounding with verification can push accuracy even higher. An ungrounded agent is a liability — it will confidently deliver wrong answers that look right. A grounded agent is a tool — it delivers answers based on your data, your docs, your reality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start grounding your LLM today:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;@neuledge/context&lt;/a&gt; for documentation grounding&lt;/li&gt;
&lt;li&gt;Install &lt;a href="https://github.com/neuledge/graph" rel="noopener noreferrer"&gt;@neuledge/graph&lt;/a&gt; for live data grounding&lt;/li&gt;
&lt;li&gt;Read the &lt;a href="https://dev.to/docs"&gt;docs&lt;/a&gt; and &lt;a href="https://dev.to/integrations"&gt;integration guides&lt;/a&gt; for setup with your editor&lt;/li&gt;
&lt;li&gt;Compare &lt;a href="https://dev.to/compare"&gt;grounding tools&lt;/a&gt; to understand your options&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>programming</category>
      <category>rag</category>
    </item>
    <item>
      <title>Local-First Documentation: What It Is and Why Your AI Agent Needs It</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Thu, 19 Feb 2026 02:27:08 +0000</pubDate>
      <link>https://forem.com/moshe_io/local-first-documentation-what-it-is-and-why-your-ai-agent-needs-it-1l0g</link>
      <guid>https://forem.com/moshe_io/local-first-documentation-what-it-is-and-why-your-ai-agent-needs-it-1l0g</guid>
      <description>&lt;p&gt;You're mid-session with your AI coding assistant. It's been writing solid code for the last twenty minutes — referencing the right framework APIs, using current patterns. Then it starts hallucinating. The cloud documentation service hit its rate limit, and your assistant fell back to its training data. Now it's confidently suggesting APIs that were deprecated two versions ago.&lt;/p&gt;

&lt;p&gt;This is the fundamental reliability problem with cloud-based documentation for AI agents. Local-first documentation solves it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is local-first documentation?
&lt;/h2&gt;

&lt;p&gt;Local-first documentation means indexing library docs into a local database and serving them to your AI agent without any network calls. Instead of your assistant querying a cloud API every time it needs to reference a framework, it reads from a file on your machine.&lt;/p&gt;

&lt;p&gt;The concept borrows from the broader &lt;a href="https://www.inkandswitch.com/local-first/" rel="noopener noreferrer"&gt;local-first software&lt;/a&gt; movement: your data lives on your device, works offline, and doesn't depend on someone else's server being up. Applied to AI documentation, it means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Docs are stored locally&lt;/strong&gt; — typically as a SQLite database or similar portable format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queries never leave your machine&lt;/strong&gt; — sub-10ms lookups instead of 100–500ms cloud round-trips&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No network dependency&lt;/strong&gt; — works on a plane, in an air-gapped environment, or when your Wi-Fi drops&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You control the version&lt;/strong&gt; — index docs for the exact library version you're using&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't a new idea for developer tools. &lt;a href="https://devdocs.io/" rel="noopener noreferrer"&gt;DevDocs&lt;/a&gt;, &lt;a href="https://zealdocs.org/" rel="noopener noreferrer"&gt;Zeal&lt;/a&gt;, and &lt;a href="https://kapeli.com/dash" rel="noopener noreferrer"&gt;Dash&lt;/a&gt; have offered offline documentation browsing for years. What's new is applying this architecture to AI agents — giving your coding assistant the same offline, instant, version-accurate access to docs that you'd want for yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem with cloud documentation services
&lt;/h2&gt;

&lt;p&gt;Cloud documentation services solve a real problem: AI coding assistants need access to current docs that aren't in their training data. Services like Context7 provide this by hosting documentation and serving it through an API.&lt;/p&gt;

&lt;p&gt;But cloud-first architecture introduces its own failure modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rate limits cut you off mid-session.&lt;/strong&gt; Most services cap requests at 60 per hour. A single complex coding session can burn through that in minutes, especially with agentic workflows where the AI makes dozens of tool calls. Once you hit the limit, your assistant loses access to docs entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency adds up.&lt;/strong&gt; Each cloud lookup takes 100–500ms. In a session with 30+ doc queries, that's 3–15 seconds of accumulated waiting — enough to noticeably slow down an interactive coding session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version mismatch.&lt;/strong&gt; Most cloud services index only the latest version of a library. If your project is pinned to Next.js 15 but the service indexed Next.js 16, &lt;a href="https://dev.to/blog/2026-02-18/version-specific-documentation-ai-coding-assistants"&gt;every answer references the wrong API&lt;/a&gt;. The version lag cuts both ways — if you're on the latest and the service is behind, you still get wrong answers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy exposure.&lt;/strong&gt; Every query goes to a third-party server. For teams working with proprietary codebases, internal APIs, or sensitive project structures, that's a non-trivial concern. The queries themselves reveal what you're building and what you're struggling with.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost scales with usage.&lt;/strong&gt; Free tiers have tight limits. Paid plans charge per query or per month. For teams with multiple developers using AI assistants heavily, costs compound.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are deal-breakers for casual use. If you're prototyping something quick and always-latest docs are fine, cloud services work. The problems surface when reliability and accuracy matter — production codebases, version-pinned dependencies, teams that can't afford their AI assistant going dark mid-session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why local-first is a better fit for AI agents
&lt;/h2&gt;

&lt;p&gt;AI agents have different access patterns than human developers browsing docs. A developer might look up a few API references per hour. An AI agent in an agentic coding session might query docs 50+ times in a single task — checking types, verifying method signatures, reading examples for each file it touches.&lt;/p&gt;

&lt;p&gt;This high-frequency access pattern is exactly where local-first shines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No rate limits, ever.&lt;/strong&gt; Your agent can query docs hundreds of times per session. The database is a file on disk — there's no server to throttle you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sub-10ms latency.&lt;/strong&gt; SQLite queries against a local FTS5 index return in under 10 milliseconds. That's fast enough that doc lookups add zero perceptible delay to your coding session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version pinning.&lt;/strong&gt; Index docs for the exact Git tag your project uses. When you're on &lt;code&gt;ai@6.0.86&lt;/code&gt;, you get v6 docs — not a blend of every version that existed at training time, and not whatever "latest" the cloud service indexed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Works everywhere.&lt;/strong&gt; Airplane mode, air-gapped networks, coffee shop Wi-Fi that drops every five minutes. Once the docs are indexed locally, your AI never loses access.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Free and unlimited.&lt;/strong&gt; No per-query pricing, no monthly subscriptions, no tier limits. Index as many libraries as you need, query as often as you want.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Private by default.&lt;/strong&gt; Your queries stay on your machine. No third party sees what APIs you're looking up, what frameworks you're using, or what internal docs you've indexed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How local-first documentation works
&lt;/h2&gt;

&lt;p&gt;The architecture is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Point at a source.&lt;/strong&gt; Give the tool a Git repository URL (or a local directory). It clones the repo's docs — typically Markdown files in a &lt;code&gt;/docs&lt;/code&gt; folder.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick a version.&lt;/strong&gt; Select the exact Git tag or branch you want. This is what makes version pinning possible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Index into a local database.&lt;/strong&gt; The tool parses documentation into semantically chunked sections and indexes them with full-text search (FTS5 + BM25 ranking) into a portable SQLite &lt;code&gt;.db&lt;/code&gt; file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serve via MCP.&lt;/strong&gt; The tool starts a local &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;Model Context Protocol&lt;/a&gt; server. Your AI coding assistant — Claude Code, Cursor, VS Code Copilot, Windsurf — connects to it and queries docs through the standard MCP protocol.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The result: your AI assistant asks "How do I create middleware in Next.js?" and gets an answer from the exact version of Next.js docs you indexed, in under 10ms, without touching the internet.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;@neuledge/context&lt;/a&gt; implements this architecture. Three commands to set up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @neuledge/context
context add https://github.com/vercel/next.js &lt;span class="nt"&gt;--tag&lt;/span&gt; v16.0.0
context mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;.db&lt;/code&gt; files are portable — check them into your repo or share them on a drive. Every developer on your team gets the same indexed docs with zero setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local-first vs. cloud documentation: when to use each
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Local-First&lt;/th&gt;
&lt;th&gt;Cloud&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rate limits&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;60 req/hour typical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&amp;lt;10ms&lt;/td&gt;
&lt;td&gt;100–500ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Offline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Version pinning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Exact tags&lt;/td&gt;
&lt;td&gt;Latest only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Privacy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;100% local&lt;/td&gt;
&lt;td&gt;Cloud-processed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;$10+/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3 commands&lt;/td&gt;
&lt;td&gt;API key + config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Internal docs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes, free&lt;/td&gt;
&lt;td&gt;Paid or unsupported&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Use local-first when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're working on a production codebase pinned to specific dependency versions&lt;/li&gt;
&lt;li&gt;You're in an offline or air-gapped environment&lt;/li&gt;
&lt;li&gt;Privacy matters — proprietary code, internal APIs, sensitive projects&lt;/li&gt;
&lt;li&gt;Your AI workflow is agentic (high-frequency doc queries that would hit rate limits)&lt;/li&gt;
&lt;li&gt;You want to index internal documentation alongside open-source libraries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use cloud when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're prototyping and always-latest docs are acceptable&lt;/li&gt;
&lt;li&gt;You want zero-setup, zero-install convenience&lt;/li&gt;
&lt;li&gt;Your AI usage is light enough that rate limits don't matter&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both approaches have their place. Cloud services offer convenience for light use. Local-first offers reliability and accuracy when it counts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;If your AI coding assistant keeps hitting rate limits, suggesting deprecated APIs, or losing access to docs mid-session, local-first documentation fixes all three:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @neuledge/context
context add https://github.com/vercel/next.js
claude mcp add context &lt;span class="nt"&gt;--&lt;/span&gt; npx @neuledge/context mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blog/2026-02-17/getting-started-with-neuledge-context"&gt;Getting started tutorial&lt;/a&gt; — full setup walkthrough with editor integration&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/context"&gt;Product page&lt;/a&gt; — features, architecture, and comparison table&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/compare"&gt;Compare alternatives&lt;/a&gt; — cloud services vs. local-first documentation&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/docs"&gt;Documentation&lt;/a&gt; — quick start and CLI reference&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; — source, issues, and contributions&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>llm</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Version-Specific Documentation: Why Your AI Coding Assistant Gets It Wrong</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Wed, 18 Feb 2026 03:55:26 +0000</pubDate>
      <link>https://forem.com/moshe_io/version-specific-documentation-why-your-ai-coding-assistant-gets-it-wrong-564n</link>
      <guid>https://forem.com/moshe_io/version-specific-documentation-why-your-ai-coding-assistant-gets-it-wrong-564n</guid>
      <description>&lt;p&gt;Your AI assistant just built you an agent. Clean code, right structure, reasonable-looking tool definitions. You run it — nothing works. An hour later, you discover that &lt;code&gt;Experimental_Agent&lt;/code&gt; was renamed to &lt;code&gt;ToolLoopAgent&lt;/code&gt; in the AI SDK version you're using. And &lt;code&gt;system&lt;/code&gt; is now &lt;code&gt;instructions&lt;/code&gt;. And &lt;code&gt;parameters&lt;/code&gt; is now &lt;code&gt;inputSchema&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The error message didn't say any of that. It just said something vague about undefined properties. So you spent an hour debugging your agent's logic when the bug was actually a renamed class.&lt;/p&gt;

&lt;p&gt;This happens constantly. And it's not a hallucination — your assistant generated code that was correct. In AI SDK v5. You're on v6.&lt;/p&gt;

&lt;h2&gt;
  
  
  It's not the model. It's the data.
&lt;/h2&gt;

&lt;p&gt;Your assistant doesn't know what version you're running. Here's why the suggestions are wrong:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Training data blends every version together.&lt;/strong&gt; The AI SDK shipped three major architectural shifts from v3 to v6. All of them are in training data, mixed together with no version labels. Your assistant learned patterns from all three eras at once.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud doc services index the latest version.&lt;/strong&gt; If you're pinned to &lt;code&gt;ai@5.x&lt;/code&gt; but the service indexed v6, every answer you get is from the wrong API. It works the other way too — if you're on v6 but the service is behind, you still get v5 answers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blog posts and tutorials don't say which version they're for.&lt;/strong&gt; A 2025 post using &lt;code&gt;generateObject&lt;/code&gt; looks identical to a 2026 post using the new &lt;code&gt;generateText&lt;/code&gt; + &lt;code&gt;output&lt;/code&gt; pattern. &lt;code&gt;generateObject&lt;/code&gt; was deprecated in v6 and your assistant has no way to know that from its training data alone.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The AI SDK is a perfect case study
&lt;/h2&gt;

&lt;p&gt;The agent pattern changed with every major release. Ask your assistant: &lt;em&gt;"How do I build a multi-step agent that calls tools?"&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;v3/v4:&lt;/strong&gt; Write a manual loop — &lt;code&gt;generateText&lt;/code&gt; with &lt;code&gt;maxSteps&lt;/code&gt;, manage each step yourself&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;v5:&lt;/strong&gt; &lt;code&gt;new Experimental_Agent({ system: "...", tools })&lt;/code&gt; — note the experimental prefix&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;v6:&lt;/strong&gt; &lt;code&gt;new ToolLoopAgent({ instructions: "...", tools })&lt;/code&gt; — stable, renamed, different param name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three correct answers, three incompatible versions. Without knowing which you're on, your assistant picks one and gets it wrong two-thirds of the time.&lt;/p&gt;

&lt;p&gt;The rest of the API has the same problem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;parameters&lt;/code&gt; / &lt;code&gt;result&lt;/code&gt; → &lt;code&gt;inputSchema&lt;/code&gt; / &lt;code&gt;output&lt;/code&gt;&lt;/strong&gt; — tool definitions changed shape; old field names fail silently with no error&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;generateObject&lt;/code&gt; deprecated&lt;/strong&gt; — still runs, but warns; breaks in the next major version&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;useChat&lt;/code&gt; &lt;code&gt;append()&lt;/code&gt; → &lt;code&gt;sendMessage()&lt;/code&gt;&lt;/strong&gt; — plus the hook now expects you to manage input state yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Silent failures are the cruelest kind. When a renamed field breaks your tool calls without throwing an error, you debug your agent's &lt;em&gt;behavior&lt;/em&gt; for an hour before you even think to check the SDK changelog.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: docs pinned to your actual version
&lt;/h2&gt;

&lt;p&gt;Your assistant gives the right answer when it has the right docs. Without version-pinned docs, it generates this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// what your assistant produces from its training mix&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Experimental_Agent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a helpful assistant.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With v6 docs indexed, it generates this instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// what it produces with AI SDK v6.0.86 docs&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ToolLoopAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a helpful assistant.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same question. Right answer, because it's reading from the right docs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;@neuledge/context&lt;/a&gt; indexes docs from a specific Git tag and serves them to your AI assistant via MCP. Two commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/vercel/ai &lt;span class="nt"&gt;--tag&lt;/span&gt; v6.0.86
npx @neuledge/context mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, when you ask about building an agent, your assistant reads v6 docs and generates &lt;code&gt;ToolLoopAgent&lt;/code&gt; with &lt;code&gt;instructions&lt;/code&gt; and &lt;code&gt;inputSchema&lt;/code&gt; — not whatever blend of versions it was trained on.&lt;/p&gt;

&lt;p&gt;Works for any fast-moving library. React Router. Next.js App Router. Tailwind CSS. Anything where the API today isn't the API from a year ago.&lt;/p&gt;

&lt;p&gt;See our &lt;a href="https://dev.to/blog/2026-02-17/getting-started-with-neuledge-context"&gt;step-by-step tutorial&lt;/a&gt; for editor setup (Claude Code, Cursor, VS Code, Windsurf).&lt;/p&gt;

&lt;h2&gt;
  
  
  What about cloud documentation services?
&lt;/h2&gt;

&lt;p&gt;They solve a real problem — zero-setup access to docs your assistant wouldn't have otherwise. But most serve only the latest version. &lt;strong&gt;If you're on v5 and the service indexed v6, you still get the wrong answers.&lt;/strong&gt; The version lag cuts both ways.&lt;/p&gt;

&lt;p&gt;For production codebases pinned to a specific version, local version-pinned docs are the cleaner solution. See our &lt;a href="https://dev.to/compare"&gt;comparison page&lt;/a&gt; for the full breakdown.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;If your AI-generated agentic code keeps using old patterns — &lt;code&gt;Experimental_Agent&lt;/code&gt; when &lt;code&gt;ToolLoopAgent&lt;/code&gt; exists, &lt;code&gt;parameters&lt;/code&gt; when the field is now &lt;code&gt;inputSchema&lt;/code&gt; — three commands fix it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @neuledge/context
context add https://github.com/vercel/ai &lt;span class="nt"&gt;--tag&lt;/span&gt; v6.0.86
claude mcp add context &lt;span class="nt"&gt;--&lt;/span&gt; npx @neuledge/context mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blog/2026-02-17/getting-started-with-neuledge-context"&gt;Getting started tutorial&lt;/a&gt; — full setup walkthrough&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/docs"&gt;Documentation&lt;/a&gt; — quick start and CLI reference&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/compare"&gt;Compare alternatives&lt;/a&gt; — cloud services vs local version-pinned docs&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; — source, issues, and CLI reference&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>programming</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Getting Started with @neuledge/context</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Tue, 17 Feb 2026 03:25:54 +0000</pubDate>
      <link>https://forem.com/moshe_io/getting-started-with-neuledgecontext-337a</link>
      <guid>https://forem.com/moshe_io/getting-started-with-neuledgecontext-337a</guid>
      <description>&lt;p&gt;Your AI coding assistant just suggested &lt;code&gt;getServerSideProps&lt;/code&gt; for your Next.js 16 project. That API was deprecated two major versions ago. Yesterday it generated Tailwind classes that don't exist. Last week it used the old AI SDK callback pattern instead of the new agent loop API.&lt;/p&gt;

&lt;p&gt;This isn't a model problem — it's a data problem. Your assistant is working from training data that's months or years out of date. When it doesn't have the right docs, it fills the gap with confident-sounding fiction.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;&lt;strong&gt;@neuledge/context&lt;/strong&gt;&lt;/a&gt; fixes this. It indexes library documentation into local SQLite files and serves them to your AI assistant via MCP (Model Context Protocol). No cloud service, no rate limits, sub-10ms queries. This tutorial walks you through setting it up from scratch with a real project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you start, you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Node.js 18+&lt;/strong&gt; — check with &lt;code&gt;node --version&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An AI coding assistant that supports MCP&lt;/strong&gt; — Claude Code, Cursor, VS Code with Copilot, Windsurf, or any MCP-compatible client&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A project you're working on&lt;/strong&gt; — we'll use a Next.js + Tailwind CSS stack as our example, but Context works with any library that has Markdown docs&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Installing Context
&lt;/h2&gt;

&lt;p&gt;Install Context globally so it's available across all your projects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @neuledge/context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This installs the &lt;code&gt;context&lt;/code&gt; CLI tool. There's no background daemon, no system service — just a command-line tool that runs when you call it.&lt;/p&gt;

&lt;p&gt;If you prefer not to install globally, you can use &lt;code&gt;npx&lt;/code&gt; instead. Every command in this tutorial works with the &lt;code&gt;npx @neuledge/context&lt;/code&gt; prefix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/context &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding your first library
&lt;/h2&gt;

&lt;p&gt;Let's index the Next.js documentation. Point Context at the GitHub repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/vercel/next.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Context will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Shallow-clone the repo&lt;/strong&gt; — only the docs, not the full git history&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Show available version tags&lt;/strong&gt; — pick the version that matches your project (e.g., &lt;code&gt;v16.0.0&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detect the docs directory&lt;/strong&gt; — it scans for &lt;code&gt;docs/&lt;/code&gt;, &lt;code&gt;documentation/&lt;/code&gt;, or &lt;code&gt;doc/&lt;/code&gt; automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parse every Markdown file&lt;/strong&gt; — extracting frontmatter, splitting content into semantically meaningful chunks by H2 headings (~800 tokens per chunk)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Index into SQLite&lt;/strong&gt; — full-text search with FTS5 and BM25 ranking, stored in a single &lt;code&gt;.db&lt;/code&gt; file&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The result is a portable database file at &lt;code&gt;~/.context/packages/nextjs@16.0.0.db&lt;/code&gt;. That file contains every piece of Next.js 16 documentation, pre-indexed and ready for instant queries.&lt;/p&gt;

&lt;p&gt;Want to pin a specific version without the interactive prompt? Use the &lt;code&gt;--tag&lt;/code&gt; flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/vercel/next.js &lt;span class="nt"&gt;--tag&lt;/span&gt; v16.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding multiple libraries
&lt;/h2&gt;

&lt;p&gt;A real project uses more than one library. Let's add Tailwind CSS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/tailwindlabs/tailwindcss
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pick the version tag that matches your project, and Context creates another &lt;code&gt;.db&lt;/code&gt; file. Each library gets its own database — clean, isolated, and independently updatable.&lt;/p&gt;

&lt;p&gt;To see everything you've indexed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nextjs@16.0.0          ~/.context/packages/nextjs@16.0.0.db
tailwindcss@4.0.0      ~/.context/packages/tailwindcss@4.0.0.db
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add as many libraries as your project uses. The &lt;a href="https://github.com/vercel/ai" rel="noopener noreferrer"&gt;Vercel AI SDK&lt;/a&gt;, &lt;a href="https://github.com/facebook/react" rel="noopener noreferrer"&gt;React&lt;/a&gt;, your component library — if it has Markdown docs in a Git repo, Context can index it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to your editor
&lt;/h2&gt;

&lt;p&gt;Context serves docs via MCP — the Model Context Protocol, an open standard backed by Anthropic, OpenAI, Google, and Microsoft. Here's how to connect it to your editor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code
&lt;/h3&gt;

&lt;p&gt;One command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add context &lt;span class="nt"&gt;--&lt;/span&gt; npx @neuledge/context mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cursor
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;.cursor/mcp.json&lt;/code&gt; in your project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"@neuledge/context"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  VS Code / Copilot
&lt;/h3&gt;

&lt;p&gt;Add to &lt;code&gt;.vscode/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"@neuledge/context"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Requires VS Code 1.99+ with the GitHub Copilot extension.&lt;/p&gt;

&lt;h3&gt;
  
  
  Windsurf
&lt;/h3&gt;

&lt;p&gt;Add to &lt;code&gt;~/.codeium/windsurf/mcp_config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"@neuledge/context"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For other MCP clients, the server command is &lt;code&gt;npx @neuledge/context mcp&lt;/code&gt; using stdio transport. See our &lt;a href="https://dev.to/integrations"&gt;integrations page&lt;/a&gt; for the full list.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Context in practice
&lt;/h2&gt;

&lt;p&gt;Once connected, your AI assistant automatically has access to the &lt;code&gt;resolve&lt;/code&gt; tool — it can search your indexed docs whenever it needs accurate information.&lt;/p&gt;

&lt;p&gt;Here's the difference in action. Say you ask your assistant: &lt;em&gt;"How do I create a middleware in Next.js 16 that redirects unauthenticated users?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without Context:&lt;/strong&gt; The assistant relies on training data. It might generate a &lt;code&gt;middleware.ts&lt;/code&gt; file using the old &lt;code&gt;NextResponse.redirect()&lt;/code&gt; pattern with the wrong import path, or reference a configuration option that was renamed two versions ago.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With Context:&lt;/strong&gt; The assistant queries your indexed Next.js 16 docs, finds the current middleware documentation, and generates code that matches the exact API of the version you're using. The correct imports, the current configuration format, the right patterns.&lt;/p&gt;

&lt;p&gt;The same applies to Tailwind. Ask about a utility class and the assistant pulls from your indexed v4 docs instead of guessing based on v3 training data.&lt;/p&gt;

&lt;p&gt;This happens transparently — your assistant calls the &lt;code&gt;resolve&lt;/code&gt; tool when it needs docs, gets results in under 10ms, and uses them to ground its response. No extra prompting needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tips for power users
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Pin exact versions
&lt;/h3&gt;

&lt;p&gt;Always match the indexed version to what's in your &lt;code&gt;package.json&lt;/code&gt;. If you're on Next.js 16.0.0, index that exact tag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/vercel/next.js &lt;span class="nt"&gt;--tag&lt;/span&gt; v16.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you upgrade, add the new version. Old &lt;code&gt;.db&lt;/code&gt; files stay around so you can switch back if needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Index your own docs
&lt;/h3&gt;

&lt;p&gt;Context works with any Git repo that has Markdown files — including yours:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add ./docs &lt;span class="nt"&gt;--name&lt;/span&gt; my-project &lt;span class="nt"&gt;--pkg-version&lt;/span&gt; 1.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Index your internal API docs, runbooks, or design system documentation. Your AI assistant gets grounded access to company knowledge, completely private, no cloud service involved.&lt;/p&gt;

&lt;h3&gt;
  
  
  Share &lt;code&gt;.db&lt;/code&gt; files with your team
&lt;/h3&gt;

&lt;p&gt;Each documentation package is a single, self-contained &lt;code&gt;.db&lt;/code&gt; file. You can share them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build and export to a specific location&lt;/span&gt;
context add https://github.com/your-org/design-system &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; design-system &lt;span class="nt"&gt;--pkg-version&lt;/span&gt; 3.1 &lt;span class="nt"&gt;--save&lt;/span&gt; ./packages/

&lt;span class="c"&gt;# Teammates install the pre-built package instantly&lt;/span&gt;
context add ./packages/design-system@3.1.db
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Commit &lt;code&gt;.db&lt;/code&gt; files to your repo, upload them to S3, or put them on a shared drive. No build step on the receiving end — the pre-indexed database installs instantly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Update when a new version releases
&lt;/h3&gt;

&lt;p&gt;When a library you depend on releases a new version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://github.com/vercel/next.js &lt;span class="nt"&gt;--tag&lt;/span&gt; v16.1.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The old version's &lt;code&gt;.db&lt;/code&gt; file stays intact. You can keep multiple versions indexed simultaneously.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's happening under the hood
&lt;/h2&gt;

&lt;p&gt;If you're curious about the internals: Context uses SQLite with FTS5 (full-text search) and BM25 ranking. When your AI assistant queries for "middleware authentication," the search engine:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Tokenizes the query&lt;/strong&gt; using Porter stemming — so "authenticating" matches "authentication"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runs FTS5 search&lt;/strong&gt; across all indexed chunks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ranks results with BM25&lt;/strong&gt; — section titles weighted 10x, doc titles 5x over body content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filters low-relevance results&lt;/strong&gt; — anything below 50% of the top score gets dropped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Merges adjacent chunks&lt;/strong&gt; — so your assistant sees coherent documentation sections, not fragments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caps at a token budget&lt;/strong&gt; — keeping the response focused without flooding the context window&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Total latency: under 10ms. Compare that to 100-500ms for a cloud round-trip.&lt;/p&gt;

&lt;p&gt;If you also need live data access beyond static documentation — product catalogs, pricing, inventory — check out &lt;a href="https://github.com/neuledge/graph" rel="noopener noreferrer"&gt;@neuledge/graph&lt;/a&gt;, which provides a semantic data layer for AI agents with pre-cached, sub-100ms responses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started now
&lt;/h2&gt;

&lt;p&gt;Install Context, index the docs for your current project, and connect to your editor. The whole setup is three commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @neuledge/context
context add https://github.com/vercel/next.js
claude mcp add context &lt;span class="nt"&gt;--&lt;/span&gt; npx @neuledge/context mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your AI coding assistant just went from hallucinating outdated APIs to having instant, offline access to the exact documentation it needs.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; — source code, issues, and full CLI reference&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/docs"&gt;Documentation&lt;/a&gt; — quick start guide and MCP configuration&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/integrations"&gt;Integrations&lt;/a&gt; — setup guides for every supported editor&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/compare"&gt;Compare alternatives&lt;/a&gt; — see how Context stacks up against cloud services&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>mcp</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Building AI Agents That Don't Hallucinate: A Practical Architecture Guide</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Sun, 15 Feb 2026 16:35:18 +0000</pubDate>
      <link>https://forem.com/moshe_io/building-ai-agents-that-dont-hallucinate-a-practical-architecture-guide-16am</link>
      <guid>https://forem.com/moshe_io/building-ai-agents-that-dont-hallucinate-a-practical-architecture-guide-16am</guid>
      <description>&lt;p&gt;So your AI agent just told a customer that your product supports a feature it doesn't have. Yesterday it cited an API endpoint that doesn't exist. And last week? It invented a compliance regulation that sounded convincing enough to pass internal review.&lt;/p&gt;

&lt;p&gt;If this sounds familiar, you're not alone. Hallucination rates across current models range from 0.7% to 9.2%. Even at the low end, that's dozens of wrong answers per day at any real scale.&lt;/p&gt;

&lt;p&gt;Here's the thing though — this is a solved problem. Not with a better prompt or a bigger model, but with architecture. You need a grounding pipeline that connects your agent to real data before it ever opens its mouth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why do agents hallucinate?
&lt;/h2&gt;

&lt;p&gt;Hallucinations aren't random. They follow predictable patterns, and each one points to a specific fix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stale training data&lt;/strong&gt; — you shipped a new API last month, but the model trained six months ago. It fills the gap with fiction. Fix: &lt;em&gt;retrieval.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weak domain signal&lt;/strong&gt; — your internal docs and jargon are barely represented in training data. Fix: &lt;em&gt;domain-specific sources.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context overload&lt;/strong&gt; — dump 20 documents into a prompt and the model can't tell what matters. Fix: &lt;em&gt;selective context.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Helpfulness bias&lt;/strong&gt; — LLMs would rather sound right than say "I don't know." Fix: &lt;em&gt;output constraints.&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's basically the architecture we're about to build.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 4-layer grounding architecture
&lt;/h2&gt;

&lt;p&gt;No single trick solves this. What works is a pipeline where each layer catches what the previous one missed. Together, they reduce hallucinations by 42–68%.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Knowledge retrieval
&lt;/h3&gt;

&lt;p&gt;The core rule: &lt;strong&gt;never let your agent answer from memory.&lt;/strong&gt; Every query hits retrieval first.&lt;/p&gt;

&lt;p&gt;What that looks like depends on your data:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For documentation&lt;/strong&gt; — package your docs for instant local search:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Index your docs into a local SQLite database&lt;/span&gt;
npx @neuledge/context add ./docs &lt;span class="nt"&gt;--version&lt;/span&gt; 3.2

&lt;span class="c"&gt;# Serve them as an MCP tool&lt;/span&gt;
npx @neuledge/context serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sub-10ms access, no network calls, every answer traced to a specific doc version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For structured data&lt;/strong&gt; (catalogs, pricing, inventory) — use a unified query interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;products&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.internal/products&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;pricing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.internal/pricing&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;inventory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.internal/inventory&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// 5-minute cache&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Your agent describes what it needs — the graph routes it&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;current price for product SKU-1234&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;For internal knowledge bases&lt;/strong&gt; (wikis, Confluence, Slack) — these are the hardest. Best bet: consolidate the critical stuff into proper docs first, then use the approaches above.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Context management
&lt;/h3&gt;

&lt;p&gt;Getting documents is half the battle. The other half is deciding what goes into the prompt.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Send 3–5 chunks, not everything.&lt;/strong&gt; More context = more noise for the model to latch onto.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attach metadata to every chunk&lt;/strong&gt; — source URL, title, version. Gives the model something real to cite.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One retrieval per topic.&lt;/strong&gt; If the query touches pricing &lt;em&gt;and&lt;/em&gt; availability, run two searches. Mixing concerns muddies results.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What authentication methods does the API support?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The API supports OAuth 2.0 and API key authentication...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Authentication Guide&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://docs.example.com/auth&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;3.2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 3: Output constraints
&lt;/h3&gt;

&lt;p&gt;Even with perfect retrieval, the model can still make stuff up. These constraints make that harder:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Require citations.&lt;/strong&gt; No source? The agent says so instead of guessing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use structured output.&lt;/strong&gt; JSON with source fields forces every claim to link to a document.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add confidence signals.&lt;/strong&gt; "Based on the v3.2 docs..." tells users the answer is grounded.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;responseSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;source_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;source_title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;high | medium | low&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;unsupported_claims&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 4: Verification
&lt;/h3&gt;

&lt;p&gt;The safety net for everything the other layers missed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fact-check programmatically.&lt;/strong&gt; Compare claims against retrieved docs. Flag anything unsupported.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-agent review.&lt;/strong&gt; A second LLM checks the first — specifically hunting for unsupported claims, not trying to be helpful.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-loop for high stakes.&lt;/strong&gt; Medical, legal, financial — 76% of enterprises already do this. Build the workflow for it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Putting it all together
&lt;/h2&gt;

&lt;p&gt;Four steps to ground a developer-facing AI agent:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Ground your docs.&lt;/strong&gt; Two commands, instant versioned access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @neuledge/context add ./docs &lt;span class="nt"&gt;--version&lt;/span&gt; latest
npx @neuledge/context serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2: Connect live data.&lt;/strong&gt; A graph layer for anything that changes faster than your docs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NeuledgeGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@neuledge/graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NeuledgeGraph&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.internal/docs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://status.internal/api&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: Wire it up.&lt;/strong&gt; Connect both tools to your agent — every query hits grounding before generation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Watch the gaps.&lt;/strong&gt; Track ungrounded queries. They tell you exactly what docs to write next.&lt;/p&gt;

&lt;h2&gt;
  
  
  What doesn't actually work
&lt;/h2&gt;

&lt;p&gt;Save yourself some time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Don't hallucinate" prompts&lt;/strong&gt; — the model doesn't know what it doesn't know. You can't prompt your way out of an architecture problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lower temperature&lt;/strong&gt; — less random ≠ more accurate. A confident hallucination at temp 0 is still wrong.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bigger models&lt;/strong&gt; — GPT-4 still hallucinates. Best-in-class is 0.7%, still wrong answers daily at scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fine-tuning on correct answers&lt;/strong&gt; — teaches style, not facts. And fine-tuned models hallucinate with &lt;em&gt;more&lt;/em&gt; confidence.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why this matters more than you think
&lt;/h2&gt;

&lt;p&gt;Hallucination damage is asymmetric: one fabricated answer wipes out the trust built by 99 correct ones. Users don't average their experience — they remember the time your agent made up a feature or quoted the wrong price.&lt;/p&gt;

&lt;p&gt;It compounds fast. Compliance risk from fabricated regulations. Developer hours debugging phantom endpoints. Customer churn from confident wrong answers.&lt;/p&gt;

&lt;p&gt;Grounding isn't something you get to eventually. It's the difference between an AI feature people trust and one they route around.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;Start with retrieval — biggest bang for your effort. Use &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;@neuledge/context&lt;/a&gt; for documentation grounding (local SQLite, MCP server, sub-10ms). Add &lt;a href="https://github.com/neuledge/graph" rel="noopener noreferrer"&gt;@neuledge/graph&lt;/a&gt; for structured live data (unified lookup, pre-cached, &amp;lt;100ms). Build up from there.&lt;/p&gt;

&lt;p&gt;Your agents are only as good as the data they're grounded in. Give them the right data, and they stop making things up.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>rag</category>
    </item>
    <item>
      <title>I Built a Context7 Local-First Alternative With Claude Code</title>
      <dc:creator>Moshe Simantov</dc:creator>
      <pubDate>Sun, 08 Feb 2026 18:01:26 +0000</pubDate>
      <link>https://forem.com/moshe_io/i-built-a-context7-local-first-alternative-with-claude-code-a6f</link>
      <guid>https://forem.com/moshe_io/i-built-a-context7-local-first-alternative-with-claude-code-a6f</guid>
      <description>&lt;p&gt;I've been using Context7 as an MCP server for months. It's a neat idea: your AI agent queries a cloud service for up-to-date library docs instead of relying on stale training data. It mostly worked. Until it didn't.&lt;/p&gt;

&lt;p&gt;In January 2026, Context7 slashed their free tier from ~6,000 to 1,000 requests per month and added a 60 requests/hour rate limit. I hit those limits within the first week. Suddenly, in the middle of a coding session, my AI assistant would just... stop being helpful. It'd fall back to hallucinating Next.js 14 patterns when I needed Next.js 16, or suggest the old AI SDK v4 &lt;code&gt;streamText&lt;/code&gt; callback style when v6 has a completely different agent loop API with built-in tool orchestration. The exact problem Context7 was supposed to solve.&lt;/p&gt;

&lt;p&gt;So I built my own. It took about a week, most of it pair-programming with Claude Code. The result is &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;Context&lt;/a&gt; — a local-first documentation tool for AI agents. No cloud. No rate limits. Sub-10ms queries. And the docs are portable &lt;code&gt;.db&lt;/code&gt; files you build once and share with your whole team.&lt;/p&gt;

&lt;p&gt;Here's how it happened and what I learned.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "Aha" Moment: Why Not Just Store Docs Locally?
&lt;/h2&gt;

&lt;p&gt;The core insight was embarrassingly simple. Cloud doc services like Context7 and Deepcon do three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Clone a library's docs repo&lt;/li&gt;
&lt;li&gt;Index the markdown into searchable chunks&lt;/li&gt;
&lt;li&gt;Serve results via API&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Steps 1 and 2 only need to happen &lt;strong&gt;once per library version&lt;/strong&gt;. But these services run them on their servers and charge you per query for step 3. Every. Single. Time.&lt;/p&gt;

&lt;p&gt;Why not do steps 1 and 2 locally, store the result as a file, and skip the network entirely?&lt;/p&gt;

&lt;p&gt;That's the whole idea. &lt;code&gt;context add https://github.com/vercel/next.js&lt;/code&gt; clones the repo, parses the docs, indexes everything into a SQLite database, and stores it at &lt;code&gt;~/.context/packages/nextjs@16.0.db&lt;/code&gt;. Done. That &lt;code&gt;.db&lt;/code&gt; file now contains every piece of Next.js 16 documentation, pre-indexed and ready for instant queries. No internet needed. No rate limits. No monthly bill.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building It With Claude Code
&lt;/h2&gt;

&lt;p&gt;I built the entire thing using Claude Code as my primary development partner. Not as a "generate boilerplate and fix it" assistant — as an actual collaborator on architecture decisions, implementation, and debugging.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Stack
&lt;/h3&gt;

&lt;p&gt;The project is a TypeScript monorepo. Here's what's under the hood:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;better-sqlite3&lt;/code&gt;&lt;/strong&gt; — Embedded database. No servers, no config, just a file. This is the critical choice that makes the whole thing work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite FTS5&lt;/strong&gt; — Full-text search with BM25 ranking and Porter stemming. The search quality is surprisingly good for what's essentially a few lines of SQL.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;@modelcontextprotocol/sdk&lt;/code&gt;&lt;/strong&gt; — The MCP server SDK. This is what lets Claude, Cursor, VS Code Copilot, and others query the docs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;remark-parse&lt;/code&gt; + &lt;code&gt;unified&lt;/code&gt;&lt;/strong&gt; — Markdown AST parsing. Needed for intelligent chunking rather than dumb text splitting.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;commander&lt;/code&gt; + &lt;code&gt;@inquirer/prompts&lt;/code&gt;&lt;/strong&gt; — CLI framework with interactive prompts for tag selection.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  How the Build Pipeline Works
&lt;/h3&gt;

&lt;p&gt;When you run &lt;code&gt;context add &amp;lt;repo&amp;gt;&lt;/code&gt;, here's what actually happens:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Source detection.&lt;/strong&gt; The CLI figures out if you gave it a git URL, a local directory, or a pre-built &lt;code&gt;.db&lt;/code&gt; file. Git URL parsing alone handles GitHub, GitLab, Bitbucket, Codeberg, SSH shorthand (&lt;code&gt;git@host:user/repo&lt;/code&gt;), and monorepo URL patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Shallow clone.&lt;/strong&gt; &lt;code&gt;git clone --depth 1&lt;/code&gt; — we only need the docs, not the full history. The CLI fetches available tags and lets you pick a version interactively, or you can pass &lt;code&gt;--tag v16.0.0&lt;/code&gt; for automation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Docs folder detection.&lt;/strong&gt; Auto-scans for &lt;code&gt;docs/&lt;/code&gt;, &lt;code&gt;documentation/&lt;/code&gt;, or &lt;code&gt;doc/&lt;/code&gt; directories. Respects &lt;code&gt;.gitignore&lt;/code&gt;. Filters by language — defaults to English but supports &lt;code&gt;--lang all&lt;/code&gt; for multilingual repos.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Markdown parsing and chunking.&lt;/strong&gt; This is where it gets interesting. The parser:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extracts YAML frontmatter for titles and descriptions&lt;/li&gt;
&lt;li&gt;Chunks content by H2 headings (the natural unit of documentation)&lt;/li&gt;
&lt;li&gt;Targets ~800 tokens per chunk with a hard limit of 1,200&lt;/li&gt;
&lt;li&gt;Splits oversized sections at code block boundaries first, then paragraph boundaries&lt;/li&gt;
&lt;li&gt;Filters out table-of-contents sections (detected by link ratio &amp;gt;50%)&lt;/li&gt;
&lt;li&gt;Strips MDX-specific React tags (&lt;code&gt;&amp;lt;AppOnly&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;PagesOnly&amp;gt;&lt;/code&gt;, etc.)&lt;/li&gt;
&lt;li&gt;Deduplicates identical sections using content hashing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5. SQLite packaging.&lt;/strong&gt; Everything goes into a single &lt;code&gt;.db&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;doc_path&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;doc_title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;section_title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;has_code&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;VIRTUAL&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;chunks_fts&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;fts5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;doc_title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section_title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'chunks'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content_rowid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;tokenize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'porter unicode61'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The FTS5 virtual table with Porter stemming means "authentication middleware" matches "authenticating in middleware" without any fancy NLP. BM25 ranking weights section titles at 10x and doc titles at 5x over body content, which makes results feel relevant without needing embeddings.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Search Pipeline: Keeping It Simple
&lt;/h2&gt;

&lt;p&gt;When Claude (or any MCP client) calls &lt;code&gt;get_docs({ library: "nextjs@16.0", topic: "middleware" })&lt;/code&gt;, the search pipeline runs entirely in-process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FTS5 query → BM25 ranking → Relevance filter → Token budget → Merge adjacent → Format
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The relevance filter drops anything scoring below 50% of the top result. The token budget caps output at 2,000 tokens — enough to be useful without flooding the context window. Adjacent chunks from the same document get merged back together so the AI sees coherent sections instead of fragments.&lt;/p&gt;

&lt;p&gt;Total latency: under 10ms. Compare that to 100-500ms for a cloud round-trip, plus the time your AI agent spends waiting before it can continue reasoning.&lt;/p&gt;

&lt;p&gt;This matters more than it sounds. AI coding agents make dozens of tool calls per session. If each doc lookup adds 300ms of network latency, that's seconds of dead time per interaction. Locally, it's effectively free.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Win: Build Once, Share Everywhere
&lt;/h2&gt;

&lt;p&gt;Here's the feature I'm most excited about, and the one I think cloud services fundamentally can't match.&lt;/p&gt;

&lt;p&gt;When you build a documentation package, the result is a single &lt;code&gt;.db&lt;/code&gt; file. That file is completely self-contained — metadata, content, search index, everything. You can:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build and export&lt;/span&gt;
context add https://github.com/your-org/design-system &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; design-system &lt;span class="nt"&gt;--pkg-version&lt;/span&gt; 3.1 &lt;span class="nt"&gt;--save&lt;/span&gt; ./packages/

&lt;span class="c"&gt;# The result: a portable file&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; packages/design-system@3.1.db
&lt;span class="c"&gt;# 2.4 MB - your entire design system docs, indexed and ready&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now share that file however you want. Upload it to an S3 bucket. Commit it to a repo. Put it on a shared drive. Your teammates install it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;context add https://your-cdn.com/design-system@3.1.db
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No build step on their end. No cloning repos. No waiting for indexing. The pre-built package installs instantly because it's already indexed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the key architectural advantage of local-first.&lt;/strong&gt; With cloud services, every user pays the query cost. With local packages, you pay the build cost once and distribute the result. It's the same principle as compiled binaries vs. interpreted scripts — do the expensive work ahead of time.&lt;/p&gt;

&lt;p&gt;For internal libraries, this is huge. You can document your internal APIs, build a package in CI, publish it alongside your npm package, and every developer on the team has instant, private, offline access to up-to-date docs. No cloud service sees your proprietary API queries.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned Building With Claude Code
&lt;/h2&gt;

&lt;p&gt;A few honest observations from using Claude Code as my primary development tool:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It's genuinely good at plumbing code.&lt;/strong&gt; Git URL parsing, CLI argument handling, SQLite schema design — the kind of code that's tedious but needs to be correct. Claude Code knocked these out quickly and accurately. The git module handles edge cases I wouldn't have thought of: monorepo tag formats like &lt;code&gt;@ai-sdk/gateway@1.2.3&lt;/code&gt;, SSH shorthand URLs, stripping &lt;code&gt;-docs&lt;/code&gt; suffixes from repo names.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It struggles with "taste" decisions.&lt;/strong&gt; Things like: what should the chunk size be? How aggressively should we filter low-relevance results? What BM25 weights feel right? These needed human judgment and iteration. I'd try values, test against real docs, adjust, repeat. Claude Code helped implement each variation quickly, but the decision of which one felt right was mine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The iteration speed is the real superpower.&lt;/strong&gt; The whole project — CLI, build pipeline, search engine, MCP server, tests — came together in about a week. Not because the code is trivial (the markdown parsing alone handles a dozen edge cases), but because the feedback loop was tight. Describe what you want, review what you get, adjust, move on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test-driven prompting works well.&lt;/strong&gt; I'd often describe the behavior I wanted in terms of test cases: "this markdown input should produce these chunks." Claude Code would write both the implementation and the tests. When they didn't match, we'd figure out why together.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;Here's where Context stands versus the cloud alternatives:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Context7&lt;/th&gt;
&lt;th&gt;Deepcon&lt;/th&gt;
&lt;th&gt;Neuledge&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Price&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$10/month&lt;/td&gt;
&lt;td&gt;$8/month&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Free tier&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1,000 req/month&lt;/td&gt;
&lt;td&gt;100 req/month&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rate limits&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;60 req/hour&lt;/td&gt;
&lt;td&gt;Throttled&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;100-500ms&lt;/td&gt;
&lt;td&gt;100-300ms&lt;/td&gt;
&lt;td&gt;&amp;lt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Works offline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Privacy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;td&gt;100% local&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Private repos&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$15/1M tokens&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Setting It Up
&lt;/h2&gt;

&lt;p&gt;If you want to try it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @neuledge/context

&lt;span class="c"&gt;# Add some docs&lt;/span&gt;
context add https://github.com/vercel/next.js
context add https://github.com/vercel/ai

&lt;span class="c"&gt;# Connect to your AI agent (Claude Code example)&lt;/span&gt;
claude mcp add context &lt;span class="nt"&gt;--&lt;/span&gt; context serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works with Claude Desktop, Cursor, VS Code Copilot, Windsurf, Zed, and Goose. Any MCP-compatible agent, really. The MCP server exposes a single &lt;code&gt;get_docs&lt;/code&gt; tool with a dynamic enum of installed libraries — the AI sees exactly what's available and queries it when relevant.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The search is currently keyword-based (FTS5 + BM25). It works well for direct queries like "middleware authentication" or "ai sdk agent loop," but it doesn't understand semantic similarity. "How do I protect routes?" won't match a section titled "Authentication Guards" unless the words overlap.&lt;/p&gt;

&lt;p&gt;I'm planning to add local embeddings for semantic search — still fully offline, probably using ONNX Runtime with a small model. The SQLite architecture makes this straightforward: add an embeddings table, compute vectors at build time, query with cosine similarity at search time.&lt;/p&gt;

&lt;p&gt;I'm also thinking about a GraphRAG-style relations table for traversing connected documentation. When you ask about middleware, you probably also want to know about authentication, routing, and error handling. A relations graph could surface those automatically.&lt;/p&gt;

&lt;p&gt;And a package registry — a GitHub-based index where the community can discover and share pre-built documentation packages. Instead of everyone independently building the same Next.js docs, build it once and publish it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;The core lesson from this project: &lt;strong&gt;not everything needs to be a cloud service.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Documentation for AI agents is a perfect case for local-first. The data changes infrequently (per library version), the queries need to be fast (agents make lots of them), privacy matters (you're asking about your codebase), and the "build once, use forever" model is a natural fit.&lt;/p&gt;

&lt;p&gt;If you're frustrated with rate limits, latency, or paying monthly for something that should be a static file — &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;give it a try&lt;/a&gt;. It's open source (Apache-2.0), it's free, and it works offline.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Context MCP is open source at &lt;a href="https://github.com/neuledge/context" rel="noopener noreferrer"&gt;github.com/neuledge/context&lt;/a&gt;. Published on npm as &lt;code&gt;@neuledge/context&lt;/code&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>vibecoding</category>
      <category>mcp</category>
      <category>llm</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
