<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Daniel da Rosa</title>
    <description>The latest articles on Forem by Daniel da Rosa (@daniel_darosa_8b4804e356).</description>
    <link>https://forem.com/daniel_darosa_8b4804e356</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3626284%2F38260b7e-208a-470f-8422-f8f64d29f65c.jpg</url>
      <title>Forem: Daniel da Rosa</title>
      <link>https://forem.com/daniel_darosa_8b4804e356</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/daniel_darosa_8b4804e356"/>
    <language>en</language>
    <item>
      <title>Why I Stopped Sending Data to LLMs: Introducing "Zero-Data Transport" Architecture</title>
      <dc:creator>Daniel da Rosa</dc:creator>
      <pubDate>Mon, 24 Nov 2025 01:35:31 +0000</pubDate>
      <link>https://forem.com/daniel_darosa_8b4804e356/why-i-stopped-sending-data-to-llms-introducing-zero-data-transport-architecture-fl4</link>
      <guid>https://forem.com/daniel_darosa_8b4804e356/why-i-stopped-sending-data-to-llms-introducing-zero-data-transport-architecture-fl4</guid>
      <description>&lt;h1&gt;
  
  
  Why I Stopped Sending Data to LLMs: Introducing "Zero-Data Transport" Architecture
&lt;/h1&gt;

&lt;h2&gt;
  
  
  The Problem with "Chat with your Data"
&lt;/h2&gt;

&lt;p&gt;Let's be honest: the standard approach to &lt;strong&gt;RAG (Retrieval-Augmented Generation)&lt;/strong&gt; for structured data is broken.&lt;/p&gt;

&lt;p&gt;You know the drill: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;User asks a question -&amp;gt; You run a query -&amp;gt; You fetch 500 rows -&amp;gt; You stuff those 500 rows into the LLM context -&amp;gt; You pray it doesn't hallucinate (or go broke on token costs).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I realized this wasn't scalable for Enterprise ERPs with huge schemas. So, I decided to flip the script.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What if we never sent the data to the AI?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Meet ADA: The "Zero-Data Transport" Agent
&lt;/h2&gt;

&lt;p&gt;I’ve been architecting &lt;strong&gt;ADA (Autonomous Data Agent)&lt;/strong&gt;, a system designed to solve the "Context Window" problem using a technique I call &lt;strong&gt;Zero-Data Transport&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The concept is simple but powerful: treat data context like a &lt;strong&gt;.zip file&lt;/strong&gt;. &lt;br&gt;
Instead of sending the raw data payload to the LLM, we send a &lt;strong&gt;Context ID&lt;/strong&gt;. The data remains in the database/cache, and the AI only manipulates the logic, never the rows.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Secret Sauce: CTE Injection Strategy
&lt;/h2&gt;

&lt;p&gt;Here is the engineering breakthrough. When a user asks to "filter the previous results", instead of re-sending the data, we use &lt;strong&gt;Common Table Expressions (CTEs)&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. The "Zip" (Redis)
&lt;/h3&gt;

&lt;p&gt;When a query runs, we save the SQL logic and metadata in Redis. We return the data to the user, but for the Agent, we only give a Token ID (e.g., &lt;code&gt;CTX_123&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9gsg0y8n3tzp4lva7z4l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9gsg0y8n3tzp4lva7z4l.png" alt="The Zip Strategy Diagram" width="800" height="604"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  2. The Logic (LLM)
&lt;/h3&gt;

&lt;p&gt;The LLM receives a prompt like: &lt;em&gt;"You have a virtual table called PREVIOUS_RESULT. User wants top 5. Write the SQL."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The LLM output is tiny and cheap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;PREVIOUS_RESULT&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt; &lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. The Injection (Backend)
&lt;/h3&gt;

&lt;p&gt;My orchestrator (Java) pulls the original SQL from Redis and injects it into a CTE:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;PREVIOUS_RESULT&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="c1"&gt;-- The original monster query from Redis is injected here&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="nb"&gt;year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2024&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;PREVIOUS_RESULT&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt; &lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result?&lt;/strong&gt; Zero data transfer to the cloud. Zero chance of hallucinating values. &lt;strong&gt;95% token savings.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57l9cbqa7nmkm5nbagjy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57l9cbqa7nmkm5nbagjy.png" alt="CTE Injection Flowchart" width="800" height="2325"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stack (Enterprise Grade)
&lt;/h2&gt;

&lt;p&gt;To make this robust, I moved away from simple vector stores and built a &lt;strong&gt;Converged Architecture&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Oracle 23ai:&lt;/strong&gt; Handles both Relational Data and Vector Embeddings in the same engine. No more sync lag.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Neo4j:&lt;/strong&gt; Acts as the "GPS". It validates JOIN paths so the LLM doesn't invent relationships that don't exist.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis:&lt;/strong&gt; The ephemeral memory (Session Store) for our Context IDs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ie5mv349c7zziwvxq7l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ie5mv349c7zziwvxq7l.png" alt="ADA Architecture Stack" width="800" height="574"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;We are moving from "Chatbots" to &lt;strong&gt;Agentic Engineering&lt;/strong&gt;. By using &lt;strong&gt;Semantic Compression&lt;/strong&gt; (mapping complex schemas to simple aliases) and &lt;strong&gt;CTE Injection&lt;/strong&gt;, we turn fragile demos into robust, secure, and cheap enterprise software.&lt;/p&gt;

&lt;p&gt;I'm currently implementing a &lt;strong&gt;Self-Optimization Layer&lt;/strong&gt; where the system learns from "Context Misses" to create its own guardrails. But that's a topic for Part 2.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffnr8cf5m2v1qemshwtuj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffnr8cf5m2v1qemshwtuj.png" alt="Self Optimization Loop" width="800" height="105"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does your RAG architecture handle 10k rows without breaking the bank? Let's discuss in the comments!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>rag</category>
      <category>oracle</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
