<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ganesh Navale</title>
    <description>The latest articles on Forem by Ganesh Navale (@ganesh_navale).</description>
    <link>https://forem.com/ganesh_navale</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3719154%2F2e29b508-97f9-47f5-84f9-d2e2db56c69c.jpeg</url>
      <title>Forem: Ganesh Navale</title>
      <link>https://forem.com/ganesh_navale</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ganesh_navale"/>
    <language>en</language>
    <item>
      <title>How to Stream AI Responses from Rails API to React (SSE Magic ✨)</title>
      <dc:creator>Ganesh Navale</dc:creator>
      <pubDate>Sat, 31 Jan 2026 09:40:01 +0000</pubDate>
      <link>https://forem.com/ganesh_navale/how-to-stream-ai-responses-from-rails-api-to-react-sse-magic--bd4</link>
      <guid>https://forem.com/ganesh_navale/how-to-stream-ai-responses-from-rails-api-to-react-sse-magic--bd4</guid>
      <description>&lt;h2&gt;
  
  
  🤔 Ever wonder how to stream AI responses to React from Rails?
&lt;/h2&gt;

&lt;p&gt;You know that feeling when your users click "Improve with AI" and stare at a loading spinner for 30 seconds? Yeah, not great UX.&lt;/p&gt;

&lt;p&gt;Let me show you how to make AI responses appear &lt;strong&gt;word-by-word&lt;/strong&gt; like ChatGPT, without the complexity of WebSockets.&lt;/p&gt;




&lt;h2&gt;
  
  
  👉 The Problem
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;React app → Rails API → OpenAI → Wait 30 seconds → Full response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What user sees:&lt;/strong&gt; Loading spinner forever 😴&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The user experience:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clicks button&lt;/li&gt;
&lt;li&gt;Waits... and waits...&lt;/li&gt;
&lt;li&gt;Wonders if it's broken&lt;/li&gt;
&lt;li&gt;Considers closing the tab&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ✨ The Solution: Server-Sent Events (SSE)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Modern approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;React app → Rails SSE endpoint → Streams tokens → Real-time updates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What user sees:&lt;/strong&gt; AI typing word-by-word, just like ChatGPT ✨&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why SSE?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Simpler than WebSockets&lt;/li&gt;
&lt;li&gt;✅ Works over standard HTTP&lt;/li&gt;
&lt;li&gt;✅ Native browser streaming support (ReadableStream)&lt;/li&gt;
&lt;li&gt;✅ Perfect for one-way streaming (server → client)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🛠️ Let's Build It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Rails API Backend
&lt;/h3&gt;

&lt;p&gt;First, create the SSE helper:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/sse.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SSE&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"event: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;close&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Create the Controller
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/api/v1/ai_improvements_controller.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Api::V1::AiImprovementsController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;ActionController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Live&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'text/event-stream'&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'X-Accel-Buffering'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'no'&lt;/span&gt; &lt;span class="c1"&gt;# Disable nginx buffering&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'FRONTEND_URL'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;sse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;SSE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:content&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Your AI service that streams tokens&lt;/span&gt;
    &lt;span class="no"&gt;AiContentImprover&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;improve_streaming&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
      &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="ss"&gt;token: &lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s1"&gt;'token'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="ss"&gt;done: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s1"&gt;'complete'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;IOError&lt;/span&gt;
    &lt;span class="c1"&gt;# Client disconnected - this is normal&lt;/span&gt;
  &lt;span class="k"&gt;ensure&lt;/span&gt;
    &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: AI Service with Streaming
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/ai_content_improver.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiContentImprover&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;
    &lt;span class="vi"&gt;@client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;improve_streaming&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4-turbo-preview"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;messages: &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="s2"&gt;"Improve this content. Make it clearer and more engaging."&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="vi"&gt;@content&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="ss"&gt;stream: &lt;/span&gt;&lt;span class="nb"&gt;proc&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_bytesize&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
          &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"delta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Routes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/routes.rb&lt;/span&gt;
&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;application&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;routes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;draw&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;namespace&lt;/span&gt; &lt;span class="ss"&gt;:api&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;namespace&lt;/span&gt; &lt;span class="ss"&gt;:v1&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="s1"&gt;'ai_improvements'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;to: &lt;/span&gt;&lt;span class="s1"&gt;'ai_improvements#create'&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚛️ React Frontend
&lt;/h2&gt;

&lt;p&gt;Now the fun part - consuming the SSE stream in React:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useEffect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;AiAssistant&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;suggestion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setSuggestion&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c1"&gt;// Cleanup on unmount&lt;/span&gt;
  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;improveContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Cancel any existing request&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nf"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;setSuggestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AbortController&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;API_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:3000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;API_URL&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/api/v1/ai_improvements`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Accept&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/event-stream&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="na"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;

      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`HTTP error! status: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getReader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;

      &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;done&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;

        &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;

        &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data: &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;

          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;setSuggestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AbortError&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Stream error:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"ai-assistant"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt; 
        &lt;span class="na"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;improveContent&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; 
        &lt;span class="na"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"improve-btn"&lt;/span&gt;
      &lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✨ AI thinking...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✨ Improve with AI&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;suggestion&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"suggestion-box"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;suggestion&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;AiAssistant&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎨 Add Some Style
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nc"&gt;.ai-assistant&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.improve-btn&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;linear-gradient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;135deg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;#667eea&lt;/span&gt; &lt;span class="m"&gt;0%&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;#764ba2&lt;/span&gt; &lt;span class="m"&gt;100%&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nl"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="no"&gt;white&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;none&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;12px&lt;/span&gt; &lt;span class="m"&gt;24px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border-radius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;pointer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;font-weight&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;transition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt; &lt;span class="m"&gt;0.2s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.improve-btn&lt;/span&gt;&lt;span class="nd"&gt;:hover&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;translateY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;-2px&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.improve-btn&lt;/span&gt;&lt;span class="nd"&gt;:disabled&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;opacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;not-allowed&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.suggestion-box&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;margin-top&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;16px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;16px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#f7fafc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border-left&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4px&lt;/span&gt; &lt;span class="nb"&gt;solid&lt;/span&gt; &lt;span class="m"&gt;#667eea&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border-radius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;line-height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;white-space&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pre-wrap&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎉 The Magic Happens
&lt;/h2&gt;

&lt;p&gt;When the user clicks "✨ Improve with AI":&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Browser creates connection using fetch with &lt;code&gt;ReadableStream&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Rails streams AI tokens one-by-one&lt;/li&gt;
&lt;li&gt;JavaScript appends each token to the UI&lt;/li&gt;
&lt;li&gt;User sees AI "typing" in real-time&lt;/li&gt;
&lt;li&gt;Connection closes automatically when done&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✨ Token-by-token streaming&lt;/li&gt;
&lt;li&gt;📱 Native browser API (no library needed)&lt;/li&gt;
&lt;li&gt;🔄 Easy to add retry logic if needed&lt;/li&gt;
&lt;li&gt;🚀 Much cleaner than WebSockets for this use case&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔥 Advanced: Add Rate Limiting
&lt;/h2&gt;

&lt;p&gt;Prevent users from spamming your expensive AI endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/api/v1/ai_improvements_controller.rb&lt;/span&gt;
&lt;span class="n"&gt;before_action&lt;/span&gt; &lt;span class="ss"&gt;:check_rate_limit&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_rate_limit&lt;/span&gt;
  &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ai_improve:&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;minute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;

  &lt;span class="n"&gt;sse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;SSE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="ss"&gt;error: &lt;/span&gt;&lt;span class="s2"&gt;"⏸️ Slow down! Try again in a minute."&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Update React to handle errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In the stream parsing loop&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💰 Bonus: Cost Tracking
&lt;/h2&gt;

&lt;p&gt;Track your AI usage to avoid surprise bills:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/ai_usage.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiUsage&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:user&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;total_cost_today&lt;/span&gt;
    &lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at &amp;gt;= ?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:estimated_cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# In your controller:&lt;/span&gt;
&lt;span class="n"&gt;usage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;AiUsage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="ss"&gt;user: &lt;/span&gt;&lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;action: &lt;/span&gt;&lt;span class="s1"&gt;'content_improvement'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;input_tokens: &lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;length&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# rough estimate&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# After streaming completes:&lt;/span&gt;
&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;output_tokens: &lt;/span&gt;&lt;span class="n"&gt;token_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;estimated_cost: &lt;/span&gt;&lt;span class="n"&gt;calculate_cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Tip:&lt;/strong&gt; For production, use the &lt;code&gt;rack-cors&lt;/code&gt; gem instead of inline headers to properly handle preflight requests.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  📊 SSE vs WebSocket vs Polling
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;SSE (fetch)&lt;/th&gt;
&lt;th&gt;WebSocket&lt;/th&gt;
&lt;th&gt;Long Polling&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⭐ Simple&lt;/td&gt;
&lt;td&gt;⭐⭐⭐ Complex&lt;/td&gt;
&lt;td&gt;⭐⭐ Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HTTP Methods&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ All (GET, POST)&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;✅ All&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Custom Headers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Full control&lt;/td&gt;
&lt;td&gt;⚠️ Limited&lt;/td&gt;
&lt;td&gt;✅ Full control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bidirectional&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Server → Client&lt;/td&gt;
&lt;td&gt;✅ Both ways&lt;/td&gt;
&lt;td&gt;❌ Client → Server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Auto Reconnect&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Manual&lt;/td&gt;
&lt;td&gt;❌ Manual&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Connection Overhead&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low (single HTTP)&lt;/td&gt;
&lt;td&gt;Low (persistent)&lt;/td&gt;
&lt;td&gt;High (repeated requests)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⭐⭐ Good&lt;/td&gt;
&lt;td&gt;⭐⭐⭐ Excellent&lt;/td&gt;
&lt;td&gt;⭐ Poor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Proxy/Firewall Friendly&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes (HTTP)&lt;/td&gt;
&lt;td&gt;⚠️ Sometimes blocked&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Browser Support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ All modern&lt;/td&gt;
&lt;td&gt;✅ All modern&lt;/td&gt;
&lt;td&gt;✅ All&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AI streaming, notifications&lt;/td&gt;
&lt;td&gt;Real-time chat, gaming&lt;/td&gt;
&lt;td&gt;Legacy systems&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;For AI streaming: SSE wins!&lt;/strong&gt; ✨&lt;/p&gt;

&lt;p&gt;Why? AI responses are inherently one-directional (server → client), making WebSocket's bidirectional capability unnecessary overhead. SSE with &lt;code&gt;fetch&lt;/code&gt; gives you POST support for sending prompts while streaming responses back.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 What's Next?
&lt;/h2&gt;

&lt;p&gt;Want to level up? Try:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add Anthropic Claude streaming&lt;/strong&gt; (different API, same concept)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implement retry logic&lt;/strong&gt; for failed streams&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add progress indicators&lt;/strong&gt; showing % complete&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache similar requests&lt;/strong&gt; to save costs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Support multiple AI models&lt;/strong&gt; (let users choose)&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  💬 Over to You
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Have you used SSE before? How was your experience?&lt;/li&gt;
&lt;li&gt;Would you choose SSE or WebSockets for AI streaming?&lt;/li&gt;
&lt;li&gt;What other AI features are you building in your Rails + React apps?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Drop a comment below! 👇&lt;/p&gt;




&lt;h2&gt;
  
  
  🔗 Useful Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events" rel="noopener noreferrer"&gt;MDN: Server-Sent Events&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.openai.com/docs/api-reference/streaming" rel="noopener noreferrer"&gt;OpenAI Streaming API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://api.rubyonrails.org/classes/ActionController/Live.html" rel="noopener noreferrer"&gt;Rails ActionController::Live&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Found this helpful?&lt;/strong&gt; Give it a ❤️ and follow me for more Rails + React + AI tips!&lt;/p&gt;

&lt;p&gt;Happy coding! 🚀&lt;/p&gt;

&lt;h1&gt;
  
  
  rails #react #ai #sse #openai #streaming #webdev
&lt;/h1&gt;

</description>
      <category>rails</category>
      <category>react</category>
      <category>ai</category>
      <category>sse</category>
    </item>
    <item>
      <title>How to Stream AI Responses from Rails API to React (SSE Magic ✨)</title>
      <dc:creator>Ganesh Navale</dc:creator>
      <pubDate>Sat, 31 Jan 2026 09:20:38 +0000</pubDate>
      <link>https://forem.com/ganesh_navale/how-to-stream-ai-responses-from-rails-api-to-react-sse-magic--1akc</link>
      <guid>https://forem.com/ganesh_navale/how-to-stream-ai-responses-from-rails-api-to-react-sse-magic--1akc</guid>
      <description>&lt;h2&gt;
  
  
  🤔 Ever wonder how to stream AI responses to React from Rails?
&lt;/h2&gt;

&lt;p&gt;You know that feeling when your users click "Improve with AI" and stare at a loading spinner for 30 seconds? Yeah, not great UX.&lt;/p&gt;

&lt;p&gt;Let me show you how to make AI responses appear &lt;strong&gt;word-by-word&lt;/strong&gt; like ChatGPT, without the complexity of WebSockets.&lt;/p&gt;




&lt;h2&gt;
  
  
  👉 The Problem
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;React app → Rails API → OpenAI → Wait 30 seconds → Full response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What user sees:&lt;/strong&gt; Loading spinner forever 😴&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The user experience:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clicks button&lt;/li&gt;
&lt;li&gt;Waits... and waits...&lt;/li&gt;
&lt;li&gt;Wonders if it's broken&lt;/li&gt;
&lt;li&gt;Considers closing the tab&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ✨ The Solution: Server-Sent Events (SSE)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Modern approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;React app → Rails SSE endpoint → Streams tokens → Real-time updates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What user sees:&lt;/strong&gt; AI typing word-by-word, just like ChatGPT ✨&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why SSE?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Simpler than WebSockets&lt;/li&gt;
&lt;li&gt;✅ Works over standard HTTP&lt;/li&gt;
&lt;li&gt;✅ Native browser streaming support (ReadableStream)&lt;/li&gt;
&lt;li&gt;✅ Perfect for one-way streaming (server → client)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🛠️ Let's Build It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Rails API Backend
&lt;/h3&gt;

&lt;p&gt;First, create the SSE helper:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/sse.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SSE&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"event: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;close&lt;/span&gt;
    &lt;span class="vi"&gt;@stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Create the Controller
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/api/v1/ai_improvements_controller.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Api::V1::AiImprovementsController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;ActionController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Live&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'text/event-stream'&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'X-Accel-Buffering'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'no'&lt;/span&gt; &lt;span class="c1"&gt;# Disable nginx buffering&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'FRONTEND_URL'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;sse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;SSE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:content&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Your AI service that streams tokens&lt;/span&gt;
    &lt;span class="no"&gt;AiContentImprover&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;improve_streaming&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
      &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="ss"&gt;token: &lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s1"&gt;'token'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="ss"&gt;done: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s1"&gt;'complete'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;IOError&lt;/span&gt;
    &lt;span class="c1"&gt;# Client disconnected - this is normal&lt;/span&gt;
  &lt;span class="k"&gt;ensure&lt;/span&gt;
    &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: AI Service with Streaming
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/ai_content_improver.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiContentImprover&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;
    &lt;span class="vi"&gt;@client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;improve_streaming&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4-turbo-preview"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;messages: &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="s2"&gt;"Improve this content. Make it clearer and more engaging."&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="vi"&gt;@content&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="ss"&gt;stream: &lt;/span&gt;&lt;span class="nb"&gt;proc&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_bytesize&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
          &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"delta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Routes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/routes.rb&lt;/span&gt;
&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;application&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;routes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;draw&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;namespace&lt;/span&gt; &lt;span class="ss"&gt;:api&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;namespace&lt;/span&gt; &lt;span class="ss"&gt;:v1&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="s1"&gt;'ai_improvements'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;to: &lt;/span&gt;&lt;span class="s1"&gt;'ai_improvements#create'&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚛️ React Frontend
&lt;/h2&gt;

&lt;p&gt;Now the fun part - consuming the SSE stream in React:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useEffect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;AiAssistant&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;suggestion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setSuggestion&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c1"&gt;// Cleanup on unmount&lt;/span&gt;
  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;improveContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Cancel any existing request&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nf"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;setSuggestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AbortController&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;API_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:3000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;API_URL&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/api/v1/ai_improvements`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Accept&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/event-stream&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="na"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;

      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`HTTP error! status: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getReader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;

      &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;done&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;

        &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;

        &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;line&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data: &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;

          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;setSuggestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AbortError&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Stream error:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nx"&gt;abortControllerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"ai-assistant"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt; 
        &lt;span class="na"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;improveContent&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; 
        &lt;span class="na"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"improve-btn"&lt;/span&gt;
      &lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✨ AI thinking...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✨ Improve with AI&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;suggestion&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"suggestion-box"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;suggestion&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;AiAssistant&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎨 Add Some Style
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nc"&gt;.ai-assistant&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.improve-btn&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;linear-gradient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;135deg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;#667eea&lt;/span&gt; &lt;span class="m"&gt;0%&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;#764ba2&lt;/span&gt; &lt;span class="m"&gt;100%&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nl"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="no"&gt;white&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;none&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;12px&lt;/span&gt; &lt;span class="m"&gt;24px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border-radius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;pointer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;font-weight&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;transition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;transform&lt;/span&gt; &lt;span class="m"&gt;0.2s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.improve-btn&lt;/span&gt;&lt;span class="nd"&gt;:hover&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;translateY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;-2px&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.improve-btn&lt;/span&gt;&lt;span class="nd"&gt;:disabled&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;opacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;not-allowed&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nc"&gt;.suggestion-box&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;margin-top&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;16px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;16px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#f7fafc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border-left&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4px&lt;/span&gt; &lt;span class="nb"&gt;solid&lt;/span&gt; &lt;span class="m"&gt;#667eea&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;border-radius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;line-height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;white-space&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pre-wrap&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎉 The Magic Happens
&lt;/h2&gt;

&lt;p&gt;When the user clicks "✨ Improve with AI":&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Browser creates connection using fetch with &lt;code&gt;ReadableStream&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Rails streams AI tokens one-by-one&lt;/li&gt;
&lt;li&gt;JavaScript appends each token to the UI&lt;/li&gt;
&lt;li&gt;User sees AI "typing" in real-time&lt;/li&gt;
&lt;li&gt;Connection closes automatically when done&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✨ Token-by-token streaming&lt;/li&gt;
&lt;li&gt;📱 Native browser API (no library needed)&lt;/li&gt;
&lt;li&gt;🔄 Easy to add retry logic if needed&lt;/li&gt;
&lt;li&gt;🚀 Much cleaner than WebSockets for this use case&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔥 Advanced: Add Rate Limiting
&lt;/h2&gt;

&lt;p&gt;Prevent users from spamming your expensive AI endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/api/v1/ai_improvements_controller.rb&lt;/span&gt;
&lt;span class="n"&gt;before_action&lt;/span&gt; &lt;span class="ss"&gt;:check_rate_limit&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_rate_limit&lt;/span&gt;
  &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ai_improve:&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;minute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;

  &lt;span class="n"&gt;sse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;SSE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="ss"&gt;error: &lt;/span&gt;&lt;span class="s2"&gt;"⏸️ Slow down! Try again in a minute."&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;sse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Update React to handle errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In the stream parsing loop&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💰 Bonus: Cost Tracking
&lt;/h2&gt;

&lt;p&gt;Track your AI usage to avoid surprise bills:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/ai_usage.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiUsage&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:user&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;total_cost_today&lt;/span&gt;
    &lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at &amp;gt;= ?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:estimated_cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# In your controller:&lt;/span&gt;
&lt;span class="n"&gt;usage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;AiUsage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="ss"&gt;user: &lt;/span&gt;&lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;action: &lt;/span&gt;&lt;span class="s1"&gt;'content_improvement'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;input_tokens: &lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;length&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# rough estimate&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# After streaming completes:&lt;/span&gt;
&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;output_tokens: &lt;/span&gt;&lt;span class="n"&gt;token_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;estimated_cost: &lt;/span&gt;&lt;span class="n"&gt;calculate_cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Tip:&lt;/strong&gt; For production, use the &lt;code&gt;rack-cors&lt;/code&gt; gem instead of inline headers to properly handle preflight requests.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  📊 SSE vs WebSocket vs Polling
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;SSE (fetch)&lt;/th&gt;
&lt;th&gt;WebSocket&lt;/th&gt;
&lt;th&gt;Long Polling&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⭐ Simple&lt;/td&gt;
&lt;td&gt;⭐⭐⭐ Complex&lt;/td&gt;
&lt;td&gt;⭐⭐ Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HTTP Methods&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ All (GET, POST)&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;✅ All&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Custom Headers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Full control&lt;/td&gt;
&lt;td&gt;⚠️ Limited&lt;/td&gt;
&lt;td&gt;✅ Full control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bidirectional&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Server → Client&lt;/td&gt;
&lt;td&gt;✅ Both ways&lt;/td&gt;
&lt;td&gt;❌ Client → Server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Auto Reconnect&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Manual&lt;/td&gt;
&lt;td&gt;❌ Manual&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Connection Overhead&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low (single HTTP)&lt;/td&gt;
&lt;td&gt;Low (persistent)&lt;/td&gt;
&lt;td&gt;High (repeated requests)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⭐⭐ Good&lt;/td&gt;
&lt;td&gt;⭐⭐⭐ Excellent&lt;/td&gt;
&lt;td&gt;⭐ Poor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Proxy/Firewall Friendly&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes (HTTP)&lt;/td&gt;
&lt;td&gt;⚠️ Sometimes blocked&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Browser Support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ All modern&lt;/td&gt;
&lt;td&gt;✅ All modern&lt;/td&gt;
&lt;td&gt;✅ All&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AI streaming, notifications&lt;/td&gt;
&lt;td&gt;Real-time chat, gaming&lt;/td&gt;
&lt;td&gt;Legacy systems&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;For AI streaming: SSE wins!&lt;/strong&gt; ✨&lt;/p&gt;

&lt;p&gt;Why? AI responses are inherently one-directional (server → client), making WebSocket's bidirectional capability unnecessary overhead. SSE with &lt;code&gt;fetch&lt;/code&gt; gives you POST support for sending prompts while streaming responses back.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 What's Next?
&lt;/h2&gt;

&lt;p&gt;Want to level up? Try:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add Anthropic Claude streaming&lt;/strong&gt; (different API, same concept)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implement retry logic&lt;/strong&gt; for failed streams&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add progress indicators&lt;/strong&gt; showing % complete&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache similar requests&lt;/strong&gt; to save costs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Support multiple AI models&lt;/strong&gt; (let users choose)&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  💬 Over to You
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Have you used SSE before? How was your experience?&lt;/li&gt;
&lt;li&gt;Would you choose SSE or WebSockets for AI streaming?&lt;/li&gt;
&lt;li&gt;What other AI features are you building in your Rails + React apps?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Drop a comment below! 👇&lt;/p&gt;




&lt;h2&gt;
  
  
  🔗 Useful Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events" rel="noopener noreferrer"&gt;MDN: Server-Sent Events&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.openai.com/docs/api-reference/streaming" rel="noopener noreferrer"&gt;OpenAI Streaming API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://api.rubyonrails.org/classes/ActionController/Live.html" rel="noopener noreferrer"&gt;Rails ActionController::Live&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Found this helpful?&lt;/strong&gt; Give it a ❤️ and follow me for more Rails + React + AI tips!&lt;/p&gt;

&lt;p&gt;Happy coding! 🚀&lt;/p&gt;

&lt;h1&gt;
  
  
  rails #react #ai #sse #openai #streaming #webdev
&lt;/h1&gt;

</description>
      <category>rails</category>
      <category>react</category>
      <category>ai</category>
      <category>sse</category>
    </item>
    <item>
      <title>RAG vs Fine-Tuning: Choosing the Right Approach for Your LLM Application</title>
      <dc:creator>Ganesh Navale</dc:creator>
      <pubDate>Sat, 24 Jan 2026 08:28:57 +0000</pubDate>
      <link>https://forem.com/ganesh_navale/rag-vs-fine-tuning-choosing-the-right-approach-for-your-llm-application-383g</link>
      <guid>https://forem.com/ganesh_navale/rag-vs-fine-tuning-choosing-the-right-approach-for-your-llm-application-383g</guid>
      <description>&lt;p&gt;When building applications with Large Language Models (LLMs), you'll often face a critical decision: should you use Retrieval-Augmented Generation (RAG) or fine-tune your model? Both approaches can enhance your LLM's performance, but they solve different problems and come with distinct trade-offs.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is RAG?
&lt;/h2&gt;

&lt;p&gt;Retrieval-Augmented Generation is a technique that combines an LLM with an external knowledge base. When a user asks a question, the system first retrieves relevant information from a database or document collection, then feeds that context to the LLM alongside the query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How RAG Works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User submits a query&lt;/li&gt;
&lt;li&gt;System converts query into embeddings&lt;/li&gt;
&lt;li&gt;Relevant documents are retrieved from a vector database&lt;/li&gt;
&lt;li&gt;Retrieved context is added to the prompt&lt;/li&gt;
&lt;li&gt;LLM generates a response using both the query and retrieved context&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example Use Case:&lt;/strong&gt;&lt;br&gt;
A customer support chatbot that needs to answer questions based on your company's latest documentation, product updates, and knowledge base articles.&lt;/p&gt;
&lt;h2&gt;
  
  
  What is Fine-Tuning?
&lt;/h2&gt;

&lt;p&gt;Fine-tuning involves training an existing LLM on a specialized dataset to adapt its behavior, writing style, or domain knowledge. You're essentially teaching the model new patterns by updating its weights through additional training.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How Fine-Tuning Works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Prepare a dataset of input-output pairs&lt;/li&gt;
&lt;li&gt;Initialize from a pre-trained model&lt;/li&gt;
&lt;li&gt;Train the model on your custom dataset&lt;/li&gt;
&lt;li&gt;Adjust hyperparameters and validate performance&lt;/li&gt;
&lt;li&gt;Deploy the customized model&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example Use Case:&lt;/strong&gt;&lt;br&gt;
A medical diagnosis assistant that needs to understand specialized medical terminology and respond with the appropriate clinical tone and precision.&lt;/p&gt;
&lt;h2&gt;
  
  
  Key Differences at a Glance
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;RAG&lt;/th&gt;
&lt;th&gt;Fine-Tuning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Knowledge Updates&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Easy - just update the database&lt;/td&gt;
&lt;td&gt;Requires retraining the model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lower - no model training needed&lt;/td&gt;
&lt;td&gt;Higher - GPU costs for training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup Time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast - days to implement&lt;/td&gt;
&lt;td&gt;Slower - weeks of preparation and training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Accuracy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Source-attributable and verifiable&lt;/td&gt;
&lt;td&gt;Absorbed into model weights&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Requirements&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Can work with small datasets&lt;/td&gt;
&lt;td&gt;Needs substantial training data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maintenance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ongoing content management&lt;/td&gt;
&lt;td&gt;Periodic retraining cycles&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  When to Use RAG
&lt;/h2&gt;

&lt;p&gt;RAG is your best choice when:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic Knowledge Requirements&lt;/strong&gt;: Your information changes frequently. If you're working with news, documentation, or any rapidly updating content, RAG allows you to update your knowledge base without retraining.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Source Attribution Matters&lt;/strong&gt;: You need to cite sources or show users where information came from. RAG naturally provides this since it retrieves specific documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limited Budget&lt;/strong&gt;: You want to avoid the computational costs of training. RAG works with off-the-shelf models and standard infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick Iteration&lt;/strong&gt;: You need to get something working fast and iterate based on user feedback. RAG systems can be set up and modified much more quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Large, Diverse Knowledge Bases&lt;/strong&gt;: You're working with extensive documentation where the model would struggle to memorize everything through fine-tuning.&lt;/p&gt;
&lt;h2&gt;
  
  
  When to Use Fine-Tuning
&lt;/h2&gt;

&lt;p&gt;Fine-tuning makes sense when:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Behavioral Consistency&lt;/strong&gt;: You need the model to consistently follow specific formats, tones, or styles. Fine-tuning can make these behaviors more reliable than prompting alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain-Specific Language&lt;/strong&gt;: Your field uses specialized terminology, syntax, or reasoning patterns that general models don't handle well. Medical, legal, or technical domains often benefit from fine-tuning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reduced Latency&lt;/strong&gt;: You want faster responses without the overhead of retrieval. Fine-tuned knowledge is baked into the model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Proprietary Knowledge&lt;/strong&gt;: Your training data contains sensitive information that shouldn't be stored in a retrievable database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Task-Specific Performance&lt;/strong&gt;: You're optimizing for a narrow, well-defined task where fine-tuning can significantly boost accuracy.&lt;/p&gt;
&lt;h2&gt;
  
  
  Can You Use Both?
&lt;/h2&gt;

&lt;p&gt;Absolutely! Many production systems combine RAG and fine-tuning for optimal results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fine-tune for style, tone, and domain language&lt;/li&gt;
&lt;li&gt;Use RAG for factual, up-to-date information&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This hybrid approach gives you the best of both worlds - a model that speaks your domain's language while staying current with the latest information.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Architecture:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Query → Fine-tuned LLM (understands domain) 
          → RAG System (retrieves latest facts)
          → Combined Response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Practical Decision Framework
&lt;/h2&gt;

&lt;p&gt;Ask yourself these questions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;How often does your knowledge change?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily/Weekly → RAG&lt;/li&gt;
&lt;li&gt;Rarely → Fine-tuning&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What's your primary goal?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access to information → RAG&lt;/li&gt;
&lt;li&gt;Behavioral modification → Fine-tuning&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What's your budget?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Limited → RAG&lt;/li&gt;
&lt;li&gt;Substantial → Consider fine-tuning&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Do you need source citations?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Yes → RAG&lt;/li&gt;
&lt;li&gt;No → Either works&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;How much training data do you have?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Limited (&amp;lt;1000 examples) → RAG&lt;/li&gt;
&lt;li&gt;Substantial (&amp;gt;10,000 examples) → Fine-tuning possible&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Real-World Examples
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;RAG Success Story&lt;/strong&gt;: A legal tech company built a contract analysis tool using RAG. They maintain a database of legal precedents and regulations that updates weekly. RAG allows them to incorporate new rulings immediately without retraining, and they can show lawyers exactly which precedents informed each analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fine-Tuning Success Story&lt;/strong&gt;: A healthcare startup fine-tuned a model on thousands of medical conversations to handle patient triage. The model learned to ask the right follow-up questions and use appropriate medical terminology, providing a consistent experience that RAG alone couldn't achieve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Starting with RAG:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Choose a vector database (Pinecone, Weaviate, Chroma)&lt;/li&gt;
&lt;li&gt;Prepare your documents and create embeddings&lt;/li&gt;
&lt;li&gt;Implement retrieval logic&lt;/li&gt;
&lt;li&gt;Integrate with your LLM API&lt;/li&gt;
&lt;li&gt;Test and refine your retrieval strategy&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Starting with Fine-Tuning:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Collect and clean your training data&lt;/li&gt;
&lt;li&gt;Format as input-output pairs&lt;/li&gt;
&lt;li&gt;Choose a base model and platform (OpenAI, Hugging Face)&lt;/li&gt;
&lt;li&gt;Train with proper validation&lt;/li&gt;
&lt;li&gt;Evaluate on held-out test data&lt;/li&gt;
&lt;li&gt;Deploy and monitor&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Neither RAG nor fine-tuning is universally better - they excel in different scenarios. RAG shines when you need flexibility, quick updates, and source attribution. Fine-tuning wins when you need behavioral consistency, domain expertise, and reduced latency.&lt;/p&gt;

&lt;p&gt;For many production applications, a hybrid approach leveraging both techniques provides the most robust solution. Start with RAG for its speed and flexibility, then consider fine-tuning if you identify specific behavioral improvements that prompting can't solve.&lt;/p&gt;

&lt;p&gt;The key is understanding your specific requirements, constraints, and goals. With this framework, you can make an informed decision that sets your LLM application up for success.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with RAG and fine-tuning? Have you found one more effective than the other for your use cases? Let me know in the comments!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>rag</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Why My Second RAG System Was Built in Rails, Not Python’s FastAPI</title>
      <dc:creator>Ganesh Navale</dc:creator>
      <pubDate>Tue, 20 Jan 2026 15:43:30 +0000</pubDate>
      <link>https://forem.com/ganesh_navale/why-my-second-rag-system-was-built-in-rails-not-pythons-fastapi-3h06</link>
      <guid>https://forem.com/ganesh_navale/why-my-second-rag-system-was-built-in-rails-not-pythons-fastapi-3h06</guid>
      <description>&lt;p&gt;Built the same RAG system in FastAPI and Ruby on Rails. FastAPI took weeks, Rails took 24 hours. Here's what that taught me about choosing frameworks for AI products.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A hands-on comparison from a Rails developer's first AI project&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fziv0ubciumpps6r4th5w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fziv0ubciumpps6r4th5w.png" alt="Diagram showing Rails framework integrating with AI components: OpenAI API, embeddings, documents, and knowledge representations" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Picture this: You’re starting a new RAG project. You open your laptop and wonder: &lt;em&gt;“Do I really need Python for this?”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Every tutorial assumes Python. Almost every example uses FastAPI or Flask. And there I was thinking, &lt;em&gt;“But… I already know Rails.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I’m a Ruby on Rails developer, and I love Rails. I decided to dive in anyway and build a RAG system with FastAPI, learning embeddings, vector databases, and how to make an LLM actually answer questions from documents.&lt;/p&gt;

&lt;p&gt;Then our company announced an AI hackathon. I had 48 hours to build another RAG system. This time, I built it in Rails. It wasn’t about rewriting; it was a practical choice under tight deadlines.&lt;/p&gt;

&lt;p&gt;Same features. Same vector database. Same LLM. Same project, different framework.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h2&gt;
  
  
  What the RAG system actually does
&lt;/h2&gt;

&lt;p&gt;This RAG system is essentially a question-answering engine over your own documents. A user uploads PDFs or text, the system embeds and stores them, and when a query is submitted it finds the most relevant content and uses an LLM to answer based on that retrieved context.&lt;br&gt;
For example, instead of asking a model “What is the capital of France?” — which it might already know — the system answers specific questions about a custom set of documents that the user has uploaded&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This article shares my experience building the same RAG system twice, what changed, and what I learned about Rails, FastAPI, and building AI-powered features in real projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Built the same RAG system in FastAPI (took weeks) and Rails (took 24 hours)&lt;/li&gt;
&lt;li&gt;The actual AI logic was identical—the difference was infrastructure&lt;/li&gt;
&lt;li&gt;Rails' mature tooling (Sidekiq, ActiveRecord, console) made development faster&lt;/li&gt;
&lt;li&gt;Python still wins for ML-heavy experimentation&lt;/li&gt;
&lt;li&gt;My recommendation: Rails for the app, Python microservices only when needed&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;You don't need to rewrite your Rails app in Python to add AI features&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I built (in both versions)
&lt;/h2&gt;

&lt;p&gt;Both RAG systems had the same core functionality:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The tech stack:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;FastAPI version:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FastAPI for the API layer&lt;/li&gt;
&lt;li&gt;Celery for background jobs&lt;/li&gt;
&lt;li&gt;SQLAlchemy for database access&lt;/li&gt;
&lt;li&gt;OpenAI API for embeddings and completions&lt;/li&gt;
&lt;li&gt;pgvector for similarity search&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Rails version:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rails API backend&lt;/li&gt;
&lt;li&gt;React frontend&lt;/li&gt;
&lt;li&gt;Sidekiq for background jobs&lt;/li&gt;
&lt;li&gt;ActiveRecord for database access&lt;/li&gt;
&lt;li&gt;Same OpenAI API and pgvector setup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI logic was identical. The framework wrapping it was different.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture (Same for Both)
&lt;/h2&gt;

&lt;p&gt;Both implementations follow this exact flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Document Upload&lt;/strong&gt; → User uploads PDF via web interface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Extraction&lt;/strong&gt; → Extract and clean text from PDF&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chunking&lt;/strong&gt; → Split document into ~500 token chunks with overlap&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding Generation&lt;/strong&gt; → Send chunks to OpenAI's embedding API&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector Storage&lt;/strong&gt; → Store embeddings in Pinecone with metadata&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Question Processing&lt;/strong&gt; → When user asks a question:

&lt;ul&gt;
&lt;li&gt;Generate embedding for the question(text-embedding-3-small model)&lt;/li&gt;
&lt;li&gt;Query Pinecone for top 5 similar chunks&lt;/li&gt;
&lt;li&gt;Send question + context to GPT-4&lt;/li&gt;
&lt;li&gt;Stream response back to user&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Next Up: Where the Differences Actually Showed Up
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The FastAPI Version: Where I Spent My Time
&lt;/h3&gt;

&lt;p&gt;Building the FastAPI version worked. But I spent more time on infrastructure than on the actual AI features.&lt;/p&gt;

&lt;p&gt;In theory, Celery handles async tasks. In practice, I became the one handling Celery.&lt;/p&gt;

&lt;p&gt;When an embedding job failed (and they did, API timeouts, rate limits, malformed PDFs), here's what debugging looked like:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Background jobs became a maintenance burden:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# FastAPI/Celery - Replay a failed embedding job
# 1. Find the task ID in logs
# 2. Check Celery flower or redis
# 3. Manually construct retry logic
# 4. Hope the async session doesn't break again
&lt;/span&gt;
&lt;span class="nd"&gt;@celery_app.task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;get_db_session&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# embedding logic
&lt;/span&gt;            &lt;span class="k"&gt;pass&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;countdown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Compare this to Rails:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Rails - Replay failed job from console or UI&lt;/span&gt;
&lt;span class="no"&gt;DocumentEmbeddingJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Built-in retry with exponential backoff&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DocumentEmbeddingJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;wait: :exponentially_longer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;embedding: &lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Rails, I can see failed jobs in Sidekiq's web UI, click "Retry," and watch it work. In FastAPI, I was writing custom monitoring and retry logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database session management became a daily puzzle:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every async endpoint needed careful session handling. I'd write a feature, run tests, and watch them randomly fail because some session somewhere wasn't properly closed. I spent more time reading asyncio documentation than building features.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# FastAPI - Manual session lifecycle everywhere
&lt;/span&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DocumentCreate&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;get_db_session&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="c1"&gt;# Don't forget to close this!
&lt;/span&gt;            &lt;span class="c1"&gt;# Or rollback on error!
&lt;/span&gt;            &lt;span class="c1"&gt;# Or handle connection pool limits!
&lt;/span&gt;            &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Meanwhile in Rails? ActiveRecord just handles it. I never thought about sessions once during the hackathon.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment felt fragile:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I had to manually configure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Async workers&lt;/li&gt;
&lt;li&gt;Job queues&lt;/li&gt;
&lt;li&gt;Retry policies&lt;/li&gt;
&lt;li&gt;Monitoring dashboards&lt;/li&gt;
&lt;li&gt;Database connection pools for async&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rails gives me all of this out of the box.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Rails version: exactly what I needed
&lt;/h3&gt;

&lt;p&gt;During the hackathon, I had 48 hours to ship a working demo. Not a prototype. Not a proof-of-concept. A working system that non-technical people could use.&lt;/p&gt;

&lt;p&gt;I chose Rails not because it's "better for AI" (it's probably not), but because I knew exactly where my time would go: building features, not configuring infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Here's what the system did:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ingested documentation from multiple sources&lt;/li&gt;
&lt;li&gt;Split content into chunks&lt;/li&gt;
&lt;li&gt;Generated embeddings via OpenAI&lt;/li&gt;
&lt;li&gt;Stored vectors in Postgres with pgvector&lt;/li&gt;
&lt;li&gt;Answered questions using retrieved context&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Actually ship it before the demo&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The entire backend:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Model&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;has_neighbors&lt;/span&gt; &lt;span class="ss"&gt;:embedding&lt;/span&gt;

  &lt;span class="n"&gt;after_create_commit&lt;/span&gt; &lt;span class="ss"&gt;:enqueue_embedding_job&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;enqueue_embedding_job&lt;/span&gt;
    &lt;span class="no"&gt;DocumentEmbeddingJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# Background job&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DocumentEmbeddingJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;wait: :exponentially_longer&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;split_into_chunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
      &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="no"&gt;DocumentChunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="ss"&gt;document: &lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;embedding: &lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# Query service&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RagQueryService&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
    &lt;span class="vi"&gt;@embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Convert question to vector&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;answer&lt;/span&gt;
    &lt;span class="c1"&gt;# Find the 5 most similar document chunks&lt;/span&gt;
    &lt;span class="n"&gt;relevant_chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;DocumentChunk&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;nearest_neighbors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="vi"&gt;@embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;distance: &lt;/span&gt;&lt;span class="s2"&gt;"cosine"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Combine them into context&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;relevant_chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="ss"&gt;:content&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Ask GPT-4 with the context&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;messages: &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="s2"&gt;"Answer based on this context: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="vi"&gt;@query&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Reality check:&lt;/strong&gt; This code is simplified for the article. The real version had error handling, logging, and rate limiting. But the core logic? Pretty much this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;That's it.&lt;/strong&gt; The entire RAG pipeline in ~60 lines of readable Ruby.&lt;br&gt;
No async session management. No custom retry logic. No Celery flower dashboard. Just Rails doing what Rails does best: letting you build features instead of infrastructure.&lt;/p&gt;

&lt;p&gt;Could I have made the FastAPI version this clean? Maybe. But I didn't have time to figure it out. And that's the point.&lt;/p&gt;

&lt;h3&gt;
  
  
  What made Rails faster for me
&lt;/h3&gt;

&lt;p&gt;I shipped the Rails version in 24 hours. The FastAPI version took weeks to get stable.&lt;/p&gt;

&lt;p&gt;Here's why:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Background jobs are a solved problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sidekiq gives me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Web UI to monitor jobs&lt;/li&gt;
&lt;li&gt;Automatic retries with backoff&lt;/li&gt;
&lt;li&gt;Dead job queues&lt;/li&gt;
&lt;li&gt;Performance metrics&lt;/li&gt;
&lt;li&gt;One-click job replay&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I didn't write any of this. It was already there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Database access is predictable&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No async session managers. No connection pool tuning. No event loop surprises. It just works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Debugging is straightforward&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When an embedding job failed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open Sidekiq UI&lt;/li&gt;
&lt;li&gt;See the error and full backtrace&lt;/li&gt;
&lt;li&gt;Click "Retry"&lt;/li&gt;
&lt;li&gt;Check logs if needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In FastAPI, I was tailing Celery logs and rebuilding context manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. The ecosystem has what I needed&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ruby-openai&lt;/code&gt; gem for API calls&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;neighbor&lt;/code&gt; gem for vector similarity&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pgvector&lt;/code&gt; extension for Postgres&lt;/li&gt;
&lt;li&gt;Standard Rails patterns for everything else&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No async complications. No compatibility issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. The developer workflow is integrated, not assembled&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is easy to underestimate until you feel it.&lt;/p&gt;

&lt;p&gt;Rails gives you a tight feedback loop by default:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A powerful interactive &lt;strong&gt;Rails console&lt;/strong&gt; for debugging live data and jobs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database migrations&lt;/strong&gt; that are simple, versioned, and reversible&lt;/li&gt;
&lt;li&gt;A test setup that’s predictable and deeply integrated with the framework&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When I wanted to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inspect a document’s embeddings&lt;/li&gt;
&lt;li&gt;Replay a failed job&lt;/li&gt;
&lt;li&gt;Tweak a schema and re-run ingestion&lt;/li&gt;
&lt;li&gt;Write or debug a failing test&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When I needed to inspect embeddings, replay a failed job, or tweak the schema, I did it from the console in seconds.&lt;/p&gt;

&lt;p&gt;In the FastAPI setup, these same tasks required more manual work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Managing Alembic migrations explicitly&lt;/li&gt;
&lt;li&gt;Configuring async test fixtures&lt;/li&gt;
&lt;li&gt;Debugging through logs instead of an interactive console&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is impossible in Python — but it &lt;em&gt;is&lt;/em&gt; more fragmented.&lt;/p&gt;

&lt;p&gt;Rails optimizes for &lt;strong&gt;flow&lt;/strong&gt;.&lt;br&gt;
FastAPI optimizes for &lt;strong&gt;flexibility&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When you're iterating on AI features under a deadline, that difference compounds daily.&lt;/p&gt;

&lt;h2&gt;
  
  
  The key insight: AI primitives are framework-agnostic
&lt;/h2&gt;

&lt;p&gt;After building both versions, here's what became clear:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The actual AI logic was identical.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Both systems used the same process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Split documents into chunks&lt;/li&gt;
&lt;li&gt;Generate embeddings via OpenAI API&lt;/li&gt;
&lt;li&gt;Store vectors in Postgres&lt;/li&gt;
&lt;li&gt;Retrieve similar chunks&lt;/li&gt;
&lt;li&gt;Pass context to LLM&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The intelligence came from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quality of input data&lt;/li&gt;
&lt;li&gt;Chunking strategy&lt;/li&gt;
&lt;li&gt;Prompt engineering&lt;/li&gt;
&lt;li&gt;Retrieval precision&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not from the framework.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Rails didn’t make the model smarter. It made the system easier to reason about, operate, and change.&lt;/p&gt;

&lt;p&gt;And for a product engineer shipping features, that matters more than access to the latest ML libraries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Python still clearly wins
&lt;/h2&gt;

&lt;p&gt;Let me be clear: there are cases where Python is the right choice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Python when you need:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Advanced document loaders (LangChain, LlamaIndex)&lt;/li&gt;
&lt;li&gt;Custom re-ranking models&lt;/li&gt;
&lt;li&gt;Sophisticated evaluation frameworks&lt;/li&gt;
&lt;li&gt;Rapid ML experimentation&lt;/li&gt;
&lt;li&gt;Fine-tuning workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For research and ML-heavy work, Python is unmatched.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Examples where I'd choose Python:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're experimenting with multiple embedding models weekly&lt;/li&gt;
&lt;li&gt;You need custom re-ranking with a BERT model&lt;/li&gt;
&lt;li&gt;You're running A/B tests on different chunking strategies&lt;/li&gt;
&lt;li&gt;Your team is already Python-first&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;But for building a production RAG feature in an existing Rails app?&lt;/strong&gt; You probably don't need to rewrite everything in Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Approach Now: Hybrid Architecture
&lt;/h2&gt;

&lt;p&gt;After building both versions, I use this mental model:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rails handles the application:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API endpoints&lt;/li&gt;
&lt;li&gt;Background jobs&lt;/li&gt;
&lt;li&gt;Database models&lt;/li&gt;
&lt;li&gt;User authentication&lt;/li&gt;
&lt;li&gt;Business logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Python microservices for ML-specific work:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Custom re-ranking models&lt;/li&gt;
&lt;li&gt;Advanced document parsing&lt;/li&gt;
&lt;li&gt;Specialized ML pipelines&lt;/li&gt;
&lt;li&gt;Evaluation frameworks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why this works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product code stays stable and maintainable&lt;/li&gt;
&lt;li&gt;AI experiments stay isolated&lt;/li&gt;
&lt;li&gt;Infrastructure stays simple&lt;/li&gt;
&lt;li&gt;Team velocity stays high&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of migrating my entire Rails app to FastAPI, I integrate Python only where it adds specific value.&lt;/p&gt;

&lt;h3&gt;
  
  
  The real takeaway
&lt;/h3&gt;

&lt;p&gt;Strip away the AI layer, and a RAG system is still just a distributed application:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Background jobs&lt;/li&gt;
&lt;li&gt;Database transactions&lt;/li&gt;
&lt;li&gt;Retries and failures&lt;/li&gt;
&lt;li&gt;User-facing latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The framework you choose determines how painful these problems are to live with.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's why this comparison isn't really about Rails vs FastAPI.&lt;/p&gt;

&lt;p&gt;It's about choosing tools that let you focus on &lt;strong&gt;product behavior&lt;/strong&gt; instead of &lt;strong&gt;infrastructure glue&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;Building the same RAG system in two different frameworks taught me something simple but important:&lt;/p&gt;

&lt;p&gt;The hard parts of production software aren't the AI API calls. They're:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reliable background processing&lt;/li&gt;
&lt;li&gt;Debugging production failures&lt;/li&gt;
&lt;li&gt;Managing deployments&lt;/li&gt;
&lt;li&gt;Maintaining clean architecture&lt;/li&gt;
&lt;li&gt;Shipping quickly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're already working in Rails (or Django, or any mature web framework), you already have solutions for these problems. Adding AI features doesn't change that.&lt;/p&gt;

&lt;p&gt;Python has incredible AI tooling. Rails has incredible application tooling.&lt;/p&gt;

&lt;p&gt;You don't need to choose one or the other. You can use both strategically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're a Rails developer wondering whether you need to learn FastAPI to build AI features: you probably don't.&lt;/strong&gt; Start with Rails. Add Python services only when you hit a real limitation.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Building AI features in non-Python frameworks? I'd love to hear about your experience. Drop a comment or connect with me on &lt;a href="https://linkedin.com/in/ganeshnavale21" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>rails</category>
      <category>ai</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
