<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ataur Rahman</title>
    <description>The latest articles on Forem by Ataur Rahman (@ataur39n).</description>
    <link>https://forem.com/ataur39n</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F627681%2F23d52ab1-1b18-4845-9524-c5655cf6427f.jpeg</url>
      <title>Forem: Ataur Rahman</title>
      <link>https://forem.com/ataur39n</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ataur39n"/>
    <language>en</language>
    <item>
      <title>From Terminal to UI: Building Your First Local AI Assistant with Node.js</title>
      <dc:creator>Ataur Rahman</dc:creator>
      <pubDate>Mon, 21 Jul 2025 14:37:06 +0000</pubDate>
      <link>https://forem.com/ataur39n/from-terminal-to-ui-building-your-first-local-ai-assistant-with-nodejs-10ok</link>
      <guid>https://forem.com/ataur39n/from-terminal-to-ui-building-your-first-local-ai-assistant-with-nodejs-10ok</guid>
      <description>&lt;p&gt;Hi everyone! How's your journey with AI going? Each day feels more exciting than the last. We're living through a technological revolution, witnessing rapid innovation in AI like never before.&lt;/p&gt;

&lt;p&gt;I won't spend time here trying to explain what AI is capable of - that's already clear. The real question is: how can we benefit from it? If I can complete 10 tasks in a day, but AI helps me get those done in half the time, I can spend the rest doing more meaningful work - or just resting. That's the magic of automation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But let's be clear:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;We shouldn't become addicted to AI. Instead, we should learn how to make the most of it. That means staying updated and understanding the fundamentals - what's happening behind the scenes. Once you grasp how current AI systems work, you'll find yourself ready to build and innovate with confidence.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;u&gt;&lt;strong&gt;Quick Note Before We Begin&lt;/strong&gt;&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apologies&lt;/strong&gt; for the delay - it's been 1.5 months since my last post. I've been under the weather, dealing with job pressure, and learning a lot of new things. But now I'm back, and the good news is: &lt;strong&gt;I've already finished testing demo apps for the next 5–6 posts!&lt;/strong&gt; That means new content will be rolling out much faster - so stay tuned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In our previous Blog,&lt;/strong&gt; I explained the essential tools and topic's like Ollama, LangChain, and how local models work. I won't repeat those here - please check out that post if you haven't yet. &lt;a href="https://medium.com/javascript-in-plain-english/building-an-ai-assistant-essential-tools-and-concepts-a8f12497cd65" rel="noopener noreferrer"&gt;Read my previous details blog from here&lt;/a&gt;. I will mention here like I am using this. To jump into the application , must read those following topics :&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites&lt;/strong&gt;&lt;br&gt;
Make sure you have in locally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;strong&gt;Node.js&lt;/strong&gt; installed → &lt;a href="https://nodejs.org/en/download" rel="noopener noreferrer"&gt;Download Node.js&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Ollama&lt;/strong&gt; installed → &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Download Ollama&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; Pulled a local model using Ollama
For this demo, I'm using the lightweight model: &lt;strong&gt;llama3.2:3b-instruct-q4_K_M (approx. 2–2.5GB)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;And you already ready this post : &lt;a href="https://medium.com/javascript-in-plain-english/building-an-ai-assistant-essential-tools-and-concepts-a8f12497cd65" rel="noopener noreferrer"&gt;Building an AI Assistant with Nodejs, Ollama and Langchain: Essential Tools and Concepts&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can pull and run it using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ollama run llama3.2:3b-instruct-q4_K_M
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;To check your setup,&lt;/strong&gt; run the command above in your terminal. You should see the model response interface. That confirms everything is ready.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnj6905oj5fdwo5ole6id.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnj6905oj5fdwo5ole6id.png" alt="Testing in terminal if everything is ok" width="800" height="618"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see all available model in you machine by running &lt;strong&gt;ollama ls&lt;/strong&gt; command. I have 5 model as you can see in the screenshot. If this is your first time and follow the Prerequisites, then in your list you will see only one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's get start:
&lt;/h2&gt;

&lt;p&gt;&lt;u&gt;&lt;strong&gt;So what are going to do in this tutorial?&lt;/strong&gt;&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;If you run "&lt;strong&gt;&lt;code&gt;ollama run llama3.2:3b-instruct-q4_K_M&lt;/code&gt;&lt;/strong&gt;" command in your terminal, you should see the model response interface and you can a conversation with model. You can end or exit from the conversation by sending - "&lt;strong&gt;/bye&lt;/strong&gt;" message.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F84oeepl3fykxrh9urvrf.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F84oeepl3fykxrh9urvrf.gif" alt="Interact with model from terminal" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;But what is our goal ?&lt;/u&gt;&lt;/strong&gt;&lt;br&gt;
Building our own AI assistant with a lot of capabilities. Which is not possible from terminal. Need a application who will interact with the model with smart capabilities. Before talking about capabilities we need a basic application where we can interact with model from from UI. I mean what we are doing now from terminal, the same thing should we do by our application. No more need terminal to interact with model. We will talk via our application who will maintain how interact with model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw1je96713xefgwb6hcb.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw1je96713xefgwb6hcb.gif" alt="Basic AI assistant" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;How can we achieve that ?&lt;/u&gt;&lt;/strong&gt;&lt;br&gt;
I will do all thing in TypeScript (JS). So, I choose Next.js for frontend and Node.js (express.js framework) for back-end. By a simple express.js application we can handle our basic need.&lt;/p&gt;

&lt;p&gt;First step is Start from Back-end.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Initialize a Nodejs application&lt;/strong&gt; and setup a basic express app by commanding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm init //initaial project

//install packages
npm &lt;span class="nb"&gt;install &lt;/span&gt;express @types/express cors @types/cors dotenv @langchain/core @langchain/community @langchain/ollama 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After installing packages create a index.ts file with basic /stream route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;cors&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cors&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;dotenv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;9000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cors&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="cm"&gt;/* 
    This is the endpoint that will be used to stream the response from the AI to the client.
*/&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/stream&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;//we will do our business logic here step by step&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Server is running on port &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;in this route we are expecting user should give at least a message like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Hi"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and inside the /stream route we will catch and process like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;   &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Message is required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ChatOllama&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;llama3.2:3b-instruct-q4_K_M&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://localhost:11434&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;formatedMessages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;formatedMessages&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Stream error:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Stream error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;wait wait wait !!!. Not complete yet. Before jump into next let's understand first what I did here.&lt;/p&gt;

&lt;p&gt;We are &lt;strong&gt;configuring&lt;/strong&gt; our model in our code by &lt;strong&gt;langchain&lt;/strong&gt;. The &lt;strong&gt;chatOllama&lt;/strong&gt; class is comes from &lt;strong&gt;langchain&lt;/strong&gt; package. Here we are configuring the &lt;strong&gt;model&lt;/strong&gt; information. when we install &lt;strong&gt;ollama&lt;/strong&gt;, by default it expose &lt;strong&gt;11434 port&lt;/strong&gt;. We can communicate with ollama by that port. Now question is&lt;/p&gt;

&lt;p&gt;&lt;u&gt;&lt;strong&gt;As ollama itself expose a rest API to communicate with him , why we are using langchain? why not Ollama directly.&lt;/strong&gt;&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;If you have this question in your head , I will say good catch. You are really curious to know how AI is working. Well, back to the point. To better understand you have to know first &lt;strong&gt;what is olllama and langchain&lt;/strong&gt;. as I mention I have a details post about those topic. &lt;a href="https://medium.com/javascript-in-plain-english/building-an-ai-assistant-essential-tools-and-concepts-a8f12497cd65" rel="noopener noreferrer"&gt;Highly recommend read from here&lt;/a&gt;. But I am mentioning here again little bit .&lt;/p&gt;

&lt;p&gt;We can communicate with Ollama by default rest api. It will work also . But our application will &lt;strong&gt;Tightly Coupled&lt;/strong&gt; with ollama system. In that case we can not use &lt;strong&gt;OpenAi&lt;/strong&gt;, &lt;strong&gt;Claude&lt;/strong&gt; , &lt;strong&gt;Gemini&lt;/strong&gt; type others model in our system. &lt;strong&gt;Actually we can but we need to different configuration for each provider.&lt;/strong&gt; So, here is the point of the langchain's entry. Langchain is a wrapper of all models configuration (in better understand in our current context). All configuration is similar. Here we are using ollama provider so , langchain offering us this chatollama class for communicate with our local model. You can see other &lt;a href="https://js.langchain.com/docs/integrations/chat/" rel="noopener noreferrer"&gt;from here&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Right now this config look too simple , that's why may be you have rise some question on your head about what is the more complex configuration. I know. Keep searching for answer and let me know your question in comment section also.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To configure the model here we need to pass which model we want to use and what is the endpoint. In our case we installed earlier &lt;strong&gt;llama3.2:3b-instruct-q4_K_M&lt;/strong&gt; , and the base URL is the provider URL. In our case, provider is ollama and it expose in 11434 port.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;       &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ChatOllama&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;llama3.2:3b-instruct-q4_K_M&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://localhost:11434&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;formatedMessages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;formatedMessages&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;Why I convert the user messages by HumanMessage class and why I put it in array ?&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Answer: Well , we can use directly the user message but if we use this class we can ensure this will return the same way how the model is expecting . You can also use like this but &lt;strong&gt;first one is recommended.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const stream = await model.stream("Hello, how are you?");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you run your code and hit your endpoint you can see some console log in your terminal. If you remember, we put a console log the content inside a for loop.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9eulxyqhpbgs7a3eib0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9eulxyqhpbgs7a3eib0.png" alt="terminal response" width="800" height="312"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Yooo Man!, Congratulation !!!. You have successfully achieved the first step. It's showing like almost same as how we are communicate the model directly. Now come to the stream the response part.&lt;br&gt;
After the stream line update your code by following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;formatedMessages&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// await streamChunksToTextResponse(res, stream);&lt;/span&gt;
        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/plain&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Transfer-Encoding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chunked&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;part&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;part&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;part&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After updating your code , if you try now to hit your endpoint any software like postman , you can see the stream response.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvd9d2nhik296eumvv2f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvd9d2nhik296eumvv2f.png" alt="Postman response" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Wow !, your back-end API is ready now you can use your API in front-end to view your response like our chat interface.&lt;br&gt;
Actually I am a &lt;strong&gt;Nodejs Developer&lt;/strong&gt;. But I know front-end little without CSS 😁😁. So I am sharing the request handle part here only not how setup the entire project. I hope you know how to create a Nextjs application and can design a chat interface. Or you can use v0.dev for design like me as I did. Let's jump into next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First I&lt;/strong&gt; create a next js API for my back-end API call and forward the response to my client (where I am calling this api):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/api/chat/route.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NextRequest&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;next/server&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;runtime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;edge&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;POST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;NextRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://localhost:9000/stream&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="nx"&gt;message&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/event-stream&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;no-cache&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;keep-alive&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In client I request my next API like following and handle the response here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;getReader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;utf-8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No stream reader found.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;done&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you run and use this function you can the console log same as you can see in back-end stream was console. Your can handle you chat history messages by using a state. In my application I create a hook where all chat message related task I am handling.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;use client&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Message&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/types&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useChat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;([])&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setIsLoading&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;isTyping&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setIsTyping&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sendMessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="c1"&gt;// Add user message to chat&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nf"&gt;setIsLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;setIsTyping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Send message to API using server action&lt;/span&gt;
      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;getReader&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;utf-8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No stream reader found.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="c1"&gt;// Add initial assistant message&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;assistantMessageId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;assistantMessageId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;
      &lt;span class="p"&gt;}])&lt;/span&gt;

      &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;done&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

        &lt;span class="c1"&gt;// Update message in real-time&lt;/span&gt;
        &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
          &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;assistantMessageId&lt;/span&gt;
            &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Error sending message:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="c1"&gt;// Add error message&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;errorMessage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Sorry, there was an error processing your request. Please try again.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;errorMessage&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;setIsLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nf"&gt;setIsTyping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;isTyping&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;sendMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here actually, I am managing the chat messages state. Receiving the stream and update the state also. So it feel's the UI the real-time stream.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Congratulations!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You've just built the foundation of your own AI assistant! You can now send messages to a local LLM from a web UI using LangChain, Express, and Ollama. We kick start successfully, still have long path to achieve our goal. Our current application dose not have chat memory yet. Means If you give him some information in your last message and ask in the next message , he can not answer. We will fix it one by one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's Next?&lt;/strong&gt;&lt;br&gt;
We've laid the groundwork. Next up, we'll refine this setup, add memory, RAG (retrieval-augmented generation), and enable tool usage. Stick around &lt;br&gt;
&lt;strong&gt;- it's about to get exciting.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;💭 Final Thoughts&lt;/strong&gt;&lt;br&gt;
The goal of this series is not just to build something cool - but to help you understand how modern AI works under the hood. If you're a JavaScript developer, you don't need to feel left out of the AI world anymore. You've got the tools. Now let's build!&lt;/p&gt;

&lt;p&gt;🔗 Follow me for updates, and thank you for joining in our mission building own AI Assistant !&lt;/p&gt;

&lt;p&gt;👉 💬 Got questions or thoughts? Drop them in the comments - I'd love to hear what you're building.&lt;/p&gt;

&lt;p&gt;👉 Stay tuned for the next post in this series!&lt;/p&gt;

&lt;p&gt;💖 If you're finding value in my posts and want to help me continue creating, feel free to support me here [&lt;a href="https://cutt.ly/0rvsCQkd" rel="noopener noreferrer"&gt;Buy me a Coffee&lt;/a&gt;]! Every contribution helps, and I truly appreciate it! Thank You. 🙌&lt;/p&gt;

&lt;p&gt;Happy Coding! 🚀&lt;/p&gt;

</description>
      <category>langchain</category>
      <category>node</category>
      <category>ollama</category>
      <category>llm</category>
    </item>
    <item>
      <title>Building an AI Assistant with NodeJs: Essential Tools and Concepts</title>
      <dc:creator>Ataur Rahman</dc:creator>
      <pubDate>Sat, 31 May 2025 20:42:10 +0000</pubDate>
      <link>https://forem.com/ataur39n/building-an-ai-assistant-with-nodejs-essential-tools-and-concepts-2n2p</link>
      <guid>https://forem.com/ataur39n/building-an-ai-assistant-with-nodejs-essential-tools-and-concepts-2n2p</guid>
      <description>&lt;h2&gt;
  
  
  Hi everyone,
&lt;/h2&gt;

&lt;p&gt;— especially those who’ve been eagerly waiting for my series, and particularly all the JavaScript developers out there. How’s your day going in this booming age of AI? 🚀&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI&lt;/strong&gt; is growing at an incredible pace. Haven’t started yet? Feeling overwhelmed with all the new technologies? Not sure how they connect or where to begin? Trust me, you’re not alone. Developer life is full of confusion — and now AI has added a whole new level. But we need to overcome that fear.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;truth&lt;/strong&gt; is: AI isn’t as complicated as it seems. You don’t need to know everything from start to finish. Everyone has limitations — scientists innovate, engineers scale, and developers build. We don’t need to play every role. Instead, we’ll start with small steps, grow our knowledge, build something meaningful, and explore new ideas. Day by day, our understanding will be more deep. But the key is to take that first step.&lt;/p&gt;

&lt;p&gt;I don’t claim to know everything, but I explore and learn every day — and face plenty of challenges along the way. That’s why I decided to document my journey, sharing what I know in the hope it might help someone else. And I’d be grateful if it does. If you’ve seen our plan and roadmap, you know what we’re trying to achieve. If you haven’t yet, &lt;strong&gt;&lt;em&gt;&lt;a href="https://ataur39n.medium.com/build-your-own-ai-assistant-with-node-js-my-roadmap-and-journey-d3ae60b2f645" rel="noopener noreferrer"&gt;take a look here&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;. That was the kickstarter post — this is our official entry into the journey. I hope you’ll enjoy it and join with me. So, let’s kick things off and &lt;strong&gt;&lt;em&gt;start building our own AI assistant with NodeJs.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Welcome, &lt;em&gt;buddy&lt;/em&gt;&lt;/strong&gt;. We’re all in the same boat. I’m thinking of those who at least know &lt;strong&gt;JavaScript fundamentals — functions, variables, loops, basic types&lt;/strong&gt;. If you know more, that’s a bonus. But I’m assuming we’re all starting from a similar place. &lt;strong&gt;To build an AI assistant,&lt;/strong&gt; we’ll need to become familiar with some &lt;strong&gt;new concepts and technologies.&lt;/strong&gt; It’s essential, because without understanding them, we won’t know how the system works during development.&lt;/p&gt;

&lt;p&gt;Let’s highlight a few key terms we’ll encounter throughout this project:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;&amp;gt; Agent, Model, Ollama, LangChain, PGVector, RAG, Tools, Memory, Redis, Postgres, MongoDB, AI SDK, MCP Server (stdio and streamable HTTP), MCP Client, Docker, Embedding Engine, Semantic Search.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each of these topics deserves its own detailed blog post to really understand what’s going on. So, I’ve prepared a standalone post for each (&lt;strong&gt;&lt;em&gt;highly recommended&lt;/em&gt;&lt;/strong&gt;). Writing them took time after I shared the roadmap, but as promised, I’ll keep you guys updated. I also used ChatGPT to help with summarizing and formatting and saving some times, but I’ve reviewed everything personally — so you can read with confidence. If you spot any issues or missing information, please comment and I’ll fix it. Today, we’ll explore these topics, and in the next day or so, we’ll configure our environment to get hands-on.&lt;/p&gt;

&lt;p&gt;Stay tuned and let’s dive in! 🚀&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧠 Agent&lt;/strong&gt;&lt;br&gt;
An Agent is the brain that decides what actions to take based on user input. It manages reasoning and tool usage to handle dynamic queries. For example, if asked for the weather, the Agent fetches live data and crafts a response. Agents prevent hardcoding logic for every possible query, keeping the system flexible. They’re essential for creating AI that acts intelligently and naturally. 🔗 Curious about how Agents work? &lt;a href="https://ataur39n.medium.com/agent-the-brain-behind-intelligent-ai-workflows-c72a87bae483" rel="noopener noreferrer"&gt;Read the full Agent post!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📚 Model&lt;/strong&gt;&lt;br&gt;
The Model is the core engine, responsible for understanding and generating text. It processes input using deep neural networks, outputting natural responses. For instance, asking “What’s LangChain?” gives a complete, contextual reply. Models enable language understanding and flexibility, far beyond simple rule-based systems. 🔗 Uncover the full magic of Models! &lt;a href="https://ataur39n.medium.com/the-heart-of-ai-assistants-understanding-the-model-0147adbe70c3" rel="noopener noreferrer"&gt;Explore the full Model blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚙️ Ollama&lt;/strong&gt;&lt;br&gt;
Ollama makes it easy to run large models locally without complex setups. It provides a simple interface to models like LLaMA and Mistral, making advanced AI accessible. With Ollama, you can run models on your machine for privacy and offline use. It’s perfect for developers experimenting with local setups. Ollama also handles model loading, tokenization, and optimization, reducing manual configuration. For example, you can run a chatbot locally using ollama serve. 🔗 Dive into local model magic! &lt;a href="https://ataur39n.medium.com/ollama-bringing-ai-models-to-your-local-machine-00ba3be32663" rel="noopener noreferrer"&gt;Check out the Ollama blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔗 LangChain&lt;/strong&gt;&lt;br&gt;
LangChain connects the dots between models, tools, and memory to create powerful workflows. It helps the assistant fetch data, handle steps, and respond intelligently. It supports modular design and allows integration of different tools seamlessly. LangChain enables chaining of complex tasks like database queries, search, and content generation. For example, it can pull invoices and draft emails based on a simple user request. 🔗 Master LangChain’s potential! &lt;a href="https://ataur39n.medium.com/%EF%B8%8F-langchain-overview-workflow-magic-for-ai-assistants-359563e6b152" rel="noopener noreferrer"&gt;Read the LangChain blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📦 PGVector&lt;/strong&gt;&lt;br&gt;
PGVector stores embeddings in PostgreSQL, enabling semantic search and fast data retrieval. It allows the assistant to store vector representations of documents and compare them to incoming queries. This makes searches faster and more meaningful than traditional keyword matching. PGVector supports indexing and similarity metrics, making it scalable for large datasets. For example, it can find relevant documents even with different phrasing. 🔗 Learn how PGVector supercharges search! &lt;a href="https://ataur39n.medium.com/pgvector-explained-boost-semantic-search-and-retrieval-with-postgres-7b86d047d9a2" rel="noopener noreferrer"&gt;Explore the PGVector blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔎 RAG&lt;/strong&gt;&lt;br&gt;
RAG enhances model generation by grounding responses in real data. Instead of guessing, the assistant fetches relevant content and combines it with generation. This makes answers more accurate and context-aware, reducing errors and hallucinations. It powers document-based QA, FAQs, and retrieval of critical information in real time. RAG improves reliability by providing references alongside generated responses. For example, it retrieves best Docker practices before responding. 🔗 See RAG in action! &lt;a href="https://ataur39n.medium.com/rag-simplified-enhance-ai-accuracy-with-real-time-retrieval-075338e54d83" rel="noopener noreferrer"&gt;Check the RAG blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚡ Redis&lt;/strong&gt;&lt;br&gt;
Redis is a lightning-fast in-memory database that manages session data, caches, and keeps the system responsive. It stores conversation history, user states, and real-time data for smooth interactions. Redis supports data structures like lists, hashes, and sorted sets for flexible use cases. It also enables features like rate limiting and temporary data storage. For example, Redis can track user sessions during a multi-step form. 🔗 Unlock Redis magic! &lt;a href="https://ataur39n.medium.com/redis-for-ai-speed-up-conversations-and-manage-memory-2f099675df3b" rel="noopener noreferrer"&gt;Explore the Redis blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🗄️ Postgres&lt;/strong&gt;&lt;br&gt;
Postgres is the structured database for storing user profiles, settings, and transactional data. It ensures data integrity and handles complex queries with ACID compliance. Postgres supports foreign keys, indexing, and constraints to maintain data relationships. It scales well for large datasets and integrates with extensions like PGVector. For example, Postgres can store user subscription details and fetch them on request. 🔗 Get deep into Postgres! &lt;a href="https://ataur39n.medium.com/postgres-made-easy-manage-structured-data-in-ai-projects-bb314de7713b" rel="noopener noreferrer"&gt;Read the Postgres blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🗃️ MongoDB&lt;/strong&gt;&lt;br&gt;
MongoDB handles flexible data like logs and activity records. Its document model allows easy adaptation to changing data formats, perfect for chat logs or analytics. Documents can have nested structures and varied fields without requiring schema changes. MongoDB scales horizontally through sharding for large datasets. For example, chat sessions and logs can be stored and queried efficiently. 🔗 Discover MongoDB’s flexibility! &lt;a href="https://ataur39n.medium.com/storing-flexible-data-for-ai-assistants-with-mongodb-69092ee58072" rel="noopener noreferrer"&gt;Dive into the MongoDB blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;💡 AI SDK&lt;/strong&gt;&lt;br&gt;
The AI SDK simplifies working with various AI providers like OpenAI or Anthropic. It standardizes model calls, letting you switch models with minimal code changes, making development faster and cleaner. It supports text generation, embeddings, and function calling in a consistent interface. The SDK also handles streaming responses for interactive UIs. For example, generating a summary from a user prompt using OpenAI’s GPT-4. 🔗 Simplify AI integration! &lt;a href="https://ataur39n.medium.com/ai-sdk-simplified-integrate-multiple-ai-models-with-easeai-sdk-simplified-integrate-multiple-ai-701abf5d1233" rel="noopener noreferrer"&gt;Check out the AI SDK blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🌐 MCP (Server &amp;amp; Client)&lt;/strong&gt;&lt;br&gt;
MCP standardizes communication between the assistant and external tools. The server exposes tools, while the client manages calls. This architecture allows seamless integration with different tools and APIs. MCP supports transports like stdio for local setups and streamable HTTP for remote servers, making it versatile and scalable. It provides a unified protocol for tools, making integration simpler and more modular. For example, it can fetch weather data using a modular tool without complex API setups. 🔗 Discover MCP’s power! &lt;a href="https://ataur39n.medium.com/mcp-standardizing-tool-access-in-ai-workflows-ed403f74b5c0" rel="noopener noreferrer"&gt;Read the MCP blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🐳 Docker&lt;/strong&gt;&lt;br&gt;
Docker packages services into containers, ensuring consistent environments and smooth deployments. It isolates dependencies and allows running multiple services without conflicts. Docker simplifies local development and cloud deployments by using containers and orchestration tools. It supports scaling and automation through Compose and Swarm. For example, running Ollama, Redis, and PGVector with one docker-compose command. 🔗 Master container magic! &lt;a href="https://ataur39n.medium.com/docker-for-scalable-and-clean-ai-environments-f1475a9206de" rel="noopener noreferrer"&gt;Explore the Docker blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🧬 Embedding Engine&lt;/strong&gt;&lt;br&gt;
The embedding engine converts text into vectors that capture meaning, crucial for semantic search and RAG. It works by using pre-trained models to map text to high-dimensional vectors that reflect semantic relationships. These embeddings power document retrieval, contextual responses, and even recommendation systems. Keeping embedding consistency and versioning is vital. It enables finding contextually relevant data and reducing irrelevant matches. 🔗 Understand embeddings deeply! &lt;a href="https://ataur39n.medium.com/embedding-engines-explained-fuel-semantic-search-in-ai-9a4ea4b38a2f" rel="noopener noreferrer"&gt;Read the Embedding Engine blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔍 Semantic Search&lt;/strong&gt;&lt;br&gt;
Semantic search retrieves data based on meaning rather than keywords. It uses embeddings to find the most relevant documents, ensuring accurate and helpful responses. It’s the engine behind natural, user-friendly searches in the assistant. It works by comparing vector similarities, enabling matching of related but differently phrased queries. Combining semantic search with metadata filters can further improve precision and recall. 🔗 Discover semantic magic! &lt;a href="https://ataur39n.medium.com/smarter-search-for-ai-with-semantic-understanding-acb2c51d4096" rel="noopener noreferrer"&gt;Dive into the Semantic Search blog!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wow! &lt;em&gt;Congratulation&lt;/em&gt;&lt;/strong&gt;. You just explored &lt;em&gt;14 new topic&lt;/em&gt;. &lt;strong&gt;Great work&lt;/strong&gt;. That’s it for today. Take some rest. I highly recommend to read each individual topic details post. Not so long those blog. Maybe will take &lt;em&gt;2–3 minutes&lt;/em&gt; each. But you will be &lt;strong&gt;clear like water&lt;/strong&gt;. I try to explained in very simple way with easy word and example. Best wishes for you.&lt;/p&gt;

&lt;p&gt;🔗 Follow me for updates, and let’s build an amazing AI Assistant together!&lt;/p&gt;

&lt;p&gt;👉 Got questions? Leave them below!&lt;br&gt;
👉 Stay tuned for the next post in this series!&lt;/p&gt;

&lt;p&gt;💖 If you’re finding value in my posts and want to help me continue creating, feel free to support me here &lt;a href="https://cutt.ly/0rvsCQkd" rel="noopener noreferrer"&gt;[Buy me a Coffee]&lt;/a&gt;! Every contribution helps, and I truly appreciate it! Thank you.🙌&lt;/p&gt;

</description>
      <category>node</category>
      <category>langchain</category>
      <category>ollama</category>
      <category>mcp</category>
    </item>
    <item>
      <title>🚀 Build Your Own AI Assistant with Node.js: My Roadmap and Journey 🌟</title>
      <dc:creator>Ataur Rahman</dc:creator>
      <pubDate>Sun, 25 May 2025 01:39:15 +0000</pubDate>
      <link>https://forem.com/ataur39n/build-your-own-ai-assistant-with-nodejs-my-roadmap-and-journey-46c7</link>
      <guid>https://forem.com/ataur39n/build-your-own-ai-assistant-with-nodejs-my-roadmap-and-journey-46c7</guid>
      <description>&lt;p&gt;Hey everyone! 👋&lt;/p&gt;

&lt;p&gt;I’m excited to kick off a new &lt;strong&gt;blog series&lt;/strong&gt; where I’ll walk you through my journey of &lt;strong&gt;building a custom AI Assistant&lt;/strong&gt; using &lt;strong&gt;Node.js&lt;/strong&gt;, &lt;strong&gt;LangChain&lt;/strong&gt;, and other cutting-edge tools. 💻✨&lt;/p&gt;

&lt;p&gt;This series is not just about coding – it’s about &lt;strong&gt;learning, experimenting, and sharing&lt;/strong&gt; everything I discover along the way. Whether you’re a developer like me, curious about AI, or just love diving into cool projects, you’re welcome to join me on this adventure! 🙌&lt;/p&gt;




&lt;h2&gt;
  
  
  📌 Here’s the Roadmap I’ll Be Following:
&lt;/h2&gt;

&lt;p&gt;🔹 &lt;strong&gt;1. Introduction: Understanding Tools and Setting Up the Environment&lt;/strong&gt;&lt;br&gt;
In this stage, we’ll explore the essential tools and technologies like Node.js, LangChain, PGVector, ai-sdk, and Redis. You’ll learn how to configure your local machine, install dependencies, and prepare a robust environment.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Setting up a scalable and developer-friendly environment saves future debugging time.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;2. Building a General Chat Assistant&lt;/strong&gt;&lt;br&gt;
We’ll create a basic chat assistant capable of handling conversations.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frontend Focus: Use ai-sdk to quickly build an interactive UI that sends queries to a local LLM (Large Language Model) and renders responses.&lt;/li&gt;
&lt;li&gt;Backend Focus: With LangChain, develop a backend where the model logic resides, and the UI just handles input/output. This approach is ideal for scalable control.
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Understand the trade-offs between frontend-heavy and backend-controlled architectures.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🔹 &lt;strong&gt;3. Connecting a Database to Our Chat Assistant&lt;/strong&gt;&lt;br&gt;
Integrate a database (PostgreSQL, MongoDB, etc.) to store conversation history, user preferences, and tool usage logs.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: A database transforms a stateless chatbot into a persistent, context-aware assistant.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;4. Setting Up Chat Memory&lt;/strong&gt;&lt;br&gt;
Implement memory techniques like Redis, local storage, or LangChain memory modules.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Memory management is crucial for context retention in multi-turn conversations.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;5. Understanding PGVector and Vector Embedding Engines&lt;/strong&gt;&lt;br&gt;
Explore how embedding models convert text into numerical vectors and how PGVector stores and retrieves these vectors efficiently.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Embedding vectors enable semantic understanding, letting the assistant retrieve relevant information.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;6. Integrating PGVector and Embedding Engines into Our Chat Backend&lt;/strong&gt;&lt;br&gt;
Connect embeddings to the backend for contextually relevant query results.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Merging embeddings into the chat logic enhances response quality and relevance.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;7. What is RAG (Retrieval-Augmented Generation)?&lt;/strong&gt;&lt;br&gt;
Learn how RAG combines retrieval systems with language models to generate accurate, dynamic responses.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: RAG makes assistants factually accurate by grounding answers in reliable sources.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;8. Configuring RAG for Our Project&lt;/strong&gt;&lt;br&gt;
Set up a basic RAG system in the backend with PGVector.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Correctly configured RAG enables high-quality, up-to-date responses.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;9. Integrating RAG with Our Backend&lt;/strong&gt;&lt;br&gt;
Connect RAG into the chatbot flow for seamless retrieval and generation.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Integration ensures smooth handoffs between retrieval and generation steps.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;10. Adding Tools to Our Backend with LangChain&lt;/strong&gt;&lt;br&gt;
Expand capabilities with custom tools using LangChain’s tools architecture.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Custom tools enhance functionality, making the assistant more versatile.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;11. What is MCP? Why Do We Need It?&lt;/strong&gt;&lt;br&gt;
Explore MCP (Model-Context Protocol) for managing tools more flexibly than LangChain alone.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: MCP offers a structured approach to tool calling beyond LangChain’s built-ins.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;12. Building Simple Stdio and Streamable HTTP Servers&lt;/strong&gt;&lt;br&gt;
Learn to build basic servers for tool management and AI-generated responses.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Streamable servers provide real-time interaction and efficient resource management.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;13. Organizing the Streamable Server&lt;/strong&gt;&lt;br&gt;
Organize the server for simple request handling and error management.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: A well-organized server ensures reliable performance in basic use cases.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;14. Connecting MCP with LangChain Backend&lt;/strong&gt;&lt;br&gt;
Integrate MCP with LangChain to enable tool calling and result handling.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: This connection brings dynamic tool calling into the assistant’s workflow.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;15. Tool Calling Ideologies&lt;/strong&gt;&lt;br&gt;
Explore two strategies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Intent-Based: Explicit tool invocation based on user intent.&lt;/li&gt;
&lt;li&gt;Free Decision: LLMs decide autonomously which tool to call.
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Each strategy has use cases; understanding them helps design the right experience.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🔹 &lt;strong&gt;16. Wrapping It All Together&lt;/strong&gt;&lt;br&gt;
Combine everything: memory, RAG, MCP, and LangChain backend to create a complete, experimental AI assistant system.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Integration delivers a seamless assistant with advanced features.&lt;/p&gt;

&lt;p&gt;🔹 &lt;strong&gt;17. Bonus: Exploring ai-sdk for Full Integration&lt;/strong&gt;&lt;br&gt;
Explore building the same system using ai-sdk, comparing approaches for deeper understanding.&lt;br&gt;
👉 &lt;em&gt;Key Takeaway&lt;/em&gt;: Exploring multiple frameworks broadens skill sets and insight.&lt;/p&gt;




&lt;h2&gt;
  
  
  🗓 My Posting Schedule
&lt;/h2&gt;

&lt;p&gt;I’ll aim to cover one topic per day. However, since testing and building take time, it might not be possible to post daily. Rest assured, I’ll share each new piece as soon as I can! 💪&lt;/p&gt;




&lt;h2&gt;
  
  
  💬 Let’s Learn Together!
&lt;/h2&gt;

&lt;p&gt;As a &lt;strong&gt;JavaScript developer&lt;/strong&gt;, especially in &lt;strong&gt;Node.js&lt;/strong&gt;, I’ll approach this project from my own perspective. I’ll share:&lt;br&gt;
✅ My learnings and discoveries&lt;br&gt;
✅ Challenges and solutions&lt;br&gt;
✅ Mistakes and how I corrected them&lt;br&gt;
✅ Helpful code snippets and explanations&lt;/p&gt;

&lt;p&gt;I’m not perfect – I’ll definitely make mistakes. If you spot something wrong, or have suggestions, please leave a comment and help me (and others) learn and improve. 🙏 Let’s make this journey collaborative! 🚀&lt;/p&gt;




&lt;p&gt;🔗 &lt;strong&gt;Follow me for updates&lt;/strong&gt;, and let’s build an amazing AI Assistant together! &lt;a href="https://ataur39n.medium.com/build-your-own-ai-assistant-with-node-js-my-roadmap-and-journey-d3ae60b2f645" rel="noopener noreferrer"&gt;in medium&lt;/a&gt;&lt;br&gt;
👉 Got questions? Leave them below!&lt;br&gt;
👉 Stay tuned for the next post in this series!&lt;/p&gt;

&lt;p&gt;💖 &lt;strong&gt;If you’d like to support my work and help me continue sharing, you can contribute here - &lt;a href="https://cutt.ly/0rvsCQkd" rel="noopener noreferrer"&gt;buy me a coffee&lt;/a&gt;. Every little bit helps – thank you! 🙏&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;💬 &lt;strong&gt;Join the Journey with Me!&lt;/strong&gt;&lt;br&gt;
Whether you’re diving in solo, bringing a friend, or joining as a team—come along on this learning adventure! 🚀 Let’s grow together, one step at a time.&lt;/p&gt;




</description>
      <category>node</category>
      <category>langchain</category>
      <category>vectordatabase</category>
      <category>mcp</category>
    </item>
    <item>
      <title>`echo "Hello, World!"` — the classic start to every dev journey... and also the intro to mine. 🔥</title>
      <dc:creator>Ataur Rahman</dc:creator>
      <pubDate>Thu, 15 May 2025 21:20:46 +0000</pubDate>
      <link>https://forem.com/ataur39n/echo-hello-world-the-classic-start-to-every-dev-journey-and-also-the-intro-to-mine-25c2</link>
      <guid>https://forem.com/ataur39n/echo-hello-world-the-classic-start-to-every-dev-journey-and-also-the-intro-to-mine-25c2</guid>
      <description>&lt;p&gt;This video might look simple — just printing a message — but guess what?&lt;br&gt;
That message is coming straight from my &lt;strong&gt;custom tool&lt;/strong&gt;, bound into my very own &lt;strong&gt;AI assistant backend&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It’s a basic tool (just echoing input) — but that’s the point.&lt;br&gt;
We’re testing the foundation.&lt;br&gt;
And just like that, a new journey begins. 🚀&lt;/p&gt;

&lt;p&gt;Right now, we’ve set up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Local LLMs via &lt;strong&gt;Ollama&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;✅ Flow + tool binding with &lt;strong&gt;LangChain&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;✅ Embeddings + search using &lt;strong&gt;PGVector&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;✅ Session memory plan&lt;/li&gt;
&lt;li&gt;✅ Streaming UI — fully working and shown in the video&lt;/li&gt;
&lt;li&gt;✅ Tools bound and functional (even this echo)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’re now diving into the &lt;strong&gt;MCP server&lt;/strong&gt; — exploring advanced tool orchestration and how to scale across multiple servers.&lt;/p&gt;

&lt;p&gt;But let’s be clear:&lt;br&gt;
👉 What we’ve done so far is &lt;strong&gt;just the beginning&lt;/strong&gt;.&lt;br&gt;
We're still in a small zone of a big vision — and there’s &lt;strong&gt;a LOT left to build.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;🎯 &lt;strong&gt;Blog series coming soon.&lt;/strong&gt;&lt;br&gt;
Maybe videos too — though I want to focus on building first.&lt;/p&gt;

&lt;p&gt;And maybe… just maybe…&lt;br&gt;
We can turn this into a &lt;strong&gt;bootcamp-style learning group&lt;/strong&gt; or &lt;strong&gt;live workshop&lt;/strong&gt; where we explore, test, and learn together.&lt;/p&gt;

&lt;p&gt;I watched tons of tutorials, read docs, debugged endlessly… but never found a complete, JS-focused guide that connects everything together — or maybe I just didn’t find the one that worked for me. So, I’m making one.&lt;/p&gt;

&lt;p&gt;But what’s more important is &lt;strong&gt;how&lt;/strong&gt; we’re building it.&lt;/p&gt;

&lt;p&gt;We’re doing everything from scratch — manually configuring each part.&lt;br&gt;
Why? Because we want to understand the core.&lt;/p&gt;

&lt;p&gt;There are definitely easier ways. We could’ve used pre-built SDKs, hosted platforms, or plug-and-play services.&lt;br&gt;
But once we truly understand how everything connects — from embeddings to vector search to tool invocation — we’ll have the power to use any provider, or even build our own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We’re not just learning tools.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;We’re learning how to build our own AI brains — with control, understanding, and creativity.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Whether you're:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A beginner? &lt;strong&gt;Let’s cook together.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Already familiar with some of these tools? &lt;strong&gt;Drop advice! I’m listening.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Confused or stuck? &lt;strong&gt;Comment your question&lt;/strong&gt; — maybe someone here can help you, or I’ll try!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/yoY2bNLwv_M"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

</description>
      <category>helloworld</category>
      <category>langchain</category>
      <category>ollama</category>
      <category>node</category>
    </item>
  </channel>
</rss>
