Forem: Ataur Rahman

From Terminal to UI: Building Your First Local AI Assistant with Node.js

Ataur Rahman — Mon, 21 Jul 2025 14:37:06 +0000

Hi everyone! How's your journey with AI going? Each day feels more exciting than the last. We're living through a technological revolution, witnessing rapid innovation in AI like never before.

I won't spend time here trying to explain what AI is capable of - that's already clear. The real question is: how can we benefit from it? If I can complete 10 tasks in a day, but AI helps me get those done in half the time, I can spend the rest doing more meaningful work - or just resting. That's the magic of automation.

But let's be clear:

We shouldn't become addicted to AI. Instead, we should learn how to make the most of it. That means staying updated and understanding the fundamentals - what's happening behind the scenes. Once you grasp how current AI systems work, you'll find yourself ready to build and innovate with confidence.

Quick Note Before We Begin

Apologies for the delay - it's been 1.5 months since my last post. I've been under the weather, dealing with job pressure, and learning a lot of new things. But now I'm back, and the good news is: I've already finished testing demo apps for the next 5–6 posts! That means new content will be rolling out much faster - so stay tuned.

In our previous Blog, I explained the essential tools and topic's like Ollama, LangChain, and how local models work. I won't repeat those here - please check out that post if you haven't yet. Read my previous details blog from here. I will mention here like I am using this. To jump into the application , must read those following topics :

Prerequisites
Make sure you have in locally:

Node.js installed → Download Node.js
Ollama installed → Download Ollama
Pulled a local model using Ollama For this demo, I'm using the lightweight model: llama3.2:3b-instruct-q4_K_M (approx. 2–2.5GB)
And you already ready this post : Building an AI Assistant with Nodejs, Ollama and Langchain: Essential Tools and Concepts

You can pull and run it using:

ollama run llama3.2:3b-instruct-q4_K_M

To check your setup, run the command above in your terminal. You should see the model response interface. That confirms everything is ready.

You can see all available model in you machine by running ollama ls command. I have 5 model as you can see in the screenshot. If this is your first time and follow the Prerequisites, then in your list you will see only one.

Let's get start:

So what are going to do in this tutorial?

If you run "ollama run llama3.2:3b-instruct-q4_K_M" command in your terminal, you should see the model response interface and you can a conversation with model. You can end or exit from the conversation by sending - "/bye" message.

But what is our goal ?
Building our own AI assistant with a lot of capabilities. Which is not possible from terminal. Need a application who will interact with the model with smart capabilities. Before talking about capabilities we need a basic application where we can interact with model from from UI. I mean what we are doing now from terminal, the same thing should we do by our application. No more need terminal to interact with model. We will talk via our application who will maintain how interact with model.

How can we achieve that ?
I will do all thing in TypeScript (JS). So, I choose Next.js for frontend and Node.js (express.js framework) for back-end. By a simple express.js application we can handle our basic need.

First step is Start from Back-end.

Initialize a Nodejs application and setup a basic express app by commanding:

npm init //initaial project

//install packages
npm install express @types/express cors @types/cors dotenv @langchain/core @langchain/community @langchain/ollama

After installing packages create a index.ts file with basic /stream route:

import cors from "cors";
import dotenv from "dotenv";
import express from "express";

dotenv.config();

const PORT = process.env.PORT || 9000;

const app = express();
app.use(express.json());
app.use(cors());

/* 
    This is the endpoint that will be used to stream the response from the AI to the client.
*/
app.post("/stream", async (req, res) => {
    //we will do our business logic here step by step
});

app.listen(PORT, () => {
    console.log(`Server is running on port ${PORT}`);
});

in this route we are expecting user should give at least a message like this:

{
  "message":"Hi"
}

and inside the /stream route we will catch and process like this:

   try {
        const { message } = req.body;
        if (!message) throw Error('Message is required');

        const model = new ChatOllama({
            model: "llama3.2:3b-instruct-q4_K_M",
            baseUrl: "http://localhost:11434",
            temperature: 0,
        })

        const formatedMessages = [
            new HumanMessage(message)
        ];

        const stream = await model.stream(formatedMessages);

        for await (const chunk of stream) {
            const content = chunk?.content;

            console.log(content);

        }

        res.end();
    } catch (err) {
        console.error("Stream error:", err);
        res.status(500).end("Stream error");
    }

wait wait wait !!!. Not complete yet. Before jump into next let's understand first what I did here.

We are configuring our model in our code by langchain. The chatOllama class is comes from langchain package. Here we are configuring the model information. when we install ollama, by default it expose 11434 port. We can communicate with ollama by that port. Now question is

As ollama itself expose a rest API to communicate with him , why we are using langchain? why not Ollama directly.

If you have this question in your head , I will say good catch. You are really curious to know how AI is working. Well, back to the point. To better understand you have to know first what is olllama and langchain. as I mention I have a details post about those topic. Highly recommend read from here. But I am mentioning here again little bit .

We can communicate with Ollama by default rest api. It will work also . But our application will Tightly Coupled with ollama system. In that case we can not use OpenAi, Claude , Gemini type others model in our system. Actually we can but we need to different configuration for each provider. So, here is the point of the langchain's entry. Langchain is a wrapper of all models configuration (in better understand in our current context). All configuration is similar. Here we are using ollama provider so , langchain offering us this chatollama class for communicate with our local model. You can see other from here.

Right now this config look too simple , that's why may be you have rise some question on your head about what is the more complex configuration. I know. Keep searching for answer and let me know your question in comment section also.

To configure the model here we need to pass which model we want to use and what is the endpoint. In our case we installed earlier llama3.2:3b-instruct-q4_K_M , and the base URL is the provider URL. In our case, provider is ollama and it expose in 11434 port.

       const model = new ChatOllama({
            model: "llama3.2:3b-instruct-q4_K_M",
            baseUrl: "http://localhost:11434",
            temperature: 0,
        })

const formatedMessages = [
            new HumanMessage(message)
        ];

const stream = await model.stream(formatedMessages);

Why I convert the user messages by HumanMessage class and why I put it in array ?
Answer: Well , we can use directly the user message but if we use this class we can ensure this will return the same way how the model is expecting . You can also use like this but first one is recommended.

const stream = await model.stream("Hello, how are you?");

If you run your code and hit your endpoint you can see some console log in your terminal. If you remember, we put a console log the content inside a for loop.

Yooo Man!, Congratulation !!!. You have successfully achieved the first step. It's showing like almost same as how we are communicate the model directly. Now come to the stream the response part.
After the stream line update your code by following code:

  const stream = await model.stream(formatedMessages);

        // await streamChunksToTextResponse(res, stream);
        res.setHeader("Content-Type", "text/plain");
        res.setHeader("Transfer-Encoding", "chunked");

        for await (const chunk of stream) {
            const content = chunk?.content;

            console.log(content);

            if (typeof content === "string") {
                res.write(content);
            } else if (Array.isArray(content)) {
                for (const part of content) {
                    if (typeof part === "object" && "text" in part && typeof part.text === "string") {
                        res.write(part.text);
                    }
                }
            }
        }

        res.end();

After updating your code , if you try now to hit your endpoint any software like postman , you can see the stream response.

Wow !, your back-end API is ready now you can use your API in front-end to view your response like our chat interface.
Actually I am a Nodejs Developer. But I know front-end little without CSS 😁😁. So I am sharing the request handle part here only not how setup the entire project. I hope you know how to create a Nextjs application and can design a chat interface. Or you can use v0.dev for design like me as I did. Let's jump into next.

First I create a next js API for my back-end API call and forward the response to my client (where I am calling this api):

// app/api/chat/route.ts
import { NextRequest } from 'next/server';

export const runtime = "edge";

export async function POST(req: NextRequest) {
  const { message } = await req.json()

  console.log({ message });

  const response = await fetch("http://localhost:9000/stream", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      message
    }),
  });

  return new Response(response.body, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      Connection: 'keep-alive',
    },
  })
}

In client I request my next API like following and handle the response here:


const res = await fetch('/api/chat', {
        method: 'POST',
        body: JSON.stringify({ message}),
        headers: { 'Content-Type': 'application/json' },
      })

      const reader = res.body?.getReader()
      const decoder = new TextDecoder('utf-8')

      if (!reader) throw new Error('No stream reader found.')

      while (true) {
        const { value, done } = await reader.read()
        if (done) break

        const chunk = decoder.decode(value)
        console.log({ chunk });
      }

If you run and use this function you can the console log same as you can see in back-end stream was console. Your can handle you chat history messages by using a state. In my application I create a hook where all chat message related task I am handling.

"use client"

import { useState } from "react"
import type { Message } from "@/types"

export function useChat() {
  const [messages, setMessages] = useState<Message[]>([])
  const [isLoading, setIsLoading] = useState(false)
  const [isTyping, setIsTyping] = useState(false)

  const sendMessage = async (input: string) => {
    if (!input.trim() || isLoading) return

    // Add user message to chat
    const userMessage: Message = {
      id: Date.now().toString(),
      role: "user",
      content: input,
    }

    setIsLoading(true)
    setIsTyping(true)

    console.log({ messages });

    try {
      // Send message to API using server action
      setMessages((prev) => [...prev, userMessage])

      const res = await fetch('/api/chat', {
        method: 'POST',
        body: JSON.stringify({ message: input}),
        headers: { 'Content-Type': 'application/json' },
      })

      const reader = res.body?.getReader()
      const decoder = new TextDecoder('utf-8')

      if (!reader) throw new Error('No stream reader found.')

      // Add initial assistant message
      const assistantMessageId = Date.now().toString()
      setMessages((prev) => [...prev, {
        id: assistantMessageId,
        role: "assistant",
        content: ""
      }])

      while (true) {
        const { value, done } = await reader.read()
        if (done) break

        const chunk = decoder.decode(value)
        console.log({ chunk });

        // Update message in real-time
        setMessages((prev) => prev.map(msg =>
          msg.id === assistantMessageId
            ? { ...msg, content: msg.content + chunk }
            : msg
        ))
      }
    } catch (error) {
      console.error("Error sending message:", error)

      // Add error message
      const errorMessage: Message = {
        id: Date.now().toString(),
        role: "assistant",
        content: "Sorry, there was an error processing your request. Please try again.",
      }

      setMessages((prev) => [...prev, errorMessage])
    } finally {
      setIsLoading(false)
      setIsTyping(false)
    }
  }

  return {
    messages,
    isLoading,
    isTyping,
    sendMessage,
    setMessages,
  }
}

Here actually, I am managing the chat messages state. Receiving the stream and update the state also. So it feel's the UI the real-time stream.

✅ Congratulations!

You've just built the foundation of your own AI assistant! You can now send messages to a local LLM from a web UI using LangChain, Express, and Ollama. We kick start successfully, still have long path to achieve our goal. Our current application dose not have chat memory yet. Means If you give him some information in your last message and ask in the next message , he can not answer. We will fix it one by one.

What's Next?
We've laid the groundwork. Next up, we'll refine this setup, add memory, RAG (retrieval-augmented generation), and enable tool usage. Stick around
- it's about to get exciting.

💭 Final Thoughts
The goal of this series is not just to build something cool - but to help you understand how modern AI works under the hood. If you're a JavaScript developer, you don't need to feel left out of the AI world anymore. You've got the tools. Now let's build!

🔗 Follow me for updates, and thank you for joining in our mission building own AI Assistant !

👉 💬 Got questions or thoughts? Drop them in the comments - I'd love to hear what you're building.

👉 Stay tuned for the next post in this series!

💖 If you're finding value in my posts and want to help me continue creating, feel free to support me here [Buy me a Coffee]! Every contribution helps, and I truly appreciate it! Thank You. 🙌

Happy Coding! 🚀

Building an AI Assistant with NodeJs: Essential Tools and Concepts

Ataur Rahman — Sat, 31 May 2025 20:42:10 +0000

Hi everyone,

— especially those who’ve been eagerly waiting for my series, and particularly all the JavaScript developers out there. How’s your day going in this booming age of AI? 🚀

AI is growing at an incredible pace. Haven’t started yet? Feeling overwhelmed with all the new technologies? Not sure how they connect or where to begin? Trust me, you’re not alone. Developer life is full of confusion — and now AI has added a whole new level. But we need to overcome that fear.

The truth is: AI isn’t as complicated as it seems. You don’t need to know everything from start to finish. Everyone has limitations — scientists innovate, engineers scale, and developers build. We don’t need to play every role. Instead, we’ll start with small steps, grow our knowledge, build something meaningful, and explore new ideas. Day by day, our understanding will be more deep. But the key is to take that first step.

I don’t claim to know everything, but I explore and learn every day — and face plenty of challenges along the way. That’s why I decided to document my journey, sharing what I know in the hope it might help someone else. And I’d be grateful if it does. If you’ve seen our plan and roadmap, you know what we’re trying to achieve. If you haven’t yet, take a look here. That was the kickstarter post — this is our official entry into the journey. I hope you’ll enjoy it and join with me. So, let’s kick things off and start building our own AI assistant with NodeJs.

Welcome, buddy. We’re all in the same boat. I’m thinking of those who at least know JavaScript fundamentals — functions, variables, loops, basic types. If you know more, that’s a bonus. But I’m assuming we’re all starting from a similar place. To build an AI assistant, we’ll need to become familiar with some new concepts and technologies. It’s essential, because without understanding them, we won’t know how the system works during development.

Let’s highlight a few key terms we’ll encounter throughout this project:

> Agent, Model, Ollama, LangChain, PGVector, RAG, Tools, Memory, Redis, Postgres, MongoDB, AI SDK, MCP Server (stdio and streamable HTTP), MCP Client, Docker, Embedding Engine, Semantic Search.

Each of these topics deserves its own detailed blog post to really understand what’s going on. So, I’ve prepared a standalone post for each (highly recommended). Writing them took time after I shared the roadmap, but as promised, I’ll keep you guys updated. I also used ChatGPT to help with summarizing and formatting and saving some times, but I’ve reviewed everything personally — so you can read with confidence. If you spot any issues or missing information, please comment and I’ll fix it. Today, we’ll explore these topics, and in the next day or so, we’ll configure our environment to get hands-on.

Stay tuned and let’s dive in! 🚀

🧠 Agent
An Agent is the brain that decides what actions to take based on user input. It manages reasoning and tool usage to handle dynamic queries. For example, if asked for the weather, the Agent fetches live data and crafts a response. Agents prevent hardcoding logic for every possible query, keeping the system flexible. They’re essential for creating AI that acts intelligently and naturally. 🔗 Curious about how Agents work? Read the full Agent post!

📚 Model
The Model is the core engine, responsible for understanding and generating text. It processes input using deep neural networks, outputting natural responses. For instance, asking “What’s LangChain?” gives a complete, contextual reply. Models enable language understanding and flexibility, far beyond simple rule-based systems. 🔗 Uncover the full magic of Models! Explore the full Model blog!

⚙️ Ollama
Ollama makes it easy to run large models locally without complex setups. It provides a simple interface to models like LLaMA and Mistral, making advanced AI accessible. With Ollama, you can run models on your machine for privacy and offline use. It’s perfect for developers experimenting with local setups. Ollama also handles model loading, tokenization, and optimization, reducing manual configuration. For example, you can run a chatbot locally using ollama serve. 🔗 Dive into local model magic! Check out the Ollama blog!

🔗 LangChain
LangChain connects the dots between models, tools, and memory to create powerful workflows. It helps the assistant fetch data, handle steps, and respond intelligently. It supports modular design and allows integration of different tools seamlessly. LangChain enables chaining of complex tasks like database queries, search, and content generation. For example, it can pull invoices and draft emails based on a simple user request. 🔗 Master LangChain’s potential! Read the LangChain blog!

📦 PGVector
PGVector stores embeddings in PostgreSQL, enabling semantic search and fast data retrieval. It allows the assistant to store vector representations of documents and compare them to incoming queries. This makes searches faster and more meaningful than traditional keyword matching. PGVector supports indexing and similarity metrics, making it scalable for large datasets. For example, it can find relevant documents even with different phrasing. 🔗 Learn how PGVector supercharges search! Explore the PGVector blog!

🔎 RAG
RAG enhances model generation by grounding responses in real data. Instead of guessing, the assistant fetches relevant content and combines it with generation. This makes answers more accurate and context-aware, reducing errors and hallucinations. It powers document-based QA, FAQs, and retrieval of critical information in real time. RAG improves reliability by providing references alongside generated responses. For example, it retrieves best Docker practices before responding. 🔗 See RAG in action! Check the RAG blog!

⚡ Redis
Redis is a lightning-fast in-memory database that manages session data, caches, and keeps the system responsive. It stores conversation history, user states, and real-time data for smooth interactions. Redis supports data structures like lists, hashes, and sorted sets for flexible use cases. It also enables features like rate limiting and temporary data storage. For example, Redis can track user sessions during a multi-step form. 🔗 Unlock Redis magic! Explore the Redis blog!

🗄️ Postgres
Postgres is the structured database for storing user profiles, settings, and transactional data. It ensures data integrity and handles complex queries with ACID compliance. Postgres supports foreign keys, indexing, and constraints to maintain data relationships. It scales well for large datasets and integrates with extensions like PGVector. For example, Postgres can store user subscription details and fetch them on request. 🔗 Get deep into Postgres! Read the Postgres blog!

🗃️ MongoDB
MongoDB handles flexible data like logs and activity records. Its document model allows easy adaptation to changing data formats, perfect for chat logs or analytics. Documents can have nested structures and varied fields without requiring schema changes. MongoDB scales horizontally through sharding for large datasets. For example, chat sessions and logs can be stored and queried efficiently. 🔗 Discover MongoDB’s flexibility! Dive into the MongoDB blog!

💡 AI SDK
The AI SDK simplifies working with various AI providers like OpenAI or Anthropic. It standardizes model calls, letting you switch models with minimal code changes, making development faster and cleaner. It supports text generation, embeddings, and function calling in a consistent interface. The SDK also handles streaming responses for interactive UIs. For example, generating a summary from a user prompt using OpenAI’s GPT-4. 🔗 Simplify AI integration! Check out the AI SDK blog!

🌐 MCP (Server & Client)
MCP standardizes communication between the assistant and external tools. The server exposes tools, while the client manages calls. This architecture allows seamless integration with different tools and APIs. MCP supports transports like stdio for local setups and streamable HTTP for remote servers, making it versatile and scalable. It provides a unified protocol for tools, making integration simpler and more modular. For example, it can fetch weather data using a modular tool without complex API setups. 🔗 Discover MCP’s power! Read the MCP blog!

🐳 Docker
Docker packages services into containers, ensuring consistent environments and smooth deployments. It isolates dependencies and allows running multiple services without conflicts. Docker simplifies local development and cloud deployments by using containers and orchestration tools. It supports scaling and automation through Compose and Swarm. For example, running Ollama, Redis, and PGVector with one docker-compose command. 🔗 Master container magic! Explore the Docker blog!

🧬 Embedding Engine
The embedding engine converts text into vectors that capture meaning, crucial for semantic search and RAG. It works by using pre-trained models to map text to high-dimensional vectors that reflect semantic relationships. These embeddings power document retrieval, contextual responses, and even recommendation systems. Keeping embedding consistency and versioning is vital. It enables finding contextually relevant data and reducing irrelevant matches. 🔗 Understand embeddings deeply! Read the Embedding Engine blog!

🔍 Semantic Search
Semantic search retrieves data based on meaning rather than keywords. It uses embeddings to find the most relevant documents, ensuring accurate and helpful responses. It’s the engine behind natural, user-friendly searches in the assistant. It works by comparing vector similarities, enabling matching of related but differently phrased queries. Combining semantic search with metadata filters can further improve precision and recall. 🔗 Discover semantic magic! Dive into the Semantic Search blog!

Wow! Congratulation. You just explored 14 new topic. Great work. That’s it for today. Take some rest. I highly recommend to read each individual topic details post. Not so long those blog. Maybe will take 2–3 minutes each. But you will be clear like water. I try to explained in very simple way with easy word and example. Best wishes for you.

🔗 Follow me for updates, and let’s build an amazing AI Assistant together!

👉 Got questions? Leave them below!
👉 Stay tuned for the next post in this series!

💖 If you’re finding value in my posts and want to help me continue creating, feel free to support me here [Buy me a Coffee]! Every contribution helps, and I truly appreciate it! Thank you.🙌

🚀 Build Your Own AI Assistant with Node.js: My Roadmap and Journey 🌟

Ataur Rahman — Sun, 25 May 2025 01:39:15 +0000

Hey everyone! 👋

I’m excited to kick off a new blog series where I’ll walk you through my journey of building a custom AI Assistant using Node.js, LangChain, and other cutting-edge tools. 💻✨

This series is not just about coding – it’s about learning, experimenting, and sharing everything I discover along the way. Whether you’re a developer like me, curious about AI, or just love diving into cool projects, you’re welcome to join me on this adventure! 🙌

📌 Here’s the Roadmap I’ll Be Following:

🔹 1. Introduction: Understanding Tools and Setting Up the Environment
In this stage, we’ll explore the essential tools and technologies like Node.js, LangChain, PGVector, ai-sdk, and Redis. You’ll learn how to configure your local machine, install dependencies, and prepare a robust environment.
👉 Key Takeaway: Setting up a scalable and developer-friendly environment saves future debugging time.

🔹 2. Building a General Chat Assistant
We’ll create a basic chat assistant capable of handling conversations.

Frontend Focus: Use ai-sdk to quickly build an interactive UI that sends queries to a local LLM (Large Language Model) and renders responses.
Backend Focus: With LangChain, develop a backend where the model logic resides, and the UI just handles input/output. This approach is ideal for scalable control. 👉 Key Takeaway: Understand the trade-offs between frontend-heavy and backend-controlled architectures.

🔹 3. Connecting a Database to Our Chat Assistant
Integrate a database (PostgreSQL, MongoDB, etc.) to store conversation history, user preferences, and tool usage logs.
👉 Key Takeaway: A database transforms a stateless chatbot into a persistent, context-aware assistant.

🔹 4. Setting Up Chat Memory
Implement memory techniques like Redis, local storage, or LangChain memory modules.
👉 Key Takeaway: Memory management is crucial for context retention in multi-turn conversations.

🔹 5. Understanding PGVector and Vector Embedding Engines
Explore how embedding models convert text into numerical vectors and how PGVector stores and retrieves these vectors efficiently.
👉 Key Takeaway: Embedding vectors enable semantic understanding, letting the assistant retrieve relevant information.

🔹 6. Integrating PGVector and Embedding Engines into Our Chat Backend
Connect embeddings to the backend for contextually relevant query results.
👉 Key Takeaway: Merging embeddings into the chat logic enhances response quality and relevance.

🔹 7. What is RAG (Retrieval-Augmented Generation)?
Learn how RAG combines retrieval systems with language models to generate accurate, dynamic responses.
👉 Key Takeaway: RAG makes assistants factually accurate by grounding answers in reliable sources.

🔹 8. Configuring RAG for Our Project
Set up a basic RAG system in the backend with PGVector.
👉 Key Takeaway: Correctly configured RAG enables high-quality, up-to-date responses.

🔹 9. Integrating RAG with Our Backend
Connect RAG into the chatbot flow for seamless retrieval and generation.
👉 Key Takeaway: Integration ensures smooth handoffs between retrieval and generation steps.

🔹 10. Adding Tools to Our Backend with LangChain
Expand capabilities with custom tools using LangChain’s tools architecture.
👉 Key Takeaway: Custom tools enhance functionality, making the assistant more versatile.

🔹 11. What is MCP? Why Do We Need It?
Explore MCP (Model-Context Protocol) for managing tools more flexibly than LangChain alone.
👉 Key Takeaway: MCP offers a structured approach to tool calling beyond LangChain’s built-ins.

🔹 12. Building Simple Stdio and Streamable HTTP Servers
Learn to build basic servers for tool management and AI-generated responses.
👉 Key Takeaway: Streamable servers provide real-time interaction and efficient resource management.

🔹 13. Organizing the Streamable Server
Organize the server for simple request handling and error management.
👉 Key Takeaway: A well-organized server ensures reliable performance in basic use cases.

🔹 14. Connecting MCP with LangChain Backend
Integrate MCP with LangChain to enable tool calling and result handling.
👉 Key Takeaway: This connection brings dynamic tool calling into the assistant’s workflow.

🔹 15. Tool Calling Ideologies
Explore two strategies:

Intent-Based: Explicit tool invocation based on user intent.
Free Decision: LLMs decide autonomously which tool to call. 👉 Key Takeaway: Each strategy has use cases; understanding them helps design the right experience.

🔹 16. Wrapping It All Together
Combine everything: memory, RAG, MCP, and LangChain backend to create a complete, experimental AI assistant system.
👉 Key Takeaway: Integration delivers a seamless assistant with advanced features.

🔹 17. Bonus: Exploring ai-sdk for Full Integration
Explore building the same system using ai-sdk, comparing approaches for deeper understanding.
👉 Key Takeaway: Exploring multiple frameworks broadens skill sets and insight.

🗓 My Posting Schedule

I’ll aim to cover one topic per day. However, since testing and building take time, it might not be possible to post daily. Rest assured, I’ll share each new piece as soon as I can! 💪

💬 Let’s Learn Together!

As a JavaScript developer, especially in Node.js, I’ll approach this project from my own perspective. I’ll share:
✅ My learnings and discoveries
✅ Challenges and solutions
✅ Mistakes and how I corrected them
✅ Helpful code snippets and explanations

I’m not perfect – I’ll definitely make mistakes. If you spot something wrong, or have suggestions, please leave a comment and help me (and others) learn and improve. 🙏 Let’s make this journey collaborative! 🚀

🔗 Follow me for updates, and let’s build an amazing AI Assistant together! in medium
👉 Got questions? Leave them below!
👉 Stay tuned for the next post in this series!

💖 If you’d like to support my work and help me continue sharing, you can contribute here - buy me a coffee. Every little bit helps – thank you! 🙏

💬 Join the Journey with Me!
Whether you’re diving in solo, bringing a friend, or joining as a team—come along on this learning adventure! 🚀 Let’s grow together, one step at a time.

`echo "Hello, World!"` — the classic start to every dev journey... and also the intro to mine. 🔥

Ataur Rahman — Thu, 15 May 2025 21:20:46 +0000

This video might look simple — just printing a message — but guess what?
That message is coming straight from my custom tool, bound into my very own AI assistant backend.

It’s a basic tool (just echoing input) — but that’s the point.
We’re testing the foundation.
And just like that, a new journey begins. 🚀

Right now, we’ve set up:

✅ Local LLMs via Ollama
✅ Flow + tool binding with LangChain
✅ Embeddings + search using PGVector
✅ Session memory plan
✅ Streaming UI — fully working and shown in the video
✅ Tools bound and functional (even this echo)

We’re now diving into the MCP server — exploring advanced tool orchestration and how to scale across multiple servers.

But let’s be clear:
👉 What we’ve done so far is just the beginning.
We're still in a small zone of a big vision — and there’s a LOT left to build.

🎯 Blog series coming soon.
Maybe videos too — though I want to focus on building first.

And maybe… just maybe…
We can turn this into a bootcamp-style learning group or live workshop where we explore, test, and learn together.

I watched tons of tutorials, read docs, debugged endlessly… but never found a complete, JS-focused guide that connects everything together — or maybe I just didn’t find the one that worked for me. So, I’m making one.

But what’s more important is how we’re building it.

We’re doing everything from scratch — manually configuring each part.
Why? Because we want to understand the core.

There are definitely easier ways. We could’ve used pre-built SDKs, hosted platforms, or plug-and-play services.
But once we truly understand how everything connects — from embeddings to vector search to tool invocation — we’ll have the power to use any provider, or even build our own.

We’re not just learning tools.
We’re learning how to build our own AI brains — with control, understanding, and creativity.

Whether you're:

A beginner? Let’s cook together.
Already familiar with some of these tools? Drop advice! I’m listening.
Confused or stuck? Comment your question — maybe someone here can help you, or I’ll try!