<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sunil_</title>
    <description>The latest articles on Forem by Sunil_ (@sunil8521).</description>
    <link>https://forem.com/sunil8521</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1820176%2Ff4d97f60-5b5c-4b04-a435-f984b5455f71.jpg</url>
      <title>Forem: Sunil_</title>
      <link>https://forem.com/sunil8521</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/sunil8521"/>
    <language>en</language>
    <item>
      <title>Building a RAG System with Vertex AI, Pinecone, and LangChain (Step-by-Step Guide)</title>
      <dc:creator>Sunil_</dc:creator>
      <pubDate>Sat, 13 Sep 2025 05:22:58 +0000</pubDate>
      <link>https://forem.com/sunil8521/building-a-rag-system-with-vertex-ai-pinecone-and-langchain-step-by-step-guide-2d18</link>
      <guid>https://forem.com/sunil8521/building-a-rag-system-with-vertex-ai-pinecone-and-langchain-step-by-step-guide-2d18</guid>
      <description>&lt;h2&gt;
  
  
  What is RAG (Retrieval-Augmented Generation)
&lt;/h2&gt;

&lt;p&gt;RAG stands for &lt;strong&gt;Retrieval-Augmented Generation&lt;/strong&gt;. It is a way to make AI models smarter by giving them &lt;strong&gt;external knowledge&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Normally, an LLM (like GPT or Gemini) only knows what it learned during training.&lt;/li&gt;
&lt;li&gt;With RAG, you &lt;strong&gt;store your documents&lt;/strong&gt; in a &lt;strong&gt;vector database&lt;/strong&gt; and &lt;strong&gt;embed them into vectors&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;When a user asks a question, the system &lt;strong&gt;finds the most relevant documents&lt;/strong&gt; and passes them to the LLM.&lt;/li&gt;
&lt;li&gt;The LLM then generates an answer &lt;strong&gt;based on the retrieved documents&lt;/strong&gt;, so it can answer questions about content it hasn’t seen before.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Simple analogy:&lt;/strong&gt;&lt;br&gt;
Think of it like a student using a &lt;strong&gt;textbook&lt;/strong&gt;. The LLM is the student, and the vector store is the textbook. When asked a question, the student looks up the textbook and answers accurately instead of just guessing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7muw1tvxkerkaakcfkey.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7muw1tvxkerkaakcfkey.png" alt="workflow of RAG" width="606" height="644"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Create a Google Cloud Account
&lt;/h3&gt;

&lt;p&gt;To get started, you need a Google Cloud account. Google gives every new user &lt;strong&gt;$300 in free credits&lt;/strong&gt; that you can spend on services like Vertex AI. &lt;/p&gt;

&lt;p&gt;go to &lt;a href="https://cloud.google.com/" rel="noopener noreferrer"&gt;Google Cloud&lt;/a&gt; and sign up with your Gmail.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Enable the Vertex AI API
&lt;/h3&gt;

&lt;p&gt;Once your Google Cloud account is ready, the next step is to turn on the &lt;strong&gt;Vertex AI API&lt;/strong&gt; for your project.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In the Google Cloud console, open the menu and go to &lt;strong&gt;“APIs &amp;amp; Services” → “Library.”&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Search for &lt;strong&gt;Vertex AI API.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click on it and hit &lt;strong&gt;“Enable.”&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  3. Create a Service Account
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;In the Google Cloud console, go to &lt;strong&gt;“IAM &amp;amp; Admin” → “Service Accounts.”&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;“Create Service Account.”&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Give it a name (for example: &lt;code&gt;vertex-rag-sa&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Assign it a role such as &lt;strong&gt;Vertex AI User&lt;/strong&gt; (this allows it to access Vertex AI).&lt;/li&gt;
&lt;li&gt;After creating the service account, click on it, go to the &lt;strong&gt;“Keys”&lt;/strong&gt; section, and choose &lt;strong&gt;“Add Key → Create New Key.”&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;JSON&lt;/strong&gt; as the key type and download the file. You’ll use this JSON file later to authenticate your JavaScript code when calling Vertex AI.&lt;/li&gt;
&lt;/ol&gt;


&lt;h3&gt;
  
  
  4. Initialize Node.js Project and Install Libraries
&lt;/h3&gt;

&lt;p&gt;First, let’s make a new Node project. Open your terminal and type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm init &lt;span class="nt"&gt;-y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This just creates a basic &lt;code&gt;package.json&lt;/code&gt; file so your project has all the info it needs.&lt;/p&gt;

&lt;p&gt;Next, we need some libraries to make RAG work with Vertex AI and Pinecone. Install them like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @langchain/pinecone @langchain/google-vertexai @pinecone-database/pinecone @langchain/textsplitters @langchain/community
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s what they do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;@langchain/pinecone&lt;/code&gt; → lets LangChain talk to Pinecone.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@langchain/google-vertexai&lt;/code&gt; → lets us use Vertex AI embeddings and LLMs.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@pinecone-database/pinecone&lt;/code&gt; → the Pinecone client to store and search your vectors.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@langchain/community&lt;/code&gt; → gives us loaders, like loading PDFs into LangChain.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@langchain/textsplitters&lt;/code&gt; → helps split big documents into smaller text chunks so embeddings can work better.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. What we are building
&lt;/h3&gt;

&lt;p&gt;let’s talk about what we are going to do. We’ll create a &lt;strong&gt;RAG (Retrieval-Augmented Generation) system&lt;/strong&gt; that can answer questions about a company’s policy.&lt;/p&gt;

&lt;p&gt;Basically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We take a &lt;strong&gt;PDF document&lt;/strong&gt; that has the company policy.&lt;/li&gt;
&lt;li&gt;We break it into small chunks and &lt;strong&gt;store it in a vector database&lt;/strong&gt; (Pinecone).&lt;/li&gt;
&lt;li&gt;When we ask a question, our system &lt;strong&gt;retrieves the most relevant parts&lt;/strong&gt; from the PDF and gives an answer using Vertex AI.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;For this tutorial, you don’t need a real company PDF. I’m using a &lt;strong&gt;dummy PDF&lt;/strong&gt; that we can generate using GPT. You can replace it with your own document later.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  6. Load the document and set up Pinecone
&lt;/h3&gt;

&lt;p&gt;Now we need two things:&lt;br&gt;
1.&lt;strong&gt;Load our PDF&lt;/strong&gt; into LangChain.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use the PDF loader from &lt;code&gt;@langchain/community&lt;/code&gt; to read the file.&lt;/li&gt;
&lt;li&gt;Split the text into smaller chunks using &lt;code&gt;@langchain/textsplitters&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;This makes it easier for the embeddings to work properly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2.&lt;strong&gt;Set up Pinecone&lt;/strong&gt; to store the embeddings.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;a href="https://www.pinecone.io/" rel="noopener noreferrer"&gt;Pinecone&lt;/a&gt; and create a free account.&lt;/li&gt;
&lt;li&gt;Create a new &lt;strong&gt;index&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Choose &lt;strong&gt;Custom settings&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Since we’re using the &lt;code&gt;text-embedding-004&lt;/code&gt; model from Vertex AI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dimension&lt;/strong&gt; → &lt;code&gt;768&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector Type&lt;/strong&gt; → &lt;code&gt;Dense&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metric&lt;/strong&gt; → &lt;code&gt;Cosine&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you use another embedding model, make sure to update the &lt;strong&gt;dimension value&lt;/strong&gt; based on that model’s output size.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;After creating the index, go to the &lt;strong&gt;API Keys section&lt;/strong&gt; in Pinecone. Copy your &lt;strong&gt;API Key&lt;/strong&gt; and add it to your &lt;code&gt;.env&lt;/code&gt; file like this:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt; PINECONE_API_KEY=your-pinecone-api-key-here
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  7. Load the document and push it into Pinecone
&lt;/h3&gt;

&lt;p&gt;Now let’s write the code to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Load our PDF&lt;/strong&gt; (Company Policy).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Split it into chunks&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate embeddings with Vertex AI&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Store them inside Pinecone&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here’s the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;PDFLoader&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/community/document_loaders/fs/pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;RecursiveCharacterTextSplitter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/textsplitters&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;VertexAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/google-vertexai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;PineconeStore&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/pinecone&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Pinecone&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;PineconeClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@pinecone-database/pinecone&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// 1. Setup Pinecone client&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PineconeClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PINECONE_API_KEY&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pineconeIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;google-embed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "google-embed" = your Pinecone index name&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Setup embeddings model (Vertex AI)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings_model_google&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;VertexAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// 3. Create vector store connection&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PineconeStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embeddings_model_google&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;pineconeIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;maxConcurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// 4. Load PDF, split it, and add to Pinecone&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PDFLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pdf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;splitPages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;✅ Document loaded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;textSplitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// each chunk ~500 characters&lt;/span&gt;
      &lt;span class="na"&gt;chunkOverlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="c1"&gt;// overlap to keep context&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;texts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;textSplitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splitText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;pageContent&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;✅ Document split into chunks&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;pageContent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="p"&gt;}));&lt;/span&gt;

    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;⏳ Adding to vector database...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;🎉 Added to Pinecone database!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;er&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;❌ Error while processing document&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;er&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Run the loader with our PDF&lt;/span&gt;
&lt;span class="nf"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./Company_Policy.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔑 Important: &lt;code&gt;.env&lt;/code&gt; Setup
&lt;/h3&gt;

&lt;p&gt;Make sure you have the following in your &lt;strong&gt;&lt;code&gt;.env&lt;/code&gt; file&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PINECONE_API_KEY=your-pinecone-api-key
GOOGLE_APPLICATION_CREDENTIALS=embeddings-test-471806-e1b74d261b5b.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;👉 The &lt;code&gt;GOOGLE_APPLICATION_CREDENTIALS&lt;/code&gt; is the JSON key file we downloaded in step 3.&lt;/p&gt;

&lt;p&gt;👉 After running the loader script, your PDF chunks should now be stored in Pinecone.&lt;br&gt;
You can go to your &lt;strong&gt;Pinecone dashboard&lt;/strong&gt; and open the index you created.&lt;/p&gt;

&lt;p&gt;There, you’ll see something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7626w2zl1nqs6vujbvtv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7626w2zl1nqs6vujbvtv.png" alt="Pinecone dashboard" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  8. Retrieve documents and use LLM to answer
&lt;/h3&gt;

&lt;p&gt;Now that we’ve stored our company policy PDF in Pinecone, we can &lt;strong&gt;search for relevant chunks&lt;/strong&gt; and use Vertex AI (LLM) to answer questions.&lt;/p&gt;

&lt;p&gt;👉 We’ll use the &lt;code&gt;vectorStore&lt;/code&gt; we created earlier to run &lt;strong&gt;similarity search&lt;/strong&gt; on the vector database. This will pull out the most relevant pieces of text related to our question. Then we’ll pass those chunks into an LLM (Gemini on Vertex AI) to get the final answer.&lt;/p&gt;

&lt;p&gt;the code we used earlier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// embeddings_model_google and pineconeIndex &lt;/span&gt;
&lt;span class="c1"&gt;// were already defined in the previous step&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PineconeStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embeddings_model_google&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;pineconeIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;maxConcurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s the next code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./prepare.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ChatVertexAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/google-vertexai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;SystemMessage&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/core/messages&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// 1. Setup LLM&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ChatVertexAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// fast + good for QA&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;            &lt;span class="c1"&gt;// deterministic answers&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// 2. System prompt for strict rules&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s2"&gt;`You are a helpful and knowledgeable assistant named General Dao. 
   Your role is to answer user questions only from the given context.
   Rules:
   - If the answer exists in the context, give a clear and concise response.
   - If it’s not in the context, say you don’t have enough information.
   - Don’t make up answers. Stay professional and helpful.`&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 3. Main function&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What is the company policy on leave?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Example question, this query is hardcoded, but you can make it dynamic using an API or user input&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Run similarity search → get 3 most relevant chunks&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similaritySearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Merge results into one context string&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;final_res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pageContent&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Build human prompt&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;HUMAN_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`Q: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n\nContext:\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;final_res&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Ask the LLM&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiMsg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;HUMAN_PROMPT&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="c1"&gt;// Print answer&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Answer:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;aiMsg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;er&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;❌ Error:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;er&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🔑 &lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User asks a question.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;vectorStore.similaritySearch()&lt;/code&gt; finds the most relevant chunks from Pinecone.&lt;/li&gt;
&lt;li&gt;We send those chunks + question to Vertex AI (Gemini model).&lt;/li&gt;
&lt;li&gt;LLM gives an answer strictly from context.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  📚 What’s Next?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;You can read more about &lt;strong&gt;LangChain vector store integrations&lt;/strong&gt; in the official docs:
👉 &lt;a href="https://docs.langchain.com/oss/javascript/integrations/vectorstores" rel="noopener noreferrer"&gt;LangChain VectorStores Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
