<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Zarrar Shaikh</title>
    <description>The latest articles on Forem by Zarrar Shaikh (@zlash65).</description>
    <link>https://forem.com/zlash65</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3315475%2Ffc52242e-2344-43e4-a172-997aa6d6410e.JPG</url>
      <title>Forem: Zarrar Shaikh</title>
      <link>https://forem.com/zlash65</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/zlash65"/>
    <language>en</language>
    <item>
      <title>PostgreSQL MCP Server with Built-in SSH Tunneling</title>
      <dc:creator>Zarrar Shaikh</dc:creator>
      <pubDate>Sat, 27 Dec 2025 14:57:43 +0000</pubDate>
      <link>https://forem.com/zlash65/postgresql-mcp-server-with-built-in-ssh-tunneling-2geb</link>
      <guid>https://forem.com/zlash65/postgresql-mcp-server-with-built-in-ssh-tunneling-2geb</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Zlash65" rel="noopener noreferrer"&gt;
        Zlash65
      &lt;/a&gt; / &lt;a href="https://github.com/Zlash65/postgresql-ssh-mcp" rel="noopener noreferrer"&gt;
        postgresql-ssh-mcp
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      PostgreSQL MCP server with SSH tunneling for Claude Desktop and ChatGPT
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;PostgreSQL SSH MCP Server&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a href="https://www.npmjs.com/package/@zlash65/postgresql-ssh-mcp" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/0fe02abe95cd41ebd13fcfa416f83e91c299cdadf8d558e895fd67ee367c7b20/68747470733a2f2f696d672e736869656c64732e696f2f6e706d2f762f407a6c61736836352f706f737467726573716c2d7373682d6d63703f636f6c6f723d326636666562266c6162656c3d6e706d" alt="npm version"&gt;&lt;/a&gt;
&lt;a href="https://www.npmjs.com/package/@zlash65/postgresql-ssh-mcp" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/6e79f7dbe9cb10409dbe0ecaf4133495f6340fc7e2bf3cba7b8cea2077864f73/68747470733a2f2f696d672e736869656c64732e696f2f6e706d2f646d2f407a6c61736836352f706f737467726573716c2d7373682d6d63703f636f6c6f723d326636666562" alt="npm downloads"&gt;&lt;/a&gt;
&lt;a href="https://github.com/Zlash65/postgresql-ssh-mcp/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/3177ca653891902fd58620feab1ad78364600fc2e38dcabe581857985f1d6e97/68747470733a2f2f696d672e736869656c64732e696f2f6e706d2f6c2f407a6c61736836352f706f737467726573716c2d7373682d6d63703f636f6c6f723d326636666562" alt="license"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;A secure PostgreSQL MCP server with built-in SSH tunneling. Connect to databases through bastion hosts automatically — no manual &lt;code&gt;ssh -L&lt;/code&gt; required.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Features&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dual Transport&lt;/strong&gt; — STDIO for Claude Desktop, Streamable HTTP for ChatGPT&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSH Tunneling&lt;/strong&gt; — Built-in tunnel with auto-reconnect and TOFU (trust on first use)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read-Only by Default&lt;/strong&gt; — Safe for production; enable writes explicitly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OAuth Support&lt;/strong&gt; — Auth0 integration for secure ChatGPT connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection Pooling&lt;/strong&gt; — Efficient resource management with configurable limits&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Architecture&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/Zlash65/postgresql-ssh-mcp/assets/architecture.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2FZlash65%2Fpostgresql-ssh-mcp%2Fassets%2Farchitecture.png" alt="Architecture"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Quick Start&lt;/h2&gt;

&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Claude Desktop (STDIO)&lt;/h3&gt;

&lt;/div&gt;
&lt;p&gt;Add to your Claude Desktop config:&lt;/p&gt;
&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Config Location&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;&lt;code&gt;~/Library/Application Support/Claude/claude_desktop_config.json&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;td&gt;&lt;code&gt;%APPDATA%/Claude/claude_desktop_config.json&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;div class="highlight highlight-source-json notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"mcpServers"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"postgres"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"command"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;npx&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"args"&lt;/span&gt;: [&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;-y&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;a class="mentioned-user" href="https://dev.to/zlash65"&gt;@zlash65&lt;/a&gt;/postgresql-ssh-mcp&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;],
      &lt;span class="pl-ent"&gt;"env"&lt;/span&gt;: {
        &lt;span class="pl-ent"&gt;"DATABASE_URI"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;postgresql://user:password@localhost:5432/mydb&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
      }
    }
  }
}&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;ChatGPT (Streamable HTTP)&lt;/h3&gt;

&lt;/div&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;DATABASE_URI=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;postgresql://user:pass@localhost:5432/mydb&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; npx &lt;a class="mentioned-user" href="https://dev.to/zlash65"&gt;@zlash65&lt;/a&gt;/postgresql-ssh-mcp-http&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Then configure ChatGPT to connect to…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Zlash65/postgresql-ssh-mcp" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;







&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Why SSH Tunneling Matters&lt;/li&gt;
&lt;li&gt;Architecture&lt;/li&gt;
&lt;li&gt;
Deployment Options

&lt;ul&gt;
&lt;li&gt;Method 1: Claude Desktop (Local STDIO)&lt;/li&gt;
&lt;li&gt;Remote HTTP Server Deployment&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Available Tools&lt;/li&gt;

&lt;li&gt;Security&lt;/li&gt;

&lt;li&gt;Troubleshooting&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why SSH Tunneling Matters
&lt;/h2&gt;

&lt;p&gt;Production databases live in private networks (VPCs, private subnets) isolated from the public internet. You can't simply provide a &lt;code&gt;DATABASE_URL&lt;/code&gt; to an AI and expect it to connect.&lt;/p&gt;

&lt;p&gt;Companies use &lt;strong&gt;bastion hosts&lt;/strong&gt; as secure gateways to private infrastructure—servers that authenticate via SSH and act as the only entry point to internal resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Without SSH Tunneling:
   AI → Database URL → Connection Refused (private network)

With SSH Tunneling (this server):
   AI → MCP Server → SSH Tunnel → Bastion → Database
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Traditional approach&lt;/strong&gt; (manual):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Terminal 1: SSH tunnel&lt;/span&gt;
ssh &lt;span class="nt"&gt;-L&lt;/span&gt; 5432:db.internal:5432 user@bastion.company.com

&lt;span class="c"&gt;# Terminal 2: MCP server&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;postgresql://localhost:5432/db npx @zlash65/postgresql-ssh-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This server&lt;/strong&gt; (automatic):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"DATABASE_URI"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"postgresql://db.internal:5432/db"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SSH_ENABLED"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SSH_HOST"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bastion.company.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SSH_PRIVATE_KEY_PATH"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.ssh/id_rsa"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The server handles tunnel establishment, database traffic forwarding, auto-reconnection on failure, and clean shutdown automatically.&lt;/p&gt;

&lt;p&gt;Without built-in SSH tunneling, you're stuck with bad options: VPN for everyone (overhead), exposed databases (security risk), or manual SSH tunneling (requires terminal skills). Product managers, operations teams, and analysts who need database insights shouldn't have to learn SSH commands or manage private keys. Configure it once, and it just works.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Key Features:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔐 &lt;strong&gt;Automatic SSH tunneling&lt;/strong&gt; with trust-on-first-use and auto-reconnection&lt;/li&gt;
&lt;li&gt;🛡️ &lt;strong&gt;Read-only by default&lt;/strong&gt; with smart SQL validation (60+ blocked patterns)&lt;/li&gt;
&lt;li&gt;🔄 &lt;strong&gt;Dual transport&lt;/strong&gt; — STDIO for local, HTTP + OAuth for remote&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;12 database tools&lt;/strong&gt; — query, schema discovery, monitoring&lt;/li&gt;
&lt;li&gt;⚡ &lt;strong&gt;Connection pooling&lt;/strong&gt; with cursor-based result limiting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://www.npmjs.com/package/@zlash65/postgresql-ssh-mcp" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Try it now: npx @zlash65/postgresql-ssh-mcp&lt;/a&gt;
&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FZlash65%2Fpostgresql-ssh-mcp%2Fmain%2Fassets%2Farchitecture.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FZlash65%2Fpostgresql-ssh-mcp%2Fmain%2Fassets%2Farchitecture.png" alt="PostgreSQL SSH MCP Architecture"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Local:  Claude Desktop → STDIO → MCP Server → [SSH Tunnel] → PostgreSQL
Remote: Claude/ChatGPT → HTTPS → MCP HTTP Server → [SSH Tunnel] → PostgreSQL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Deployment Options
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Client&lt;/th&gt;
&lt;th&gt;Transport&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude Desktop&lt;/td&gt;
&lt;td&gt;STDIO&lt;/td&gt;
&lt;td&gt;Local development, direct DB access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude Desktop&lt;/td&gt;
&lt;td&gt;HTTP + OAuth&lt;/td&gt;
&lt;td&gt;Remote DB via Connectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;HTTP + OAuth&lt;/td&gt;
&lt;td&gt;Remote DB via Developer Mode&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Method 1: Claude Desktop (Local STDIO)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Local development, databases accessible from your machine&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Find Config File
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Location&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;&lt;code&gt;~/Library/Application Support/Claude/claude_desktop_config.json&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;td&gt;&lt;code&gt;%APPDATA%/Claude/claude_desktop_config.json&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In Claude Desktop: &lt;strong&gt;Settings &amp;gt; Developer &amp;gt; Edit Config&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Add MCP Server
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Basic (local database):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"postgres"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@zlash65/postgresql-ssh-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"DATABASE_URI"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"postgresql://user:password@localhost:5432/mydb"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;
  With SSH Tunnel (production database)
  &lt;br&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"postgres-prod"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@zlash65/postgresql-ssh-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"DATABASE_URI"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"postgresql://dbuser:dbpass@db.internal:5432/mydb"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"SSH_ENABLED"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"SSH_HOST"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bastion.example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"SSH_USER"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ec2-user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"SSH_PRIVATE_KEY_PATH"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/Users/you/.ssh/id_rsa"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Optional: &lt;code&gt;SSH_PRIVATE_KEY_PASSPHRASE&lt;/code&gt;, &lt;code&gt;SSH_MAX_RECONNECT_ATTEMPTS&lt;/code&gt;&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  Enable Write Mode
  &lt;br&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"DATABASE_URI"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"postgresql://user:password@localhost:5432/mydb"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"READ_ONLY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"false"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;⚠️ Only enable for non-production databases.&lt;/p&gt;



&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Restart Claude Desktop
&lt;/h3&gt;

&lt;p&gt;Done! Ask Claude: "What tables are in my database?"&lt;/p&gt;




&lt;h2&gt;
  
  
  Remote HTTP Server Deployment
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Shared team access, production databases, ChatGPT integration&lt;/p&gt;

&lt;p&gt;This section covers deploying the MCP HTTP server with OAuth on a Linux server. Both Claude Desktop (via Connectors) and ChatGPT connect to the same server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Linux server (Ubuntu 22.04/24.04 recommended)&lt;/li&gt;
&lt;li&gt;Domain name with DNS access&lt;/li&gt;
&lt;li&gt;Firewall allowing ports 22, 80, 443&lt;/li&gt;
&lt;li&gt;PostgreSQL database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FZlash65%2Fpostgresql-ssh-mcp%2Fmain%2Fassets%2Foauth-flow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FZlash65%2Fpostgresql-ssh-mcp%2Fmain%2Fassets%2Foauth-flow.png" alt="OAuth Flow"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 1: Configure DNS
&lt;/h3&gt;

&lt;p&gt;Create an A record pointing your domain to your server's public IP:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;td&gt;your-subdomain&lt;/td&gt;
&lt;td&gt;Server IP&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Verify: &lt;code&gt;nslookup your-subdomain.example.com&lt;/code&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 2: Install Dependencies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Connect to server&lt;/span&gt;
ssh &lt;span class="nt"&gt;-i&lt;/span&gt; your-key.pem ubuntu@your-server-ip

&lt;span class="c"&gt;# Update system&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;apt upgrade &lt;span class="nt"&gt;-y&lt;/span&gt;

&lt;span class="c"&gt;# Install Node.js via nvm&lt;/span&gt;
curl &lt;span class="nt"&gt;-o-&lt;/span&gt; https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc
nvm &lt;span class="nb"&gt;install &lt;/span&gt;22

&lt;span class="c"&gt;# Install nginx and Certbot&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; nginx certbot python3-certbot-nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 3: Configure nginx
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;vim /etc/nginx/sites-available/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;your-subdomain.example.com&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:3000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_http_version&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Upgrade&lt;/span&gt; &lt;span class="nv"&gt;$http_upgrade&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Connection&lt;/span&gt; &lt;span class="s"&gt;"upgrade"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_read_timeout&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable and reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; /etc/nginx/sites-available/mcp /etc/nginx/sites-enabled/
&lt;span class="nb"&gt;sudo &lt;/span&gt;nginx &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 4: Obtain SSL Certificate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;certbot &lt;span class="nt"&gt;--nginx&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; your-subdomain.example.com

&lt;span class="c"&gt;# Verify auto-renewal&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;certbot renew &lt;span class="nt"&gt;--dry-run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 5: Install MCP Server
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /opt/mcp-server
&lt;span class="nb"&gt;sudo chown &lt;/span&gt;ubuntu:ubuntu /opt/mcp-server
&lt;span class="nb"&gt;cd&lt;/span&gt; /opt/mcp-server

git clone https://github.com/zlash65/postgresql-ssh-mcp.git
&lt;span class="nb"&gt;cd &lt;/span&gt;postgresql-ssh-mcp
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm run build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 6: Configure Auth0
&lt;/h3&gt;

&lt;p&gt;
  6.1: Create Auth0 Tenant
  &lt;ol&gt;
&lt;li&gt;Sign up at &lt;a href="https://auth0.com" rel="noopener noreferrer"&gt;Auth0&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Set tenant domain (e.g., &lt;code&gt;postgresql-ssh-mcp&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Select region&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Your &lt;code&gt;AUTH0_DOMAIN&lt;/code&gt;: &lt;code&gt;{tenant-domain}.{region}.auth0.com&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4haixptnfrp8y9fstlh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4haixptnfrp8y9fstlh.png" alt="Auth0 Tenant"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  6.2: Create API
  &lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Applications &amp;gt; APIs &amp;gt; + Create API&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Name: &lt;code&gt;PostgreSQL SSH MCP&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Identifier: &lt;code&gt;https://your-subdomain.example.com/mcp&lt;/code&gt; (becomes &lt;code&gt;AUTH0_AUDIENCE&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Signing Algorithm: &lt;code&gt;RS256&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1uig94u6ssvwugahvkfe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1uig94u6ssvwugahvkfe.png" alt="Create API"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  6.3: Set Default Audience
  &lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Settings &amp;gt; General &amp;gt; API Authorization Settings&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Set Default Audience to your API Identifier&lt;/li&gt;
&lt;li&gt;Save&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv121yivx9cooduyop5u7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv121yivx9cooduyop5u7.png" alt="Default Audience"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  6.4: Enable DCR
  &lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Settings &amp;gt; Advanced&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Enable &lt;strong&gt;Dynamic Client Registration (DCR)&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsfws404bnazdzt9umg1r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsfws404bnazdzt9umg1r.png" alt="DCR"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  6.5: Configure Database Connection
  &lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Authentication &amp;gt; Database &amp;gt; Username-Password-Authentication&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Enable: &lt;strong&gt;Disable Sign Ups&lt;/strong&gt; and &lt;strong&gt;Promote Connection to Domain Level&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhbuu92hzyvr0md3v5dyy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhbuu92hzyvr0md3v5dyy.png" alt="Database"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  6.6: Create User
  &lt;ol&gt;
&lt;li&gt;&lt;strong&gt;User Management &amp;gt; Users &amp;gt; + Create User&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Connection: &lt;code&gt;Username-Password-Authentication&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Email and password&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Remember these credentials for authentication.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9ic77j5aqkz6rin7ukw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9ic77j5aqkz6rin7ukw.png" alt="Create User"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 7: Configure Environment
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vim /opt/mcp-server/postgresql-ssh-mcp/.env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Database&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;postgresql://user:password@your-db-host:5432/your-database
&lt;span class="nv"&gt;DATABASE_SSL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false
&lt;/span&gt;&lt;span class="nv"&gt;READ_ONLY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# SSH Tunnel (optional)&lt;/span&gt;
&lt;span class="c"&gt;# SSH_ENABLED=true&lt;/span&gt;
&lt;span class="c"&gt;# SSH_HOST=bastion.example.com&lt;/span&gt;
&lt;span class="c"&gt;# SSH_USER=ubuntu&lt;/span&gt;
&lt;span class="c"&gt;# SSH_PRIVATE_KEY_PATH=/home/ubuntu/.ssh/id_rsa&lt;/span&gt;

&lt;span class="c"&gt;# HTTP Server&lt;/span&gt;
&lt;span class="nv"&gt;MCP_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.0.0.0

&lt;span class="c"&gt;# Auth0&lt;/span&gt;
&lt;span class="nv"&gt;MCP_AUTH_MODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;oauth
&lt;span class="nv"&gt;AUTH0_DOMAIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-tenant.us.auth0.com
&lt;span class="nv"&gt;AUTH0_AUDIENCE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://your-subdomain.example.com/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 8: Create systemd Service
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;vim /etc/systemd/system/postgresql-ssh-mcp.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;PostgreSQL SSH MCP Server&lt;/span&gt;
&lt;span class="py"&gt;After&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network.target&lt;/span&gt;

&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;simple&lt;/span&gt;
&lt;span class="py"&gt;User&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;ubuntu&lt;/span&gt;
&lt;span class="py"&gt;WorkingDirectory&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/opt/mcp-server/postgresql-ssh-mcp&lt;/span&gt;
&lt;span class="py"&gt;EnvironmentFile&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/opt/mcp-server/postgresql-ssh-mcp/.env&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;PATH=/home/ubuntu/.nvm/versions/node/v22.21.1/bin:/usr/bin:/bin&lt;/span&gt;
&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/home/ubuntu/.nvm/versions/node/v22.21.1/bin/node /opt/mcp-server/postgresql-ssh-mcp/dist/http.js&lt;/span&gt;
&lt;span class="py"&gt;Restart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;on-failure&lt;/span&gt;
&lt;span class="py"&gt;RestartSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;10&lt;/span&gt;

&lt;span class="nn"&gt;[Install]&lt;/span&gt;
&lt;span class="py"&gt;WantedBy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;multi-user.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Check your Node.js path with &lt;code&gt;which node&lt;/code&gt; and update accordingly.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;postgresql-ssh-mcp
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start postgresql-ssh-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 9: Verify Deployment
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check status&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status postgresql-ssh-mcp

&lt;span class="c"&gt;# View logs&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; postgresql-ssh-mcp &lt;span class="nt"&gt;-f&lt;/span&gt;

&lt;span class="c"&gt;# Test health endpoint&lt;/span&gt;
curl https://your-subdomain.example.com/health
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.x.x"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;
  Useful Commands
  &lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sudo systemctl restart postgresql-ssh-mcp&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Restart service&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sudo journalctl -u postgresql-ssh-mcp -f&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Live logs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sudo nginx -t &amp;amp;&amp;amp; sudo systemctl reload nginx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Reload nginx&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sudo certbot renew&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Renew SSL&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;/p&gt;




&lt;h3&gt;
  
  
  Method 2: Connect Claude Desktop (via Connectors)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Open Claude Desktop &lt;strong&gt;Settings &amp;gt; Connectors&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click on the &lt;strong&gt;Add custom connector&lt;/strong&gt; button&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoph4o1z04fkczvdvoyz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoph4o1z04fkczvdvoyz.png" alt="Claude Add custom connector"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Click &lt;strong&gt;Add&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Authenticate with Auth0 (credentials from Step 6.6)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw84fa4x3shc0bdhdfhu8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw84fa4x3shc0bdhdfhu8.png" alt="Auth0 Login"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Method 3: Connect ChatGPT
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Profile &amp;gt; Settings &amp;gt; Apps &amp;gt; Enable Developer Mode&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create App&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Name: &lt;code&gt;PostgreSQL SSH MCP&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;MCP Server URL: &lt;code&gt;https://your-subdomain.example.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Authentication: &lt;code&gt;OAuth&lt;/code&gt; (leave Client ID/Secret empty)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flhc05rtt09fmlf7qpw9c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flhc05rtt09fmlf7qpw9c.png" alt="ChatGPT Create App"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Authenticate with Auth0 (credentials from Step 6.6)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw84fa4x3shc0bdhdfhu8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw84fa4x3shc0bdhdfhu8.png" alt="Auth0 Login"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In chat: &lt;strong&gt;+ &amp;gt; More&lt;/strong&gt; &amp;gt; Select your MCP server&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Available Tools
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Query
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;execute_query&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SQL with parameterized queries (results capped by &lt;code&gt;MAX_ROWS&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;explain_query&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;EXPLAIN plans in text/JSON/YAML/XML&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Schema
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_schemas&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Database schemas (excludes system)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_tables&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tables with row counts and sizes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;describe_table&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Columns, constraints, indexes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_databases&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All databases with sizes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Monitoring
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_connection_status&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pool stats, tunnel state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_active_connections&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Active connections&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_long_running_queries&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Slow queries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_database_version&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL version&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_database_size&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Size breakdown&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_table_stats&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Vacuum/analyze stats&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Security
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Read-only by default&lt;/strong&gt; — The server blocks &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, &lt;code&gt;DELETE&lt;/code&gt;, &lt;code&gt;DROP&lt;/code&gt;, &lt;code&gt;TRUNCATE&lt;/code&gt;, &lt;code&gt;ALTER&lt;/code&gt;, &lt;code&gt;CREATE&lt;/code&gt;, &lt;code&gt;GRANT&lt;/code&gt;, &lt;code&gt;REVOKE&lt;/code&gt;, even &lt;code&gt;SELECT INTO&lt;/code&gt; and &lt;code&gt;PREPARE&lt;/code&gt;/&lt;code&gt;EXECUTE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SSH Host Key Verification&lt;/strong&gt; — Trust-on-First-Use (TOFU) saves unknown keys on first connection and verifies subsequent connections. Disable with &lt;code&gt;SSH_TRUST_ON_FIRST_USE=false&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Connection string obfuscation&lt;/strong&gt; — Passwords are redacted in logs: &lt;code&gt;postgresql://user:***@host:5432/mydb&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query limits&lt;/strong&gt; — &lt;code&gt;MAX_ROWS=1000&lt;/code&gt;, &lt;code&gt;QUERY_TIMEOUT=30000&lt;/code&gt;, &lt;code&gt;MAX_CONCURRENT_QUERIES=10&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CORS/Origin validation&lt;/strong&gt; (HTTP mode):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;MCP_ALLOWED_ORIGINS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://chatgpt.com,https://chat.openai.com"&lt;/span&gt;
&lt;span class="nv"&gt;MCP_ALLOWED_HOSTS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-subdomain.example.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;
  Environment Variables Reference
  &lt;br&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Database&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"postgresql://user:pass@host:5432/db"&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE_SSL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE_SSL_CA&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/path/to/ca.pem"&lt;/span&gt;

&lt;span class="c"&gt;# SSH Tunnel&lt;/span&gt;
&lt;span class="nv"&gt;SSH_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;
&lt;span class="nv"&gt;SSH_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"bastion.example.com"&lt;/span&gt;
&lt;span class="nv"&gt;SSH_USER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"ec2-user"&lt;/span&gt;
&lt;span class="nv"&gt;SSH_PRIVATE_KEY_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/path/to/key"&lt;/span&gt;
&lt;span class="nv"&gt;SSH_TRUST_ON_FIRST_USE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;
&lt;span class="nv"&gt;SSH_MAX_RECONNECT_ATTEMPTS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"5"&lt;/span&gt;

&lt;span class="c"&gt;# Query&lt;/span&gt;
&lt;span class="nv"&gt;READ_ONLY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;
&lt;span class="nv"&gt;MAX_ROWS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"1000"&lt;/span&gt;
&lt;span class="nv"&gt;QUERY_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"30000"&lt;/span&gt;

&lt;span class="c"&gt;# HTTP Server&lt;/span&gt;
&lt;span class="nv"&gt;PORT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"3000"&lt;/span&gt;
&lt;span class="nv"&gt;MCP_AUTH_MODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"oauth"&lt;/span&gt;
&lt;span class="nv"&gt;AUTH0_DOMAIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"tenant.us.auth0.com"&lt;/span&gt;
&lt;span class="nv"&gt;AUTH0_AUDIENCE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://your-domain.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;/p&gt;




&lt;h2&gt;
  
  
  Troubleshooting
&lt;/h2&gt;

&lt;p&gt;
  Claude Desktop: Server Not Starting
  &lt;p&gt;Check logs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;macOS: &lt;code&gt;~/Library/Logs/Claude/mcp*.log&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Windows: &lt;code&gt;%APPDATA%/Claude/logs/mcp*.log&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common issues: Invalid &lt;code&gt;DATABASE_URI&lt;/code&gt;, SSH key permissions (should be 600), PostgreSQL not accessible.&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  ChatGPT: "Unable to connect"
  &lt;br&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Verify server&lt;/span&gt;
curl https://your-subdomain.example.com/health

&lt;span class="c"&gt;# Check OAuth metadata&lt;/span&gt;
curl https://your-subdomain.example.com/.well-known/oauth-protected-resource
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Verify Auth0: Default Audience set, DCR enabled, database connection promoted.&lt;/p&gt;



&lt;/p&gt;

&lt;p&gt;
  401 Unauthorized
  &lt;ul&gt;
&lt;li&gt;
&lt;code&gt;AUTH0_AUDIENCE&lt;/code&gt; must match API Identifier exactly&lt;/li&gt;
&lt;li&gt;Check Default Audience in Auth0 tenant settings&lt;/li&gt;
&lt;li&gt;Tokens may have expired&lt;/li&gt;
&lt;/ul&gt;



&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.npmjs.com/package/@zlash65/postgresql-ssh-mcp" rel="noopener noreferrer"&gt;npm package&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Zlash65/postgresql-ssh-mcp#readme" rel="noopener noreferrer"&gt;Full documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Zlash65/postgresql-ssh-mcp/issues" rel="noopener noreferrer"&gt;Report issues&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://github.com/Zlash65/postgresql-ssh-mcp" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Star on GitHub&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>ssh</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Fullstack RAG PDFBot - From Prototype to Production-Ready-ish</title>
      <dc:creator>Zarrar Shaikh</dc:creator>
      <pubDate>Mon, 07 Jul 2025 12:06:15 +0000</pubDate>
      <link>https://forem.com/zlash65/rag-pdfbot-v3-from-prototype-to-production-ready-ish-32f7</link>
      <guid>https://forem.com/zlash65/rag-pdfbot-v3-from-prototype-to-production-ready-ish-32f7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🔗 If you're new to this project, start with the original guide here:&lt;br&gt;&lt;br&gt;
&lt;a href="https://dev.to/zlash65/building-a-rag-powered-pdf-chatbot-with-langchain-streamlit-and-faiss-9i9"&gt;Building a RAG-powered PDF Chatbot - V1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔗 Follow-up guide after the first iteration of the bot:&lt;br&gt;&lt;br&gt;
&lt;a href="https://dev.to/zlash65/refactoring-rag-pdfbot-modular-design-with-langchain-streamlit-and-chromadb-41fn"&gt;Refactoring RAG PDFBot - V2&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;In &lt;a href="https://dev.to/zlash65/refactoring-rag-pdfbot-modular-design-with-langchain-streamlit-and-chromadb-41fn"&gt;Version 2&lt;/a&gt;, we built on our &lt;a href="https://dev.to/zlash65/building-a-rag-powered-pdf-chatbot-with-langchain-streamlit-and-faiss-9i9"&gt;Version 1&lt;/a&gt; foundation by splitting everything into separate files. That was great - we cleaned up our monolithic code and gave our chatbot more structure. But let’s be honest: everything still lived inside one Streamlit app. The logic for uploading files, generating answers, and even managing the vector store - all of it was handled inside Streamlit. That’s fine for a prototype, but not quite production-ready.&lt;/p&gt;

&lt;p&gt;With Version 3, we’ve taken a major step forward.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📦 Source Code V3: &lt;a href="https://github.com/Zlash65/rag-bot-fastapi" rel="noopener noreferrer"&gt;Zlash65/rag-bot-fastapi&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🚀 What’s New in Iteration 3?
&lt;/h2&gt;

&lt;p&gt;We’ve &lt;strong&gt;split the application into a real Frontend and Backend&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Built using &lt;strong&gt;Streamlit&lt;/strong&gt;, it handles all the UI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: Powered by &lt;strong&gt;FastAPI&lt;/strong&gt;, it takes care of PDF processing, vector storage, querying, and AI interactions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✅ Why this split?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Separation of Concerns&lt;/strong&gt;: UI doesn’t need to know how AI logic or embeddings work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexibility&lt;/strong&gt;: Want to use Gradio, or React for your UI? Now you can, without touching the backend.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: This separation allows better logging, monitoring, and potential deployment on different servers.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F457lzqizftletediiws6.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F457lzqizftletediiws6.gif" alt="RAG PDFBot Demo" width="720" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;👆 Here's a quick look&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧱 Our Project Structure
&lt;/h2&gt;

&lt;p&gt;We now have two separate folders:&lt;/p&gt;

&lt;h3&gt;
  
  
  📂 &lt;code&gt;client/&lt;/code&gt; - Streamlit Frontend
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;client/
├── app.py                      # Main entrypoint for Streamlit
├── components/                 # Chat UI, inspector, sidebar
│   ├── chat.py
│   ├── inspector.py
│   └── sidebar.py
├── state/
│   └── session.py              # Session setup and helper functions
├── utils/
│   ├── api.py                  # API calls to FastAPI server
│   ├── config.py               # API URL config
│   └── helpers.py              # High-level API abstractions
├── requirements.txt
└── README.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Stateless API interactions via &lt;code&gt;requests&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;UI elements handled via sidebar, chat input, and toggleable views&lt;/li&gt;
&lt;li&gt;Modular components for Chat, Inspector, and Uploads&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  📂 &lt;code&gt;server/&lt;/code&gt; - FastAPI Backend
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;server/
├── api/
│   ├── routes.py               # API endpoints for upload, chat, models etc.
│   └── schemas.py              # Input/output data validation with Pydantic
├── core/
│   ├── document_processor.py   # PDF handling: save, chunk, split
│   ├── llm_chain_factory.py    # LLM, embeddings, chain creation
│   └── vector_database.py      # ChromaDB handling: load, upsert, search
├── config/
│   └── settings.py             # API keys, model setup, directories
├── utils/
│   └── logger.py               # Logging for debugging and monitoring
├── main.py                     # FastAPI app setup
├── requirements.txt
└── README.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our backend now has full control over:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What LLM is used&lt;/li&gt;
&lt;li&gt;Which model the user selects&lt;/li&gt;
&lt;li&gt;How PDFs are stored and processed&lt;/li&gt;
&lt;li&gt;What embeddings we generate&lt;/li&gt;
&lt;li&gt;What responses we send back&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means we can extend things easily - add another model, another embedding technique, or change vector store - without touching our frontend.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔄 What Changed from Iteration 2
&lt;/h2&gt;

&lt;p&gt;Here’s a quick breakdown of how we evolved:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Iteration 2&lt;/th&gt;
&lt;th&gt;Iteration 3&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Codebase&lt;/td&gt;
&lt;td&gt;One Streamlit app&lt;/td&gt;
&lt;td&gt;Separate client (UI) + server (logic)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PDF Handling&lt;/td&gt;
&lt;td&gt;Inside frontend&lt;/td&gt;
&lt;td&gt;Via FastAPI API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM Response&lt;/td&gt;
&lt;td&gt;Direct from Streamlit&lt;/td&gt;
&lt;td&gt;API-based response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embeddings + Vectorstore&lt;/td&gt;
&lt;td&gt;Managed by UI&lt;/td&gt;
&lt;td&gt;Fully controlled by backend&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inspector&lt;/td&gt;
&lt;td&gt;Inside sidebar (cramped)&lt;/td&gt;
&lt;td&gt;Main UI toggle - cleaner&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extending Models&lt;/td&gt;
&lt;td&gt;Needed code change in UI&lt;/td&gt;
&lt;td&gt;Plug-and-play via config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File Validation&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;PDF size/type check in backend&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Future Extensions&lt;/td&gt;
&lt;td&gt;Hard&lt;/td&gt;
&lt;td&gt;Clean hooks for scaling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UX&lt;/td&gt;
&lt;td&gt;Basic&lt;/td&gt;
&lt;td&gt;Toggle-based views, downloads, resets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Text Splitting&lt;/td&gt;
&lt;td&gt;RecursiveCharacterTextSplitter&lt;/td&gt;
&lt;td&gt;TokenTextSplitter (LLM-aware, cleaner splits)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🔍 Why We Switched to TokenTextSplitter
&lt;/h2&gt;

&lt;p&gt;In earlier versions, we used &lt;code&gt;RecursiveCharacterTextSplitter&lt;/code&gt; to chunk our documents. It works by splitting the text at "natural" breakpoints - like paragraphs, then sentences, then words, then characters - to get close to the target chunk size &lt;strong&gt;in characters&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But here's the problem: LLMs like GPT, Claude, or Gemini don’t read text in characters - they read &lt;strong&gt;tokens&lt;/strong&gt;. A token is roughly 3–4 characters or 0.75 words. That means your 1000-character chunk might be 300 tokens… or 1200 tokens. It’s unpredictable.&lt;/p&gt;

&lt;p&gt;To fix this, we now use &lt;code&gt;TokenTextSplitter&lt;/code&gt;, which splits based on &lt;strong&gt;actual token counts&lt;/strong&gt;, giving precise control over chunk size and overlap. This leads to more reliable inputs and avoids going over model limits.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔬 Simple Example
&lt;/h3&gt;

&lt;p&gt;Let’s take this sentence:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"LangChain helps developers build applications with LLMs more efficiently."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s &lt;strong&gt;76 characters&lt;/strong&gt; but around &lt;strong&gt;12 tokens&lt;/strong&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  RecursiveCharacterTextSplitter
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;With &lt;code&gt;RecursiveCharacterTextSplitter(chunk_size=30)&lt;/code&gt;, we might get:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chunk 1: "LangChain helps developers "
Chunk 2: "build applications with LLMs "
Chunk 3: "more efficiently."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Visually clean, but &lt;strong&gt;token count varies&lt;/strong&gt; and could overflow model limits.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm34f7hvtq601hi9zrupz.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm34f7hvtq601hi9zrupz.gif" alt="RAG PDFBot RecursiveCharacterTextSplitter" width="720" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;em&gt;Notice how the bot was not able to give correct response to our question because of improper chunking&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  TokenTextSplitter
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;With &lt;code&gt;TokenTextSplitter(chunk_size=10, chunk_overlap=2)&lt;/code&gt;, we get chunks like:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chunk 1: ["Lang", "Chain", "helps", ..., "efficiently"]
Chunk 2: ["applications", ..., "efficiently"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each chunk is &lt;strong&gt;exactly 10 tokens&lt;/strong&gt;, making it predictable and LLM-friendly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbcvs9t1li36qpe2p7r2p.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbcvs9t1li36qpe2p7r2p.gif" alt="RAG PDFBot TokenTextSplitter" width="720" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ &lt;em&gt;We get more accurate responses when splitting chunks by token&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By using &lt;code&gt;TokenTextSplitter&lt;/code&gt;, we gain &lt;strong&gt;better control, consistency, and contextual accuracy&lt;/strong&gt; - making our RAG pipeline more reliable.&lt;/p&gt;




&lt;h2&gt;
  
  
  ✨ Better User Experience
&lt;/h2&gt;

&lt;p&gt;Previously, the inspector tool was a bit hidden. We crammed it into the sidebar and showed the results there too - not the best experience.&lt;/p&gt;

&lt;p&gt;In this iteration, we made it &lt;strong&gt;visible in the main chat area&lt;/strong&gt;. There's a toggle in the sidebar where we can switch between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;💬 Chat View&lt;/li&gt;
&lt;li&gt;🔬 Inspector View&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ze169rgmtz6ai7gcmyt.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ze169rgmtz6ai7gcmyt.gif" alt="RAG PDFBot Different Views" width="720" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, we get full-width, readable responses whether we’re chatting with PDFs or inspecting our vectorstore. It’s simple and intuitive.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚡ Why a Production-Ready Backend Matters
&lt;/h2&gt;

&lt;p&gt;Splitting the codebase into frontend and backend isn’t just good structure - it unlocks real power:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Async APIs by Default&lt;/strong&gt;: Our FastAPI backend supports async endpoints. That means heavy operations like PDF uploads can later be offloaded to background task queues like &lt;strong&gt;Celery&lt;/strong&gt; or &lt;strong&gt;RQ&lt;/strong&gt;, keeping the app responsive.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Plug-and-Play Model Integration&lt;/strong&gt;: Want to add OpenAI, Cohere, or any other LLM provider? Just update the model config in &lt;code&gt;settings.py&lt;/code&gt;. The frontend automatically reflects the new options - no need to touch UI code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Independent Scalability&lt;/strong&gt;: The backend can now be scaled separately. You could deploy it on a more powerful server or container, while keeping the Streamlit frontend lightweight.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Extendability&lt;/strong&gt;: You can now plug in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Authentication &amp;amp; authorization&lt;/li&gt;
&lt;li&gt;Persistent chat history&lt;/li&gt;
&lt;li&gt;User sessions&lt;/li&gt;
&lt;li&gt;Rate limiting&lt;/li&gt;
&lt;li&gt;Admin dashboards&lt;/li&gt;
&lt;li&gt;and more...&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cleaner Logs &amp;amp; Traceability&lt;/strong&gt;: Errors, API calls, and internal processing can now be logged systematically using &lt;code&gt;utils/logger.py&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ready for Containerization&lt;/strong&gt;: Frontend and backend can be deployed on different services and containers (e.g. Streamlit Cloud + Render, EC2, etc).&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Frontend Agnostic&lt;/strong&gt;: Want a more custom UI? You can now build one in React, Gradio, or even mobile - and keep using the same backend APIs.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🌐 How the Frontend Talks to the Backend
&lt;/h2&gt;

&lt;p&gt;The Streamlit frontend acts purely as a UI renderer. Every interaction routes through the FastAPI backend via well-defined HTTP endpoints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model &amp;amp; Provider Fetching&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;GET /llm&lt;/code&gt; → Fetches available providers.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GET /llm/{model_provider}&lt;/code&gt; → Fetches models for the selected provider.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;PDF Upload &amp;amp; Processing&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;POST /upload_and_process_pdfs&lt;/code&gt; → Uploads selected PDFs, splits them, creates embeddings, and stores them.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Inspector Tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;GET /vector_store/count/{model_provider}&lt;/code&gt; → Gets the number of indexed documents.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /vector_store/search&lt;/code&gt; → Returns top document matches for a query.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Chat Endpoint&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;POST /chat&lt;/code&gt; → Sends user message + model info, and returns LLM-generated response.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;This setup ensures &lt;strong&gt;separation of responsibilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frontend: handles layout, inputs, displaying results.&lt;/li&gt;
&lt;li&gt;Backend: handles logic, computation, storage, and integration with LLMs.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🏗️ Why We Still Use Streamlit for the Frontend
&lt;/h2&gt;

&lt;p&gt;While we’ve upgraded our backend, we’re sticking with &lt;strong&gt;Streamlit&lt;/strong&gt; for the frontend - for now. Here’s why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🧱 &lt;strong&gt;Rapid Prototyping&lt;/strong&gt;: We can build interactive UIs in minutes, not days.&lt;/li&gt;
&lt;li&gt;💬 &lt;strong&gt;Built-in Components&lt;/strong&gt;: Features like &lt;code&gt;chat_input&lt;/code&gt;, &lt;code&gt;expander&lt;/code&gt;, &lt;code&gt;sidebar&lt;/code&gt;, and &lt;code&gt;st.tabs&lt;/code&gt; simplify layout.&lt;/li&gt;
&lt;li&gt;🧪 &lt;strong&gt;Focus on Learning AI&lt;/strong&gt;: We avoid the overhead of building a custom UI in React or HTML/CSS - saving our energy for improving LLM workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Eventually, we’ll likely switch to a custom-built frontend. But until then, Streamlit lets us move fast and learn faster.&lt;/p&gt;




&lt;h2&gt;
  
  
  📦 Architecture Benefits at a Glance
&lt;/h2&gt;

&lt;p&gt;Here’s what we gain by this separation of concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Maintainability&lt;/strong&gt;: Code is modular and easier to debug or extend.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Scalability&lt;/strong&gt;: Frontend and backend can grow independently.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Developer Experience&lt;/strong&gt;: No fear adding new models, chains, or workflows.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Deployment Flexibility&lt;/strong&gt;: Deploy frontend and backend to different services with ease.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Tooling Support&lt;/strong&gt;: Easier to add monitoring, tracing, logging, background jobs, or security layers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This structure mirrors how real-world AI products are built.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔁 Recap
&lt;/h2&gt;

&lt;p&gt;Let’s summarize our journey so far:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Iteration 1&lt;/strong&gt;: A single-file prototype with Streamlit and FAISS. Quick and dirty.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Iteration 2&lt;/strong&gt;: Modularized the logic but still kept everything inside one Streamlit app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Iteration 3&lt;/strong&gt;: Split into a decoupled frontend (Streamlit) and backend (FastAPI), creating a scalable, production-leaning RAG bot.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;From hacking things together to building an extensible, maintainable system - we're not just playing with AI anymore. We're engineering real tools.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  📦 Source Code
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Version 1: &lt;a href="https://github.com/Zlash65/rag-bot-basic" rel="noopener noreferrer"&gt;Zlash65/rag-bot-basic&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Version 2: &lt;a href="https://github.com/Zlash65/rag-bot-chroma" rel="noopener noreferrer"&gt;Zlash65/rag-bot-chroma&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Version 3: &lt;a href="https://github.com/Zlash65/rag-bot-fastapi" rel="noopener noreferrer"&gt;Zlash65/rag-bot-fastapi&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  💭 Final thoughts
&lt;/h2&gt;

&lt;p&gt;We started with a single file prototype. Then, we broke things into modules. Now, we’ve split the app into an actual frontend and backend. And that’s a &lt;strong&gt;huge deal&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you’ve made it this far, you’ve not only built something functional - you’ve learned how real-world AI tools are structured.&lt;/p&gt;

&lt;p&gt;Don't stop here. Keep exploring. Keep tweaking. Build weird stuff. Break things and fix them.&lt;/p&gt;

&lt;p&gt;This is how engineers grow.&lt;/p&gt;

&lt;p&gt;Let’s keep shipping and improving - one iteration at a time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy building! 🛠️&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Refactoring RAG PDFBot: Modular Design with LangChain, Streamlit and ChromaDB</title>
      <dc:creator>Zarrar Shaikh</dc:creator>
      <pubDate>Sat, 05 Jul 2025 20:04:50 +0000</pubDate>
      <link>https://forem.com/zlash65/refactoring-rag-pdfbot-modular-design-with-langchain-streamlit-and-chromadb-41fn</link>
      <guid>https://forem.com/zlash65/refactoring-rag-pdfbot-modular-design-with-langchain-streamlit-and-chromadb-41fn</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🔗 If you're new to this project, start with the original guide here: &lt;a href="https://dev.to/zlash65/building-a-rag-powered-pdf-chatbot-with-langchain-streamlit-and-faiss-9i9"&gt;Building a RAG-powered PDF Chatbot&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;In real-world production systems, it’s common practice to split responsibilities into multiple well-defined modules. Instead of cramming everything into a single file, code is grouped based on functionality - making it easier to debug, scale, and maintain. In this version of the RAG PDFBot, we’re simulating that same structure.&lt;/p&gt;

&lt;p&gt;In this post we will walk through how we can evolve our original chatbot into a modular, production-style app using &lt;strong&gt;LangChain&lt;/strong&gt;, &lt;strong&gt;ChromaDB&lt;/strong&gt;, and &lt;strong&gt;Streamlit&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbz8s915wuq8o3jhdyyy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbz8s915wuq8o3jhdyyy.gif" alt="Working demo" width="760" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;👆 Here's a quick look at what you'll be building in this guide.&lt;/p&gt;

&lt;p&gt;📦 Source Code: &lt;a href="https://github.com/Zlash65/rag-bot-chroma" rel="noopener noreferrer"&gt;Zlash65/rag-bot-chroma&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧱 What's New in the Modular Version
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;Original Version&lt;/th&gt;
&lt;th&gt;Modular Edition&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;File Structure&lt;/td&gt;
&lt;td&gt;One big &lt;code&gt;app.py&lt;/code&gt; file&lt;/td&gt;
&lt;td&gt;Multiple logical modules (chat, sidebar, LLM, PDF, vectorstore, config)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PDF Parser&lt;/td&gt;
&lt;td&gt;PyPDF2&lt;/td&gt;
&lt;td&gt;Switched to &lt;code&gt;pypdf&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embedding Store&lt;/td&gt;
&lt;td&gt;FAISS&lt;/td&gt;
&lt;td&gt;Switched to &lt;code&gt;ChromaDB&lt;/code&gt; (for learning &amp;amp; experimentation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM Chains&lt;/td&gt;
&lt;td&gt;Simple &lt;code&gt;load_qa_chain&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;LangChain &lt;code&gt;RetrievalChain&lt;/code&gt; with structured prompts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompting&lt;/td&gt;
&lt;td&gt;Static prompt template&lt;/td&gt;
&lt;td&gt;Modular prompt with system/human roles&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dev Tools&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Built-in vectorstore inspector&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🔁 From FAISS to ChromaDB
&lt;/h2&gt;

&lt;p&gt;Both &lt;strong&gt;FAISS&lt;/strong&gt; and &lt;strong&gt;ChromaDB&lt;/strong&gt; are popular options for storing and searching vector embeddings.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ In this version, we're switching to &lt;strong&gt;ChromaDB&lt;/strong&gt; - not because FAISS isn't good, but to experiment with a different vector database and learn its tradeoffs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;FAISS&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;ChromaDB&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Persistence&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;In-memory by default (requires manual saving/loading with &lt;code&gt;.save_local()&lt;/code&gt; / &lt;code&gt;.load_local()&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Persistent by default (creates &lt;code&gt;chroma&lt;/code&gt; directory and auto-saves)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple for in-memory; more manual steps for persistence&lt;/td&gt;
&lt;td&gt;Plug-and-play with auto-persistence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Metadata Support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stores metadata, but querying/filtering support is limited&lt;/td&gt;
&lt;td&gt;Rich metadata filtering and querying support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Built-in Filtering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Minimal (not intuitive for metadata-based filtering)&lt;/td&gt;
&lt;td&gt;Native filtering with conditions on metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Performance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Highly optimized for similarity search at scale (especially with GPU)&lt;/td&gt;
&lt;td&gt;Good performance, but not optimized for billion-scale datasets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Indexing Options&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multiple indexing algorithms (Flat, IVF, HNSW, etc.)&lt;/td&gt;
&lt;td&gt;Abstracted away - we don't control indexing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Use &lt;strong&gt;FAISS&lt;/strong&gt; if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want &lt;strong&gt;high performance&lt;/strong&gt; similarity search.&lt;/li&gt;
&lt;li&gt;You’re comfortable managing &lt;strong&gt;manual persistence&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;You’re deploying &lt;strong&gt;on-device or at scale&lt;/strong&gt;, especially with GPU acceleration.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use &lt;strong&gt;ChromaDB&lt;/strong&gt; if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want &lt;strong&gt;auto-persistence&lt;/strong&gt; with minimal setup.&lt;/li&gt;
&lt;li&gt;You need &lt;strong&gt;metadata filtering&lt;/strong&gt; (e.g., retrieve only documents from a specific source).&lt;/li&gt;
&lt;li&gt;You're in &lt;strong&gt;rapid prototyping&lt;/strong&gt; mode and want a simple dev experience.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Code Snippet: ChromaDB Setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_chroma_vectorstore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;persist_directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./data/chroma_store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔍 VectorStore Inspector
&lt;/h2&gt;

&lt;p&gt;One of the highlights of this version is we added a &lt;strong&gt;vectorstore inspector&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the previous version, the vectorstore was a black box. Now, we can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run ad-hoc test queries&lt;/li&gt;
&lt;li&gt;See matching chunks returned from Chroma&lt;/li&gt;
&lt;li&gt;Visually debug which documents were used for answering&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Code Snippet: Vector Inspector
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;inspect_vectorstore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subheader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔬 Vectorstore Inspector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Enter a test query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Result &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;**&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;code&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Example:
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vbg626ophvd1tv4897e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vbg626ophvd1tv4897e.png" alt="VectorStore Inspector" width="800" height="452"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Improved Prompt &amp;amp; Chain Logic
&lt;/h2&gt;

&lt;p&gt;In the original version, we used:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;load_qa_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stuff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That worked - but now we’re using LangChain's &lt;code&gt;RetrievalQA&lt;/code&gt; with a cleaner, modular prompt built using &lt;code&gt;ChatPromptTemplate&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code Snippet: New Chain Logic
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chains&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_qa_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_messages&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant. Use the provided context to answer.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{question}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_chain_type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;chain_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stuff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;chain_type_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why it’s better:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;We separate system and user roles clearly&lt;/li&gt;
&lt;li&gt;It's easier to extend for follow-up questions or history&lt;/li&gt;
&lt;li&gt;It aligns better with modern LLM chat paradigms&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧩 UI &amp;amp; Handler Logic: Cleaner, Separated, Smarter
&lt;/h2&gt;

&lt;p&gt;The user interface behavior is mostly the same - but under the hood, it's been broken into logical handlers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sidebar_handler.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Handles model selection, API key input, PDF upload, and utility buttons&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;chat_handler.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Handles rendering chat bubbles, input box, and chat history download&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;llm_handler.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Manages chain and prompt setup for different model providers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vectorstore_handler.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Embeds and stores PDF chunks into ChromaDB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pdf_handler.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Extracts and chunks text from uploaded PDFs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;developer_mode.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Adds optional vectorstore inspector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;config.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Holds model metadata and keys from &lt;code&gt;.env&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🎛️ Smarter UI Behavior with &lt;code&gt;disabled&lt;/code&gt; Components
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Previous Version
&lt;/h3&gt;

&lt;p&gt;We used conditions like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which meant entire sections of the UI wouldn’t render at all until something was selected.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F16knaryxno4b3bqowp8p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F16knaryxno4b3bqowp8p.png" alt="RAG PDFBot V1" width="800" height="493"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Current Version
&lt;/h3&gt;

&lt;p&gt;In this version, &lt;strong&gt;all components are always rendered&lt;/strong&gt;, but &lt;strong&gt;disabled&lt;/strong&gt; until their prerequisites are met.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why this matters:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;The UI feels more responsive and intuitive&lt;/li&gt;
&lt;li&gt;Users can "see" what steps are required&lt;/li&gt;
&lt;li&gt;No jumping around or missing UI elements&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Example:
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7jx2avkmyonrfc8m7ja.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7jx2avkmyonrfc8m7ja.png" alt="RAG PDFBot V2" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The model select dropdown is active only after choosing a provider&lt;/li&gt;
&lt;li&gt;The pdf uploader is active only after choosing a model&lt;/li&gt;
&lt;li&gt;The chat input is shown but disabled until PDFs are submitted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach improves clarity, especially for new users.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Want to Try It?
&lt;/h2&gt;

&lt;p&gt;You can find the full source code here 👉 &lt;a href="https://github.com/Zlash65/rag-bot-chroma" rel="noopener noreferrer"&gt;Zlash65/rag-bot-chroma&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Zlash65/rag-bot-chroma.git
&lt;span class="nb"&gt;cd &lt;/span&gt;rag-bot-chroma

python3 &lt;span class="nt"&gt;-m&lt;/span&gt; venv venv
&lt;span class="nb"&gt;source &lt;/span&gt;venv/bin/activate

pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file for your API keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;GROQ_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-groq-key
&lt;span class="nv"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-google-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then launch the app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;streamlit run app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💭 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;This version of RAG PDFBot isn’t just a refactor - it’s a learning step toward building production-grade RAG apps. With ChromaDB, internal tools, modular code, and more intuitive UI, it's easier to maintain and extend.&lt;/p&gt;

&lt;p&gt;Still learning?&lt;/p&gt;

&lt;p&gt;👉 Start here: &lt;a href="https://dev.to/zlash65/building-a-rag-powered-pdf-chatbot-with-langchain-streamlit-and-faiss-9i9"&gt;Building a RAG-powered PDF Chatbot&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then come back and modularize like a pro.&lt;/p&gt;

&lt;p&gt;Happy building! 🛠️&lt;/p&gt;

</description>
      <category>llm</category>
      <category>rag</category>
      <category>langchain</category>
      <category>chromadb</category>
    </item>
    <item>
      <title>Building a RAG-powered PDF Chatbot with LangChain, Streamlit and FAISS</title>
      <dc:creator>Zarrar Shaikh</dc:creator>
      <pubDate>Fri, 04 Jul 2025 11:15:10 +0000</pubDate>
      <link>https://forem.com/zlash65/building-a-rag-powered-pdf-chatbot-with-langchain-streamlit-and-faiss-9i9</link>
      <guid>https://forem.com/zlash65/building-a-rag-powered-pdf-chatbot-with-langchain-streamlit-and-faiss-9i9</guid>
      <description>&lt;p&gt;In this guide, we’re going to build a working AI chatbot that can read our PDFs and answer questions from them. We’ll use a method called &lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt; to help our chatbot connect the dots between static AI models and our own documents.&lt;/p&gt;

&lt;p&gt;The thing with large language models (LLMs) is - they’re great with language, but they don’t know anything about our specific data. Their knowledge is frozen at the point they were trained. They can’t read our lease agreements, product manuals, or meeting notes. That’s where RAG comes in.&lt;/p&gt;

&lt;p&gt;RAG gives us a way to feed our data into the model. So instead of asking, “When does this lease expire?” and hoping the AI knows what we’re talking about, we give it the lease PDF - and it finds the answer based on what’s actually written in there.&lt;/p&gt;

&lt;h3&gt;
  
  
  Some useful ways we can apply this:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Legal documents&lt;/strong&gt; - Ask the AI to summarize a case or find key terms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Research papers&lt;/strong&gt; - Get summaries or compare studies without reading it all ourselves.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customer support&lt;/strong&gt; - Use our product docs and chat history to create a smart support assistant.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HR/Policy docs&lt;/strong&gt; - Help our team quickly find rules, policies, and procedures.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’ll walk through this step by step, using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit&lt;/strong&gt; for a simple, clean frontend&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangChain&lt;/strong&gt; to handle the LLM logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FAISS&lt;/strong&gt; for fast vector search over our document content&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our goal is to build something useful, easy to understand, and flexible enough to plug in any LLM we want later.&lt;/p&gt;

&lt;p&gt;Let’s get started.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏗️ What We're Building
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Working Demo
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgvyfec4lw2i8r98tu49.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgvyfec4lw2i8r98tu49.gif" alt="g-bot-basicar" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  📐 System Design
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnbfmqr68u99q844g80kf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnbfmqr68u99q844g80kf.png" alt="rag-bot-basic" width="800" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We’re building &lt;strong&gt;RAG PDFBot&lt;/strong&gt; - a chatbot that lets us upload PDFs, choose a model provider (like Groq or Gemini), and ask natural-language questions based on the content of those PDFs.&lt;/p&gt;

&lt;p&gt;All the core pieces - PDF parsing, embedding, vector search, and LLM interaction - will be stitched together using LangChain, FAISS, and Streamlit.&lt;/p&gt;

&lt;p&gt;And we’re keeping the architecture modular, so we can plug in new models or features anytime.&lt;/p&gt;




&lt;h2&gt;
  
  
  📚 Core Concepts (Explained with Real Examples)
&lt;/h2&gt;

&lt;p&gt;Before we dive into the code, let’s quickly go over the key ideas behind what we’re building - explained in a way that makes sense without needing a PhD.&lt;/p&gt;

&lt;h3&gt;
  
  
  Large Language Models (LLMs)
&lt;/h3&gt;

&lt;p&gt;LLMs are programs trained to guess the next word in a sentence - but they’ve seen billions of examples, so their guesses are usually spot on.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;What is the capital of France?&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Paris.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We’ll use an LLM later in our chatbot to generate answers based on the content we retrieve from our PDFs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Embeddings
&lt;/h3&gt;

&lt;p&gt;Embeddings turn our text into numbers - or more accurately, into &lt;strong&gt;vectors&lt;/strong&gt;. These numbers capture meaning. For example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Cats drink milk.”&lt;/em&gt; → [0.12, -0.34, 0.89, …]&lt;br&gt;
&lt;em&gt;“Kittens consume dairy.”&lt;/em&gt; → [0.10, -0.31, 0.87, …]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The two sentences mean nearly the same thing, and their embeddings are close too. That’s how we compare meanings using math, not just words.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vectors and Vector Databases
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;vector&lt;/strong&gt; is just a list of numbers. A &lt;strong&gt;vector database&lt;/strong&gt; stores lots of these vectors and helps us search by meaning.&lt;/p&gt;

&lt;p&gt;Let’s say we ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Who signed this document?”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Instead of keyword matching, our app finds chunks like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Authorized representative: John Doe”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We’re using &lt;strong&gt;FAISS&lt;/strong&gt; - a super fast, lightweight vector store that runs locally and keeps things snappy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Retrieval-Augmented Generation (RAG)
&lt;/h3&gt;

&lt;p&gt;This is the secret sauce. Instead of dumping the entire PDF into the LLM, we do something smarter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval&lt;/strong&gt; - We find the chunks that best match our question&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Augmented&lt;/strong&gt; - We add those chunks to the prompt&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generation&lt;/strong&gt; - The model uses them to craft a relevant answer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So when we ask a question, it doesn’t guess blindly - it answers based on the actual content from our documents.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧬 Stack Overview
&lt;/h2&gt;

&lt;p&gt;Here’s what we’re using to build our chatbot - and why each piece matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
The glue holding everything together. LangChain helps us connect to LLMs, manage prompts, load chains, and run contextual queries. It saves us from writing a ton of boilerplate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.streamlit.io/" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
A Python-based web framework that turns our scripts into a working UI - no need to mess with HTML or JavaScript. Just write Python functions, and Streamlit builds the interface.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://groq.com/docs/" rel="noopener noreferrer"&gt;Groq&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
A service that runs open-source models like LLaMA 3 at lightning speed using custom chips. It’s perfect for low-latency responses. And with their generous free tier, it’s great for getting started.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://ai.google.dev/" rel="noopener noreferrer"&gt;Google Gemini&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
Google’s LLM platform and a strong alternative to ChatGPT. We can access advanced models like Gemini Flash for reasoning and dialogue. The free tier gives us more than enough for prototyping.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/facebookresearch/faiss" rel="noopener noreferrer"&gt;FAISS&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
A fast, local vector store from Facebook AI. It lets us search our document chunks by meaning, not just keywords. Lightweight, efficient, and easy to use with embeddings.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;em&gt;We’re skipping OpenAI for now because of cost - both Groq and Gemini have fast and generous free tiers.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧰 Setting Up the Project
&lt;/h2&gt;

&lt;p&gt;Let’s start by creating a fresh project from scratch - and version-controlling it right from the beginning.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create a New GitHub Repository
&lt;/h3&gt;

&lt;p&gt;Head over to your GitHub profile and create a new repository.&lt;br&gt;
You can name it something like &lt;code&gt;rag-pdf-chatbot&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep it &lt;strong&gt;public&lt;/strong&gt; or &lt;strong&gt;private&lt;/strong&gt; - up to you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add a &lt;code&gt;.gitignore&lt;/code&gt;&lt;/strong&gt; and choose the &lt;strong&gt;Python&lt;/strong&gt; template.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once created, copy the repo’s URL (use HTTPS).&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Clone the Repo to Your Local Machine
&lt;/h3&gt;

&lt;p&gt;Open your terminal and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/your-username/rag-pdf-chatbot.git
&lt;span class="nb"&gt;cd &lt;/span&gt;rag-pdf-chatbot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace &lt;code&gt;your-username&lt;/code&gt; with your actual GitHub username.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Set Up a Virtual Environment
&lt;/h3&gt;

&lt;p&gt;Let’s keep our project dependencies clean and isolated from the rest of our system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate  &lt;span class="c"&gt;# For MacOS / Linux&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;🧪 &lt;strong&gt;Why use a virtual environment?&lt;/strong&gt;&lt;br&gt;
It keeps our project’s packages separate from everything else on our machine. This avoids version conflicts and makes our setup easier to manage and share.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧩 Code Overview
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Directory Structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;rag-pdf-chatbot/
│
├── app.py                 # Streamlit frontend
├── requirements.txt       # Dependencies
├── data/                  # FAISS vector store
├── README.md              # Description and instructions
├── .env                   # API keys (not committed to Git)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;requirements.txt&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pandas
PyPDF2
streamlit
faiss-cpu
langchain
langchain-community
langchain-groq
langchain-google-genai
sentence-transformers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;app.py&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;streamlit&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PyPDF2&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PdfReader&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.text_splitter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chains.question_answering&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_qa_chain&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HuggingFaceEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_google_genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ChatGoogleGenerativeAI&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_groq&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatGroq&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These cover everything from PDF parsing and embedding to using Groq and Gemini with LangChain.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 If you plan to use a different LLM provider later (like OpenAI or Mistral), you can add their package when needed.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  📂 Full Code Reference
&lt;/h2&gt;

&lt;p&gt;You can find the complete code for this project here:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://github.com/Zlash65/rag-bot-basic" rel="noopener noreferrer"&gt;GitHub Repo - Zlash65/rag-bot-basic&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Feel free to give it a ⭐ if you like it!&lt;/p&gt;




&lt;h2&gt;
  
  
  🔍 Full Walkthrough of &lt;code&gt;main()&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;This walkthrough follows the top-down order of the &lt;code&gt;main()&lt;/code&gt; function in your app and explains utility functions in-depth as they're introduced.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. 🚀 App Initialization and State Setup
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
  &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_page_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RAG PDFBot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;centered&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;👽 RAG PDFBot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Chat with multiple PDFs :books:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The app starts by setting a page title and layout, followed by a heading and caption. This makes the UI friendly and helps users immediately understand the purpose of the app.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdfs_submitted&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector_store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf_files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;last_provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unsubmitted_files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This loop initializes &lt;code&gt;st.session_state&lt;/code&gt; with default values. Streamlit reruns the script on every interaction, so we use the session state to persist key pieces of information across reruns, such as uploaded files, model selections, chat history, and whether reprocessing is needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. 🎛️ Sidebar: Model Configuration UI
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sidebar&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expander&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚙️ Configuration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expanded&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sidebar groups all model-related configurations inside an expandable section to keep the interface organized and uncluttered.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Below the imports, define model options.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;MODEL_OPTIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Groq&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;playground&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://console.groq.com/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama-3.1-8b-instant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3-70b-8192&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;playground&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://ai.google.dev&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Inside the sidebar expander, add this.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;      &lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;selectbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔌 Model Provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Select a model provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Groq&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Select a model provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user is required to pick either Groq or Gemini as their LLM provider. If nothing is selected, the function returns early. This prevents the rest of the interface from loading and avoids initializing model-specific logic prematurely.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Still inside the same sidebar expander, add this.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;      &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔑 Enter your API Key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Get API key from [here](&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MODEL_OPTIONS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;playground&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This input prompts the user for their API key. It dynamically links to the relevant model provider's API dashboard. Again, if no key is entered, we return early, which ensures no model or embedding operations run without credentials.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Still inside the same sidebar expander, add this.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;      &lt;span class="n"&gt;models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_OPTIONS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;selectbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🧠 Select a model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The available model options are loaded based on the selected provider. For Groq, it includes LLaMA variants, and for Gemini, it includes Gemini 2.0 and 2.5.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. 📥 PDF Upload and Submission
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Still inside the same sidebar expander, add this.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;      &lt;span class="n"&gt;uploaded_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;file_uploader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📚 Upload PDFs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;accept_multiple_files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf_uploader&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;uploaded_files&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;uploaded_files&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsubmitted_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Users can upload one or more PDFs. If the uploaded files differ from the current state, we flag them as “unsubmitted.” This allows us to prompt users later to submit their files explicitly, avoiding silent reprocessing.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Still inside the same sidebar expander, add this.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;➡️ Submit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;uploaded_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;spinner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing PDFs...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="nf"&gt;process_and_store_pdfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uploaded_files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uploaded_files&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsubmitted_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDFs processed successfully!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;icon&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No files uploaded.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the user clicks Submit, we process the uploaded files and generate their vector representation. This is done only after confirmation to prevent accidental reprocessing or loading large documents unintentionally.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 ⚙️ Utility: &lt;code&gt;process_and_store_pdfs()&lt;/code&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this function above your &lt;code&gt;main()&lt;/code&gt; function.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_and_store_pdfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdfs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="n"&gt;raw_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_pdf_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdfs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_text_chunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_vectorstore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;
  &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdfs_submitted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function is the core of the ingestion pipeline. It extracts all text from the uploaded PDFs, chunks that text into manageable overlapping segments, embeds them, stores them in a FAISS vectorstore, and then keeps the store in session memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 📄 &lt;code&gt;get_pdf_text()&lt;/code&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this function above your &lt;code&gt;process_and_store_pdfs()&lt;/code&gt; function.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_pdf_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PdfReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extract_text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function loops through each uploaded PDF and extracts raw text from every page using &lt;code&gt;PyPDF2&lt;/code&gt;. If a page doesn't contain extractable text (like a scanned image), &lt;code&gt;extract_text()&lt;/code&gt; may return &lt;code&gt;None&lt;/code&gt;, so we use &lt;code&gt;or ""&lt;/code&gt; to ensure the process doesn’t fail. The function returns a long string containing all text concatenated together.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
If you upload a PDF with 2 pages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Page 1: &lt;code&gt;"Terms and Conditions"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Page 2: &lt;code&gt;"Refunds will not be issued after 30 days."&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then this function will return:&lt;br&gt;
&lt;code&gt;"Terms and ConditionsRefunds will not be issued after 30 days."&lt;/code&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  3.3 ✂️ &lt;code&gt;get_text_chunks()&lt;/code&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this function below your &lt;code&gt;get_pdf_text()&lt;/code&gt; function.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_text_chunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="n"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function breaks the extracted text into overlapping chunks using LangChain’s &lt;code&gt;RecursiveCharacterTextSplitter&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;chunk_size&lt;/code&gt; is set to 5000 characters, and each chunk overlaps the previous one by 500 characters. This ensures that if important context spans across two chunks, the LLM doesn’t lose meaning due to hard boundaries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let’s say you have the following 150-character text:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"In case of early termination, the lessee shall forfeit the security deposit. A 60-day written notice is mandatory for all cancellations."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With a chunk size of 80 and an overlap of 20, the chunks would be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Chunk 1:
&lt;code&gt;"In case of early termination, the lessee shall forfeit the security deposit."&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Chunk 2:
&lt;code&gt;"shall forfeit the security deposit. A 60-day written notice is mandatory"&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This overlap ensures LLMs don’t miss context when answering questions later.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.4 🧠 &lt;code&gt;get_vectorstore()&lt;/code&gt; and &lt;code&gt;get_embeddings()&lt;/code&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add these function below your &lt;code&gt;get_text_chunks()&lt;/code&gt; function.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_vectorstore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FAISS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_local&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./data/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_vector_store.faiss&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function creates the actual FAISS vectorstore. It first retrieves the appropriate embedding function using &lt;code&gt;get_embeddings()&lt;/code&gt;, applies it to each chunk, and stores the resulting vectors in a FAISS index saved locally.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;groq&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;HuggingFaceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sentence-transformers/all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models/embedding-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;google_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unsupported provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since LangChain does not yet offer an official embedding model for Groq, we use a general-purpose HuggingFace embedding model called &lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;, which works well for a wide range of semantic search tasks. For Gemini, LangChain provides official support for Google’s embedding API (&lt;code&gt;models/embedding-001&lt;/code&gt;), which integrates seamlessly and is optimized for use with Gemini models.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. 🔁 Auto-Reprocess on Provider Change
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Place this right below the Submit button inside the same sidebar block.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;spinner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reprocessing PDFs...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="nf"&gt;process_and_store_pdfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDFs reprocessed successfully!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;icon&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔁&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the user changes the provider (e.g., from Groq to Gemini), the system automatically reprocesses existing PDFs using the new embedding model. This prevents mismatched embeddings and ensures consistency.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. 🛠 Sidebar Tools
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Still inside the sidebar but outside the config expander, add this.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expander&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🛠️ Tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expanded&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="n"&gt;col1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔄 Reset&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Select a model provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rerun&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🧹 Clear Chat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdfs_submitted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Chat and PDF cleared.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;icon&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🧼&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;col3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;↩️ Undo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rerun&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The following tools allow users to reset, clear, or undo chat state, making the UI much more usable and fault-tolerant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reset:&lt;/strong&gt; Clears everything and resets the dropdown&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clear Chat:&lt;/strong&gt; Removes chat and uploaded data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Undo:&lt;/strong&gt; Removes the last chat interaction only&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6. 📎 Show Uploaded File List
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this function below your &lt;code&gt;process_and_store_pdfs()&lt;/code&gt; function.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;render_uploaded_files&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
  &lt;span class="n"&gt;pdf_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pdf_files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expander&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**📎 Uploaded Files:**&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;render_uploaded_files()&lt;/code&gt; function shows the names of all submitted PDF files in a collapsible section. It gives users quick visual confirmation of which files are currently active in the chatbot.&lt;/p&gt;

&lt;p&gt;We only call this function &lt;strong&gt;after the PDFs have been submitted and processed&lt;/strong&gt;, using:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Place this outside the sidebar, right &lt;em&gt;below&lt;/em&gt; the entire sidebar block in &lt;code&gt;main()&lt;/code&gt;.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdfs_submitted&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nf"&gt;render_uploaded_files&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This avoids showing the file list prematurely or when the files are not yet embedded, keeping the UI clean and relevant.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. 📖 Show Chat History
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this inside &lt;code&gt;main()&lt;/code&gt; just after uploaded files list.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This loop recreates all past questions and responses in chat-style bubbles.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. ⚠️ Warn About Unsubmitted Files
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this inside &lt;code&gt;main()&lt;/code&gt; after chat history.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsubmitted_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📄 New PDFs uploaded. Please submit before chatting.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents the user from asking questions before submitting the newly uploaded PDFs, ensuring only processed documents are used for answering.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. 💬 Chat Input and Answer Generation
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this next inside &lt;code&gt;main()&lt;/code&gt; after the unsubmitted check.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdfs_submitted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;💬 Ask a Question from the PDF Files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;spinner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Thinking...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
          &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_qa_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
              &lt;span class="n"&gt;return_only_outputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
            &lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;pdf_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pdf_files&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pdf_names&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📄 Please upload and submit PDFs to start chatting.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This section renders the chat input only &lt;strong&gt;after&lt;/strong&gt; PDFs have been successfully submitted. When a user asks a question, it performs a similarity search over the FAISS vector store to find relevant document chunks. It then sends those chunks and the user’s question to the selected LLM (Groq or Gemini) using a prompt chain, and returns a detailed answer. The result, along with metadata, is saved into session state for chat history and optional download.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧠 9.1 &lt;code&gt;get_qa_chain()&lt;/code&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this function below your &lt;code&gt;get_vectorstore()&lt;/code&gt; function.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_qa_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PromptTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Answer the question as detailed as possible.
    If the question cannot be answered using the provided context, please say &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t know.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;

    Context:
    {context}

    Question:
    {question}?

    Answer:
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;input_variables&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatGroq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;groq&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="nc"&gt;ChatGoogleGenerativeAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;load_qa_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stuff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This utility creates a custom QA chain using LangChain’s &lt;code&gt;load_qa_chain&lt;/code&gt; method. The chain is configured to respond strictly based on context, and not to hallucinate when the answer isn’t found. It supports both Groq and Gemini as LLM backends, selecting the appropriate one based on the provider.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. 💾 Download Chat History
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Add this function below your &lt;code&gt;render_uploaded_files()&lt;/code&gt; function.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;render_download_chat_history&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
  &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Model Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PDF File&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expander&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**📎 Download Chat History:**&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sidebar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;download_button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📥 Download Chat History&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;mime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text/csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function creates a downloadable CSV file containing the full chat history. It uses Pandas to format the data and adds a download button in the sidebar for users to save their conversations along with model and file metadata.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;At the bottom of &lt;code&gt;main()&lt;/code&gt;, add this.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;render_download_chat_history&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This checks if there’s any chat history and, if so, calls the utility function to render the download option.&lt;/p&gt;

&lt;h3&gt;
  
  
  💡 What This Does
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;If any chat has occurred, a new expander shows up in the sidebar.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Users can download a &lt;code&gt;.csv&lt;/code&gt; file with all conversation metadata:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Questions, Answers&lt;/li&gt;
&lt;li&gt;Model Provider and Model Name&lt;/li&gt;
&lt;li&gt;PDF files used&lt;/li&gt;
&lt;li&gt;Timestamp of each entry&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✅ Final Behavior Example
&lt;/h3&gt;

&lt;p&gt;A downloaded CSV might look like:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Question&lt;/th&gt;
&lt;th&gt;Answer&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Model Name&lt;/th&gt;
&lt;th&gt;PDF File&lt;/th&gt;
&lt;th&gt;Timestamp&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;What’s the lease end date?&lt;/td&gt;
&lt;td&gt;June 30, 2025&lt;/td&gt;
&lt;td&gt;Groq&lt;/td&gt;
&lt;td&gt;llama-3.1-8b-instant&lt;/td&gt;
&lt;td&gt;lease.pdf&lt;/td&gt;
&lt;td&gt;2025-07-04 14:02:00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Who signed the agreement?&lt;/td&gt;
&lt;td&gt;John Doe&lt;/td&gt;
&lt;td&gt;Groq&lt;/td&gt;
&lt;td&gt;llama-3.1-8b-instant&lt;/td&gt;
&lt;td&gt;lease.pdf&lt;/td&gt;
&lt;td&gt;2025-07-04 14:03:15&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  11. 🎬 Launch the Script
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;🧾 &lt;strong&gt;Finally, call &lt;code&gt;main()&lt;/code&gt; at the bottom of your file.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This standard Python entry point ensures the app runs when &lt;code&gt;app.py&lt;/code&gt; is executed directly.&lt;/p&gt;




&lt;h2&gt;
  
  
  12. 🏁 How to Run the Code
&lt;/h2&gt;

&lt;p&gt;Once everything is set up, running the app is simple. Just use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;streamlit run app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will start a local web server and open the chatbot in your browser - usually at &lt;a href="http://localhost:8501" rel="noopener noreferrer"&gt;http://localhost:8501&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpyuomkyjnmrfnf4huj5d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpyuomkyjnmrfnf4huj5d.png" alt="rag-bot-basic" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From there, you’ll be able to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Upload one or more PDFs&lt;/li&gt;
&lt;li&gt;Choose a model provider (Groq or Gemini)&lt;/li&gt;
&lt;li&gt;Enter your API key&lt;/li&gt;
&lt;li&gt;Start asking questions based on your documents&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🧠 If the browser doesn’t open automatically, just copy the &lt;code&gt;localhost&lt;/code&gt; URL from the terminal and paste it into your browser.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  💭 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;That’s it - our chatbot is ready to rock.&lt;/p&gt;

&lt;p&gt;It reads PDFs, finds context using RAG, and gives relevant answers using your choice of LLM - all from a clean, modular setup that’s easy to extend.&lt;/p&gt;

&lt;p&gt;Here are some ideas you can explore next:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🧠 Add memory so it remembers past messages&lt;/li&gt;
&lt;li&gt;📁 Support multiple PDFs at once&lt;/li&gt;
&lt;li&gt;🌐 Deploy it on the web (Streamlit Cloud or Hugging Face Spaces)&lt;/li&gt;
&lt;li&gt;🔌 Try swapping in different LLMs like Claude or Mistral&lt;/li&gt;
&lt;li&gt;🧪 Add advanced features like source highlighting or confidence scores&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn’t just a chatbot - it’s a real-world template we can build upon to create context-aware AI tools. No retraining, no black-box magic. Just good engineering and the right tools.&lt;/p&gt;

&lt;p&gt;So let’s fork it, build on it, break it, fix it - and make it ours.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy building. 🛠️&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>rag</category>
      <category>langchain</category>
      <category>faiss</category>
    </item>
    <item>
      <title>Building an AI Chatbot with LangChain, FastAPI &amp; Streamlit</title>
      <dc:creator>Zarrar Shaikh</dc:creator>
      <pubDate>Wed, 02 Jul 2025 18:33:07 +0000</pubDate>
      <link>https://forem.com/zlash65/building-an-ai-chatbot-with-langchain-fastapi-streamlit-377m</link>
      <guid>https://forem.com/zlash65/building-an-ai-chatbot-with-langchain-fastapi-streamlit-377m</guid>
      <description>&lt;p&gt;In this comprehensive guide, we’ll build a robust and modular AI chatbot from scratch using powerful, free-to-use large language models (LLMs) like Groq and Gemini. We’ll integrate optional web search functionality to enhance the chatbot’s responses.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk2je8d31spnecv4c0y2w.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk2je8d31spnecv4c0y2w.gif" alt="Agentic AI Chatbot demo GIF" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Throughout this project, we’ll learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dynamically select AI models
&lt;/li&gt;
&lt;li&gt;Create responsive AI agents
&lt;/li&gt;
&lt;li&gt;Develop structured backend services
&lt;/li&gt;
&lt;li&gt;Design an intuitive frontend user interface
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end of this guide, we’ll have a solid understanding of how to integrate AI models into practical applications and how to maintain code modularity, enabling easy extension and future-proofing of our projects.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Technologies Explained
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLM (Large Language Model):&lt;/strong&gt; Advanced AI models capable of understanding context and generating human-like text. &lt;a href="https://en.wikipedia.org/wiki/Large_language_model" rel="noopener noreferrer"&gt;Read more&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Groq:&lt;/strong&gt; Provides extremely fast inference using open‑source models like LLaMA. It’s free to use and ideal for rapid prototyping. &lt;a href="https://groq.com/" rel="noopener noreferrer"&gt;Learn more&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini:&lt;/strong&gt; Google’s advanced LLM with strong performance in reasoning and dialogue tasks. &lt;a href="https://ai.google.dev/" rel="noopener noreferrer"&gt;Explore Gemini&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangChain:&lt;/strong&gt; Simplifies the use of LLMs by providing tools for prompts, memory management, chaining models with various tools and APIs. &lt;a href="https://python.langchain.com/docs/get_started/introduction" rel="noopener noreferrer"&gt;LangChain documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph:&lt;/strong&gt; A LangChain extension to structure and manage stateful AI agents, allowing complex decision‑making processes. &lt;a href="https://github.com/langchain-ai/langgraph" rel="noopener noreferrer"&gt;LangGraph Overview&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FastAPI:&lt;/strong&gt; A modern, high‑performance Python framework for quickly building robust APIs. It’s intuitive, fast, and easy to learn. &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI docs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit:&lt;/strong&gt; Enables us to build interactive and attractive web applications quickly and effortlessly using Python. &lt;a href="https://streamlit.io/" rel="noopener noreferrer"&gt;Streamlit official site&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why Groq and Gemini and not OpenAI?
&lt;/h3&gt;

&lt;p&gt;We chose Groq and Gemini instead of OpenAI primarily because OpenAI models are not available for free. Groq and Gemini, on the other hand, provide free access to advanced, capable language models like LLaMA and Google’s own Gemini models, making them perfect for experimenting, learning, and prototyping without worrying about burning through our pocket money and living like a hermit for the rest of the month.&lt;/p&gt;




&lt;h2&gt;
  
  
  Project Overview
&lt;/h2&gt;

&lt;h3&gt;
  
  
  System Design
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffparitczooz0pemhd8da.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffparitczooz0pemhd8da.png" alt="Agentic AI Chatbot System Design" width="800" height="571"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Code Structure
&lt;/h3&gt;

&lt;p&gt;You can find the source code here - &lt;a href="https://github.com/Zlash65/agentic-ai-chatbot-example" rel="noopener noreferrer"&gt;Zlash65/agentic-ai-chatbot-example&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agentic-ai-chatbot-example/
├── .env
├── requirements.txt
├── main.py                     # FastAPI entry point
├── agents/                     # Phase 1 - AI logic
│   ├── llm_provider.py         # Select LLM provider
│   ├── tools.py                # Tavily search tool
│   └── ai_agent.py             # Build LangGraph agent
├── backend/                    # Phase 2 - Backend API
│   ├── config.py               # Load env vars
│   ├── schema.py               # Pydantic schema
│   ├── router.py               # /chat endpoint
├── frontend/                   # Phase 3 - Streamlit UI
│   └── streamlit_app.py        # Streamlit chat UI
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Phase 1 - AI Agent Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;llm_provider.py&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This code snippet dynamically chooses the appropriate AI model provider (Groq or Gemini). We structured this code to be modular, allowing easy future addition of more LLM providers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_groq&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatGroq&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_google_genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatGoogleGenerativeAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;backend.config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GROQ_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GOOGLE_API_KEY&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;groq&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;GROQ_API_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GROQ_API_KEY is not set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ChatGroq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;GROQ_API_KEY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_API_KEY is not set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ChatGoogleGenerativeAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid model provider: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;tools.py&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;We conditionally include a web search tool. This gives users the flexibility to enable or disable web searching as needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.tools.tavily_search&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TavilySearchResults&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_tools&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;allow_search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;allow_search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TavilySearchResults&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;ai_agent.py&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This function is the core engine of our AI chatbot. It actually calls the LLM (Groq or Gemini), optionally enables tools like web search, and returns the AI-generated message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.prebuilt&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_react_agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.messages&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AIMessage&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents.llm_provider&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_llm&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_tools&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_response_from_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;allow_search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_tools&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;allow_search&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_react_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_messages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;

    &lt;span class="n"&gt;ai_messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AIMessage&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ai_messages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ai_messages&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No response from AI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why only the last user message?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We’re simulating a stateless, simple interaction to keep responses fast and reduce LLM token usage.
&lt;/li&gt;
&lt;li&gt;Passing all previous messages would allow the LLM to remember context and have a longer conversation memory.
&lt;/li&gt;
&lt;li&gt;We can extend this later by feeding in more &lt;code&gt;HumanMessage&lt;/code&gt; and &lt;code&gt;AIMessage&lt;/code&gt; history for memory support.
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 2 - Backend Setup (FastAPI)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;config.py&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Securely load environment variables from the &lt;code&gt;.env&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;GROQ_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GROQ_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;GOOGLE_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;TAVILY_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TAVILY_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;schema.py&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Define structured input from the frontend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;allow_search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;router.py&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;APIRouter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;backend.schema&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatRequest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents.ai_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_response_from_agent&lt;/span&gt;

&lt;span class="n"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;APIRouter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;ALLOWED_PROVIDERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;groq&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;ALLOWED_MODELS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama-3.1-8b-instant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3-70b-8192&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nd"&gt;@router.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/chat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ChatRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;
    &lt;span class="n"&gt;allow_search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;allow_search&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ALLOWED_PROVIDERS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid model provider: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Must be one of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ALLOWED_PROVIDERS&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ALLOWED_MODELS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid model name: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Must be one of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ALLOWED_MODELS&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_response_from_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;allow_search&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;allow_search&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;user_messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ALLOWED_PROVIDERS&lt;/code&gt; &amp;amp; &lt;code&gt;ALLOWED_MODELS&lt;/code&gt; - safety checks to block unsupported data.
&lt;/li&gt;
&lt;li&gt;We define a &lt;strong&gt;POST&lt;/strong&gt; endpoint at &lt;code&gt;/chat&lt;/code&gt;; FastAPI auto‑validates the body against &lt;code&gt;ChatRequest&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;The validated data flows into &lt;code&gt;get_response_from_agent()&lt;/code&gt;, which handles model init, tools, and response building.
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 3 - Frontend Setup (Streamlit)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;streamlit_app.py&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Find the full code &lt;a href="https://github.com/Zlash65/agentic-ai-chatbot-example/blob/main/frontend/streamlit_app.py" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Dynamic Model Selection
&lt;/h4&gt;

&lt;p&gt;To prevent invalid provider‑model combinations, we dynamically update model selections based on the chosen provider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;MODEL_OPTIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Groq&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama-3.1-8b-instant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3-70b-8192&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;model_provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;selectbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔌 Model Provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MODEL_OPTIONS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;span class="n"&gt;model_choices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_OPTIONS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model_provider&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model_choices&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;selectbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🧠 Model Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_choices&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Running Our Application
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Terminal tab 1 - backend&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
uvicorn main:app &lt;span class="nt"&gt;--reload&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Terminal tab 2 - frontend&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
streamlit run frontend/streamlit_app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Final Chatbot in Action
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F484tcwsqbzz7icynpjgp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F484tcwsqbzz7icynpjgp.png" alt="Agentic AI Chatbot" width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Wrapping Up
&lt;/h2&gt;

&lt;p&gt;By following this detailed guide, we’ve learned how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dynamically integrate multiple powerful AI providers
&lt;/li&gt;
&lt;li&gt;Construct structured backend services using FastAPI
&lt;/li&gt;
&lt;li&gt;Create user‑friendly interfaces with Streamlit
&lt;/li&gt;
&lt;li&gt;Keep our code modular and easy to extend
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔧 What We Can Explore Next
&lt;/h2&gt;

&lt;p&gt;Now that we have a solid foundation, here are a few ideas to level up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Add context memory&lt;/strong&gt; - retain previous questions and answers for richer conversations.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable streaming responses&lt;/strong&gt; - stream long responses token by token for a smoother user experience.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plug in more tools&lt;/strong&gt; - let the AI do math, call APIs, or browse docs via LangChain tools.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy our app&lt;/strong&gt; - share your chatbot with the world on Render, Railway, or Hugging Face Spaces.
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚀 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;This project isn’t just a chatbot - it’s a starter kit for any LLM‑powered product. Whether you’re prototyping an assistant, automating research, or experimenting with multi‑agent workflows, this architecture gives you room to grow.&lt;/p&gt;

&lt;p&gt;So tweak it, break it, improve it - and most importantly, &lt;strong&gt;have fun while learning&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Happy building! 💻✨&lt;/p&gt;

</description>
      <category>llm</category>
      <category>genai</category>
      <category>langchain</category>
      <category>fastapi</category>
    </item>
    <item>
      <title>Build a Gen-AI Dockerfile Generator with AWS Bedrock, Lambda and Terraform</title>
      <dc:creator>Zarrar Shaikh</dc:creator>
      <pubDate>Wed, 02 Jul 2025 13:40:19 +0000</pubDate>
      <link>https://forem.com/zlash65/build-a-gen-ai-dockerfile-generator-with-aws-bedrock-lambda-and-terraform-17n8</link>
      <guid>https://forem.com/zlash65/build-a-gen-ai-dockerfile-generator-with-aws-bedrock-lambda-and-terraform-17n8</guid>
      <description>&lt;p&gt;Let’s be honest - AWS is powerful but not always the friendliest thing to set up. What started as a simple curiosity - “Can I generate a Dockerfile using an Amazon Bedrock?” - quickly turned into a full-blown, end-to-end project.&lt;/p&gt;

&lt;p&gt;In the process, I learned how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set up and use Amazon Bedrock&lt;/li&gt;
&lt;li&gt;Create an AWS Lambda function that talks to an LLM&lt;/li&gt;
&lt;li&gt;Use Terraform to build and tear down infrastructure automatically&lt;/li&gt;
&lt;li&gt;Deal with Bedrock’s quirks and debug confusing prompt behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re new to Bedrock, this project is a great way to learn how to actually use it - not just browse through the documentation. We’ll build a real service that accepts a programming language name via an HTTP request and returns a Dockerfile generated by an LLM. All of this is done using AWS Lambda, API Gateway, S3, and Bedrock - with Terraform managing the setup.&lt;/p&gt;

&lt;p&gt;Here’s what we’re building:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqpzwt3zb4n7mo6bweuvp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqpzwt3zb4n7mo6bweuvp.png" alt="Architecture" width="800" height="862"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We’ll go step by step. You don’t need to know everything up front. By the end, you’ll understand how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configure Amazon Bedrock and request model access&lt;/li&gt;
&lt;li&gt;Write a Python-based Lambda function that calls Bedrock&lt;/li&gt;
&lt;li&gt;Store and retrieve files in S3 securely&lt;/li&gt;
&lt;li&gt;Wire everything together with API Gateway&lt;/li&gt;
&lt;li&gt;Use Terraform to deploy and manage all of it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s get started.&lt;/p&gt;


&lt;h2&gt;
  
  
  Create an AWS Account (Skip if already have one)
&lt;/h2&gt;

&lt;p&gt;Go to &lt;a href="//aws.amazon.com"&gt;aws.amazon.com&lt;/a&gt; and create an account.&lt;br&gt;
Once you’re in, head to the AWS Management Console.&lt;/p&gt;


&lt;h2&gt;
  
  
  Enable Bedrock and Request Model Access
&lt;/h2&gt;

&lt;p&gt;In the search bar, type Amazon Bedrock and open it. Choose a region that supports Bedrock models - I used &lt;code&gt;ap-south-1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In the left sidebar, click &lt;em&gt;Model access&lt;/em&gt; and request access to &lt;strong&gt;Meta Llama 3 8B Instruct&lt;/strong&gt;. That’s the model we’ll use to generate the Dockerfile.&lt;/p&gt;

&lt;p&gt;Access might take a few minutes to be granted.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flbostez1gh8hslnfai4v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flbostez1gh8hslnfai4v.png" alt="Amazon Bedrock" width="800" height="410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also use any other model of your choice. I chose this model because of its low price and because I have been using Llama model for local LLM.&lt;/p&gt;


&lt;h2&gt;
  
  
  Install Terraform CLI
&lt;/h2&gt;

&lt;p&gt;We’ll use Terraform to set up all the infrastructure - Lambda, API Gateway, IAM roles, and so on. Install Terraform CLI locally:&lt;/p&gt;

&lt;p&gt;On macOS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew tap hashicorp/tap
brew &lt;span class="nb"&gt;install &lt;/span&gt;hashicorp/tap/terraform
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or follow the instructions for your OS: &lt;a href="https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli" rel="noopener noreferrer"&gt;Install Terraform&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Create a Terraform Cloud Account
&lt;/h2&gt;

&lt;p&gt;Go to &lt;a href="https://app.terraform.io/" rel="noopener noreferrer"&gt;Terraform Cloud&lt;/a&gt; and sign up.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a new organization. I named mine &lt;strong&gt;zlash65-ai-ml&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Inside the org, create a workspace named &lt;strong&gt;aws-bedrock-example&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’ll connect this workspace to GitHub later to deploy infrastructure from code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Set Up AWS Credentials for Terraform
&lt;/h2&gt;

&lt;h4&gt;
  
  
  In the AWS Console:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;em&gt;IAM&lt;/em&gt; &amp;gt; &lt;em&gt;Users&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Create a user called &lt;strong&gt;Terraform User&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Enable &lt;strong&gt;programmatic access&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Attach the &lt;strong&gt;AdministratorAccess&lt;/strong&gt; policy&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;This gives Terraform permission to create any AWS resource&lt;/li&gt;
&lt;li&gt;If you’re a paranoid person (and rightly so), you can limit the scope to only services you need - like Lambda, S3, IAM, Bedrock, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ux4zm0inv4fta9u1x34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ux4zm0inv4fta9u1x34.png" alt="IAM User" width="800" height="362"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Now back in Terraform Cloud:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;em&gt;Organization Settings&lt;/em&gt; &amp;gt; &lt;em&gt;Variable Sets&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Click on &lt;em&gt;Create organization variable set&lt;/em&gt; button&lt;/li&gt;
&lt;li&gt;Add:

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Attach the variable set to the &lt;strong&gt;aws-bedrock-example&lt;/strong&gt; workspace.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg0vdkob64b1db4jpb752.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg0vdkob64b1db4jpb752.png" alt="Organization Variable Set" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Terraform?
&lt;/h2&gt;

&lt;p&gt;If you’ve never used Terraform before, think of it like this: instead of setting everything up manually in the AWS Console - clicking through pages, configuring settings, creating roles - you write everything you need in a few &lt;code&gt;.tf&lt;/code&gt; files. Terraform reads those files and builds the infrastructure for you.&lt;/p&gt;

&lt;p&gt;That might sound like extra work upfront, but here’s why it’s actually a huge win:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trackability:&lt;/strong&gt; Everything is written in code. You can track it in version control, collaborate with others, and roll back changes if something breaks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reusability:&lt;/strong&gt; Want to recreate the same setup for another region or project? Just tweak a variable and redeploy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation:&lt;/strong&gt; With a single command, Terraform provisions AWS services like Lambda, API Gateway, S3, IAM roles, and more.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clean Teardown:&lt;/strong&gt; Done with your experiment? A simple terraform destroy removes every resource you created. No more forgetting something and getting billed for it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For this project specifically, Terraform helps us avoid a manual setup that would’ve involved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creating a Lambda function and uploading a custom layer&lt;/li&gt;
&lt;li&gt;Writing IAM roles and policies manually&lt;/li&gt;
&lt;li&gt;Setting up an S3 bucket with the correct permissions&lt;/li&gt;
&lt;li&gt;Building an API Gateway that connects to the Lambda function&lt;/li&gt;
&lt;li&gt;Wiring it all together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By using Terraform, we’re making this setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Faster to build&lt;/li&gt;
&lt;li&gt;Easier to understand&lt;/li&gt;
&lt;li&gt;Simpler to reproduce and delete&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re building anything that involves multiple AWS services, Terraform isn’t just a nice-to-have - it’s a necessity.&lt;/p&gt;

&lt;p&gt;Instead of creating Lambda, S3, and API Gateway one by one, we’ll just run a command and let Terraform do the work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftonucvq769urrvrdm7ya.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftonucvq769urrvrdm7ya.png" alt="Terraform Resource Creation" width="800" height="452"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Set Up Project Structure
&lt;/h2&gt;

&lt;p&gt;Create a GitHub repo called &lt;code&gt;aws-bedrock-example&lt;/code&gt;. Then clone it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/YOUR_USERNAME/aws-bedrock-example.git
&lt;span class="nb"&gt;cd &lt;/span&gt;aws-bedrock-example
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create this structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws-bedrock-example/
├── .gitignore
├── README.md
├── lambda/
└── terraform/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add &lt;code&gt;.gitignore&lt;/code&gt; and &lt;code&gt;.terraformignore&lt;/code&gt; with Python and Terraform-specific ignores.&lt;/p&gt;




&lt;h2&gt;
  
  
  Write Lambda Code
&lt;/h2&gt;

&lt;p&gt;We’ll use &lt;strong&gt;Python&lt;/strong&gt; and &lt;strong&gt;boto3&lt;/strong&gt; to talk to Amazon Bedrock.&lt;/p&gt;

&lt;p&gt;Add the following to &lt;code&gt;requirements.txt&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;boto3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Lambda Code: Unpacked
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;app.py&lt;/code&gt; file is the heart of our project.&lt;/p&gt;

&lt;p&gt;Here's a breakdown of what each function does:&lt;/p&gt;

&lt;p&gt;✅ &lt;code&gt;generate_dockerfile(language: str) -&amp;gt; str&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This function creates a structured prompt and sends it to Amazon Bedrock to generate a Dockerfile based on the language input.&lt;/p&gt;

&lt;p&gt;We build a plain prompt that tells the model exactly what to return. Avoiding extra formatting like chat headers helps get a usable output. We’ll use a simplified prompt format (more on that later) -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_dockerfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;formatted_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
  ONLY generate an ideal Dockerfile for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; with best practices. Do not provide any explanation.
  Include:
  - Base image
  - Installing dependencies
  - Setting working directory
  - Adding source code
  - Running the application
  &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These parameters control the model’s output length and creativity -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;formatted_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_gen_len&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_p&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This block calls Bedrock, reads the JSON response, and extracts the generated Dockerfile -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;bedrock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-south-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                            &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;botocore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read_timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}))&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;meta.llama3-8b-instruct-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response_content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dockerfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dockerfile&lt;/span&gt;
  &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error generating Dockerfile:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ &lt;code&gt;save_dockerfile(s3_key, s3_bucket, dockerfile)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This function uploads the Dockerfile to S3 and returns a presigned URL.&lt;/p&gt;

&lt;p&gt;Uploads the Dockerfile into a folder named dockerfiles/ inside the specified bucket -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_dockerfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s3_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s3_bucket&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dockerfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expiry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-south-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endpoint_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://s3.ap-south-1.amazonaws.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dockerfile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s3_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s3_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;ContentType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text/plain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Dockerfile saved to S3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns a temporary, secure URL so anyone can download the file without needing AWS credentials -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_presigned_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;s3_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;s3_key&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;ExpiresIn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expiry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ &lt;code&gt;handler(event, context)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This is the entrypoint Lambda uses when triggered by API Gateway.&lt;/p&gt;

&lt;p&gt;Extracts the language from the incoming POST body -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;language&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;language&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the Dockerfile was generated successfully, we:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store it in S3 with a timestamped key&lt;/li&gt;
&lt;li&gt;Generate a presigned URL&lt;/li&gt;
&lt;li&gt;Return a 200 status with the URL&lt;/li&gt;
&lt;li&gt;If something failed, we return a 500 error
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;dockerfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate_dockerfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;%H-%M-%S&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;s3_bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zlash65-aws-bedrock-example&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dockerfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;s3_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dockerfiles/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;current_time&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.Dockerfile&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;dockerfile_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;save_dockerfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s3_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s3_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dockerfile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Dockerfile generated&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;dockerfile_url&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to generate Dockerfile&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Create Lambda Build Script
&lt;/h2&gt;

&lt;p&gt;In the root directory, add &lt;code&gt;build.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; lambda_build lambda.zip
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; lambda_build
pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; lambda/requirements.txt &lt;span class="nt"&gt;-t&lt;/span&gt; lambda_build
&lt;span class="nb"&gt;cp &lt;/span&gt;lambda/app.py lambda_build
&lt;span class="nb"&gt;cd &lt;/span&gt;lambda_build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; zip &lt;span class="nt"&gt;-r&lt;/span&gt; ../lambda.zip &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x build.sh
./build.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script installs dependencies and packages the Lambda function into a zip file for Terraform to deploy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Write Terraform Code
&lt;/h2&gt;

&lt;p&gt;Inside the &lt;code&gt;terraform/&lt;/code&gt; folder, create the following files and fill them with the linked code:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Zlash65/aws-bedrock-example/blob/main/terraform/backends.tf" rel="noopener noreferrer"&gt;&lt;code&gt;backends.tf&lt;/code&gt;&lt;/a&gt; - Connects to your Terraform Cloud workspace.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Zlash65/aws-bedrock-example/blob/main/terraform/providers.tf" rel="noopener noreferrer"&gt;&lt;code&gt;providers.tf&lt;/code&gt;&lt;/a&gt; - Sets the AWS provider and region.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Zlash65/aws-bedrock-example/blob/main/terraform/variables.tf" rel="noopener noreferrer"&gt;&lt;code&gt;variables.tf&lt;/code&gt;&lt;/a&gt; - Stores common variables like region.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Zlash65/aws-bedrock-example/blob/main/terraform/outputs.tf" rel="noopener noreferrer"&gt;&lt;code&gt;outputs.tf&lt;/code&gt;&lt;/a&gt; - Outputs the API URL and Lambda function name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Zlash65/aws-bedrock-example/blob/main/terraform/bedrock.tf" rel="noopener noreferrer"&gt;&lt;code&gt;bedrock.tf&lt;/code&gt;&lt;/a&gt; - This code sets up logging for Amazon Bedrock model invocations using CloudWatch Logs. Basically, if your Bedrock model fails or returns an unexpected response and you don’t see anything in the Lambda logs - this setup helps capture what’s happening behind the scenes in Bedrock itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://github.com/Zlash65/aws-bedrock-example/blob/main/terraform/main.tf" rel="noopener noreferrer"&gt;&lt;code&gt;main.tf&lt;/code&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;This file is the central piece of infrastructure setup for the entire project.&lt;/p&gt;

&lt;h4&gt;
  
  
  🔧 What This File Does Overall
&lt;/h4&gt;

&lt;p&gt;This Terraform file provisions everything needed to expose our Lambda function as an API, give it permissions, and wire it up to Amazon Bedrock and S3. It’s the core infrastructure-as-code file for the project.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creates a Lambda function that:

&lt;ul&gt;
&lt;li&gt;Invokes Amazon Bedrock to generate Dockerfiles&lt;/li&gt;
&lt;li&gt;Stores the output in an S3 bucket&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Sets up an S3 bucket to:

&lt;ul&gt;
&lt;li&gt;Store the generated Dockerfiles&lt;/li&gt;
&lt;li&gt;Allow presigned access for download&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Configures API Gateway to:

&lt;ul&gt;
&lt;li&gt;Expose the Lambda as a public-facing HTTP POST endpoint (&lt;code&gt;/generate-dockerfile&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Grants IAM permissions to the Lambda function so it can:

&lt;ul&gt;
&lt;li&gt;Write logs to CloudWatch&lt;/li&gt;
&lt;li&gt;Read and write objects in S3&lt;/li&gt;
&lt;li&gt;Call Bedrock models via &lt;code&gt;bedrock:InvokeModel&lt;/code&gt; permission&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;In short: this file is the heart of our project infrastructure. Once applied, we get a fully working, callable API that uses AI (via Bedrock) to return a Dockerfile and give a downloadable link via S3.&lt;/p&gt;

&lt;h4&gt;
  
  
  Format everything:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;terraform
terraform &lt;span class="nb"&gt;fmt&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Push Code to GitHub and Deploy
&lt;/h2&gt;

&lt;p&gt;Commit and push your code to GitHub.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"feat: lambda and infra code for aws-bedrock-example"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in Terraform Cloud:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Connect the repo to your &lt;strong&gt;aws-bedrock-example&lt;/strong&gt; workspace&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzxr0nwgu4wu0urv3uwc1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzxr0nwgu4wu0urv3uwc1.png" alt="Terraform VCS Setup" width="800" height="444"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Trigger a new run&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ya06q27mxxdy08dw1ip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ya06q27mxxdy08dw1ip.png" alt="Terraform Workspace New Run" width="800" height="166"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Review and Confirm &amp;amp; Apply&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb66vitmp9k6ov9llj3zi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb66vitmp9k6ov9llj3zi.png" alt="Terraform Apply" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Terraform will create everything: Lambda, Gateway, S3, permissions - all in one go.&lt;/p&gt;




&lt;h2&gt;
  
  
  Test Using Postman
&lt;/h2&gt;

&lt;p&gt;Once deployed, Terraform will show the &lt;strong&gt;API Gateway URL&lt;/strong&gt; in outputs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0zjzt17upbmobdqs2vw3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0zjzt17upbmobdqs2vw3.png" alt="Terraform Output" width="800" height="376"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Postman:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set method: &lt;code&gt;POST&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;URL: &lt;code&gt;API Gateway URL&lt;/code&gt; from Terraform output&lt;/li&gt;
&lt;li&gt;Body: &lt;code&gt;{"language": "node"}&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;em&gt;Send&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If everything went well, you’ll get an &lt;code&gt;Presigned URL&lt;/code&gt; pointing to your generated Dockerfile in S3.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvftw170puqlmtww2eue8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvftw170puqlmtww2eue8.png" alt="Postman" width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Debugging: The Prompt Format Got Me
&lt;/h2&gt;

&lt;p&gt;This part tripped me up for an hour or some.&lt;/p&gt;

&lt;p&gt;I followed &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-runtime_example_bedrock-runtime_InvokeModel_MetaLlama3_section.html" rel="noopener noreferrer"&gt;this official example&lt;/a&gt; and wrote my prompt like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;formatted_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
&amp;lt;|begin_of_text|&amp;gt;&amp;lt;|start_header_id|&amp;gt;user&amp;lt;|end_header_id|&amp;gt;
ONLY Generate an ideal Dockerfile for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...
&amp;lt;|eot_id|&amp;gt;&amp;lt;|start_header_id|&amp;gt;assistant&amp;lt;|end_header_id|&amp;gt;
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks right, but it didn’t work.&lt;/p&gt;

&lt;p&gt;My Lambda ran fine. No errors. But the response from Bedrock was -&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;{‘generation’: ‘’, ‘prompt_token_count’: 64, ‘generation_token_count’: 1, ‘stop_reason’: ‘stop’}&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Turns out this prompt format is for &lt;strong&gt;chat-based streaming&lt;/strong&gt;. What I needed was a &lt;strong&gt;simple single-shot prompt&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;formatted_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
ONLY generate an ideal Dockerfile for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; with best practices. Do not provide any explanation.
Include:
- Base image
- Installing dependencies
- Setting working directory
- Adding source code
- Running the application
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once I removed all the fancy tokens, it worked instantly.&lt;/p&gt;

&lt;p&gt;This is also why we enabled Bedrock logging in &lt;code&gt;bedrock.tf&lt;/code&gt;. If Lambda logs don’t help, Bedrock logs will.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxaanwnvxxsta6qjhz02d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxaanwnvxxsta6qjhz02d.png" alt="Cloudwatch Logs for Amazon Bedrock" width="800" height="281"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Folder Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws-bedrock-example/
├── .gitignore
├── .terraformignore
├── build.sh
├── lambda/
│   ├── app.py
│   └── requirements.txt
├── lambda.zip
├── README.md
└── terraform/
    ├── backends.tf
    ├── bedrock.tf
    ├── main.tf
    ├── outputs.tf
    ├── providers.tf
    └── variables.tf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧹 Cleanup AWS Resources
&lt;/h2&gt;

&lt;p&gt;Now that our project is complete, it’s time to clean up. And because we used &lt;strong&gt;Terraform Cloud&lt;/strong&gt;, deleting everything is just as easy as provisioning it.&lt;/p&gt;

&lt;p&gt;No manual AWS cleanup. No guessing what resources you created. No surprise charges a month later. Just a few clicks.&lt;/p&gt;

&lt;p&gt;To destroy all AWS resources via Terraform Cloud:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;strong&gt;Terraform Cloud&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Open your organization and the workspace you used (e.g., &lt;strong&gt;zlash-ai-ml&lt;/strong&gt; &amp;gt; &lt;strong&gt;aws-bedrock-example&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Go to &lt;em&gt;Workspace Settings&lt;/em&gt; &amp;gt; &lt;em&gt;Destruction and Deletion&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;em&gt;Queue destroy plan&lt;/em&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2u58ttnb1pn45ue14c0n.png" alt="Terraform Destroy Trigger" width="800" height="285"&gt;
&lt;/li&gt;
&lt;li&gt;Wait for the plan to complete, then click on &lt;em&gt;Confirm and apply&lt;/em&gt; button to remove the resources from AWS
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fng37c3hzqb6z374bx3qp.png" alt="Terraform Destroy" width="800" height="452"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Terraform Cloud will now safely tear down every AWS resource it created - Lambda function, S3 bucket, API Gateway, IAM roles - everything.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ Important: You won’t be able to recover anything once it’s destroyed, so double-check before proceeding.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;All the code is here: &lt;a href="https://github.com/Zlash65/aws-bedrock-example" rel="noopener noreferrer"&gt;Zlash65/aws-bedrock-example&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It’s working, simplified, and ready to be cloned.&lt;/p&gt;

&lt;p&gt;I hope this gives you a clear, real-world path to start building with Amazon Bedrock. If you run into weird issues—start with the prompt. Always the prompt.&lt;/p&gt;

</description>
      <category>bedrock</category>
      <category>genai</category>
      <category>llm</category>
      <category>terraform</category>
    </item>
  </channel>
</rss>
