<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Yasser B.</title>
    <description>The latest articles on Forem by Yasser B. (@geekyfox90).</description>
    <link>https://forem.com/geekyfox90</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3785198%2F1098d18e-2c79-44f4-bcdd-1d928ee365f0.png</url>
      <title>Forem: Yasser B.</title>
      <link>https://forem.com/geekyfox90</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/geekyfox90"/>
    <language>en</language>
    <item>
      <title>PostgreSQL Connection Pooling with PgBouncer: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Tue, 31 Mar 2026 15:22:09 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-connection-pooling-with-pgbouncer-a-complete-guide-2fam</link>
      <guid>https://forem.com/geekyfox90/postgresql-connection-pooling-with-pgbouncer-a-complete-guide-2fam</guid>
      <description>&lt;p&gt;You launch your app. Traffic is light, everything works. A few weeks later you start seeing &lt;code&gt;FATAL: remaining connection slots are reserved for non-replication superuser connections&lt;/code&gt;. Your PostgreSQL server is out of connections and your app is falling over.&lt;/p&gt;

&lt;p&gt;This is one of the most common PostgreSQL scaling problems, and connection pooling is the fix. But the fix has its own complexity: PgBouncer has three modes with different tradeoffs, the configuration is full of footguns, and if you get it wrong you get subtle bugs that are much harder to debug than the original connection error.&lt;/p&gt;

&lt;p&gt;This guide covers how PostgreSQL connections actually work, how to set up and configure PgBouncer correctly, and how to choose the right pool mode for your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why PostgreSQL Connections Are Expensive
&lt;/h2&gt;

&lt;p&gt;PostgreSQL handles each connection with a dedicated server process. When a client connects, Postgres forks a new OS process. That process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Allocates its own memory (typically 5-10 MB per connection including shared memory overhead)&lt;/li&gt;
&lt;li&gt;Maintains its own backend state, transaction state, and lock tables&lt;/li&gt;
&lt;li&gt;Requires the kernel to schedule it like any other process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At 50 connections, this is fine. At 500 connections, you have 500 OS processes and the scheduler starts showing up in your performance profiles. At 1,000 connections, you are likely hitting the &lt;code&gt;max_connections&lt;/code&gt; limit (default 100 in stock PostgreSQL) and your app is returning errors.&lt;/p&gt;

&lt;p&gt;The naive fix is to increase &lt;code&gt;max_connections&lt;/code&gt;. Don't do that without thinking it through. Each connection costs memory. Set &lt;code&gt;max_connections = 1000&lt;/code&gt; on a server with 8 GB of RAM and you've allocated the entire heap to idle connections before a single query runs. The &lt;code&gt;shared_buffers&lt;/code&gt; and &lt;code&gt;work_mem&lt;/code&gt; math goes sideways fast.&lt;/p&gt;

&lt;p&gt;The right fix is to reduce the number of actual connections to PostgreSQL. That's what connection poolers do.&lt;/p&gt;

&lt;h2&gt;
  
  
  What PgBouncer Does
&lt;/h2&gt;

&lt;p&gt;PgBouncer sits between your application and PostgreSQL. Your app thinks it's talking to Postgres, but it's actually talking to PgBouncer. PgBouncer maintains a pool of real connections to Postgres and hands them out to client requests.&lt;/p&gt;

&lt;p&gt;The numbers look like this in practice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before PgBouncer:&lt;/strong&gt; 300 app threads, 300 Postgres connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After PgBouncer (transaction mode):&lt;/strong&gt; 300 app threads, 20 actual Postgres connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those 20 connections serve 300 clients because most clients are not actually executing SQL at any given moment. They're waiting for network I/O, processing results, or sitting idle. Transaction mode takes advantage of this by returning a connection to the pool the moment a transaction commits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing PgBouncer
&lt;/h2&gt;

&lt;p&gt;On Ubuntu/Debian:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;pgbouncer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On macOS with Homebrew:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;pgbouncer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuring PgBouncer
&lt;/h2&gt;

&lt;p&gt;A minimal working config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[databases]&lt;/span&gt;
&lt;span class="py"&gt;mydb&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;host=127.0.0.1 port=5432 dbname=mydb&lt;/span&gt;

&lt;span class="nn"&gt;[pgbouncer]&lt;/span&gt;
&lt;span class="py"&gt;listen_addr&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;127.0.0.1&lt;/span&gt;
&lt;span class="py"&gt;listen_port&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;6432&lt;/span&gt;
&lt;span class="py"&gt;auth_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;scram-sha-256&lt;/span&gt;
&lt;span class="py"&gt;auth_file&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/pgbouncer/userlist.txt&lt;/span&gt;
&lt;span class="py"&gt;pool_mode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;transaction&lt;/span&gt;
&lt;span class="py"&gt;max_client_conn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;1000&lt;/span&gt;
&lt;span class="py"&gt;default_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;25&lt;/span&gt;
&lt;span class="py"&gt;min_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;5&lt;/span&gt;
&lt;span class="py"&gt;reserve_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;5&lt;/span&gt;
&lt;span class="py"&gt;server_idle_timeout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;600&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For production, use hashed passwords. Generate them with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rolname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'" "'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rolpassword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_authid&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rolname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'myuser'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your app connects to port 6432 instead of 5432. Nothing else changes in the app code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Pool Modes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Session Mode
&lt;/h3&gt;

&lt;p&gt;A server connection is assigned when the client connects and held until the client disconnects. This is the safest mode, behaves identically to a direct PostgreSQL connection. Prepared statements, advisory locks, &lt;code&gt;LISTEN/NOTIFY&lt;/code&gt; all work correctly.&lt;/p&gt;

&lt;p&gt;Session mode does not help much with connection counts at steady state. Use it when you need full compatibility and your problem is peak load, not constant high concurrency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Transaction Mode
&lt;/h3&gt;

&lt;p&gt;A server connection is assigned for the duration of a transaction and returned to the pool immediately after. This gives you the dramatic reduction in server connections.&lt;/p&gt;

&lt;p&gt;The tradeoff: &lt;strong&gt;session-level state does not persist across transactions&lt;/strong&gt;. This breaks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PREPARE&lt;/code&gt; and server-side prepared statement caching&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SET&lt;/code&gt; commands that are not wrapped in a transaction&lt;/li&gt;
&lt;li&gt;Advisory locks (session-scoped)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LISTEN&lt;/code&gt; and &lt;code&gt;NOTIFY&lt;/code&gt; subscriptions&lt;/li&gt;
&lt;li&gt;Temp tables that are supposed to persist across transactions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Transaction mode works well for stateless web applications using pg (Node.js), psycopg2/psycopg3 (Python), or JDBC (Java), as long as those frameworks don't use session-level features.&lt;/p&gt;

&lt;h3&gt;
  
  
  Statement Mode
&lt;/h3&gt;

&lt;p&gt;A connection is held only for a single SQL statement, then returned. Breaks multi-statement transactions entirely. Rarely the right choice for web applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Choosing Your Mode
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your app&lt;/th&gt;
&lt;th&gt;Recommended mode&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Stateless API using ORM (Django, Rails, Prisma)&lt;/td&gt;
&lt;td&gt;Transaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long-lived connections with prepared statements&lt;/td&gt;
&lt;td&gt;Session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Connection count issues at peak only&lt;/td&gt;
&lt;td&gt;Session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Connection count issues at steady state&lt;/td&gt;
&lt;td&gt;Transaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Serverless functions&lt;/td&gt;
&lt;td&gt;Transaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Application using LISTEN/NOTIFY&lt;/td&gt;
&lt;td&gt;Session&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Pool Sizing
&lt;/h2&gt;

&lt;p&gt;A reasonable starting formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;default_pool_size = (number of PostgreSQL CPU cores) * 2 + number of disks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, 20-30 works well for most web applications. Check pool utilization while running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;POOLS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;cl_waiting &amp;gt; 0&lt;/code&gt; sustained means the pool is undersized. &lt;code&gt;sv_idle&lt;/code&gt; consistently high means it's oversized.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring PgBouncer
&lt;/h2&gt;

&lt;p&gt;Connect to the admin console:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;psql &lt;span class="nt"&gt;-h&lt;/span&gt; 127.0.0.1 &lt;span class="nt"&gt;-p&lt;/span&gt; 6432 &lt;span class="nt"&gt;-U&lt;/span&gt; pgbouncer pgbouncer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key commands: &lt;code&gt;SHOW POOLS&lt;/code&gt;, &lt;code&gt;SHOW STATS&lt;/code&gt;, &lt;code&gt;SHOW CLIENTS&lt;/code&gt;, &lt;code&gt;SHOW SERVERS&lt;/code&gt;, &lt;code&gt;RELOAD&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SHOW STATS&lt;/code&gt; tells you &lt;code&gt;avg_query_time&lt;/code&gt; and &lt;code&gt;avg_wait_time&lt;/code&gt;. If either is climbing, something is backing up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Pitfalls
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Prepared statements in transaction mode:&lt;/strong&gt; If your app uses server-side prepared statements, transaction mode will break it. Disable them in your driver: &lt;code&gt;prepare_threshold=None&lt;/code&gt; in psycopg2, &lt;code&gt;{ prepare: false }&lt;/code&gt; in pg (Node.js).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Connection storms at startup:&lt;/strong&gt; Set &lt;code&gt;reserve_pool_size = 5&lt;/code&gt; and &lt;code&gt;max_client_conn&lt;/code&gt; to at least 2-3x your expected peak.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SSL:&lt;/strong&gt; Always use SSL for both client-to-PgBouncer and PgBouncer-to-PostgreSQL in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Skip PgBouncer
&lt;/h2&gt;

&lt;p&gt;You probably don't need it if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fewer than 50 concurrent connections at peak&lt;/li&gt;
&lt;li&gt;Using a modern ORM with client-side pooling already (Prisma, Django, Rails)&lt;/li&gt;
&lt;li&gt;Serverless workloads with 1-2 connections per function instance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need it if multiple processes or pods each maintain their own pool and the total exceeds &lt;code&gt;max_connections&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;PostgreSQL connection pooling is not optional at scale. PgBouncer in transaction mode is the right default for most web applications.&lt;/p&gt;

&lt;p&gt;The main things to get right: pool size (start at 20-30 for OLTP), mode (transaction for stateless apps, session for anything using prepared statements or advisory locks), monitoring (&lt;code&gt;cl_waiting&lt;/code&gt; and &lt;code&gt;avg_wait_time&lt;/code&gt;), and SSL in production.&lt;/p&gt;

&lt;p&gt;Connection pooling is unglamorous infrastructure, but it's the difference between an app that falls over at traffic spikes and one that just handles them.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgresql-connection-pooling-pgbouncer" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>tutorial</category>
      <category>database</category>
      <category>webdev</category>
    </item>
    <item>
      <title>PostgreSQL Full Text Search: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Mon, 30 Mar 2026 14:49:54 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-full-text-search-a-complete-guide-2nj9</link>
      <guid>https://forem.com/geekyfox90/postgresql-full-text-search-a-complete-guide-2nj9</guid>
      <description>&lt;p&gt;There's a database already running in your stack. It has your users, your content, your transactions. And buried in that same PostgreSQL instance is a full text search engine you've probably never turned on.&lt;/p&gt;

&lt;p&gt;PostgreSQL full text search has been production ready for over a decade. It handles stemming, stop words, multiple languages, weighted ranking, and trigram fuzzy matching. You don't need Elasticsearch for a search feature. You don't need Algolia if your data is already in Postgres. For most applications, especially those with under a few million documents, built-in full text search is the right call.&lt;/p&gt;

&lt;p&gt;This guide covers everything you need to ship full text search in PostgreSQL: how the underlying model works, how to index correctly, how to rank results, and how it compares to vector search with pgvector.&lt;/p&gt;

&lt;h2&gt;
  
  
  How PostgreSQL Full Text Search Works
&lt;/h2&gt;

&lt;p&gt;PostgreSQL doesn't search raw text. It converts text into a normalized representation called a &lt;code&gt;tsvector&lt;/code&gt;, then matches queries expressed as &lt;code&gt;tsquery&lt;/code&gt; objects. This two-step process is what makes it fast.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;tsvector&lt;/code&gt; is a sorted list of lexemes: normalized word forms that strip suffixes and reduce words to their base form. The word "running" becomes "run". "Postgres" becomes "postgr". Stop words like "the", "a", "an" are dropped entirely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'The quick brown fox jumps over the lazy dog'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A &lt;code&gt;tsquery&lt;/code&gt; is what you match against:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'jumping'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 'jump'&lt;/span&gt;

&lt;span class="c1"&gt;-- Boolean operators&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgres &amp;amp; search'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;-- AND&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgres | mysql'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;-- OR&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'database &amp;amp; !oracle'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;-- NOT&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The match operator is &lt;code&gt;@@&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'PostgreSQL is a powerful database'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'powerful'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Up Full Text Search
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: Generated column (recommended for PostgreSQL 12+)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
  &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="n"&gt;TSVECTOR&lt;/span&gt;
  &lt;span class="k"&gt;GENERATED&lt;/span&gt; &lt;span class="n"&gt;ALWAYS&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;STORED&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;setweight&lt;/code&gt; assigns priority to fields: 'A' (highest) to title, 'B' to body. Documents where the search term appears in the title rank higher.&lt;/p&gt;

&lt;p&gt;Query with ranking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 2: Trigger-maintained column
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="n"&gt;TSVECTOR&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;articles_search_vector_trigger&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;articles_search_vector_update&lt;/span&gt;
  &lt;span class="k"&gt;BEFORE&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
  &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;articles_search_vector_trigger&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Indexing for Performance
&lt;/h2&gt;

&lt;p&gt;Without an index, full text search does a full table scan. GIN (Generalized Inverted Index) is purpose built for tsvectors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;articles_search_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this index, full text search queries return in milliseconds even on tables with millions of rows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ranking Results
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql &amp;amp; database'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ts_rank_cd&lt;/code&gt; uses cover density (how close matching terms are to each other) and often gives better results for multi-word queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Snippet Generation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'MaxWords=50, MinWords=15, StartSel=&amp;lt;mark&amp;gt;, StopSel=&amp;lt;/mark&amp;gt;'&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: &lt;code&gt;ts_headline&lt;/code&gt; does not use the GIN index. Call it only on the final page of results, not before pagination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling User Input Safely
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- websearch_to_tsquery: safe, handles Google-style syntax&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;websearch_to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'"full text search" postgres -oracle'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;websearch_to_tsquery&lt;/code&gt; (PostgreSQL 11+) is the best default for user input. It's injection-safe, handles partial syntax, and supports quoted phrases and exclusions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full Text Search vs pgvector
&lt;/h2&gt;

&lt;p&gt;They solve different problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full text search&lt;/strong&gt; finds documents containing specific words or phrases. Fast, precise, no ML model required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;pgvector&lt;/strong&gt; finds semantically similar documents, even without shared keywords. Needs an embedding model.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Best approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Blog/docs search&lt;/td&gt;
&lt;td&gt;Full text search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Semantic Q&amp;amp;A, RAG&lt;/td&gt;
&lt;td&gt;pgvector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E-commerce&lt;/td&gt;
&lt;td&gt;Often both&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Many systems use both, merging scores with Reciprocal Rank Fusion. For RAG pipelines, see our guides on &lt;a href="https://rivestack.io/blog/getting-started-with-pgvector" rel="noopener noreferrer"&gt;getting started with pgvector&lt;/a&gt; and &lt;a href="https://rivestack.io/blog/how-to-use-pgvector-with-python" rel="noopener noreferrer"&gt;pgvector with Python&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Complete Production Setup
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;BIGSERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="n"&gt;TSVECTOR&lt;/span&gt; &lt;span class="k"&gt;GENERATED&lt;/span&gt; &lt;span class="n"&gt;ALWAYS&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;STORED&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;articles_search_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;search_articles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_text&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_size&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_offset&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="n"&gt;FLOAT4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;DECLARE&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="n"&gt;TSQUERY&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;websearch_to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="n"&gt;QUERY&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'MaxWords=40, MinWords=10, StartSel=&amp;lt;mark&amp;gt;, StopSel=&amp;lt;/mark&amp;gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;
  &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
  &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="n"&gt;page_size&lt;/span&gt; &lt;span class="k"&gt;OFFSET&lt;/span&gt; &lt;span class="n"&gt;page_offset&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Usage&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;search_articles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'postgresql replication'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keyword based search, relevance ranking, HTML ready snippets, and pagination, all within PostgreSQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It on Rivestack
&lt;/h2&gt;

&lt;p&gt;PostgreSQL full text search and pgvector work side by side on the same database. No separate infrastructure for keyword vs semantic search.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Rivestack&lt;/a&gt; gives you a fully managed PostgreSQL instance with pgvector, pg_trgm, and all standard extensions enabled by default. Try it free, no credit card required.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgres-full-text-search" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>tutorial</category>
      <category>database</category>
      <category>sql</category>
    </item>
    <item>
      <title>PostgreSQL High Availability: A Practical Guide for Production</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sun, 29 Mar 2026 16:08:53 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-high-availability-a-practical-guide-for-production-1a63</link>
      <guid>https://forem.com/geekyfox90/postgresql-high-availability-a-practical-guide-for-production-1a63</guid>
      <description>&lt;p&gt;Your application is live. Customers are using it. The database goes down.&lt;/p&gt;

&lt;p&gt;How long before traffic routes around the failure? Ten seconds? Five minutes? Never, because you're paged at 2 AM and have to manually promote a replica while the on-call engineer Slacks you asking if the database is "doing a thing"?&lt;/p&gt;

&lt;p&gt;PostgreSQL high availability is one of those topics that looks straightforward in blog posts and turns out to be deeply humbling when you actually implement it in production. This guide covers how PostgreSQL HA actually works, the main tools people use, what typically goes wrong, and when the complexity of DIY HA stops being worth it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "High Availability" Means for PostgreSQL
&lt;/h2&gt;

&lt;p&gt;High availability means your database keeps serving requests even when individual components fail. For PostgreSQL, that typically requires three things working together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Data replication&lt;/strong&gt; — at least one copy of your data exists on a server other than the primary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure detection&lt;/strong&gt; — something notices when the primary is unreachable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic failover&lt;/strong&gt; — the replica promotes itself to primary without a human in the loop&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;PostgreSQL ships with excellent replication primitives but no built-in automatic failover. The replication part is solid and well-understood. The failover part is where teams get into trouble.&lt;/p&gt;

&lt;h3&gt;
  
  
  Streaming Replication: The Foundation
&lt;/h3&gt;

&lt;p&gt;PostgreSQL replication is based on Write-Ahead Log (WAL) shipping. Every write to the primary is first written to the WAL. Replicas connect to the primary and stream that WAL in near-real-time, replaying it to stay current.&lt;/p&gt;

&lt;p&gt;Setting up a basic standby looks like this in &lt;code&gt;postgresql.conf&lt;/code&gt; on the primary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- postgresql.conf on the primary&lt;/span&gt;
&lt;span class="n"&gt;wal_level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;replica&lt;/span&gt;
&lt;span class="n"&gt;max_wal_senders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;wal_keep_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;GB&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And in &lt;code&gt;pg_hba.conf&lt;/code&gt;, you allow the replica to connect for replication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="c"&gt;# pg_hba.conf on the primary
&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt; &lt;span class="n"&gt;replication&lt;/span&gt; &lt;span class="n"&gt;replicator&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;.&lt;span class="m"&gt;0&lt;/span&gt;.&lt;span class="m"&gt;0&lt;/span&gt;.&lt;span class="m"&gt;2&lt;/span&gt;/&lt;span class="m"&gt;32&lt;/span&gt; &lt;span class="n"&gt;scram&lt;/span&gt;-&lt;span class="n"&gt;sha&lt;/span&gt;-&lt;span class="m"&gt;256&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The replica connects with a &lt;code&gt;primary_conninfo&lt;/code&gt; in its configuration and starts streaming WAL from the primary. Once streaming, the replica is typically only milliseconds behind.&lt;/p&gt;

&lt;h3&gt;
  
  
  Synchronous vs Asynchronous Replication
&lt;/h3&gt;

&lt;p&gt;By default, PostgreSQL replication is &lt;strong&gt;asynchronous&lt;/strong&gt;: the primary commits a transaction and returns success to the client before confirming the replica received the data. If the primary dies at exactly the wrong moment, you can lose the last few transactions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Synchronous replication&lt;/strong&gt; waits for at least one replica to confirm it has received and written the WAL before reporting the commit as successful:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- postgresql.conf on the primary&lt;/span&gt;
&lt;span class="n"&gt;synchronous_standby_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'replica1'&lt;/span&gt;
&lt;span class="n"&gt;synchronous_commit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you zero-RPO (recovery point objective) — no committed data is ever lost. The tradeoff is latency: every write waits for a round trip to the replica. On a local network this is typically 1–5ms. Across availability zones it can be 10–30ms depending on the cloud provider.&lt;/p&gt;

&lt;p&gt;Most production setups use synchronous replication for the hot standby and asynchronous replication for additional read replicas or DR standbys that are geographically distant.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: PostgreSQL Doesn't Fail Over Itself
&lt;/h2&gt;

&lt;p&gt;With streaming replication running, you have your data in two places. But if the primary goes down, PostgreSQL doesn't automatically promote the replica. You have two choices:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Manual failover&lt;/strong&gt; — someone runs &lt;code&gt;pg_ctl promote&lt;/code&gt; or &lt;code&gt;SELECT pg_promote()&lt;/code&gt; on the replica. Fast if someone is awake, catastrophic if not.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated failover via an HA tool&lt;/strong&gt; — a separate process watches the primary and promotes the replica when it detects a failure.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Almost every production PostgreSQL HA setup uses one of three tools for automated failover: Patroni, pg_auto_failover, or repmgr. They all solve the same problem; they have meaningfully different complexity and tradeoff profiles.&lt;/p&gt;

&lt;h2&gt;
  
  
  Patroni: The Industry Standard (and Why It's Hard)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/zalando/patroni" rel="noopener noreferrer"&gt;Patroni&lt;/a&gt; is what most teams with serious PostgreSQL HA requirements end up using. It's battle-tested, highly configurable, and runs at scale. It's also genuinely complex to operate.&lt;/p&gt;

&lt;p&gt;Patroni uses a distributed consensus store — either etcd, Consul, or ZooKeeper — to maintain cluster state and elect a primary. A minimal production Patroni setup is 3 etcd nodes + 2–3 PostgreSQL nodes + HAProxy. That's 5–6 servers before you've even started.&lt;/p&gt;

&lt;p&gt;A minimal &lt;code&gt;patroni.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prod-cluster&lt;/span&gt;
&lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/db/&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pg-node-1&lt;/span&gt;

&lt;span class="na"&gt;restapi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;listen&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.0.0.0:8008&lt;/span&gt;
  &lt;span class="na"&gt;connect_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.0.0.1:8008&lt;/span&gt;

&lt;span class="na"&gt;etcd3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.0.1.1:2379,10.0.1.2:2379,10.0.1.3:2379&lt;/span&gt;

&lt;span class="na"&gt;bootstrap&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;dcs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ttl&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
    &lt;span class="na"&gt;loop_wait&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
    &lt;span class="na"&gt;retry_timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
    &lt;span class="na"&gt;maximum_lag_on_failover&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1048576&lt;/span&gt;
    &lt;span class="na"&gt;synchronous_mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;postgresql&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;listen&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.0.0.0:5432&lt;/span&gt;
  &lt;span class="na"&gt;connect_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.0.0.1:5432&lt;/span&gt;
  &lt;span class="na"&gt;data_dir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/var/lib/postgresql/data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;HAProxy needs health checks against the Patroni REST API (port 8008), not just the PostgreSQL port — that's how it knows which node is the current primary.&lt;/p&gt;

&lt;p&gt;Patroni is genuinely good software. But "run Patroni in production" is a weeks-long project, not an afternoon task.&lt;/p&gt;

&lt;h2&gt;
  
  
  pg_auto_failover: Simpler, More Opinionated
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://pg-auto-failover.readthedocs.io/" rel="noopener noreferrer"&gt;pg_auto_failover&lt;/a&gt; uses a dedicated monitor node instead of etcd:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the monitor node&lt;/span&gt;
pg_autoctl create monitor &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgdata&lt;/span&gt; /var/lib/postgresql/monitor &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgport&lt;/span&gt; 5000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hostname&lt;/span&gt; monitor.internal

&lt;span class="c"&gt;# On the primary node&lt;/span&gt;
pg_autoctl create postgres &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgdata&lt;/span&gt; /var/lib/postgresql/data &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--monitor&lt;/span&gt; postgres://autoctl_node@monitor.internal:5000/pg_auto_failover &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hostname&lt;/span&gt; primary.internal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgport&lt;/span&gt; 5432
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Easier to set up than Patroni, but the monitor is a single point of failure and it's less flexible for complex topologies. Good choice for teams that want something working quickly without running etcd.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Goes Wrong in Production
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Split-Brain
&lt;/h3&gt;

&lt;p&gt;The most dangerous failure mode: a network partition causes both nodes to think they're primary and accept writes. Patroni prevents this with etcd distributed locks — a node can only be primary if it holds the lock. If it can't reach etcd, it demotes itself. pg_auto_failover prevents it by centralizing all promotion decisions in the monitor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Replication Lag at Failover Time
&lt;/h3&gt;

&lt;p&gt;With async replication, a lagged replica that promotes will be missing the last N transactions. Patroni's &lt;code&gt;maximum_lag_on_failover&lt;/code&gt; controls this — but set it too conservatively and failover blocks entirely if all replicas are lagged after a network partition.&lt;/p&gt;

&lt;h3&gt;
  
  
  Application Not Reconnecting
&lt;/h3&gt;

&lt;p&gt;After failover, point your connection string at a VIP or load balancer, not a node's IP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;postgresql://ha-proxy.internal:5432/mydb?target_session_attrs=read-write
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;target_session_attrs=read-write&lt;/code&gt; parameter tells libpq to reject connections to read-only servers, which helps clients find the current primary automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing Your Failover
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Kill the primary while watching application logs&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl stop postgresql@17-main

&lt;span class="c"&gt;# Check the cluster state&lt;/span&gt;
patronictl &lt;span class="nt"&gt;-c&lt;/span&gt; /etc/patroni/config.yml list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you haven't tested your failover, you don't have HA. You have a plan that might work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Failover Time Question
&lt;/h2&gt;

&lt;p&gt;Typical Patroni failover: 10–30 seconds. The timeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Primary unreachable (0s)&lt;/li&gt;
&lt;li&gt;Patroni TTL expires, primary declared dead (default: 30s)&lt;/li&gt;
&lt;li&gt;Replica acquires DCS lock and promotes (1–2s)&lt;/li&gt;
&lt;li&gt;HAProxy detects new primary (2–5s)&lt;/li&gt;
&lt;li&gt;Application reconnects (depends on pool settings)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Reducing TTL speeds detection but increases false positives from transient network blips. Most teams settle on 15–30 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Managed PostgreSQL High Availability Makes Sense
&lt;/h2&gt;

&lt;p&gt;DIY HA works. Large companies run Patroni at massive scale. But you're running 5+ nodes, maintaining etcd, and debugging edge cases at the worst possible time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://rivestack.io/blog/why-managed-postgres-for-ai" rel="noopener noreferrer"&gt;As we've broken down before&lt;/a&gt;, the real cost of self-hosted PostgreSQL HA includes infrastructure, tooling, and engineering time — and engineering time dominates.&lt;/p&gt;

&lt;p&gt;Rivestack's HA clusters handle streaming replication, automatic failover, and connection routing automatically. Failover happens in seconds, your connection string doesn't change, and there's no etcd cluster to maintain. HA clusters start at $99/month with NVMe storage, automated backups, point-in-time recovery, and monitoring included.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Do If You're Starting From Scratch
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;For a managed setup:&lt;/strong&gt; &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Try Rivestack&lt;/a&gt; — spin up an HA cluster, verify failover works from the dashboard, and move on to building your application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For a self-managed setup:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with pg_auto_failover for 2–3 nodes and fast setup&lt;/li&gt;
&lt;li&gt;Move to Patroni if you need multi-datacenter support or fine-grained control&lt;/li&gt;
&lt;li&gt;Use HAProxy with Patroni REST API health checks (port 8008) for connection routing&lt;/li&gt;
&lt;li&gt;Enable synchronous replication if your application can't tolerate data loss&lt;/li&gt;
&lt;li&gt;Test failover in staging before relying on it in production&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;PostgreSQL has excellent HA building blocks. Streaming replication is fast, reliable, and well-understood. The gap is automated failover, and Patroni or pg_auto_failover fills it — but adds real operational complexity.&lt;/p&gt;

&lt;p&gt;Test your failover before you need it. The worst time to discover your replica is 10 minutes behind is during an actual outage.&lt;/p&gt;

&lt;p&gt;If you want to skip the infrastructure work, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Rivestack&lt;/a&gt; handles the HA layer for you — including pgvector if you're building AI applications. See the &lt;a href="https://rivestack.io/blog/getting-started-with-pgvector" rel="noopener noreferrer"&gt;getting started guide&lt;/a&gt; if that's your stack.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgres-high-availability" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>devops</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Use pgvector with Python: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sat, 28 Mar 2026 13:58:17 +0000</pubDate>
      <link>https://forem.com/geekyfox90/how-to-use-pgvector-with-python-a-complete-guide-808</link>
      <guid>https://forem.com/geekyfox90/how-to-use-pgvector-with-python-a-complete-guide-808</guid>
      <description>&lt;p&gt;You've decided to use PostgreSQL for your vector embeddings. Smart move. Now you need to wire it up from Python — and if you've landed here, you've probably already noticed that there are a few different libraries involved, the syntax isn't immediately obvious, and the official pgvector docs give you the C extension but leave the Python story somewhat scattered.&lt;/p&gt;

&lt;p&gt;This guide covers the whole picture: installing the Python client, connecting with both psycopg3 and SQLAlchemy, storing and querying embeddings, building indexes, and wiring it up into a real RAG pipeline. By the end you'll have a working setup you can actually ship.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Need Before You Start
&lt;/h2&gt;

&lt;p&gt;You'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A PostgreSQL database with the &lt;code&gt;vector&lt;/code&gt; extension enabled&lt;/li&gt;
&lt;li&gt;Python 3.8+&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;pgvector&lt;/code&gt; Python package&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're running PostgreSQL locally, install the pgvector extension from &lt;a href="https://github.com/pgvector/pgvector" rel="noopener noreferrer"&gt;the pgvector GitHub repo&lt;/a&gt; and run &lt;code&gt;CREATE EXTENSION vector;&lt;/code&gt;. If you're using a managed PostgreSQL service, the extension is typically pre-installed — on Rivestack, it's enabled by default on every database.&lt;/p&gt;

&lt;p&gt;Install the Python package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pgvector
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll also want a database driver. The two most common choices for Python are &lt;strong&gt;psycopg3&lt;/strong&gt; (direct, fast, recommended) and &lt;strong&gt;SQLAlchemy&lt;/strong&gt; (for ORM-based projects). We'll cover both.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# For psycopg3&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"psycopg[binary]"&lt;/span&gt;

&lt;span class="c"&gt;# For SQLAlchemy&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;sqlalchemy psycopg2-binary
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Up the Vector Extension
&lt;/h2&gt;

&lt;p&gt;Connect to your database and enable the extension if it isn't already:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this once per database. You can verify it's installed with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_extension&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;extname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'vector'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Using pgvector with psycopg3
&lt;/h2&gt;

&lt;p&gt;psycopg3 is the modern PostgreSQL driver for Python. It's faster than psycopg2, has proper async support, and the pgvector Python package integrates with it natively.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connecting and Registering the Vector Type
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.psycopg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;register_vector&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;register_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;register_vector&lt;/code&gt; call is important — it tells psycopg how to serialize and deserialize Python lists/numpy arrays into the &lt;code&gt;vector&lt;/code&gt; type PostgreSQL expects. Without it, you'll get type errors when inserting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a Table with a Vector Column
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        CREATE TABLE IF NOT EXISTS documents (
            id BIGSERIAL PRIMARY KEY,
            content TEXT NOT NULL,
            embedding VECTOR(1536)
        )
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;1536&lt;/code&gt; dimension matches OpenAI's &lt;code&gt;text-embedding-3-small&lt;/code&gt; model. If you're using a different model, adjust accordingly:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Dimensions&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI text-embedding-3-small&lt;/td&gt;
&lt;td&gt;1536&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI text-embedding-3-large&lt;/td&gt;
&lt;td&gt;3072&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cohere embed-v4&lt;/td&gt;
&lt;td&gt;1024&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google text-embedding-005&lt;/td&gt;
&lt;td&gt;768&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;all-MiniLM-L6-v2 (local)&lt;/td&gt;
&lt;td&gt;384&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Inserting Embeddings
&lt;/h3&gt;

&lt;p&gt;In practice, you generate embeddings with an API call or a local model, then insert them alongside your content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;

&lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PostgreSQL is a powerful open-source relational database.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector adds vector similarity search to PostgreSQL.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Python is a high-level programming language.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO documents (content, embedding) VALUES (%s, %s)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;pgvector's psycopg integration accepts plain Python lists directly — no special wrapping needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Querying by Similarity
&lt;/h3&gt;

&lt;p&gt;To find the documents most similar to a query, compute the query embedding and use one of pgvector's distance operators:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; — cosine distance (most common for text embeddings)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt; — L2 (Euclidean) distance&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; — negative inner product
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How does vector search work in PostgreSQL?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT content, 1 - (embedding &amp;lt;=&amp;gt; %s) AS similarity
        FROM documents
        ORDER BY embedding &amp;lt;=&amp;gt; %s
        LIMIT 5
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; operator returns cosine &lt;em&gt;distance&lt;/em&gt; (0 = identical, 2 = opposite), so &lt;code&gt;1 - distance&lt;/code&gt; gives you cosine &lt;em&gt;similarity&lt;/em&gt; if you want a score between 0 and 1.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using pgvector with SQLAlchemy
&lt;/h2&gt;

&lt;p&gt;If your project uses SQLAlchemy for ORM models, pgvector has first-class support there too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BigInteger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sqlalchemy.orm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;declarative_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Session&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Vector&lt;/span&gt;

&lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql+psycopg2://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;Base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;declarative_base&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;__tablename__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BigInteger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;autoincrement&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nullable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inserting with SQLAlchemy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector makes PostgreSQL a capable vector store.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector makes PostgreSQL a capable vector store.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Querying with SQLAlchemy — use the &lt;code&gt;l2_distance&lt;/code&gt;, &lt;code&gt;cosine_distance&lt;/code&gt;, or &lt;code&gt;max_inner_product&lt;/code&gt; functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cosine_distance&lt;/span&gt;

&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector search with PostgreSQL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cosine_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding an Index for Production
&lt;/h2&gt;

&lt;p&gt;The queries above work fine for development and small datasets — PostgreSQL does an exact nearest-neighbor scan. But with more than ~50,000 vectors, you'll want an approximate nearest-neighbor (ANN) index to keep queries fast.&lt;/p&gt;

&lt;p&gt;pgvector supports two index types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HNSW&lt;/strong&gt; — faster queries, higher memory usage during build, better recall&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IVFFlat&lt;/strong&gt; — faster build, less memory, slightly lower recall&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most production use cases, HNSW is the right choice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        CREATE INDEX IF NOT EXISTS documents_embedding_idx
        ON documents
        USING hnsw (embedding vector_cosine_ops)
        WITH (m = 16, ef_construction = 64)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;vector_cosine_ops&lt;/code&gt; tells pgvector to optimize the index for cosine distance — make sure this matches the operator you use in queries (&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;). If you're using L2 distance, use &lt;code&gt;vector_l2_ops&lt;/code&gt; instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; Build the index &lt;em&gt;after&lt;/em&gt; loading your initial data. Building on an empty table and then loading data works too, but index quality is better when built on the full dataset.&lt;/p&gt;

&lt;h3&gt;
  
  
  HNSW parameters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;m&lt;/code&gt; — number of connections per layer (default 16, range 2–100). Higher = better recall, more memory.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ef_construction&lt;/code&gt; — search depth during build (default 64, range 4–1000). Higher = better recall, slower build.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For query-time speed/recall tradeoff, set &lt;code&gt;hnsw.ef_search&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SET hnsw.ef_search = 100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Now run your similarity query
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Building a RAG Pipeline with pgvector and Python
&lt;/h2&gt;

&lt;p&gt;Here's a minimal but complete Retrieval-Augmented Generation pipeline using everything above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.psycopg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;register_vector&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;openai_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;register_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Store a document and its embedding.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO documents (content, embedding) VALUES (%s, %s)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Find the k most relevant documents for a query.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
            SELECT content
            FROM documents
            ORDER BY embedding &amp;lt;=&amp;gt; %s
            LIMIT %s
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Retrieve context and generate an answer.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;context_docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context_docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Answer questions using only the provided context. Be concise.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Context:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;

&lt;span class="c1"&gt;# Index some documents
&lt;/span&gt;&lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector supports HNSW and IVFFlat indexes.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cosine similarity is best for text embedding comparisons.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Rivestack provides managed PostgreSQL with pgvector pre-installed.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Ask a question
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What index types does pgvector support?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Metadata Filtering
&lt;/h2&gt;

&lt;p&gt;One of the big advantages of pgvector over standalone vector databases is that you can filter by regular columns in the same query. If you have multi-tenant data or want to search within a category, just add a &lt;code&gt;WHERE&lt;/code&gt; clause:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Add a source column to your table
&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    ALTER TABLE documents ADD COLUMN IF NOT EXISTS source TEXT
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Query with metadata filter
&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    SELECT content, embedding &amp;lt;=&amp;gt; %s AS distance
    FROM documents
    WHERE source = &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;internal-wiki&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;
    ORDER BY embedding &amp;lt;=&amp;gt; %s
    LIMIT 5
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No syncing, no secondary filtering step, no extra infrastructure. It's just SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection Pooling
&lt;/h2&gt;

&lt;p&gt;pgvector queries are fast, but embedding vectors are large (1536 floats ≈ 6KB per vector). If you're running a web application with concurrent requests, use a connection pool to avoid exhausting PostgreSQL's connection limit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;psycopg_pool&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConnectionPool&lt;/span&gt;

&lt;span class="n"&gt;pool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConnectionPool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;min_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;register_vector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT content FROM documents ORDER BY embedding &amp;lt;=&amp;gt; %s LIMIT 5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Rivestack, connection pooling is built into the platform so you don't need to manage PgBouncer yourself — you connect to the pooler endpoint and the rest is handled for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Async Support
&lt;/h2&gt;

&lt;p&gt;If you're building with FastAPI, asyncio, or any other async Python framework, psycopg3 has full async support:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.psycopg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;register_vector_async&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AsyncConnection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;register_vector_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
            SELECT content FROM documents
            ORDER BY embedding &amp;lt;=&amp;gt; %s LIMIT 5
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the &lt;code&gt;register_vector_async&lt;/code&gt; — you need the async version when working with &lt;code&gt;AsyncConnection&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Mistakes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Using the wrong distance operator for your index.&lt;/strong&gt; If you create an HNSW index with &lt;code&gt;vector_cosine_ops&lt;/code&gt; but query with &lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt; (L2 distance), the index won't be used and you'll get a full sequential scan. Match the operator to the index ops class.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forgetting &lt;code&gt;register_vector&lt;/code&gt;.&lt;/strong&gt; You'll get &lt;code&gt;can't adapt type 'list'&lt;/code&gt; errors. Call it once per connection (or once on the pool).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Building indexes on empty tables.&lt;/strong&gt; The index will exist but won't actually improve query quality. Load your data first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not setting &lt;code&gt;ef_search&lt;/code&gt; for precision-sensitive use cases.&lt;/strong&gt; The default is 40, which is fast but sacrifices some recall. For RAG where missing a relevant document matters, &lt;code&gt;SET hnsw.ef_search = 100&lt;/code&gt; is worth the small latency increase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Go From Here
&lt;/h2&gt;

&lt;p&gt;You now have everything you need to build vector search into a Python application using pgvector:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;psycopg3 for direct, fast database access&lt;/li&gt;
&lt;li&gt;SQLAlchemy if you prefer an ORM&lt;/li&gt;
&lt;li&gt;HNSW indexes for production performance&lt;/li&gt;
&lt;li&gt;Metadata filtering using standard SQL &lt;code&gt;WHERE&lt;/code&gt; clauses&lt;/li&gt;
&lt;li&gt;A full RAG pipeline skeleton you can extend&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The operational side — backups, connection pooling, high availability, keeping pgvector updated — is where the real work begins if you're self-hosting. If you'd rather not deal with that, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;try Rivestack&lt;/a&gt;: managed PostgreSQL with pgvector pre-installed, NVMe storage for fast index traversal, and automatic backups. You get the same SQL interface you've just built, without the 3 AM pages.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/how-to-use-pgvector-with-python" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>python</category>
      <category>tutorial</category>
      <category>ai</category>
    </item>
    <item>
      <title>pgvector Cosine Distance: How &lt;=&gt; Actually Works</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Fri, 27 Mar 2026 21:17:05 +0000</pubDate>
      <link>https://forem.com/geekyfox90/pgvector-cosine-distance-how-actually-works-1458</link>
      <guid>https://forem.com/geekyfox90/pgvector-cosine-distance-how-actually-works-1458</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;When we first wired up pgvector for semantic search, I assumed &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; (cosine distance) was just "the normal one." Then a colleague asked why our similarity scores were sometimes above 1.0, and I realized I'd been cargo-culting the operator without understanding what it actually returns.&lt;/p&gt;

&lt;p&gt;The pgvector cosine distance operator &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; is probably the most used distance function in similarity search — and the most misunderstood. Here's what's actually happening under the hood, when to use it, and when to switch.&lt;/p&gt;

&lt;h2&gt;
  
  
  What &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; Returns
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; returns &lt;strong&gt;cosine distance&lt;/strong&gt;, not cosine similarity. The relationship is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cosine_distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cosine_similarity&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So two identical vectors return &lt;code&gt;0&lt;/code&gt; (not &lt;code&gt;1&lt;/code&gt;), and two orthogonal vectors return &lt;code&gt;1&lt;/code&gt;. Two vectors pointing in opposite directions return &lt;code&gt;2&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Identical vectors: distance = 0&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'[1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 0&lt;/span&gt;

&lt;span class="c1"&gt;-- Opposite vectors: distance = 2&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'[1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[-1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 2&lt;/span&gt;

&lt;span class="c1"&gt;-- Find the 5 nearest neighbors by cosine distance&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[0.1, 0.3, ...]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[0.1, 0.3, ...]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This trips people up constantly. If you're used to thinking "higher = more similar," you need to flip your mental model. With &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;, lower scores are better.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Operators, Three Use Cases
&lt;/h2&gt;

&lt;p&gt;pgvector exposes three distance operators:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cosine distance&lt;/td&gt;
&lt;td&gt;NLP embeddings, semantic search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;L2 (Euclidean) distance&lt;/td&gt;
&lt;td&gt;Image embeddings, dense numeric vectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Negative inner product&lt;/td&gt;
&lt;td&gt;When vectors are already unit-normalized&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; when&lt;/strong&gt;: your embeddings come from an NLP model (OpenAI, Cohere, Mistral, etc.) and you care about directional similarity rather than magnitude. Most language model embeddings encode meaning in the direction of the vector, not its length.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt; when&lt;/strong&gt;: magnitude matters. Image feature vectors or tabular embeddings where distance in absolute space is meaningful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; when&lt;/strong&gt;: your vectors are already L2-normalized (unit length). It returns the negative dot product, which is equivalent to cosine similarity for unit vectors — and it's faster because it skips the normalization step.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Normalization Shortcut
&lt;/h2&gt;

&lt;p&gt;If you pre-normalize your embeddings before inserting them, &lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; is strictly faster than &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; and produces equivalent ranking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Normalize on insert (Python)&lt;/span&gt;
&lt;span class="n"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="n"&gt;def&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;norm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vec&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt; &lt;span class="n"&gt;if&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;vec&lt;/span&gt;

&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;-- Then query with inner product&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;#&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_vec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;#&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_vec&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We benchmarked this on a 1M-row table with 1536-dim vectors. Pre-normalizing and using &lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; cut query time by ~18% vs &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; on an IVFFlat index. Not huge, but free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Index Compatibility
&lt;/h2&gt;

&lt;p&gt;One thing that bites teams: &lt;strong&gt;your index operator class must match your query operator&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- For cosine distance queries (&amp;lt;=&amp;gt;), use vector_cosine_ops&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_cosine_ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- For L2 queries (&amp;lt;-&amp;gt;), use vector_l2_ops&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_l2_ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you create an &lt;code&gt;ivfflat&lt;/code&gt; index with &lt;code&gt;vector_l2_ops&lt;/code&gt; and then query with &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;, Postgres will do a full sequential scan. No error, no warning — just slow queries that silently skip the index. Run &lt;code&gt;EXPLAIN&lt;/code&gt; and look for &lt;code&gt;Index Scan&lt;/code&gt; vs &lt;code&gt;Seq Scan&lt;/code&gt; to confirm your index is being used.&lt;/p&gt;

&lt;h2&gt;
  
  
  Picking Lists and probes
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;lists&lt;/code&gt; is the number of clusters IVFFlat partitions your data into. A good starting point is &lt;code&gt;sqrt(row_count)&lt;/code&gt;. At query time, &lt;code&gt;ivfflat.probes&lt;/code&gt; controls how many clusters to search:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Set probes at session level (or per query)&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;probes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[...]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dist&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;dist&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Higher &lt;code&gt;probes&lt;/code&gt; = better recall, slower queries. For most semantic search workloads, &lt;code&gt;probes = 10&lt;/code&gt; gives 95%+ recall at reasonable speed. Test with your actual data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;The pgvector cosine distance operator &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; is the right default for language model embeddings — but know what it returns (distance, not similarity), verify your index operator class matches your query operator, and consider pre-normalizing if you're chasing extra performance.&lt;/p&gt;

&lt;p&gt;If you want pgvector running on managed Postgres without tuning kernel params or fighting storage I/O, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Rivestack&lt;/a&gt; handles that — provisioned in 30 seconds with the pgvector extension pre-enabled.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>pgvector</category>
      <category>database</category>
    </item>
    <item>
      <title>I built a managed pgvector service and here's what I learned about vector search performance</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Wed, 25 Mar 2026 09:01:38 +0000</pubDate>
      <link>https://forem.com/geekyfox90/i-built-a-managed-pgvector-service-and-heres-what-i-learned-about-vector-search-performance-ehl</link>
      <guid>https://forem.com/geekyfox90/i-built-a-managed-pgvector-service-and-heres-what-i-learned-about-vector-search-performance-ehl</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrmi6fijilv2pvd27uua.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrmi6fijilv2pvd27uua.png" alt="NVMe vs cloud SSD pgvector performance benchmark visualization in Firewatch art style" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I Ran pgvector on NVMe vs Cloud SSD. The Difference Shocked Me.&lt;br&gt;
2,000 queries per second at under 4ms. That's what I'm getting on a $35/month server. Let me tell you how I got there and what I had to build to make it work.&lt;/p&gt;

&lt;p&gt;The problem started with a side project&lt;/p&gt;

&lt;p&gt;I was building a RAG pipeline. Standard stuff: OpenAI embeddings, PostgreSQL, pgvector extension, HNSW index. Everything worked fine in development. Then I moved it to a managed database on a regular cloud provider and watched my query latency go from 4ms to 47ms under any real load.&lt;/p&gt;

&lt;p&gt;I spent two days thinking my index was wrong. Wrong ef_search value. Wrong m parameter. Wrong dimension count. None of that was it.&lt;br&gt;
The problem was the disk.&lt;/p&gt;

&lt;p&gt;HNSW does not behave like a normal database query&lt;br&gt;
Most database queries are sequential reads. Your disk reads a chunk of data in order, hands it back, done. SSDs are fast at this. Even gp3 cloud SSDs are fast at this.&lt;/p&gt;

&lt;p&gt;HNSW is different. An HNSW index traversal is essentially a graph walk. You start at an entry point, compare distances, jump to neighbors, compare again, jump again. Each jump is a random read to a different location on disk. The more vectors you have, the more jumps, the more random reads.&lt;/p&gt;

&lt;p&gt;Cloud SSDs are slow at random reads. NVMe drives are very fast at random reads. That gap, which barely matters for most database workloads, is everything for pgvector at scale.&lt;br&gt;
I didn't believe it until I benchmarked it myself.&lt;/p&gt;

&lt;p&gt;The actual numbers&lt;/p&gt;

&lt;p&gt;I tested 1 million vectors at 1536 dimensions (OpenAI text-embedding-3-small), HNSW index, cosine distance, ef_search=40, 16 concurrent clients.&lt;/p&gt;

&lt;p&gt;On a cloud SSD backed instance: around 410 QPS, p95 latency at 18ms.&lt;br&gt;
On NVMe: 2,150 QPS, p95 at 2.8ms.&lt;br&gt;
Same PostgreSQL version. Same pgvector version. Same query. Same index parameters. Five times the throughput, six times lower latency. Just from the storage layer.&lt;/p&gt;

&lt;p&gt;That's when I decided to build Rivestack.&lt;/p&gt;

&lt;p&gt;Why PostgreSQL and not a dedicated vector database&lt;/p&gt;

&lt;p&gt;Honest answer: because I didn't want to manage two databases.&lt;br&gt;
The moment you move your vectors to Pinecone or Qdrant or Weaviate, you have split your data across two systems. Your relational data lives in Postgres. Your vectors live somewhere else. Every query that needs both involves a round trip between systems. That latency adds up fast.&lt;/p&gt;

&lt;p&gt;With pgvector you write one query. You can filter by user_id, join against your documents table, and do a vector similarity search in a single SQL statement. That's not a minor convenience. It changes how you architect the whole application.&lt;br&gt;
The only real argument against pgvector is scale. If you have a billion vectors, you need a dedicated system. For the other 99% of applications, pgvector on NVMe is genuinely competitive.&lt;/p&gt;

&lt;p&gt;What I actually built&lt;/p&gt;

&lt;p&gt;Rivestack is a managed PostgreSQL service where pgvector is the whole point, not an afterthought. NVMe storage on every plan. HNSW pre-configured with sensible defaults. Daily backups, point-in-time recovery, HA failover if you want it.&lt;/p&gt;

&lt;p&gt;The thing I kept running into with other managed Postgres providers is that pgvector is just... there. An extension you can enable. Nobody has tuned the storage layer for it. Nobody has written documentation for RAG use cases. Nobody has thought about what happens to your index when you have 5 million vectors and 50 concurrent clients.&lt;/p&gt;

&lt;p&gt;That's the gap I'm filling.&lt;/p&gt;

&lt;p&gt;The honest trade-offs&lt;/p&gt;

&lt;p&gt;Rivestack is not for everyone. If you need built-in auth, storage buckets, or a real-time WebSocket layer, use Supabase. They're great at that. If you have hundreds of millions of vectors and need automatic sharding, use Pinecone. They're built for that.&lt;br&gt;
Rivestack is for teams who need pgvector to actually perform under load and don't want to spend three days tuning PostgreSQL configuration to get there.&lt;/p&gt;

&lt;p&gt;What I'd do differently&lt;/p&gt;

&lt;p&gt;I spent too long on the benchmark tooling before I had a single user. Classic founder mistake. The infrastructure is solid but I should have shipped earlier and iterated based on real workloads instead of synthetic ones.&lt;/p&gt;

&lt;p&gt;Also I underestimated how much developers care about EU data residency. It's come up in every conversation I've had since launch. If you're building anything GDPR-adjacent, knowing your vectors never leave EU territory is not a minor detail.&lt;/p&gt;

&lt;p&gt;Try it if you're building with embeddings&lt;br&gt;
Free shared tier at rivestack.io, no credit card required. Paid plans start at $35/month for a dedicated node with 2 vCPU, 4GB RAM and 55GB NVMe.&lt;br&gt;
If you're already running pgvector somewhere, migration is just pg_dump and pg_restore. Takes about five minutes for most databases.&lt;/p&gt;

&lt;p&gt;What are you using for vector storage right now? And what's the biggest pain point you've hit with it? Genuinely curious whether the storage layer issue I ran into is common or whether I just had unusually bad luck with my cloud provider.&lt;a href="https://dev.tourl"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>vectordatabase</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I indexed 30 million Hacker News posts into a single Postgres database. Here is what I learned about pgvector at scale.</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sun, 22 Feb 2026 15:53:49 +0000</pubDate>
      <link>https://forem.com/geekyfox90/i-indexed-30-million-hacker-news-posts-into-a-single-postgres-database-here-is-what-i-learned-a3n</link>
      <guid>https://forem.com/geekyfox90/i-indexed-30-million-hacker-news-posts-into-a-single-postgres-database-here-is-what-i-learned-a3n</guid>
      <description>&lt;p&gt;I have been running Kubernetes clusters and CI/CD pipelines for a living for years. Big org, 5000 developers, the usual enterprise stuff. But on the side I have always been building things. A few weeks ago I launched Rivestack, a managed Postgres service built around pgvector for AI workloads. The problem was obvious: nobody knew it existed and I had zero marketing budget.&lt;/p&gt;

&lt;p&gt;So instead of writing blog posts nobody would read or tweeting into the void, I decided to build something useful and give it away. I built a semantic search engine over the entire Hacker News archive.&lt;br&gt;
You can try it here: &lt;a href="https://ask.rivestack.io" rel="noopener noreferrer"&gt;https://ask.rivestack.io&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Type any question in plain English and it finds relevant HN threads by meaning, not by matching keywords. Search for "how to deal with burnout as a solo founder" and you will get threads where people talked about exactly that, even if they never used the word burnout.&lt;br&gt;
The whole thing runs on a single Postgres instance with pgvector. No Pinecone. No Weaviate. No Qdrant. Just Postgres.&lt;br&gt;
Here is what I learned building it.&lt;/p&gt;

&lt;p&gt;pgvector is way faster than people think&lt;br&gt;
There is this narrative that pgvector is fine for prototyping but you need a "real" vector database for production. That has not been my experience at all.&lt;/p&gt;

&lt;p&gt;With HNSW indexes on NVMe storage, I am getting sub 4ms query latency at 99% recall. For context, that is fast enough that users cannot tell the difference between this and a keyword search. The bottleneck in most RAG and search applications is not the vector lookup. It is everything else around it.&lt;/p&gt;

&lt;p&gt;I spent a lot of time tuning HNSW parameters and the biggest takeaway is that the defaults are actually pretty good for most workloads. The main thing that matters is having enough memory for the index and using NVMe storage. If your index fits in memory, pgvector flies.&lt;br&gt;
HNSW vs IVFFlat is not even a close decision anymore&lt;/p&gt;

&lt;p&gt;I started with IVFFlat because it uses less memory. Bad idea. The recall was inconsistent and it degrades as you add and delete data because the centroids get stale. I switched to HNSW and never looked back. The build time is slower but the query performance and recall stability are worth it.&lt;/p&gt;

&lt;p&gt;If you are starting a new project in 2026, just use HNSW. The memory overhead is manageable and the results are dramatically better.&lt;br&gt;
Keeping everything in one database is a superpower&lt;br&gt;
This is the part that surprised me the most. When your embeddings live next to your relational data, things get simple in ways you do not expect.&lt;/p&gt;

&lt;p&gt;Want to filter search results by date? Just add a WHERE clause. Want to boost recent posts? Order by a combination of vector distance and recency. Want to update an embedding when the source data changes? Do it in a single transaction and you never have stale data.&lt;/p&gt;

&lt;p&gt;With a separate vector database you need to build syncing logic, handle failures where one system updated but the other did not, and manage two sets of credentials, backups, and monitoring. With pgvector it is just Postgres. The same backups, the same monitoring, the same failover you already know.&lt;/p&gt;

&lt;p&gt;Embeddings are the easy part. Data cleaning is the hard part.&lt;br&gt;
I expected the embedding generation to be the big challenge. It was not. The real work was cleaning and preparing the HN data. Dealing with deleted posts, HTML entities, encoding issues, extremely long comments that need to be chunked, short comments that are just "this" or "+1" and add noise to the index.&lt;/p&gt;

&lt;p&gt;I probably spent 70% of my time on data preparation and 30% on everything else combined. If you are building something similar, budget your time accordingly.&lt;/p&gt;

&lt;p&gt;The cost surprised me too&lt;br&gt;
I keep hearing that vector search is expensive. Maybe it is if you are paying Pinecone prices at scale. Running this on a single Postgres instance with pgvector costs me almost nothing compared to what a dedicated vector database would charge for the same dataset. The entire infrastructure for ask.rivestack.io costs less than what some teams spend on their vector database alone.&lt;/p&gt;

&lt;p&gt;That is not a knock on Pinecone or Qdrant. They solve real problems at very large scale. But for the vast majority of applications, the ones with under 50 million vectors, pgvector on properly configured Postgres is more than enough.&lt;/p&gt;

&lt;p&gt;What I would do differently&lt;/p&gt;

&lt;p&gt;If I were starting over, three things:&lt;br&gt;
First, I would use HNSW from day one instead of wasting a week on IVFFlat.&lt;/p&gt;

&lt;p&gt;Second, I would invest more in data quality upfront. Garbage embeddings give garbage results no matter how good your index is.&lt;br&gt;
Third, I would set up monitoring for recall quality from the start. It is easy to ship something that seems to work and not notice that results are degrading as your dataset grows.&lt;/p&gt;

&lt;p&gt;Why I built this as a product&lt;/p&gt;

&lt;p&gt;After going through all of this, I realized that the hard part of pgvector is not pgvector itself. It is everything around it: provisioning, backups, HA, monitoring, index tuning, keeping Postgres updated. The vector search part is actually the easy part once the infrastructure is solid.&lt;/p&gt;

&lt;p&gt;That is why I built Rivestack (&lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;https://rivestack.io&lt;/a&gt;). It is managed Postgres with pgvector already configured and optimized. You get a database in minutes with backups, metrics, SSL, and autoscaling. There is a free tier if you want to try it.&lt;/p&gt;

&lt;p&gt;I am a one person team running this alongside my day job so I am not trying to compete with Supabase or Neon on features. I am just trying to be the simplest option for people who want pgvector to work without thinking about infrastructure.&lt;br&gt;
Try the search&lt;/p&gt;

&lt;p&gt;The semantic search over HN is free and will stay free. Go play with it: &lt;a href="https://ask.rivestack.io" rel="noopener noreferrer"&gt;https://ask.rivestack.io&lt;/a&gt;&lt;br&gt;
If you are building something with pgvector and have questions about tuning or architecture, drop them in the comments. I have made most of the mistakes already so I might be able to save you some time.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>database</category>
      <category>postgres</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
