<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Yasser B.</title>
    <description>The latest articles on Forem by Yasser B. (@geekyfox90).</description>
    <link>https://forem.com/geekyfox90</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3785198%2F1098d18e-2c79-44f4-bcdd-1d928ee365f0.png</url>
      <title>Forem: Yasser B.</title>
      <link>https://forem.com/geekyfox90</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/geekyfox90"/>
    <language>en</language>
    <item>
      <title>PostgreSQL Row Level Security: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Wed, 22 Apr 2026 18:20:31 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-row-level-security-a-complete-guide-2l4</link>
      <guid>https://forem.com/geekyfox90/postgresql-row-level-security-a-complete-guide-2l4</guid>
      <description>&lt;p&gt;Your application code knows which tenant owns which row. Your ORM always filters by &lt;code&gt;WHERE tenant_id = $1&lt;/code&gt;. Your team has reviewed the queries and they look fine.&lt;/p&gt;

&lt;p&gt;Then someone forgets the WHERE clause. Or a bulk operation skips the filter. Or a new developer writes a raw query without knowing the convention. Suddenly one tenant can read another tenant's data, and you find out from a support ticket two weeks later.&lt;/p&gt;

&lt;p&gt;Row Level Security (RLS) moves the tenant isolation logic inside PostgreSQL itself. The database enforces the policy automatically on every access, regardless of how the query was written.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Row Level Security Does
&lt;/h2&gt;

&lt;p&gt;Enable RLS on a table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without any policies, no rows are visible to non-superusers. The safe default is deny, not permit. Then create a policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;documents_tenant_isolation&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.tenant_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting the Tenant Context
&lt;/h2&gt;

&lt;p&gt;Always use &lt;code&gt;SET LOCAL&lt;/code&gt; (not &lt;code&gt;SET&lt;/code&gt;) with connection poolers. &lt;code&gt;SET LOCAL&lt;/code&gt; resets when the transaction ends, so pooled connections do not carry the wrong tenant context into the next request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="k"&gt;LOCAL&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'550e8400-e29b-41d4-a716-446655440000'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- your queries here&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  FORCE ROW LEVEL SECURITY
&lt;/h2&gt;

&lt;p&gt;Table owners bypass RLS by default. Close this gap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;FORCE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without it, an application connecting as the table owner silently ignores all policies. This is the most common RLS gotcha.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permissive vs Restrictive Policies
&lt;/h2&gt;

&lt;p&gt;Multiple policies on the same operation combine with OR by default (permissive). For rules that must always apply, use &lt;code&gt;AS RESTRICTIVE&lt;/code&gt;. Restrictive policies combine with AND against all other policies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;Add an index on the tenant_id column:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_documents_tenant_id&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without it, every query with an RLS filter becomes a full table scan.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Mistakes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Not using &lt;code&gt;FORCE ROW LEVEL SECURITY&lt;/code&gt; when the app connects as the table owner&lt;/li&gt;
&lt;li&gt;Using &lt;code&gt;SET&lt;/code&gt; instead of &lt;code&gt;SET LOCAL&lt;/code&gt; with PgBouncer in transaction mode (tenant context leaks between clients)&lt;/li&gt;
&lt;li&gt;Missing the index on the &lt;code&gt;tenant_id&lt;/code&gt; column&lt;/li&gt;
&lt;li&gt;Not testing cross-tenant access explicitly in your test suite&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the full guide with multi-tenant schema setup, testing patterns, EXPLAIN output, and inspecting existing policies, read the full post at rivestack.io.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgresql-row-level-security" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgressql</category>
      <category>security</category>
      <category>database</category>
      <category>webdev</category>
    </item>
    <item>
      <title>PostgreSQL Monitoring: A Practical Guide to Metrics, Tools, and Alerts</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Wed, 22 Apr 2026 15:17:27 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-monitoring-a-practical-guide-to-metrics-tools-and-alerts-ink</link>
      <guid>https://forem.com/geekyfox90/postgresql-monitoring-a-practical-guide-to-metrics-tools-and-alerts-ink</guid>
      <description>&lt;p&gt;Something is wrong with your database. You just don't know it yet.&lt;/p&gt;

&lt;p&gt;Maybe it's a query that takes 200ms on average but spikes to 8 seconds every few minutes. Maybe autovacuum fell behind three weeks ago and your tables are now bloated to twice their logical size. Maybe replication lag crept from 100ms to 4 seconds and nobody noticed because the app still works. Until it doesn't.&lt;/p&gt;

&lt;p&gt;PostgreSQL gives you everything you need to catch these problems early. The issue is knowing which of the hundreds of available metrics actually matter, how to collect them without adding overhead, and what to alert on versus what to watch passively. This guide walks through all of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why PostgreSQL Monitoring Is Different
&lt;/h2&gt;

&lt;p&gt;PostgreSQL is not a black box. It exposes a rich set of internal statistics through system views that you can query with plain SQL. No special agent, no proprietary protocol. The challenge is not access but interpretation: there are dozens of views, each tracking a different subsystem, and the relationships between them are not obvious until you've debugged a few production incidents.&lt;/p&gt;

&lt;p&gt;The other challenge is overhead. Collecting metrics too aggressively can itself impact performance, particularly on systems with high connection counts or heavy write workloads.&lt;/p&gt;

&lt;p&gt;A good monitoring setup has three layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Metrics collection&lt;/strong&gt; ‚Äî snapshot key numbers every 15 to 60 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query analysis&lt;/strong&gt; ‚Äî track which queries are slow and how often they run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alerts&lt;/strong&gt; ‚Äî fire on conditions that require human action, not on every blip&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Metrics That Actually Matter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Connection Usage
&lt;/h3&gt;

&lt;p&gt;PostgreSQL creates a new OS process for every client connection. This isn't free. Each connection consumes around 5 to 10MB of RAM, and the OS has to schedule and context-switch between all of them.&lt;/p&gt;

&lt;p&gt;The most important connection metric is what fraction of your &lt;code&gt;max_connections&lt;/code&gt; limit you're actually using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_connections&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;active&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'idle'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;idle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'idle in transaction'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;idle_in_transaction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;setting&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_settings&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'max_connections'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;max_connections&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_activity&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pg_backend_pid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;idle in transaction&lt;/code&gt; connections are particularly dangerous. A connection sitting in an open transaction holds locks and prevents autovacuum from cleaning up dead rows. If you see this number climbing, you likely have application code that opens transactions and then waits on external resources before committing.&lt;/p&gt;

&lt;p&gt;For most applications, if you're regularly exceeding 80% of &lt;code&gt;max_connections&lt;/code&gt;, you need &lt;a href="https://dev.to/blog/postgresql-connection-pooling-pgbouncer"&gt;connection pooling&lt;/a&gt;. PgBouncer in transaction mode can multiplex hundreds of application connections through a small handful of actual Postgres connections.&lt;/p&gt;

&lt;h3&gt;
  
  
  Query Performance
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;pg_stat_statements&lt;/code&gt; is the single most useful view for understanding production performance. It tracks execution statistics for every distinct query seen since the extension was last reset, aggregated across all calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_exec_time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean_exec_time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;mean_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stddev_exec_time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;stddev_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;rows&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_statements&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_exec_time&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;total_exec_time&lt;/code&gt; column gives you the queries consuming the most cumulative time. This is usually the right place to start: a fast query that runs a million times a day costs more than a slow query that runs once.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;stddev_exec_time&lt;/code&gt; column is underused. A query with a mean of 10ms and a stddev of 200ms has pathological behavior: most calls are fast but occasional calls are catastrophically slow. This pattern often means missing indexes on filtered columns or a query plan that changes depending on parameter values.&lt;/p&gt;

&lt;p&gt;To enable &lt;code&gt;pg_stat_statements&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Add to postgresql.conf:&lt;/span&gt;
&lt;span class="c1"&gt;-- shared_preload_libraries = 'pg_stat_statements'&lt;/span&gt;

&lt;span class="c1"&gt;-- After restart:&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;pg_stat_statements&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cache Hit Ratio
&lt;/h3&gt;

&lt;p&gt;PostgreSQL uses shared buffers as its in-memory cache. If your working set fits in memory, most reads are served from cache rather than disk. The buffer hit ratio tells you how well this is working:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_read&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;heap_read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_hit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;heap_hit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_hit&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;
    &lt;span class="k"&gt;nullif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_hit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;heap_blks_read&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;hit_ratio_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_statio_user_tables&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A ratio below 90% on an OLTP database usually means your &lt;code&gt;shared_buffers&lt;/code&gt; setting is too small, or your working set is genuinely larger than available RAM. The fix is either more RAM or a read replica to distribute load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Table Bloat and Autovacuum
&lt;/h3&gt;

&lt;p&gt;PostgreSQL uses MVCC (multi-version concurrency control). When you update or delete a row, the old version stays on disk until autovacuum cleans it up. If autovacuum falls behind, tables accumulate dead rows: "bloat." A highly bloated table is slower to scan and wastes disk space.&lt;/p&gt;

&lt;p&gt;This query shows which tables have the most dead tuples relative to live ones:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;schemaname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;relname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;n_live_tup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;n_dead_tup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_dead_tup&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;nullif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_live_tup&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;n_dead_tup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dead_pct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;last_autovacuum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;last_autoanalyze&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_user_tables&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;n_live_tup&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;n_dead_tup&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tables with a &lt;code&gt;dead_pct&lt;/code&gt; above 20% need attention. If autovacuum is running but not keeping up, you need to tune it: either lower &lt;code&gt;autovacuum_vacuum_scale_factor&lt;/code&gt; for large tables, or increase &lt;code&gt;autovacuum_max_workers&lt;/code&gt; if you have many tables that need vacuuming simultaneously.&lt;/p&gt;

&lt;p&gt;For tables that see extremely high write throughput, you can override autovacuum settings at the table level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;autovacuum_vacuum_scale_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="n"&gt;autovacuum_vacuum_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells autovacuum to trigger a vacuum on the &lt;code&gt;orders&lt;/code&gt; table after 1% of rows change, rather than the default 20%. For a 10 million row table, the default means waiting until 2 million rows are dead before cleaning up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Replication Lag
&lt;/h3&gt;

&lt;p&gt;If you're running streaming replication (either for HA or read replicas), monitoring lag is critical. On the primary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;client_addr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;sent_lsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;write_lsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;flush_lsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;replay_lsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;pg_wal_lsn_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sent_lsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replay_lsn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;replication_lag_bytes&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_replication&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;replication_lag_bytes&lt;/code&gt; tells you how far behind each replica is in bytes of WAL. You can convert this to approximate time, but the byte measure is more reliable: time is dependent on how busy the primary is, whereas bytes directly measures how much data hasn't been applied yet.&lt;/p&gt;

&lt;p&gt;For a complete picture of how high availability and replication interact, see our &lt;a href="https://dev.to/blog/postgres-high-availability"&gt;PostgreSQL High Availability guide&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lock Waits
&lt;/h3&gt;

&lt;p&gt;Lock contention is one of the sneakiest performance problems. A query that normally takes 10ms will wait indefinitely if it needs a lock held by another transaction. This query shows active locks and what they're waiting on:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;blocked_pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;blocked_query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;blocking&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;blocking_pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;blocking&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;blocking_query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wait_event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wait_event&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_activity&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;blocked&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;pg_stat_activity&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;blocking&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;blocking&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;ANY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_blocking_pids&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;cardinality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_blocking_pids&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Long-running &lt;code&gt;idle in transaction&lt;/code&gt; connections are the most common culprit. A connection that opens a transaction, acquires a lock, and then waits on application code blocks every other transaction that needs the same lock. Setting &lt;code&gt;idle_in_transaction_session_timeout = '30s'&lt;/code&gt; in PostgreSQL 9.6+ will automatically terminate these connections.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools for Collecting PostgreSQL Metrics
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prometheus and postgres_exporter
&lt;/h3&gt;

&lt;p&gt;The most common open source stack is &lt;a href="https://github.com/prometheus-community/postgres_exporter" rel="noopener noreferrer"&gt;postgres_exporter&lt;/a&gt; feeding into Prometheus, with Grafana for visualization. postgres_exporter runs as a sidecar, connects to your database with a read-only user, and scrapes all the stats views above on a configurable interval.&lt;/p&gt;

&lt;p&gt;The advantage is flexibility: you control exactly which metrics you collect, how long you retain them, and how you visualize them. The disadvantage is that you own all of it. Setting up a reliable Prometheus stack with proper retention, alerting, and visualization takes real engineering time.&lt;/p&gt;

&lt;p&gt;A minimal postgres_exporter setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# docker-compose.yml&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;postgres-exporter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prometheuscommunity/postgres-exporter&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;DATA_SOURCE_NAME&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://monitoring_user:password@postgres:5432/mydb?sslmode=require"&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;9187:9187"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;monitoring_user&lt;/code&gt; needs these grants:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;USER&lt;/span&gt; &lt;span class="n"&gt;monitoring_user&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;PASSWORD&lt;/span&gt; &lt;span class="s1"&gt;'password'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="n"&gt;pg_monitor&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;monitoring_user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;pg_monitor&lt;/code&gt; is a predefined role in PostgreSQL 10+ that grants read access to all monitoring-related views without superuser privileges.&lt;/p&gt;

&lt;h3&gt;
  
  
  pganalyze
&lt;/h3&gt;

&lt;p&gt;pganalyze is a hosted service purpose-built for PostgreSQL monitoring. It does query performance analysis, index recommendations, and automated bloat detection. The main differentiator is that it normalizes query statistics: it groups queries by their structure (ignoring parameter values) and tracks performance over time with a timeline view that shows when a query got slower and whether it correlates with a schema change or data growth.&lt;/p&gt;

&lt;p&gt;The free tier covers one database and is enough to get started. The main tradeoff versus Prometheus is that you're sending query statistics to an external service, which requires a data processing agreement for regulated industries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Datadog and CloudWatch
&lt;/h3&gt;

&lt;p&gt;If you're already paying for Datadog or AWS CloudWatch, the PostgreSQL integrations are solid and require minimal setup. Datadog's postgres integration scrapes all the key stats views and provides out-of-the-box dashboards and alert policies.&lt;/p&gt;

&lt;p&gt;The tradeoff is cost: Datadog pricing is per host, which gets expensive at scale. CloudWatch RDS Enhanced Monitoring is cheaper but only applies to RDS instances.&lt;/p&gt;

&lt;h3&gt;
  
  
  pg_activity
&lt;/h3&gt;

&lt;p&gt;For interactive debugging, &lt;code&gt;pg_activity&lt;/code&gt; is a top-like interface for PostgreSQL. It shows running queries, their duration, locks, I/O, and CPU usage in real time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pg_activity
pg_activity &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;--dbname&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;mydb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't for automated monitoring; it's for when you're actively debugging an incident and need to see what's happening right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Alert On
&lt;/h2&gt;

&lt;p&gt;Not every metric worth tracking needs an alert. Alert fatigue is real: if your alerts fire frequently and rarely require action, engineers start ignoring them.&lt;/p&gt;

&lt;p&gt;These are the conditions worth waking someone up for:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High connection usage.&lt;/strong&gt; Alert when active connections exceed 85% of &lt;code&gt;max_connections&lt;/code&gt;. At 90%, new connections start failing, which causes application errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Replication lag above threshold.&lt;/strong&gt; What the threshold is depends on your recovery time objective. For most applications, lag above 60 seconds on a replica that serves traffic is worth alerting on. For HA failover replicas, even 10 seconds of lag means potential data loss in a failover scenario.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long-running transactions.&lt;/strong&gt; Any transaction open for more than 5 minutes is almost certainly stuck or abandoned. This is a lock risk and a bloat risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Autovacuum not running.&lt;/strong&gt; If &lt;code&gt;last_autovacuum&lt;/code&gt; on a high-write table is more than a few hours old, autovacuum is either failing or misconfigured. The table will bloat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disk usage above 80%.&lt;/strong&gt; PostgreSQL will crash if it runs out of disk space, and it will do so in a way that can require recovery. Alert early and plan for disk growth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checkpoint frequency too high.&lt;/strong&gt; If &lt;code&gt;checkpoints_req&lt;/code&gt; in &lt;code&gt;pg_stat_bgwriter&lt;/code&gt; is growing faster than &lt;code&gt;checkpoints_timed&lt;/code&gt;, your &lt;code&gt;checkpoint_completion_target&lt;/code&gt; and &lt;code&gt;max_wal_size&lt;/code&gt; settings need tuning. Frequent required checkpoints hurt write throughput.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;checkpoints_timed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;checkpoints_req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoints_req&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;numeric&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;nullif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoints_timed&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;checkpoints_req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;pct_required&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_bgwriter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;pct_required&lt;/code&gt; is above 10%, increase &lt;code&gt;max_wal_size&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Index Health
&lt;/h2&gt;

&lt;p&gt;Monitoring query performance tells you when queries are slow, but it doesn't always tell you why. Two index-specific queries help diagnose the most common cause: missing or unused indexes.&lt;/p&gt;

&lt;p&gt;Unused indexes waste write performance and vacuum time without helping any query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;schemaname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;relname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;indexrelname&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;index_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;idx_scan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;pg_size_pretty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexrelid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;index_size&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_user_indexes&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;idx_scan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;indisprimary&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;indisunique&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexrelid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Indexes with &lt;code&gt;idx_scan = 0&lt;/code&gt; since the last stats reset (or since the database was restarted) are candidates for removal. Check the &lt;a href="https://dev.to/blog/postgresql-indexing-guide"&gt;PostgreSQL Indexing Guide&lt;/a&gt; for the full picture on when to drop versus keep.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring in Production: Practical Setup
&lt;/h2&gt;

&lt;p&gt;A minimal but effective monitoring setup for a production PostgreSQL database:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable &lt;code&gt;pg_stat_statements&lt;/code&gt;&lt;/strong&gt; as a shared preload library. This has almost no overhead and is the single highest-value observability addition you can make.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Create a dedicated monitoring user&lt;/strong&gt; with &lt;code&gt;pg_monitor&lt;/code&gt; role. Never use superuser credentials for monitoring.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Collect metrics every 30 seconds&lt;/strong&gt; to a time-series store (Prometheus, CloudWatch, or Datadog). This gives you enough resolution to catch spikes without overwhelming the database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set up a weekly query review.&lt;/strong&gt; Look at the top 10 queries by total time in &lt;code&gt;pg_stat_statements&lt;/code&gt; every week. This is where most optimization opportunities hide.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Alert on the conditions above&lt;/strong&gt;, not on every metric. Keep the signal-to-noise ratio high so alerts remain meaningful.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Back up your monitoring data separately from your database.&lt;/strong&gt; If your database is having a problem, you need to be able to look at historical metrics from before the incident started. See our guide on &lt;a href="https://dev.to/blog/postgresql-backup-recovery-guide"&gt;PostgreSQL backup and recovery&lt;/a&gt; for a full production backup strategy.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What Rivestack Handles for You
&lt;/h2&gt;

&lt;p&gt;If you're running PostgreSQL on Rivestack, the monitoring fundamentals are already in place. Rivestack exposes connection counts, active query duration, replication lag, and disk usage through a built-in dashboard, and alerts fire automatically on the conditions above without requiring any setup on your part.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pg_stat_statements&lt;/code&gt; is enabled by default on every Rivestack database. You can query it directly through your preferred SQL client or connect to it via the Rivestack metrics API.&lt;/p&gt;

&lt;p&gt;For teams that want deeper observability, Rivestack supports read-only monitoring credentials so you can connect postgres_exporter or pganalyze without any additional configuration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Starting Points
&lt;/h2&gt;

&lt;p&gt;The most important thing is to start simple. You don't need the full Prometheus stack before you've shipped. What you do need, immediately, is &lt;code&gt;pg_stat_statements&lt;/code&gt; enabled and a dashboard that shows connection usage and disk growth. Those two things will catch 80% of production database problems before they become incidents.&lt;/p&gt;

&lt;p&gt;Add alerts for connection spikes and replication lag. Set &lt;code&gt;idle_in_transaction_session_timeout&lt;/code&gt;. Review slow queries weekly. Everything else you can add as you learn what actually causes problems in your specific workload.&lt;/p&gt;

&lt;p&gt;When you're ready to take the operational burden off your team entirely, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;try Rivestack&lt;/a&gt; for managed PostgreSQL with monitoring built in from day one.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgresql-monitoring-guide" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>tutorial</category>
      <category>database</category>
      <category>devops</category>
    </item>
    <item>
      <title>PostgreSQL Backup and Recovery: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sat, 18 Apr 2026 08:24:11 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-backup-and-recovery-a-complete-guide-34j2</link>
      <guid>https://forem.com/geekyfox90/postgresql-backup-and-recovery-a-complete-guide-34j2</guid>
      <description>&lt;p&gt;Your database has never failed. That's not because you're careful. It's because you haven't been running long enough.&lt;/p&gt;

&lt;p&gt;Hardware dies. Disk controllers corrupt data silently for weeks before anyone notices. An engineer runs a &lt;code&gt;DELETE&lt;/code&gt; without a &lt;code&gt;WHERE&lt;/code&gt; clause on production. A cloud provider has a multi-zone outage. The only thing standing between you and catastrophic data loss is a backup you can actually restore from.&lt;/p&gt;

&lt;p&gt;This guide covers how PostgreSQL backup actually works: the difference between logical and physical backups, how point-in-time recovery lets you undo bad queries, and what a production-grade backup strategy looks like. We will go through the tools, the tradeoffs, and the one mistake most teams make that turns a "we have backups" story into a "we lost two days of data" incident.&lt;/p&gt;

&lt;h2&gt;
  
  
  Logical vs Physical Backups
&lt;/h2&gt;

&lt;p&gt;Before touching any tools, you need to understand the two categories of PostgreSQL backup:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Logical backups&lt;/strong&gt; capture the data as SQL. The output is a file of &lt;code&gt;CREATE TABLE&lt;/code&gt;, &lt;code&gt;INSERT&lt;/code&gt;, and &lt;code&gt;COPY&lt;/code&gt; statements that recreate your schema and data from scratch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Physical backups&lt;/strong&gt; copy the raw files that PostgreSQL uses to store data on disk, including the base directory and Write-Ahead Log (WAL) files.&lt;/p&gt;

&lt;p&gt;Each has a different purpose:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Logical (pg_dump)&lt;/th&gt;
&lt;th&gt;Physical (pg_basebackup)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Portability&lt;/td&gt;
&lt;td&gt;High: restore to different Postgres versions&lt;/td&gt;
&lt;td&gt;Low: must match major version&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;Table, schema, database&lt;/td&gt;
&lt;td&gt;Entire cluster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Restore time&lt;/td&gt;
&lt;td&gt;Slow for large databases&lt;/td&gt;
&lt;td&gt;Fast: just copy files back&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Point-in-time recovery&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (with WAL archiving)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consistency&lt;/td&gt;
&lt;td&gt;Always consistent&lt;/td&gt;
&lt;td&gt;Consistent (with pg_start_backup)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For small databases or specific table exports, logical backups with &lt;code&gt;pg_dump&lt;/code&gt; are perfectly reasonable. For production databases over a few gigabytes, or whenever you need point-in-time recovery, physical backups are the right foundation.&lt;/p&gt;

&lt;h2&gt;
  
  
  pg_dump: Logical Backups for Specific Needs
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;pg_dump&lt;/code&gt; is a utility that comes bundled with PostgreSQL. It connects to a running database and produces a self-consistent snapshot you can restore elsewhere.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Dump a database to a plain SQL file&lt;/span&gt;
pg_dump &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres mydb &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mydb.sql

&lt;span class="c"&gt;# Dump in custom format (recommended: faster restore, selective restore)&lt;/span&gt;
pg_dump &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-Fc&lt;/span&gt; mydb &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mydb.dump

&lt;span class="c"&gt;# Dump to a directory with parallel workers&lt;/span&gt;
pg_dump &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-Fd&lt;/span&gt; &lt;span class="nt"&gt;-j&lt;/span&gt; 4 &lt;span class="nt"&gt;-f&lt;/span&gt; mydb_dir/ mydb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The custom format (&lt;code&gt;-Fc&lt;/code&gt;) is almost always better than plain SQL. It compresses the output, supports parallel restore, and lets you restore individual tables or schemas selectively.&lt;/p&gt;

&lt;h3&gt;
  
  
  Restoring a pg_dump backup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Restore a plain SQL dump&lt;/span&gt;
psql &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres newdb &amp;lt; mydb.sql

&lt;span class="c"&gt;# Restore a custom format dump&lt;/span&gt;
pg_restore &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-d&lt;/span&gt; newdb mydb.dump

&lt;span class="c"&gt;# Restore with parallel workers (much faster for large databases)&lt;/span&gt;
pg_restore &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-j&lt;/span&gt; 4 &lt;span class="nt"&gt;-d&lt;/span&gt; newdb mydb.dump
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Limitations of pg_dump
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;pg_dump&lt;/code&gt; works by running a transaction against your running database. For large databases, this can take hours, and the entire dump runs in a single transaction to ensure consistency. If you need to restore 500 GB with &lt;code&gt;pg_restore&lt;/code&gt;, you are looking at several hours of downtime.&lt;/p&gt;

&lt;p&gt;More importantly, &lt;code&gt;pg_dump&lt;/code&gt; gives you a snapshot at a point in time. If you run it at midnight and your application corrupts data at 10 PM the following day, you lose 22 hours of data. That might be acceptable for some workloads. For most production databases, it is not.&lt;/p&gt;

&lt;h2&gt;
  
  
  pg_basebackup: Physical Backups of the Entire Cluster
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;pg_basebackup&lt;/code&gt; creates a physical copy of the PostgreSQL data directory. Combined with WAL archiving, it forms the foundation of production backup strategies.&lt;/p&gt;

&lt;h3&gt;
  
  
  What pg_basebackup does
&lt;/h3&gt;

&lt;p&gt;PostgreSQL writes every change to the Write-Ahead Log before applying it to the actual data files. This means a physical backup taken at any moment is internally consistent if you also keep the WAL generated during the backup. &lt;code&gt;pg_basebackup&lt;/code&gt; handles the coordination automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pg_basebackup &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-U&lt;/span&gt; replicator &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-D&lt;/span&gt; /backups/base &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-Ft&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-P&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--wal-method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;stream
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;replicator&lt;/code&gt; user needs &lt;code&gt;REPLICATION&lt;/code&gt; privilege:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;ROLE&lt;/span&gt; &lt;span class="n"&gt;replicator&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;REPLICATION&lt;/span&gt; &lt;span class="n"&gt;LOGIN&lt;/span&gt; &lt;span class="n"&gt;PASSWORD&lt;/span&gt; &lt;span class="s1"&gt;'yourpassword'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Restoring from pg_basebackup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl stop postgresql
&lt;span class="nb"&gt;sudo rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/postgresql/16/main/&lt;span class="k"&gt;*&lt;/span&gt;
&lt;span class="nb"&gt;sudo tar&lt;/span&gt; &lt;span class="nt"&gt;-xzf&lt;/span&gt; /backups/base/base.tar.gz &lt;span class="nt"&gt;-C&lt;/span&gt; /var/lib/postgresql/16/main/
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start postgresql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you the database as it was at the time of the backup. To recover to a point after the backup, you need WAL archiving.&lt;/p&gt;

&lt;h2&gt;
  
  
  WAL Archiving and Point-In-Time Recovery
&lt;/h2&gt;

&lt;p&gt;Point-in-time recovery (PITR) lets you restore a database to any moment between your last base backup and now. It works by replaying WAL files on top of a base backup.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enabling WAL archiving
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;postgresql.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;wal_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;replica&lt;/span&gt;
&lt;span class="py"&gt;archive_mode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;on&lt;/span&gt;
&lt;span class="py"&gt;archive_command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'cp %p /mnt/wal-archive/%f'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In production, ship to object storage instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;archive_command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'aws s3 cp %p s3://my-wal-bucket/wal/%f'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Performing a point-in-time recovery
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# PostgreSQL 12+: create a recovery signal file&lt;/span&gt;
&lt;span class="nb"&gt;touch&lt;/span&gt; /var/lib/postgresql/16/main/recovery.signal

&lt;span class="c"&gt;# Add to postgresql.conf&lt;/span&gt;
restore_command &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'cp /mnt/wal-archive/%f %p'&lt;/span&gt;
recovery_target_time &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-04-18 14:30:00'&lt;/span&gt;
recovery_target_action &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'promote'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also recover to a named restore point you create before migrations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;pg_create_restore_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'before_migration'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  pgBackRest: Production-Grade Backup Management
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;pgBackRest&lt;/code&gt; handles everything from incremental backups to parallel compression and verification.&lt;/p&gt;

&lt;h3&gt;
  
  
  Minimal pgBackRest configuration
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;/etc/pgbackrest/pgbackrest.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[global]&lt;/span&gt;
&lt;span class="py"&gt;repo1-path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/var/lib/pgbackrest&lt;/span&gt;
&lt;span class="py"&gt;repo1-retention-full&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;
&lt;span class="py"&gt;repo1-retention-diff&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;7&lt;/span&gt;
&lt;span class="py"&gt;process-max&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;4&lt;/span&gt;

&lt;span class="nn"&gt;[main]&lt;/span&gt;
&lt;span class="py"&gt;pg1-path&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/var/lib/postgresql/16/main&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;postgresql.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;archive_command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'pgbackrest --stanza=main archive-push %p'&lt;/span&gt;
&lt;span class="py"&gt;archive_mode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;on&lt;/span&gt;
&lt;span class="py"&gt;wal_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;replica&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running and restoring backups
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Full backup&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; postgres pgbackrest &lt;span class="nt"&gt;--stanza&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;main &lt;span class="nt"&gt;--type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;full backup

&lt;span class="c"&gt;# Restore to a specific point in time&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; postgres pgbackrest &lt;span class="nt"&gt;--stanza&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;main restore &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;time&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"2026-04-18 14:30:00"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Backup Verification: The Step Most Teams Skip
&lt;/h2&gt;

&lt;p&gt;An untested backup is not a backup. It is an assumption.&lt;/p&gt;

&lt;p&gt;WAL segments can be silently corrupted. Archive commands can silently succeed (exit 0) while producing empty files. &lt;code&gt;pg_dump&lt;/code&gt; output can be truncated. You will not know until you try to restore.&lt;/p&gt;

&lt;p&gt;Monitor archive health continuously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;archived_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;last_archived_wal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;last_archived_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;failed_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;last_failed_wal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;last_failed_time&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_archiver&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;failed_count&lt;/code&gt; is climbing and &lt;code&gt;last_archived_time&lt;/code&gt; is stale, your WAL archive is silently broken.&lt;/p&gt;

&lt;p&gt;For mission-critical databases, run automated restore drills: take a backup, restore to an isolated instance, compare row counts against production. This sounds like overhead, and it is. It is also the thing that saves teams from finding out their backups were corrupted during an actual disaster.&lt;/p&gt;

&lt;h2&gt;
  
  
  Backup Retention and Recovery Time
&lt;/h2&gt;

&lt;p&gt;A common production policy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily full backups kept for 7 days&lt;/li&gt;
&lt;li&gt;Continuous WAL archiving for PITR within the retention window&lt;/li&gt;
&lt;li&gt;Weekly backups kept for 4 weeks&lt;/li&gt;
&lt;li&gt;Monthly snapshots kept for 12 months&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For recovery time: &lt;code&gt;pg_restore&lt;/code&gt; with a single worker does roughly 1 GB per minute. A 100 GB database is 100 minutes. With parallel restore (&lt;code&gt;-j 4&lt;/code&gt;), that drops to 30 to 40 minutes. A physical restore from pgBackRest over a 1 Gbps connection takes under 15 minutes for the base backup, plus WAL replay time.&lt;/p&gt;

&lt;p&gt;Know your restore time before the incident, not during it.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Practical Backup Checklist
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Base backups running on a schedule (at least daily)&lt;/li&gt;
&lt;li&gt;[ ] WAL archiving enabled and destination confirmed&lt;/li&gt;
&lt;li&gt;[ ] Backup verification job running at least weekly&lt;/li&gt;
&lt;li&gt;[ ] Restore procedure documented and tested&lt;/li&gt;
&lt;li&gt;[ ] Retention policy defined and enforced&lt;/li&gt;
&lt;li&gt;[ ] Monitoring for archive failures via &lt;code&gt;pg_stat_archiver&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Offsite storage (not on the same machine or availability zone as the primary)&lt;/li&gt;
&lt;li&gt;[ ] Recovery time measured on actual restore drill&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;PostgreSQL gives you the primitives for a solid backup strategy. The gap between "we have backups" and "we can actually recover" is almost always process: untested restores, silent archive failures, and retention policies set once and never revisited.&lt;/p&gt;

&lt;p&gt;If you want backups that work without the operational overhead, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;try Rivestack&lt;/a&gt;. Continuous backup, point-in-time recovery, and restore in a few clicks.&lt;/p&gt;

&lt;p&gt;For more production PostgreSQL reading, see our guides on &lt;a href="https://rivestack.io/blog/postgresql-connection-pooling-pgbouncer" rel="noopener noreferrer"&gt;connection pooling with PgBouncer&lt;/a&gt; and &lt;a href="https://rivestack.io/blog/postgresql-indexing-guide" rel="noopener noreferrer"&gt;PostgreSQL indexing strategies&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgresql-backup-recovery-guide" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>devops</category>
      <category>postgres</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Search Gmail Attachments by File Type, Date, or Sender</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sun, 12 Apr 2026 13:41:23 +0000</pubDate>
      <link>https://forem.com/geekyfox90/how-to-search-gmail-attachments-by-file-type-date-or-sender-3838</link>
      <guid>https://forem.com/geekyfox90/how-to-search-gmail-attachments-by-file-type-date-or-sender-3838</guid>
      <description>&lt;p&gt;You know the file is somewhere in your inbox. A PDF invoice, a spreadsheet from your accountant, a signed contract someone sent three months ago. You scroll, you search, you give up and ask the sender to resend it. Sound familiar?&lt;/p&gt;

&lt;p&gt;Gmail holds thousands of messages for most professionals, and attachments get buried fast. The built-in search bar is more powerful than most people realize, though. With the right search operators, you can filter attachments by file type, narrow results to a specific date range, isolate emails from one sender, and combine all of these filters to pinpoint exactly what you need in seconds.&lt;/p&gt;

&lt;p&gt;This guide covers every useful technique for searching Gmail attachments, from basic operators to advanced combinations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Gmail Search Feels Broken (And How to Fix It)
&lt;/h2&gt;

&lt;p&gt;The default Gmail search works reasonably well for finding emails by keyword or sender name. But when you type "invoice PDF" into the search bar, Gmail searches the full body of every message, not just attachments. You get hundreds of results, most of them irrelevant.&lt;/p&gt;

&lt;p&gt;The fix is search operators: special keywords and syntax that tell Gmail exactly what to look for and where. Once you know a handful of these, you'll never spend more than a few seconds hunting for an attachment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Foundation: Finding Emails with Attachments
&lt;/h2&gt;

&lt;p&gt;The simplest operator for attachment search is &lt;code&gt;has:attachment&lt;/code&gt;. Add it to any Gmail search and your results will only show emails that include attached files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single operator cuts out all the noise. From here, you layer on additional filters to zero in on exactly what you need.&lt;/p&gt;

&lt;h2&gt;
  
  
  Searching Gmail Attachments by File Type
&lt;/h2&gt;

&lt;p&gt;Gmail lets you filter attachments by MIME type using the &lt;code&gt;filename:&lt;/code&gt; operator. You can search by extension or by the actual filename if you know it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Search by file extension
&lt;/h3&gt;

&lt;p&gt;To find all PDFs in your inbox:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:pdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Excel files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:xlsx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Word documents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:docx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For ZIP archives:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:zip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For CSV files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Search by partial filename
&lt;/h3&gt;

&lt;p&gt;If you remember part of the filename, Gmail does partial matching:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:invoice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Search by exact filename
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;filename:"Q1 2026 Report.pdf"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Searching Gmail Attachments by Date
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Using after: and before:
&lt;/h3&gt;

&lt;p&gt;To find attachments within a specific range:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment after:2026/01/01 before:2026/02/01
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using older_than: and newer_than:
&lt;/h3&gt;

&lt;p&gt;Find attachments received in the last 30 days:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment newer_than:30d
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Find PDFs received in the past three months:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:pdf newer_than:3m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Searching Gmail Attachments by Sender
&lt;/h2&gt;

&lt;p&gt;Search by email address:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment from:john@company.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Search by domain (useful for vendors or clients):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment from:@company.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Combining Filters for Precision Search
&lt;/h2&gt;

&lt;p&gt;PDFs from a specific sender this year:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment filename:pdf from:accountant@firm.com after:2026/01/01
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Large attachments from anyone:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment larger:5M
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Using OR Logic for Multiple File Types
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;has:attachment {filename:pdf filename:docx}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Searching for Attachments You Sent
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;in:sent has:attachment
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Limits of Gmail's Built-in Search
&lt;/h2&gt;

&lt;p&gt;Gmail's operators are powerful but have friction points: no bulk download, no attachment-level folder organization, and search doesn't look inside attachment content. Storage management still requires manual action per email.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Dioveo Extends Gmail Attachment Management
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dioveo.com" rel="noopener noreferrer"&gt;Dioveo&lt;/a&gt; is built specifically for Gmail attachment management. Where Gmail's search helps you find attachments, Dioveo helps you act on them at scale — automatically saving attachments to Google Drive based on rules you define: sender, file type, subject keywords, or label.&lt;/p&gt;

&lt;p&gt;It also handles the storage problem. Dioveo identifies large and old attachments, gives you a clear view of what's using space, and lets you clean up in bulk. Start with the free plan at &lt;a href="https://dioveo.com" rel="noopener noreferrer"&gt;dioveo.com&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Reference: Gmail Attachment Search Operators
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What you want&lt;/th&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Emails with any attachment&lt;/td&gt;
&lt;td&gt;&lt;code&gt;has:attachment&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Specific file type&lt;/td&gt;
&lt;td&gt;&lt;code&gt;filename:pdf&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;From a specific sender&lt;/td&gt;
&lt;td&gt;&lt;code&gt;from:email@domain.com&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;From a domain&lt;/td&gt;
&lt;td&gt;&lt;code&gt;from:@company.com&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;After a date&lt;/td&gt;
&lt;td&gt;&lt;code&gt;after:2026/01/01&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Last N days&lt;/td&gt;
&lt;td&gt;&lt;code&gt;newer_than:30d&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Larger than a size&lt;/td&gt;
&lt;td&gt;&lt;code&gt;larger:5M&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sent by you&lt;/td&gt;
&lt;td&gt;&lt;code&gt;in:sent has:attachment&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://dioveo.com/blog/how-to-search-gmail-attachments" rel="noopener noreferrer"&gt;dioveo.com/blog/how-to-search-gmail-attachments&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>gmail</category>
      <category>productivity</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>PostgreSQL JSONB: A Complete Guide to Storing and Querying JSON Data</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sun, 12 Apr 2026 13:14:50 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-jsonb-a-complete-guide-to-storing-and-querying-json-data-155k</link>
      <guid>https://forem.com/geekyfox90/postgresql-jsonb-a-complete-guide-to-storing-and-querying-json-data-155k</guid>
      <description>&lt;p&gt;Every application reaches a point where the relational model gets awkward. User preferences. API responses. Feature flags. Event payloads. You could create a new column for each field, run a migration every time the schema changes, and normalize everything into lookup tables. Or you could store the data as JSON.&lt;/p&gt;

&lt;p&gt;PostgreSQL gives you both options, and unlike databases that bolt on JSON support as an afterthought, PostgreSQL treats JSONB as a first class data type with real indexing, operators, and query planning. You get the flexibility of a document store with the guarantees of a relational database: transactions, constraints, joins, and a query planner that actually knows what it's doing.&lt;/p&gt;

&lt;p&gt;This guide covers everything you need to work with JSONB in production: when to use it, how to query it, how to index it, and where teams get tripped up.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgresql-jsonb-guide" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>pgvector with LangChain: Build a RAG Pipeline on PostgreSQL</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Fri, 10 Apr 2026 09:19:05 +0000</pubDate>
      <link>https://forem.com/geekyfox90/pgvector-with-langchain-build-a-rag-pipeline-on-postgresql-1bl8</link>
      <guid>https://forem.com/geekyfox90/pgvector-with-langchain-build-a-rag-pipeline-on-postgresql-1bl8</guid>
      <description>&lt;p&gt;LangChain has a vectorstore abstraction that lets you swap out the underlying vector database without rewriting your application logic. Swap Chroma for Pinecone, Pinecone for Weaviate, whatever. In theory, it's clean. In practice, most teams end up staying with whatever they picked first, because migration is never as simple as swapping a class name.&lt;/p&gt;

&lt;p&gt;So the decision matters. And if you're building on PostgreSQL, the answer is almost always: use pgvector. Your embeddings live in the same database as your users, documents, and application state. No sync pipeline. No eventual consistency. Full SQL.&lt;/p&gt;

&lt;p&gt;This guide walks through the LangChain &lt;code&gt;PGVector&lt;/code&gt; integration from scratch, including document loading, embedding, similarity search, metadata filtering, and wiring it into a working retrieval chain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why pgvector Over a Dedicated Vectorstore
&lt;/h2&gt;

&lt;p&gt;Before the code, let's be direct about the tradeoff.&lt;/p&gt;

&lt;p&gt;Dedicated vectorstores like Pinecone are fast and scale to billions of vectors without you thinking about infrastructure. If you're building something where vectors are the entire product, they're a reasonable choice.&lt;/p&gt;

&lt;p&gt;But most applications aren't like that. You have users. You have documents. You have metadata. You have access control logic. When your vectors live in Pinecone and everything else lives in PostgreSQL, you've just created a sync problem. Document gets deleted from Postgres, the embedding stays in Pinecone. You want to filter by &lt;code&gt;user_id&lt;/code&gt;, now you need to implement that in a separate system. You want a transaction that inserts a document and its embedding atomically, you can't have one.&lt;/p&gt;

&lt;p&gt;pgvector collapses this. Everything lives in one place. JOINs work. Transactions work. Your existing backup strategy, monitoring setup, and connection pooler all work with zero changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up
&lt;/h2&gt;

&lt;p&gt;You'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PostgreSQL with the &lt;code&gt;vector&lt;/code&gt; extension enabled&lt;/li&gt;
&lt;li&gt;Python 3.9+&lt;/li&gt;
&lt;li&gt;An OpenAI API key (or any LangChain-compatible embeddings model)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Install dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;langchain langchain-openai langchain-postgres psycopg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;langchain-postgres&lt;/code&gt; package is the current home for the &lt;code&gt;PGVector&lt;/code&gt; integration. On a managed PostgreSQL service with pgvector pre-installed, just run this once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Connecting LangChain to pgvector
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;PGVector&lt;/code&gt; class takes a connection string and an embeddings object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_postgres&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PGVector&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;

&lt;span class="n"&gt;connection_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql+psycopg://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PGVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;connection_string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;use_jsonb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always set &lt;code&gt;use_jsonb=True&lt;/code&gt; — it stores metadata as JSONB, unlocking proper metadata filtering. LangChain creates the required tables automatically on first use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Loading and Embedding Documents
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.documents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Document&lt;/span&gt;

&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PostgreSQL supports ACID transactions and is MVCC-based.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgres-intro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector adds vector similarity search using HNSW and IVFFlat indexes.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector-intro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extensions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For real use cases, split long documents first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;

&lt;span class="n"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;split_docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;split_docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Similarity Search
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How does pgvector index work?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;results_with_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search_with_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_with_scores&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lower scores mean higher similarity when using cosine distance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Metadata Filtering
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Filter by a single field
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database indexing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Multi-tenant: scope to a specific user
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;use_jsonb=True&lt;/code&gt;, this translates to a JSONB containment query that can be indexed.&lt;/p&gt;

&lt;h2&gt;
  
  
  MMR Search for Diverse Results
&lt;/h2&gt;

&lt;p&gt;Maximum Marginal Relevance avoids returning five chunks that all say the same thing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max_marginal_relevance_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PostgreSQL extensions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;fetch_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;lambda_mult&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;  &lt;span class="c1"&gt;# 0 = max diversity, 1 = max relevance
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Building a Retrieval Chain
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.runnables&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RunnablePassthrough&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.output_parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StrOutputParser&lt;/span&gt;

&lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;search_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mmr&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;search_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fetch_k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Answer the question based only on the following context:

{context}

Question: {question}
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
     &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;RunnablePassthrough&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nc"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What indexes does pgvector support?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For multi-tenant RAG, scope the retriever per authenticated user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;search_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;current_user_id&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  HNSW Indexing for Production
&lt;/h2&gt;

&lt;p&gt;By default, pgvector uses exact k-NN search. For datasets over 100k vectors, add an HNSW index:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;langchain_pg_embedding&lt;/span&gt;
&lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;hnsw&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_cosine_ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ef_construction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- At query time:&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;hnsw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ef_search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LangChain won't create this index automatically. Run it once after initial load.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection Pooling
&lt;/h2&gt;

&lt;p&gt;Configure an engine with a pool instead of relying on per-instance connections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_engine&lt;/span&gt;

&lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;connection_string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pool_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_overflow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pool_pre_ping&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PGVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;use_jsonb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Async Usage
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PGVector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;async_engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;use_jsonb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;async_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;asimilarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ainvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What indexes does pgvector support?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Document Upsert and Deletion
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Upsert by ID
&lt;/span&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;updated_docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc_id_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc_id_2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Delete by ID
&lt;/span&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc_id_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;pgvector with LangChain is production-ready. The &lt;code&gt;langchain-postgres&lt;/code&gt; package handles embeddings storage, similarity search, MMR, and metadata filtering. The underlying database is PostgreSQL, so transactions, JOINs, and your existing operational tooling all work.&lt;/p&gt;

&lt;p&gt;Key things to get right: use &lt;code&gt;use_jsonb=True&lt;/code&gt;, create an HNSW index before your dataset grows large, configure a connection pool at the engine level, and scope retrievers to the authenticated user for multi-tenant workloads.&lt;/p&gt;

&lt;p&gt;If you want a managed PostgreSQL setup with pgvector pre-installed, PgBouncer included, and backups handled: check out &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Rivestack&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/pgvector-langchain" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>ai</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>PostgreSQL Connection Pooling with PgBouncer: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Tue, 31 Mar 2026 15:22:09 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-connection-pooling-with-pgbouncer-a-complete-guide-2fam</link>
      <guid>https://forem.com/geekyfox90/postgresql-connection-pooling-with-pgbouncer-a-complete-guide-2fam</guid>
      <description>&lt;p&gt;You launch your app. Traffic is light, everything works. A few weeks later you start seeing &lt;code&gt;FATAL: remaining connection slots are reserved for non-replication superuser connections&lt;/code&gt;. Your PostgreSQL server is out of connections and your app is falling over.&lt;/p&gt;

&lt;p&gt;This is one of the most common PostgreSQL scaling problems, and connection pooling is the fix. But the fix has its own complexity: PgBouncer has three modes with different tradeoffs, the configuration is full of footguns, and if you get it wrong you get subtle bugs that are much harder to debug than the original connection error.&lt;/p&gt;

&lt;p&gt;This guide covers how PostgreSQL connections actually work, how to set up and configure PgBouncer correctly, and how to choose the right pool mode for your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why PostgreSQL Connections Are Expensive
&lt;/h2&gt;

&lt;p&gt;PostgreSQL handles each connection with a dedicated server process. When a client connects, Postgres forks a new OS process. That process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Allocates its own memory (typically 5-10 MB per connection including shared memory overhead)&lt;/li&gt;
&lt;li&gt;Maintains its own backend state, transaction state, and lock tables&lt;/li&gt;
&lt;li&gt;Requires the kernel to schedule it like any other process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At 50 connections, this is fine. At 500 connections, you have 500 OS processes and the scheduler starts showing up in your performance profiles. At 1,000 connections, you are likely hitting the &lt;code&gt;max_connections&lt;/code&gt; limit (default 100 in stock PostgreSQL) and your app is returning errors.&lt;/p&gt;

&lt;p&gt;The naive fix is to increase &lt;code&gt;max_connections&lt;/code&gt;. Don't do that without thinking it through. Each connection costs memory. Set &lt;code&gt;max_connections = 1000&lt;/code&gt; on a server with 8 GB of RAM and you've allocated the entire heap to idle connections before a single query runs. The &lt;code&gt;shared_buffers&lt;/code&gt; and &lt;code&gt;work_mem&lt;/code&gt; math goes sideways fast.&lt;/p&gt;

&lt;p&gt;The right fix is to reduce the number of actual connections to PostgreSQL. That's what connection poolers do.&lt;/p&gt;

&lt;h2&gt;
  
  
  What PgBouncer Does
&lt;/h2&gt;

&lt;p&gt;PgBouncer sits between your application and PostgreSQL. Your app thinks it's talking to Postgres, but it's actually talking to PgBouncer. PgBouncer maintains a pool of real connections to Postgres and hands them out to client requests.&lt;/p&gt;

&lt;p&gt;The numbers look like this in practice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before PgBouncer:&lt;/strong&gt; 300 app threads, 300 Postgres connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After PgBouncer (transaction mode):&lt;/strong&gt; 300 app threads, 20 actual Postgres connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those 20 connections serve 300 clients because most clients are not actually executing SQL at any given moment. They're waiting for network I/O, processing results, or sitting idle. Transaction mode takes advantage of this by returning a connection to the pool the moment a transaction commits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing PgBouncer
&lt;/h2&gt;

&lt;p&gt;On Ubuntu/Debian:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;pgbouncer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On macOS with Homebrew:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;pgbouncer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuring PgBouncer
&lt;/h2&gt;

&lt;p&gt;A minimal working config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[databases]&lt;/span&gt;
&lt;span class="py"&gt;mydb&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;host=127.0.0.1 port=5432 dbname=mydb&lt;/span&gt;

&lt;span class="nn"&gt;[pgbouncer]&lt;/span&gt;
&lt;span class="py"&gt;listen_addr&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;127.0.0.1&lt;/span&gt;
&lt;span class="py"&gt;listen_port&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;6432&lt;/span&gt;
&lt;span class="py"&gt;auth_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;scram-sha-256&lt;/span&gt;
&lt;span class="py"&gt;auth_file&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/etc/pgbouncer/userlist.txt&lt;/span&gt;
&lt;span class="py"&gt;pool_mode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;transaction&lt;/span&gt;
&lt;span class="py"&gt;max_client_conn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;1000&lt;/span&gt;
&lt;span class="py"&gt;default_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;25&lt;/span&gt;
&lt;span class="py"&gt;min_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;5&lt;/span&gt;
&lt;span class="py"&gt;reserve_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;5&lt;/span&gt;
&lt;span class="py"&gt;server_idle_timeout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;600&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For production, use hashed passwords. Generate them with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rolname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'" "'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rolpassword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_authid&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rolname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'myuser'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your app connects to port 6432 instead of 5432. Nothing else changes in the app code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Pool Modes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Session Mode
&lt;/h3&gt;

&lt;p&gt;A server connection is assigned when the client connects and held until the client disconnects. This is the safest mode, behaves identically to a direct PostgreSQL connection. Prepared statements, advisory locks, &lt;code&gt;LISTEN/NOTIFY&lt;/code&gt; all work correctly.&lt;/p&gt;

&lt;p&gt;Session mode does not help much with connection counts at steady state. Use it when you need full compatibility and your problem is peak load, not constant high concurrency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Transaction Mode
&lt;/h3&gt;

&lt;p&gt;A server connection is assigned for the duration of a transaction and returned to the pool immediately after. This gives you the dramatic reduction in server connections.&lt;/p&gt;

&lt;p&gt;The tradeoff: &lt;strong&gt;session-level state does not persist across transactions&lt;/strong&gt;. This breaks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PREPARE&lt;/code&gt; and server-side prepared statement caching&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SET&lt;/code&gt; commands that are not wrapped in a transaction&lt;/li&gt;
&lt;li&gt;Advisory locks (session-scoped)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LISTEN&lt;/code&gt; and &lt;code&gt;NOTIFY&lt;/code&gt; subscriptions&lt;/li&gt;
&lt;li&gt;Temp tables that are supposed to persist across transactions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Transaction mode works well for stateless web applications using pg (Node.js), psycopg2/psycopg3 (Python), or JDBC (Java), as long as those frameworks don't use session-level features.&lt;/p&gt;

&lt;h3&gt;
  
  
  Statement Mode
&lt;/h3&gt;

&lt;p&gt;A connection is held only for a single SQL statement, then returned. Breaks multi-statement transactions entirely. Rarely the right choice for web applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Choosing Your Mode
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your app&lt;/th&gt;
&lt;th&gt;Recommended mode&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Stateless API using ORM (Django, Rails, Prisma)&lt;/td&gt;
&lt;td&gt;Transaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long-lived connections with prepared statements&lt;/td&gt;
&lt;td&gt;Session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Connection count issues at peak only&lt;/td&gt;
&lt;td&gt;Session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Connection count issues at steady state&lt;/td&gt;
&lt;td&gt;Transaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Serverless functions&lt;/td&gt;
&lt;td&gt;Transaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Application using LISTEN/NOTIFY&lt;/td&gt;
&lt;td&gt;Session&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Pool Sizing
&lt;/h2&gt;

&lt;p&gt;A reasonable starting formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;default_pool_size = (number of PostgreSQL CPU cores) * 2 + number of disks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, 20-30 works well for most web applications. Check pool utilization while running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt; &lt;span class="n"&gt;POOLS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;cl_waiting &amp;gt; 0&lt;/code&gt; sustained means the pool is undersized. &lt;code&gt;sv_idle&lt;/code&gt; consistently high means it's oversized.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring PgBouncer
&lt;/h2&gt;

&lt;p&gt;Connect to the admin console:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;psql &lt;span class="nt"&gt;-h&lt;/span&gt; 127.0.0.1 &lt;span class="nt"&gt;-p&lt;/span&gt; 6432 &lt;span class="nt"&gt;-U&lt;/span&gt; pgbouncer pgbouncer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key commands: &lt;code&gt;SHOW POOLS&lt;/code&gt;, &lt;code&gt;SHOW STATS&lt;/code&gt;, &lt;code&gt;SHOW CLIENTS&lt;/code&gt;, &lt;code&gt;SHOW SERVERS&lt;/code&gt;, &lt;code&gt;RELOAD&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SHOW STATS&lt;/code&gt; tells you &lt;code&gt;avg_query_time&lt;/code&gt; and &lt;code&gt;avg_wait_time&lt;/code&gt;. If either is climbing, something is backing up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Pitfalls
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Prepared statements in transaction mode:&lt;/strong&gt; If your app uses server-side prepared statements, transaction mode will break it. Disable them in your driver: &lt;code&gt;prepare_threshold=None&lt;/code&gt; in psycopg2, &lt;code&gt;{ prepare: false }&lt;/code&gt; in pg (Node.js).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Connection storms at startup:&lt;/strong&gt; Set &lt;code&gt;reserve_pool_size = 5&lt;/code&gt; and &lt;code&gt;max_client_conn&lt;/code&gt; to at least 2-3x your expected peak.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SSL:&lt;/strong&gt; Always use SSL for both client-to-PgBouncer and PgBouncer-to-PostgreSQL in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Skip PgBouncer
&lt;/h2&gt;

&lt;p&gt;You probably don't need it if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fewer than 50 concurrent connections at peak&lt;/li&gt;
&lt;li&gt;Using a modern ORM with client-side pooling already (Prisma, Django, Rails)&lt;/li&gt;
&lt;li&gt;Serverless workloads with 1-2 connections per function instance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need it if multiple processes or pods each maintain their own pool and the total exceeds &lt;code&gt;max_connections&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;PostgreSQL connection pooling is not optional at scale. PgBouncer in transaction mode is the right default for most web applications.&lt;/p&gt;

&lt;p&gt;The main things to get right: pool size (start at 20-30 for OLTP), mode (transaction for stateless apps, session for anything using prepared statements or advisory locks), monitoring (&lt;code&gt;cl_waiting&lt;/code&gt; and &lt;code&gt;avg_wait_time&lt;/code&gt;), and SSL in production.&lt;/p&gt;

&lt;p&gt;Connection pooling is unglamorous infrastructure, but it's the difference between an app that falls over at traffic spikes and one that just handles them.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgresql-connection-pooling-pgbouncer" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>tutorial</category>
      <category>database</category>
      <category>webdev</category>
    </item>
    <item>
      <title>PostgreSQL Full Text Search: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Mon, 30 Mar 2026 14:49:54 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-full-text-search-a-complete-guide-2nj9</link>
      <guid>https://forem.com/geekyfox90/postgresql-full-text-search-a-complete-guide-2nj9</guid>
      <description>&lt;p&gt;There's a database already running in your stack. It has your users, your content, your transactions. And buried in that same PostgreSQL instance is a full text search engine you've probably never turned on.&lt;/p&gt;

&lt;p&gt;PostgreSQL full text search has been production ready for over a decade. It handles stemming, stop words, multiple languages, weighted ranking, and trigram fuzzy matching. You don't need Elasticsearch for a search feature. You don't need Algolia if your data is already in Postgres. For most applications, especially those with under a few million documents, built-in full text search is the right call.&lt;/p&gt;

&lt;p&gt;This guide covers everything you need to ship full text search in PostgreSQL: how the underlying model works, how to index correctly, how to rank results, and how it compares to vector search with pgvector.&lt;/p&gt;

&lt;h2&gt;
  
  
  How PostgreSQL Full Text Search Works
&lt;/h2&gt;

&lt;p&gt;PostgreSQL doesn't search raw text. It converts text into a normalized representation called a &lt;code&gt;tsvector&lt;/code&gt;, then matches queries expressed as &lt;code&gt;tsquery&lt;/code&gt; objects. This two-step process is what makes it fast.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;tsvector&lt;/code&gt; is a sorted list of lexemes: normalized word forms that strip suffixes and reduce words to their base form. The word "running" becomes "run". "Postgres" becomes "postgr". Stop words like "the", "a", "an" are dropped entirely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'The quick brown fox jumps over the lazy dog'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A &lt;code&gt;tsquery&lt;/code&gt; is what you match against:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'jumping'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 'jump'&lt;/span&gt;

&lt;span class="c1"&gt;-- Boolean operators&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgres &amp;amp; search'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;-- AND&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgres | mysql'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;-- OR&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'database &amp;amp; !oracle'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;-- NOT&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The match operator is &lt;code&gt;@@&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'PostgreSQL is a powerful database'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'powerful'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Up Full Text Search
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: Generated column (recommended for PostgreSQL 12+)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
  &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="n"&gt;TSVECTOR&lt;/span&gt;
  &lt;span class="k"&gt;GENERATED&lt;/span&gt; &lt;span class="n"&gt;ALWAYS&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;STORED&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;setweight&lt;/code&gt; assigns priority to fields: 'A' (highest) to title, 'B' to body. Documents where the search term appears in the title rank higher.&lt;/p&gt;

&lt;p&gt;Query with ranking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 2: Trigger-maintained column
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="n"&gt;TSVECTOR&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;articles_search_vector_trigger&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;articles_search_vector_update&lt;/span&gt;
  &lt;span class="k"&gt;BEFORE&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
  &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;articles_search_vector_trigger&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Indexing for Performance
&lt;/h2&gt;

&lt;p&gt;Without an index, full text search does a full table scan. GIN (Generalized Inverted Index) is purpose built for tsvectors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;articles_search_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this index, full text search queries return in milliseconds even on tables with millions of rows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ranking Results
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql &amp;amp; database'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ts_rank_cd&lt;/code&gt; uses cover density (how close matching terms are to each other) and often gives better results for multi-word queries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Snippet Generation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'MaxWords=50, MinWords=15, StartSel=&amp;lt;mark&amp;gt;, StopSel=&amp;lt;/mark&amp;gt;'&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'postgresql'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: &lt;code&gt;ts_headline&lt;/code&gt; does not use the GIN index. Call it only on the final page of results, not before pagination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling User Input Safely
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- websearch_to_tsquery: safe, handles Google-style syntax&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;websearch_to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'"full text search" postgres -oracle'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;websearch_to_tsquery&lt;/code&gt; (PostgreSQL 11+) is the best default for user input. It's injection-safe, handles partial syntax, and supports quoted phrases and exclusions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full Text Search vs pgvector
&lt;/h2&gt;

&lt;p&gt;They solve different problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full text search&lt;/strong&gt; finds documents containing specific words or phrases. Fast, precise, no ML model required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;pgvector&lt;/strong&gt; finds semantically similar documents, even without shared keywords. Needs an embedding model.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Best approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Blog/docs search&lt;/td&gt;
&lt;td&gt;Full text search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Semantic Q&amp;amp;A, RAG&lt;/td&gt;
&lt;td&gt;pgvector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E-commerce&lt;/td&gt;
&lt;td&gt;Often both&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Many systems use both, merging scores with Reciprocal Rank Fusion. For RAG pipelines, see our guides on &lt;a href="https://rivestack.io/blog/getting-started-with-pgvector" rel="noopener noreferrer"&gt;getting started with pgvector&lt;/a&gt; and &lt;a href="https://rivestack.io/blog/how-to-use-pgvector-with-python" rel="noopener noreferrer"&gt;pgvector with Python&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Complete Production Setup
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;BIGSERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="n"&gt;TSVECTOR&lt;/span&gt; &lt;span class="k"&gt;GENERATED&lt;/span&gt; &lt;span class="n"&gt;ALWAYS&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
    &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;STORED&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;articles_search_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;search_articles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_text&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_size&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_offset&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="n"&gt;FLOAT4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;DECLARE&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="n"&gt;TSQUERY&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;websearch_to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="n"&gt;QUERY&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'MaxWords=40, MinWords=10, StartSel=&amp;lt;mark&amp;gt;, StopSel=&amp;lt;/mark&amp;gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;
  &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
  &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="n"&gt;page_size&lt;/span&gt; &lt;span class="k"&gt;OFFSET&lt;/span&gt; &lt;span class="n"&gt;page_offset&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Usage&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;search_articles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'postgresql replication'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keyword based search, relevance ranking, HTML ready snippets, and pagination, all within PostgreSQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It on Rivestack
&lt;/h2&gt;

&lt;p&gt;PostgreSQL full text search and pgvector work side by side on the same database. No separate infrastructure for keyword vs semantic search.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Rivestack&lt;/a&gt; gives you a fully managed PostgreSQL instance with pgvector, pg_trgm, and all standard extensions enabled by default. Try it free, no credit card required.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgres-full-text-search" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>tutorial</category>
      <category>database</category>
      <category>sql</category>
    </item>
    <item>
      <title>PostgreSQL High Availability: A Practical Guide for Production</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sun, 29 Mar 2026 16:08:53 +0000</pubDate>
      <link>https://forem.com/geekyfox90/postgresql-high-availability-a-practical-guide-for-production-1a63</link>
      <guid>https://forem.com/geekyfox90/postgresql-high-availability-a-practical-guide-for-production-1a63</guid>
      <description>&lt;p&gt;Your application is live. Customers are using it. The database goes down.&lt;/p&gt;

&lt;p&gt;How long before traffic routes around the failure? Ten seconds? Five minutes? Never, because you're paged at 2 AM and have to manually promote a replica while the on-call engineer Slacks you asking if the database is "doing a thing"?&lt;/p&gt;

&lt;p&gt;PostgreSQL high availability is one of those topics that looks straightforward in blog posts and turns out to be deeply humbling when you actually implement it in production. This guide covers how PostgreSQL HA actually works, the main tools people use, what typically goes wrong, and when the complexity of DIY HA stops being worth it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "High Availability" Means for PostgreSQL
&lt;/h2&gt;

&lt;p&gt;High availability means your database keeps serving requests even when individual components fail. For PostgreSQL, that typically requires three things working together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Data replication&lt;/strong&gt; — at least one copy of your data exists on a server other than the primary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure detection&lt;/strong&gt; — something notices when the primary is unreachable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic failover&lt;/strong&gt; — the replica promotes itself to primary without a human in the loop&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;PostgreSQL ships with excellent replication primitives but no built-in automatic failover. The replication part is solid and well-understood. The failover part is where teams get into trouble.&lt;/p&gt;

&lt;h3&gt;
  
  
  Streaming Replication: The Foundation
&lt;/h3&gt;

&lt;p&gt;PostgreSQL replication is based on Write-Ahead Log (WAL) shipping. Every write to the primary is first written to the WAL. Replicas connect to the primary and stream that WAL in near-real-time, replaying it to stay current.&lt;/p&gt;

&lt;p&gt;Setting up a basic standby looks like this in &lt;code&gt;postgresql.conf&lt;/code&gt; on the primary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- postgresql.conf on the primary&lt;/span&gt;
&lt;span class="n"&gt;wal_level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;replica&lt;/span&gt;
&lt;span class="n"&gt;max_wal_senders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;wal_keep_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;GB&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And in &lt;code&gt;pg_hba.conf&lt;/code&gt;, you allow the replica to connect for replication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="c"&gt;# pg_hba.conf on the primary
&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt; &lt;span class="n"&gt;replication&lt;/span&gt; &lt;span class="n"&gt;replicator&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;.&lt;span class="m"&gt;0&lt;/span&gt;.&lt;span class="m"&gt;0&lt;/span&gt;.&lt;span class="m"&gt;2&lt;/span&gt;/&lt;span class="m"&gt;32&lt;/span&gt; &lt;span class="n"&gt;scram&lt;/span&gt;-&lt;span class="n"&gt;sha&lt;/span&gt;-&lt;span class="m"&gt;256&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The replica connects with a &lt;code&gt;primary_conninfo&lt;/code&gt; in its configuration and starts streaming WAL from the primary. Once streaming, the replica is typically only milliseconds behind.&lt;/p&gt;

&lt;h3&gt;
  
  
  Synchronous vs Asynchronous Replication
&lt;/h3&gt;

&lt;p&gt;By default, PostgreSQL replication is &lt;strong&gt;asynchronous&lt;/strong&gt;: the primary commits a transaction and returns success to the client before confirming the replica received the data. If the primary dies at exactly the wrong moment, you can lose the last few transactions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Synchronous replication&lt;/strong&gt; waits for at least one replica to confirm it has received and written the WAL before reporting the commit as successful:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- postgresql.conf on the primary&lt;/span&gt;
&lt;span class="n"&gt;synchronous_standby_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'replica1'&lt;/span&gt;
&lt;span class="n"&gt;synchronous_commit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you zero-RPO (recovery point objective) — no committed data is ever lost. The tradeoff is latency: every write waits for a round trip to the replica. On a local network this is typically 1–5ms. Across availability zones it can be 10–30ms depending on the cloud provider.&lt;/p&gt;

&lt;p&gt;Most production setups use synchronous replication for the hot standby and asynchronous replication for additional read replicas or DR standbys that are geographically distant.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: PostgreSQL Doesn't Fail Over Itself
&lt;/h2&gt;

&lt;p&gt;With streaming replication running, you have your data in two places. But if the primary goes down, PostgreSQL doesn't automatically promote the replica. You have two choices:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Manual failover&lt;/strong&gt; — someone runs &lt;code&gt;pg_ctl promote&lt;/code&gt; or &lt;code&gt;SELECT pg_promote()&lt;/code&gt; on the replica. Fast if someone is awake, catastrophic if not.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated failover via an HA tool&lt;/strong&gt; — a separate process watches the primary and promotes the replica when it detects a failure.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Almost every production PostgreSQL HA setup uses one of three tools for automated failover: Patroni, pg_auto_failover, or repmgr. They all solve the same problem; they have meaningfully different complexity and tradeoff profiles.&lt;/p&gt;

&lt;h2&gt;
  
  
  Patroni: The Industry Standard (and Why It's Hard)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/zalando/patroni" rel="noopener noreferrer"&gt;Patroni&lt;/a&gt; is what most teams with serious PostgreSQL HA requirements end up using. It's battle-tested, highly configurable, and runs at scale. It's also genuinely complex to operate.&lt;/p&gt;

&lt;p&gt;Patroni uses a distributed consensus store — either etcd, Consul, or ZooKeeper — to maintain cluster state and elect a primary. A minimal production Patroni setup is 3 etcd nodes + 2–3 PostgreSQL nodes + HAProxy. That's 5–6 servers before you've even started.&lt;/p&gt;

&lt;p&gt;A minimal &lt;code&gt;patroni.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prod-cluster&lt;/span&gt;
&lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/db/&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pg-node-1&lt;/span&gt;

&lt;span class="na"&gt;restapi&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;listen&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.0.0.0:8008&lt;/span&gt;
  &lt;span class="na"&gt;connect_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.0.0.1:8008&lt;/span&gt;

&lt;span class="na"&gt;etcd3&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.0.1.1:2379,10.0.1.2:2379,10.0.1.3:2379&lt;/span&gt;

&lt;span class="na"&gt;bootstrap&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;dcs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ttl&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
    &lt;span class="na"&gt;loop_wait&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
    &lt;span class="na"&gt;retry_timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
    &lt;span class="na"&gt;maximum_lag_on_failover&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1048576&lt;/span&gt;
    &lt;span class="na"&gt;synchronous_mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;postgresql&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;listen&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.0.0.0:5432&lt;/span&gt;
  &lt;span class="na"&gt;connect_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.0.0.1:5432&lt;/span&gt;
  &lt;span class="na"&gt;data_dir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/var/lib/postgresql/data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;HAProxy needs health checks against the Patroni REST API (port 8008), not just the PostgreSQL port — that's how it knows which node is the current primary.&lt;/p&gt;

&lt;p&gt;Patroni is genuinely good software. But "run Patroni in production" is a weeks-long project, not an afternoon task.&lt;/p&gt;

&lt;h2&gt;
  
  
  pg_auto_failover: Simpler, More Opinionated
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://pg-auto-failover.readthedocs.io/" rel="noopener noreferrer"&gt;pg_auto_failover&lt;/a&gt; uses a dedicated monitor node instead of etcd:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the monitor node&lt;/span&gt;
pg_autoctl create monitor &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgdata&lt;/span&gt; /var/lib/postgresql/monitor &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgport&lt;/span&gt; 5000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hostname&lt;/span&gt; monitor.internal

&lt;span class="c"&gt;# On the primary node&lt;/span&gt;
pg_autoctl create postgres &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgdata&lt;/span&gt; /var/lib/postgresql/data &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--monitor&lt;/span&gt; postgres://autoctl_node@monitor.internal:5000/pg_auto_failover &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hostname&lt;/span&gt; primary.internal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pgport&lt;/span&gt; 5432
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Easier to set up than Patroni, but the monitor is a single point of failure and it's less flexible for complex topologies. Good choice for teams that want something working quickly without running etcd.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Goes Wrong in Production
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Split-Brain
&lt;/h3&gt;

&lt;p&gt;The most dangerous failure mode: a network partition causes both nodes to think they're primary and accept writes. Patroni prevents this with etcd distributed locks — a node can only be primary if it holds the lock. If it can't reach etcd, it demotes itself. pg_auto_failover prevents it by centralizing all promotion decisions in the monitor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Replication Lag at Failover Time
&lt;/h3&gt;

&lt;p&gt;With async replication, a lagged replica that promotes will be missing the last N transactions. Patroni's &lt;code&gt;maximum_lag_on_failover&lt;/code&gt; controls this — but set it too conservatively and failover blocks entirely if all replicas are lagged after a network partition.&lt;/p&gt;

&lt;h3&gt;
  
  
  Application Not Reconnecting
&lt;/h3&gt;

&lt;p&gt;After failover, point your connection string at a VIP or load balancer, not a node's IP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;postgresql://ha-proxy.internal:5432/mydb?target_session_attrs=read-write
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;target_session_attrs=read-write&lt;/code&gt; parameter tells libpq to reject connections to read-only servers, which helps clients find the current primary automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing Your Failover
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Kill the primary while watching application logs&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl stop postgresql@17-main

&lt;span class="c"&gt;# Check the cluster state&lt;/span&gt;
patronictl &lt;span class="nt"&gt;-c&lt;/span&gt; /etc/patroni/config.yml list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you haven't tested your failover, you don't have HA. You have a plan that might work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Failover Time Question
&lt;/h2&gt;

&lt;p&gt;Typical Patroni failover: 10–30 seconds. The timeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Primary unreachable (0s)&lt;/li&gt;
&lt;li&gt;Patroni TTL expires, primary declared dead (default: 30s)&lt;/li&gt;
&lt;li&gt;Replica acquires DCS lock and promotes (1–2s)&lt;/li&gt;
&lt;li&gt;HAProxy detects new primary (2–5s)&lt;/li&gt;
&lt;li&gt;Application reconnects (depends on pool settings)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Reducing TTL speeds detection but increases false positives from transient network blips. Most teams settle on 15–30 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Managed PostgreSQL High Availability Makes Sense
&lt;/h2&gt;

&lt;p&gt;DIY HA works. Large companies run Patroni at massive scale. But you're running 5+ nodes, maintaining etcd, and debugging edge cases at the worst possible time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://rivestack.io/blog/why-managed-postgres-for-ai" rel="noopener noreferrer"&gt;As we've broken down before&lt;/a&gt;, the real cost of self-hosted PostgreSQL HA includes infrastructure, tooling, and engineering time — and engineering time dominates.&lt;/p&gt;

&lt;p&gt;Rivestack's HA clusters handle streaming replication, automatic failover, and connection routing automatically. Failover happens in seconds, your connection string doesn't change, and there's no etcd cluster to maintain. HA clusters start at $99/month with NVMe storage, automated backups, point-in-time recovery, and monitoring included.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Do If You're Starting From Scratch
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;For a managed setup:&lt;/strong&gt; &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Try Rivestack&lt;/a&gt; — spin up an HA cluster, verify failover works from the dashboard, and move on to building your application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For a self-managed setup:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with pg_auto_failover for 2–3 nodes and fast setup&lt;/li&gt;
&lt;li&gt;Move to Patroni if you need multi-datacenter support or fine-grained control&lt;/li&gt;
&lt;li&gt;Use HAProxy with Patroni REST API health checks (port 8008) for connection routing&lt;/li&gt;
&lt;li&gt;Enable synchronous replication if your application can't tolerate data loss&lt;/li&gt;
&lt;li&gt;Test failover in staging before relying on it in production&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;PostgreSQL has excellent HA building blocks. Streaming replication is fast, reliable, and well-understood. The gap is automated failover, and Patroni or pg_auto_failover fills it — but adds real operational complexity.&lt;/p&gt;

&lt;p&gt;Test your failover before you need it. The worst time to discover your replica is 10 minutes behind is during an actual outage.&lt;/p&gt;

&lt;p&gt;If you want to skip the infrastructure work, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Rivestack&lt;/a&gt; handles the HA layer for you — including pgvector if you're building AI applications. See the &lt;a href="https://rivestack.io/blog/getting-started-with-pgvector" rel="noopener noreferrer"&gt;getting started guide&lt;/a&gt; if that's your stack.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/postgres-high-availability" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>devops</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Use pgvector with Python: A Complete Guide</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Sat, 28 Mar 2026 13:58:17 +0000</pubDate>
      <link>https://forem.com/geekyfox90/how-to-use-pgvector-with-python-a-complete-guide-808</link>
      <guid>https://forem.com/geekyfox90/how-to-use-pgvector-with-python-a-complete-guide-808</guid>
      <description>&lt;p&gt;You've decided to use PostgreSQL for your vector embeddings. Smart move. Now you need to wire it up from Python — and if you've landed here, you've probably already noticed that there are a few different libraries involved, the syntax isn't immediately obvious, and the official pgvector docs give you the C extension but leave the Python story somewhat scattered.&lt;/p&gt;

&lt;p&gt;This guide covers the whole picture: installing the Python client, connecting with both psycopg3 and SQLAlchemy, storing and querying embeddings, building indexes, and wiring it up into a real RAG pipeline. By the end you'll have a working setup you can actually ship.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Need Before You Start
&lt;/h2&gt;

&lt;p&gt;You'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A PostgreSQL database with the &lt;code&gt;vector&lt;/code&gt; extension enabled&lt;/li&gt;
&lt;li&gt;Python 3.8+&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;pgvector&lt;/code&gt; Python package&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're running PostgreSQL locally, install the pgvector extension from &lt;a href="https://github.com/pgvector/pgvector" rel="noopener noreferrer"&gt;the pgvector GitHub repo&lt;/a&gt; and run &lt;code&gt;CREATE EXTENSION vector;&lt;/code&gt;. If you're using a managed PostgreSQL service, the extension is typically pre-installed — on Rivestack, it's enabled by default on every database.&lt;/p&gt;

&lt;p&gt;Install the Python package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;pgvector
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll also want a database driver. The two most common choices for Python are &lt;strong&gt;psycopg3&lt;/strong&gt; (direct, fast, recommended) and &lt;strong&gt;SQLAlchemy&lt;/strong&gt; (for ORM-based projects). We'll cover both.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# For psycopg3&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"psycopg[binary]"&lt;/span&gt;

&lt;span class="c"&gt;# For SQLAlchemy&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;sqlalchemy psycopg2-binary
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Up the Vector Extension
&lt;/h2&gt;

&lt;p&gt;Connect to your database and enable the extension if it isn't already:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this once per database. You can verify it's installed with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_extension&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;extname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'vector'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Using pgvector with psycopg3
&lt;/h2&gt;

&lt;p&gt;psycopg3 is the modern PostgreSQL driver for Python. It's faster than psycopg2, has proper async support, and the pgvector Python package integrates with it natively.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connecting and Registering the Vector Type
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.psycopg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;register_vector&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;register_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;register_vector&lt;/code&gt; call is important — it tells psycopg how to serialize and deserialize Python lists/numpy arrays into the &lt;code&gt;vector&lt;/code&gt; type PostgreSQL expects. Without it, you'll get type errors when inserting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a Table with a Vector Column
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        CREATE TABLE IF NOT EXISTS documents (
            id BIGSERIAL PRIMARY KEY,
            content TEXT NOT NULL,
            embedding VECTOR(1536)
        )
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;1536&lt;/code&gt; dimension matches OpenAI's &lt;code&gt;text-embedding-3-small&lt;/code&gt; model. If you're using a different model, adjust accordingly:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Dimensions&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI text-embedding-3-small&lt;/td&gt;
&lt;td&gt;1536&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI text-embedding-3-large&lt;/td&gt;
&lt;td&gt;3072&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cohere embed-v4&lt;/td&gt;
&lt;td&gt;1024&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google text-embedding-005&lt;/td&gt;
&lt;td&gt;768&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;all-MiniLM-L6-v2 (local)&lt;/td&gt;
&lt;td&gt;384&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Inserting Embeddings
&lt;/h3&gt;

&lt;p&gt;In practice, you generate embeddings with an API call or a local model, then insert them alongside your content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;

&lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PostgreSQL is a powerful open-source relational database.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector adds vector similarity search to PostgreSQL.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Python is a high-level programming language.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO documents (content, embedding) VALUES (%s, %s)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;pgvector's psycopg integration accepts plain Python lists directly — no special wrapping needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Querying by Similarity
&lt;/h3&gt;

&lt;p&gt;To find the documents most similar to a query, compute the query embedding and use one of pgvector's distance operators:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; — cosine distance (most common for text embeddings)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt; — L2 (Euclidean) distance&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; — negative inner product
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How does vector search work in PostgreSQL?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT content, 1 - (embedding &amp;lt;=&amp;gt; %s) AS similarity
        FROM documents
        ORDER BY embedding &amp;lt;=&amp;gt; %s
        LIMIT 5
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;similarity&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; operator returns cosine &lt;em&gt;distance&lt;/em&gt; (0 = identical, 2 = opposite), so &lt;code&gt;1 - distance&lt;/code&gt; gives you cosine &lt;em&gt;similarity&lt;/em&gt; if you want a score between 0 and 1.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using pgvector with SQLAlchemy
&lt;/h2&gt;

&lt;p&gt;If your project uses SQLAlchemy for ORM models, pgvector has first-class support there too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BigInteger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sqlalchemy.orm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;declarative_base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Session&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Vector&lt;/span&gt;

&lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql+psycopg2://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;Base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;declarative_base&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;__tablename__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BigInteger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;autoincrement&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nullable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inserting with SQLAlchemy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector makes PostgreSQL a capable vector store.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector makes PostgreSQL a capable vector store.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Querying with SQLAlchemy — use the &lt;code&gt;l2_distance&lt;/code&gt;, &lt;code&gt;cosine_distance&lt;/code&gt;, or &lt;code&gt;max_inner_product&lt;/code&gt; functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cosine_distance&lt;/span&gt;

&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector search with PostgreSQL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cosine_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding an Index for Production
&lt;/h2&gt;

&lt;p&gt;The queries above work fine for development and small datasets — PostgreSQL does an exact nearest-neighbor scan. But with more than ~50,000 vectors, you'll want an approximate nearest-neighbor (ANN) index to keep queries fast.&lt;/p&gt;

&lt;p&gt;pgvector supports two index types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HNSW&lt;/strong&gt; — faster queries, higher memory usage during build, better recall&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IVFFlat&lt;/strong&gt; — faster build, less memory, slightly lower recall&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most production use cases, HNSW is the right choice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        CREATE INDEX IF NOT EXISTS documents_embedding_idx
        ON documents
        USING hnsw (embedding vector_cosine_ops)
        WITH (m = 16, ef_construction = 64)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;vector_cosine_ops&lt;/code&gt; tells pgvector to optimize the index for cosine distance — make sure this matches the operator you use in queries (&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;). If you're using L2 distance, use &lt;code&gt;vector_l2_ops&lt;/code&gt; instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; Build the index &lt;em&gt;after&lt;/em&gt; loading your initial data. Building on an empty table and then loading data works too, but index quality is better when built on the full dataset.&lt;/p&gt;

&lt;h3&gt;
  
  
  HNSW parameters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;m&lt;/code&gt; — number of connections per layer (default 16, range 2–100). Higher = better recall, more memory.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ef_construction&lt;/code&gt; — search depth during build (default 64, range 4–1000). Higher = better recall, slower build.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For query-time speed/recall tradeoff, set &lt;code&gt;hnsw.ef_search&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SET hnsw.ef_search = 100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Now run your similarity query
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Building a RAG Pipeline with pgvector and Python
&lt;/h2&gt;

&lt;p&gt;Here's a minimal but complete Retrieval-Augmented Generation pipeline using everything above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.psycopg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;register_vector&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;openai_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;register_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Store a document and its embedding.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO documents (content, embedding) VALUES (%s, %s)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Find the k most relevant documents for a query.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
            SELECT content
            FROM documents
            ORDER BY embedding &amp;lt;=&amp;gt; %s
            LIMIT %s
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Retrieve context and generate an answer.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;context_docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context_docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Answer questions using only the provided context. Be concise.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Context:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;

&lt;span class="c1"&gt;# Index some documents
&lt;/span&gt;&lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgvector supports HNSW and IVFFlat indexes.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cosine similarity is best for text embedding comparisons.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;index_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Rivestack provides managed PostgreSQL with pgvector pre-installed.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Ask a question
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What index types does pgvector support?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Metadata Filtering
&lt;/h2&gt;

&lt;p&gt;One of the big advantages of pgvector over standalone vector databases is that you can filter by regular columns in the same query. If you have multi-tenant data or want to search within a category, just add a &lt;code&gt;WHERE&lt;/code&gt; clause:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Add a source column to your table
&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    ALTER TABLE documents ADD COLUMN IF NOT EXISTS source TEXT
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Query with metadata filter
&lt;/span&gt;&lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    SELECT content, embedding &amp;lt;=&amp;gt; %s AS distance
    FROM documents
    WHERE source = &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;internal-wiki&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;
    ORDER BY embedding &amp;lt;=&amp;gt; %s
    LIMIT 5
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No syncing, no secondary filtering step, no extra infrastructure. It's just SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection Pooling
&lt;/h2&gt;

&lt;p&gt;pgvector queries are fast, but embedding vectors are large (1536 floats ≈ 6KB per vector). If you're running a web application with concurrent requests, use a connection pool to avoid exhausting PostgreSQL's connection limit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;psycopg_pool&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConnectionPool&lt;/span&gt;

&lt;span class="n"&gt;pool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConnectionPool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;min_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;register_vector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT content FROM documents ORDER BY embedding &amp;lt;=&amp;gt; %s LIMIT 5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Rivestack, connection pooling is built into the platform so you don't need to manage PgBouncer yourself — you connect to the pooler endpoint and the rest is handled for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Async Support
&lt;/h2&gt;

&lt;p&gt;If you're building with FastAPI, asyncio, or any other async Python framework, psycopg3 has full async support:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pgvector.psycopg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;register_vector_async&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;psycopg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AsyncConnection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:password@localhost:5432/mydb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;register_vector_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
            SELECT content FROM documents
            ORDER BY embedding &amp;lt;=&amp;gt; %s LIMIT 5
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the &lt;code&gt;register_vector_async&lt;/code&gt; — you need the async version when working with &lt;code&gt;AsyncConnection&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Mistakes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Using the wrong distance operator for your index.&lt;/strong&gt; If you create an HNSW index with &lt;code&gt;vector_cosine_ops&lt;/code&gt; but query with &lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt; (L2 distance), the index won't be used and you'll get a full sequential scan. Match the operator to the index ops class.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forgetting &lt;code&gt;register_vector&lt;/code&gt;.&lt;/strong&gt; You'll get &lt;code&gt;can't adapt type 'list'&lt;/code&gt; errors. Call it once per connection (or once on the pool).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Building indexes on empty tables.&lt;/strong&gt; The index will exist but won't actually improve query quality. Load your data first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not setting &lt;code&gt;ef_search&lt;/code&gt; for precision-sensitive use cases.&lt;/strong&gt; The default is 40, which is fast but sacrifices some recall. For RAG where missing a relevant document matters, &lt;code&gt;SET hnsw.ef_search = 100&lt;/code&gt; is worth the small latency increase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Go From Here
&lt;/h2&gt;

&lt;p&gt;You now have everything you need to build vector search into a Python application using pgvector:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;psycopg3 for direct, fast database access&lt;/li&gt;
&lt;li&gt;SQLAlchemy if you prefer an ORM&lt;/li&gt;
&lt;li&gt;HNSW indexes for production performance&lt;/li&gt;
&lt;li&gt;Metadata filtering using standard SQL &lt;code&gt;WHERE&lt;/code&gt; clauses&lt;/li&gt;
&lt;li&gt;A full RAG pipeline skeleton you can extend&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The operational side — backups, connection pooling, high availability, keeping pgvector updated — is where the real work begins if you're self-hosting. If you'd rather not deal with that, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;try Rivestack&lt;/a&gt;: managed PostgreSQL with pgvector pre-installed, NVMe storage for fast index traversal, and automatic backups. You get the same SQL interface you've just built, without the 3 AM pages.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://rivestack.io/blog/how-to-use-pgvector-with-python" rel="noopener noreferrer"&gt;rivestack.io&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>python</category>
      <category>tutorial</category>
      <category>ai</category>
    </item>
    <item>
      <title>pgvector Cosine Distance: How &lt;=&gt; Actually Works</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Fri, 27 Mar 2026 21:17:05 +0000</pubDate>
      <link>https://forem.com/geekyfox90/pgvector-cosine-distance-how-actually-works-1458</link>
      <guid>https://forem.com/geekyfox90/pgvector-cosine-distance-how-actually-works-1458</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;When we first wired up pgvector for semantic search, I assumed &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; (cosine distance) was just "the normal one." Then a colleague asked why our similarity scores were sometimes above 1.0, and I realized I'd been cargo-culting the operator without understanding what it actually returns.&lt;/p&gt;

&lt;p&gt;The pgvector cosine distance operator &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; is probably the most used distance function in similarity search — and the most misunderstood. Here's what's actually happening under the hood, when to use it, and when to switch.&lt;/p&gt;

&lt;h2&gt;
  
  
  What &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; Returns
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; returns &lt;strong&gt;cosine distance&lt;/strong&gt;, not cosine similarity. The relationship is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cosine_distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cosine_similarity&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So two identical vectors return &lt;code&gt;0&lt;/code&gt; (not &lt;code&gt;1&lt;/code&gt;), and two orthogonal vectors return &lt;code&gt;1&lt;/code&gt;. Two vectors pointing in opposite directions return &lt;code&gt;2&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Identical vectors: distance = 0&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'[1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 0&lt;/span&gt;

&lt;span class="c1"&gt;-- Opposite vectors: distance = 2&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'[1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[-1,0,0]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: 2&lt;/span&gt;

&lt;span class="c1"&gt;-- Find the 5 nearest neighbors by cosine distance&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[0.1, 0.3, ...]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[0.1, 0.3, ...]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This trips people up constantly. If you're used to thinking "higher = more similar," you need to flip your mental model. With &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;, lower scores are better.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Operators, Three Use Cases
&lt;/h2&gt;

&lt;p&gt;pgvector exposes three distance operators:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cosine distance&lt;/td&gt;
&lt;td&gt;NLP embeddings, semantic search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;L2 (Euclidean) distance&lt;/td&gt;
&lt;td&gt;Image embeddings, dense numeric vectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Negative inner product&lt;/td&gt;
&lt;td&gt;When vectors are already unit-normalized&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; when&lt;/strong&gt;: your embeddings come from an NLP model (OpenAI, Cohere, Mistral, etc.) and you care about directional similarity rather than magnitude. Most language model embeddings encode meaning in the direction of the vector, not its length.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt; when&lt;/strong&gt;: magnitude matters. Image feature vectors or tabular embeddings where distance in absolute space is meaningful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; when&lt;/strong&gt;: your vectors are already L2-normalized (unit length). It returns the negative dot product, which is equivalent to cosine similarity for unit vectors — and it's faster because it skips the normalization step.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Normalization Shortcut
&lt;/h2&gt;

&lt;p&gt;If you pre-normalize your embeddings before inserting them, &lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; is strictly faster than &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; and produces equivalent ranking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Normalize on insert (Python)&lt;/span&gt;
&lt;span class="n"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="n"&gt;def&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;norm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vec&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt; &lt;span class="n"&gt;if&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;vec&lt;/span&gt;

&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;-- Then query with inner product&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;#&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_vec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;#&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_vec&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We benchmarked this on a 1M-row table with 1536-dim vectors. Pre-normalizing and using &lt;code&gt;&amp;lt;#&amp;gt;&lt;/code&gt; cut query time by ~18% vs &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; on an IVFFlat index. Not huge, but free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Index Compatibility
&lt;/h2&gt;

&lt;p&gt;One thing that bites teams: &lt;strong&gt;your index operator class must match your query operator&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- For cosine distance queries (&amp;lt;=&amp;gt;), use vector_cosine_ops&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_cosine_ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- For L2 queries (&amp;lt;-&amp;gt;), use vector_l2_ops&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_l2_ops&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you create an &lt;code&gt;ivfflat&lt;/code&gt; index with &lt;code&gt;vector_l2_ops&lt;/code&gt; and then query with &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;, Postgres will do a full sequential scan. No error, no warning — just slow queries that silently skip the index. Run &lt;code&gt;EXPLAIN&lt;/code&gt; and look for &lt;code&gt;Index Scan&lt;/code&gt; vs &lt;code&gt;Seq Scan&lt;/code&gt; to confirm your index is being used.&lt;/p&gt;

&lt;h2&gt;
  
  
  Picking Lists and probes
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;lists&lt;/code&gt; is the number of clusters IVFFlat partitions your data into. A good starting point is &lt;code&gt;sqrt(row_count)&lt;/code&gt;. At query time, &lt;code&gt;ivfflat.probes&lt;/code&gt; controls how many clusters to search:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Set probes at session level (or per query)&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;probes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[...]'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dist&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;dist&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Higher &lt;code&gt;probes&lt;/code&gt; = better recall, slower queries. For most semantic search workloads, &lt;code&gt;probes = 10&lt;/code&gt; gives 95%+ recall at reasonable speed. Test with your actual data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;The pgvector cosine distance operator &lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; is the right default for language model embeddings — but know what it returns (distance, not similarity), verify your index operator class matches your query operator, and consider pre-normalizing if you're chasing extra performance.&lt;/p&gt;

&lt;p&gt;If you want pgvector running on managed Postgres without tuning kernel params or fighting storage I/O, &lt;a href="https://rivestack.io" rel="noopener noreferrer"&gt;Rivestack&lt;/a&gt; handles that — provisioned in 30 seconds with the pgvector extension pre-enabled.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>pgvector</category>
      <category>database</category>
    </item>
    <item>
      <title>I built a managed pgvector service and here's what I learned about vector search performance</title>
      <dc:creator>Yasser B.</dc:creator>
      <pubDate>Wed, 25 Mar 2026 09:01:38 +0000</pubDate>
      <link>https://forem.com/geekyfox90/i-built-a-managed-pgvector-service-and-heres-what-i-learned-about-vector-search-performance-ehl</link>
      <guid>https://forem.com/geekyfox90/i-built-a-managed-pgvector-service-and-heres-what-i-learned-about-vector-search-performance-ehl</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrmi6fijilv2pvd27uua.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrmi6fijilv2pvd27uua.png" alt="NVMe vs cloud SSD pgvector performance benchmark visualization in Firewatch art style" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I Ran pgvector on NVMe vs Cloud SSD. The Difference Shocked Me.&lt;br&gt;
2,000 queries per second at under 4ms. That's what I'm getting on a $35/month server. Let me tell you how I got there and what I had to build to make it work.&lt;/p&gt;

&lt;p&gt;The problem started with a side project&lt;/p&gt;

&lt;p&gt;I was building a RAG pipeline. Standard stuff: OpenAI embeddings, PostgreSQL, pgvector extension, HNSW index. Everything worked fine in development. Then I moved it to a managed database on a regular cloud provider and watched my query latency go from 4ms to 47ms under any real load.&lt;/p&gt;

&lt;p&gt;I spent two days thinking my index was wrong. Wrong ef_search value. Wrong m parameter. Wrong dimension count. None of that was it.&lt;br&gt;
The problem was the disk.&lt;/p&gt;

&lt;p&gt;HNSW does not behave like a normal database query&lt;br&gt;
Most database queries are sequential reads. Your disk reads a chunk of data in order, hands it back, done. SSDs are fast at this. Even gp3 cloud SSDs are fast at this.&lt;/p&gt;

&lt;p&gt;HNSW is different. An HNSW index traversal is essentially a graph walk. You start at an entry point, compare distances, jump to neighbors, compare again, jump again. Each jump is a random read to a different location on disk. The more vectors you have, the more jumps, the more random reads.&lt;/p&gt;

&lt;p&gt;Cloud SSDs are slow at random reads. NVMe drives are very fast at random reads. That gap, which barely matters for most database workloads, is everything for pgvector at scale.&lt;br&gt;
I didn't believe it until I benchmarked it myself.&lt;/p&gt;

&lt;p&gt;The actual numbers&lt;/p&gt;

&lt;p&gt;I tested 1 million vectors at 1536 dimensions (OpenAI text-embedding-3-small), HNSW index, cosine distance, ef_search=40, 16 concurrent clients.&lt;/p&gt;

&lt;p&gt;On a cloud SSD backed instance: around 410 QPS, p95 latency at 18ms.&lt;br&gt;
On NVMe: 2,150 QPS, p95 at 2.8ms.&lt;br&gt;
Same PostgreSQL version. Same pgvector version. Same query. Same index parameters. Five times the throughput, six times lower latency. Just from the storage layer.&lt;/p&gt;

&lt;p&gt;That's when I decided to build Rivestack.&lt;/p&gt;

&lt;p&gt;Why PostgreSQL and not a dedicated vector database&lt;/p&gt;

&lt;p&gt;Honest answer: because I didn't want to manage two databases.&lt;br&gt;
The moment you move your vectors to Pinecone or Qdrant or Weaviate, you have split your data across two systems. Your relational data lives in Postgres. Your vectors live somewhere else. Every query that needs both involves a round trip between systems. That latency adds up fast.&lt;/p&gt;

&lt;p&gt;With pgvector you write one query. You can filter by user_id, join against your documents table, and do a vector similarity search in a single SQL statement. That's not a minor convenience. It changes how you architect the whole application.&lt;br&gt;
The only real argument against pgvector is scale. If you have a billion vectors, you need a dedicated system. For the other 99% of applications, pgvector on NVMe is genuinely competitive.&lt;/p&gt;

&lt;p&gt;What I actually built&lt;/p&gt;

&lt;p&gt;Rivestack is a managed PostgreSQL service where pgvector is the whole point, not an afterthought. NVMe storage on every plan. HNSW pre-configured with sensible defaults. Daily backups, point-in-time recovery, HA failover if you want it.&lt;/p&gt;

&lt;p&gt;The thing I kept running into with other managed Postgres providers is that pgvector is just... there. An extension you can enable. Nobody has tuned the storage layer for it. Nobody has written documentation for RAG use cases. Nobody has thought about what happens to your index when you have 5 million vectors and 50 concurrent clients.&lt;/p&gt;

&lt;p&gt;That's the gap I'm filling.&lt;/p&gt;

&lt;p&gt;The honest trade-offs&lt;/p&gt;

&lt;p&gt;Rivestack is not for everyone. If you need built-in auth, storage buckets, or a real-time WebSocket layer, use Supabase. They're great at that. If you have hundreds of millions of vectors and need automatic sharding, use Pinecone. They're built for that.&lt;br&gt;
Rivestack is for teams who need pgvector to actually perform under load and don't want to spend three days tuning PostgreSQL configuration to get there.&lt;/p&gt;

&lt;p&gt;What I'd do differently&lt;/p&gt;

&lt;p&gt;I spent too long on the benchmark tooling before I had a single user. Classic founder mistake. The infrastructure is solid but I should have shipped earlier and iterated based on real workloads instead of synthetic ones.&lt;/p&gt;

&lt;p&gt;Also I underestimated how much developers care about EU data residency. It's come up in every conversation I've had since launch. If you're building anything GDPR-adjacent, knowing your vectors never leave EU territory is not a minor detail.&lt;/p&gt;

&lt;p&gt;Try it if you're building with embeddings&lt;br&gt;
Free shared tier at rivestack.io, no credit card required. Paid plans start at $35/month for a dedicated node with 2 vCPU, 4GB RAM and 55GB NVMe.&lt;br&gt;
If you're already running pgvector somewhere, migration is just pg_dump and pg_restore. Takes about five minutes for most databases.&lt;/p&gt;

&lt;p&gt;What are you using for vector storage right now? And what's the biggest pain point you've hit with it? Genuinely curious whether the storage layer issue I ran into is common or whether I just had unusually bad luck with my cloud provider.&lt;a href="https://dev.tourl"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>vectordatabase</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
