<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Max Shafiei</title>
    <description>The latest articles on Forem by Max Shafiei (@max_shafiei_f46a24e31e2cb).</description>
    <link>https://forem.com/max_shafiei_f46a24e31e2cb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3153027%2F5e528edd-66f9-4b50-8676-8491751f8aa1.jpeg</url>
      <title>Forem: Max Shafiei</title>
      <link>https://forem.com/max_shafiei_f46a24e31e2cb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/max_shafiei_f46a24e31e2cb"/>
    <language>en</language>
    <item>
      <title>PostgreSQL Ignored My Index — Until I Fought Back (A Planner Battle Story)</title>
      <dc:creator>Max Shafiei</dc:creator>
      <pubDate>Mon, 12 May 2025 09:42:18 +0000</pubDate>
      <link>https://forem.com/max_shafiei_f46a24e31e2cb/postgresql-ignored-my-index-until-i-fought-back-a-planner-battle-story-5ao0</link>
      <guid>https://forem.com/max_shafiei_f46a24e31e2cb/postgresql-ignored-my-index-until-i-fought-back-a-planner-battle-story-5ao0</guid>
      <description>&lt;p&gt;I noticed that a seemingly harmless query on one of our lookup tables was responsible for &lt;strong&gt;~13%&lt;/strong&gt; of total CPU usage in production. This was surprising, especially since the query targeted a text field backed by a trigram GIN index — which should’ve made it lightning fast.&lt;/p&gt;

&lt;p&gt;But PostgreSQL had other ideas. It kept choosing a sequential scan, leading to unnecessary CPU load.&lt;/p&gt;

&lt;p&gt;This post walks through how I investigated the problem, why I rejected some common solutions, and the subtle one-line trick that ended up working — without extensions, transactions, or rewriting the query.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔥 The Problem
&lt;/h2&gt;

&lt;p&gt;Here's the query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;company_entities&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;category_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%alpha%'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%group%'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%inc%'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even though a trigram GIN index existed on &lt;code&gt;name&lt;/code&gt;, PostgreSQL consistently chose a sequential scan:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Seq&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;company_entities&lt;/span&gt;
  &lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
  &lt;span class="k"&gt;Rows&lt;/span&gt; &lt;span class="n"&gt;Removed&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;9996&lt;/span&gt;
  &lt;span class="n"&gt;Execution&lt;/span&gt; &lt;span class="nb"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="mi"&gt;46&lt;/span&gt; &lt;span class="n"&gt;ms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This would be understandable for a massive table — but this one had only ~17,000 rows.&lt;/p&gt;

&lt;p&gt;Turns out, Postgres’s planner estimated that scanning the whole table would be cheaper than using the index — which simply wasn't true under real production load.&lt;/p&gt;




&lt;h3&gt;
  
  
  🧪 Can We Force the Index?
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Option 1: &lt;code&gt;SET LOCAL enable_seqscan = off&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;Yes, this works. Wrap the query in a transaction and disable sequential scans for that scope:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"SET LOCAL enable_seqscan = off"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="no"&gt;CompanyEntity&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;category_id: &lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"name ILIKE ANY (ARRAY[?, ?, ?])"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"%alpha%"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"%group%"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"%inc%"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ It forces the planner to use the GIN index&lt;br&gt;&lt;br&gt;
✅ Execution time drops by ~10×&lt;/p&gt;

&lt;p&gt;But here’s the catch.&lt;/p&gt;
&lt;h5&gt;
  
  
  🚫 Why we &lt;em&gt;didn’t&lt;/em&gt; go with it:
&lt;/h5&gt;

&lt;p&gt;My colleagues raised valid concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It requires wrapping read-only queries in transactions, which can surprise developers or break expectations.&lt;/li&gt;
&lt;li&gt;It adds noise and complexity to otherwise simple scopes or service calls.&lt;/li&gt;
&lt;li&gt;It doesn't compose well — if the calling code already opens a transaction, behavior can become unpredictable.&lt;/li&gt;
&lt;li&gt;It smells like a workaround, not a long-term fix.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we looked for something safer and less intrusive.&lt;/p&gt;


&lt;h4&gt;
  
  
  Option 2: A Harmless Dummy Condition
&lt;/h4&gt;

&lt;p&gt;After testing a few ideas, one simple trick worked:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;char_length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This does &lt;strong&gt;nothing&lt;/strong&gt; logically — unless &lt;code&gt;name&lt;/code&gt; is null (which it isn't in our dataset). But Postgres treats it as a filtering condition on the &lt;code&gt;name&lt;/code&gt; column, and that’s enough to make the planner reevaluate its assumptions.&lt;/p&gt;

&lt;h5&gt;
  
  
  The query plan switched to:
&lt;/h5&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Bitmap&lt;/span&gt; &lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;index_company_entities_on_name&lt;/span&gt;
&lt;span class="n"&gt;Execution&lt;/span&gt; &lt;span class="nb"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="n"&gt;ms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s over 10× faster, without changing query logic or semantics.&lt;/p&gt;

&lt;p&gt;We also verified the results were identical:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query_without_dummy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pluck&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query_with_dummy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pluck&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="s2"&gt;"Mismatch!"&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ Same data&lt;br&gt;&lt;br&gt;
✅ Less CPU&lt;br&gt;&lt;br&gt;
✅ No weird transactional behavior&lt;/p&gt;

&lt;p&gt;I tried several other logically harmless dummy conditions to influence the planner. The idea was always the same: nudge PostgreSQL into using the GIN index without changing results. But most of these failed, for example:&lt;/p&gt;

&lt;p&gt;This seemed like an obvious candidate. All name values are non-null in our dataset, so the condition is always true.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result: Still triggered a sequential scan.&lt;br&gt;
Likely because Postgres sees this as a no-op for planning purposes — not selective, not expressive enough.&lt;/p&gt;


&lt;h4&gt;
  
  
  Option 3: Query Hints or Extensions?
&lt;/h4&gt;

&lt;p&gt;We also considered using the &lt;code&gt;pg_hint_plan&lt;/code&gt; extension to force index usage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="cm"&gt;/*+ BitmapScan(company_entities index_company_entities_on_name) */&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But installing a C extension in a large organization (especially one on managed Postgres) is... ambitious.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It requires infra buy-in
&lt;/li&gt;
&lt;li&gt;It's harder to test and maintain
&lt;/li&gt;
&lt;li&gt;It adds a new surface area for query tuning most devs won’t understand&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we skipped it.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧩 Summary of Options
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;Index Used?&lt;/th&gt;
&lt;th&gt;Practical in Prod?&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SET LOCAL enable_seqscan = off&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;⚠️ Not ideal&lt;/td&gt;
&lt;td&gt;Works, but adds transactional complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query Hints or Extensions&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Seems hacky and adds dependency on an extension&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;char_length(name) &amp;gt;= 0&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Subtle, safe, and effective&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  ✅ Final Takeaway
&lt;/h2&gt;

&lt;p&gt;PostgreSQL’s query planner is smart — but not perfect. Especially when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The table is small
&lt;/li&gt;
&lt;li&gt;You’re using &lt;code&gt;%...%&lt;/code&gt; patterns
&lt;/li&gt;
&lt;li&gt;The index exists but looks “too expensive” at first glance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In our case, the best fix was &lt;strong&gt;not&lt;/strong&gt; to rewrite the query, install an extension, or wrap everything in transactions.&lt;/p&gt;

&lt;p&gt;Instead, it was a &lt;strong&gt;single, harmless condition&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;char_length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🧠 Low effort&lt;br&gt;&lt;br&gt;
🔒 Safe for production&lt;br&gt;&lt;br&gt;
⚡ Fast and effective&lt;/p&gt;

&lt;p&gt;Have you ever fought the Postgres planner? I’d love to hear how you nudged it in the right direction.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>database</category>
      <category>postgres</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
