<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sakthivel C</title>
    <description>The latest articles on Forem by Sakthivel C (@sakthivel_c_98e5dce09e5d9).</description>
    <link>https://forem.com/sakthivel_c_98e5dce09e5d9</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3871941%2F1a94c8cb-b864-4630-b69b-75bce8bd88b5.jpg</url>
      <title>Forem: Sakthivel C</title>
      <link>https://forem.com/sakthivel_c_98e5dce09e5d9</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/sakthivel_c_98e5dce09e5d9"/>
    <language>en</language>
    <item>
      <title>Why Your AWS EKS Cluster Isn't Scaling Down — The PDB Trap With Stateless Services</title>
      <dc:creator>Sakthivel C</dc:creator>
      <pubDate>Thu, 16 Apr 2026 14:50:12 +0000</pubDate>
      <link>https://forem.com/sakthivel_c_98e5dce09e5d9/why-your-aws-eks-cluster-isnt-scaling-down-the-pdb-trap-with-stateless-services-1063</link>
      <guid>https://forem.com/sakthivel_c_98e5dce09e5d9/why-your-aws-eks-cluster-isnt-scaling-down-the-pdb-trap-with-stateless-services-1063</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;What is a Pod Disruption Budget?&lt;/li&gt;
&lt;li&gt;The Problem - PDB Blocking Node Scale Down&lt;/li&gt;
&lt;li&gt;Why This Is Easy To Miss ?&lt;/li&gt;
&lt;li&gt;The Fix — PDB Only For Stateful Services&lt;/li&gt;
&lt;li&gt;Key Takeaway&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Kubernetes cost optimization on AWS EKS often focuses on scaling up efficiently — but scaling down is where hidden costs live. Cluster autoscalar focuses on identifying nodes which are consuming less resource and scaling down those nodes ( if node count is  greater than min available  node count set in cluster )to reduce overall EKS usage cost..In a production environment we worked on, we noticed nodes weren't been scaled down by autoscalar even when resource usage was very low. After investigating, the culprit was something small and easy to overlook — a Pod Disruption Budget  configured on a stateless service.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is a Pod Disruption Budget?
&lt;/h2&gt;

&lt;p&gt;A Pod Disruption Budget (PDB) is a Kubernetes resource that limits how many pods of a deployment can be down at the same time during voluntary disruptions — something like node drains, cluster upgrades, or autoscaler scale down events. This is used to ensure critical services are always available during above disruption cases to ensure service availability. This doesn't prevent cases  node failure, pod OOM events.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;policy/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PodDisruptionBudget&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-service-pdb&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;minAvailable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-service&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells Kubernetes — "at least 1 pod of this service must always be running." ( minAvailable: 1)&lt;/p&gt;

&lt;p&gt;For stateful services like Redis or Kafka this makes complete sense. You don't want all the pods of these stateful services going down unexpectedly. And some minimum number of pods to be available at all costs.The problem starts when you apply this same logic to stateless services.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem -PDB Blocking Node Scale Down
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Here's the exact scenario we ran into:
&lt;/h3&gt;

&lt;p&gt;A stateless service was running with 1 replica .It had a PDB with minAvailable: 1. The pod was consuming very low CPU and memory&lt;br&gt;
AWS EKS Cluster Autoscaler identified the node as underutilized and tried to scale it down&lt;br&gt;
To scale down the node it needed to evict the pod first, But PDB said minimum 1 pod must be available at all times. Since there was only 1 replica, evicting it would violate the PDB&lt;/p&gt;

&lt;p&gt;Result — autoscaler couldn't evict the pod, node stayed up indefinitely.The node was essentially stuck — too empty to be useful, too protected to be removed.&lt;br&gt;
Cluster Autoscaler → tries to drain node&lt;br&gt;
                   → attempts to evict pod&lt;br&gt;
                   → PDB blocks eviction (minAvailable: 1, replicas: 1)&lt;br&gt;
                   → node scale down blocked&lt;br&gt;
                   → you keep paying for an underutilized node&lt;/p&gt;
&lt;h2&gt;
  
  
  Why this is Easy to Miss ?
&lt;/h2&gt;

&lt;p&gt;The pod itself showed no issues. CPU and memory were fine. HPA wasn't triggering. Everything looked healthy from an application perspective. The only sign was nodes not scaling down during low traffic periods — which is easy to dismiss as "autoscaler being slow" rather than investigating deeper.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Fix — PDB Only For Stateful Services
&lt;/h2&gt;

&lt;p&gt;The solution was straightforward once we identified the cause. Removed PDB entirely from stateless services&lt;br&gt;
Kept PDB only for stateful services like - Redis, Kafka, and similar infra components.Moved these stateful services to dedicated node group to ensure any high resource usage by stateless pods doesn't affect these pods if they are allocated in same node as PDB won't protect such cases. This fix ensured stateful services running in dedicated node group isolated from stateless pods with PDB ensuring during drain events these critical infra pods are available and doesn't cause entire production outage events.&lt;/p&gt;

&lt;p&gt;Stateless services by definition can handle being evicted and rescheduled — that's the whole point of being stateless. They don't need disruption protection by default. If even minute disruptions are not acceptable having PDBs with Max unavailable option can be considered or isolating  such services to seperate node group  in EKS with high tier based on whether they are cpu/ memory intensive would be a better choice.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;#  PDB makes sense here — stateful service&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;policy/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PodDisruptionBudget&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-pdb&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;minAvailable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;#  Avoid this — stateless service with single replica&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;policy/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PodDisruptionBudget&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-stateless-service-pdb&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;minAvailable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-stateless-service&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Takeaway
&lt;/h2&gt;

&lt;p&gt;If your AWS EKS cluster autoscaler isn't scaling down nodes during low traffic periods, check your PDBs before anything else. A minAvailable: 1 on a single replica stateless service is effectively telling your cluster — "this node can never be removed."&lt;br&gt;
Reserve PDBs for services that genuinely need them. Your AWS bill will thank you.&lt;/p&gt;

&lt;p&gt;Have you run into unexpected autoscaler behaviour in EKS? Drop a comment — would love to hear other gotchas people have faced.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>aws</category>
      <category>devops</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Why Your Kubernetes Pods Scale Slowly (And How to Fix It)</title>
      <dc:creator>Sakthivel C</dc:creator>
      <pubDate>Fri, 10 Apr 2026 15:40:24 +0000</pubDate>
      <link>https://forem.com/sakthivel_c_98e5dce09e5d9/why-your-kubernetes-pods-scale-slowly-and-how-to-fix-it-4ca9</link>
      <guid>https://forem.com/sakthivel_c_98e5dce09e5d9/why-your-kubernetes-pods-scale-slowly-and-how-to-fix-it-4ca9</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The Problem&lt;/li&gt;
&lt;li&gt;Why Autoscaling Feels Slow&lt;/li&gt;
&lt;li&gt;The Fix: Placeholder Pods&lt;/li&gt;
&lt;li&gt;How to Set It Up&lt;/li&gt;
&lt;li&gt;What Happens During a Real Spike&lt;/li&gt;
&lt;li&gt;Things to Keep in Mind&lt;/li&gt;
&lt;li&gt;Wrapping Up&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You've set up the &lt;strong&gt;Horizontal Pod Autoscaler (HPA)&lt;/strong&gt; in your cluster. Your app gets a sudden spike in traffic, and your existing pods start to throttle under the heavy load.&lt;/p&gt;

&lt;p&gt;The HPA kicks in: &lt;em&gt;"Hey, I need 3 more pods to service this traffic!"&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;But instead of scaling instantly, those pods sit in a &lt;strong&gt;Pending&lt;/strong&gt; state for 4–5 minutes. In that window:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requests are dropped.&lt;/li&gt;
&lt;li&gt;Latency spikes.&lt;/li&gt;
&lt;li&gt;You lose a huge number of customers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why are the pods stuck?
&lt;/h3&gt;

&lt;p&gt;The Kubernetes scheduler can't place your pods because there is no room left on your existing nodes. This triggers the &lt;strong&gt;Cluster Autoscaler (CA)&lt;/strong&gt; to provision a brand new node. &lt;/p&gt;

&lt;p&gt;That process is slow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;VM Provisioning:&lt;/strong&gt; The cloud provider has to spin up a new instance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Node Bootstrapping:&lt;/strong&gt; Joining the node to the cluster and installing dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image Pulling:&lt;/strong&gt; Downloading your container images to the new node.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By the time the node is ready, the damage is already done.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Autoscaling Feels Slow
&lt;/h2&gt;

&lt;p&gt;Kubernetes autoscaling operates in two distinct layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HPA (Horizontal Pod Autoscaler):&lt;/strong&gt; Scales pods based on metrics. This is &lt;strong&gt;fast&lt;/strong&gt; (seconds).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CA (Cluster Autoscaler):&lt;/strong&gt; Adds new nodes when pods can't be scheduled. This is &lt;strong&gt;slow&lt;/strong&gt; (3–5 minutes).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;HPA reacts in seconds, but CA reacts in minutes. That gap is where your availability suffers.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fix: Placeholder Pods
&lt;/h2&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;strong&gt;The Concept:&lt;/strong&gt; Keep "dummy" pods running on your nodes to reserve space. They do nothing but hold capacity. When a real pod needs that space, Kubernetes evicts the dummy immediately, and your real pod schedules without waiting.&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;The evicted dummy then has nowhere to go, which signals the Cluster Autoscaler to provision a new node. The dummy lands there—restoring the buffer for the next spike.&lt;/p&gt;

&lt;p&gt;This ensures you always have &lt;strong&gt;warm capacity&lt;/strong&gt; ready. The slow provisioning happens in the background, not in your user's critical path.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Set It Up
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create a Low-Priority Class
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scheduling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PriorityClass&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder-pod-priority&lt;/span&gt;
&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;-1&lt;/span&gt;
&lt;span class="na"&gt;globalDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Used&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;placeholder&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pods&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;that&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;can&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;be&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;evicted&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;anytime"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A negative priority ensures any real pod—which defaults to priority 0—will always win. The scheduler will immediately evict the placeholder to make room for your application pod.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Deploy the Placeholder Pods
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;priorityClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder-pod-priority&lt;/span&gt;
      &lt;span class="na"&gt;terminationGracePeriodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;placeholder&lt;/span&gt;
          &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;registry.k8s.io/pause:3.9&lt;/span&gt;
          &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500m"&lt;/span&gt;
              &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;512Mi"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key details in this manifest:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;pause image:&lt;/strong&gt; This is the smallest possible container; it does nothing and consumes virtually no resources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;resources.requests:&lt;/strong&gt; This tells Kubernetes to reserve this specific amount of space. Match this roughly to your app's requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;terminationGracePeriodSeconds: 0:&lt;/strong&gt; Ensures the eviction is instant, handing the spot to your real pod without any shutdown delay.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Verify Your App's Priority
&lt;/h3&gt;

&lt;p&gt;If you haven't explicitly set a priorityClassName on your application deployment, it defaults to 0. Since 0 is higher than -1, your real pods will always preempt the placeholders automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Happens During a Real Spike
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Traffic increases → HPA requests 3 new pods.&lt;/li&gt;
&lt;li&gt;Scheduler looks for space → finds it (placeholder pods are holding it).&lt;/li&gt;
&lt;li&gt;Placeholder pods get evicted instantly → real pods schedule in seconds.&lt;/li&gt;
&lt;li&gt;Evicted placeholders are now in Pending state.&lt;/li&gt;
&lt;li&gt;Cluster Autoscaler sees Pending pods → provisions a new node.&lt;/li&gt;
&lt;li&gt;Placeholders land on the new node → buffer is restored for next time.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Things to Keep in Mind
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost Trade-off:&lt;/strong&gt; Placeholder pods reserve real node capacity, meaning you are essentially paying for "warm" standby nodes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Namespace Scope:&lt;/strong&gt; Deploy placeholders in the same namespace as your workloads, or tune them per-namespace based on criticality.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Works Best with CA:&lt;/strong&gt; This pattern targets the node provisioning delay specifically. If your nodes already have massive amounts of spare capacity, you don't need this.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Cluster Autoscaler is not broken—it's just slow by design because provisioning VMs takes time. Placeholder pods let you work with that constraint. Your HPA scales instantly into pre-warmed capacity, and the slow provisioning happens in the background where it belongs.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>aws</category>
      <category>cloud</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
