<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Anupam Kushwaha</title>
    <description>The latest articles on Forem by Anupam Kushwaha (@anupam_kushwaha_85).</description>
    <link>https://forem.com/anupam_kushwaha_85</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3758282%2Fbdc81381-ebe4-4df9-83a2-1c63d64c7025.jpg</url>
      <title>Forem: Anupam Kushwaha</title>
      <link>https://forem.com/anupam_kushwaha_85</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/anupam_kushwaha_85"/>
    <language>en</language>
    <item>
      <title>Why My Microservices Broke on OpenShift — And How a Hidden Kubernetes Quota Nearly Cost Me Days</title>
      <dc:creator>Anupam Kushwaha</dc:creator>
      <pubDate>Wed, 06 May 2026 21:17:50 +0000</pubDate>
      <link>https://forem.com/anupam_kushwaha_85/why-my-microservices-broke-on-openshift-and-how-a-hidden-kubernetes-quota-nearly-cost-me-days-22g5</link>
      <guid>https://forem.com/anupam_kushwaha_85/why-my-microservices-broke-on-openshift-and-how-a-hidden-kubernetes-quota-nearly-cost-me-days-22g5</guid>
      <description>&lt;p&gt;Deploying four Spring Boot microservices to OpenShift Developer Sandbox, I hit two silent failures — a ReplicaSet quota exhaustion and a gateway routing to localhost. Here is the full debugging story.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;If you're deploying microservices on OpenShift's free Developer Sandbox (or any resource-constrained Kubernetes cluster), this post might save you hours of debugging.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;I built a production-grade mobile application backed by a &lt;strong&gt;microservices architecture&lt;/strong&gt; — four Spring Boot services deployed to &lt;strong&gt;Red Hat OpenShift Developer Sandbox&lt;/strong&gt; via a fully automated &lt;strong&gt;GitHub Actions CI/CD pipeline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flutter&lt;/strong&gt; mobile frontend (automated APK releases)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Gateway&lt;/strong&gt; (Spring Cloud Gateway) — single entry point for all client requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth Service&lt;/strong&gt; — handles registration, login, OTP verification, JWT tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User Service&lt;/strong&gt; — user profiles, preferences, settings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core Service&lt;/strong&gt; — main business logic, AI features, data processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MongoDB Atlas&lt;/strong&gt; — separate databases per service&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Container Registry (GHCR)&lt;/strong&gt; — Docker image hosting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenShift Developer Sandbox&lt;/strong&gt; — free-tier Kubernetes hosting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything was containerized, secrets-managed, health-probed, and CI/CD automated. It worked flawlessly on localhost. Then I deployed it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It broke. For two days.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;Here is how the system is wired:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F482vv1yxora0dt0lpc9x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F482vv1yxora0dt0lpc9x.png" alt="Architecture" width="800" height="597"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The mobile app hits the &lt;strong&gt;API Gateway&lt;/strong&gt; via an OpenShift Route (HTTPS). The gateway reads the URL path and forwards it to the correct internal microservice via &lt;strong&gt;Kubernetes Service DNS names&lt;/strong&gt; (e.g., &lt;code&gt;http://app-auth-svc:8080&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  The CI/CD Pipeline
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fxbefubaqog7vslfb3c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4fxbefubaqog7vslfb3c.png" alt="CICD flow" width="800" height="91"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every push to &lt;code&gt;main&lt;/code&gt; triggers a GitHub Actions workflow that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Builds all 4 services with Maven&lt;/li&gt;
&lt;li&gt;Creates Docker images and pushes to GHCR&lt;/li&gt;
&lt;li&gt;Logs into OpenShift via CLI (&lt;code&gt;oc login&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Creates/updates Kubernetes secrets (MongoDB URIs, JWT secret, API keys)&lt;/li&gt;
&lt;li&gt;Applies all deployment manifests&lt;/li&gt;
&lt;li&gt;Runs &lt;code&gt;oc rollout restart&lt;/code&gt; on each deployment&lt;/li&gt;
&lt;li&gt;Waits for health checks to pass&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sounds bulletproof, right? Here is where it fell apart.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Symptom
&lt;/h2&gt;

&lt;p&gt;After deploying, the app showed one message on every action — registration, login, anything:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Something went wrong. Please try again later."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The classic generic error that tells you absolutely nothing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Bug #1: The Silent Quota Killer
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What I Saw
&lt;/h3&gt;

&lt;p&gt;The CI/CD pipeline failed with this in the logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AuthServiceApplication - Started AuthServiceApplication in 36.106 seconds
...
Error from server (BadRequest): previous terminated container "app-auth" 
in pod "app-auth-xxxxx-xxxxx" not found
Error: Process completed with exit code 1.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Confusing, right? The auth service &lt;strong&gt;clearly started successfully&lt;/strong&gt; (36 seconds, listening on port 8080). But the deployment was marked as &lt;strong&gt;failed&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Digging Deeper
&lt;/h3&gt;

&lt;p&gt;Looking at the pod events, I found:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Readiness probe failed: Get "http://10.x.x.x:8080/actuator/health": 
  connection refused
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And buried further down, the &lt;strong&gt;real&lt;/strong&gt; error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;replicasets.apps is forbidden: exceeded quota
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What Actually Happened
&lt;/h3&gt;

&lt;p&gt;Here is what most people do not know about Kubernetes deployments:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every time you run &lt;code&gt;oc rollout restart&lt;/code&gt; (or &lt;code&gt;kubectl rollout restart&lt;/code&gt;), Kubernetes does not just restart your pods.&lt;/strong&gt; It creates an entirely &lt;strong&gt;new ReplicaSet&lt;/strong&gt; while keeping the old ones around as rollback history.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzn5208485fjpgrf6pqa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzn5208485fjpgrf6pqa.png" alt="quota" width="333" height="699"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;By default, Kubernetes keeps the &lt;strong&gt;last 10 ReplicaSets&lt;/strong&gt; per deployment (controlled by &lt;code&gt;revisionHistoryLimit&lt;/code&gt;, which defaults to &lt;code&gt;10&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Now multiply that by 4 microservices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;4 services × 10 ReplicaSets = 40 ReplicaSets&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The OpenShift Developer Sandbox (free tier) has a &lt;strong&gt;strict quota&lt;/strong&gt; on the total number of ReplicaSets allowed in your namespace. After just a few CI/CD runs, I silently hit that ceiling.&lt;/p&gt;

&lt;p&gt;When the quota is exceeded:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Kubernetes &lt;strong&gt;cannot create new ReplicaSets&lt;/strong&gt; for the rollout&lt;/li&gt;
&lt;li&gt;No new ReplicaSet = &lt;strong&gt;no new pods&lt;/strong&gt; get scheduled&lt;/li&gt;
&lt;li&gt;No pods = readiness probe has &lt;strong&gt;nothing to connect to&lt;/strong&gt; → &lt;code&gt;connection refused&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Rollout waits... and eventually times out → &lt;code&gt;context deadline exceeded&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Pipeline fails with &lt;code&gt;exit code 1&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;The app code was perfectly fine. Kubernetes just silently refused to create pods.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;There are two approaches, and I recommend using &lt;strong&gt;both&lt;/strong&gt;:&lt;/p&gt;

&lt;h4&gt;
  
  
  Fix A: Set &lt;code&gt;revisionHistoryLimit&lt;/code&gt; in Your Deployments (Best Practice)
&lt;/h4&gt;

&lt;p&gt;Add &lt;code&gt;revisionHistoryLimit: 1&lt;/code&gt; to &lt;strong&gt;every&lt;/strong&gt; deployment manifest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-service&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;revisionHistoryLimit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;    &lt;span class="c1"&gt;# Only keep 1 old ReplicaSet&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-service&lt;/span&gt;
  &lt;span class="c1"&gt;# ... rest of your spec&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why &lt;code&gt;1&lt;/code&gt; and not &lt;code&gt;0&lt;/code&gt;?&lt;/strong&gt; Setting it to &lt;code&gt;0&lt;/code&gt; means Kubernetes keeps &lt;strong&gt;zero rollback history&lt;/strong&gt;. If a bad deployment goes out, you cannot do &lt;code&gt;oc rollout undo&lt;/code&gt; to instantly revert. Keeping &lt;code&gt;1&lt;/code&gt; gives you exactly one rollback point — enough for safety without wasting quota. This is the &lt;strong&gt;best practice&lt;/strong&gt; because if anything goes wrong with a new deployment, you still have an instant rollback option.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With 4 services at &lt;code&gt;revisionHistoryLimit: 1&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;4 services × (1 current + 1 old) = 8 ReplicaSets&lt;/strong&gt; — well within any quota.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Fix B: Add Cleanup to Your Deploy Script (Recovery Safety Net)
&lt;/h4&gt;

&lt;p&gt;Add this &lt;strong&gt;before&lt;/strong&gt; the rollout restart commands in your deployment script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clean up old ReplicaSets to avoid quota issues on free-tier clusters&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Cleaning up old ReplicaSets..."&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;dep &lt;span class="k"&gt;in &lt;/span&gt;app-auth app-user app-core app-gateway&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="c"&gt;# Get all ReplicaSets for this deployment, sorted oldest first&lt;/span&gt;
  &lt;span class="c"&gt;# Delete all except the most recent one&lt;/span&gt;
  &lt;span class="nv"&gt;OLD_RS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;oc get rs &lt;span class="nt"&gt;-l&lt;/span&gt; &lt;span class="s2"&gt;"app=&lt;/span&gt;&lt;span class="nv"&gt;$dep&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;.metadata.creationTimestamp &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-o&lt;/span&gt; name 2&amp;gt;/dev/null | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="nt"&gt;-1&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$OLD_RS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$OLD_RS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | xargs oc delete
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  Cleaned old ReplicaSets for &lt;/span&gt;&lt;span class="nv"&gt;$dep&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;fi
done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Fix B is useful for &lt;strong&gt;one-time recovery&lt;/strong&gt; when you have already hit the quota, or as a safety net alongside Fix A. But &lt;strong&gt;Fix A is the real solution&lt;/strong&gt; — it is declarative, permanent, and prevents the problem from ever occurring again.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Bug #2: The Gateway That Routed to Itself
&lt;/h2&gt;

&lt;p&gt;Even after fixing the quota issue and getting all pods running, the app &lt;strong&gt;still&lt;/strong&gt; did not work. Registration still failed.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Clue
&lt;/h3&gt;

&lt;p&gt;I hit the gateway health endpoint directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://my-gateway-route.apps.openshiftapps.com/actuator/health
&lt;span class="c"&gt;# → 200 OK ✅&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gateway was healthy. But hitting an actual API route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://my-gateway-route.apps.openshiftapps.com/api/auth/signup
&lt;span class="c"&gt;# → 502 Bad Gateway ❌&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;My API Gateway's &lt;code&gt;application.yml&lt;/code&gt; had &lt;strong&gt;hardcoded localhost URLs&lt;/strong&gt; for routing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spring&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;cloud&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;auth-service&lt;/span&gt;
          &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://localhost:7071&lt;/span&gt;        &lt;span class="c1"&gt;# Works on my laptop&lt;/span&gt;
          &lt;span class="na"&gt;predicates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Path=/api/auth/**&lt;/span&gt;

        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user-service&lt;/span&gt;
          &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://localhost:7072&lt;/span&gt;        &lt;span class="c1"&gt;# Works on my laptop&lt;/span&gt;
          &lt;span class="na"&gt;predicates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Path=/api/users/**&lt;/span&gt;

        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;core-service&lt;/span&gt;
          &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://localhost:7073&lt;/span&gt;        &lt;span class="c1"&gt;# Works on my laptop&lt;/span&gt;
          &lt;span class="na"&gt;predicates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Path=/api/core/**&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;On my machine&lt;/strong&gt;, all 4 services run on the same host (localhost) on different ports. It works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;On Kubernetes&lt;/strong&gt;, each service runs in a &lt;strong&gt;separate pod&lt;/strong&gt; with its own network namespace. &lt;code&gt;localhost:7071&lt;/code&gt; inside the gateway pod is just... the gateway pod itself. There is nothing listening on port 7071 there.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frd0c48av32nhskn75b3a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frd0c48av32nhskn75b3a.png" alt="gateway self routing" width="697" height="702"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Irony
&lt;/h3&gt;

&lt;p&gt;My deploy script &lt;strong&gt;already created&lt;/strong&gt; the correct internal URLs as Kubernetes secrets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;oc create secret generic app-secrets &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--from-literal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;AUTH_SERVICE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://app-auth-svc:8080"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--from-literal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;USER_SERVICE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://app-user-svc:8080"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--from-literal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;CORE_SERVICE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://app-core-svc:8080"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And my other services &lt;strong&gt;correctly used them&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# core-service application.yml — Correct&lt;/span&gt;
&lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;user-service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;base-url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${USER_SERVICE_URL:http://localhost:7072}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Only the gateway was missed. The env vars were injected into the pod but never referenced in the routing config.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;Replace hardcoded URLs with environment variable references (with localhost as the default for local development):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spring&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;cloud&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;auth-service&lt;/span&gt;
          &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${AUTH_SERVICE_URL:http://localhost:7071}&lt;/span&gt;
          &lt;span class="na"&gt;predicates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Path=/api/auth/**&lt;/span&gt;

        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user-service&lt;/span&gt;
          &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${USER_SERVICE_URL:http://localhost:7072}&lt;/span&gt;
          &lt;span class="na"&gt;predicates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Path=/api/users/**&lt;/span&gt;

        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;core-service&lt;/span&gt;
          &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${CORE_SERVICE_URL:http://localhost:7073}&lt;/span&gt;
          &lt;span class="na"&gt;predicates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Path=/api/core/**&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;${ENV_VAR:default}&lt;/code&gt; syntax means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;On Kubernetes&lt;/strong&gt;: uses the injected secret value → &lt;code&gt;http://app-auth-svc:8080&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On localhost&lt;/strong&gt;: falls back to the default → &lt;code&gt;http://localhost:7071&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One config, works everywhere.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Complete Debugging Checklist
&lt;/h2&gt;

&lt;p&gt;If your microservices work locally but fail on OpenShift/Kubernetes, run through this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Check&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Are pods actually running?&lt;/td&gt;
&lt;td&gt;&lt;code&gt;oc get pods&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Are readiness probes passing?&lt;/td&gt;
&lt;td&gt;&lt;code&gt;oc describe pod &amp;lt;pod-name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Can the pod start at all?&lt;/td&gt;
&lt;td&gt;&lt;code&gt;oc logs &amp;lt;pod-name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Is there a quota issue?&lt;/td&gt;
&lt;td&gt;&lt;code&gt;oc describe quota&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;How many ReplicaSets exist?&lt;/td&gt;
&lt;td&gt;&lt;code&gt;oc get rs&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Is the gateway routing correctly?&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl &amp;lt;gateway-url&amp;gt;/actuator/health&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Are env vars injected properly?&lt;/td&gt;
&lt;td&gt;`oc exec  -- env \&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Is the service DNS resolving?&lt;/td&gt;
&lt;td&gt;{% raw %}&lt;code&gt;oc exec &amp;lt;gateway-pod&amp;gt; -- curl http://app-auth-svc:8080/actuator/health&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. "It works on my machine" extends to Kubernetes
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;localhost&lt;/code&gt; routing is the microservice equivalent of "works on my machine." Always use environment variables with sensible defaults so the same config works in both environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Kubernetes fails silently in ways you do not expect
&lt;/h3&gt;

&lt;p&gt;The ReplicaSet quota error did not crash my app. It did not log a warning. It just silently prevented new pods from being created, and the symptoms (readiness probe failure, connection refused) pointed you in the completely wrong direction.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Free-tier clusters have hidden constraints
&lt;/h3&gt;

&lt;p&gt;OpenShift Developer Sandbox, Google Cloud free tier, Azure free tier — they all have resource quotas that do not exist in your local Minikube or Docker Desktop Kubernetes. Always run &lt;code&gt;oc describe quota&lt;/code&gt; (or &lt;code&gt;kubectl describe quota&lt;/code&gt;) in your namespace to know your limits.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Set &lt;code&gt;revisionHistoryLimit&lt;/code&gt; from day one
&lt;/h3&gt;

&lt;p&gt;Do not wait until you hit the quota. Add &lt;code&gt;revisionHistoryLimit: 1&lt;/code&gt; to every deployment manifest as a standard practice. It keeps your cluster clean, stays within quotas, and still gives you one rollback point for safety.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. CI/CD amplifies configuration bugs
&lt;/h3&gt;

&lt;p&gt;When you deploy manually, you might catch issues because you are watching the logs. When CI/CD deploys automatically on every push, a configuration bug silently breaks production while you are still writing code, thinking everything is fine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bug&lt;/th&gt;
&lt;th&gt;Root Cause&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pods not starting&lt;/td&gt;
&lt;td&gt;ReplicaSet quota exceeded from accumulated rollout history&lt;/td&gt;
&lt;td&gt;Set &lt;code&gt;revisionHistoryLimit: 1&lt;/code&gt; in deployment manifests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gateway 502&lt;/td&gt;
&lt;td&gt;Route URIs hardcoded to &lt;code&gt;localhost&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Use &lt;code&gt;${ENV_VAR:localhost}&lt;/code&gt; pattern in gateway config&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you are deploying microservices on a free-tier Kubernetes cluster and your deployments mysteriously stop working after a few CI/CD runs — &lt;strong&gt;check your ReplicaSet count&lt;/strong&gt;. That silent quota limit is probably the culprit.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have you hit weird Kubernetes issues on free-tier clusters? I would love to hear about them — connect with me on &lt;a href="https://linkedin.com/in/anupamkushwaha85" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or check out more on &lt;a href="https://anupamkushwaha.me" rel="noopener noreferrer"&gt;anupamkushwaha.me&lt;/a&gt;.           And here is the full blog &lt;a href="https://anupamkushwaha.me/blog/openshift-microservices-quota-debugging" rel="noopener noreferrer"&gt;Link&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>softwareengineering</category>
      <category>microservices</category>
    </item>
    <item>
      <title>How I cut AI calls by 95% without losing quality?</title>
      <dc:creator>Anupam Kushwaha</dc:creator>
      <pubDate>Tue, 05 May 2026 14:03:08 +0000</pubDate>
      <link>https://forem.com/anupam_kushwaha_85/how-i-cut-ai-calls-by-95-without-losing-quality-28m8</link>
      <guid>https://forem.com/anupam_kushwaha_85/how-i-cut-ai-calls-by-95-without-losing-quality-28m8</guid>
      <description>&lt;h2&gt;
  
  
  The Hidden Cost of Calling AI Too Early
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;I stopped calling AI on every request — and everything got better.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;In one of my projects, I was generating AI-based insights from user activity.&lt;/p&gt;

&lt;p&gt;The initial design was simple:&lt;/p&gt;

&lt;p&gt;Every request for today’s insight → call the AI model → return a fresh response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /api/insights/today
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At first, this felt clean and correct.&lt;/p&gt;

&lt;p&gt;But in practice, it created serious problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;429 rate limit errors within hours&lt;/li&gt;
&lt;li&gt;Daily quota exhausted before noon&lt;/li&gt;
&lt;li&gt;Random failures affecting users&lt;/li&gt;
&lt;li&gt;Costs scaling linearly with traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system was working — but it wasn’t sustainable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Issue
&lt;/h2&gt;

&lt;p&gt;The problem wasn’t the AI provider.&lt;/p&gt;

&lt;p&gt;It was the &lt;strong&gt;trigger model&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The system never asked basic questions before making an expensive call:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Has anything actually changed?&lt;/li&gt;
&lt;li&gt;Did I already generate a response recently?&lt;/li&gt;
&lt;li&gt;Is the user even active today?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without these checks, every request was treated as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Generate a new insight now.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That assumption was the real bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  The New Approach
&lt;/h2&gt;

&lt;p&gt;Instead of adding caching on top, I redesigned the system into an &lt;strong&gt;event-driven pipeline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;AI became the &lt;strong&gt;last step&lt;/strong&gt;, not the default.&lt;/p&gt;




&lt;h2&gt;
  
  
  System Flow
&lt;/h2&gt;

&lt;p&gt;Here’s the simplified request flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flowchart TD
    A[Request for today's insight] --&amp;gt; B{Activity today?}
    B -- No --&amp;gt; C[Reuse latest insight or fallback]
    B -- Yes --&amp;gt; D{Meaningful change?}
    D -- No --&amp;gt; C
    D -- Yes --&amp;gt; E{Cooldown passed?}
    E -- No --&amp;gt; C
    E -- Yes --&amp;gt; F{Daily cap reached?}
    F -- Yes --&amp;gt; C
    F -- No --&amp;gt; G{Global AI limit reached?}
    G -- Yes --&amp;gt; H[Use deterministic fallback]
    G -- No --&amp;gt; I[Call AI model]
    I --&amp;gt; J[Persist insight]
    H --&amp;gt; J
    C --&amp;gt; J
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Most requests now end at a simple database read — not an AI call.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  (Optional) System Screenshot
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Add your architecture / sequence diagram here&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;![System Flow](your-image-url-or-upload)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Five-Layer Redesign
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Activity Gate
&lt;/h3&gt;

&lt;p&gt;Start with the cheapest check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;hasActivity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;activityService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasActivityToday&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;hasActivity&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;getLatestOrFallback&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If nothing happened → don’t call AI.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Event-Driven Triggers
&lt;/h3&gt;

&lt;p&gt;AI should only run when something meaningful changes.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;user updates intent&lt;/li&gt;
&lt;li&gt;significant behavior change&lt;/li&gt;
&lt;li&gt;threshold crossed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No change → reuse previous insight.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Cooldown Window
&lt;/h3&gt;

&lt;p&gt;Avoid frequent re-generation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt; &lt;span class="n"&gt;cooldown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofMinutes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;cooldown&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;getLatestOrFallback&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents unnecessary repeated calls.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Per-User Daily Cap
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;todayCount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;getLatestOrFallback&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even active users shouldn’t trigger unlimited AI calls.&lt;/p&gt;




&lt;h3&gt;
  
  
  5. Global AI Guard
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dailyAiCalls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;useFallback&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This acts as a system-wide circuit breaker.&lt;/p&gt;




&lt;h2&gt;
  
  
  Configuration
&lt;/h2&gt;

&lt;p&gt;All thresholds are configurable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;insight&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;activity-delta&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
  &lt;span class="na"&gt;cooldown-minutes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
  &lt;span class="na"&gt;daily-cap-per-user&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="na"&gt;max-ai-calls-per-day&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
  &lt;span class="na"&gt;freshness-window-hours&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This allows tuning without redeploying code.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;After this redesign:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI calls dropped from ~100/day → ~5–10/day&lt;/li&gt;
&lt;li&gt;Rate limit errors disappeared&lt;/li&gt;
&lt;li&gt;Most requests became fast database reads&lt;/li&gt;
&lt;li&gt;Free-tier usage became sustainable&lt;/li&gt;
&lt;li&gt;System behavior became more predictable&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Engineering Takeaway
&lt;/h2&gt;

&lt;p&gt;AI should be the &lt;strong&gt;exception&lt;/strong&gt;, not the rule.&lt;/p&gt;

&lt;p&gt;A well-designed backend should first decide:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Is this request even worth sending to the model?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That decision layer — gating, triggers, cooldowns — is where the real engineering happens.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;If most requests can be handled using deterministic logic or cached state:&lt;/p&gt;

&lt;p&gt;Do that first.&lt;/p&gt;

&lt;p&gt;Use AI only when it actually adds value.&lt;/p&gt;

&lt;p&gt;That single shift can make your system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cheaper&lt;/li&gt;
&lt;li&gt;faster&lt;/li&gt;
&lt;li&gt;more reliable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;—and much easier to scale.&lt;br&gt;&lt;br&gt;
 ## blog link - &lt;br&gt;
&lt;a href="https://anupamkushwaha.me/blog/stopped-calling-ai-on-every-request" rel="noopener noreferrer"&gt;https://anupamkushwaha.me/blog/stopped-calling-ai-on-every-request&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>backend</category>
      <category>buildinpublic</category>
      <category>techtalks</category>
    </item>
    <item>
      <title>From Scaffolding to Debugging: Spring Boot with GitHub Copilot CLI</title>
      <dc:creator>Anupam Kushwaha</dc:creator>
      <pubDate>Mon, 09 Feb 2026 18:33:58 +0000</pubDate>
      <link>https://forem.com/anupam_kushwaha_85/from-scaffolding-to-debugging-spring-boot-with-github-copilot-cli-3e3l</link>
      <guid>https://forem.com/anupam_kushwaha_85/from-scaffolding-to-debugging-spring-boot-with-github-copilot-cli-3e3l</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-01-21"&gt;GitHub Copilot CLI Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I built a production-ready Spring Boot REST API secured with JWT authentication.&lt;/p&gt;

&lt;p&gt;The application includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User registration and authentication&lt;/li&gt;
&lt;li&gt;Stateless JWT-based authorization&lt;/li&gt;
&lt;li&gt;Task management (CRUD operations)&lt;/li&gt;
&lt;li&gt;Layered architecture (controller, service, repository, entity, DTO)&lt;/li&gt;
&lt;li&gt;Multi-database support:

&lt;ul&gt;
&lt;li&gt;H2 for local development&lt;/li&gt;
&lt;li&gt;PostgreSQL for production readiness&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;This project explores whether GitHub Copilot CLI can act as a real engineering assistant — not just a code generator — while building a secure backend following real-world Spring Boot best practices.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🔗 &lt;strong&gt;Repository:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/anupamkushwaha85/copilot-cli-springboot-jwt" rel="noopener noreferrer"&gt;https://github.com/anupamkushwaha85/copilot-cli-springboot-jwt&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Screenshots
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Project scaffolding with GitHub Copilot CLI&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2zb5c8gjcp8kfwt9i8ar.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2zb5c8gjcp8kfwt9i8ar.png" alt="GitHub Copilot CLI generating the complete Spring Boot project structure, including entities, repositories, services, controllers, and security configuration." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JWT security and service layer generation&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn2t86g6xkg7dl2asf7e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn2t86g6xkg7dl2asf7e.png" alt="Copilot CLI generating JWT authentication, Spring Security filters, service layer logic, and REST controllers using iterative prompts." width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging Spring Security (403 Forbidden issue)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0wjwrvxixwgyn8d6d0y5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0wjwrvxixwgyn8d6d0y5.png" alt="Initial 403 Forbidden error when testing authentication endpoints, highlighting a real Spring Security configuration issue." width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API testing with Postman&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi02fg5rinwy5ffq3xix4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi02fg5rinwy5ffq3xix4.png" alt="Successful user registration and authentication using JWT tokens." width="800" height="667"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Protected endpoints in action&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c71lzpyoblfvxib529b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c71lzpyoblfvxib529b.png" alt="JWT-secured task APIs accessed with authenticated requests." width="800" height="673"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  My Experience with GitHub Copilot CLI
&lt;/h2&gt;

&lt;p&gt;GitHub Copilot CLI was used throughout the entire development lifecycle — not just for initial scaffolding.&lt;/p&gt;

&lt;p&gt;Copilot CLI assisted with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generating the complete project structure and Maven configuration&lt;/li&gt;
&lt;li&gt;Creating entities, repositories, services, controllers, and DTOs&lt;/li&gt;
&lt;li&gt;Implementing JWT authentication and Spring Security configuration&lt;/li&gt;
&lt;li&gt;Adapting the application for H2 (development) and PostgreSQL (production)&lt;/li&gt;
&lt;li&gt;Generating documentation alongside the code&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real Debugging with Copilot CLI
&lt;/h3&gt;

&lt;p&gt;One of the most valuable moments was debugging a &lt;code&gt;403 Forbidden&lt;/code&gt; error on the authentication endpoints.&lt;/p&gt;

&lt;p&gt;Instead of trial-and-error:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Copilot CLI analyzed the Spring Security configuration&lt;/li&gt;
&lt;li&gt;Identified default form login and HTTP basic auth interference&lt;/li&gt;
&lt;li&gt;Suggested disabling them explicitly&lt;/li&gt;
&lt;li&gt;Provided the exact fix in &lt;code&gt;SecurityConfig&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This experience showed that Copilot CLI is effective not only for writing code, but also for systematic debugging and root-cause analysis. It significantly reduced setup friction and allowed me to focus on architecture and correctness instead of boilerplate.&lt;/p&gt;

&lt;h4&gt;
  
  
  Outcome:
&lt;/h4&gt;

&lt;p&gt;A production-ready backend built end-to-end with GitHub Copilot CLI, demonstrating real-world usage beyond code generation.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>cli</category>
      <category>githubcopilot</category>
    </item>
  </channel>
</rss>
