<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Akshat Jain</title>
    <description>The latest articles on Forem by Akshat Jain (@akshatjme).</description>
    <link>https://forem.com/akshatjme</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3792066%2Fbb56929c-3411-4d44-9737-f4b6d57a235c.png</url>
      <title>Forem: Akshat Jain</title>
      <link>https://forem.com/akshatjme</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/akshatjme"/>
    <language>en</language>
    <item>
      <title>How I Built an LLM Service That Converts Natural Language into Database Events</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Tue, 05 May 2026 16:08:45 +0000</pubDate>
      <link>https://forem.com/akshatjme/how-i-built-an-llm-service-that-converts-natural-language-into-database-events-cka</link>
      <guid>https://forem.com/akshatjme/how-i-built-an-llm-service-that-converts-natural-language-into-database-events-cka</guid>
      <description>&lt;p&gt;You open the app, fill fields, select options, and submit.&lt;/p&gt;

&lt;p&gt;It works but it’s friction.&lt;/p&gt;

&lt;p&gt;I wanted something simpler.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;What if a user could just say:&lt;br&gt;&lt;br&gt;
_**&lt;/em&gt;“Netflix ₹499 monthly”&lt;em&gt;** &lt;/em&gt;…and the system handles everything?_&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdsoo9w8nbkpr13hpc2iu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdsoo9w8nbkpr13hpc2iu.png" alt="How I Built an LLM Service That Converts Natural Language into Database Events" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Core Idea
&lt;/h3&gt;

&lt;p&gt;Instead of forcing users to adapt to the system…&lt;/p&gt;

&lt;p&gt;Make the system adapt to the user.&lt;/p&gt;

&lt;p&gt;The pipeline looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfpsmpxayt47jxni8c0e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfpsmpxayt47jxni8c0e.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each step reduces ambiguity and moves toward structured data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1 — Handling Voice &amp;amp; Text Input
&lt;/h3&gt;

&lt;p&gt;The system doesn’t just rely on one type of input.&lt;/p&gt;

&lt;p&gt;Users can either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Speak&lt;/strong&gt; (“Netflix ₹499 monthly”)&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Type a quick message&lt;/strong&gt; (just like a notification or note)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the first step is to normalize everything into &lt;strong&gt;plain text&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If the input is voice, we convert it using a speech-to-text service.&lt;br&gt;&lt;br&gt;
If it’s already text, we process it directly.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The goal is simple: everything becomes text before any processing begins.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Example Input
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;\&lt;span class="c1"&gt;# Case 1: User typed a message (like a quick note)  
&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Netflix 499 monthly&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  

\&lt;span class="c1"&gt;# Case 2: Voice input (after speech-to-text conversion)  
&lt;/span&gt;&lt;span class="n"&gt;voice&lt;/span&gt;\&lt;span class="n"&gt;_transcribed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Spotify 199 per month&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Basic Handling Layer
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;normalize&lt;/span&gt;\&lt;span class="nf"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;\&lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;\&lt;span class="n"&gt;_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;\&lt;span class="n"&gt;_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="c1"&gt;# Simulated speech-to-text (replace with real API)  
&lt;/span&gt;        &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;\&lt;span class="n"&gt;_data&lt;/span&gt;  &lt;span class="c1"&gt;# already transcribed  
&lt;/span&gt;    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;\&lt;span class="n"&gt;_data&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  

\&lt;span class="c1"&gt;# Example usage  
&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;\&lt;span class="nf"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="n"&gt;voice&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;\&lt;span class="nf"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;voice&lt;/span&gt;\&lt;span class="n"&gt;_transcribed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;voice&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Why This Step Matters
&lt;/h3&gt;

&lt;p&gt;This step might look simple, but it’s critical.&lt;/p&gt;

&lt;p&gt;Because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  It creates a &lt;strong&gt;single entry point&lt;/strong&gt; for all inputs&lt;/li&gt;
&lt;li&gt;  It keeps downstream logic clean&lt;/li&gt;
&lt;li&gt;  It allows you to support multiple input methods easily&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And more importantly:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;It makes the system feel natural — users can just “say” or “type” what they did.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Step 2 — Lightweight Regex Filtering
&lt;/h3&gt;

&lt;p&gt;Before sending everything to the LLM, I added a simple filter.&lt;/p&gt;

&lt;p&gt;Why?&lt;/p&gt;

&lt;p&gt;Because not all inputs are subscription-related.&lt;/p&gt;

&lt;p&gt;This saves cost and improves accuracy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;  

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is&lt;/span&gt;\&lt;span class="nf"&gt;_subscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="n"&gt;patterns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; \&lt;span class="p"&gt;[&lt;/span&gt;  
        &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;\\b(monthly|yearly|weekly)\\b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
        &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;₹\\d+&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
        &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;\\b(netflix|spotify|amazon|prime)\\b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  
    \&lt;span class="p"&gt;]&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
\&lt;span class="c1"&gt;# Example  
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ow"&gt;is&lt;/span&gt;\&lt;span class="nf"&gt;_subscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# True
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If it’s not a subscription, we can route it elsewhere.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3 — LLM Parsing
&lt;/h3&gt;

&lt;p&gt;Now comes the important part — extracting structured data.&lt;/p&gt;

&lt;p&gt;We send the filtered input to an LLM with a strict prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;  

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parse&lt;/span&gt;\&lt;span class="nf"&gt;_subscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;  
    Extract subscription details from the input.  
    Return JSON with fields:  
    name, cost, billing\_cycle  
    Input: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;  
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;  
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;\&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;\&lt;span class="p"&gt;]&lt;/span&gt;  
    &lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;\&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;\&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;  
\&lt;span class="c1"&gt;# Example  
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parse&lt;/span&gt;\&lt;span class="nf"&gt;_subscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;\&lt;span class="n"&gt;_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Netflix"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
  &lt;/span&gt;&lt;span class="nl"&gt;"cost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;499&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
  &lt;/span&gt;&lt;span class="nl"&gt;"billing\_cycle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"monthly"&lt;/span&gt;&lt;span class="w"&gt;  
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4 — Structuring the Event
&lt;/h3&gt;

&lt;p&gt;Now we convert this into a system event.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;  

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;\&lt;span class="nf"&gt;_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;\&lt;span class="n"&gt;_json&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;\&lt;span class="n"&gt;_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

    &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;  
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUBSCRIPTION\_CREATED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payload&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;  
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;\&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;\&lt;span class="p"&gt;],&lt;/span&gt;  
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;\&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;\&lt;span class="p"&gt;],&lt;/span&gt;  
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;billing\_cycle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;\&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;billing\_cycle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;\&lt;span class="p"&gt;]&lt;/span&gt;  
        &lt;span class="p"&gt;}&lt;/span&gt;  
    &lt;span class="p"&gt;}&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;  
&lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt;\&lt;span class="nf"&gt;_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5 — Saving to Database
&lt;/h3&gt;

&lt;p&gt;Finally, store it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save&lt;/span&gt;\&lt;span class="n"&gt;_to&lt;/span&gt;\&lt;span class="nf"&gt;_db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="c1"&gt;# Replace with actual DB logic  
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Saving to DB:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

&lt;span class="n"&gt;save&lt;/span&gt;\&lt;span class="n"&gt;_to&lt;/span&gt;\&lt;span class="nf"&gt;_db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why This Works
&lt;/h3&gt;

&lt;p&gt;This system feels simple, but a few design decisions make it powerful:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Regex Before LLM
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Filters irrelevant input&lt;/li&gt;
&lt;li&gt;  Reduces cost&lt;/li&gt;
&lt;li&gt;  Improves signal&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. LLM for Structure, Not Logic
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  LLM extracts meaning&lt;/li&gt;
&lt;li&gt;  System enforces rules&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Event-Based Design
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Everything becomes an event&lt;/li&gt;
&lt;li&gt;  Easy to extend (notifications, analytics, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where This Gets Interesting
&lt;/h3&gt;

&lt;p&gt;Once this pipeline is in place, you can extend it easily:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Add reminders automatically&lt;/li&gt;
&lt;li&gt;  Trigger notifications&lt;/li&gt;
&lt;li&gt;  Detect duplicates&lt;/li&gt;
&lt;li&gt;  Categorize spending&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And most importantly:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The user doesn’t feel like they’re using a system.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They just type or speak naturally or we can take permission and extract messages from cell phone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Final Thought
&lt;/h3&gt;

&lt;p&gt;This isn’t about AI.&lt;/p&gt;

&lt;p&gt;It’s about reducing friction.&lt;/p&gt;

&lt;p&gt;Forms make users adapt to systems.&lt;br&gt;&lt;br&gt;
Natural language lets systems adapt to users.&lt;/p&gt;

&lt;p&gt;And that small shift makes everything feel… effortless.&lt;/p&gt;

</description>
      <category>database</category>
      <category>llm</category>
      <category>nlp</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Observability: You Can’t Fix What You Can’t See</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Sun, 03 May 2026 15:09:41 +0000</pubDate>
      <link>https://forem.com/akshatjme/observability-you-cant-fix-what-you-cant-see-5hhf</link>
      <guid>https://forem.com/akshatjme/observability-you-cant-fix-what-you-cant-see-5hhf</guid>
      <description>&lt;h4&gt;
  
  
  Understanding system behavior beyond logs and dashboards
&lt;/h4&gt;

&lt;p&gt;In previous parts, we explored how systems fail under load and how design decisions influence performance.&lt;/p&gt;

&lt;p&gt;But identifying failures is a different challenge.&lt;/p&gt;

&lt;p&gt;A system may be slow, unstable, or partially broken, yet the cause is not always visible.&lt;/p&gt;

&lt;p&gt;This is where observability becomes important.&lt;/p&gt;

&lt;p&gt;Observability is not just about collecting data.&lt;br&gt;&lt;br&gt;
It is about understanding how a system behaves internally by looking at its outputs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo3l657nx1o2d7yzzgy5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo3l657nx1o2d7yzzgy5.png" alt="Observability: You Can’t Fix What You Can’t See" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Logs, metrics, and traces
&lt;/h3&gt;

&lt;p&gt;Observability is built on three main signals.&lt;/p&gt;

&lt;p&gt;Logs provide discrete records of events.&lt;br&gt;&lt;br&gt;
They show what happened at a specific point in time.&lt;/p&gt;

&lt;p&gt;Metrics provide aggregated numerical data.&lt;br&gt;&lt;br&gt;
They show trends such as latency, error rates, and throughput.&lt;/p&gt;

&lt;p&gt;Traces provide request level visibility.&lt;br&gt;&lt;br&gt;
They show how a single request moves through different components.&lt;/p&gt;

&lt;p&gt;Each of these serves a different purpose.&lt;/p&gt;

&lt;p&gt;Logs help in understanding specific events.&lt;br&gt;&lt;br&gt;
Metrics help in identifying patterns.&lt;br&gt;&lt;br&gt;
Traces help in connecting events across the system.&lt;/p&gt;

&lt;p&gt;None of them is sufficient on its own.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lack of visibility delays fixes
&lt;/h3&gt;

&lt;p&gt;When systems lack observability, problems remain hidden.&lt;/p&gt;

&lt;p&gt;Failures may exist in small forms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  slight latency increases&lt;/li&gt;
&lt;li&gt;  occasional errors&lt;/li&gt;
&lt;li&gt;  resource usage spikes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These signals are often missed without proper visibility.&lt;/p&gt;

&lt;p&gt;Over time, these small issues grow.&lt;/p&gt;

&lt;p&gt;By the time they become noticeable, the system is already under stress or failing.&lt;/p&gt;

&lt;p&gt;Lack of visibility does not prevent problems.&lt;br&gt;&lt;br&gt;
It delays their discovery.&lt;/p&gt;

&lt;h3&gt;
  
  
  Correlation is key
&lt;/h3&gt;

&lt;p&gt;Modern systems are distributed.&lt;/p&gt;

&lt;p&gt;A single request may pass through multiple services, databases, and external APIs.&lt;/p&gt;

&lt;p&gt;Observing each component separately is not enough.&lt;/p&gt;

&lt;p&gt;The key is to &lt;strong&gt;connect events across components&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Correlation allows understanding of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  how one service affects another&lt;/li&gt;
&lt;li&gt;  where latency is introduced&lt;/li&gt;
&lt;li&gt;  how failures propagate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without correlation, data remains fragmented.&lt;/p&gt;

&lt;p&gt;With correlation, it becomes possible to identify root causes instead of symptoms.&lt;/p&gt;

&lt;h3&gt;
  
  
  The problem of too many metrics
&lt;/h3&gt;

&lt;p&gt;Collecting more data does not always improve observability.&lt;/p&gt;

&lt;p&gt;Large systems often generate thousands of metrics.&lt;/p&gt;

&lt;p&gt;This creates noise.&lt;/p&gt;

&lt;p&gt;When everything is measured, it becomes harder to identify what actually matters.&lt;/p&gt;

&lt;p&gt;Important signals get lost among less relevant data.&lt;/p&gt;

&lt;p&gt;Effective observability focuses on meaningful metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  latency&lt;/li&gt;
&lt;li&gt;  error rates&lt;/li&gt;
&lt;li&gt;  system saturation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is not to measure everything, but to measure what reflects system behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability as a system property
&lt;/h3&gt;

&lt;p&gt;Observability is not something added later.&lt;/p&gt;

&lt;p&gt;It must be part of system design.&lt;/p&gt;

&lt;p&gt;Systems should be built in a way that their internal state can be inferred from external outputs.&lt;/p&gt;

&lt;p&gt;This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  structured logging&lt;/li&gt;
&lt;li&gt;  consistent metrics&lt;/li&gt;
&lt;li&gt;  traceable request flows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this, understanding system behavior becomes difficult, especially under load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Observability defines how well a system can be understood from the outside.&lt;/p&gt;

&lt;p&gt;Without it, diagnosing issues becomes slow and uncertain.&lt;/p&gt;

&lt;p&gt;With it, systems become easier to analyze, debug, and improve.&lt;/p&gt;

&lt;p&gt;Performance issues, failures, and bottlenecks are not always obvious.&lt;br&gt;&lt;br&gt;
They must be observed, connected, and interpreted.&lt;/p&gt;

&lt;p&gt;In the next part, we will look at common scaling myths that often mislead developers when designing systems.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>systemdesignconcepts</category>
      <category>distributedsystems</category>
      <category>backenddevelopment</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Load Testing: Why Most Developers Do It Wrong</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Fri, 01 May 2026 15:31:42 +0000</pubDate>
      <link>https://forem.com/akshatjme/load-testing-why-most-developers-do-it-wrong-b0h</link>
      <guid>https://forem.com/akshatjme/load-testing-why-most-developers-do-it-wrong-b0h</guid>
      <description>&lt;h4&gt;
  
  
  Why testing for stability often hides the real limits of your system
&lt;/h4&gt;

&lt;p&gt;In previous parts, we explored how systems behave under pressure.&lt;/p&gt;

&lt;p&gt;Load testing is meant to reveal those behaviors before they appear in production.&lt;/p&gt;

&lt;p&gt;However, many systems still fail unexpectedly, even after being tested.&lt;/p&gt;

&lt;p&gt;The issue is not the absence of testing.&lt;br&gt;&lt;br&gt;
It is how testing is approached.&lt;/p&gt;

&lt;p&gt;Load testing is often treated as a validation step, rather than a method to understand system limits.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2A0aTnB1LaKS1NmFGW2m7CLQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2A0aTnB1LaKS1NmFGW2m7CLQ.png" alt="Load Testing" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Load Testing&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing average instead of peak
&lt;/h3&gt;

&lt;p&gt;Most load tests simulate normal conditions.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  expected number of users&lt;/li&gt;
&lt;li&gt;  typical request patterns&lt;/li&gt;
&lt;li&gt;  stable traffic levels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Under these conditions, systems usually perform well.&lt;/p&gt;

&lt;p&gt;However, real failures occur under &lt;strong&gt;peak conditions&lt;/strong&gt;, not average ones.&lt;/p&gt;

&lt;p&gt;Traffic spikes, sudden bursts, and extreme concurrency reveal issues that normal testing cannot.&lt;/p&gt;

&lt;p&gt;Testing only average load gives a false sense of confidence.&lt;br&gt;&lt;br&gt;
It confirms that the system works, but not how it behaves under stress.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unrealistic test scenarios
&lt;/h3&gt;

&lt;p&gt;Load tests often use simplified or artificial traffic patterns.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  uniform request distribution&lt;/li&gt;
&lt;li&gt;  predictable intervals&lt;/li&gt;
&lt;li&gt;  identical requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Real user behavior is different.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  traffic comes in bursts&lt;/li&gt;
&lt;li&gt;  request patterns vary&lt;/li&gt;
&lt;li&gt;  some endpoints are used more than others&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because of this mismatch, tests fail to capture real-world complexity.&lt;/p&gt;

&lt;p&gt;The system passes the test but fails in production, where conditions are less predictable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ignoring system limits
&lt;/h3&gt;

&lt;p&gt;A key purpose of load testing is to identify limits.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  maximum throughput&lt;/li&gt;
&lt;li&gt;  latency thresholds&lt;/li&gt;
&lt;li&gt;  resource saturation points&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, many tests stop once the system appears stable.&lt;/p&gt;

&lt;p&gt;They measure success instead of exploring failure.&lt;/p&gt;

&lt;p&gt;Without pushing the system to its limits, it is not possible to understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  when performance starts degrading&lt;/li&gt;
&lt;li&gt;  how quickly failures spread&lt;/li&gt;
&lt;li&gt;  which component fails first&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Understanding limits is more valuable than confirming stability.&lt;/p&gt;

&lt;h3&gt;
  
  
  No continuous testing
&lt;/h3&gt;

&lt;p&gt;Load testing is often treated as a one-time activity.&lt;/p&gt;

&lt;p&gt;It is performed before release and then ignored.&lt;/p&gt;

&lt;p&gt;However, systems evolve over time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  new features are added&lt;/li&gt;
&lt;li&gt;  traffic patterns change&lt;/li&gt;
&lt;li&gt;  dependencies are updated&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These changes affect performance.&lt;/p&gt;

&lt;p&gt;A system that was stable earlier may degrade gradually.&lt;/p&gt;

&lt;p&gt;Without continuous testing, these changes go unnoticed until failure occurs in production.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lack of failure analysis
&lt;/h3&gt;

&lt;p&gt;Many load tests focus on metrics like response time and throughput.&lt;/p&gt;

&lt;p&gt;But they do not analyze how the system fails.&lt;/p&gt;

&lt;p&gt;Important questions are often ignored:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  does the system degrade gradually or suddenly&lt;/li&gt;
&lt;li&gt;  which component fails first&lt;/li&gt;
&lt;li&gt;  how failures propagate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Understanding failure behavior is essential for improving system design.&lt;/p&gt;

&lt;p&gt;Without it, testing provides limited insight.&lt;/p&gt;

&lt;h3&gt;
  
  
  No correlation with real metrics
&lt;/h3&gt;

&lt;p&gt;Load testing results are often viewed in isolation.&lt;/p&gt;

&lt;p&gt;They are not always compared with real system metrics such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  CPU usage&lt;/li&gt;
&lt;li&gt;  memory consumption&lt;/li&gt;
&lt;li&gt;  database performance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this correlation, it is difficult to identify the root cause of performance issues.&lt;/p&gt;

&lt;p&gt;Testing shows that a problem exists, but not why it exists.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Load testing is not just about checking if a system works.&lt;/p&gt;

&lt;p&gt;It is about understanding how the system behaves under pressure.&lt;/p&gt;

&lt;p&gt;Testing average conditions, using unrealistic scenarios, and avoiding system limits leads to incomplete results.&lt;/p&gt;

&lt;p&gt;To be effective, load testing must explore extremes, reflect real-world usage, and evolve with the system.&lt;/p&gt;

&lt;p&gt;In the next part, we will look at observability and why understanding system behavior is essential for fixing performance issues.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>scalability</category>
      <category>backenddevelopment</category>
      <category>systemdesignconcepts</category>
      <category>softwaretesting</category>
    </item>
    <item>
      <title>How I Built a Decision-Tree Based Help and Support System</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Wed, 29 Apr 2026 16:10:54 +0000</pubDate>
      <link>https://forem.com/akshatjme/how-i-built-a-decision-tree-based-help-and-support-system-14c0</link>
      <guid>https://forem.com/akshatjme/how-i-built-a-decision-tree-based-help-and-support-system-14c0</guid>
      <description>&lt;h4&gt;
  
  
  80% of user problems are repeated patterns.
&lt;/h4&gt;

&lt;p&gt;So why are we solving them manually every time?&lt;/p&gt;

&lt;p&gt;If you’ve ever built a &lt;strong&gt;help and support system&lt;/strong&gt;, you’ve probably done this&lt;br&gt;&lt;br&gt;
Add a few FAQs, maybe a help page, and a “Contact Us” button.&lt;/p&gt;

&lt;p&gt;It feels enough.&lt;/p&gt;

&lt;p&gt;But then users start reaching out… and you notice something strange.&lt;/p&gt;

&lt;p&gt;They’re asking the same questions. Over and over again.&lt;/p&gt;

&lt;p&gt;“How do I add a subscription?”&lt;br&gt;&lt;br&gt;
“Why is my billing date wrong?”&lt;br&gt;&lt;br&gt;
“Where can I see my payments?”&lt;/p&gt;

&lt;p&gt;At first, it feels like users aren’t reading.&lt;/p&gt;

&lt;p&gt;But that’s not the real problem.&lt;/p&gt;

&lt;p&gt;The real problem is this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Most help systems are designed for information.&lt;br&gt;&lt;br&gt;
Users need guidance.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A static FAQ assumes the user already knows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  what their problem is&lt;/li&gt;
&lt;li&gt;  what to search for&lt;/li&gt;
&lt;li&gt;  which answer applies to them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In reality, most users are confused at the first step.&lt;/p&gt;

&lt;p&gt;They don’t think in terms of categories like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  “billing issues”&lt;/li&gt;
&lt;li&gt;  “subscription errors”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They think in situations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  “something is not working”&lt;/li&gt;
&lt;li&gt;  “I don’t understand this screen”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And here’s where things get interesting.&lt;/p&gt;

&lt;p&gt;When I started looking at support requests closely, I realized something:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;A large percentage of problems were repeated patterns.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not unique cases.&lt;/p&gt;

&lt;p&gt;Just the same issues showing up again and again:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  multiple users struggling to add a subscription&lt;/li&gt;
&lt;li&gt;  users misunderstanding renewal dates&lt;/li&gt;
&lt;li&gt;  confusion around monthly vs yearly billing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This changed how I looked at the problem.&lt;/p&gt;

&lt;p&gt;Instead of building a better FAQ…&lt;/p&gt;

&lt;p&gt;I started thinking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;What if the system could guide users step-by-step to the solution instead of expecting them to find it?&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That idea is what led to building a &lt;strong&gt;decision-tree based help system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl94q5iqpq2r7iddj19e1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl94q5iqpq2r7iddj19e1.png" alt="How I Built a Decision-Tree Based Help and Support System" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Thinking Like a System, Not a Page
&lt;/h3&gt;

&lt;p&gt;After noticing that most user issues were repetitive, the problem became clearer.&lt;/p&gt;

&lt;p&gt;The issue wasn’t lack of content.&lt;br&gt;&lt;br&gt;
It was lack of &lt;strong&gt;direction&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Traditional help systems are built like documentation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Lists of FAQs&lt;/li&gt;
&lt;li&gt;  Search bars&lt;/li&gt;
&lt;li&gt;  Static categories&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But users don’t navigate problems like that.&lt;/p&gt;

&lt;p&gt;They don’t think:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Let me go to the billing section and read all options.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They think:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Something is wrong what do I do next?”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That shift is important.&lt;/p&gt;

&lt;p&gt;Instead of designing a &lt;strong&gt;help page&lt;/strong&gt;, I started designing a &lt;strong&gt;guided system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A system that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  asks the right questions&lt;/li&gt;
&lt;li&gt;  narrows down the problem&lt;/li&gt;
&lt;li&gt;  leads the user to a solution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Almost like how a support agent would think.&lt;/p&gt;

&lt;p&gt;And that’s where the idea of a &lt;strong&gt;decision tree&lt;/strong&gt; fits naturally.&lt;/p&gt;

&lt;p&gt;Instead of overwhelming users with options, you guide them step by step:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  What’s the issue?&lt;/li&gt;
&lt;li&gt;  What exactly went wrong?&lt;/li&gt;
&lt;li&gt;  When did it happen?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each answer moves them closer to the solution.&lt;/p&gt;

&lt;p&gt;This approach does two things really well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Reduces user confusion&lt;/li&gt;
&lt;li&gt;  Reduces repeated support requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because now, instead of 20 users asking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“How do I add a subscription?”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The system guides them through the exact steps automatically.&lt;/p&gt;

&lt;p&gt;At this point, the help system stops being passive.&lt;/p&gt;

&lt;p&gt;It becomes &lt;strong&gt;interactive and problem-solving&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Designing the Decision Tree Structure
&lt;/h3&gt;

&lt;p&gt;Once the idea of a guided system was clear, the next step was structuring it properly.&lt;/p&gt;

&lt;p&gt;At its core, the help system is just a &lt;strong&gt;decision tree&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Simple concept:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Each &lt;strong&gt;node&lt;/strong&gt; = a question&lt;/li&gt;
&lt;li&gt;  Each &lt;strong&gt;branch&lt;/strong&gt; = a user choice&lt;/li&gt;
&lt;li&gt;  Each &lt;strong&gt;leaf&lt;/strong&gt; = a solution or a real person/agent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of showing everything at once, the system reveals only what’s needed at each step.&lt;/p&gt;

&lt;p&gt;Here’s a simple example:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhg92m3wb1ti6jb35n39g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhg92m3wb1ti6jb35n39g.png" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Tree Structure&lt;/p&gt;

&lt;p&gt;Now compare this to a typical FAQ page.&lt;/p&gt;

&lt;p&gt;Instead of scanning 10–15 questions, the user just answers 2–3 guided steps and reaches the solution.&lt;/p&gt;
&lt;h3&gt;
  
  
  Why This Structure Works
&lt;/h3&gt;

&lt;p&gt;This works well because of one key observation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Most user problems fall into a limited number of patterns.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Many users struggle with adding subscriptions&lt;/li&gt;
&lt;li&gt;  Many get confused about billing cycles&lt;/li&gt;
&lt;li&gt;  Many face similar payment issues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of handling each request individually, we &lt;strong&gt;categorize and guide&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This reduces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  repeated support queries&lt;/li&gt;
&lt;li&gt;  manual intervention&lt;/li&gt;
&lt;li&gt;  user frustration&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Designing It Properly
&lt;/h3&gt;

&lt;p&gt;While building this, a few principles mattered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Keep questions simple&lt;/li&gt;
&lt;li&gt;  Avoid deep nesting (3–5 levels max)&lt;/li&gt;
&lt;li&gt;  Always provide an exit (contact support)&lt;/li&gt;
&lt;li&gt;  Log where users drop off&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because if users abandon the flow, that’s where your system needs improvement.&lt;/p&gt;

&lt;p&gt;At this point, the structure is clear.&lt;/p&gt;

&lt;p&gt;Next step is making it &lt;strong&gt;work in code&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Implementing the Decision Tree (Python Code)
&lt;/h3&gt;

&lt;p&gt;Once the structure was clear, implementing it was surprisingly simple.&lt;/p&gt;

&lt;p&gt;You don’t need complex frameworks.&lt;br&gt;&lt;br&gt;
A decision tree can be represented using basic objects.&lt;/p&gt;

&lt;p&gt;At its core, each node needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  a question (or condition)&lt;/li&gt;
&lt;li&gt;  possible next steps&lt;/li&gt;
&lt;li&gt;  or a final action (solution)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Basic Implementation
&lt;/h3&gt;

&lt;p&gt;Here’s a clean and minimal version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Node:  
    def \_\_init\_\_(self, question=None, options=None, action=None):  
        self.question = question  
        self.options = options or {}  
        self.action = action  

    def evaluate(self, context):  
        if self.action:  
            return self.action(context)  

        answer = context.get(self.question)  

        if answer in self.options:  
            return self.options\[answer\].evaluate(context)  

        return escalate\_to\_agent(context)  

\# Actions  
def resolved(ctx):  
    return "Issue Resolved"  

def retry\_payment(ctx):  
    if ctx.get("retry\_success"):  
        return "Payment Successful"  
    return escalate\_to\_agent(ctx)  

def escalate\_to\_agent(ctx):  
    return "Escalating to Customer Support Agent"  

\# Tree Construction  

tree = Node(  
    question="issue\_type",  
    options={  
        "subscription": Node(  
            question="subscription\_problem",  
            options={  
                "add": Node(  
                    question="add\_issue\_type",  
                    options={  
                        "ui": Node(action=resolved),  
                        "error": Node(action=escalate\_to\_agent)  
                    }  
                ),  
                "manage": Node(  
                    question="manage\_issue",  
                    options={  
                        "edit": Node(action=resolved),  
                        "delete": Node(action=escalate\_to\_agent)  
                    }  
                )  
            }  
        ),  
        "payment": Node(  
            question="payment\_problem",  
            options={  
                "failed": Node(action=retry\_payment),  
                "incorrect\_charge": Node(action=escalate\_to\_agent)  
            }  
        )  
    }  
)  

\# Example Context  
context = {  
    "issue\_type": "payment",  
    "payment\_problem": "failed",  
    "retry\_success": False  
}  

print(tree.evaluate(context))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What This Gives You
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  A flexible structure&lt;/li&gt;
&lt;li&gt;  Easy to extend (just add nodes)&lt;/li&gt;
&lt;li&gt;  Clear separation of logic&lt;/li&gt;
&lt;li&gt;  No hardcoded if-else chains&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And most importantly:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;You can model real user journeys instead of writing scattered logic.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In a real system, this wouldn’t use input().&lt;/p&gt;

&lt;p&gt;Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  UI handles selections&lt;/li&gt;
&lt;li&gt;  Backend returns next node&lt;/li&gt;
&lt;li&gt;  State is maintained per sessionNow the final step is connecting this to your actual system.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Integrating It into the Real Project
&lt;/h3&gt;

&lt;p&gt;The best part about this help and support system is that it doesn’t need to live separately from the main application.&lt;/p&gt;

&lt;p&gt;I integrated it as its own &lt;strong&gt;Help/Support Service&lt;/strong&gt; inside the project architecture.&lt;/p&gt;

&lt;p&gt;The flow is simple:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4qx4b16cg1lnzn20ywre.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4qx4b16cg1lnzn20ywre.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When a user taps &lt;strong&gt;Help&lt;/strong&gt;, the mobile app sends the selected category or current screen context to the service.&lt;/p&gt;

&lt;p&gt;For example, if the user is on the &lt;em&gt;Add Subscription&lt;/em&gt; page and opens support, the system can already start from a relevant branch of the tree instead of asking generic questions.&lt;/p&gt;

&lt;p&gt;This makes the experience feel much smarter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reducing Human Effort
&lt;/h3&gt;

&lt;p&gt;The biggest win was reducing repeated manual support effort.&lt;/p&gt;

&lt;p&gt;Earlier, if 20 users had trouble adding a new subscription, all 20 would either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  read the same FAQ&lt;/li&gt;
&lt;li&gt;  message support&lt;/li&gt;
&lt;li&gt;  wait for a response&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, the tree handles the majority of these repeated issues automatically.&lt;/p&gt;

&lt;p&gt;Some common examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  unable to add a new subscription&lt;/li&gt;
&lt;li&gt;  confusion between monthly and yearly plans&lt;/li&gt;
&lt;li&gt;  payment failure after renewal&lt;/li&gt;
&lt;li&gt;  missing notification alerts&lt;/li&gt;
&lt;li&gt;  dashboard analytics not updating&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are &lt;strong&gt;pattern-based problems&lt;/strong&gt;, which makes them perfect for tree traversal.&lt;/p&gt;

&lt;p&gt;This means human agents only need to handle edge cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Escalation Path
&lt;/h3&gt;

&lt;p&gt;Every branch ends with one of two outcomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Resolved automatically&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Escalate to human agent&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That fallback is important.&lt;/p&gt;

&lt;p&gt;Because no matter how good the tree is, some cases will always need human judgment.&lt;/p&gt;

&lt;p&gt;The system should help users first, not trap them.&lt;/p&gt;

&lt;p&gt;That balance is what makes it practical inside a larger product.&lt;/p&gt;

&lt;h3&gt;
  
  
  What I Learned Building This
&lt;/h3&gt;

&lt;p&gt;Building a &lt;strong&gt;help and support system&lt;/strong&gt; like this taught me something simple but important:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Most problems are not unique they’re repeated patterns.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once you accept that, the solution becomes clearer.&lt;/p&gt;

&lt;p&gt;You don’t need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  more FAQs&lt;/li&gt;
&lt;li&gt;  more documentation&lt;/li&gt;
&lt;li&gt;  more support agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need a system that can &lt;strong&gt;recognize patterns and guide users&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The decision-tree approach worked well because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  it simplifies user choices&lt;/li&gt;
&lt;li&gt;  it reduces cognitive load&lt;/li&gt;
&lt;li&gt;  it scales without increasing support effort&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But it’s not perfect.&lt;/p&gt;

&lt;p&gt;Some trade-offs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Deep trees can become hard to manage&lt;/li&gt;
&lt;li&gt;  Poorly designed questions can confuse users&lt;/li&gt;
&lt;li&gt;  Edge cases still require human support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the goal isn’t to replace support.&lt;/p&gt;

&lt;p&gt;It’s to &lt;strong&gt;handle the predictable 70–80% of issues automatically&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;A help system shouldn’t just exist it should &lt;strong&gt;actively solve problems&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Most applications treat support as a secondary feature:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  static pages&lt;/li&gt;
&lt;li&gt;  long FAQs&lt;/li&gt;
&lt;li&gt;  contact forms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But users don’t want information.&lt;/p&gt;

&lt;p&gt;They want &lt;strong&gt;resolution&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;By turning the help system into a &lt;strong&gt;decision-tree based flow&lt;/strong&gt;, you shift from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  passive content → guided experience&lt;/li&gt;
&lt;li&gt;  repeated queries → automated solutions&lt;/li&gt;
&lt;li&gt;  manual effort → scalable support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And the result is something that feels natural.&lt;/p&gt;

&lt;p&gt;Users don’t feel like they’re navigating a system.&lt;br&gt;&lt;br&gt;
They feel like the system understands them.&lt;/p&gt;

&lt;p&gt;That’s when support stops being a feature…&lt;/p&gt;

&lt;p&gt;and starts becoming part of the product experience.&lt;/p&gt;

</description>
      <category>systemdesignconcepts</category>
      <category>softwareengineering</category>
      <category>backenddevelopment</category>
      <category>userexperience</category>
    </item>
    <item>
      <title>Async Processing: The Secret to Surviving Spikes</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Mon, 27 Apr 2026 16:05:51 +0000</pubDate>
      <link>https://forem.com/akshatjme/async-processing-the-secret-to-surviving-spikes-4n45</link>
      <guid>https://forem.com/akshatjme/async-processing-the-secret-to-surviving-spikes-4n45</guid>
      <description>&lt;h4&gt;
  
  
  How decoupling work from requests helps systems stay stable under load
&lt;/h4&gt;

&lt;p&gt;In the previous part, we saw the limitations of synchronous systems.&lt;/p&gt;

&lt;p&gt;When every request waits for all operations to complete, performance suffers under load. Resources remain blocked, and slow dependencies affect the entire flow.&lt;/p&gt;

&lt;p&gt;Asynchronous processing takes a different approach.&lt;/p&gt;

&lt;p&gt;Instead of doing all work during the request, it separates immediate responses from background work.&lt;/p&gt;

&lt;p&gt;This shift changes how systems handle load, especially during traffic spikes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu15n1j4s1npi631d3wya.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu15n1j4s1npi631d3wya.png" alt="Async Processing" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Decoupling work from requests
&lt;/h3&gt;

&lt;p&gt;In an asynchronous system, not all work is done in real time.&lt;/p&gt;

&lt;p&gt;The request handles only what is necessary for an immediate response.&lt;br&gt;&lt;br&gt;
The remaining work is moved to background processing.&lt;/p&gt;

&lt;p&gt;This reduces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  request duration&lt;/li&gt;
&lt;li&gt;  resource usage during the request&lt;/li&gt;
&lt;li&gt;  dependency on slow operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By decoupling work, the system avoids holding resources for long periods and improves overall throughput.&lt;/p&gt;

&lt;h3&gt;
  
  
  Queues absorb traffic spikes
&lt;/h3&gt;

&lt;p&gt;Queues are a core part of asynchronous systems.&lt;/p&gt;

&lt;p&gt;Instead of processing all requests immediately, incoming tasks are stored in a queue and processed at a controlled rate.&lt;/p&gt;

&lt;p&gt;This creates a buffer between incoming traffic and system capacity.&lt;/p&gt;

&lt;p&gt;During traffic spikes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  requests are queued instead of rejected&lt;/li&gt;
&lt;li&gt;  processing happens gradually&lt;/li&gt;
&lt;li&gt;  system load remains stable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Queues do not eliminate load, but they prevent sudden overload.&lt;/p&gt;

&lt;h3&gt;
  
  
  Improved user experience
&lt;/h3&gt;

&lt;p&gt;Asynchronous systems improve perceived performance.&lt;/p&gt;

&lt;p&gt;Users receive faster responses because the system does not wait for all operations to complete.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  a request can be accepted immediately&lt;/li&gt;
&lt;li&gt;  heavy processing happens in the background&lt;/li&gt;
&lt;li&gt;  results are delivered later&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduces user wait time and makes the system feel more responsive.&lt;/p&gt;

&lt;h3&gt;
  
  
  Event driven architecture basics
&lt;/h3&gt;

&lt;p&gt;Asynchronous systems are often built around events.&lt;/p&gt;

&lt;p&gt;Instead of calling services directly and waiting for responses, components emit events when something happens.&lt;/p&gt;

&lt;p&gt;Other components react to these events independently.&lt;/p&gt;

&lt;p&gt;This model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  reduces direct dependencies between services&lt;/li&gt;
&lt;li&gt;  allows work to happen in parallel&lt;/li&gt;
&lt;li&gt;  improves system flexibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Event driven systems shift the focus from request flow to state changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Better resource utilization
&lt;/h3&gt;

&lt;p&gt;Asynchronous processing allows better use of system resources.&lt;/p&gt;

&lt;p&gt;Since requests are shorter and less blocking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  threads are freed faster&lt;/li&gt;
&lt;li&gt;  connections are reused efficiently&lt;/li&gt;
&lt;li&gt;  overall throughput increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Background workers can process tasks independently, making better use of available capacity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Isolation of failures
&lt;/h3&gt;

&lt;p&gt;In synchronous systems, failure in one step affects the entire request.&lt;/p&gt;

&lt;p&gt;In asynchronous systems, failures can be isolated.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  a background job can fail without blocking user requests&lt;/li&gt;
&lt;li&gt;  retries can be handled separately&lt;/li&gt;
&lt;li&gt;  issues remain contained within specific components&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduces the impact of failures on the overall system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trade offs of asynchronous systems
&lt;/h3&gt;

&lt;p&gt;Asynchronous systems are not without challenges.&lt;/p&gt;

&lt;p&gt;They introduce:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  increased system complexity&lt;/li&gt;
&lt;li&gt;  delayed consistency&lt;/li&gt;
&lt;li&gt;  need for monitoring background jobs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Debugging becomes harder because work is distributed across multiple components.&lt;/p&gt;

&lt;p&gt;Despite these trade offs, the benefits are significant for systems under variable load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Asynchronous processing changes how systems handle work.&lt;/p&gt;

&lt;p&gt;By separating immediate responses from background tasks, systems can reduce load, improve responsiveness, and handle traffic spikes more effectively.&lt;/p&gt;

&lt;p&gt;This approach is especially useful in environments where demand is unpredictable.&lt;/p&gt;

&lt;p&gt;In the next part, we will explore why APIs feel slow even when backend systems are fast.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>performance</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>The Hidden Cost of Synchronous Systems</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Sat, 25 Apr 2026 15:05:22 +0000</pubDate>
      <link>https://forem.com/akshatjme/the-hidden-cost-of-synchronous-systems-d33</link>
      <guid>https://forem.com/akshatjme/the-hidden-cost-of-synchronous-systems-d33</guid>
      <description>&lt;h4&gt;
  
  
  Why waiting for every step to finish can quietly slow down your entire backend
&lt;/h4&gt;

&lt;p&gt;In previous parts, we explored how system design choices affect performance.&lt;/p&gt;

&lt;p&gt;One such choice is how work is executed.&lt;/p&gt;

&lt;p&gt;Many backend systems follow a &lt;strong&gt;synchronous model&lt;/strong&gt;, where each step waits for the previous one to complete.&lt;/p&gt;

&lt;p&gt;This approach is simple and easy to reason about.&lt;/p&gt;

&lt;p&gt;However, under load, it introduces hidden costs that affect performance, scalability, and user experience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F33wj1a8fzd1ymdvsd9ae.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F33wj1a8fzd1ymdvsd9ae.png" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Blocking requests
&lt;/h3&gt;

&lt;p&gt;In a synchronous system, a request waits until all operations are complete.&lt;/p&gt;

&lt;p&gt;During this time, system resources remain occupied.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  threads stay blocked&lt;/li&gt;
&lt;li&gt;  connections remain open&lt;/li&gt;
&lt;li&gt;  memory is held&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduces the number of requests the system can handle at the same time.&lt;/p&gt;

&lt;p&gt;As traffic increases, blocked resources begin to accumulate, leading to slower responses and reduced throughput.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slow dependencies lead to slow systems
&lt;/h3&gt;

&lt;p&gt;A synchronous flow depends on the speed of each component.&lt;/p&gt;

&lt;p&gt;If one dependency is slow, the entire request becomes slow.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  database queries&lt;/li&gt;
&lt;li&gt;  external APIs&lt;/li&gt;
&lt;li&gt;  internal services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each step adds to the total response time.&lt;/p&gt;

&lt;p&gt;The system’s performance becomes limited by its slowest dependency.&lt;/p&gt;

&lt;p&gt;This creates a chain where delays propagate across the entire request lifecycle.&lt;/p&gt;

&lt;h3&gt;
  
  
  User perceived latency
&lt;/h3&gt;

&lt;p&gt;In synchronous systems, users wait for the full operation to complete.&lt;/p&gt;

&lt;p&gt;Even if some parts of the work are not immediately required, the response is delayed until everything finishes.&lt;/p&gt;

&lt;p&gt;This increases perceived latency.&lt;/p&gt;

&lt;p&gt;From the user’s perspective, the system feels slow, even if individual operations are fast.&lt;/p&gt;

&lt;p&gt;Reducing perceived latency is not only about speed, but also about how responses are structured.&lt;/p&gt;

&lt;h3&gt;
  
  
  No parallelism advantage
&lt;/h3&gt;

&lt;p&gt;Synchronous execution processes tasks in sequence.&lt;/p&gt;

&lt;p&gt;This limits the ability to use available resources efficiently.&lt;/p&gt;

&lt;p&gt;Many operations can be performed independently, but in a synchronous flow they are executed one after another.&lt;/p&gt;

&lt;p&gt;This results in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  underutilized resources&lt;/li&gt;
&lt;li&gt;  longer total processing time&lt;/li&gt;
&lt;li&gt;  lower system efficiency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Parallel execution can reduce total latency, but synchronous systems do not take full advantage of it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Limited scalability under load
&lt;/h3&gt;

&lt;p&gt;As traffic increases, synchronous systems struggle to scale.&lt;/p&gt;

&lt;p&gt;Each request holds resources for its entire duration.&lt;br&gt;&lt;br&gt;
More requests require more threads, more connections, and more memory.&lt;/p&gt;

&lt;p&gt;At some point, the system reaches its limits.&lt;/p&gt;

&lt;p&gt;This makes scaling more expensive and less efficient compared to systems that release resources early.&lt;/p&gt;

&lt;h3&gt;
  
  
  Coupling between operations
&lt;/h3&gt;

&lt;p&gt;Synchronous flows create tight coupling between steps.&lt;/p&gt;

&lt;p&gt;Each operation depends on the previous one to complete successfully.&lt;/p&gt;

&lt;p&gt;If one step fails or slows down, the entire request is affected.&lt;/p&gt;

&lt;p&gt;This reduces flexibility and makes systems more sensitive to failures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Synchronous systems are simple and predictable, but they come with trade-offs.&lt;/p&gt;

&lt;p&gt;They block resources, amplify the impact of slow dependencies, and limit how efficiently a system can scale.&lt;/p&gt;

&lt;p&gt;These costs are not always visible at small scale, but they become significant under load.&lt;/p&gt;

&lt;p&gt;Understanding these limitations is important when designing systems that need to handle real-world traffic.&lt;/p&gt;

&lt;p&gt;In the next part, we will explore asynchronous processing and how it helps systems handle load more efficiently.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>design</category>
      <category>softwaredevelopment</category>
      <category>performance</category>
      <category>systemsthinking</category>
    </item>
    <item>
      <title>Why Microservices Make Performance Worse (If Done Wrong)</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Thu, 23 Apr 2026 16:16:27 +0000</pubDate>
      <link>https://forem.com/akshatjme/why-microservices-make-performance-worse-if-done-wrong-2i5e</link>
      <guid>https://forem.com/akshatjme/why-microservices-make-performance-worse-if-done-wrong-2i5e</guid>
      <description>&lt;h4&gt;
  
  
  How breaking your system into services can increase complexity and slow everything down
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://medium.com/codetodeploy/designing-systems-that-dont-collapse-under-pressure-11ae443fcdfc" rel="noopener noreferrer"&gt;In the previous part&lt;/a&gt;, we discussed how to design systems that survive under pressure.&lt;/p&gt;

&lt;p&gt;Microservices are often seen as a solution to scaling and reliability.&lt;/p&gt;

&lt;p&gt;But in practice, many systems become slower and harder to manage after moving to microservices.&lt;/p&gt;

&lt;p&gt;The problem is not microservices themselves.&lt;br&gt;&lt;br&gt;
The problem is how they are used.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm0307za24k7wtsx7licv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm0307za24k7wtsx7licv.png" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Too many network calls
&lt;/h3&gt;

&lt;p&gt;In a monolithic system, components communicate in memory.&lt;/p&gt;

&lt;p&gt;In microservices, communication happens over the network.&lt;/p&gt;

&lt;p&gt;Every request between services adds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  network latency&lt;/li&gt;
&lt;li&gt;  serialization and deserialization cost&lt;/li&gt;
&lt;li&gt;  additional failure points&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A single user request may trigger multiple internal calls.&lt;/p&gt;

&lt;p&gt;This increases total response time.&lt;/p&gt;

&lt;p&gt;What was once a fast internal function call becomes a slower network operation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chatty services problem
&lt;/h3&gt;

&lt;p&gt;Microservices often become too dependent on each other.&lt;/p&gt;

&lt;p&gt;Instead of one efficient call, services make many small calls.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  service A calls service B&lt;/li&gt;
&lt;li&gt;  service B calls service C&lt;/li&gt;
&lt;li&gt;  service C returns partial data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates a chain of requests.&lt;/p&gt;

&lt;p&gt;Each call adds latency.&lt;br&gt;&lt;br&gt;
Together, they create significant overhead.&lt;/p&gt;

&lt;p&gt;This pattern is known as chatty services.&lt;/p&gt;

&lt;p&gt;It is one of the most common causes of slow systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Distributed failures
&lt;/h3&gt;

&lt;p&gt;In a distributed system, failures spread easily.&lt;/p&gt;

&lt;p&gt;If one service becomes slow or unavailable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  dependent services are affected&lt;/li&gt;
&lt;li&gt;  requests start timing out&lt;/li&gt;
&lt;li&gt;  retries increase traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This can lead to cascading failures across the system.&lt;/p&gt;

&lt;p&gt;Unlike monoliths, where failure is contained, microservices increase the surface area of failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Harder debugging
&lt;/h3&gt;

&lt;p&gt;Debugging performance issues becomes more complex.&lt;/p&gt;

&lt;p&gt;In a single system, it is easier to trace a request.&lt;/p&gt;

&lt;p&gt;In microservices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  requests pass through multiple services&lt;/li&gt;
&lt;li&gt;  logs are spread across systems&lt;/li&gt;
&lt;li&gt;  latency is distributed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finding the root cause requires tracing across multiple components.&lt;/p&gt;

&lt;p&gt;Without proper observability, diagnosing issues becomes difficult.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data consistency challenges
&lt;/h3&gt;

&lt;p&gt;Microservices often manage separate databases.&lt;/p&gt;

&lt;p&gt;This improves independence but creates consistency challenges.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  data may not be updated at the same time&lt;/li&gt;
&lt;li&gt;  systems may temporarily disagree&lt;/li&gt;
&lt;li&gt;  additional logic is required to handle this&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Managing consistency adds complexity and can impact performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overengineering too early
&lt;/h3&gt;

&lt;p&gt;Microservices are often adopted too early.&lt;/p&gt;

&lt;p&gt;For small systems, they introduce:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  more services to manage&lt;/li&gt;
&lt;li&gt;  more deployment complexity&lt;/li&gt;
&lt;li&gt;  more communication overhead&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before scaling becomes a real problem, this added complexity slows development and performance.&lt;/p&gt;

&lt;p&gt;A simple system becomes unnecessarily complicated.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Microservices are powerful, but they are not a default solution.&lt;/p&gt;

&lt;p&gt;They introduce network overhead, increase system complexity, and make failures harder to manage.&lt;/p&gt;

&lt;p&gt;When used correctly, they help systems scale.&lt;br&gt;&lt;br&gt;
When used too early or without proper design, they make performance worse.&lt;/p&gt;

&lt;p&gt;Choosing the right architecture depends on the problem, not the trend.&lt;/p&gt;

&lt;p&gt;In the next part, we will look at synchronous systems and how waiting on responses can slow down your backend.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>designsystems</category>
      <category>backenddevelopment</category>
      <category>microservicearchitecture</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Designing Systems That Don’t Collapse Under Pressure</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Tue, 21 Apr 2026 15:49:46 +0000</pubDate>
      <link>https://forem.com/akshatjme/designing-systems-that-dont-collapse-under-pressure-2jcg</link>
      <guid>https://forem.com/akshatjme/designing-systems-that-dont-collapse-under-pressure-2jcg</guid>
      <description>&lt;h4&gt;
  
  
  How to build backend systems that continue to work even when things go wrong
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://medium.com/@akshatjme/list/the-hidden-reasons-your-backend-fails-under-pressure-909d1c28564d" rel="noopener noreferrer"&gt;In earlier parts&lt;/a&gt;, we saw how systems fail under load.&lt;/p&gt;

&lt;p&gt;Traffic increases, dependencies slow down, and small issues turn into full outages.&lt;/p&gt;

&lt;p&gt;The goal of system design is not to avoid failure completely.&lt;/p&gt;

&lt;p&gt;It is to &lt;strong&gt;handle failure in a controlled way&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A well-designed system does not collapse under pressure.&lt;br&gt;&lt;br&gt;
It adapts, limits damage, and continues to function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhp4bg7s6vtwz08oi6kyj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhp4bg7s6vtwz08oi6kyj.png" alt="Designing Systems That Don’t Collapse Under Pressure" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Design for failure, not perfection
&lt;/h3&gt;

&lt;p&gt;No system runs perfectly all the time.&lt;/p&gt;

&lt;p&gt;Dependencies fail. Networks slow down. Traffic becomes unpredictable.&lt;/p&gt;

&lt;p&gt;Designing for perfect conditions creates fragile systems.&lt;/p&gt;

&lt;p&gt;Instead, systems should assume that failures will happen.&lt;/p&gt;

&lt;p&gt;This changes how components are built:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  what happens if a service is unavailable&lt;/li&gt;
&lt;li&gt;  how the system responds to delays&lt;/li&gt;
&lt;li&gt;  how errors are handled&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Planning for failure makes systems more stable under real conditions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add timeouts everywhere
&lt;/h3&gt;

&lt;p&gt;Every external call should have a timeout.&lt;/p&gt;

&lt;p&gt;Without timeouts, a request can wait indefinitely for a response.&lt;br&gt;&lt;br&gt;
This blocks threads, connections, and memory.&lt;/p&gt;

&lt;p&gt;Under load, these blocked resources accumulate and create pressure on the system.&lt;/p&gt;

&lt;p&gt;Timeouts ensure that requests fail fast instead of waiting too long.&lt;/p&gt;

&lt;p&gt;This helps in freeing resources and preventing cascading slowdowns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use retries carefully
&lt;/h3&gt;

&lt;p&gt;Retries are useful, but they can also be harmful.&lt;/p&gt;

&lt;p&gt;When a request fails, retrying may succeed if the failure is temporary.&lt;/p&gt;

&lt;p&gt;However, under high load, retries increase traffic.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  one request becomes multiple requests&lt;/li&gt;
&lt;li&gt;  load increases on already stressed services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Uncontrolled retries can worsen the situation.&lt;/p&gt;

&lt;p&gt;Retries should be limited, delayed, and used only when necessary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Introduce circuit breakers
&lt;/h3&gt;

&lt;p&gt;A circuit breaker stops requests to a failing service.&lt;/p&gt;

&lt;p&gt;When a dependency is slow or unavailable, continuing to call it wastes resources.&lt;/p&gt;

&lt;p&gt;Circuit breakers detect failures and temporarily block calls.&lt;/p&gt;

&lt;p&gt;This prevents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  unnecessary load on failing services&lt;/li&gt;
&lt;li&gt;  delays in dependent systems&lt;/li&gt;
&lt;li&gt;  spread of failures across the system&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the service recovers, requests can resume.&lt;/p&gt;

&lt;h3&gt;
  
  
  Decouple components
&lt;/h3&gt;

&lt;p&gt;Tightly coupled systems fail together.&lt;/p&gt;

&lt;p&gt;If one component depends directly on another, failure spreads quickly.&lt;/p&gt;

&lt;p&gt;Decoupling reduces this risk.&lt;/p&gt;

&lt;p&gt;This can be done using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  asynchronous communication&lt;/li&gt;
&lt;li&gt;  message queues&lt;/li&gt;
&lt;li&gt;  clear service boundaries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Loose coupling ensures that one failure does not bring down the entire system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use queues to absorb spikes
&lt;/h3&gt;

&lt;p&gt;Traffic is not always steady.&lt;/p&gt;

&lt;p&gt;Sudden spikes can overload services.&lt;/p&gt;

&lt;p&gt;Queues act as buffers.&lt;/p&gt;

&lt;p&gt;Instead of processing everything immediately, requests are stored and handled gradually.&lt;/p&gt;

&lt;p&gt;This helps in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  smoothing traffic&lt;/li&gt;
&lt;li&gt;  protecting downstream services&lt;/li&gt;
&lt;li&gt;  maintaining stability during bursts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Queues do not remove load, but they control how it is handled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitor meaningful metrics
&lt;/h3&gt;

&lt;p&gt;System health cannot be understood without visibility.&lt;/p&gt;

&lt;p&gt;Important metrics include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  latency&lt;/li&gt;
&lt;li&gt;  error rate&lt;/li&gt;
&lt;li&gt;  throughput&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These metrics show how the system behaves under load.&lt;/p&gt;

&lt;p&gt;Monitoring helps in detecting problems early and understanding where pressure is building.&lt;/p&gt;

&lt;p&gt;Collecting too many metrics is not useful. Focus should be on signals that reflect real system behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Keep buffer capacity
&lt;/h3&gt;

&lt;p&gt;Systems should not run at full capacity.&lt;/p&gt;

&lt;p&gt;If CPU, memory, or connections are always near their limits, even a small increase in load can cause failure.&lt;/p&gt;

&lt;p&gt;Keeping buffer capacity provides room to handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  sudden traffic spikes&lt;/li&gt;
&lt;li&gt;  temporary slowdowns&lt;/li&gt;
&lt;li&gt;  unexpected events&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This headroom is important for stability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Graceful degradation
&lt;/h3&gt;

&lt;p&gt;When a system is under stress, it should not fail completely.&lt;/p&gt;

&lt;p&gt;Instead, it should reduce functionality in a controlled way.&lt;/p&gt;

&lt;p&gt;Examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  returning partial data&lt;/li&gt;
&lt;li&gt;  disabling non-critical features&lt;/li&gt;
&lt;li&gt;  serving cached responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows the system to remain usable even during issues.&lt;/p&gt;

&lt;p&gt;Graceful degradation improves user experience and prevents total outages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;System design is not just about performance.&lt;/p&gt;

&lt;p&gt;It is about how the system behaves under stress.&lt;/p&gt;

&lt;p&gt;Failures are unavoidable, but uncontrolled failures are not.&lt;/p&gt;

&lt;p&gt;By designing for failure, limiting impact, and maintaining control over load, systems can remain stable even under pressure.&lt;/p&gt;

&lt;p&gt;In the next part, we will look at microservices and how they can introduce new performance challenges if not designed carefully.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>scalability</category>
      <category>backenddevelopment</category>
      <category>systemdesignconcepts</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Why Your Database Becomes the Bottleneck</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Sun, 19 Apr 2026 15:59:40 +0000</pubDate>
      <link>https://forem.com/akshatjme/why-your-database-becomes-the-bottleneck-40ab</link>
      <guid>https://forem.com/akshatjme/why-your-database-becomes-the-bottleneck-40ab</guid>
      <description>&lt;h4&gt;
  
  
  Why most backend performance issues eventually lead back to the database
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://akshatjme.medium.com/why-your-backend-stops-performing-overnight-2d5e3a2f263f" rel="noopener noreferrer"&gt;In Part 1&lt;/a&gt;, we saw how systems collapse under pressure.&lt;br&gt;&lt;br&gt;
&lt;a href="https://akshatjme.medium.com/caching-mistakes-that-kill-performance-c0e64ef00cd8" rel="noopener noreferrer"&gt;In Part 2&lt;/a&gt;, we saw how caching can help or hurt.&lt;/p&gt;

&lt;p&gt;Now we look at the most common bottleneck in backend systems:&lt;/p&gt;

&lt;p&gt;The database.&lt;/p&gt;

&lt;p&gt;Almost every request touches it.&lt;br&gt;&lt;br&gt;
So when it slows down, everything slows down.&lt;/p&gt;

&lt;h3&gt;
  
  
  Every request depends on the database
&lt;/h3&gt;

&lt;p&gt;Most backend operations rely on the database.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  fetching data&lt;/li&gt;
&lt;li&gt;  storing updates&lt;/li&gt;
&lt;li&gt;  validating state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes it a central dependency.&lt;/p&gt;

&lt;p&gt;If the database is slow, your entire system feels slow.&lt;br&gt;&lt;br&gt;
There is no easy fallback.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connection pool exhaustion
&lt;/h3&gt;

&lt;p&gt;Databases support limited connections.&lt;/p&gt;

&lt;p&gt;Under high traffic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  all connections get used&lt;/li&gt;
&lt;li&gt;  new requests wait in queue&lt;/li&gt;
&lt;li&gt;  latency increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This happens even before the query runs.&lt;/p&gt;

&lt;p&gt;If the wait time grows, requests start failing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slow queries under load
&lt;/h3&gt;

&lt;p&gt;Queries that look fast at low traffic become slow at scale.&lt;/p&gt;

&lt;p&gt;Because now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  many queries run together&lt;/li&gt;
&lt;li&gt;  resources are shared&lt;/li&gt;
&lt;li&gt;  contention increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even a small delay per query becomes a big problem when multiplied across thousands of requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lack of proper indexing
&lt;/h3&gt;

&lt;p&gt;Without indexes, the database scans large data to find results.&lt;/p&gt;

&lt;p&gt;At small scale, it may work.&lt;br&gt;&lt;br&gt;
At large scale, it becomes expensive.&lt;/p&gt;

&lt;p&gt;This increases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  response time&lt;/li&gt;
&lt;li&gt;  CPU usage&lt;/li&gt;
&lt;li&gt;  overall system load&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Indexes are one of the simplest and most ignored optimizations.&lt;/p&gt;

&lt;h3&gt;
  
  
  N plus 1 query problem
&lt;/h3&gt;

&lt;p&gt;Instead of one efficient query, the system makes many small queries.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  fetch list&lt;/li&gt;
&lt;li&gt;  then fetch details one by one&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This increases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  number of DB calls&lt;/li&gt;
&lt;li&gt;  total latency&lt;/li&gt;
&lt;li&gt;  load on database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At scale, this becomes a major bottleneck.&lt;/p&gt;

&lt;h3&gt;
  
  
  Write heavy operations
&lt;/h3&gt;

&lt;p&gt;Writes are more expensive than reads.&lt;/p&gt;

&lt;p&gt;Frequent writes can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  lock rows&lt;/li&gt;
&lt;li&gt;  block reads&lt;/li&gt;
&lt;li&gt;  increase contention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When reads and writes happen together, they slow each other down.&lt;/p&gt;

&lt;h3&gt;
  
  
  No read write separation
&lt;/h3&gt;

&lt;p&gt;Using a single database for everything creates pressure.&lt;/p&gt;

&lt;p&gt;Reads and writes compete for the same resources.&lt;/p&gt;

&lt;p&gt;A better approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  primary database for writes&lt;/li&gt;
&lt;li&gt;  replicas for reads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this, scaling becomes harder.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inefficient data modeling
&lt;/h3&gt;

&lt;p&gt;Poor schema design creates long-term problems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  too many joins&lt;/li&gt;
&lt;li&gt;  deeply nested relations&lt;/li&gt;
&lt;li&gt;  unnecessary complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes queries slower and harder to optimize.&lt;/p&gt;

&lt;p&gt;Good design reduces work before optimization is even needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unbounded queries
&lt;/h3&gt;

&lt;p&gt;Queries without limits can become dangerous.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  fetching too much data&lt;/li&gt;
&lt;li&gt;  no pagination&lt;/li&gt;
&lt;li&gt;  large scans&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These queries consume more memory and take longer to execute.&lt;/p&gt;

&lt;p&gt;Under load, they affect other queries as well.&lt;/p&gt;

&lt;h3&gt;
  
  
  Locking and contention
&lt;/h3&gt;

&lt;p&gt;When multiple operations try to access the same data, locks are created.&lt;/p&gt;

&lt;p&gt;Too many locks lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  waiting queries&lt;/li&gt;
&lt;li&gt;  slower execution&lt;/li&gt;
&lt;li&gt;  reduced throughput&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is common in write-heavy systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Database scaling limits
&lt;/h3&gt;

&lt;p&gt;Databases have limits.&lt;/p&gt;

&lt;p&gt;Vertical scaling can only go so far:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  CPU limits&lt;/li&gt;
&lt;li&gt;  memory limits&lt;/li&gt;
&lt;li&gt;  cost increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Beyond a point, adding more power does not help.&lt;/p&gt;

&lt;p&gt;You need better design, not just bigger machines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;It has limited resources and handles critical operations.&lt;br&gt;&lt;br&gt;
As load increases, small inefficiencies become visible.&lt;/p&gt;

&lt;p&gt;Most performance issues are not sudden.&lt;br&gt;&lt;br&gt;
They build slowly and show up when the system is under pressure.&lt;/p&gt;

&lt;p&gt;Understanding these patterns helps in avoiding common mistakes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://akshatjme.medium.com/why-your-database-becomes-the-bottleneck-f325cc73e605" rel="noopener noreferrer"&gt;In the next part&lt;/a&gt;, we will look at rate limiting and how controlling traffic can prevent overload.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>systemdesignconcepts</category>
      <category>bottlenecker</category>
      <category>why</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Rate Limiting: The Most Underrated Backend Skill</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Sun, 19 Apr 2026 15:59:33 +0000</pubDate>
      <link>https://forem.com/akshatjme/rate-limiting-the-most-underrated-backend-skill-1n10</link>
      <guid>https://forem.com/akshatjme/rate-limiting-the-most-underrated-backend-skill-1n10</guid>
      <description>&lt;h4&gt;
  
  
  Why controlling traffic matters more than handling it
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://akshatjme.medium.com/why-your-backend-stops-performing-overnight-2d5e3a2f263f" rel="noopener noreferrer"&gt;In Part 1&lt;/a&gt;, we saw how systems collapse under pressure.&lt;br&gt;&lt;br&gt;
&lt;a href="https://akshatjme.medium.com/caching-mistakes-that-kill-performance-c0e64ef00cd8" rel="noopener noreferrer"&gt;In Part 2&lt;/a&gt; and &lt;a href="https://akshatjme.medium.com/why-your-database-becomes-the-bottleneck-f325cc73e605" rel="noopener noreferrer"&gt;3&lt;/a&gt;, we looked at caching and database bottlenecks.&lt;/p&gt;

&lt;p&gt;But there is one concept that directly controls pressure:&lt;/p&gt;

&lt;p&gt;Rate limiting.&lt;/p&gt;

&lt;p&gt;Most systems fail not because they lack resources, but because they accept more traffic than they can handle.&lt;/p&gt;

&lt;h3&gt;
  
  
  Uncontrolled traffic is dangerous
&lt;/h3&gt;

&lt;p&gt;Backend systems are designed with limits.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  CPU is limited&lt;/li&gt;
&lt;li&gt;  memory is limited&lt;/li&gt;
&lt;li&gt;  connections are limited&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If too many requests come in at once, these limits are quickly reached.&lt;/p&gt;

&lt;p&gt;Without control, the system keeps accepting requests until it slows down or crashes.&lt;/p&gt;

&lt;p&gt;In many cases, too much traffic causes failure faster than too little capacity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Not all users should be equal
&lt;/h3&gt;

&lt;p&gt;Treating all requests equally can harm the system.&lt;/p&gt;

&lt;p&gt;Some requests are more important than others.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  critical APIs&lt;/li&gt;
&lt;li&gt;  authenticated users&lt;/li&gt;
&lt;li&gt;  internal services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If everything is handled the same way, important requests can get blocked by less important ones.&lt;/p&gt;

&lt;p&gt;Rate limiting allows prioritization, so critical traffic continues even under load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling traffic spikes
&lt;/h3&gt;

&lt;p&gt;Traffic is not always consistent.&lt;/p&gt;

&lt;p&gt;Sudden spikes can happen due to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  new feature releases&lt;/li&gt;
&lt;li&gt;  external events&lt;/li&gt;
&lt;li&gt;  viral traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even a well-designed system can struggle with sudden bursts.&lt;/p&gt;

&lt;p&gt;Rate limiting smooths these spikes by controlling how fast requests are processed.&lt;/p&gt;

&lt;p&gt;This prevents the system from being overwhelmed instantly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Protecting against abuse
&lt;/h3&gt;

&lt;p&gt;Not all traffic is valid.&lt;/p&gt;

&lt;p&gt;Bots, scripts, and malicious users can send a large number of requests in a short time.&lt;/p&gt;

&lt;p&gt;Without limits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  APIs get overloaded&lt;/li&gt;
&lt;li&gt;  resources are wasted&lt;/li&gt;
&lt;li&gt;  real users are affected&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rate limiting acts as a basic protection layer against such abuse.&lt;/p&gt;

&lt;h3&gt;
  
  
  Global vs per user limits
&lt;/h3&gt;

&lt;p&gt;Rate limiting can be applied in different ways.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  global limits control total system traffic&lt;/li&gt;
&lt;li&gt;  per-user limits control individual usage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both are useful.&lt;/p&gt;

&lt;p&gt;Global limits protect the system as a whole.&lt;br&gt;&lt;br&gt;
Per-user limits prevent a single user from consuming too many resources.&lt;/p&gt;

&lt;p&gt;Choosing the right strategy depends on system design.&lt;/p&gt;

&lt;h3&gt;
  
  
  Failing gracefully
&lt;/h3&gt;

&lt;p&gt;When a system is overloaded, it must make a choice.&lt;/p&gt;

&lt;p&gt;Either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  accept all requests and risk crashing&lt;/li&gt;
&lt;li&gt;  reject some requests and stay stable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rate limiting helps in rejecting requests in a controlled way.&lt;/p&gt;

&lt;p&gt;Returning a failure response is better than letting the entire system go down.&lt;/p&gt;

&lt;h3&gt;
  
  
  Backpressure concept
&lt;/h3&gt;

&lt;p&gt;Backpressure means slowing down incoming traffic when the system is under stress.&lt;/p&gt;

&lt;p&gt;Instead of accepting everything, the system signals that it cannot handle more load.&lt;/p&gt;

&lt;p&gt;This helps in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  reducing pressure&lt;/li&gt;
&lt;li&gt;  stabilizing performance&lt;/li&gt;
&lt;li&gt;  avoiding cascading failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It allows the system to recover instead of collapsing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ignoring rate limiting in internal services
&lt;/h3&gt;

&lt;p&gt;Rate limiting is often applied only to external APIs.&lt;/p&gt;

&lt;p&gt;But internal services can also overload each other.&lt;/p&gt;

&lt;p&gt;In microservice architectures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  one service may send too many requests to another&lt;/li&gt;
&lt;li&gt;  internal traffic can grow quickly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without limits, this leads to internal failures that spread across the system.&lt;/p&gt;

&lt;p&gt;Rate limiting should exist both externally and internally.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Rate limiting is not just about blocking requests.&lt;/p&gt;

&lt;p&gt;It is about controlling how the system behaves under pressure.&lt;/p&gt;

&lt;p&gt;Without it, even well-designed systems can fail when traffic increases.&lt;/p&gt;

&lt;p&gt;With it, systems can stay stable by managing load instead of reacting to failure.&lt;/p&gt;

&lt;p&gt;In the next part, we will look at how to design systems that continue to work even when components fail.&lt;/p&gt;

&lt;p&gt;I’ve also explored rate limiting strategies in detail in a previous article, where I break down common approaches like token bucket, sliding window, and their real-world trade-offs. — &lt;a href="https://akshatjme.medium.com/rate-limiting-101-how-to-protect-your-apis-at-scale-8d0a8c666dd3" rel="noopener noreferrer"&gt;[LINK]&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>api</category>
      <category>systemdesignconcepts</category>
      <category>backenddevelopment</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Why Your Database Becomes the Bottleneck</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Fri, 17 Apr 2026 15:35:06 +0000</pubDate>
      <link>https://forem.com/akshatjme/why-your-database-becomes-the-bottleneck-1pfg</link>
      <guid>https://forem.com/akshatjme/why-your-database-becomes-the-bottleneck-1pfg</guid>
      <description>&lt;h4&gt;
  
  
  Why most backend performance issues eventually lead back to the database
&lt;/h4&gt;

&lt;p&gt;In Part 1, we saw how systems collapse under pressure.&lt;br&gt;&lt;br&gt;
In Part 2, we saw how caching can help or hurt.&lt;/p&gt;

&lt;p&gt;Now we look at the most common bottleneck in backend systems:&lt;/p&gt;

&lt;p&gt;The database.&lt;/p&gt;

&lt;p&gt;Almost every request touches it.&lt;br&gt;&lt;br&gt;
So when it slows down, everything slows down.&lt;/p&gt;

&lt;h3&gt;
  
  
  Every request depends on the database
&lt;/h3&gt;

&lt;p&gt;Most backend operations rely on the database.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  fetching data&lt;/li&gt;
&lt;li&gt;  storing updates&lt;/li&gt;
&lt;li&gt;  validating state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes it a central dependency.&lt;/p&gt;

&lt;p&gt;If the database is slow, your entire system feels slow.&lt;br&gt;&lt;br&gt;
There is no easy fallback.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connection pool exhaustion
&lt;/h3&gt;

&lt;p&gt;Databases support limited connections.&lt;/p&gt;

&lt;p&gt;Under high traffic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  all connections get used&lt;/li&gt;
&lt;li&gt;  new requests wait in queue&lt;/li&gt;
&lt;li&gt;  latency increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This happens even before the query runs.&lt;/p&gt;

&lt;p&gt;If the wait time grows, requests start failing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slow queries under load
&lt;/h3&gt;

&lt;p&gt;Queries that look fast at low traffic become slow at scale.&lt;/p&gt;

&lt;p&gt;Because now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  many queries run together&lt;/li&gt;
&lt;li&gt;  resources are shared&lt;/li&gt;
&lt;li&gt;  contention increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even a small delay per query becomes a big problem when multiplied across thousands of requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lack of proper indexing
&lt;/h3&gt;

&lt;p&gt;Without indexes, the database scans large data to find results.&lt;/p&gt;

&lt;p&gt;At small scale, it may work.&lt;br&gt;&lt;br&gt;
At large scale, it becomes expensive.&lt;/p&gt;

&lt;p&gt;This increases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  response time&lt;/li&gt;
&lt;li&gt;  CPU usage&lt;/li&gt;
&lt;li&gt;  overall system load&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Indexes are one of the simplest and most ignored optimizations.&lt;/p&gt;

&lt;h3&gt;
  
  
  N plus 1 query problem
&lt;/h3&gt;

&lt;p&gt;Instead of one efficient query, the system makes many small queries.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  fetch list&lt;/li&gt;
&lt;li&gt;  then fetch details one by one&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This increases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  number of DB calls&lt;/li&gt;
&lt;li&gt;  total latency&lt;/li&gt;
&lt;li&gt;  load on database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At scale, this becomes a major bottleneck.&lt;/p&gt;

&lt;h3&gt;
  
  
  Write heavy operations
&lt;/h3&gt;

&lt;p&gt;Writes are more expensive than reads.&lt;/p&gt;

&lt;p&gt;Frequent writes can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  lock rows&lt;/li&gt;
&lt;li&gt;  block reads&lt;/li&gt;
&lt;li&gt;  increase contention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When reads and writes happen together, they slow each other down.&lt;/p&gt;

&lt;h3&gt;
  
  
  No read write separation
&lt;/h3&gt;

&lt;p&gt;Using a single database for everything creates pressure.&lt;/p&gt;

&lt;p&gt;Reads and writes compete for the same resources.&lt;/p&gt;

&lt;p&gt;A better approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  primary database for writes&lt;/li&gt;
&lt;li&gt;  replicas for reads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this, scaling becomes harder.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inefficient data modeling
&lt;/h3&gt;

&lt;p&gt;Poor schema design creates long-term problems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  too many joins&lt;/li&gt;
&lt;li&gt;  deeply nested relations&lt;/li&gt;
&lt;li&gt;  unnecessary complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes queries slower and harder to optimize.&lt;/p&gt;

&lt;p&gt;Good design reduces work before optimization is even needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unbounded queries
&lt;/h3&gt;

&lt;p&gt;Queries without limits can become dangerous.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  fetching too much data&lt;/li&gt;
&lt;li&gt;  no pagination&lt;/li&gt;
&lt;li&gt;  large scans&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These queries consume more memory and take longer to execute.&lt;/p&gt;

&lt;p&gt;Under load, they affect other queries as well.&lt;/p&gt;

&lt;h3&gt;
  
  
  Locking and contention
&lt;/h3&gt;

&lt;p&gt;When multiple operations try to access the same data, locks are created.&lt;/p&gt;

&lt;p&gt;Too many locks lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  waiting queries&lt;/li&gt;
&lt;li&gt;  slower execution&lt;/li&gt;
&lt;li&gt;  reduced throughput&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is common in write-heavy systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Database scaling limits
&lt;/h3&gt;

&lt;p&gt;Databases have limits.&lt;/p&gt;

&lt;p&gt;Vertical scaling can only go so far:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  CPU limits&lt;/li&gt;
&lt;li&gt;  memory limits&lt;/li&gt;
&lt;li&gt;  cost increases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Beyond a point, adding more power does not help.&lt;/p&gt;

&lt;p&gt;You need better design, not just bigger machines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;It has limited resources and handles critical operations.&lt;br&gt;&lt;br&gt;
As load increases, small inefficiencies become visible.&lt;/p&gt;

&lt;p&gt;Most performance issues are not sudden.&lt;br&gt;&lt;br&gt;
They build slowly and show up when the system is under pressure.&lt;/p&gt;

&lt;p&gt;Understanding these patterns helps in avoiding common mistakes.&lt;/p&gt;

&lt;p&gt;In the next part, we will look at rate limiting and how controlling traffic can prevent overload.&lt;/p&gt;

&lt;p&gt;Thanks for reading.&lt;/p&gt;

</description>
      <category>systemdesignconcepts</category>
      <category>bottlenecker</category>
      <category>why</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Caching Mistakes That Kill Performance</title>
      <dc:creator>Akshat Jain</dc:creator>
      <pubDate>Wed, 15 Apr 2026 15:45:05 +0000</pubDate>
      <link>https://forem.com/akshatjme/caching-mistakes-that-kill-performance-lfd</link>
      <guid>https://forem.com/akshatjme/caching-mistakes-that-kill-performance-lfd</guid>
      <description>&lt;h4&gt;
  
  
  Simple caching mistakes that silently slow down your backend and make systems fragile
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://akshatjme.medium.com/why-your-backend-stops-performing-overnight-2d5e3a2f263f" rel="noopener noreferrer"&gt;In the previous part&lt;/a&gt;, we saw how systems fail when pressure increases.&lt;/p&gt;

&lt;p&gt;Caching is often used to solve that problem.&lt;/p&gt;

&lt;p&gt;It reduces load, improves response time, and helps systems handle more traffic.&lt;/p&gt;

&lt;p&gt;But caching is not always a win.&lt;/p&gt;

&lt;p&gt;If done wrong, it can make systems harder to manage and sometimes even slower.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2A4rzm6zWs-AXP6p7BqqnKLg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2A4rzm6zWs-AXP6p7BqqnKLg.png" alt="Caching Mistakes That Kill Performance" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Caching everything blindly
&lt;/h3&gt;

&lt;p&gt;A common mistake is caching everything.&lt;/p&gt;

&lt;p&gt;Not all data needs caching.&lt;/p&gt;

&lt;p&gt;If the data is rarely accessed or changes frequently, caching adds extra complexity without real benefit. You end up managing cache logic, invalidation, and storage for little gain.&lt;/p&gt;

&lt;p&gt;Caching should be selective.&lt;/p&gt;

&lt;p&gt;It works best for data that is read often and changes less frequently.&lt;/p&gt;

&lt;h3&gt;
  
  
  No cache invalidation strategy
&lt;/h3&gt;

&lt;p&gt;Caching introduces a new problem: stale data.&lt;/p&gt;

&lt;p&gt;If cached data is not updated or cleared correctly, users may see outdated information.&lt;/p&gt;

&lt;p&gt;In many systems, stale data becomes a bigger issue than slow responses.&lt;/p&gt;

&lt;p&gt;Without a clear invalidation strategy, the cache slowly becomes unreliable.&lt;/p&gt;

&lt;p&gt;Handling updates properly is as important as caching itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Over reliance on cache
&lt;/h3&gt;

&lt;p&gt;A cache should improve performance, not become a dependency.&lt;/p&gt;

&lt;p&gt;If your system breaks when the cache is unavailable, the design is fragile.&lt;/p&gt;

&lt;p&gt;Cache failures should not stop core functionality. The system should still work, even if responses become slower.&lt;/p&gt;

&lt;p&gt;A good system treats cache as an optimization, not a requirement.&lt;/p&gt;

&lt;h3&gt;
  
  
  Caching at the wrong layer
&lt;/h3&gt;

&lt;p&gt;Caching can be applied at different levels.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  database level&lt;/li&gt;
&lt;li&gt;  backend services&lt;/li&gt;
&lt;li&gt;  API layer&lt;/li&gt;
&lt;li&gt;  frontend&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Choosing the wrong layer reduces its effectiveness.&lt;/p&gt;

&lt;p&gt;For example, caching only at the frontend may not reduce backend load. Caching too deep in the database may not help repeated API calls.&lt;/p&gt;

&lt;p&gt;The goal is to cache where it reduces the most work.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ignoring cache hit rate
&lt;/h3&gt;

&lt;p&gt;A cache is only useful if it is being used.&lt;/p&gt;

&lt;p&gt;If most requests miss the cache, it does not provide real value.&lt;/p&gt;

&lt;p&gt;Low hit rate means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  unnecessary memory usage&lt;/li&gt;
&lt;li&gt;  extra complexity&lt;/li&gt;
&lt;li&gt;  no performance improvement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Monitoring hit rate helps in understanding whether the cache is actually effective.&lt;/p&gt;

&lt;h3&gt;
  
  
  Large object caching mistakes
&lt;/h3&gt;

&lt;p&gt;Caching large responses can create new problems.&lt;/p&gt;

&lt;p&gt;Large objects consume more memory and take longer to read and write. This increases pressure on the cache system itself.&lt;/p&gt;

&lt;p&gt;Instead of improving performance, it can slow things down.&lt;/p&gt;

&lt;p&gt;It is often better to cache smaller, frequently used pieces of data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cold start problem
&lt;/h3&gt;

&lt;p&gt;Caches are empty at the start.&lt;/p&gt;

&lt;p&gt;When traffic suddenly increases, all requests go directly to the database. This creates a spike in load.&lt;/p&gt;

&lt;p&gt;If the database cannot handle this spike, the system slows down or fails.&lt;/p&gt;

&lt;p&gt;This is known as the cold start problem.&lt;/p&gt;

&lt;p&gt;Proper warming strategies or gradual traffic handling can reduce this risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  No expiration strategy (TTL issues)
&lt;/h3&gt;

&lt;p&gt;Every cache needs a clear expiration policy.&lt;/p&gt;

&lt;p&gt;If data lives too long, it becomes stale.&lt;br&gt;&lt;br&gt;
If it expires too quickly, the system keeps fetching fresh data and loses the benefit of caching.&lt;/p&gt;

&lt;p&gt;Choosing the right TTL depends on how often the data changes and how critical freshness is.&lt;/p&gt;

&lt;p&gt;Poor TTL decisions reduce the effectiveness of caching.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Caching is a powerful tool, but it is easy to misuse.&lt;/p&gt;

&lt;p&gt;It does not fix underlying system problems. It only hides them if used incorrectly.&lt;/p&gt;

&lt;p&gt;A good caching strategy focuses on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  what to cache&lt;/li&gt;
&lt;li&gt;  where to cache&lt;/li&gt;
&lt;li&gt;  how to update it&lt;/li&gt;
&lt;li&gt;  when to expire it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When done right, caching improves performance and stability. When done wrong, it adds complexity without real benefit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://medium.com/@akshatjme/f325cc73e605" rel="noopener noreferrer"&gt;In the next part&lt;/a&gt;, we will look at why databases often become the main bottleneck in backend systems.&lt;/p&gt;

</description>
      <category>backenddevelopment</category>
      <category>softwareengineering</category>
      <category>preformance</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
