<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: CodeWithDhanian</title>
    <description>The latest articles on Forem by CodeWithDhanian (@code_2).</description>
    <link>https://forem.com/code_2</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2598648%2F680b87c2-d957-4cdd-b8c1-0009b0f55328.jpg</url>
      <title>Forem: CodeWithDhanian</title>
      <link>https://forem.com/code_2</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/code_2"/>
    <language>en</language>
    <item>
      <title>What is Horizontal vs Vertical Scaling?</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 08 May 2026 08:03:13 +0000</pubDate>
      <link>https://forem.com/code_2/what-is-horizontal-vs-vertical-scaling-1o3e</link>
      <guid>https://forem.com/code_2/what-is-horizontal-vs-vertical-scaling-1o3e</guid>
      <description>&lt;p&gt;&lt;strong&gt;Scaling&lt;/strong&gt; is the fundamental process of increasing a system’s capacity to handle greater workloads, more users, or higher traffic without compromising performance. In system design, two primary strategies exist: &lt;strong&gt;vertical scaling&lt;/strong&gt; and &lt;strong&gt;horizontal scaling&lt;/strong&gt;. Each approach addresses growth differently, carries unique architectural implications, and demands distinct engineering considerations. Understanding both is essential for building systems that remain reliable, cost-effective, and performant as demand evolves.&lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding Vertical Scaling
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Vertical scaling&lt;/strong&gt;, also known as &lt;strong&gt;scaling up&lt;/strong&gt;, involves enhancing the capabilities of a single server or instance by adding more resources to it. This typically means increasing CPU cores, RAM, storage capacity, or network bandwidth on the existing machine.&lt;/p&gt;

&lt;p&gt;The process is straightforward. Consider a web server running on a machine with 4 CPU cores and 8 GB of RAM. When traffic grows, the operations team upgrades that same machine to 16 CPU cores and 64 GB of RAM. No additional servers are introduced; the workload continues to run on the upgraded hardware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vertical scaling&lt;/strong&gt; shines in scenarios where the application is monolithic or tightly coupled to a single process. Databases often benefit from this approach during early growth phases because a larger instance can process more queries per second without requiring data partitioning logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advantages of vertical scaling&lt;/strong&gt; include simplicity of implementation, lower operational overhead, and minimal changes to application code. Latency between components remains low since everything runs within one machine. Management is easier because there is only a single instance to monitor, backup, and secure.&lt;/p&gt;

&lt;p&gt;However, &lt;strong&gt;vertical scaling&lt;/strong&gt; has hard physical limits. Hardware vendors offer only finite maximum configurations for any server type. Beyond a certain point, upgrading becomes prohibitively expensive or technically impossible. A single point of failure exists: if that upgraded machine crashes, the entire system goes down. Upgrades frequently require downtime while the instance is stopped, resized, and restarted. In cloud environments, this translates to higher costs for larger instance types that may be over-provisioned during low-traffic periods.&lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding Horizontal Scaling
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Horizontal scaling&lt;/strong&gt;, also known as &lt;strong&gt;scaling out&lt;/strong&gt;, involves adding more servers or instances to distribute the workload across multiple machines. Instead of making one server more powerful, the system grows by increasing the number of identical servers working together.&lt;/p&gt;

&lt;p&gt;A load balancer sits in front of the fleet of servers and routes incoming requests intelligently across them. As demand increases, new instances are spun up automatically or manually, and traffic is spread evenly. This approach aligns naturally with cloud-native architectures and microservices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Horizontal scaling&lt;/strong&gt; provides virtually unlimited growth potential because additional machines can be added indefinitely. It delivers built-in fault tolerance: if one server fails, the remaining servers continue serving traffic. Cost efficiency improves because smaller, commodity instances are cheaper than a single massive machine. Upgrades can occur without downtime by adding new instances before removing old ones.&lt;/p&gt;

&lt;p&gt;The trade-offs are significant. &lt;strong&gt;Horizontal scaling&lt;/strong&gt; introduces complexity in areas such as data synchronization, session management, and inter-service communication. Network latency between machines becomes a factor. Applications must be designed to be stateless or use external shared stores for state. Distributed system challenges like consistency, leader election, and failure detection emerge. Debugging across multiple nodes is more difficult than on a single machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparing Vertical and Horizontal Scaling in Practice
&lt;/h3&gt;

&lt;p&gt;The choice between &lt;strong&gt;vertical scaling&lt;/strong&gt; and &lt;strong&gt;horizontal scaling&lt;/strong&gt; depends on the application’s architecture, expected growth curve, team expertise, and budget.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vertical scaling&lt;/strong&gt; suits early-stage startups, legacy monolithic applications, or workloads with heavy in-memory computations where splitting data is impractical. &lt;strong&gt;Horizontal scaling&lt;/strong&gt; becomes necessary when traffic exceeds what any single machine can handle or when high availability is non-negotiable.&lt;/p&gt;

&lt;p&gt;Real-world systems frequently combine both strategies. A database might use &lt;strong&gt;vertical scaling&lt;/strong&gt; for its primary instance while employing &lt;strong&gt;horizontal scaling&lt;/strong&gt; through read replicas or sharded clusters for read-heavy workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Designing Applications for Horizontal Scaling: A Practical Example
&lt;/h3&gt;

&lt;p&gt;To succeed with &lt;strong&gt;horizontal scaling&lt;/strong&gt;, applications must be stateless whenever possible. The following complete code example illustrates the difference.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stateful example (problematic for horizontal scaling)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;  &lt;span class="c1"&gt;# Global variable stored in memory
&lt;/span&gt;
&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/increment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="n"&gt;counter&lt;/span&gt;
    &lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Counter: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this service runs on multiple instances behind a load balancer, each instance maintains its own counter. Users hitting different instances receive inconsistent values. This breaks correctness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stateless example (ready for horizontal scaling)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;redis_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;redis-shared-store&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/increment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;incr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;global_counter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Counter: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All instances share the same Redis store. The counter remains consistent regardless of which instance processes the request. This design scales horizontally without modification.&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementing Horizontal Scaling with Nginx Load Balancer
&lt;/h3&gt;

&lt;p&gt;A complete Nginx configuration demonstrates how to distribute traffic across multiple application instances.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="c1"&gt;# /etc/nginx/nginx.conf&lt;/span&gt;
&lt;span class="k"&gt;user&lt;/span&gt; &lt;span class="s"&gt;www-data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;worker_processes&lt;/span&gt; &lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;pid&lt;/span&gt; &lt;span class="n"&gt;/run/nginx.pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;events&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;worker_connections&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;http&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;upstream&lt;/span&gt; &lt;span class="s"&gt;backend&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;app-instance-1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;app-instance-2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;app-instance-3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="c1"&gt;# Add more servers here as you scale out&lt;/span&gt;
        &lt;span class="kn"&gt;least_conn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;# Distribute to the least busy server&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;example.com&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://backend&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;# Health checks for automatic removal of unhealthy instances&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_next_upstream&lt;/span&gt; &lt;span class="s"&gt;error&lt;/span&gt; &lt;span class="s"&gt;timeout&lt;/span&gt; &lt;span class="s"&gt;http_500&lt;/span&gt; &lt;span class="s"&gt;http_502&lt;/span&gt; &lt;span class="s"&gt;http_503&lt;/span&gt; &lt;span class="s"&gt;http_504&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This configuration defines an &lt;strong&gt;upstream&lt;/strong&gt; block listing all application instances. The &lt;strong&gt;least_conn&lt;/strong&gt; directive ensures intelligent load distribution. Proxy headers preserve original client information. As demand grows, simply add more &lt;strong&gt;server&lt;/strong&gt; lines or use orchestration tools to dynamically update the upstream list. Reload Nginx with &lt;code&gt;nginx -s reload&lt;/code&gt; to apply changes without downtime.&lt;/p&gt;

&lt;h3&gt;
  
  
  When to Choose Each Strategy
&lt;/h3&gt;

&lt;p&gt;Start with &lt;strong&gt;vertical scaling&lt;/strong&gt; when the system is small, the team is focused on rapid feature delivery, and the application does not yet require distributed coordination. Transition to &lt;strong&gt;horizontal scaling&lt;/strong&gt; when traffic patterns show sustained growth, when downtime during upgrades becomes unacceptable, or when cloud costs for larger instances exceed the expense of multiple smaller ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Horizontal scaling&lt;/strong&gt; is the foundation of modern resilient systems. It forces thoughtful design decisions that pay dividends in reliability and flexibility far beyond raw capacity.&lt;/p&gt;

&lt;p&gt;If you found this deep dive into &lt;strong&gt;horizontal versus vertical scaling&lt;/strong&gt; valuable and want the complete professional treatment of all 100 system design concepts with diagrams, real-world architectures, and production-ready patterns, grab the full system design ebook at &lt;a href="https://codewithdhanian.gumroad.com/l/urcjee" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/urcjee&lt;/a&gt;. If this content helped you, consider buying me a coffee at &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt; to support more free in-depth resources like this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Visualizing the Concept&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn4kwj66bavf8lpz6d5m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn4kwj66bavf8lpz6d5m.png" alt="Horizontal vs vertical scaling comparison" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>What is Load Balancing?</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 08 May 2026 07:43:53 +0000</pubDate>
      <link>https://forem.com/code_2/what-is-load-balancing-5egi</link>
      <guid>https://forem.com/code_2/what-is-load-balancing-5egi</guid>
      <description>&lt;p&gt;&lt;strong&gt;Load balancing&lt;/strong&gt; is the fundamental technique used in modern distributed systems to distribute incoming network traffic across multiple backend &lt;strong&gt;servers&lt;/strong&gt; or &lt;strong&gt;resources&lt;/strong&gt; in order to ensure no single server becomes overwhelmed, thereby improving &lt;strong&gt;responsiveness&lt;/strong&gt;, &lt;strong&gt;availability&lt;/strong&gt;, and &lt;strong&gt;scalability&lt;/strong&gt;. At its core, a &lt;strong&gt;load balancer&lt;/strong&gt; acts as a traffic cop that sits between clients and the actual application servers, intelligently routing each request to the most appropriate server based on predefined rules and real-time conditions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Load Balancing Is Essential in System Design
&lt;/h3&gt;

&lt;p&gt;In any production-grade application that serves millions of users, relying on a single server is impractical and risky. A sudden spike in traffic, such as during a flash sale or viral event, can cause that server to slow down, crash, or become unresponsive. &lt;strong&gt;Load balancing&lt;/strong&gt; solves this by enabling &lt;strong&gt;horizontal scaling&lt;/strong&gt; — the ability to add more servers dynamically — while maintaining a seamless user experience. It also provides &lt;strong&gt;fault tolerance&lt;/strong&gt;: if one server fails, the &lt;strong&gt;load balancer&lt;/strong&gt; automatically stops sending traffic to it and redirects requests to healthy servers. This ensures the system remains highly available even during hardware failures, maintenance windows, or unexpected load surges.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Components of a Load Balancer
&lt;/h3&gt;

&lt;p&gt;A typical &lt;strong&gt;load balancer&lt;/strong&gt; consists of the following essential elements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend Listener&lt;/strong&gt;: The entry point that accepts incoming client requests on specific ports (usually 80 for HTTP or 443 for HTTPS).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend Pool&lt;/strong&gt;: A group of healthy application servers (often called targets or origins) that actually process the requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health Check Mechanism&lt;/strong&gt;: Continuous monitoring that probes each backend server to verify it is responding correctly. A failed health check removes the server from the active pool until it recovers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Routing Engine&lt;/strong&gt;: The brain that applies &lt;strong&gt;load balancing algorithms&lt;/strong&gt; and rules to decide which server receives each request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session Persistence Layer&lt;/strong&gt; (optional): Ensures that a user’s subsequent requests are routed to the same server when necessary (also known as sticky sessions).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  How Load Balancing Works Step by Step
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;A client (browser, mobile app, or another service) sends a request to the public IP or domain of the &lt;strong&gt;load balancer&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;load balancer&lt;/strong&gt; inspects the request headers, source IP, or other metadata.&lt;/li&gt;
&lt;li&gt;Using its configured algorithm and current server metrics (CPU load, active connections, response time), the &lt;strong&gt;load balancer&lt;/strong&gt; selects the optimal backend server.&lt;/li&gt;
&lt;li&gt;The request is forwarded (proxied) to the chosen server.&lt;/li&gt;
&lt;li&gt;The backend server processes the request and sends the response back through the &lt;strong&gt;load balancer&lt;/strong&gt; to the client.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;load balancer&lt;/strong&gt; may also perform TLS termination, compression, or request rewriting before forwarding.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Types of Load Balancers
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Load balancers&lt;/strong&gt; are broadly classified into two layers of the OSI model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Layer 4 (Transport Layer) Load Balancers&lt;/strong&gt;: Operate at the TCP/UDP level. They forward packets based on IP address and port without inspecting the actual content of the request. Examples include AWS Network Load Balancer and HAProxy in TCP mode. They are extremely fast and suitable for high-throughput scenarios but cannot make routing decisions based on HTTP headers or URL paths.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Layer 7 (Application Layer) Load Balancers&lt;/strong&gt;: Operate at the HTTP/HTTPS level. They can read the full request, including URL, headers, cookies, and method. This allows advanced routing such as sending image requests to one pool and API requests to another. Examples include AWS Application Load Balancer, NGINX, and Envoy. They support content-based routing, rate limiting, and header manipulation but introduce slightly higher latency due to inspection.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Load balancers&lt;/strong&gt; can also be deployed as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hardware appliances&lt;/strong&gt; (F5 BIG-IP, Citrix ADC) — expensive but offer high performance and specialized ASIC chips.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Software solutions&lt;/strong&gt; (NGINX, HAProxy, Traefik) — run on commodity servers or containers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud-managed services&lt;/strong&gt; (AWS ELB, Google Cloud Load Balancing, Azure Load Balancer) — fully managed with auto-scaling built in.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Popular Load Balancing Algorithms
&lt;/h3&gt;

&lt;p&gt;The choice of algorithm directly impacts system performance. Here are the most widely used ones with detailed explanations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Round Robin&lt;/strong&gt;: Requests are distributed sequentially across the backend servers in a cyclic order. Simple and fair when all servers have identical capacity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Weighted Round Robin&lt;/strong&gt;: Each server is assigned a weight based on its capacity. A more powerful server receives proportionally more requests.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Least Connections&lt;/strong&gt;: The &lt;strong&gt;load balancer&lt;/strong&gt; routes the next request to the server currently handling the fewest active connections. Excellent for uneven workloads.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Least Response Time&lt;/strong&gt;: Routes to the server with the lowest average response time, combining connection count and latency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;IP Hash&lt;/strong&gt;: Uses the client’s IP address to consistently route requests to the same server. Useful for session persistence without cookies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Random&lt;/strong&gt;: Selects a server at random. Surprisingly effective and simple to implement.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Complete NGINX Configuration Example
&lt;/h3&gt;

&lt;p&gt;Below is a production-ready NGINX configuration that demonstrates &lt;strong&gt;load balancing&lt;/strong&gt; with health checks, weighted round robin, and session persistence. Every line is explained in detail.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Global settings&lt;/span&gt;
&lt;span class="k"&gt;worker_processes&lt;/span&gt; &lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;events&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;worker_connections&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;http&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# Define the upstream (backend pool)&lt;/span&gt;
    &lt;span class="kn"&gt;upstream&lt;/span&gt; &lt;span class="s"&gt;backend_servers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;# Least Connections algorithm with weights&lt;/span&gt;
        &lt;span class="kn"&gt;least_conn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;app-server-1.example.com&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8080&lt;/span&gt; &lt;span class="s"&gt;weight=3&lt;/span&gt; &lt;span class="s"&gt;max_fails=3&lt;/span&gt; &lt;span class="s"&gt;fail_timeout=30s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;app-server-2.example.com&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8080&lt;/span&gt; &lt;span class="s"&gt;weight=2&lt;/span&gt; &lt;span class="s"&gt;max_fails=3&lt;/span&gt; &lt;span class="s"&gt;fail_timeout=30s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;app-server-3.example.com&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8080&lt;/span&gt; &lt;span class="s"&gt;weight=1&lt;/span&gt; &lt;span class="s"&gt;max_fails=3&lt;/span&gt; &lt;span class="s"&gt;fail_timeout=30s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;# Health check (requires nginx-plus or open-source module)&lt;/span&gt;
        &lt;span class="c1"&gt;# In open-source NGINX, use external tools like consul-template&lt;/span&gt;
        &lt;span class="kn"&gt;keepalive&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt; &lt;span class="s"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;myapp.com&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;# SSL termination happens here&lt;/span&gt;
        &lt;span class="kn"&gt;ssl_certificate&lt;/span&gt; &lt;span class="n"&gt;/etc/nginx/ssl/fullchain.pem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;ssl_certificate_key&lt;/span&gt; &lt;span class="n"&gt;/etc/nginx/ssl/privkey.pem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;# Forward request to the upstream pool&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://backend_servers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;# Preserve original host and client IP&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;# Enable session persistence using cookies&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_cookie_path&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="s"&gt;"/&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="kn"&gt;secure&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="kn"&gt;HttpOnly"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;# Timeout settings for reliability&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_connect_timeout&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_send_timeout&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="kn"&gt;proxy_read_timeout&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explanation of key directives&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;upstream backend_servers&lt;/code&gt; defines the pool of servers.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;least_conn&lt;/code&gt; activates the Least Connections algorithm.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;weight=3&lt;/code&gt; gives app-server-1 three times more traffic than app-server-3.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;max_fails=3 fail_timeout=30s&lt;/code&gt; removes a server after three consecutive failures for 30 seconds.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;proxy_pass http://backend_servers&lt;/code&gt; forwards traffic to the chosen server.&lt;/li&gt;
&lt;li&gt;Header directives ensure the backend knows the original client information.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Advanced Load Balancing Concepts
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Consistent Hashing&lt;/strong&gt; is often combined with &lt;strong&gt;load balancing&lt;/strong&gt; to minimize disruption when servers are added or removed. Instead of rehashing everything, only a small portion of traffic is affected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Global Server Load Balancing (GSLB)&lt;/strong&gt; extends the concept across multiple data centers using DNS-based routing (Anycast or GeoDNS) to direct users to the nearest healthy region.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auto-scaling integration&lt;/strong&gt; allows the &lt;strong&gt;load balancer&lt;/strong&gt; to dynamically register new instances launched by Kubernetes Horizontal Pod Autoscaler or AWS Auto Scaling Groups.&lt;/p&gt;

&lt;p&gt;If you found this deep dive into &lt;strong&gt;load balancing&lt;/strong&gt; valuable and want to master the remaining 99 system design concepts with equally detailed explanations, code examples, and diagrams, grab the complete System Design eBook at &lt;a href="https://codewithdhanian.gumroad.com/l/urcjee" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/urcjee&lt;/a&gt;. You can also support the creation of more high-quality technical content by buying me a coffee at &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55d08ru8tyxz54bg1a4r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55d08ru8tyxz54bg1a4r.png" alt="Understanding load balancing concepts" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Retry &amp; Exponential Backoff in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Sun, 05 Apr 2026 10:17:33 +0000</pubDate>
      <link>https://forem.com/code_2/retry-exponential-backoff-in-system-design-3ho6</link>
      <guid>https://forem.com/code_2/retry-exponential-backoff-in-system-design-3ho6</guid>
      <description>&lt;p&gt;In &lt;strong&gt;distributed systems&lt;/strong&gt; and &lt;strong&gt;microservices architectures&lt;/strong&gt;, transient failures are common. Network glitches, temporary service overloads, brief database contention, or momentary unavailability of third-party APIs frequently resolve themselves within seconds. The &lt;strong&gt;Retry&lt;/strong&gt; mechanism combined with &lt;strong&gt;Exponential Backoff&lt;/strong&gt; provides a fundamental resilience strategy that intelligently re-attempts failed operations instead of failing immediately. This pattern significantly improves overall system reliability and user experience by handling flaky conditions gracefully without overwhelming the failing service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retry &amp;amp; Exponential Backoff&lt;/strong&gt; forms one of the core building blocks of fault-tolerant design, often used alongside the &lt;strong&gt;Circuit Breaker Pattern&lt;/strong&gt;, &lt;strong&gt;timeouts&lt;/strong&gt;, &lt;strong&gt;idempotency&lt;/strong&gt;, and &lt;strong&gt;bulkhead isolation&lt;/strong&gt;. When implemented correctly, it reduces unnecessary errors while protecting downstream services from retry storms that could lead to cascading failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Retry Mechanisms
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;retry&lt;/strong&gt; is simply the act of re-executing a failed operation after a short delay. Not every failure deserves a retry. Only &lt;strong&gt;idempotent&lt;/strong&gt; operations or those that are safe to repeat should be retried. Non-idempotent operations require careful handling, often through &lt;strong&gt;idempotency keys&lt;/strong&gt; or unique transaction identifiers to prevent duplicate effects.&lt;/p&gt;

&lt;p&gt;Common transient failure scenarios suitable for retries include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Network timeouts or connection resets&lt;/li&gt;
&lt;li&gt;HTTP 503 Service Unavailable or 429 Too Many Requests&lt;/li&gt;
&lt;li&gt;Temporary database deadlocks or lock contention&lt;/li&gt;
&lt;li&gt;Rate limiting responses from external services&lt;/li&gt;
&lt;li&gt;Brief unavailability during scaling events or deployments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Permanent failures such as validation errors (HTTP 400), authentication failures (401/403), or business logic errors should not trigger retries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exponential Backoff Strategy
&lt;/h2&gt;

&lt;p&gt;Simple fixed-delay retries can create &lt;strong&gt;thundering herd&lt;/strong&gt; problems where many clients retry simultaneously, overwhelming the recovering service. &lt;strong&gt;Exponential Backoff&lt;/strong&gt; solves this by increasing the wait time between retries exponentially. The delay typically follows the formula:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;delay = base_delay × 2^retry_attempt&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To prevent synchronization of retries across clients, &lt;strong&gt;jitter&lt;/strong&gt; (random variation) is added to the calculated delay.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full delay formula with jitter&lt;/strong&gt;:&lt;br&gt;
&lt;strong&gt;delay = min(cap, base_delay × 2^retry_attempt) + random(0, jitter)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Common variations include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Full Jitter&lt;/strong&gt;: Random delay between 0 and the computed exponential value&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Equal Jitter&lt;/strong&gt;: Computed delay minus a random portion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decorrelated Jitter&lt;/strong&gt;: Next delay based on previous delay with randomness&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Exponential Backoff with Jitter&lt;/strong&gt; dramatically improves system stability under load by spreading retry attempts over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Detailed Implementation of Retry with Exponential Backoff
&lt;/h2&gt;

&lt;p&gt;Production-grade implementations must handle concurrency safely, respect maximum retry limits, support different backoff strategies, and integrate with logging and monitoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pseudocode for Retry with Exponential Backoff
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class RetryWithBackoff {
    int maxAttempts;
    long baseDelayMs;
    long maxDelayMs;
    double jitterFactor;

    Object executeWithRetry(Callable operation) {
        Exception lastException;

        for (int attempt = 0; attempt &amp;lt; maxAttempts; attempt++) {
            try {
                return operation.call();
            } catch (TransientException e) {
                lastException = e;
                if (attempt == maxAttempts - 1) {
                    break;  // Final attempt failed
                }
                long delay = calculateDelay(attempt);
                sleep(delay);
            } catch (PermanentException e) {
                throw e;  // Do not retry
            }
        }
        throw lastException;  // Propagate after exhausting retries
    }

    private long calculateDelay(int attempt) {
        long exponentialDelay = baseDelayMs * (1L &amp;lt;&amp;lt; attempt);  // 2^attempt
        long cappedDelay = min(exponentialDelay, maxDelayMs);

        // Add full jitter
        long jitter = random(0, (long)(cappedDelay * jitterFactor));
        return cappedDelay + jitter;
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Complete Python Implementation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Type&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransientError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retry_with_exponential_backoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_delay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;# 100ms
&lt;/span&gt;    &lt;span class="n"&gt;max_delay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;# 10 seconds
&lt;/span&gt;    &lt;span class="n"&gt;jitter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;backoff_factor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;wrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;last_exception&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;last_exception&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;

                    &lt;span class="c1"&gt;# Check if error is transient (custom logic)
&lt;/span&gt;                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;is_transient_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                        &lt;span class="k"&gt;raise&lt;/span&gt;  &lt;span class="c1"&gt;# Permanent error - do not retry
&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;max_attempts&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="k"&gt;break&lt;/span&gt;  &lt;span class="c1"&gt;# Last attempt failed
&lt;/span&gt;
                    &lt;span class="c1"&gt;# Calculate exponential backoff
&lt;/span&gt;                    &lt;span class="n"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base_delay&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backoff_factor&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_delay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;jitter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 10% jitter
&lt;/span&gt;
                    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="c1"&gt;# Optional: log retry attempt
&lt;/span&gt;                    &lt;span class="c1"&gt;# logger.warning(f"Retry {attempt+1}/{max_attempts} after {delay:.2f}s")
&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;last_exception&lt;/span&gt;  &lt;span class="c1"&gt;# Re-raise after all retries exhausted
&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;wrapper&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;

&lt;span class="c1"&gt;# Example usage
&lt;/span&gt;&lt;span class="nd"&gt;@retry_with_exponential_backoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_external_api&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Simulate network call that may fail transiently
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.example.com/users/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Java Conceptual Structure (Resilience4j Style)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;RetryConfig&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RetryConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;custom&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxAttempts&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;waitDuration&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofMillis&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;retryOnException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nc"&gt;TransientException&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;intervalFunction&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;IntervalFunction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofExponentialBackoff&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;Retry&lt;/span&gt; &lt;span class="n"&gt;retry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Retry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"externalService"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="nc"&gt;Callable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;retryableCall&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Retry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decorateCallable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retry&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;callExternalService&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Try&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofCallable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retryableCall&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;recover&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;fallbackResponse&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These implementations demonstrate key elements: configurable attempt limits, proper classification of transient versus permanent errors, exponential delay calculation, jitter for load distribution, and clean separation of concerns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices for Retry &amp;amp; Exponential Backoff
&lt;/h2&gt;

&lt;p&gt;Effective use of this pattern requires attention to several critical details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency&lt;/strong&gt;: Always ensure retried operations are idempotent or use &lt;strong&gt;idempotency keys&lt;/strong&gt; (unique request identifiers stored server-side) to prevent duplicate side effects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout Integration&lt;/strong&gt;: Combine retries with appropriate per-attempt timeouts to avoid hanging requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Circuit Breaker Synergy&lt;/strong&gt;: Use &lt;strong&gt;circuit breakers&lt;/strong&gt; to stop retries entirely when a service is confirmed unhealthy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring &amp;amp; Observability&lt;/strong&gt;: Track retry counts, success-after-retry rates, and backoff delays using tools like &lt;strong&gt;Prometheus&lt;/strong&gt; and &lt;strong&gt;Grafana&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum Delay Caps&lt;/strong&gt;: Prevent excessively long waits by capping delays.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client-Specific Backoff&lt;/strong&gt;: Different clients or services may need tailored backoff parameters based on their importance and load characteristics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid Retry Storms&lt;/strong&gt;: Jitter and randomized delays are essential in large-scale systems with thousands of instances.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In &lt;strong&gt;event-driven architectures&lt;/strong&gt; using &lt;strong&gt;message queues&lt;/strong&gt; like &lt;strong&gt;Kafka&lt;/strong&gt; or &lt;strong&gt;RabbitMQ&lt;/strong&gt;, retries are often handled through dead-letter queues and delayed message redelivery rather than in-process loops.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Considerations
&lt;/h2&gt;

&lt;p&gt;In high-scale systems, &lt;strong&gt;Retry &amp;amp; Exponential Backoff&lt;/strong&gt; must be applied judiciously. Overly aggressive retries can still contribute to overload. Many modern service meshes (such as Istio) and API gateways provide built-in retry capabilities at the infrastructure layer, allowing application code to focus on business logic.&lt;/p&gt;

&lt;p&gt;The combination of &lt;strong&gt;Retry&lt;/strong&gt; with &lt;strong&gt;Exponential Backoff&lt;/strong&gt; remains one of the simplest yet most powerful techniques for improving resilience in &lt;strong&gt;distributed systems&lt;/strong&gt;. When paired with proper &lt;strong&gt;idempotency&lt;/strong&gt;, &lt;strong&gt;timeouts&lt;/strong&gt;, and &lt;strong&gt;circuit breakers&lt;/strong&gt;, it enables applications to withstand transient issues while maintaining high availability and responsive user experiences.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxi5lybmdphpwrqwimj57.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxi5lybmdphpwrqwimj57.png" alt="Retry with Exponential Backoff" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
For more in-depth insights and comprehensive coverage of system design topics, consider purchasing the System Design Handbook at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;. It will equip you with the knowledge to master complex distributed systems.  &lt;/p&gt;

&lt;p&gt;Buy me coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Circuit Breaker Pattern in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 03 Apr 2026 08:07:13 +0000</pubDate>
      <link>https://forem.com/code_2/circuit-breaker-pattern-in-system-design-4l24</link>
      <guid>https://forem.com/code_2/circuit-breaker-pattern-in-system-design-4l24</guid>
      <description>&lt;p&gt;In &lt;strong&gt;distributed systems&lt;/strong&gt; and &lt;strong&gt;microservices architectures&lt;/strong&gt;, failures are inevitable. Network latency, service overload, database slowdowns, or third-party API outages can quickly cascade into widespread system instability. The &lt;strong&gt;Circuit Breaker Pattern&lt;/strong&gt; serves as a critical &lt;strong&gt;resilience&lt;/strong&gt; mechanism that prevents these cascading failures by intelligently isolating faulty components. Inspired by electrical circuit breakers that interrupt current flow during overloads, the software version acts as a protective proxy around remote calls, allowing systems to fail fast, degrade gracefully, and recover automatically.&lt;/p&gt;

&lt;p&gt;This pattern is essential for building robust, highly available applications where services depend on each other across network boundaries. By monitoring failure rates and response times, the &lt;strong&gt;circuit breaker&lt;/strong&gt; stops repeated attempts to reach an unhealthy service, giving it time to recover while providing immediate feedback or fallback responses to callers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Circuit Breaker Pattern
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Circuit Breaker Pattern&lt;/strong&gt; functions as a stateful wrapper around operations that interact with external services or resources. Instead of allowing every request to reach a failing downstream service—which could overwhelm it further and degrade the entire system—the &lt;strong&gt;circuit breaker&lt;/strong&gt; tracks metrics such as error counts, latency, or exceptions. When failure thresholds are breached, it “trips” and redirects traffic away from the problematic service.&lt;/p&gt;

&lt;p&gt;Key benefits include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prevention of &lt;strong&gt;cascading failures&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Reduction in resource consumption on both caller and callee sides&lt;/li&gt;
&lt;li&gt;Faster response times through immediate failure detection&lt;/li&gt;
&lt;li&gt;Graceful degradation via &lt;strong&gt;fallback mechanisms&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Automatic recovery without manual intervention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern works best when combined with complementary techniques such as &lt;strong&gt;retries with exponential backoff&lt;/strong&gt;, &lt;strong&gt;timeouts&lt;/strong&gt;, &lt;strong&gt;rate limiting&lt;/strong&gt;, and &lt;strong&gt;bulkhead isolation&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three States of a Circuit Breaker
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;circuit breaker&lt;/strong&gt; maintains one of three distinct states, each dictating how incoming requests are handled. These states form a finite state machine that transitions based on observed behavior and configurable thresholds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Closed State&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
This is the normal operating state. All requests pass through to the protected service. The &lt;strong&gt;circuit breaker&lt;/strong&gt; monitors outcomes, counting failures within a sliding time window or consecutive failure count. If the failure rate or count exceeds a predefined threshold (for example, 50% errors in the last 10 seconds or 5 consecutive failures), the breaker transitions to the &lt;strong&gt;Open&lt;/strong&gt; state. Successes reset or decrement failure counters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Open State&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
When the circuit is &lt;strong&gt;open&lt;/strong&gt;, the breaker immediately rejects all requests without forwarding them to the downstream service. This prevents further load on the failing component and avoids long timeouts or resource exhaustion. Instead, the caller receives an immediate exception or a &lt;strong&gt;fallback&lt;/strong&gt; response. A timeout timer (reset timeout) starts, after which the breaker moves to the &lt;strong&gt;Half-Open&lt;/strong&gt; state to test recovery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Half-Open State&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
This transitional state allows a limited number of test requests (often just one or a small configurable count) to reach the service. If these probe requests succeed, the &lt;strong&gt;circuit breaker&lt;/strong&gt; assumes recovery and returns to the &lt;strong&gt;Closed&lt;/strong&gt; state, resetting failure counters. If any test fails, the breaker reverts to the &lt;strong&gt;Open&lt;/strong&gt; state and restarts the timeout period. This cautious probing ensures the service has truly stabilized before resuming full traffic.&lt;/p&gt;

&lt;p&gt;These state transitions enable self-healing while protecting system stability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Detailed Implementation of the Circuit Breaker Pattern
&lt;/h2&gt;

&lt;p&gt;Implementing a &lt;strong&gt;circuit breaker&lt;/strong&gt; from scratch requires careful handling of concurrency, metrics tracking, and state persistence. In production, developers typically use battle-tested libraries such as &lt;strong&gt;Resilience4j&lt;/strong&gt; (Java), &lt;strong&gt;Hystrix&lt;/strong&gt; (legacy Java), &lt;strong&gt;Polly&lt;/strong&gt; (.NET), or &lt;strong&gt;pybreaker&lt;/strong&gt; (Python). Below are complete, illustrative code structures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pseudocode for a Generic Circuit Breaker
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class CircuitBreaker {
    enum State { CLOSED, OPEN, HALF_OPEN }

    State currentState = CLOSED;
    int failureCount = 0;
    int successCount = 0;
    long lastFailureTime = 0;
    Configuration config;  // failureThreshold, timeout, successThreshold, etc.

    Object execute(Callable operation) {
        if (currentState == OPEN) {
            if (isTimeoutExpired()) {
                transitionTo(HALF_OPEN);
            } else {
                return invokeFallback();  // or throw CircuitOpenException
            }
        }

        try {
            Object result = operation.call();
            onSuccess();
            return result;
        } catch (Exception e) {
            onFailure(e);
            return invokeFallback();
        }
    }

    private void onSuccess() {
        failureCount = 0;
        successCount++;
        if (currentState == HALF_OPEN &amp;amp;&amp;amp; successCount &amp;gt;= config.successThreshold) {
            transitionTo(CLOSED);
        }
    }

    private void onFailure(Exception e) {
        failureCount++;
        lastFailureTime = currentTime();
        if (failureCount &amp;gt;= config.failureThreshold || currentState == HALF_OPEN) {
            transitionTo(OPEN);
        }
    }

    private boolean isTimeoutExpired() {
        return (currentTime() - lastFailureTime) &amp;gt; config.resetTimeout;
    }

    private void transitionTo(State newState) {
        currentState = newState;
        // Log state change, notify monitoring system
        if (newState == HALF_OPEN) {
            successCount = 0;
        }
    }

    private Object invokeFallback() {
        // Execute fallback logic, e.g., return cached data or default value
        return defaultResponse();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Java Example Using Resilience4j Style (Conceptual Full Structure)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Configuration&lt;/span&gt;
&lt;span class="nc"&gt;CircuitBreakerConfig&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreakerConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;custom&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failureRateThreshold&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;// Open if failure rate &amp;gt; 50%&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;waitDurationInOpenState&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;// Reset timeout&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;permittedNumberOfCallsInHalfOpenState&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;         &lt;span class="c1"&gt;// Test calls&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;slidingWindowSize&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;              &lt;span class="c1"&gt;// Window for metrics&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt; &lt;span class="n"&gt;circuitBreaker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"paymentService"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Decorator usage&lt;/span&gt;
&lt;span class="nc"&gt;Supplier&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;decoratedSupplier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decorateSupplier&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;circuitBreaker&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
    &lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;callPaymentService&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;// remote call&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// With fallback&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Try&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSupplier&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decoratedSupplier&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;recover&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;throwable&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;fallbackPaymentResponse&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Python Example Using a Simple Custom Implementation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;CLOSED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;closed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;OPEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;open&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;HALF_OPEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;half_open&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;failure_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reset_timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;success_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CLOSED&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failure_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;failure_threshold&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reset_timeout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reset_timeout&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;success_threshold&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failure_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_failure_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_failure_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reset_timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HALF_OPEN&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreakerOpenException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Circuit breaker is OPEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_on_success&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_on_failure&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;  &lt;span class="c1"&gt;# or handle with fallback
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_on_success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failure_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HALF_OPEN&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CLOSED&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_on_failure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failure_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_failure_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failure_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failure_threshold&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HALF_OPEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CircuitState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OPEN&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreakerOpenException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These implementations highlight essential elements: configurable thresholds, state management, fallback execution, and safe transitions. In real systems, thread-safety (using locks or atomic operations) and integration with monitoring tools like &lt;strong&gt;Prometheus&lt;/strong&gt; are mandatory.&lt;/p&gt;

&lt;h2&gt;
  
  
  When and How to Use the Circuit Breaker Pattern
&lt;/h2&gt;

&lt;p&gt;Apply the &lt;strong&gt;Circuit Breaker Pattern&lt;/strong&gt; to any synchronous or asynchronous call to external services, databases, or third-party APIs where failure could propagate. Common scenarios include &lt;strong&gt;microservices&lt;/strong&gt; communication, payment gateways, inventory checks, or recommendation engines.&lt;/p&gt;

&lt;p&gt;Best practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Combine with &lt;strong&gt;timeouts&lt;/strong&gt; to avoid indefinite waits.&lt;/li&gt;
&lt;li&gt;Implement meaningful &lt;strong&gt;fallbacks&lt;/strong&gt;—cached data, default values, or queued operations.&lt;/li&gt;
&lt;li&gt;Monitor state transitions and metrics for observability.&lt;/li&gt;
&lt;li&gt;Tune thresholds based on service characteristics and traffic patterns.&lt;/li&gt;
&lt;li&gt;Ensure &lt;strong&gt;idempotency&lt;/strong&gt; for operations that may be retried.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern shines in high-traffic environments but adds slight overhead in normal operation due to metric collection. For extremely latency-sensitive paths, evaluate whether the protection justifies the cost.&lt;/p&gt;

&lt;p&gt;Mastering the &lt;strong&gt;Circuit Breaker Pattern&lt;/strong&gt; equips system designers with a powerful tool to build resilient, fault-tolerant distributed systems that maintain availability even when individual components fail.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmb2zr7xrho642c7ufv56.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmb2zr7xrho642c7ufv56.png" alt="Circuit breaker pattern diagram in design" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
For more in-depth insights and comprehensive coverage of system design topics, consider purchasing the System Design Handbook at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;. It will equip you with the knowledge to master complex distributed systems.  &lt;/p&gt;

&lt;p&gt;Buy me coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Rate Limiting &amp; Throttling in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 03 Apr 2026 07:54:26 +0000</pubDate>
      <link>https://forem.com/code_2/rate-limiting-throttling-in-system-design-3o83</link>
      <guid>https://forem.com/code_2/rate-limiting-throttling-in-system-design-3o83</guid>
      <description>&lt;p&gt;In large-scale &lt;strong&gt;distributed systems&lt;/strong&gt; and &lt;strong&gt;microservices&lt;/strong&gt; architectures, uncontrolled incoming traffic can quickly lead to resource exhaustion, degraded performance, or complete service outages. &lt;strong&gt;Rate limiting&lt;/strong&gt; and &lt;strong&gt;throttling&lt;/strong&gt; serve as critical defensive mechanisms that protect backend services, ensure fair usage among clients, prevent abuse, and maintain overall system stability under varying load conditions. These techniques control the flow of requests to APIs, databases, or other resources, allowing systems to operate reliably even during traffic spikes or malicious attacks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Rate Limiting
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Rate limiting&lt;/strong&gt; is a technique that enforces a strict upper bound on the number of requests a client, user, IP address, or API key can make within a defined time window. The primary goals include protecting against &lt;strong&gt;DDoS attacks&lt;/strong&gt;, ensuring &lt;strong&gt;fair resource allocation&lt;/strong&gt;, enforcing &lt;strong&gt;business quotas&lt;/strong&gt;, and preventing any single client from monopolizing shared resources.&lt;/p&gt;

&lt;p&gt;When a request exceeds the allowed limit, the system typically rejects it immediately and returns an &lt;strong&gt;HTTP 429 Too Many Requests&lt;/strong&gt; status code, often accompanied by headers such as &lt;strong&gt;Retry-After&lt;/strong&gt; to inform the client when it may retry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate limiting&lt;/strong&gt; operates at multiple layers: at the &lt;strong&gt;API gateway&lt;/strong&gt;, within individual &lt;strong&gt;microservices&lt;/strong&gt;, at the &lt;strong&gt;load balancer&lt;/strong&gt;, or even at the &lt;strong&gt;edge&lt;/strong&gt; using &lt;strong&gt;content delivery networks&lt;/strong&gt;. In &lt;strong&gt;distributed environments&lt;/strong&gt;, the rate limiter must maintain consistent state across multiple nodes, typically using a centralized store such as &lt;strong&gt;Redis&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Throttling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Throttling&lt;/strong&gt; differs from &lt;strong&gt;rate limiting&lt;/strong&gt; by focusing on controlling the processing speed or flow of requests rather than imposing a hard rejection limit. Instead of outright denying excess requests, &lt;strong&gt;throttling&lt;/strong&gt; slows down, queues, or paces the handling of requests to maintain a steady load on the system.&lt;/p&gt;

&lt;p&gt;While &lt;strong&gt;rate limiting&lt;/strong&gt; answers the question “Is this request allowed?”, &lt;strong&gt;throttling&lt;/strong&gt; addresses “How fast should this request be processed?”. &lt;strong&gt;Throttling&lt;/strong&gt; is particularly useful for smoothing bursty traffic, protecting downstream services with their own rate limits, or gracefully handling temporary overload without dropping legitimate requests.&lt;/p&gt;

&lt;p&gt;Common &lt;strong&gt;throttling&lt;/strong&gt; strategies include introducing artificial delays, queuing requests in &lt;strong&gt;message queues&lt;/strong&gt;, or dynamically reducing the processing rate based on current system metrics such as CPU usage or queue length.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Differences Between Rate Limiting and Throttling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Rate limiting&lt;/strong&gt; provides a hard cap and immediate rejection for excess requests, making it ideal for quota enforcement and abuse prevention. &lt;strong&gt;Throttling&lt;/strong&gt; prioritizes smoothing traffic and improving user experience by avoiding abrupt denials, often at the cost of increased latency for some requests. Many production systems combine both: &lt;strong&gt;rate limiting&lt;/strong&gt; at the entry point for protection and &lt;strong&gt;throttling&lt;/strong&gt; internally for traffic shaping.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Rate Limiting Algorithms
&lt;/h2&gt;

&lt;p&gt;Several well-established algorithms exist for implementing &lt;strong&gt;rate limiting&lt;/strong&gt;, each offering different trade-offs in terms of burst tolerance, accuracy, memory usage, and implementation complexity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Token Bucket Algorithm
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;token bucket&lt;/strong&gt; algorithm is one of the most widely adopted approaches due to its flexibility and ability to handle controlled bursts. It models capacity as a bucket that accumulates &lt;strong&gt;tokens&lt;/strong&gt; at a constant refill rate up to a maximum capacity. Each incoming request consumes one token. If tokens are available, the request is allowed; otherwise, it is rejected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key parameters&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Refill rate&lt;/strong&gt; (r): Tokens added per unit time (e.g., 10 tokens per second).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bucket capacity&lt;/strong&gt; (b): Maximum number of tokens the bucket can hold, determining burst size.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This algorithm allows short bursts up to the bucket capacity while enforcing the long-term average rate. It is particularly suitable for public APIs where users may send occasional bursts of requests after periods of inactivity.&lt;/p&gt;

&lt;h4&gt;
  
  
  Complete Token Bucket Implementation Example Using Redis (Lua Script for Atomicity)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Token Bucket Lua Script for Redis&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;                  &lt;span class="c1"&gt;-- e.g., "rate:limit:user:123"&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ARGV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;        &lt;span class="c1"&gt;-- current timestamp in seconds&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;refill_rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ARGV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c1"&gt;-- tokens per second&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;capacity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ARGV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;   &lt;span class="c1"&gt;-- max bucket size&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;tokens_requested&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ARGV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="c1"&gt;-- Get current tokens and last refill time&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;last_refill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"HGET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"last_refill"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"HGET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"tokens"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;-- Calculate new tokens to add&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;last_refill&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;new_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;math.floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;refill_rate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;math.min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;new_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;-- Check if enough tokens available&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;tokens_requested&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;tokens_requested&lt;/span&gt;
    &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"HSET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"tokens"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"HSET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"last_refill"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"EXPIRE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;-- expire after 1 hour for cleanup&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;               &lt;span class="c1"&gt;-- allowed, remaining tokens&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;               &lt;span class="c1"&gt;-- rejected, remaining tokens&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;strong&gt;Lua script&lt;/strong&gt; ensures atomic execution, preventing race conditions in &lt;strong&gt;distributed systems&lt;/strong&gt;. The client calls this script via &lt;strong&gt;EVAL&lt;/strong&gt; or &lt;strong&gt;EVALSHA&lt;/strong&gt; commands in Redis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Leaky Bucket Algorithm
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;leaky bucket&lt;/strong&gt; algorithm treats requests as water pouring into a bucket with a small hole at the bottom. Requests enter the bucket and are processed (leaked) at a constant fixed rate. If the bucket overflows, incoming requests are rejected or queued.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Leaky bucket&lt;/strong&gt; excels at smoothing traffic to a steady output rate, making it ideal for scenarios requiring predictable load, such as payment processing or integration with external services that have strict rate limits. Unlike &lt;strong&gt;token bucket&lt;/strong&gt;, it does not permit large bursts; excess requests are either delayed or dropped.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fixed Window Counter Algorithm
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;fixed window&lt;/strong&gt; algorithm divides time into fixed intervals (e.g., one minute or one hour) and counts the number of requests within each window. A counter is incremented for every allowed request. When the counter exceeds the limit for the current window, further requests are rejected until the next window begins.&lt;/p&gt;

&lt;p&gt;This approach is simple and memory-efficient but suffers from the &lt;strong&gt;boundary burst problem&lt;/strong&gt;: clients can send twice the allowed rate at window edges (e.g., 100 requests at the end of one minute and another 100 immediately at the start of the next).&lt;/p&gt;

&lt;h4&gt;
  
  
  Simple Fixed Window Pseudocode
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function isAllowed(clientId, limit, windowSeconds):
    currentWindow = floor(currentTime / windowSeconds)
    counterKey = "rate:" + clientId + ":" + currentWindow
    count = redis.INCR(counterKey)
    if count == 1:
        redis.EXPIRE(counterKey, windowSeconds)
    return count &amp;lt;= limit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sliding Window Algorithms
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Sliding window&lt;/strong&gt; approaches provide higher accuracy by using a continuously moving time frame instead of rigid boundaries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sliding Window Log&lt;/strong&gt;: Maintains a sorted list or set of timestamps for every request made by a client within the window. On each request, remove old timestamps outside the window and check if the remaining count is below the limit. This offers precise control but consumes significant memory for high-traffic clients.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sliding Window Counter&lt;/strong&gt;: A hybrid that combines fixed windows with mathematical adjustment. It tracks counts in the current and previous windows and calculates a weighted count for the sliding period. This balances accuracy and memory usage effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Distributed Rate Limiting Considerations
&lt;/h2&gt;

&lt;p&gt;In &lt;strong&gt;microservices&lt;/strong&gt; or multi-node deployments, a single in-memory rate limiter is insufficient. Designers must ensure consistency across instances using a shared &lt;strong&gt;distributed cache&lt;/strong&gt; such as &lt;strong&gt;Redis&lt;/strong&gt;, &lt;strong&gt;Memcached&lt;/strong&gt;, or a dedicated rate-limiting service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consistent hashing&lt;/strong&gt; can route requests for the same client to the same shard, while &lt;strong&gt;Lua scripts&lt;/strong&gt; or atomic operations guarantee correctness under concurrency. For extremely high scale, consider &lt;strong&gt;Redis Cluster&lt;/strong&gt; or &lt;strong&gt;consistent hashing&lt;/strong&gt; combined with local caching for hot clients.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Idempotency&lt;/strong&gt; and proper &lt;strong&gt;error handling&lt;/strong&gt; are essential: clients should receive clear &lt;strong&gt;rate limit headers&lt;/strong&gt; (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) to adjust their behavior gracefully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices for Implementing Rate Limiting &amp;amp; Throttling
&lt;/h2&gt;

&lt;p&gt;Apply &lt;strong&gt;rate limiting&lt;/strong&gt; at multiple levels: edge (CDN or API gateway), service level, and database level. Choose the algorithm based on requirements — &lt;strong&gt;token bucket&lt;/strong&gt; for burst-tolerant APIs, &lt;strong&gt;leaky bucket&lt;/strong&gt; for traffic shaping, and &lt;strong&gt;sliding window counter&lt;/strong&gt; for strict fairness with good performance.&lt;/p&gt;

&lt;p&gt;Use &lt;strong&gt;Redis&lt;/strong&gt; with &lt;strong&gt;Lua scripts&lt;/strong&gt; for atomicity in distributed setups. Always return informative headers and consider &lt;strong&gt;adaptive rate limiting&lt;/strong&gt; that dynamically adjusts limits based on system load. Combine with &lt;strong&gt;circuit breakers&lt;/strong&gt;, &lt;strong&gt;bulkheads&lt;/strong&gt;, and &lt;strong&gt;monitoring&lt;/strong&gt; (Prometheus, Grafana) to detect and respond to abuse patterns.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;throttling&lt;/strong&gt;, integrate with &lt;strong&gt;message queues&lt;/strong&gt; (Kafka, RabbitMQ) to queue excess requests or apply exponential backoff and jitter on retries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate limiting&lt;/strong&gt; and &lt;strong&gt;throttling&lt;/strong&gt; form foundational resilience patterns in &lt;strong&gt;system design&lt;/strong&gt;. Proper implementation protects services, improves user experience, and enables sustainable scaling of &lt;strong&gt;distributed systems&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvspobzgxoanq63oddaun.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvspobzgxoanq63oddaun.png" alt="Rate limiting vs throttling comparison" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
For more in-depth insights and comprehensive coverage of system design topics, consider purchasing the System Design Handbook at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;. It will equip you with the knowledge to master complex distributed systems.  &lt;/p&gt;

&lt;p&gt;Buy me coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Idempotency in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 03 Apr 2026 07:05:37 +0000</pubDate>
      <link>https://forem.com/code_2/idempotency-in-system-design-9pj</link>
      <guid>https://forem.com/code_2/idempotency-in-system-design-9pj</guid>
      <description>&lt;p&gt;In the realm of &lt;strong&gt;distributed systems&lt;/strong&gt; and &lt;strong&gt;microservices architectures&lt;/strong&gt;, ensuring that operations produce the same result regardless of how many times they are executed stands as a fundamental requirement for building reliable and fault-tolerant applications. &lt;strong&gt;Idempotency&lt;/strong&gt; addresses this need by guaranteeing that repeated invocations of the same operation yield identical outcomes without causing unintended side effects. This concept becomes especially critical when dealing with &lt;strong&gt;network failures&lt;/strong&gt;, &lt;strong&gt;retries&lt;/strong&gt;, &lt;strong&gt;message queues&lt;/strong&gt;, and &lt;strong&gt;distributed transactions&lt;/strong&gt;, where the same request may arrive multiple times due to timeouts, duplicate deliveries, or client retries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Idempotency
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Idempotency&lt;/strong&gt; derives from mathematics, where a function is idempotent if applying it multiple times produces the same result as applying it once. In &lt;strong&gt;system design&lt;/strong&gt;, an &lt;strong&gt;idempotent operation&lt;/strong&gt; ensures that executing the same request several times has the same effect as executing it exactly once. &lt;/p&gt;

&lt;p&gt;Key characteristics of &lt;strong&gt;idempotent operations&lt;/strong&gt; include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Safe repetition&lt;/strong&gt;: Repeating the request does not create duplicate resources, charge a customer multiple times, or alter system state unexpectedly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No cumulative side effects&lt;/strong&gt;: The system state after N identical requests equals the state after a single request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable outcomes&lt;/strong&gt;: Clients can safely retry failed operations without fear of inconsistency.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common examples of &lt;strong&gt;idempotent operations&lt;/strong&gt; include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Retrieving a resource via GET in &lt;strong&gt;REST APIs&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Updating a resource with a complete replacement (PUT)&lt;/li&gt;
&lt;li&gt;Deleting a resource (DELETE)&lt;/li&gt;
&lt;li&gt;Processing a payment with a unique &lt;strong&gt;idempotency key&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Non-idempotent operations, such as creating a new resource with POST or incrementing a counter, require special handling to prevent duplicates.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Idempotency Matters in Distributed Systems
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Distributed systems&lt;/strong&gt; face inherent unreliability due to &lt;strong&gt;network partitions&lt;/strong&gt;, &lt;strong&gt;service failures&lt;/strong&gt;, and &lt;strong&gt;timeout issues&lt;/strong&gt;. Clients and intermediaries often implement &lt;strong&gt;retry mechanisms&lt;/strong&gt; with &lt;strong&gt;exponential backoff&lt;/strong&gt;. Without &lt;strong&gt;idempotency&lt;/strong&gt;, these retries can lead to serious problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Duplicate orders in e-commerce platforms&lt;/li&gt;
&lt;li&gt;Multiple charges on customer payment methods&lt;/li&gt;
&lt;li&gt;Inconsistent inventory levels&lt;/li&gt;
&lt;li&gt;Corrupted data in financial ledgers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Idempotency&lt;/strong&gt; serves as a critical defense mechanism that allows safe retries, simplifies error recovery, and supports &lt;strong&gt;at-least-once delivery&lt;/strong&gt; semantics commonly found in &lt;strong&gt;message queues&lt;/strong&gt; like &lt;strong&gt;Kafka&lt;/strong&gt;, &lt;strong&gt;RabbitMQ&lt;/strong&gt;, and &lt;strong&gt;SQS&lt;/strong&gt;. It forms an essential building block alongside patterns such as the &lt;strong&gt;Circuit Breaker&lt;/strong&gt;, &lt;strong&gt;Retry &amp;amp; Exponential Backoff&lt;/strong&gt;, and &lt;strong&gt;Saga&lt;/strong&gt; for &lt;strong&gt;distributed transactions&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing Idempotency with Idempotency Keys
&lt;/h2&gt;

&lt;p&gt;The most robust approach to achieving &lt;strong&gt;idempotency&lt;/strong&gt; involves the use of &lt;strong&gt;idempotency keys&lt;/strong&gt;. A client generates a unique identifier for each logical operation and includes it in every request. The server stores the result associated with this key and reuses it for subsequent identical requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Components of Idempotency Key Implementation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency Key&lt;/strong&gt;: A unique string or UUID generated by the client, typically tied to a specific business operation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage Layer&lt;/strong&gt;: A durable store (database table, Redis cache, or dedicated service) that records processed keys along with their outcomes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validation Logic&lt;/strong&gt;: Server-side checks to detect duplicate requests and return cached responses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expiration Mechanism&lt;/strong&gt;: Optional time-based cleanup of old keys to prevent unbounded storage growth.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Complete Implementation Example: Idempotent Payment API
&lt;/h3&gt;

&lt;p&gt;Consider a payment processing endpoint that must remain &lt;strong&gt;idempotent&lt;/strong&gt; even under heavy retry scenarios.&lt;/p&gt;

&lt;h4&gt;
  
  
  Server-Side Controller (Idempotent Endpoint)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@RestController
@RequestMapping("/payments")
class PaymentController {

    PaymentService paymentService;
    IdempotencyStore idempotencyStore;  // Backed by Redis or Database

    @PostMapping
    ResponseEntity&amp;lt;PaymentResponse&amp;gt; processPayment(
            @RequestHeader("Idempotency-Key") String idempotencyKey,
            @RequestBody PaymentRequest request) {

        // Step 1: Check for existing result
        Optional&amp;lt;PaymentResponse&amp;gt; existingResult = idempotencyStore.getResult(idempotencyKey);
        if (existingResult.isPresent()) {
            return ResponseEntity.ok(existingResult.get());  // Return cached response
        }

        // Step 2: Validate key format and business rules
        if (!isValidIdempotencyKey(idempotencyKey)) {
            throw new InvalidIdempotencyKeyException();
        }

        try {
            // Step 3: Execute the actual business logic
            PaymentResult result = paymentService.processPayment(request);

            PaymentResponse response = mapToResponse(result);

            // Step 4: Store the result atomically with the key
            idempotencyStore.storeResult(idempotencyKey, response, Duration.ofHours(24));

            return ResponseEntity.ok(response);

        } catch (Exception e) {
            // Store failure state to prevent partial retries
            idempotencyStore.storeFailure(idempotencyKey, e.getMessage());
            throw e;
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Idempotency Store Implementation (Using Redis for High Performance)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class RedisIdempotencyStore implements IdempotencyStore {

    RedisTemplate&amp;lt;String, Object&amp;gt; redisTemplate;

    Optional&amp;lt;PaymentResponse&amp;gt; getResult(String key) {
        String stored = (String) redisTemplate.opsForValue().get("idempotency:" + key);
        if (stored == null) {
            return Optional.empty();
        }
        return Optional.of(deserializeResponse(stored));
    }

    void storeResult(String key, PaymentResponse response, Duration ttl) {
        String serialized = serializeResponse(response);
        redisTemplate.opsForValue().set(
            "idempotency:" + key, 
            serialized, 
            ttl
        );
    }

    void storeFailure(String key, String errorMessage) {
        redisTemplate.opsForValue().set(
            "idempotency:" + key + ":failure", 
            errorMessage, 
            Duration.ofHours(1)
        );
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implementation ensures that even if the same request with the identical &lt;strong&gt;idempotency key&lt;/strong&gt; arrives multiple times—due to network retry, load balancer duplication, or client-side retry logic—the server processes the payment operation only once and returns the same response for all subsequent calls.&lt;/p&gt;

&lt;h3&gt;
  
  
  Best Practices for Idempotency
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Client Responsibilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate a unique &lt;strong&gt;idempotency key&lt;/strong&gt; (UUID v4 recommended) per logical operation before sending the request.&lt;/li&gt;
&lt;li&gt;Store the key locally and reuse it for all retries of the same operation.&lt;/li&gt;
&lt;li&gt;Include the key in a standard header such as &lt;code&gt;Idempotency-Key&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Server Responsibilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reject requests missing a valid &lt;strong&gt;idempotency key&lt;/strong&gt; for non-safe operations.&lt;/li&gt;
&lt;li&gt;Make storage and business logic execution atomic where possible.&lt;/li&gt;
&lt;li&gt;Use short TTLs on stored results to manage storage growth.&lt;/li&gt;
&lt;li&gt;Design all compensating actions in &lt;strong&gt;Saga&lt;/strong&gt; patterns to also be &lt;strong&gt;idempotent&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Idempotency in Message-Driven Systems&lt;/strong&gt;:&lt;br&gt;
When consuming from &lt;strong&gt;message queues&lt;/strong&gt;, attach &lt;strong&gt;idempotency keys&lt;/strong&gt; to messages. Consumers should check the key before processing and acknowledge only after successful storage of the result. This pattern supports &lt;strong&gt;at-least-once&lt;/strong&gt; delivery while preventing duplicate side effects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Idempotency in Broader System Design Patterns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Idempotency&lt;/strong&gt; integrates deeply with other critical concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In &lt;strong&gt;distributed transactions&lt;/strong&gt; using the &lt;strong&gt;Saga&lt;/strong&gt; pattern, every compensating transaction must be &lt;strong&gt;idempotent&lt;/strong&gt; to handle repeated failure recoveries safely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Gateways&lt;/strong&gt; and &lt;strong&gt;service meshes&lt;/strong&gt; can enforce &lt;strong&gt;idempotency&lt;/strong&gt; at the infrastructure layer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event-Driven Architectures&lt;/strong&gt; rely on &lt;strong&gt;idempotent&lt;/strong&gt; event handlers to maintain consistency across services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retry &amp;amp; Exponential Backoff&lt;/strong&gt; strategies become safe only when paired with proper &lt;strong&gt;idempotency&lt;/strong&gt; controls.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In &lt;strong&gt;high-scale systems&lt;/strong&gt;, &lt;strong&gt;idempotency&lt;/strong&gt; often combines with &lt;strong&gt;rate limiting&lt;/strong&gt;, &lt;strong&gt;circuit breakers&lt;/strong&gt;, and &lt;strong&gt;distributed caching&lt;/strong&gt; to create resilient request pipelines.&lt;/p&gt;

&lt;p&gt;Mastering &lt;strong&gt;idempotency&lt;/strong&gt; enables &lt;strong&gt;system designers&lt;/strong&gt; to build applications that gracefully handle the realities of distributed environments while maintaining data integrity and providing consistent user experiences.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb6cguda5d5l5d20hm2w7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb6cguda5d5l5d20hm2w7.png" alt="Idempotency in system workflows" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
For more in-depth insights and comprehensive coverage of system design topics, consider purchasing the System Design Handbook at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;. It will equip you with the knowledge to master complex distributed systems.  &lt;/p&gt;

&lt;p&gt;Buy me coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Distributed Transactions (2PC, Saga) in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 03 Apr 2026 06:52:41 +0000</pubDate>
      <link>https://forem.com/code_2/distributed-transactions-2pc-saga-in-system-design-3llp</link>
      <guid>https://forem.com/code_2/distributed-transactions-2pc-saga-in-system-design-3llp</guid>
      <description>&lt;p&gt;In the complex landscape of modern &lt;strong&gt;distributed systems&lt;/strong&gt;, maintaining &lt;strong&gt;data consistency&lt;/strong&gt; across multiple independent services and databases presents one of the most challenging problems in &lt;strong&gt;system design&lt;/strong&gt;. &lt;strong&gt;Distributed transactions&lt;/strong&gt; provide the foundation for ensuring that operations spanning several resources either succeed completely or fail entirely, preserving the &lt;strong&gt;ACID properties&lt;/strong&gt; of &lt;strong&gt;atomicity&lt;/strong&gt;, &lt;strong&gt;consistency&lt;/strong&gt;, &lt;strong&gt;isolation&lt;/strong&gt;, and &lt;strong&gt;durability&lt;/strong&gt;. This article explores two primary approaches to handling &lt;strong&gt;distributed transactions&lt;/strong&gt;: the &lt;strong&gt;Two-Phase Commit&lt;/strong&gt; protocol, commonly known as &lt;strong&gt;2PC&lt;/strong&gt;, and the &lt;strong&gt;Saga&lt;/strong&gt; pattern. Each method addresses the coordination of &lt;strong&gt;long-running transactions&lt;/strong&gt; in environments where traditional single-database transactions fall short.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Are Distributed Transactions
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;distributed transaction&lt;/strong&gt; involves multiple participating resources, such as separate &lt;strong&gt;databases&lt;/strong&gt;, &lt;strong&gt;microservices&lt;/strong&gt;, or external systems, that must coordinate to achieve a unified outcome. Unlike local transactions confined to a single resource, &lt;strong&gt;distributed transactions&lt;/strong&gt; must manage &lt;strong&gt;cross-service consistency&lt;/strong&gt; while dealing with network latency, partial failures, and independent scaling of components. &lt;/p&gt;

&lt;p&gt;The core requirement remains the same as in monolithic systems: the entire operation must appear &lt;strong&gt;atomic&lt;/strong&gt; to the end user. If any part fails, all changes must be undone. However, achieving this in a &lt;strong&gt;distributed environment&lt;/strong&gt; introduces significant complexity because each participant operates autonomously, and communication occurs over unreliable networks. &lt;strong&gt;System designers&lt;/strong&gt; must therefore select protocols that balance &lt;strong&gt;strong consistency&lt;/strong&gt; with &lt;strong&gt;availability&lt;/strong&gt; and &lt;strong&gt;performance&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Two-Phase Commit Protocol (2PC)
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Two-Phase Commit&lt;/strong&gt; protocol, or &lt;strong&gt;2PC&lt;/strong&gt;, stands as the classic solution for achieving &lt;strong&gt;strong consistency&lt;/strong&gt; in &lt;strong&gt;distributed transactions&lt;/strong&gt;. Introduced in the 1970s, &lt;strong&gt;2PC&lt;/strong&gt; relies on a central &lt;strong&gt;coordinator&lt;/strong&gt; and multiple &lt;strong&gt;participants&lt;/strong&gt; to ensure all-or-nothing semantics across heterogeneous resources.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Components of 2PC
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Coordinator&lt;/strong&gt;: The central authority responsible for driving the transaction. It receives the initial transaction request and manages the voting and decision process.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Participants&lt;/strong&gt;: The individual resources (databases or services) that perform local work and respond to the &lt;strong&gt;coordinator&lt;/strong&gt;'s instructions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction Manager&lt;/strong&gt;: Often implemented using standards such as &lt;strong&gt;XA&lt;/strong&gt; (eXtended Architecture) for database interactions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phases of the 2PC Protocol
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;2PC&lt;/strong&gt; operates in two distinct phases, ensuring safety before any permanent changes occur.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prepare Phase&lt;/strong&gt; (Voting Phase):&lt;br&gt;&lt;br&gt;
The &lt;strong&gt;coordinator&lt;/strong&gt; sends a &lt;strong&gt;prepare&lt;/strong&gt; message to all &lt;strong&gt;participants&lt;/strong&gt;. Each &lt;strong&gt;participant&lt;/strong&gt; performs the necessary local operations, acquires locks, writes changes to a durable log, and responds with either &lt;strong&gt;ready&lt;/strong&gt; (vote yes) or &lt;strong&gt;abort&lt;/strong&gt; (vote no). If any &lt;strong&gt;participant&lt;/strong&gt; votes no or fails to respond, the &lt;strong&gt;coordinator&lt;/strong&gt; decides to abort.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Commit Phase&lt;/strong&gt; (Decision Phase):&lt;br&gt;&lt;br&gt;
If all &lt;strong&gt;participants&lt;/strong&gt; vote &lt;strong&gt;ready&lt;/strong&gt;, the &lt;strong&gt;coordinator&lt;/strong&gt; logs the global commit decision and sends &lt;strong&gt;commit&lt;/strong&gt; messages to every &lt;strong&gt;participant&lt;/strong&gt;. Each &lt;strong&gt;participant&lt;/strong&gt; then applies the changes permanently and releases locks. If the decision is to abort, the &lt;strong&gt;coordinator&lt;/strong&gt; sends &lt;strong&gt;rollback&lt;/strong&gt; messages, and &lt;strong&gt;participants&lt;/strong&gt; undo their local changes using the prepared log entries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pseudocode Implementation of 2PC Coordinator
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class TwoPhaseCommitCoordinator {
    List&amp;lt;Participant&amp;gt; participants;
    TransactionLog log;

    void beginTransaction(Transaction tx) {
        log.write("BEGIN_TX", tx.id);
        boolean allReady = true;

        // Prepare Phase
        for each participant in participants {
            Response response = participant.prepare(tx);
            if (!response.isReady()) {
                allReady = false;
                break;
            }
        }

        // Decision
        if (allReady) {
            log.write("GLOBAL_COMMIT", tx.id);
            for each participant in participants {
                participant.commit(tx);
            }
        } else {
            log.write("GLOBAL_ABORT", tx.id);
            for each participant in participants {
                participant.rollback(tx);
            }
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pseudocode for a 2PC Participant
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class DatabaseParticipant implements Participant {
    LocalDatabase db;
    UndoLog undoLog;

    Response prepare(Transaction tx) {
        try {
            db.acquireLocks(tx.operations);
            db.executeOperations(tx.operations);  // tentative changes
            undoLog.recordUndoInfo(tx);
            return new Response(true, "READY");
        } catch (Exception e) {
            return new Response(false, "ABORT");
        }
    }

    void commit(Transaction tx) {
        db.makeChangesPermanent(tx);
        db.releaseLocks(tx);
        undoLog.clear(tx);
    }

    void rollback(Transaction tx) {
        db.applyUndo(undoLog.getUndoInfo(tx));
        db.releaseLocks(tx);
        undoLog.clear(tx);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These code structures illustrate the blocking nature of &lt;strong&gt;2PC&lt;/strong&gt;: &lt;strong&gt;participants&lt;/strong&gt; hold locks from the &lt;strong&gt;prepare&lt;/strong&gt; phase until the final decision arrives. The &lt;strong&gt;coordinator&lt;/strong&gt; must persist its decision durably before proceeding, ensuring recoverability after crashes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Limitations of 2PC
&lt;/h3&gt;

&lt;p&gt;While &lt;strong&gt;2PC&lt;/strong&gt; guarantees &lt;strong&gt;strong consistency&lt;/strong&gt;, it suffers from several critical drawbacks. The &lt;strong&gt;coordinator&lt;/strong&gt; becomes a &lt;strong&gt;single point of failure&lt;/strong&gt; and a &lt;strong&gt;performance bottleneck&lt;/strong&gt;. The protocol is &lt;strong&gt;blocking&lt;/strong&gt;—if the &lt;strong&gt;coordinator&lt;/strong&gt; fails after the &lt;strong&gt;prepare&lt;/strong&gt; phase, &lt;strong&gt;participants&lt;/strong&gt; remain locked indefinitely until recovery. Network partitions can cause prolonged unavailability. In high-throughput &lt;strong&gt;microservices&lt;/strong&gt; environments, these issues make &lt;strong&gt;2PC&lt;/strong&gt; impractical for long-lived operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Saga Pattern
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Saga&lt;/strong&gt; pattern offers a fundamentally different approach to &lt;strong&gt;distributed transactions&lt;/strong&gt; by embracing &lt;strong&gt;eventual consistency&lt;/strong&gt; instead of immediate &lt;strong&gt;strong consistency&lt;/strong&gt;. Originally described in the 1980s for handling &lt;strong&gt;long-lived transactions&lt;/strong&gt;, a &lt;strong&gt;Saga&lt;/strong&gt; decomposes a large &lt;strong&gt;distributed transaction&lt;/strong&gt; into a sequence of smaller, local transactions. Each local transaction has an associated &lt;strong&gt;compensating transaction&lt;/strong&gt; that undoes its effects if later steps fail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Principles of Saga
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Local Transactions&lt;/strong&gt;: Each service executes its part independently and commits immediately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compensating Transactions&lt;/strong&gt;: Reversible operations that restore the system to a consistent state without global rollback.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Global Lock&lt;/strong&gt;: Resources remain available throughout the process.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eventual Consistency&lt;/strong&gt;: The system converges to a consistent state over time rather than instantly.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Two Implementation Styles of Saga
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Choreography-Based Saga&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
Services communicate directly through &lt;strong&gt;events&lt;/strong&gt;. Each service listens for events from previous steps and publishes its own events upon completion or failure. No central controller exists. This style promotes &lt;strong&gt;loose coupling&lt;/strong&gt; but can become difficult to trace as the number of services grows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Orchestration-Based Saga&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
A central &lt;strong&gt;Saga Orchestrator&lt;/strong&gt; coordinates the flow by sending commands to services and reacting to their responses or events. The &lt;strong&gt;orchestrator&lt;/strong&gt; maintains the overall state and decides the next step or triggers compensation. This approach provides clearer visibility into the transaction flow and simplifies error handling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete Orchestration-Based Saga Example: E-commerce Order Processing
&lt;/h3&gt;

&lt;p&gt;Consider an online store where placing an order involves three services: &lt;strong&gt;Order Service&lt;/strong&gt;, &lt;strong&gt;Payment Service&lt;/strong&gt;, and &lt;strong&gt;Inventory Service&lt;/strong&gt;. The &lt;strong&gt;Saga&lt;/strong&gt; ensures that if payment fails, inventory is not deducted, or if inventory is unavailable, payment is refunded.&lt;/p&gt;

&lt;h4&gt;
  
  
  Saga Orchestrator Pseudocode (Full Structure)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class OrderSagaOrchestrator {
    OrderService orderService;
    PaymentService paymentService;
    InventoryService inventoryService;
    SagaStateRepository stateRepo;

    void startOrderSaga(OrderRequest request) {
        SagaInstance saga = new SagaInstance(request.orderId);
        stateRepo.save(saga);

        // Step 1: Create Order (local transaction)
        Order order = orderService.createOrder(request);
        saga.updateStep("ORDER_CREATED", order);

        try {
            // Step 2: Process Payment
            Payment payment = paymentService.processPayment(order);
            saga.updateStep("PAYMENT_SUCCESS", payment);

            // Step 3: Reserve Inventory
            InventoryReservation reservation = inventoryService.reserveInventory(order);
            saga.updateStep("INVENTORY_RESERVED", reservation);

            saga.complete();
            return;

        } catch (PaymentFailedException e) {
            // Compensation: Cancel Order
            orderService.cancelOrder(order);
            saga.fail("PAYMENT_FAILED");
        } catch (InventoryUnavailableException e) {
            // Compensation Chain
            paymentService.refundPayment(payment);
            orderService.cancelOrder(order);
            saga.fail("INVENTORY_FAILED");
        }
    }

    // Compensating transaction examples
    void compensatePayment(Payment payment) {
        paymentService.refundPayment(payment);  // idempotent refund
    }

    void compensateOrder(Order order) {
        orderService.cancelOrder(order);  // releases any reservations
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Service-Level Local Transaction Example (Inventory Service)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class InventoryService {
    InventoryRepository repo;

    InventoryReservation reserveInventory(Order order) {
        // Local transaction - fully committed immediately
        return repo.withinTransaction(() -&amp;gt; {
            Stock stock = repo.findStock(order.productId);
            if (stock.quantity &amp;lt; order.quantity) {
                throw new InventoryUnavailableException();
            }
            stock.quantity -= order.quantity;
            repo.save(stock);
            return new InventoryReservation(order.orderId, order.quantity);
        });
    }

    // Compensating transaction - public and idempotent
    void releaseInventory(InventoryReservation reservation) {
        repo.withinTransaction(() -&amp;gt; {
            Stock stock = repo.findStock(reservation.productId);
            stock.quantity += reservation.quantity;
            repo.save(stock);
        });
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This full code structure demonstrates how the &lt;strong&gt;orchestrator&lt;/strong&gt; drives the &lt;strong&gt;Saga&lt;/strong&gt; while each service remains responsible only for its local &lt;strong&gt;ACID&lt;/strong&gt; transaction and its &lt;strong&gt;compensating transaction&lt;/strong&gt;. &lt;strong&gt;Idempotency keys&lt;/strong&gt; should be included in every command and compensation to handle retries safely after network failures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advantages of the Saga Pattern
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Saga&lt;/strong&gt; excels in &lt;strong&gt;microservices&lt;/strong&gt; because it avoids long-held locks, improves &lt;strong&gt;availability&lt;/strong&gt;, and scales horizontally. Failures trigger targeted &lt;strong&gt;compensations&lt;/strong&gt; rather than global aborts. The pattern naturally fits &lt;strong&gt;event-driven architectures&lt;/strong&gt; and works seamlessly with &lt;strong&gt;message queues&lt;/strong&gt; such as &lt;strong&gt;Kafka&lt;/strong&gt; or &lt;strong&gt;RabbitMQ&lt;/strong&gt; for reliable event delivery.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing Between 2PC and Saga
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;2PC&lt;/strong&gt; suits scenarios demanding immediate &lt;strong&gt;strong consistency&lt;/strong&gt;, such as financial systems where partial states are unacceptable. &lt;strong&gt;Saga&lt;/strong&gt; fits better for business processes that tolerate temporary inconsistencies, prioritize &lt;strong&gt;high availability&lt;/strong&gt;, and involve long-running workflows across many services. In practice, many &lt;strong&gt;system designs&lt;/strong&gt; combine both: &lt;strong&gt;2PC&lt;/strong&gt; for critical synchronous steps within a bounded context and &lt;strong&gt;Saga&lt;/strong&gt; for cross-context orchestration.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Saga&lt;/strong&gt; pattern, supported by modern frameworks, has become the de facto standard for &lt;strong&gt;distributed transactions&lt;/strong&gt; in cloud-native &lt;strong&gt;microservices&lt;/strong&gt; due to its resilience and performance characteristics. Proper implementation requires careful design of &lt;strong&gt;compensating transactions&lt;/strong&gt;, &lt;strong&gt;idempotency&lt;/strong&gt;, and comprehensive &lt;strong&gt;monitoring&lt;/strong&gt; of &lt;strong&gt;Saga&lt;/strong&gt; instances to detect and resolve stuck workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distributed Transactions (2PC, Saga) in System Design&lt;/strong&gt; remains a cornerstone topic that every professional &lt;strong&gt;system designer&lt;/strong&gt; must master to build reliable, scalable, and maintainable large-scale applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkj4h6vr0lsaolgptv5i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkj4h6vr0lsaolgptv5i.png" alt="Two-Phase Commit vs Saga pattern" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
For more in-depth insights and comprehensive coverage of system design topics, consider purchasing the System Design Handbook at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;. It will equip you with the knowledge to master complex distributed systems.  &lt;/p&gt;

&lt;p&gt;Buy me coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Event-Driven Architecture in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 03 Apr 2026 06:34:44 +0000</pubDate>
      <link>https://forem.com/code_2/event-driven-architecture-in-system-design-4fgg</link>
      <guid>https://forem.com/code_2/event-driven-architecture-in-system-design-4fgg</guid>
      <description>&lt;p&gt;&lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; represents a foundational paradigm in modern &lt;strong&gt;system design&lt;/strong&gt; where the entire flow of processing is triggered and governed by &lt;strong&gt;events&lt;/strong&gt; rather than direct synchronous calls between components. In this approach, systems detect meaningful changes in state, encapsulate them as &lt;strong&gt;events&lt;/strong&gt;, and propagate them asynchronously across decoupled services. This enables &lt;strong&gt;real-time responsiveness&lt;/strong&gt;, &lt;strong&gt;loose coupling&lt;/strong&gt;, and &lt;strong&gt;high scalability&lt;/strong&gt; while allowing individual components to evolve independently without breaking the overall system.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Event-Driven Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; shifts away from the traditional &lt;strong&gt;request-response model&lt;/strong&gt; commonly seen in &lt;strong&gt;RESTful APIs&lt;/strong&gt; or &lt;strong&gt;monolithic applications&lt;/strong&gt;. Instead of one service directly invoking another and waiting for a reply, &lt;strong&gt;producers&lt;/strong&gt; emit &lt;strong&gt;events&lt;/strong&gt; that signal something has occurred. &lt;strong&gt;Consumers&lt;/strong&gt; then react to these &lt;strong&gt;events&lt;/strong&gt; at their own pace. This asynchronous nature eliminates blocking calls, reduces latency bottlenecks, and supports massive concurrency.&lt;/p&gt;

&lt;p&gt;At its core, &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; treats &lt;strong&gt;events&lt;/strong&gt; as first-class citizens. An &lt;strong&gt;event&lt;/strong&gt; is an immutable record of a fact that happened in the past, carrying both metadata and payload data. Examples include &lt;strong&gt;OrderPlaced&lt;/strong&gt;, &lt;strong&gt;UserRegistered&lt;/strong&gt;, or &lt;strong&gt;PaymentProcessed&lt;/strong&gt;. These &lt;strong&gt;events&lt;/strong&gt; drive the behavior of the entire distributed system without requiring tight integration between services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Components of Event-Driven Architecture
&lt;/h2&gt;

&lt;p&gt;Every &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; relies on four essential building blocks that work together to ensure reliable &lt;strong&gt;event&lt;/strong&gt; flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Events
&lt;/h3&gt;

&lt;p&gt;An &lt;strong&gt;event&lt;/strong&gt; is the fundamental unit of communication. It must be &lt;strong&gt;immutable&lt;/strong&gt;, &lt;strong&gt;idempotent&lt;/strong&gt;, and self-contained. Each &lt;strong&gt;event&lt;/strong&gt; typically includes a unique &lt;strong&gt;event ID&lt;/strong&gt;, &lt;strong&gt;timestamp&lt;/strong&gt;, &lt;strong&gt;event type&lt;/strong&gt;, &lt;strong&gt;source&lt;/strong&gt;, and a &lt;strong&gt;payload&lt;/strong&gt; with domain-specific data. &lt;strong&gt;Events&lt;/strong&gt; are never altered after creation; instead, new &lt;strong&gt;events&lt;/strong&gt; represent subsequent state changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Producers
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Producers&lt;/strong&gt; are the components responsible for detecting state changes and publishing &lt;strong&gt;events&lt;/strong&gt; to the central &lt;strong&gt;event broker&lt;/strong&gt;. A &lt;strong&gt;producer&lt;/strong&gt; can be a &lt;strong&gt;microservice&lt;/strong&gt;, a database trigger, or an external system. Upon generating an &lt;strong&gt;event&lt;/strong&gt;, the &lt;strong&gt;producer&lt;/strong&gt; serializes it into a standard format such as &lt;strong&gt;JSON&lt;/strong&gt; or &lt;strong&gt;Avro&lt;/strong&gt; and sends it reliably to the broker.&lt;/p&gt;

&lt;h3&gt;
  
  
  Consumers
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Consumers&lt;/strong&gt; subscribe to one or more &lt;strong&gt;event&lt;/strong&gt; streams and execute business logic when matching &lt;strong&gt;events&lt;/strong&gt; arrive. A single &lt;strong&gt;event&lt;/strong&gt; can be consumed by multiple &lt;strong&gt;consumers&lt;/strong&gt; simultaneously, enabling parallel processing. &lt;strong&gt;Consumers&lt;/strong&gt; can be grouped into &lt;strong&gt;consumer groups&lt;/strong&gt; to achieve &lt;strong&gt;load balancing&lt;/strong&gt; and &lt;strong&gt;horizontal scaling&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Event Broker
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;event broker&lt;/strong&gt; serves as the reliable messaging backbone. It receives &lt;strong&gt;events&lt;/strong&gt; from &lt;strong&gt;producers&lt;/strong&gt;, persists them durably, and delivers them to &lt;strong&gt;consumers&lt;/strong&gt;. Popular &lt;strong&gt;event brokers&lt;/strong&gt; include &lt;strong&gt;Apache Kafka&lt;/strong&gt; for high-throughput streaming and &lt;strong&gt;RabbitMQ&lt;/strong&gt; for flexible routing. The broker guarantees &lt;strong&gt;at-least-once&lt;/strong&gt;, &lt;strong&gt;exactly-once&lt;/strong&gt;, or &lt;strong&gt;at-most-once&lt;/strong&gt; delivery semantics depending on configuration.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Event-Driven Architecture Works in Practice
&lt;/h2&gt;

&lt;p&gt;Consider an &lt;strong&gt;e-commerce platform&lt;/strong&gt; built with &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt;. When a customer places an order, the &lt;strong&gt;Order Service&lt;/strong&gt; acts as a &lt;strong&gt;producer&lt;/strong&gt; and publishes an &lt;strong&gt;OrderPlaced&lt;/strong&gt; &lt;strong&gt;event&lt;/strong&gt;. This &lt;strong&gt;event&lt;/strong&gt; flows to the &lt;strong&gt;event broker&lt;/strong&gt; and is immediately available to three independent &lt;strong&gt;consumers&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Inventory Service&lt;/strong&gt; subtracts stock and publishes an &lt;strong&gt;InventoryUpdated&lt;/strong&gt; &lt;strong&gt;event&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Payment Service&lt;/strong&gt; processes the transaction and publishes a &lt;strong&gt;PaymentProcessed&lt;/strong&gt; &lt;strong&gt;event&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Notification Service&lt;/strong&gt; sends an email confirmation to the customer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these services call each other directly. They remain completely decoupled, allowing each to scale, fail, or be updated independently while the &lt;strong&gt;event broker&lt;/strong&gt; ensures reliable delivery.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Patterns in Event-Driven Architecture
&lt;/h2&gt;

&lt;p&gt;Several proven patterns elevate &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; from basic messaging to sophisticated &lt;strong&gt;system design&lt;/strong&gt; solutions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Event Sourcing
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Event Sourcing&lt;/strong&gt; stores the complete history of &lt;strong&gt;events&lt;/strong&gt; rather than just the current state of an entity. The current state of any object is reconstructed by replaying the sequence of &lt;strong&gt;events&lt;/strong&gt; from the beginning. This pattern provides perfect auditability, time-travel debugging, and easy recovery from failures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Command Query Responsibility Segregation (CQRS)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;CQRS&lt;/strong&gt; separates the write path (&lt;strong&gt;commands&lt;/strong&gt;) from the read path (&lt;strong&gt;queries&lt;/strong&gt;). &lt;strong&gt;Commands&lt;/strong&gt; generate &lt;strong&gt;events&lt;/strong&gt; that update the write model. A separate read model is kept synchronized through &lt;strong&gt;event&lt;/strong&gt; subscriptions. This allows optimized data structures for reads while maintaining strong consistency on writes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Saga Pattern
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Saga Pattern&lt;/strong&gt; orchestrates long-running &lt;strong&gt;distributed transactions&lt;/strong&gt; without relying on traditional &lt;strong&gt;two-phase commit&lt;/strong&gt;. Each step in a business process publishes a completion &lt;strong&gt;event&lt;/strong&gt; or a compensating &lt;strong&gt;event&lt;/strong&gt; on failure. For example, if an order fails payment, a &lt;strong&gt;CancelOrder&lt;/strong&gt; &lt;strong&gt;event&lt;/strong&gt; triggers compensating actions across services to maintain overall consistency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation Example Using Apache Kafka
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Apache Kafka&lt;/strong&gt; is the industry-standard &lt;strong&gt;event broker&lt;/strong&gt; for high-scale &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt;. Below are complete, production-ready code snippets in Python using the official &lt;strong&gt;confluent-kafka&lt;/strong&gt; library.&lt;/p&gt;

&lt;h3&gt;
  
  
  Kafka Producer Implementation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Producer&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delivery_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Message delivery failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Message delivered to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; [&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;partition&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;conf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bootstrap.servers&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;kafka-broker-1:9092,kafka-broker-2:9092&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;client.id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gethostname&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;acks&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="c1"&gt;# Wait for all in-sync replicas
&lt;/span&gt;    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;enable.idempotence&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;        &lt;span class="c1"&gt;# Prevent duplicate events
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Producer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Publish an OrderPlaced event
&lt;/span&gt;&lt;span class="n"&gt;event_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evt-1234567890&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OrderPlaced&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-04-03T06:18:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payload&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;order_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ORD-98765&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;USR-54321&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;items&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PROD-111&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;quantity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;149.99&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;produce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;orders&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_data&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;event_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payload&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;order_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# Ensures ordering per order
&lt;/span&gt;    &lt;span class="n"&gt;callback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;delivery_callback&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Ensure all messages are sent before exit
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;strong&gt;producer&lt;/strong&gt; guarantees &lt;strong&gt;exactly-once&lt;/strong&gt; semantics through idempotence and waits for full replication before considering the &lt;strong&gt;event&lt;/strong&gt; published.&lt;/p&gt;

&lt;h3&gt;
  
  
  Kafka Consumer Implementation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Consumer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KafkaError&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;conf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bootstrap.servers&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;kafka-broker-1:9092,kafka-broker-2:9092&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;group.id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;inventory-service-group&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;auto.offset.reset&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;earliest&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;enable.auto.commit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;# Manual commit for exactly-once
&lt;/span&gt;    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;isolation.level&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;read_committed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;consumer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Consumer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subscribe&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;orders&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;code&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;KafkaError&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_PARTITION_EOF&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;

    &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;value&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OrderPlaced&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Process inventory deduction
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing inventory for order &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;payload&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;order_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# ... business logic here ...
&lt;/span&gt;
        &lt;span class="c1"&gt;# Publish follow-up event
&lt;/span&gt;        &lt;span class="c1"&gt;# (In real systems this would use a separate producer)
&lt;/span&gt;
        &lt;span class="c1"&gt;# Manual commit only after successful processing
&lt;/span&gt;        &lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;consumer&lt;/strong&gt; belongs to a &lt;strong&gt;consumer group&lt;/strong&gt;, processes &lt;strong&gt;events&lt;/strong&gt; in order within each partition, and commits offsets only after successful business logic execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges in Event-Driven Architecture
&lt;/h2&gt;

&lt;p&gt;While powerful, &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; introduces specific complexities that must be addressed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Eventual Consistency&lt;/strong&gt;: Data across services may temporarily differ until all &lt;strong&gt;events&lt;/strong&gt; propagate.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event Ordering&lt;/strong&gt;: Guaranteeing strict chronological order requires careful &lt;strong&gt;partitioning&lt;/strong&gt; and &lt;strong&gt;key&lt;/strong&gt; selection.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency&lt;/strong&gt;: &lt;strong&gt;Consumers&lt;/strong&gt; must handle duplicate &lt;strong&gt;events&lt;/strong&gt; gracefully using &lt;strong&gt;event IDs&lt;/strong&gt; or &lt;strong&gt;deduplication tables&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging Distributed Flows&lt;/strong&gt;: Tracing a single business transaction across dozens of &lt;strong&gt;events&lt;/strong&gt; requires &lt;strong&gt;distributed tracing&lt;/strong&gt; tools.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema Evolution&lt;/strong&gt;: &lt;strong&gt;Events&lt;/strong&gt; must support forward and backward compatibility through &lt;strong&gt;schema registries&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Best Practices for Event-Driven Architecture
&lt;/h2&gt;

&lt;p&gt;To build robust &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; systems, always:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Design &lt;strong&gt;events&lt;/strong&gt; as facts about the past, never as commands.
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;Avro&lt;/strong&gt; or &lt;strong&gt;Protobuf&lt;/strong&gt; with a &lt;strong&gt;schema registry&lt;/strong&gt; for type safety.
&lt;/li&gt;
&lt;li&gt;Implement &lt;strong&gt;dead-letter queues&lt;/strong&gt; for failed &lt;strong&gt;events&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Monitor &lt;strong&gt;event&lt;/strong&gt; lag, throughput, and consumer health continuously.
&lt;/li&gt;
&lt;li&gt;Version &lt;strong&gt;events&lt;/strong&gt; explicitly and maintain backward compatibility.
&lt;/li&gt;
&lt;li&gt;Combine &lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; with &lt;strong&gt;CQRS&lt;/strong&gt; and &lt;strong&gt;Event Sourcing&lt;/strong&gt; only when business requirements justify the added complexity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Event-Driven Architecture&lt;/strong&gt; empowers &lt;strong&gt;system design&lt;/strong&gt; teams to create resilient, scalable, and maintainable distributed systems that respond instantly to real-world changes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkfu8p151725ttierbl4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzkfu8p151725ttierbl4.png" alt="Event-driven architecture diagram" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
To master every concept in system design including Event-Driven Architecture, purchase the complete &lt;strong&gt;System Design Handbook&lt;/strong&gt; at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
Buy me coffee to support my content at &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Message Queues (Kafka, RabbitMQ, SQS) in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 03 Apr 2026 06:09:40 +0000</pubDate>
      <link>https://forem.com/code_2/message-queues-kafka-rabbitmq-sqs-in-system-design-57jl</link>
      <guid>https://forem.com/code_2/message-queues-kafka-rabbitmq-sqs-in-system-design-57jl</guid>
      <description>&lt;h2&gt;
  
  
  Introduction to Message Queues in System Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Message queues&lt;/strong&gt; serve as the backbone of &lt;strong&gt;asynchronous communication&lt;/strong&gt; in &lt;strong&gt;distributed systems&lt;/strong&gt;. They enable &lt;strong&gt;producers&lt;/strong&gt; to send &lt;strong&gt;messages&lt;/strong&gt; that &lt;strong&gt;consumers&lt;/strong&gt; process independently, without requiring both parties to be available simultaneously. This &lt;strong&gt;decoupling&lt;/strong&gt; eliminates tight temporal dependencies, buffers traffic during spikes, and provides &lt;strong&gt;fault tolerance&lt;/strong&gt; by allowing &lt;strong&gt;retries&lt;/strong&gt; and &lt;strong&gt;dead-letter queues&lt;/strong&gt; for failed messages. &lt;/p&gt;

&lt;p&gt;In &lt;strong&gt;system design&lt;/strong&gt;, &lt;strong&gt;message queues&lt;/strong&gt; address critical challenges such as &lt;strong&gt;scalability&lt;/strong&gt;, &lt;strong&gt;reliability&lt;/strong&gt;, and &lt;strong&gt;throughput&lt;/strong&gt;. They support patterns including &lt;strong&gt;event-driven architecture&lt;/strong&gt;, &lt;strong&gt;microservices communication&lt;/strong&gt;, &lt;strong&gt;task queues&lt;/strong&gt;, and &lt;strong&gt;log aggregation&lt;/strong&gt;. By persisting &lt;strong&gt;messages&lt;/strong&gt; durably, they ensure data survives failures and can be replayed when necessary. &lt;strong&gt;Message queues&lt;/strong&gt; come in two broad categories: traditional &lt;strong&gt;task queues&lt;/strong&gt; focused on routing and delivery, and &lt;strong&gt;event streams&lt;/strong&gt; optimized for high-volume, ordered, replayable data.&lt;/p&gt;

&lt;p&gt;Three prominent implementations dominate modern &lt;strong&gt;system design&lt;/strong&gt;: &lt;strong&gt;Apache Kafka&lt;/strong&gt; for high-throughput &lt;strong&gt;event streaming&lt;/strong&gt;, &lt;strong&gt;RabbitMQ&lt;/strong&gt; for flexible &lt;strong&gt;routing&lt;/strong&gt; and &lt;strong&gt;messaging patterns&lt;/strong&gt;, and &lt;strong&gt;Amazon SQS&lt;/strong&gt; for fully managed, serverless &lt;strong&gt;queue&lt;/strong&gt; operations. Each offers distinct strengths in &lt;strong&gt;architecture&lt;/strong&gt;, &lt;strong&gt;delivery guarantees&lt;/strong&gt;, and operational model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Apache Kafka: Distributed Event Streaming Platform
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Apache Kafka&lt;/strong&gt; functions as a distributed &lt;strong&gt;event streaming platform&lt;/strong&gt; rather than a simple &lt;strong&gt;message queue&lt;/strong&gt;. It excels in scenarios demanding massive &lt;strong&gt;throughput&lt;/strong&gt;, &lt;strong&gt;durability&lt;/strong&gt;, and &lt;strong&gt;replayability&lt;/strong&gt; across thousands of &lt;strong&gt;producers&lt;/strong&gt; and &lt;strong&gt;consumers&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Architecture of Kafka
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;Kafka cluster&lt;/strong&gt; consists of multiple &lt;strong&gt;brokers&lt;/strong&gt; that store and serve data. &lt;strong&gt;Topics&lt;/strong&gt; act as logical categories for &lt;strong&gt;messages&lt;/strong&gt;, each divided into &lt;strong&gt;partitions&lt;/strong&gt; for parallelism. Every &lt;strong&gt;partition&lt;/strong&gt; is an ordered, immutable log stored on a &lt;strong&gt;broker&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Replication&lt;/strong&gt; ensures fault tolerance: each &lt;strong&gt;partition&lt;/strong&gt; has a &lt;strong&gt;leader&lt;/strong&gt; handling reads and writes, plus &lt;strong&gt;followers&lt;/strong&gt; (in-sync replicas) that copy data. If the &lt;strong&gt;leader&lt;/strong&gt; fails, a &lt;strong&gt;follower&lt;/strong&gt; is elected. &lt;strong&gt;Producers&lt;/strong&gt; write to &lt;strong&gt;topics&lt;/strong&gt;, while &lt;strong&gt;consumers&lt;/strong&gt; read from &lt;strong&gt;partitions&lt;/strong&gt; and commit &lt;strong&gt;offsets&lt;/strong&gt; to track progress. &lt;strong&gt;Consumer groups&lt;/strong&gt; enable load balancing, with each &lt;strong&gt;partition&lt;/strong&gt; assigned to exactly one &lt;strong&gt;consumer&lt;/strong&gt; in the group.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kafka&lt;/strong&gt; achieves &lt;strong&gt;exactly-once semantics&lt;/strong&gt; through &lt;strong&gt;idempotent producers&lt;/strong&gt; (using sequence numbers to deduplicate retries) and &lt;strong&gt;transactions&lt;/strong&gt; (atomic produce-and-commit operations). Modern deployments use &lt;strong&gt;KRaft mode&lt;/strong&gt; for metadata management via the &lt;strong&gt;Raft consensus protocol&lt;/strong&gt;, eliminating external coordination dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kafka&lt;/strong&gt; delivers &lt;strong&gt;high throughput&lt;/strong&gt; via &lt;strong&gt;batching&lt;/strong&gt;, &lt;strong&gt;compression&lt;/strong&gt;, and &lt;strong&gt;zero-copy&lt;/strong&gt; transfers. It supports &lt;strong&gt;log compaction&lt;/strong&gt; for stateful streams and integrates seamlessly with stream processing frameworks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Producer and Consumer Implementation in Kafka
&lt;/h3&gt;

&lt;p&gt;Here is a complete &lt;strong&gt;Python&lt;/strong&gt; implementation using the &lt;code&gt;kafka-python&lt;/code&gt; library for a basic &lt;strong&gt;producer&lt;/strong&gt; and &lt;strong&gt;consumer&lt;/strong&gt;. This demonstrates &lt;strong&gt;idempotent&lt;/strong&gt; production and &lt;strong&gt;offset&lt;/strong&gt; management.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kafka&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KafkaProducer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KafkaConsumer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kafka.errors&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KafkaError&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="c1"&gt;# Producer configuration
&lt;/span&gt;&lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaProducer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;bootstrap_servers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;localhost:9092&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;value_serializer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;acks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="c1"&gt;# Wait for all in-sync replicas
&lt;/span&gt;    &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                     &lt;span class="c1"&gt;# Retry on transient failures
&lt;/span&gt;    &lt;span class="n"&gt;enable_idempotence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# Prevent duplicates on retries
&lt;/span&gt;    &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;              &lt;span class="c1"&gt;# Batch records for efficiency
&lt;/span&gt;    &lt;span class="n"&gt;linger_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;                    &lt;span class="c1"&gt;# Wait briefly to fill batches
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Send messages
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;payload-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;record_metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Message sent to topic &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record_metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; partition &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record_metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;partition&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; offset &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record_metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;KafkaError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to send: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explanation of producer code&lt;/strong&gt;: The &lt;code&gt;bootstrap_servers&lt;/code&gt; connects to the cluster. &lt;code&gt;acks='all'&lt;/code&gt; ensures &lt;strong&gt;durability&lt;/strong&gt; by waiting for replication. &lt;code&gt;enable_idempotence=True&lt;/code&gt; guarantees no duplicates. Batching via &lt;code&gt;batch_size&lt;/code&gt; and &lt;code&gt;linger_ms&lt;/code&gt; maximizes &lt;strong&gt;throughput&lt;/strong&gt;. The &lt;code&gt;send&lt;/code&gt; method returns a future for asynchronous confirmation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Consumer configuration
&lt;/span&gt;&lt;span class="n"&gt;consumer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaConsumer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;bootstrap_servers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;localhost:9092&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;group_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event-processors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;value_deserializer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="n"&gt;auto_offset_reset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;earliest&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;# Start from beginning if no offset
&lt;/span&gt;    &lt;span class="n"&gt;enable_auto_commit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;        &lt;span class="c1"&gt;# Manual control for exactly-once
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Consumed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; from partition &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;partition&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; offset &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Process message here
&lt;/span&gt;    &lt;span class="c1"&gt;# On success: consumer.commit() for manual offset commit
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explanation of consumer code&lt;/strong&gt;: The &lt;code&gt;group_id&lt;/code&gt; joins a &lt;strong&gt;consumer group&lt;/strong&gt; for parallel processing. &lt;code&gt;auto_offset_reset&lt;/code&gt; controls initial position. Manual commits allow transactional exactly-once processing when combined with producer transactions. &lt;strong&gt;Offsets&lt;/strong&gt; are committed only after successful handling.&lt;/p&gt;

&lt;h2&gt;
  
  
  RabbitMQ: Flexible Messaging Broker with Advanced Routing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;RabbitMQ&lt;/strong&gt; implements the &lt;strong&gt;AMQP&lt;/strong&gt; protocol as a robust &lt;strong&gt;message broker&lt;/strong&gt; optimized for complex &lt;strong&gt;routing&lt;/strong&gt; and reliable delivery. It suits &lt;strong&gt;task queues&lt;/strong&gt;, &lt;strong&gt;work distribution&lt;/strong&gt;, and scenarios requiring sophisticated message patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Architecture of RabbitMQ
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;RabbitMQ&lt;/strong&gt; uses &lt;strong&gt;exchanges&lt;/strong&gt; to route &lt;strong&gt;messages&lt;/strong&gt; to &lt;strong&gt;queues&lt;/strong&gt; based on &lt;strong&gt;bindings&lt;/strong&gt; and &lt;strong&gt;routing keys&lt;/strong&gt;. Producers publish to &lt;strong&gt;exchanges&lt;/strong&gt;; &lt;strong&gt;consumers&lt;/strong&gt; pull from &lt;strong&gt;queues&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Four main &lt;strong&gt;exchange&lt;/strong&gt; types exist:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Direct&lt;/strong&gt;: Routes based on exact &lt;strong&gt;routing key&lt;/strong&gt; match.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fanout&lt;/strong&gt;: Broadcasts to all bound &lt;strong&gt;queues&lt;/strong&gt; (ignores key).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Topic&lt;/strong&gt;: Uses pattern matching on &lt;strong&gt;routing keys&lt;/strong&gt; (e.g., &lt;code&gt;user.*.event&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Headers&lt;/strong&gt;: Routes based on message header attributes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Queues&lt;/strong&gt; can be &lt;strong&gt;durable&lt;/strong&gt; (survive restarts), &lt;strong&gt;exclusive&lt;/strong&gt;, or &lt;strong&gt;auto-delete&lt;/strong&gt;. &lt;strong&gt;Acknowledgments&lt;/strong&gt; ensure reliable delivery: &lt;strong&gt;consumers&lt;/strong&gt; explicitly acknowledge after processing. &lt;strong&gt;Prefetch&lt;/strong&gt; limits unacknowledged messages per &lt;strong&gt;consumer&lt;/strong&gt; for flow control. &lt;strong&gt;Dead-letter queues&lt;/strong&gt; capture failed messages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RabbitMQ&lt;/strong&gt; supports &lt;strong&gt;clustering&lt;/strong&gt; for high availability and &lt;strong&gt;mirrored queues&lt;/strong&gt; for replication. It provides &lt;strong&gt;low-latency&lt;/strong&gt; delivery and works across multiple protocols.&lt;/p&gt;

&lt;h3&gt;
  
  
  Producer and Consumer Implementation in RabbitMQ
&lt;/h3&gt;

&lt;p&gt;Here is a complete &lt;strong&gt;Python&lt;/strong&gt; implementation using the &lt;code&gt;pika&lt;/code&gt; library.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pika&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="c1"&gt;# Establish connection and channel
&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pika&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;BlockingConnection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pika&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ConnectionParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;channel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Declare durable queue and direct exchange
&lt;/span&gt;&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exchange_declare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exchange&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exchange_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;direct&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;durable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;queue_declare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event-processor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;durable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;queue_bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exchange&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event-processor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;routing_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user.created&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Producer: publish message
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user.created&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;payload-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basic_publish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;exchange&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;routing_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user.created&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;properties&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pika&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;BasicProperties&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;delivery_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Persistent
&lt;/span&gt;        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Published event &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explanation of producer code&lt;/strong&gt;: The &lt;strong&gt;exchange&lt;/strong&gt; and &lt;strong&gt;queue&lt;/strong&gt; are declared durable for persistence. &lt;code&gt;basic_publish&lt;/code&gt; with &lt;code&gt;delivery_mode=2&lt;/code&gt; ensures the &lt;strong&gt;message&lt;/strong&gt; survives broker restarts. &lt;strong&gt;Routing key&lt;/strong&gt; determines delivery via the &lt;strong&gt;binding&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Consumer setup
&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pika&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;BlockingConnection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pika&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ConnectionParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;channel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;queue_declare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event-processor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;durable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basic_ack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delivery_tag&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delivery_tag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Manual ack
&lt;/span&gt;
&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basic_qos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prefetch_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Fair dispatch
&lt;/span&gt;&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basic_consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event-processor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_message_callback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Waiting for messages...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_consuming&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explanation of consumer code&lt;/strong&gt;: &lt;code&gt;basic_qos&lt;/code&gt; with &lt;code&gt;prefetch_count=1&lt;/code&gt; prevents overload. Manual &lt;code&gt;basic_ack&lt;/code&gt; confirms successful processing; unacknowledged messages return to the &lt;strong&gt;queue&lt;/strong&gt;. This pattern supports &lt;strong&gt;idempotency&lt;/strong&gt; and &lt;strong&gt;retry&lt;/strong&gt; logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Amazon SQS: Fully Managed Serverless Queues
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Amazon SQS&lt;/strong&gt; provides a fully managed &lt;strong&gt;message queue&lt;/strong&gt; service that removes infrastructure overhead. It focuses on simplicity and seamless integration within cloud-native architectures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Architecture of SQS
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;SQS&lt;/strong&gt; offers two queue types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Standard queues&lt;/strong&gt;: Deliver &lt;strong&gt;at-least-once&lt;/strong&gt; with high &lt;strong&gt;throughput&lt;/strong&gt; and scalability. Messages may arrive out of order or duplicated.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FIFO queues&lt;/strong&gt;: Guarantee &lt;strong&gt;exactly-once&lt;/strong&gt; processing and strict ordering within &lt;strong&gt;message groups&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Producers&lt;/strong&gt; send messages via API; &lt;strong&gt;consumers&lt;/strong&gt; poll for messages. &lt;strong&gt;Visibility timeout&lt;/strong&gt; hides a received message temporarily, preventing concurrent processing. If not deleted within the timeout, the message reappears. &lt;strong&gt;Long polling&lt;/strong&gt; waits up to 20 seconds for messages, reducing empty responses. &lt;strong&gt;Dead-letter queues&lt;/strong&gt; capture messages failing after a configurable receive count.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQS&lt;/strong&gt; handles replication, encryption, and scaling automatically. It integrates natively with other cloud services for &lt;strong&gt;event-driven&lt;/strong&gt; workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Producer and Consumer Implementation in SQS
&lt;/h3&gt;

&lt;p&gt;Here is a complete &lt;strong&gt;Python&lt;/strong&gt; implementation using &lt;code&gt;boto3&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;sqs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sqs&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;queue_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://sqs.us-east-1.amazonaws.com/123456789012/my-queue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  &lt;span class="c1"&gt;# Replace with actual URL
&lt;/span&gt;
&lt;span class="c1"&gt;# Producer: send message
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;event_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;payload-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sqs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;QueueUrl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;queue_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;MessageBody&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;MessageAttributes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;EventType&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;DataType&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;String&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;StringValue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user.created&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sent message ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;MessageId&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explanation of producer code&lt;/strong&gt;: &lt;code&gt;send_message&lt;/code&gt; accepts &lt;strong&gt;body&lt;/strong&gt; and optional &lt;strong&gt;attributes&lt;/strong&gt; for filtering. &lt;strong&gt;SQS&lt;/strong&gt; handles durability and distribution automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Consumer: receive and process
&lt;/span&gt;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sqs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;receive_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;QueueUrl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;queue_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;MaxNumberOfMessages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;WaitTimeSeconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# Long polling
&lt;/span&gt;        &lt;span class="n"&gt;VisibilityTimeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;         &lt;span class="c1"&gt;# Hide for 30 seconds
&lt;/span&gt;    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Messages&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Messages&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Received and processing: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Process logic here
&lt;/span&gt;
        &lt;span class="c1"&gt;# Delete after success
&lt;/span&gt;        &lt;span class="n"&gt;sqs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;QueueUrl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;queue_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;ReceiptHandle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ReceiptHandle&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No messages available&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explanation of consumer code&lt;/strong&gt;: &lt;strong&gt;Long polling&lt;/strong&gt; via &lt;code&gt;WaitTimeSeconds&lt;/code&gt; improves efficiency. &lt;strong&gt;Visibility timeout&lt;/strong&gt; prevents duplicate processing. &lt;code&gt;delete_message&lt;/code&gt; removes the &lt;strong&gt;message&lt;/strong&gt; permanently using the &lt;strong&gt;receipt handle&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the Right Message Queue
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Kafka&lt;/strong&gt; shines for &lt;strong&gt;event streaming&lt;/strong&gt; with replayability and massive scale. &lt;strong&gt;RabbitMQ&lt;/strong&gt; excels in &lt;strong&gt;complex routing&lt;/strong&gt; and traditional &lt;strong&gt;task queues&lt;/strong&gt;. &lt;strong&gt;SQS&lt;/strong&gt; offers zero-ops management for simple &lt;strong&gt;decoupling&lt;/strong&gt; in cloud environments. Evaluate based on &lt;strong&gt;throughput&lt;/strong&gt; needs, &lt;strong&gt;ordering&lt;/strong&gt; requirements, &lt;strong&gt;operational burden&lt;/strong&gt;, and &lt;strong&gt;delivery semantics&lt;/strong&gt; when designing systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffapyqcci4zcdvj77fyk5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffapyqcci4zcdvj77fyk5.png" alt="Message queues in system design" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
To master these and more concepts in system design, consider purchasing the System Design Handbook at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;. Buy me coffee to support my content at &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>API Design (REST, GraphQL, gRPC) in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Fri, 03 Apr 2026 05:47:43 +0000</pubDate>
      <link>https://forem.com/code_2/api-design-rest-graphql-grpc-in-system-design-4nci</link>
      <guid>https://forem.com/code_2/api-design-rest-graphql-grpc-in-system-design-4nci</guid>
      <description>&lt;p&gt;In the intricate landscape of &lt;strong&gt;system design&lt;/strong&gt;, &lt;strong&gt;API design&lt;/strong&gt; serves as the foundational communication layer that enables different components, services, and clients to interact seamlessly at scale. The choice of &lt;strong&gt;API&lt;/strong&gt; paradigm directly influences performance, maintainability, scalability, and developer experience. Three dominant approaches dominate modern &lt;strong&gt;distributed systems&lt;/strong&gt;: &lt;strong&gt;REST&lt;/strong&gt;, &lt;strong&gt;GraphQL&lt;/strong&gt;, and &lt;strong&gt;gRPC&lt;/strong&gt;. Each addresses distinct challenges in &lt;strong&gt;data exchange&lt;/strong&gt;, &lt;strong&gt;efficiency&lt;/strong&gt;, and &lt;strong&gt;real-time capabilities&lt;/strong&gt; while fitting specific architectural patterns such as &lt;strong&gt;microservices&lt;/strong&gt;, &lt;strong&gt;event-driven architectures&lt;/strong&gt;, and &lt;strong&gt;high-throughput applications&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;REST&lt;/strong&gt; API Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;REST&lt;/strong&gt;, or Representational State Transfer, is an architectural style introduced by Roy Fielding that emphasizes a stateless, client-server model built on standard &lt;strong&gt;HTTP&lt;/strong&gt; protocols. In &lt;strong&gt;system design&lt;/strong&gt;, &lt;strong&gt;REST&lt;/strong&gt; APIs excel when simplicity, cacheability, and broad compatibility are paramount. The core constraints of &lt;strong&gt;REST&lt;/strong&gt; include &lt;strong&gt;client-server separation&lt;/strong&gt;, &lt;strong&gt;statelessness&lt;/strong&gt;, &lt;strong&gt;cacheability&lt;/strong&gt;, &lt;strong&gt;uniform interface&lt;/strong&gt;, &lt;strong&gt;layered system&lt;/strong&gt;, and &lt;strong&gt;code on demand&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;REST&lt;/strong&gt; API treats data as &lt;strong&gt;resources&lt;/strong&gt; identified by &lt;strong&gt;URIs&lt;/strong&gt;. Operations on these &lt;strong&gt;resources&lt;/strong&gt; map to &lt;strong&gt;HTTP methods&lt;/strong&gt;: &lt;strong&gt;GET&lt;/strong&gt; for retrieval, &lt;strong&gt;POST&lt;/strong&gt; for creation, &lt;strong&gt;PUT&lt;/strong&gt; for full updates, &lt;strong&gt;PATCH&lt;/strong&gt; for partial updates, and &lt;strong&gt;DELETE&lt;/strong&gt; for removal. This mapping aligns with &lt;strong&gt;CRUD&lt;/strong&gt; operations while enforcing &lt;strong&gt;idempotency&lt;/strong&gt; where applicable—&lt;strong&gt;PUT&lt;/strong&gt; and &lt;strong&gt;DELETE&lt;/strong&gt; are idempotent, whereas &lt;strong&gt;POST&lt;/strong&gt; is not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HTTP status codes&lt;/strong&gt; provide explicit feedback: &lt;strong&gt;2xx&lt;/strong&gt; for success, &lt;strong&gt;4xx&lt;/strong&gt; for client errors, and &lt;strong&gt;5xx&lt;/strong&gt; for server errors. &lt;strong&gt;REST&lt;/strong&gt; leverages &lt;strong&gt;HTTP headers&lt;/strong&gt; for metadata, such as &lt;strong&gt;Content-Type&lt;/strong&gt;, &lt;strong&gt;Authorization&lt;/strong&gt;, and &lt;strong&gt;Cache-Control&lt;/strong&gt;. &lt;strong&gt;Versioning&lt;/strong&gt; is typically handled via &lt;strong&gt;URI&lt;/strong&gt; paths (/v1/users), &lt;strong&gt;headers&lt;/strong&gt;, or &lt;strong&gt;query parameters&lt;/strong&gt; to maintain backward compatibility in evolving &lt;strong&gt;systems&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here is a complete, production-ready &lt;strong&gt;REST&lt;/strong&gt; API implementation using &lt;strong&gt;Node.js&lt;/strong&gt; and &lt;strong&gt;Express&lt;/strong&gt; to illustrate resource management for a user service in a &lt;strong&gt;microservices&lt;/strong&gt; environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="c1"&gt;// In-memory store for demonstration (replace with database in production)&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Alice&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;alice@example.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}];&lt;/span&gt;

&lt;span class="c1"&gt;// GET all users - retrieval with optional pagination and filtering&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;startIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;paginatedUsers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;startIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;startIndex&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;paginatedUsers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// GET single user by ID - resource-specific endpoint&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/users/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// POST create user - non-idempotent creation with validation&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;email&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Name and email required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newUser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;email&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newUser&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newUser&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// PUT full update - idempotent&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/users/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userIndex&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userIndex&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userIndex&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// PATCH partial update&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/users/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// DELETE user - idempotent removal&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/users/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userIndex&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;204&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;REST API running on port 3000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This snippet demonstrates &lt;strong&gt;stateless&lt;/strong&gt; design—each request contains all necessary information. In a real &lt;strong&gt;system&lt;/strong&gt;, integrate &lt;strong&gt;database indexing&lt;/strong&gt;, &lt;strong&gt;caching&lt;/strong&gt; with &lt;strong&gt;Redis&lt;/strong&gt;, and &lt;strong&gt;rate limiting&lt;/strong&gt; at the &lt;strong&gt;API gateway&lt;/strong&gt; level for horizontal scaling.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;GraphQL&lt;/strong&gt; API Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GraphQL&lt;/strong&gt; is a query language and runtime for &lt;strong&gt;APIs&lt;/strong&gt; created by Facebook that allows clients to request exactly the data they need in a single round trip, eliminating &lt;strong&gt;over-fetching&lt;/strong&gt; and &lt;strong&gt;under-fetching&lt;/strong&gt; common in &lt;strong&gt;REST&lt;/strong&gt;. In &lt;strong&gt;system design&lt;/strong&gt;, &lt;strong&gt;GraphQL&lt;/strong&gt; shines in complex, hierarchical data models and client-driven &lt;strong&gt;microservices&lt;/strong&gt; where flexibility is essential.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;GraphQL&lt;/strong&gt; schema defines types, queries, mutations, and subscriptions using &lt;strong&gt;Schema Definition Language (SDL)&lt;/strong&gt;. A single &lt;strong&gt;endpoint&lt;/strong&gt; (/graphql) handles all operations via &lt;strong&gt;POST&lt;/strong&gt; requests containing a &lt;strong&gt;query&lt;/strong&gt; or &lt;strong&gt;mutation&lt;/strong&gt; document. &lt;strong&gt;Resolvers&lt;/strong&gt; map fields to data-fetching logic, enabling nested queries without multiple &lt;strong&gt;HTTP&lt;/strong&gt; calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GraphQL&lt;/strong&gt; supports &lt;strong&gt;introspection&lt;/strong&gt; for self-documentation and &lt;strong&gt;subscriptions&lt;/strong&gt; over &lt;strong&gt;WebSockets&lt;/strong&gt; for real-time updates. &lt;strong&gt;Batching&lt;/strong&gt; and &lt;strong&gt;caching&lt;/strong&gt; at the resolver level prevent &lt;strong&gt;N+1 query problems&lt;/strong&gt; using tools like &lt;strong&gt;DataLoader&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here is a complete &lt;strong&gt;GraphQL&lt;/strong&gt; setup using &lt;strong&gt;Node.js&lt;/strong&gt; with &lt;strong&gt;Apollo Server&lt;/strong&gt; and an in-memory store, showcasing a user service with nested relationships:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ApolloServer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gql&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apollo-server&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Schema definition&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;typeDefs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;gql&lt;/span&gt;&lt;span class="s2"&gt;`
  type User {
    id: ID!
    name: String!
    email: String!
    posts: [Post!]!
  }
  type Post {
    id: ID!
    title: String!
    content: String!
    author: User!
  }
  type Query {
    users: [User!]!
    user(id: ID!): User
  }
  type Mutation {
    createUser(name: String!, email: String!): User!
  }
  type Subscription {
    userCreated: User!
  }
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Resolvers with DataLoader pattern for N+1 prevention (simplified)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Alice&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;alice@example.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;101&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}];&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;101&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GraphQL Basics&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Deep dive...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;authorId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}];&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;resolvers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;users&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;Mutation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;createUser&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;email&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newUser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newUser&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;newUser&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;User&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ApolloServer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;typeDefs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;resolvers&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(({&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`GraphQL server ready at &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure centralizes &lt;strong&gt;data fetching&lt;/strong&gt; while giving clients precise control. In &lt;strong&gt;distributed systems&lt;/strong&gt;, &lt;strong&gt;GraphQL&lt;/strong&gt; integrates with &lt;strong&gt;API gateways&lt;/strong&gt; for &lt;strong&gt;schema stitching&lt;/strong&gt; across &lt;strong&gt;microservices&lt;/strong&gt; and &lt;strong&gt;rate limiting&lt;/strong&gt; per operation complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;gRPC&lt;/strong&gt; API Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;gRPC&lt;/strong&gt; is a high-performance, open-source &lt;strong&gt;RPC&lt;/strong&gt; framework developed by Google that uses &lt;strong&gt;HTTP/2&lt;/strong&gt; for transport and &lt;strong&gt;Protocol Buffers&lt;/strong&gt; for binary serialization. In &lt;strong&gt;system design&lt;/strong&gt;, &lt;strong&gt;gRPC&lt;/strong&gt; is preferred for internal &lt;strong&gt;service-to-service&lt;/strong&gt; communication in &lt;strong&gt;polyglot microservices&lt;/strong&gt; due to its low latency, built-in &lt;strong&gt;streaming&lt;/strong&gt;, and strong typing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;gRPC&lt;/strong&gt; contracts are defined in &lt;strong&gt;.proto&lt;/strong&gt; files, which generate client and server stubs in multiple languages. It supports four communication patterns: &lt;strong&gt;unary&lt;/strong&gt;, &lt;strong&gt;server streaming&lt;/strong&gt;, &lt;strong&gt;client streaming&lt;/strong&gt;, and &lt;strong&gt;bidirectional streaming&lt;/strong&gt;. &lt;strong&gt;HTTP/2&lt;/strong&gt; multiplexing enables concurrent requests over a single connection, with automatic &lt;strong&gt;flow control&lt;/strong&gt; and &lt;strong&gt;header compression&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here is a complete &lt;strong&gt;gRPC&lt;/strong&gt; definition and implementation example using a &lt;strong&gt;.proto&lt;/strong&gt; file for a user service, followed by a Python server snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="na"&gt;syntax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"proto3"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;users&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;service&lt;/span&gt; &lt;span class="n"&gt;UserService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;GetUser&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;UserRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;ListUsers&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;Pagination&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;CreateUser&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;UserCreateRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;StreamUserUpdates&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;UserUpdate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;UserRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;Pagination&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int32&lt;/span&gt; &lt;span class="na"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int32&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;UserCreateRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;UserResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;UserUpdate&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;field&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Corresponding Python server using &lt;strong&gt;grpcio&lt;/strong&gt; (full implementation):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;concurrent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;futures&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;users_pb2&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;users_pb2_grpc&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users_pb2_grpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserServiceServicer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;GetUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Simulate database lookup
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;users_pb2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ListUsers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request_iterator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pagination&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;request_iterator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Stream users with pagination
&lt;/span&gt;            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pagination&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;users_pb2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;CreateUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;users_pb2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;StreamUserUpdates&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request_iterator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;update&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;request_iterator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;users_pb2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;UserResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Updated&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;updated@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;users_pb2_grpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_UserServiceServicer_to_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;UserService&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_insecure_port&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[::]:50051&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait_for_termination&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;gRPC&lt;/strong&gt; leverages &lt;strong&gt;binary&lt;/strong&gt; format for smaller payloads and native &lt;strong&gt;streaming&lt;/strong&gt; for real-time scenarios, making it ideal for &lt;strong&gt;low-latency&lt;/strong&gt; &lt;strong&gt;distributed systems&lt;/strong&gt;. &lt;strong&gt;Service discovery&lt;/strong&gt; and &lt;strong&gt;load balancing&lt;/strong&gt; integrate naturally with &lt;strong&gt;Kubernetes&lt;/strong&gt; and &lt;strong&gt;Istio&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the Right &lt;strong&gt;API&lt;/strong&gt; Design in &lt;strong&gt;System Design&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;REST&lt;/strong&gt; prioritizes simplicity and &lt;strong&gt;caching&lt;/strong&gt; for public-facing &lt;strong&gt;APIs&lt;/strong&gt; where broad client support matters. &lt;strong&gt;GraphQL&lt;/strong&gt; solves data flexibility challenges in frontend-heavy applications and complex &lt;strong&gt;data graphs&lt;/strong&gt;. &lt;strong&gt;gRPC&lt;/strong&gt; delivers superior performance and strict contracts for internal &lt;strong&gt;microservices&lt;/strong&gt; communication, especially under high load or with &lt;strong&gt;polyglot&lt;/strong&gt; teams.&lt;/p&gt;

&lt;p&gt;Trade-offs include &lt;strong&gt;REST&lt;/strong&gt;’s verbosity versus &lt;strong&gt;GraphQL&lt;/strong&gt;’s query complexity management and &lt;strong&gt;gRPC&lt;/strong&gt;’s ecosystem maturity. In a complete &lt;strong&gt;system&lt;/strong&gt;, hybrid approaches are common: &lt;strong&gt;REST&lt;/strong&gt; for external clients, &lt;strong&gt;GraphQL&lt;/strong&gt; for mobile/web, and &lt;strong&gt;gRPC&lt;/strong&gt; for backend &lt;strong&gt;service meshes&lt;/strong&gt;. Security considerations—&lt;strong&gt;OAuth&lt;/strong&gt;, &lt;strong&gt;JWT&lt;/strong&gt;, &lt;strong&gt;mTLS&lt;/strong&gt;—apply uniformly, while &lt;strong&gt;observability&lt;/strong&gt; tools like &lt;strong&gt;distributed tracing&lt;/strong&gt; ensure reliability across paradigms.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqxp4qm007y5npytrshp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqxp4qm007y5npytrshp.png" alt="API design comparison: REST, GraphQL, gRPC" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For a comprehensive guide that builds upon these foundations and covers the full spectrum of &lt;strong&gt;system design&lt;/strong&gt; principles, purchase the &lt;strong&gt;System Design Handbook&lt;/strong&gt; at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;. Your purchase directly supports the creation of in-depth technical content. Buy me a coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Consistency Patterns (Strong, Eventual, Weak) in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Thu, 02 Apr 2026 20:18:47 +0000</pubDate>
      <link>https://forem.com/code_2/consistency-patterns-strong-eventual-weak-in-system-design-2pjf</link>
      <guid>https://forem.com/code_2/consistency-patterns-strong-eventual-weak-in-system-design-2pjf</guid>
      <description>&lt;h2&gt;
  
  
  Understanding Consistency in Distributed Systems
&lt;/h2&gt;

&lt;p&gt;In distributed systems, &lt;strong&gt;consistency&lt;/strong&gt; defines how and when updates to data become visible across multiple &lt;strong&gt;nodes&lt;/strong&gt; or &lt;strong&gt;replicas&lt;/strong&gt;. When a client performs a &lt;strong&gt;write&lt;/strong&gt; operation on one node, the system must decide whether subsequent &lt;strong&gt;read&lt;/strong&gt; operations on any other node will immediately reflect that change or tolerate some delay. This decision directly influences &lt;strong&gt;availability&lt;/strong&gt;, &lt;strong&gt;latency&lt;/strong&gt;, &lt;strong&gt;throughput&lt;/strong&gt;, and overall system behavior under network partitions or failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consistency patterns&lt;/strong&gt; provide structured guarantees that help architects balance these competing requirements. The three primary patterns—&lt;strong&gt;Strong Consistency&lt;/strong&gt;, &lt;strong&gt;Eventual Consistency&lt;/strong&gt;, and &lt;strong&gt;Weak Consistency&lt;/strong&gt;—form a spectrum from the strictest guarantees to the most relaxed. Each pattern addresses different real-world demands, from financial accuracy to massive-scale user-generated content.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Strong Consistency&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Strong Consistency&lt;/strong&gt;, also known as &lt;strong&gt;linearizability&lt;/strong&gt;, guarantees that once a &lt;strong&gt;write&lt;/strong&gt; operation completes successfully, every subsequent &lt;strong&gt;read&lt;/strong&gt; operation—regardless of which &lt;strong&gt;replica&lt;/strong&gt; or client issues it—will return the most recent &lt;strong&gt;write&lt;/strong&gt; value or a newer one. There is no window for &lt;strong&gt;stale data&lt;/strong&gt;. All operations appear to execute in a single, global, sequential order as if on a single atomic copy of the data.&lt;/p&gt;

&lt;p&gt;This pattern enforces &lt;strong&gt;immediate visibility&lt;/strong&gt; of updates. If Client A writes a value and receives confirmation, Client B reading immediately afterward will always see the updated value. The system achieves this through tight synchronization mechanisms such as &lt;strong&gt;quorum-based replication&lt;/strong&gt;, &lt;strong&gt;consensus protocols&lt;/strong&gt; like &lt;strong&gt;Paxos&lt;/strong&gt; or &lt;strong&gt;Raft&lt;/strong&gt;, or &lt;strong&gt;two-phase commit (2PC)&lt;/strong&gt; for distributed transactions.&lt;/p&gt;

&lt;h3&gt;
  
  
  How &lt;strong&gt;Strong Consistency&lt;/strong&gt; Works in Practice
&lt;/h3&gt;

&lt;p&gt;A typical architecture uses a &lt;strong&gt;primary node&lt;/strong&gt; (or leader) that coordinates writes. When a &lt;strong&gt;write&lt;/strong&gt; arrives:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The primary applies the change locally.&lt;/li&gt;
&lt;li&gt;It replicates the update synchronously to a &lt;strong&gt;quorum&lt;/strong&gt; of replicas (for example, a majority of nodes in a five-node cluster requires acknowledgment from at least three).&lt;/li&gt;
&lt;li&gt;Only after the quorum confirms does the primary acknowledge success to the client.&lt;/li&gt;
&lt;li&gt;Any &lt;strong&gt;read&lt;/strong&gt; request, whether directed to the primary or a replica, is served only after verifying it reflects the latest committed state.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This ensures &lt;strong&gt;linearizability&lt;/strong&gt; but introduces latency because writes block until synchronization completes. Failures or network partitions can temporarily reduce &lt;strong&gt;availability&lt;/strong&gt; until consensus is restored.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete Implementation Structure and Code Example
&lt;/h3&gt;

&lt;p&gt;Consider a simplified &lt;strong&gt;Python&lt;/strong&gt; simulation of a strongly consistent key-value store using threading locks and a central coordinator to mimic quorum behavior. This demonstrates the core structure used in production systems like &lt;strong&gt;Google Spanner&lt;/strong&gt; or &lt;strong&gt;etcd&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StrongConsistencyStore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;          &lt;span class="c1"&gt;# Shared data store
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;      &lt;span class="c1"&gt;# Version tracking for linearizability
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;            &lt;span class="c1"&gt;# Global lock simulating quorum coordination
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_replicas&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;quorum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;  &lt;span class="c1"&gt;# Majority quorum
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# Simulate synchronous quorum acknowledgment
&lt;/span&gt;            &lt;span class="n"&gt;current_version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_version&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write committed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (version &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c1"&gt;# In real systems, this would wait for quorum replicas to acknowledge
&lt;/span&gt;            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Simulate network round-trip latency
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# All reads go through coordinated check
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Read returned: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (version &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="c1"&gt;# Usage demonstration
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StrongConsistencyStore&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;account_balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1500.00&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Client A writes balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Immediate read from any "replica" (simulated)
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Client B reads balance:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;account_balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;account_balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1400.00&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Client C reads updated balance:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;account_balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code enforces &lt;strong&gt;strong consistency&lt;/strong&gt; by serializing all operations under a single lock, ensuring every &lt;strong&gt;read&lt;/strong&gt; sees the latest &lt;strong&gt;write&lt;/strong&gt;. In a real distributed deployment, the lock would be replaced by a &lt;strong&gt;consensus algorithm&lt;/strong&gt; that requires quorum acknowledgments from live replicas. The version counter prevents stale reads even if network delays occur.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for &lt;strong&gt;Strong Consistency&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Strong Consistency&lt;/strong&gt; is essential in domains where correctness cannot be compromised:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Financial systems&lt;/strong&gt; and banking applications where account balances must reflect every transaction instantly to prevent double-spending.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inventory management&lt;/strong&gt; in e-commerce to ensure stock counts are accurate across all regional warehouses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reservation systems&lt;/strong&gt; such as airline seat booking or hotel room allocation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Trade-offs of &lt;strong&gt;Strong Consistency&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;While it provides perfect data accuracy, &lt;strong&gt;Strong Consistency&lt;/strong&gt; sacrifices &lt;strong&gt;availability&lt;/strong&gt; during partitions (per the &lt;strong&gt;CAP theorem&lt;/strong&gt;) and increases &lt;strong&gt;latency&lt;/strong&gt; due to synchronization overhead. Systems may return errors or block operations rather than serve potentially inconsistent data.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Eventual Consistency&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Eventual Consistency&lt;/strong&gt; relaxes the immediate guarantee. It promises that if no new &lt;strong&gt;writes&lt;/strong&gt; occur to a data item, all &lt;strong&gt;replicas&lt;/strong&gt; will eventually converge to the same latest value after some unspecified but finite time. Temporary divergence is allowed, and &lt;strong&gt;reads&lt;/strong&gt; may return &lt;strong&gt;stale data&lt;/strong&gt; during the propagation window.&lt;/p&gt;

&lt;p&gt;This pattern relies on &lt;strong&gt;asynchronous replication&lt;/strong&gt;. Writes are acknowledged quickly after being applied to a single node or a minimal set, then propagated in the background through mechanisms like &lt;strong&gt;gossip protocols&lt;/strong&gt;, &lt;strong&gt;anti-entropy&lt;/strong&gt;, &lt;strong&gt;read repair&lt;/strong&gt;, or &lt;strong&gt;hinting handoff&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  How &lt;strong&gt;Eventual Consistency&lt;/strong&gt; Works in Practice
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;write&lt;/strong&gt; succeeds as soon as it reaches one or more nodes (often a single node for maximum &lt;strong&gt;availability&lt;/strong&gt;). Background processes then push the update to other &lt;strong&gt;replicas&lt;/strong&gt;. &lt;strong&gt;Conflict resolution&lt;/strong&gt; strategies—such as &lt;strong&gt;last-write-wins (LWW)&lt;/strong&gt; based on timestamps or &lt;strong&gt;vector clocks&lt;/strong&gt;—resolve any concurrent updates when replicas reconcile.&lt;/p&gt;

&lt;p&gt;Clients may see different values briefly, but the system self-heals without manual intervention.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete Implementation Structure and Code Example
&lt;/h3&gt;

&lt;p&gt;Below is a full &lt;strong&gt;Python&lt;/strong&gt; implementation simulating an eventually consistent store using asynchronous queues and background reconciliation threads. This mirrors the architecture of &lt;strong&gt;Amazon DynamoDB&lt;/strong&gt; (default mode) or &lt;strong&gt;Apache Cassandra&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EventualConsistencyStore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_replicas&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_replicas&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write_queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replica_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_replicas&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="c1"&gt;# Start background propagation thread
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;propagator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_propagate_updates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;daemon&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;propagator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;current_version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_version&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write acknowledged on replica &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Read from replica &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (version &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_propagate_updates&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;rid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replica_count&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                        &lt;span class="n"&gt;current_version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;rid&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;current_version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;rid&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
                            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;rid&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;
                &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Simulate network propagation delay
&lt;/span&gt;            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Empty&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

&lt;span class="c1"&gt;# Usage demonstration
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EventualConsistencyStore&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_post&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello world&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Client A reads immediately (may be stale):&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_post&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Allow propagation
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Client B reads after convergence:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_post&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The background thread ensures &lt;strong&gt;eventual convergence&lt;/strong&gt;. In production, &lt;strong&gt;vector clocks&lt;/strong&gt; or &lt;strong&gt;CRDTs&lt;/strong&gt; would replace simple version numbers for more sophisticated conflict handling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for &lt;strong&gt;Eventual Consistency&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Eventual Consistency&lt;/strong&gt; excels in high-scale, high-availability scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Social media feeds&lt;/strong&gt; and like counters where slight delays in visibility are acceptable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content delivery networks (CDN)&lt;/strong&gt; and caching layers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DNS systems&lt;/strong&gt; and email propagation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Weak Consistency&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Weak Consistency&lt;/strong&gt; provides the fewest guarantees. After a &lt;strong&gt;write&lt;/strong&gt;, subsequent &lt;strong&gt;reads&lt;/strong&gt; may or may not see the update, and there is no assurance that &lt;strong&gt;replicas&lt;/strong&gt; will ever converge. Divergence can persist indefinitely unless explicitly resolved.&lt;/p&gt;

&lt;p&gt;This model prioritizes raw performance and &lt;strong&gt;availability&lt;/strong&gt; above all else. Updates are fire-and-forget, with no automatic synchronization or conflict resolution built into the core protocol.&lt;/p&gt;

&lt;h3&gt;
  
  
  How &lt;strong&gt;Weak Consistency&lt;/strong&gt; Works in Practice
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Writes&lt;/strong&gt; are applied locally to whichever node receives them. &lt;strong&gt;Reads&lt;/strong&gt; return whatever local state exists at that moment. Reconciliation, if any, happens only through application-level logic or manual intervention. There are no background propagators or quorum requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complete Implementation Structure and Code Example
&lt;/h3&gt;

&lt;p&gt;Here is a complete &lt;strong&gt;Python&lt;/strong&gt; simulation of a weakly consistent store with zero synchronization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WeakConsistencyStore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_replicas&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Only for internal safety, not for consistency
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write applied only to replica &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Read from replica &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (no convergence guarantee)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;

&lt;span class="c1"&gt;# Usage demonstration
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;WeakConsistencyStore&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;game_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;150&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Client reads from replica 1:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;game_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# Likely None
&lt;/span&gt;    &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;game_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;200&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Client reads from replica 0:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;game_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replica_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# Still 150
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No propagation occurs. Each &lt;strong&gt;replica&lt;/strong&gt; remains isolated unless the application adds custom logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for &lt;strong&gt;Weak Consistency&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Weak Consistency&lt;/strong&gt; suits applications where freshness is secondary to speed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multiplayer game leaderboards&lt;/strong&gt; where occasional staleness does not affect gameplay.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-critical caching layers&lt;/strong&gt; or analytics dashboards.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Highly partitioned sensor networks&lt;/strong&gt; that tolerate data loss.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Comparing the Patterns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Strong Consistency&lt;/strong&gt; delivers immediate correctness at the cost of higher latency and reduced &lt;strong&gt;availability&lt;/strong&gt; during failures. &lt;strong&gt;Eventual Consistency&lt;/strong&gt; trades immediate accuracy for superior scalability and &lt;strong&gt;availability&lt;/strong&gt;, relying on time for convergence. &lt;strong&gt;Weak Consistency&lt;/strong&gt; maximizes performance by removing all guarantees, making it suitable only when temporary or permanent divergence is tolerable.&lt;/p&gt;

&lt;p&gt;Each pattern aligns with specific positions on the &lt;strong&gt;CAP spectrum&lt;/strong&gt;: &lt;strong&gt;Strong Consistency&lt;/strong&gt; typically favors &lt;strong&gt;CP&lt;/strong&gt; systems, while &lt;strong&gt;Eventual&lt;/strong&gt; and &lt;strong&gt;Weak Consistency&lt;/strong&gt; enable &lt;strong&gt;AP&lt;/strong&gt; designs that remain responsive even under partitions.&lt;/p&gt;

&lt;p&gt;The choice depends entirely on the application's tolerance for &lt;strong&gt;stale data&lt;/strong&gt;, required &lt;strong&gt;throughput&lt;/strong&gt;, and acceptable &lt;strong&gt;failure modes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0mtf8nu8ip1wwsq7agsj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0mtf8nu8ip1wwsq7agsj.png" alt="Consistency patterns in system design" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Master the complete System Design interview process with real-world architectures, deep-dive explanations, and battle-tested patterns used at top tech companies.&lt;br&gt;&lt;br&gt;
Buy it now at: &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;  &lt;/p&gt;

&lt;p&gt;Buy me coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
    <item>
      <title>SQL vs. NoSQL in System Design</title>
      <dc:creator>CodeWithDhanian</dc:creator>
      <pubDate>Thu, 02 Apr 2026 19:51:30 +0000</pubDate>
      <link>https://forem.com/code_2/sql-vs-nosql-in-system-design-6h5</link>
      <guid>https://forem.com/code_2/sql-vs-nosql-in-system-design-6h5</guid>
      <description>&lt;h2&gt;
  
  
  Foundations of &lt;strong&gt;SQL&lt;/strong&gt; Databases
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SQL&lt;/strong&gt; databases, also known as &lt;strong&gt;relational databases&lt;/strong&gt;, organize data into structured tables with predefined &lt;strong&gt;schemas&lt;/strong&gt;. Each table consists of rows and columns, where columns enforce specific data types and constraints. &lt;strong&gt;Relationships&lt;/strong&gt; between tables are established through &lt;strong&gt;foreign keys&lt;/strong&gt;, enabling complex &lt;strong&gt;joins&lt;/strong&gt; to retrieve interconnected data efficiently. This model follows the principles of &lt;strong&gt;normalization&lt;/strong&gt; to minimize data redundancy and ensure &lt;strong&gt;data integrity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The core strength of &lt;strong&gt;SQL&lt;/strong&gt; lies in its adherence to &lt;strong&gt;ACID&lt;/strong&gt; properties: &lt;strong&gt;Atomicity&lt;/strong&gt;, &lt;strong&gt;Consistency&lt;/strong&gt;, &lt;strong&gt;Isolation&lt;/strong&gt;, and &lt;strong&gt;Durability&lt;/strong&gt;. These guarantees make &lt;strong&gt;SQL&lt;/strong&gt; ideal for applications requiring strict &lt;strong&gt;transactional consistency&lt;/strong&gt;, such as financial systems or e-commerce platforms where partial failures cannot occur.&lt;/p&gt;

&lt;p&gt;A complete &lt;strong&gt;SQL&lt;/strong&gt; schema for an e-commerce system illustrates this structure. Consider the following full code snippet using &lt;strong&gt;PostgreSQL&lt;/strong&gt; syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Complete SQL schema for e-commerce system&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;username&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;password_hash&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stock_quantity&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;category_id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;categories&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;category_id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_date&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_amount&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;order_items&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;order_item_id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;quantity&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;price_at_purchase&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Example transaction ensuring ACID compliance&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;order_items&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price_at_purchase&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'orders_order_id_seq'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;49&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;stock_quantity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stock_quantity&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure enforces &lt;strong&gt;referential integrity&lt;/strong&gt; through &lt;strong&gt;foreign keys&lt;/strong&gt; and &lt;strong&gt;cascading deletes&lt;/strong&gt;. The transaction block guarantees that either all operations succeed or none do, maintaining &lt;strong&gt;consistency&lt;/strong&gt; even under concurrent access. In system design, such &lt;strong&gt;SQL&lt;/strong&gt; setups are typically deployed with &lt;strong&gt;master-slave replication&lt;/strong&gt; for read scalability and &lt;strong&gt;vertical scaling&lt;/strong&gt; by upgrading hardware on a single server.&lt;/p&gt;

&lt;h2&gt;
  
  
  Foundations of &lt;strong&gt;NoSQL&lt;/strong&gt; Databases
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;NoSQL&lt;/strong&gt; databases, or &lt;strong&gt;non-relational databases&lt;/strong&gt;, reject the rigid table structure in favor of flexible data models. They store data in formats such as &lt;strong&gt;documents&lt;/strong&gt;, &lt;strong&gt;key-value pairs&lt;/strong&gt;, &lt;strong&gt;wide-column stores&lt;/strong&gt;, or &lt;strong&gt;graphs&lt;/strong&gt;. &lt;strong&gt;Schemas&lt;/strong&gt; are often &lt;strong&gt;dynamic&lt;/strong&gt; or &lt;strong&gt;schema-less&lt;/strong&gt;, allowing applications to evolve without downtime for migrations. This flexibility prioritizes &lt;strong&gt;scalability&lt;/strong&gt; and &lt;strong&gt;performance&lt;/strong&gt; over strict consistency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NoSQL&lt;/strong&gt; systems typically follow &lt;strong&gt;BASE&lt;/strong&gt; principles: &lt;strong&gt;Basically Available&lt;/strong&gt;, &lt;strong&gt;Soft state&lt;/strong&gt;, and &lt;strong&gt;Eventual consistency&lt;/strong&gt;. They excel in &lt;strong&gt;horizontal scaling&lt;/strong&gt; across distributed clusters, making them suitable for high-throughput applications like social media feeds, real-time analytics, or content management systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NoSQL&lt;/strong&gt; encompasses four primary types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Document stores&lt;/strong&gt; (e.g., &lt;strong&gt;MongoDB&lt;/strong&gt;) store self-contained &lt;strong&gt;JSON&lt;/strong&gt;-like documents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key-value stores&lt;/strong&gt; (e.g., &lt;strong&gt;Redis&lt;/strong&gt;) provide ultra-fast lookups.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Column-family stores&lt;/strong&gt; (e.g., &lt;strong&gt;Cassandra&lt;/strong&gt;) handle massive sparse data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph databases&lt;/strong&gt; (e.g., &lt;strong&gt;Neo4j&lt;/strong&gt;) optimize for relationship-heavy queries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A complete &lt;strong&gt;NoSQL&lt;/strong&gt; implementation example uses &lt;strong&gt;MongoDB&lt;/strong&gt; for the same e-commerce scenario. The following full code snippet demonstrates document-based storage and operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Complete MongoDB setup and operations for e-commerce system&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;MongoClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mongodb&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mongodb://localhost:27017&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;MongoClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ecommerce&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Collections are created implicitly on first insert&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;usersCollection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;users&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;productsCollection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;products&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ordersCollection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;orders&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Insert a user document (schema-less)&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newUser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;johndoe&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;john@example.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;passwordHash&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hashedpassword123&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;addresses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;  &lt;span class="c1"&gt;// Embedded array for flexibility&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;street&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;123 Main St&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Nairobi&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;country&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Kenya&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userResult&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;usersCollection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newUser&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Insert product with dynamic fields&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newProduct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Wireless Headphones&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Noise-cancelling over-ear&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;99.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;stockQuantity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Electronics&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;audio&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;wireless&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;// Flexible array&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;productsCollection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newProduct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Complete order as a single document with embedded items (denormalized for speed)&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newOrder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;insertedId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;orderDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;totalAmount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;99.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pending&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;productObjectIdHere&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Wireless Headphones&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;priceAtPurchase&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;99.99&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="c1"&gt;// Embedded shipping info without separate table&lt;/span&gt;
      &lt;span class="na"&gt;shippingAddress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;street&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;123 Main St&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Nairobi&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;country&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Kenya&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ordersCollection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newOrder&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Query example with aggregation for analytics&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;analytics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ordersCollection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aggregate&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;completed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;totalRevenue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$sum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$totalAmount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Analytics:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;analytics&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code demonstrates &lt;strong&gt;denormalization&lt;/strong&gt; by embedding related data directly into documents, eliminating the need for &lt;strong&gt;joins&lt;/strong&gt;. Operations occur atomically within a single document, and the system scales horizontally by adding more nodes to the &lt;strong&gt;replica set&lt;/strong&gt; or &lt;strong&gt;sharded cluster&lt;/strong&gt;. In system design, &lt;strong&gt;MongoDB&lt;/strong&gt; would use &lt;strong&gt;consistent hashing&lt;/strong&gt; for data partitioning and &lt;strong&gt;replication&lt;/strong&gt; for high availability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparative Analysis in System Design
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SQL&lt;/strong&gt; and &lt;strong&gt;NoSQL&lt;/strong&gt; differ fundamentally in how they handle data, scale, and guarantee consistency, directly impacting architectural decisions.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;
&lt;strong&gt;SQL&lt;/strong&gt; Databases&lt;/th&gt;
&lt;th&gt;
&lt;strong&gt;NoSQL&lt;/strong&gt; Databases&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Data Model&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Relational&lt;/strong&gt; tables with fixed &lt;strong&gt;schema&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Flexible&lt;/strong&gt; documents, key-value, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Rigid&lt;/strong&gt; and enforced&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Dynamic&lt;/strong&gt; or &lt;strong&gt;schema-less&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query Language&lt;/td&gt;
&lt;td&gt;Standardized &lt;strong&gt;SQL&lt;/strong&gt; with &lt;strong&gt;joins&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Database-specific (e.g., MongoDB Query Language)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consistency&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;ACID&lt;/strong&gt; (strong guarantees)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;BASE&lt;/strong&gt; (eventual consistency)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scalability&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Vertical&lt;/strong&gt; scaling preferred&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Horizontal&lt;/strong&gt; scaling across clusters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use Case Fit&lt;/td&gt;
&lt;td&gt;Transactions, complex relationships&lt;/td&gt;
&lt;td&gt;High volume, unstructured data, real-time&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;SQL&lt;/strong&gt; shines when &lt;strong&gt;data integrity&lt;/strong&gt; and &lt;strong&gt;complex queries&lt;/strong&gt; are paramount. &lt;strong&gt;NoSQL&lt;/strong&gt; excels when &lt;strong&gt;velocity&lt;/strong&gt;, &lt;strong&gt;variety&lt;/strong&gt;, and &lt;strong&gt;volume&lt;/strong&gt; dominate, as in big data pipelines or global user bases.&lt;/p&gt;

&lt;p&gt;In a distributed system design, a hybrid approach often emerges. &lt;strong&gt;User authentication&lt;/strong&gt; and &lt;strong&gt;financial transactions&lt;/strong&gt; might reside in &lt;strong&gt;PostgreSQL&lt;/strong&gt; for &lt;strong&gt;ACID&lt;/strong&gt; compliance, while &lt;strong&gt;user activity logs&lt;/strong&gt; and &lt;strong&gt;product recommendations&lt;/strong&gt; use &lt;strong&gt;Cassandra&lt;/strong&gt; or &lt;strong&gt;MongoDB&lt;/strong&gt; for &lt;strong&gt;horizontal scaling&lt;/strong&gt; and &lt;strong&gt;eventual consistency&lt;/strong&gt;. &lt;strong&gt;Data partitioning&lt;/strong&gt; via &lt;strong&gt;sharding&lt;/strong&gt; in &lt;strong&gt;NoSQL&lt;/strong&gt; contrasts with &lt;strong&gt;partitioning&lt;/strong&gt; strategies in &lt;strong&gt;SQL&lt;/strong&gt; that rely more on &lt;strong&gt;read replicas&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Implementation in Distributed Systems
&lt;/h2&gt;

&lt;p&gt;System designers must evaluate &lt;strong&gt;trade-offs&lt;/strong&gt; when selecting databases. For a high-traffic social platform, &lt;strong&gt;SQL&lt;/strong&gt; might handle &lt;strong&gt;user profiles&lt;/strong&gt; with &lt;strong&gt;strong consistency&lt;/strong&gt; via &lt;strong&gt;two-phase commits&lt;/strong&gt; in rare cross-service transactions. &lt;strong&gt;NoSQL&lt;/strong&gt; would manage &lt;strong&gt;feeds&lt;/strong&gt; using &lt;strong&gt;event-driven architecture&lt;/strong&gt; with &lt;strong&gt;message queues&lt;/strong&gt; publishing changes for eventual propagation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Leader election&lt;/strong&gt; and &lt;strong&gt;consensus algorithms&lt;/strong&gt; like &lt;strong&gt;Raft&lt;/strong&gt; ensure &lt;strong&gt;NoSQL&lt;/strong&gt; clusters remain available during node failures. &lt;strong&gt;SQL&lt;/strong&gt; clusters often employ &lt;strong&gt;multi-master replication&lt;/strong&gt; with careful conflict resolution.&lt;/p&gt;

&lt;p&gt;The choice influences &lt;strong&gt;microservices architecture&lt;/strong&gt;: each service owns its database, enforcing &lt;strong&gt;database-per-service&lt;/strong&gt; patterns. &lt;strong&gt;API gateways&lt;/strong&gt; route requests, while &lt;strong&gt;circuit breakers&lt;/strong&gt; and &lt;strong&gt;retry mechanisms&lt;/strong&gt; handle transient failures across database boundaries.&lt;/p&gt;

&lt;p&gt;This comprehensive understanding equips system designers to architect resilient, scalable solutions tailored to specific requirements.&lt;/p&gt;

&lt;p&gt;To help visualize the concepts discussed, here is one complete image:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rjgkyry26hib8apzeet.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rjgkyry26hib8apzeet.png" alt="SQL vs NoSQL database comparison" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Design Handbook&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
To master these concepts and many more, purchase the complete System Design Handbook at &lt;a href="https://codewithdhanian.gumroad.com/l/ntmcf" rel="noopener noreferrer"&gt;https://codewithdhanian.gumroad.com/l/ntmcf&lt;/a&gt;.  &lt;/p&gt;

&lt;p&gt;Buy me coffee to support my content at: &lt;a href="https://ko-fi.com/codewithdhanian" rel="noopener noreferrer"&gt;https://ko-fi.com/codewithdhanian&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>database</category>
      <category>sql</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
