<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Dylan Dumont</title>
    <description>The latest articles on Forem by Dylan Dumont (@dylan_dumont_266378d98367).</description>
    <link>https://forem.com/dylan_dumont_266378d98367</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3853448%2F34f28cc2-c576-4b86-8a09-73e1aeb86ed4.png</url>
      <title>Forem: Dylan Dumont</title>
      <link>https://forem.com/dylan_dumont_266378d98367</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dylan_dumont_266378d98367"/>
    <language>en</language>
    <item>
      <title>The RED Method: Request Rate, Errors, and Duration as Your Core SLIs</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Sun, 19 Apr 2026 12:39:38 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/the-red-method-request-rate-errors-and-duration-as-your-core-slis-4jk</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/the-red-method-request-rate-errors-and-duration-as-your-core-slis-4jk</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;"Noise drowns out signal; focus on the three metrics that actually indicate system health."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are instrumenting a Go-based HTTP handler to expose the three Request Rate, Errors, and Duration metrics required to calculate Service Level Indicators (SLIs). This scope excludes internal tracing spans or database metrics, focusing strictly on the surface API gateway to ensure consistency across a distributed backend. The goal is to replace legacy monitoring scripts with a structured metrics export that feeds directly into a Prometheus stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Instrument the Middleware
&lt;/h2&gt;

&lt;p&gt;The first step is intercepting incoming requests before they reach the application logic. You need a middleware function that wraps the handler and captures the timing start point.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;RequestInfo&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Start&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;RequestMetricsMiddleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handler&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandlerFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;reqInfo&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;RequestInfo&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;

        &lt;span class="c"&gt;// Wrap the original handler logic here&lt;/span&gt;
        &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServeHTTP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c"&gt;// Extract duration&lt;/span&gt;
        &lt;span class="n"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reqInfo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This separation ensures the application logic remains clean while observability concerns are handled at the infrastructure boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Aggregate Request Counts
&lt;/h2&gt;

&lt;p&gt;Counters track the total volume of requests. You should maintain separate counters for 4xx errors and 5xx errors to distinguish client failures from server failures.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;totalRequests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prometheus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCounter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prometheus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CounterOpts&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api_total_requests_total"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Help&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Total number of API requests."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;error5xx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prometheus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCounter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prometheus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CounterOpts&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api_errors_5xx_total"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Help&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Server-side errors."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Counters are essential for calculating Request Rate per second, which helps determine capacity planning thresholds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Classify Error Labels
&lt;/h2&gt;

&lt;p&gt;Do not just count errors; label them. Use status codes (2xx, 4xx, 5xx) as labels to allow you to query specific failure modes later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;recordError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;error5xx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Inc&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="c"&gt;// Record 4xx in a similar gauge or counter with a label&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This specificity allows you to distinguish between a rate-limiting issue (429) and a database crash (500) during incident response.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Measure Latency Histograms
&lt;/h2&gt;

&lt;p&gt;Duration needs more than an average. A histogram with percentiles (p50, p95, p99) is required to understand the tail latency that impacts user experience.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reqInfo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;apiDurationHistogram&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Observe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Seconds&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Histograms normalize for request volume, preventing a flood of requests from skewing the average latency significantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 — Export Metrics via HTTP Endpoint
&lt;/h2&gt;

&lt;p&gt;The final step is exposing these values so a collector like Prometheus can scrape them every 15 seconds. Ensure your server does not block during the write phase.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;startServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;mux&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServeMux&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/metrics"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prometheus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListenAndServe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;":8080"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Standard HTTP endpoints provide the necessary protocol compliance for cloud-native observability stacks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Request Rate&lt;/strong&gt; provides visibility into traffic volume and helps identify capacity saturation points in real-time.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Errors&lt;/strong&gt; must be labeled by status code to allow engineers to differentiate between client and server failures.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Duration&lt;/strong&gt; histograms are superior to averages because they reveal the tail latency that causes actual user complaints.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Instrumentation&lt;/strong&gt; should happen at the edge, ensuring that metrics reflect the contract presented to the client, not internal implementation details.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;SLOs&lt;/strong&gt; derived from these RED metrics drive meaningful alerts rather than noise from every internal dependency failure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Next, define Service Level Objectives (SLOs) based on the 99.9th percentile of the Duration histogram. You should calculate error budgets to determine how much failure is acceptable before slowing down feature deployment. Finally, implement alerting rules that trigger on sustained spikes in error5xx over 5xx rates exceeding your threshold for one minute.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/strong&gt; — Essential for understanding how to structure systems to handle the data flow that metrics represent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/strong&gt; — Relevant for managing the complexity trade-offs when instrumenting every layer of a backend system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Part of the &lt;strong&gt;Architecture Patterns&lt;/strong&gt; series.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>observability</category>
      <category>backend</category>
      <category>devops</category>
    </item>
    <item>
      <title>Building a Job Queue in Rust: Persistent Tasks With Retry Logic</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Fri, 17 Apr 2026 12:40:11 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/building-a-job-queue-in-rust-persistent-tasks-with-retry-logic-5n9</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/building-a-job-queue-in-rust-persistent-tasks-with-retry-logic-5n9</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;"Transient failures are inevitable; durable execution requires state to survive the crash."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are constructing a resilient worker service in Rust that processes background tasks from a persistent queue. This example prioritizes data durability over peak throughput, ensuring that failed jobs are never lost but eventually succeed or move to a dead letter queue. We will use async Rust with SQL for storage, demonstrating how to structure state transitions that survive application restarts. The focus is on architectural correctness over raw performance, building a foundation for long-running background processing systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Define the Job State Machine
&lt;/h2&gt;

&lt;p&gt;The worker must track a job's lifecycle without relying on volatile memory alone. We start by defining an enum that explicitly tracks every state transition, ensuring the logic is exhaustive.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;JobStatus&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;Pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;Succeeded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;Failed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;DeadLetter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This choice matters because explicit states prevent silent state drifts that often plague long-running daemon processes. By forcing the developer to handle every case, we reduce the chance of forgetting to update a database column after a panic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Persist Job State in Storage
&lt;/h2&gt;

&lt;p&gt;A transient failure of the application worker must not result in data loss. We model the job table to include columns for status, retry count, and last attempt timestamp, creating a source of truth that survives restarts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(sqlx::FromRow)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Job&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;JobStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;retry_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DateTime&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Utc&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;last_attempted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Option&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;DateTime&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Utc&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Storing metadata here allows us to query for pending work and ensures we can resume processing from exactly where the application died. We use UUIDs for the ID to maintain uniqueness and avoid accidental collisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Implement Exponential Backoff Logic
&lt;/h2&gt;

&lt;p&gt;When a job fails, we must wait before retrying to prevent database overload. We generate a delay based on the current retry count, using a &lt;code&gt;tokio::time::sleep&lt;/code&gt; to enforce a pause before the next attempt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;calculate_delay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retry_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Duration&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Start with 1 second delay and double it with each retry&lt;/span&gt;
  &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;base_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_secs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;max_duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_secs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;raw_delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base_duration&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retry_count&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u32&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;capped_delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;raw_delay&lt;/span&gt;&lt;span class="nf"&gt;.min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_duration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Add jitter to prevent thundering herd issues&lt;/span&gt;
  &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;jitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_millis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;random&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;u64&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nn"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_secs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;capped_delay&lt;/span&gt;&lt;span class="nf"&gt;.as_secs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;jitter&lt;/span&gt;&lt;span class="nf"&gt;.as_secs&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using exponential backoff instead of a fixed delay ensures that transient network issues resolve without overwhelming the system resources. The jitter component is critical for preventing multiple workers from retrying at the exact same second, which can cause spikes in database load.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Handle Permanent Failures in a DLQ
&lt;/h2&gt;

&lt;p&gt;A job should not be retried infinitely if the error is irrecoverable. If the retry count exceeds a threshold, we transition the state to &lt;code&gt;DeadLetter&lt;/code&gt; to prevent an infinite loop and allow operators to manually inspect or discard the job.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;should_retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="py"&gt;.retry_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Mark as DeadLetter&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This separation isolates error handling from success paths, adhering to the principle of separation of concerns. The &lt;code&gt;DeadLetter&lt;/code&gt; state acts as a final repository for problematic jobs, ensuring the system doesn't block on them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;Building a durable job queue requires treating state as an external truth source rather than application memory. By defining a strict state machine and persisting it in a relational database, we ensure that no work is ever lost even if the worker process crashes. The retry logic with exponential backoff protects system health, while the dead letter queue allows for manual intervention on permanent failures. This pattern scales well for any background processing system that values correctness over speed. The separation of concerns—logic for success, logic for retry, logic for failure—ensures that the code remains maintainable and the architecture remains robust against transient failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next
&lt;/h2&gt;

&lt;p&gt;To expand on this pattern, consider adding concurrency controls to process jobs in parallel without overloading the database write locks. Investigate how &lt;code&gt;postgres&lt;/code&gt; connection pooling interacts with long-running transactions when processing large payloads. Finally, review the logging strategies for tracking job lifecycle events in a distributed system context to ensure observability aligns with operational expectations. You might also consider implementing a metrics pipeline to track average processing times per job type.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reading
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Books
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/strong&gt;: Covers the tradeoffs between durability and availability that inform our database schema choices.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/strong&gt;: The chapter on coupling applies to how we separate the retry logic from the processing logic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Further Reading
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://rust-lang.github.io/async-book/" rel="noopener noreferrer"&gt;Rust Async Book&lt;/a&gt; for deeper async patterns.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/launchbadge/sqlx" rel="noopener noreferrer"&gt;SQL Alchemy for Rust&lt;/a&gt; for database interaction.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Part of the &lt;strong&gt;Architecture Patterns&lt;/strong&gt; series.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>systems</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>Log-Structured Merge Trees: The Data Structure That Powers Modern Databases</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Thu, 16 Apr 2026 12:43:31 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/log-structured-merge-trees-the-data-structure-that-powers-modern-databases-ck4</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/log-structured-merge-trees-the-data-structure-that-powers-modern-databases-ck4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;LSM trees optimize write performance by buffering changes in memory before flushing to disk sequentially.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are implementing a simplified LSM tree architecture to understand the mechanics behind high-throughput databases like Cassandra and RocksDB. This scope focuses on the core trade-off between write speed and storage durability. We will explore how write-heavy workloads are decoupled from read-heavy operations by leveraging sequential disk access rather than random seeking. This pattern is essential for modern backend systems handling massive logging or ingestion streams.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — In-Memory Write Buffer
&lt;/h2&gt;

&lt;p&gt;The core innovation of LSM trees is the in-memory buffer called a memtable. Incoming writes are appended to this vector rather than hitting the disk immediately. This drastically reduces the number of expensive seek operations required to update the dataset. In a production context, this allows the system to absorb burst traffic by queuing updates until the buffer capacity is reached.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;MemTable&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Vec&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;capacity_limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;MemTable&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;Self&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;MemTable&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;Vec&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;capacity_limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;capacity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.entries&lt;/span&gt;&lt;span class="nf"&gt;.push&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rust vectors provide contiguous memory allocation, which aligns perfectly with how operating systems handle sequential writes to storage media.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Flushing to SSTables
&lt;/h2&gt;

&lt;p&gt;Once the memtable fills its capacity limit, the buffer is frozen and flushed to the disk as a Sorted String Table (SSTable). This process is asynchronous and ensures that no data is lost by moving the buffer content into an immutable file. The file is written sequentially, which minimizes random read/write amplification that plagues traditional B-Trees under heavy update loads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;flush_memtable_to_disk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;MemTable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Simulate serialization to SSTable format&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;serde_json&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;to_vec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="py"&gt;.entries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// In reality, this would be written to disk with compression&lt;/span&gt;
    &lt;span class="nn"&gt;String&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"sst-000001"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Immutable files allow multiple versions to coexist without locking, enabling readers to see consistent snapshots of the database state.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Indexing for Random Reads
&lt;/h2&gt;

&lt;p&gt;Storing all data in memtables would be inefficient for reads. We must maintain indexes to locate keys quickly. We utilize bloom filters or inverted indices to determine if a key exists in an SSTable before loading the full file into memory. This check is fast and requires minimal disk I/O, making the system efficient for large datasets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;collections&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;HashSet&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;check_bloom_filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sst_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Check bit array to determine existence probability&lt;/span&gt;
    &lt;span class="n"&gt;sst_key&lt;/span&gt;&lt;span class="nf"&gt;.contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"valid"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bloom filter returns a probabilistic false positive but never a false negative, ensuring that valid keys are always found.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Asynchronous Compaction
&lt;/h2&gt;

&lt;p&gt;Over time, the number of SSTable files grows, consuming excessive disk space. A background compaction process merges these files, removing duplicates and reclaiming free space. This ensures that the storage usage remains proportional to the active data lifespan.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;compact_sstables&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sst_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Vec&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Simulate merging files into a new sorted file&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;merged&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Vec&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sst_files&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;merged&lt;/span&gt;&lt;span class="nf"&gt;.push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;format!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"merged-{}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;merged&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This process runs on a separate thread to ensure that write throughput is not degraded by background cleanup tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Write Amplification:&lt;/strong&gt; LSM trees reduce write amplification by grouping updates and writing them sequentially.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sequential I/O:&lt;/strong&gt; By avoiding random seeks, the system utilizes the high bandwidth of modern storage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durability:&lt;/strong&gt; Data is persisted as soon as it is written to the immutable SSTable on disk.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency:&lt;/strong&gt; The write buffer allows high concurrency without contention on the storage device.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Explore immutable snapshots for point-in-time recovery.&lt;/li&gt;
&lt;li&gt;Investigate how memtables handle concurrent reads with read amplification.&lt;/li&gt;
&lt;li&gt;Consider the trade-offs when implementing bloom filters in low-memory environments.&lt;/li&gt;
&lt;li&gt;Compare LSM tree implementations across different database engines like Redis and RocksDB.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://amzn.to/4c2jE8D" rel="noopener noreferrer"&gt;Computer Systems: A Programmer's Perspective (Bryant &amp;amp; O'Hallaron)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://amzn.to/4mekBiY" rel="noopener noreferrer"&gt;Python Crash Course (Matthes)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://amzn.to/41FQGXh" rel="noopener noreferrer"&gt;Cracking the Coding Interview (McDowell)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://amzn.to/3O0yNPF" rel="noopener noreferrer"&gt;AI Engineering (Chip Huyen)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://amzn.to/4sPlPDL" rel="noopener noreferrer"&gt;Learn Rust in a Month of Lunches (MacLeod)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>distributed</category>
      <category>systems</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>Exactly-Once Delivery Is a Lie: How Systems Approximate It Anyway</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Tue, 14 Apr 2026 12:44:01 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/exactly-once-delivery-is-a-lie-how-systems-approximate-it-anyway-2m6j</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/exactly-once-delivery-is-a-lie-how-systems-approximate-it-anyway-2m6j</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Reliability in distributed systems does not mean eliminating failures; it means ensuring failures do not corrupt state.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are implementing a financial order service that processes thousands of transactions per second across multiple availability zones. The requirement is strict consistency regarding money. We cannot charge a customer twice, even if the network retries. We must handle network timeouts without creating ghost orders or losing revenue. This architecture assumes the network is eventually consistent and focuses on managing the retries safely rather than pretending they never happen.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — The Idempotency Key
&lt;/h2&gt;

&lt;p&gt;Every incoming request must carry a unique identifier. When the service receives a payment instruction, it checks if that identifier has already been processed. If a duplicate arrives, the service rejects it immediately without executing business logic. This prevents double-charging when a client resends a failed request. The code ensures the identifier is hashed and stored before processing the payload.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Service&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;CreateOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;OrderRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IdempotencyKey&lt;/span&gt;

    &lt;span class="c"&gt;// Check existing state in a separate store&lt;/span&gt;
    &lt;span class="n"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetProcessedKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="c"&gt;// Return previously created order, don't process again&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c"&gt;// ... process order&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InsertOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2 — The Outbox Pattern
&lt;/h2&gt;

&lt;p&gt;Database transactions and message queue deliveries often fail together. We separate these by writing to a local table during the same transaction as the database record. A separate worker process reads this table and sends the message to the external queue. This guarantees the order event is logged before it is sent, ensuring durability without requiring complex two-phase commits across services.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Inside a transaction&lt;/span&gt;
&lt;span class="n"&gt;tx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Begin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InsertOrderRecord&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// Write data&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InsertOutboxEntry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c"&gt;// Write event log&lt;/span&gt;
&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c"&gt;// Worker reads 'id' and sends to MQ asynchronously&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3 — State Machine Validation
&lt;/h2&gt;

&lt;p&gt;We track the lifecycle of each order. States include Created, Paid, and Completed. A transition from Created to Paid is idempotent. A retry on a Paid order is ignored. If the database is down, the state remains Created. Once the database recovers, we replay the log. The state machine prevents moving backward or skipping steps, which would violate accounting principles. We use a finite state machine to enforce valid transitions at every request boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Compensating Transactions
&lt;/h2&gt;

&lt;p&gt;If a payment fails after an order is created, we must undo the side effects. This is a compensating transaction. We record a cancellation event with the same unique identifier logic as the creation request. If a cancellation arrives, we check the current state. If the state is still Created, we update it to Cancelled and refund the account. This ensures we never hold funds indefinitely without a valid order. The worker processes cancellations with the same priority as creation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Service&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;CancelOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FindByOrderKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"not found"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;"CREATED"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="c"&gt;// State is already processed&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UpdateOrderState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CANCELLED"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Idempotency keys protect against duplicate requests but require client support. The Outbox pattern ensures messages are not lost during DB failures. State machines validate valid lifecycle paths and prevent invalid updates. Compensating transactions clean up partial failures and maintain financial integrity. Together, these patterns approximate exactly-once semantics by acknowledging network unreliability and designing for eventual consistency.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;The final step is monitoring. We track the ratio of retries to successes. If the retry rate spikes, we check the queue depth. We also monitor the deduplication store for memory usage. High cardinality of keys can cause performance degradation. We plan to shard the state storage to handle large volumes of concurrent requests. Next, we will explore distributed tracing to observe these flows across service boundaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;em&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/em&gt; by Martin Kleppmann.&lt;/li&gt;
&lt;li&gt;  &lt;em&gt;&lt;a href="https://amzn.to/4c2jE8D" rel="noopener noreferrer"&gt;Computer Systems: A Programmer's Perspective (Bryant &amp;amp; O'Hallaron)&lt;/a&gt;&lt;/em&gt; by Bryant and O'Hallaron.&lt;/li&gt;
&lt;li&gt;  &lt;em&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/em&gt; by John Ousterhout.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture Patterns Series
&lt;/h2&gt;

&lt;p&gt;Part of the &lt;strong&gt;Architecture Patterns&lt;/strong&gt; series.&lt;/p&gt;

</description>
      <category>distributed</category>
      <category>systems</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>Serde Deep Dive: Custom Serialization for Wire Protocols</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Tue, 14 Apr 2026 06:13:38 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/serde-deep-dive-custom-serialization-for-wire-protocols-1hn0</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/serde-deep-dive-custom-serialization-for-wire-protocols-1hn0</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;"When standard serialization attributes can't satisfy your wire protocol constraints, you must descend to manual implementation for precision."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are constructing a production-grade Rust module that bridges an internal domain model with an external network protocol. The goal is to enforce strict field ordering, handle optional legacy fields, and ensure binary-compatible output without the rigidity of default JSON serialization. This pattern addresses scenarios where you cannot use generic JSON and need to optimize for specific wire protocols like gRPC, FlatBuffers, or custom binary streams. We will use &lt;code&gt;serde&lt;/code&gt; for its flexibility while demonstrating where the derive macros fall short, necessitating a custom approach to maintain architectural integrity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Define the Internal Domain Model
&lt;/h2&gt;

&lt;p&gt;First, establish the clean Rust type that represents your business logic. This structure should contain only the fields strictly necessary for your application logic. The wire protocol will be an abstraction built around this, not a replacement for it. You need to isolate the core state from the transport concerns to ensure your business logic remains decoupled from network specifics.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(Debug)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;UserUpdate&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Option&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2 — Override Serialization Attributes
&lt;/h2&gt;

&lt;p&gt;Default &lt;code&gt;Serialize&lt;/code&gt; implementations serialize optional fields as &lt;code&gt;null&lt;/code&gt; or skip them entirely depending on the variant. For wire protocols requiring specific field ordering or skipping nulls entirely, you must implement custom serialization logic. Use &lt;code&gt;serialize_with&lt;/code&gt; or implement the trait manually to enforce strict formatting rules required by your downstream consumers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(Serialize)]&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;UserUpdate&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;#[serde(skip_serializing_if&lt;/span&gt; &lt;span class="nd"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Option::is_none"&lt;/span&gt;&lt;span class="nd"&gt;)]&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Option&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3 — Manage Optional and Deprecated Fields
&lt;/h2&gt;

&lt;p&gt;Wire protocols often require deprecated fields to be removed from payloads or sent as specific boolean flags rather than &lt;code&gt;null&lt;/code&gt;. This prevents version drift and ensures forward compatibility. By marking optional fields with &lt;code&gt;skip_serializing_if&lt;/code&gt; or custom flags, you control the exact byte stream sent over the network without altering the internal struct definition.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Implement Custom Deserialization Logic
&lt;/h2&gt;

&lt;p&gt;Deserialization is where validation usually fails silently in default implementations. You should customize the error output to be actionable for the API client. Creating a custom &lt;code&gt;Deserialize&lt;/code&gt; implementation allows you to map unknown fields to a safe default instead of failing with a generic type error, which improves resilience when handling version mismatches.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 — Control Versioning and Protocols
&lt;/h2&gt;

&lt;p&gt;Versioning in wire protocols is often managed via a header or a specific enum tag. In Rust, this maps to &lt;code&gt;serde::flatten&lt;/code&gt; or tagged unions for handling different schema versions within the same payload. Implementing a &lt;code&gt;VersionedMessage&lt;/code&gt; wrapper struct allows you to serialize multiple schema versions into a single wire packet structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Type Safety:&lt;/strong&gt; Custom serialization ensures your internal state maps exactly to your wire format without unexpected fields leaking into the network stream.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Protocol Agnosticism:&lt;/strong&gt; This pattern allows you to switch underlying transport mechanisms (HTTP, UDP, TCP) by swapping the serialization context without changing the business logic.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Backward Compatibility:&lt;/strong&gt; By explicitly managing how optional fields are handled, you prevent breaking changes when deploying new service versions.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Error Granularity:&lt;/strong&gt; Custom deserialization provides precise error messages that help clients fix issues quickly rather than facing generic serialization failures.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Code Maintainability:&lt;/strong&gt; Centralizing wire protocol logic in custom &lt;code&gt;Serialize&lt;/code&gt; implementations keeps business logic clean and separated from transport concerns.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Performance:&lt;/strong&gt; Custom serialization allows you to avoid allocating unnecessary types during the serialization process, leading to more efficient wire formats.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Explore how to integrate &lt;code&gt;serde&lt;/code&gt; with binary protocols like Protobuf by generating Rust structs from &lt;code&gt;.proto&lt;/code&gt; definitions. Investigate how to handle compression transparently using &lt;code&gt;zstd&lt;/code&gt; wrappers over the serialized payload. Finally, consider implementing &lt;code&gt;serde&lt;/code&gt;-like abstractions in Go using &lt;code&gt;encoding/json&lt;/code&gt; or &lt;code&gt;msgp&lt;/code&gt; to compare architectural trade-offs between the two ecosystems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/strong&gt; — This book explains the fundamental trade-offs between storage formats that directly impacts how you structure your serialization logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4c2jE8D" rel="noopener noreferrer"&gt;Computer Systems: A Programmer's Perspective (Bryant &amp;amp; O'Hallaron)&lt;/a&gt;&lt;/strong&gt; — Understanding the underlying memory and network layers helps you appreciate why serialization overhead matters in high-performance systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/strong&gt; — Learning to separate core logic from peripheral details like network serialization aligns with the book's principles of deep modules and shallow hierarchies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Part of the &lt;strong&gt;Architecture Patterns&lt;/strong&gt; series.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>systems</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>Cooperative Cancellation: Propagating Shutdown Signals Through Async Trees</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Mon, 13 Apr 2026 07:09:37 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/cooperative-cancellation-propagating-shutdown-signals-through-async-trees-21bi</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/cooperative-cancellation-propagating-shutdown-signals-through-async-trees-21bi</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Fire-and-forget concurrency leaves dangling resources until you explicitly stop the engine.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are implementing a worker tree where a shutdown signal from the root propagates to all leaf nodes without dropping tasks or leaking memory. In this scope, we define a control structure that traverses down an &lt;code&gt;async&lt;/code&gt; call stack. The design ensures that when the parent task terminates, child tasks receive a clean interrupt to run finalizers. We focus on Rust using &lt;code&gt;tokio&lt;/code&gt; because its ownership model handles cleanup deterministically. The goal is a robust shutdown pattern applicable to gRPC servers, message queues, or high-throughput data pipelines where abrupt termination causes data loss.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Establish a Broadcast Channel
&lt;/h2&gt;

&lt;p&gt;You need a mechanism to send a single shutdown command to many tasks simultaneously. Instead of passing mutable references, create a &lt;code&gt;tokio::sync::broadcast&lt;/code&gt; channel. The receiver (&lt;code&gt;Receiver&lt;/code&gt;) lives with the task logic, while the sender (&lt;code&gt;Sender&lt;/code&gt;) moves into the &lt;code&gt;shutdown&lt;/code&gt; function at the root level.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shutdown_tx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;shutdown_rx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;broadcast&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// ... pass shutdown_tx to parent ...&lt;/span&gt;
&lt;span class="c1"&gt;// ... pass shutdown_rx to children ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rust's channel model prevents data races by enforcing move semantics, ensuring the sender isn't copied inadvertently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Distribute Tokens to Children
&lt;/h2&gt;

&lt;p&gt;When spawning a worker, pass a clone of the &lt;code&gt;shutdown_rx&lt;/code&gt; handle along with the task payload. This allows the child task to become an independent subscriber to the same control stream. Do not clone the receiver for every logical sub-operation; pass one per task instance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;worker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Process data...&lt;/span&gt;
    &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;select!&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;do_work&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shutdown_rx&lt;/span&gt;&lt;span class="nf"&gt;.recv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(()),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cloning receivers is cheap, but passing them maintains the logical tree structure required for propagation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Listen in a Select Statement
&lt;/h2&gt;

&lt;p&gt;To react immediately to a shutdown signal, wrap your core logic in a &lt;code&gt;tokio::select!&lt;/code&gt; block. This combinator chooses between the successful completion of work and the reception of a cancellation message. This pattern prioritizes the interrupt over normal flow, preventing the task from getting stuck in a blocking operation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;select!&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;task_data_channel&lt;/span&gt;&lt;span class="nf"&gt;.recv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shutdown_rx&lt;/span&gt;&lt;span class="nf"&gt;.recv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt; &lt;span class="n"&gt;Loop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;timer&lt;/span&gt;&lt;span class="nf"&gt;.tick&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nd"&gt;panic!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Timeout"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;select!&lt;/code&gt; provides a non-blocking path for interruption, which is critical for maintaining liveness in a tree.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Implement Cleanup Closures
&lt;/h2&gt;

&lt;p&gt;A shutdown signal must trigger resource release, such as closing file handles or dropping database connections. Wrap the logic in a closure or match arm that executes cleanup code upon &lt;code&gt;break Loop&lt;/code&gt;. This ensures that &lt;code&gt;Drop&lt;/code&gt; traits are called even if the task was interrupted mid-process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;worker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="nf"&gt;do_work&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nd"&gt;println!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Work done"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;handle_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="c1"&gt;// On shutdown, ensure resources are released&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shutdown_logic&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;eprintln!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Interrupted: {}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Handling errors explicitly avoids silently swallowing the reason for cancellation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 — Trigger from the Root
&lt;/h2&gt;

&lt;p&gt;The parent task holds the &lt;code&gt;Sender&lt;/code&gt;. When application exit is imminent, call &lt;code&gt;shutdown_tx.send(true)&lt;/code&gt;. Because it is a broadcast channel, all registered children receive the signal instantly. The propagation stops at the leaf nodes where tasks exit their loops. This top-down approach prevents "zombie" threads in an async runtime.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;start_server&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;broadcast&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;spawn_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
    &lt;span class="n"&gt;shutdown_tx&lt;/span&gt;&lt;span class="nf"&gt;.send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single broadcast ensures a consistent state where no child task is left unaware of the shutdown decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Broadcast Channels&lt;/strong&gt; — Single signals can interrupt multiple independent tasks simultaneously without needing shared mutable state.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Select Patterns&lt;/strong&gt; — Using &lt;code&gt;tokio::select!&lt;/code&gt; prioritizes shutdown interrupts over long-running blocking operations.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Ownership Safety&lt;/strong&gt; — Move semantics in Rust ensure shutdown handles are consumed, preventing accidental double-sending.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Cleanup Hooks&lt;/strong&gt; — Explicit error matching allows you to finalize resources rather than relying solely on Drop.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Tree Propagation&lt;/strong&gt; — Top-down shutdown prevents straggling tasks that waste CPU cycles on orphaned logic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Extend this pattern to handle network I/O that times out automatically. Investigate how to implement exponential backoff if a task fails immediately after a shutdown signal. Consider adding metrics to track how long each level takes to drain during shutdown. Finally, integrate this into your CI pipeline to ensure no resource leaks occur under load.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/strong&gt; — Explains high-level concurrency models and why graceful shutdown matters in distributed systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/strong&gt; — Discusses how complexity arises from coupling and why decoupling shutdown logic reduces fragility.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Part of the &lt;strong&gt;Architecture Patterns&lt;/strong&gt; series.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>concurrency</category>
      <category>systems</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Rate Limiting Deep Dive: Token Bucket vs Leaky Bucket vs Sliding Window</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Sun, 12 Apr 2026 07:09:14 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/rate-limiting-deep-dive-token-bucket-vs-leaky-bucket-vs-sliding-window-47b7</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/rate-limiting-deep-dive-token-bucket-vs-leaky-bucket-vs-sliding-window-47b7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Protect your backend infrastructure from resource exhaustion by controlling traffic admission with precision.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;This article guides you through implementing robust rate limiting strategies in Go. We explore the core mechanics of Token Bucket, Leaky Bucket, and Sliding Window algorithms, explaining their trade-offs in handling bursts versus sustained load. We also address persistence for distributed systems and selection criteria to ensure your system remains stable under pressure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Define the Rate Limit Contract
&lt;/h2&gt;

&lt;p&gt;Before selecting an algorithm, you must establish a data structure to track time and tokens. This contract ensures type safety and serialization compatibility. In Go, you typically use an &lt;code&gt;atomic&lt;/code&gt; integer for simple counters or a &lt;code&gt;sync.Map&lt;/code&gt; for complex state. For a Token Bucket, you track &lt;code&gt;tokens&lt;/code&gt; and &lt;code&gt;lastRefillTime&lt;/code&gt;. For a Sliding Window, you track a map of &lt;code&gt;timeWindow&lt;/code&gt; to &lt;code&gt;count&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;RateLimiter&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt;       &lt;span class="kt"&gt;int64&lt;/span&gt;
    &lt;span class="n"&gt;capacity&lt;/span&gt;     &lt;span class="kt"&gt;int64&lt;/span&gt;
    &lt;span class="n"&gt;lastRefill&lt;/span&gt;   &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
    &lt;span class="n"&gt;refillRate&lt;/span&gt;   &lt;span class="kt"&gt;int64&lt;/span&gt; &lt;span class="c"&gt;// tokens per second&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;RateLimiter&lt;/code&gt; struct encapsulates the state required for a single-node implementation. By centralizing the state here, you ensure that every request goes through a consistent logic path. This also simplifies debugging. If the rate limit is too strict, you can increase &lt;code&gt;capacity&lt;/code&gt; or &lt;code&gt;refillRate&lt;/code&gt; without changing the structural layout. This approach supports testing by isolating the logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Implement Token Bucket for Burst Tolerance
&lt;/h2&gt;

&lt;p&gt;This algorithm simulates a bucket of tokens that fills up at a fixed rate. A request consumes one token. If empty, the request is rejected. This allows bursts up to the bucket capacity. When no requests arrive for 1 second, the bucket refills to &lt;code&gt;capacity&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;RateLimiter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c"&gt;// Refill logic&lt;/span&gt;
    &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lastRefill&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Microseconds&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1000000&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;refillRate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;newTokens&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;newTokens&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;capacity&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;newTokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;capacity&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newTokens&lt;/span&gt;
    &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lastRefill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;rl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The refill calculation uses &lt;code&gt;elapsed&lt;/code&gt; to determine how many tokens to add. This handles sudden traffic spikes better than fixed windows, which reset abruptly at the end of a period. For example, if you allow 100 requests per second, a sudden burst of 50 requests is allowed, then the system waits for the remaining capacity to refill before allowing the next batch. This reduces latency for legitimate high-demand periods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      Tokens
      ^
      |
      |      [ Bucket Fills ]
      |     /
      |    /
      |---/----------------&amp;gt; Time
      |  /    [ Burst Allowed ]
      | /
      |/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3 — Implement Leaky Bucket for Traffic Smoothing
&lt;/h2&gt;

&lt;p&gt;Unlike the Token Bucket, the Leaky Bucket processes requests at a constant rate regardless of arrival rate. Requests enter a queue. If the bucket is full, new requests are dropped. The "leak" rate represents the backend processing speed. This prevents downstream overload from upstream bursts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lb&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;LeakyBucket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;lb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;queueDepth&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;lb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;maxDepth&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt; &lt;span class="c"&gt;// Drop request&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;lb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;queueDepth&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;
    &lt;span class="c"&gt;// Simulate leak&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;lb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;leakRate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;lb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;queueDepth&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;lb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;queueDepth&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;go&lt;/code&gt; routine simulates the leak. In production, this leak logic should run on the same thread to avoid concurrency issues or use a dedicated worker pool. The core logic is the comparison of &lt;code&gt;queueDepth&lt;/code&gt; against &lt;code&gt;maxDepth&lt;/code&gt;. When depth exceeds the limit, the request is rejected with a 429 status. This algorithm is ideal for protecting a specific downstream service from high variance in incoming traffic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Implement Sliding Window for Precision
&lt;/h2&gt;

&lt;p&gt;The Token Bucket and Leaky Bucket are approximations. The Sliding Window counts requests in a fixed time interval (e.g., 1 second) and checks the count. It tracks a sliding window of logs within a window. Code checks the &lt;code&gt;count&lt;/code&gt; since &lt;code&gt;start&lt;/code&gt; time. The window shifts forward with every tick.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sw&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;SlidingWindow&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;window&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeWindow&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;sw&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;
        &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach is accurate because it counts exactly how many requests have arrived in the current window. However, it is more complex to implement efficiently without a database or cache. A naive implementation uses a slice or map of timestamps. For high throughput, you might combine a Token Bucket with a counter reset. The benefit is that the reset is gradual rather than abrupt. This prevents the "sawtooth" pattern where all limits are reached just before the reset.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 — Persist State for Distributed Systems
&lt;/h2&gt;

&lt;p&gt;Single-node implementations fail under multi-node load. You must use Redis for multi-node state. The key is to store the current state in Redis. When a request arrives, the application atomically checks and updates the Redis key. This avoids a single point of failure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Example Redis Lua Script&lt;/span&gt;
&lt;span class="nb"&gt;local &lt;/span&gt;tokens &lt;span class="o"&gt;=&lt;/span&gt; tonumber&lt;span class="o"&gt;(&lt;/span&gt;redis.call&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GET'&lt;/span&gt;, KEYS[1]&lt;span class="o"&gt;)&lt;/span&gt; or &lt;span class="s2"&gt;"100"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;local &lt;/span&gt;now &lt;span class="o"&gt;=&lt;/span&gt; tonumber&lt;span class="o"&gt;(&lt;/span&gt;redis.call&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'SET'&lt;/span&gt;, KEYS[1], tokens - 1, &lt;span class="s1"&gt;'PX'&lt;/span&gt;, 1000&lt;span class="o"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;return &lt;/span&gt;now
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Lua script ensures atomicity. If you used separate &lt;code&gt;GET&lt;/code&gt; and &lt;code&gt;SET&lt;/code&gt; commands, a concurrent request might read a stale value and succeed, bypassing the limit. The &lt;code&gt;PX&lt;/code&gt; parameter sets an expiration on the key. This ensures memory leaks are managed. This setup is required for microservices where no central authority controls the rate limit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6 — Select the Right Algorithm
&lt;/h2&gt;

&lt;p&gt;Your choice depends on traffic patterns and downstream constraints.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Burst Tolerance&lt;/td&gt;
&lt;td&gt;Token Bucket&lt;/td&gt;
&lt;td&gt;Allows spikes up to capacity.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Constant Rate&lt;/td&gt;
&lt;td&gt;Leaky Bucket&lt;/td&gt;
&lt;td&gt;Smooths output for fragile APIs.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Precision&lt;/td&gt;
&lt;td&gt;Sliding Window&lt;/td&gt;
&lt;td&gt;Accurate counting without abrupt resets.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Global Limits&lt;/td&gt;
&lt;td&gt;Redis + Lua&lt;/td&gt;
&lt;td&gt;Shared state across distributed instances.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you need to protect a specific user or IP, use the Sliding Window. If you want to smooth out traffic spikes for a shared resource, use the Leaky Bucket. For general API protection, Token Bucket is the default choice. The decision matrix ensures you match the algorithm to the specific problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;State is Key:&lt;/strong&gt; You must maintain state to track limits.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Atomicity:&lt;/strong&gt; Use Lua scripts for Redis to prevent race conditions.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Algorithm Choice:&lt;/strong&gt; Pick based on whether you want burst tolerance (Token) or smoothing (Leaky).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Complexity vs. Accuracy:&lt;/strong&gt; Sliding Window is most accurate but computationally heavier.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Persistence:&lt;/strong&gt; Use Redis for distributed setups to share state across instances.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Retry Logic:&lt;/strong&gt; Clients should retry with exponential backoff if rate limited.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Extend this by implementing adaptive rate limiting where the limit adjusts based on response times. You can monitor the tail latency of your backend and increase the limit if the error rate drops. This ensures you utilize available capacity while protecting against overload. You can also add custom headers to inform clients of their current quota.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/strong&gt; — Explains the complexity of distributed state management.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/strong&gt; — Discusses how to handle trade-offs between flexibility and performance.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>backend</category>
      <category>architecture</category>
      <category>performance</category>
    </item>
    <item>
      <title>Two-Phase Commit Demystified: When Distributed Transactions Are Unavoidable</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Sat, 11 Apr 2026 12:45:53 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/two-phase-commit-demystified-when-distributed-transactions-are-unavoidable-1678</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/two-phase-commit-demystified-when-distributed-transactions-are-unavoidable-1678</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;When a banking payment must update inventory in one database and capture funds in another, no single-database transaction can guarantee atomicity across both — that is exactly where two-phase commit becomes necessary.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are implementing a core logic layer for a high-value e-commerce platform where data integrity is paramount. This system must reserve inventory in one database and capture payment in another simultaneously. If the inventory is reserved but payment fails, refunding becomes a complex operational nightmare involving manual reconciliation and customer dissatisfaction. We need atomicity across distinct database tables that do not share a transaction ID, forcing us to rely on a distributed transaction protocol. A two-phase commit (2PC) ensures that either the order is processed fully or rolled back completely. This architecture assumes a strong consistency requirement, such as financial ledgers or critical reservation slots, justifying the latency cost of synchronous coordination between distributed nodes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Define the Transaction Coordinator
&lt;/h2&gt;

&lt;p&gt;The coordinator acts as the conductor of the distributed orchestra, managing the state of the transaction across all participants. It holds the write lock for the global transaction ID (XID) and sends instructions to resources. In our design, the coordinator is an internal RPC service written in Go, utilizing &lt;code&gt;context.Context&lt;/code&gt; for cancellation. This choice matters because Go's goroutine scheduling and cancellation tokens naturally model the synchronous wait for participant readiness without spawning heavy threads. We define the &lt;code&gt;ResourceManager&lt;/code&gt; interface to abstract the underlying database drivers, ensuring loose coupling between the coordinator and specific storage engines.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;ResourceManager&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
        &lt;span class="n"&gt;Rollback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2 — Request Prepared States
&lt;/h2&gt;

&lt;p&gt;The coordinator sends a &lt;code&gt;PREPARE&lt;/code&gt; message to each registered resource, asking if they can safely commit the changes requested. This phase involves the resources checking internal locks, network connectivity, and storage health. Resources must write their &lt;code&gt;vote&lt;/code&gt; to stable storage before responding to ensure they can survive a crash between the vote and the final commit decision. This step effectively pauses the transaction until every participant confirms its ability to participate, which introduces latency but guarantees that no partial updates will be applied.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Coordinator&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;prepareAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;ResourceManager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;responses&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prepareResponse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="n"&gt;ResourceManager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;responses&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;prepareResponse&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Status&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;responses&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Failed&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3 — Aggregate the Vote
&lt;/h2&gt;

&lt;p&gt;Once responses arrive, the coordinator aggregates the votes to determine the final decision. If any resource returns a negative status, the coordinator must abort the transaction immediately to preserve data integrity. This step ensures that a single point of failure in a resource doesn't corrupt the global state, preventing the "dirty read" scenario where data is half-updated. The coordinator acts as a quorum collector, ensuring that a majority of votes match the consensus required before proceeding. It prevents the scenario where one node commits while another fails, which would lead to data inconsistency in a distributed environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Coordinator&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;decide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;votes&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"ABORT"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"one resource voted to abort"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4 — Finalize the Outcome
&lt;/h2&gt;

&lt;p&gt;After voting, the coordinator broadcasts a &lt;code&gt;COMMIT&lt;/code&gt; or &lt;code&gt;ROLLBACK&lt;/code&gt; instruction based on the aggregate result. In the commit case, resources apply their changes and release locks. The coordinator must persist this decision locally to an atomic log before notifying resources to prevent a state where a resource has committed but the coordinator lost its record, leading to an inconsistent global state upon restart. Conversely, if the vote was negative, the rollback is mandatory. This ensures no data is left in an intermediate state. The protocol requires two stable storage writes (vote and decision) and ensures that no resource moves forward unless the coordinator has a record of the vote.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance and Trade-offs
&lt;/h2&gt;

&lt;p&gt;Two-phase commit introduces performance overhead. The coordinator must wait for all participants before proceeding, which creates a bottleneck. In high-throughput e-commerce, this might delay order confirmation. However, the alternative is data inconsistency, which is unacceptable for financial systems. Modern implementations optimize 2PC by using asynchronous networking or batching commits, but the synchronous nature remains. It's crucial to monitor the latency introduced by the preparation phase, as resources holding long-held locks can starve other transactions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Implementing a two-phase commit protocol is a complex but necessary strategy for systems requiring distributed atomicity. While it introduces latency and complexity, it provides the reliability needed for critical financial or inventory management applications. By carefully coordinating resources and handling failure states, we ensure that the ledger remains consistent across distributed nodes.&lt;/p&gt;

&lt;p&gt;Part of the &lt;strong&gt;Architecture Patterns&lt;/strong&gt; series.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>distributedsystems</category>
      <category>go</category>
    </item>
    <item>
      <title>Database Connection Pooling: What Every Backend Developer Should Know</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Sat, 11 Apr 2026 06:31:13 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/database-connection-pooling-what-every-backend-developer-should-know-1cb6</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/database-connection-pooling-what-every-backend-developer-should-know-1cb6</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Without a pool, every request forces a TCP handshake, database authentication, and context switch, turning latency spikes into a denial-of-service scenario.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are implementing a production-grade connection management pattern in Go. This scope focuses on configuring limits, handling lifecycle, and monitoring health to prevent resource exhaustion in a high-concurrency API service. We assume a standard PostgreSQL backend where drivers like &lt;code&gt;pgx&lt;/code&gt; are used to expose pool metrics.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Initialize the Pool at Startup
&lt;/h2&gt;

&lt;p&gt;Database connections should never be instantiated on-demand per request in a concurrent environment. Instead, the pool is initialized once when the application service starts, ensuring resources are pre-warmed. This approach avoids the latency of TCP handshakes for every incoming HTTP request.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"postgres"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"conn=..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to open database: %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This specific choice matters because &lt;code&gt;sql.Open&lt;/code&gt; only returns a client handle; actual connection allocation happens lazily based on configured limits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Configure Limits and Timeouts
&lt;/h2&gt;

&lt;p&gt;You must define &lt;code&gt;MaxOpenConns&lt;/code&gt; to limit the maximum number of connections the driver will open and &lt;code&gt;SetConnMaxLifetime&lt;/code&gt; to ensure connections are recycled. Setting a timeout on idle connections prevents the pool from holding stale sockets indefinitely, which is critical for network instability.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMaxOpenConns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMaxIdleConns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;25&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetConnMaxLifetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Minute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This specific choice matters because the database server and the client process may share the same network subnet, increasing the risk of socket exhaustion if not strictly limited.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Manage Idle Connection Recycling
&lt;/h2&gt;

&lt;p&gt;Idle connections are candidates for eviction if they sit too long without being used. The &lt;code&gt;SetMaxIdleConns&lt;/code&gt; value combined with &lt;code&gt;SetMaxIdleTime&lt;/code&gt; ensures that unused connections are closed automatically. This prevents the accumulation of "zombie" connections that consume file descriptors but provide no throughput.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetConnMaxIdleTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;30&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This specific choice matters because idle connections often become invalid due to firewall timeout settings or intermediate proxy resets, causing sudden failures when accessed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Monitor Metrics for Latency
&lt;/h2&gt;

&lt;p&gt;Production systems require visibility into pool health, specifically the &lt;code&gt;Stats()&lt;/code&gt; method which returns open and acquired connection counts. You should check these metrics to detect if the application is blocking on &lt;code&gt;Acquire()&lt;/code&gt; due to saturation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stats&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OpenConnections&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MaxOpenConnections&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Pool is full: %d open, %d waiting"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OpenConnections&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WaitCount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This specific choice matters because relying solely on HTTP response times often leads to reactive failure when the pool is actually empty and waiting for a handshake.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    +---------+      +---------+      +---------+
    |  Client | ----&amp;gt;|  Pool  | ----&amp;gt;|Database |
    +---------+      +---------+      +---------+
         |              |                 |
         | (Acquire)    | (Execute)       |
         v              v                 v
    [Wait Queue]  &amp;lt;- [Idle Connection] -&amp;gt; [Result]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Lazy Initialization:&lt;/strong&gt; Connections should be pre-warmed at startup rather than created on demand to minimize latency during peak traffic.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Strict Limits:&lt;/strong&gt; Enforcing &lt;code&gt;MaxOpenConns&lt;/code&gt; prevents your application from consuming all available network sockets on the server, which could crash the database.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Recycle Logic:&lt;/strong&gt; Recycling idle connections based on time is crucial to handling network instability where long-lived sockets might be dropped by proxies.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Visibility:&lt;/strong&gt; Monitoring the pool stats allows you to react to saturation before it results in timeouts, rather than reacting to HTTP 503 errors after the fact.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;To further optimize the service, consider using the &lt;code&gt;pgx&lt;/code&gt; driver which provides &lt;code&gt;pgxpool&lt;/code&gt; for finer control over connection recycling. You could also implement circuit breakers to stop hitting the database if the pool metrics indicate persistent saturation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; &lt;em&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/em&gt; by Martin Kleppmann&lt;/li&gt;
&lt;li&gt; &lt;em&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/em&gt; by John Ousterhout&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;strong&gt;Part of the Architecture Patterns series.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>backend</category>
      <category>database</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Building a TCP Server in Rust With tokio: From Accept Loop to Graceful Shutdown</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Fri, 10 Apr 2026 06:25:28 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/building-a-tcp-server-in-rust-with-tokio-from-accept-loop-to-graceful-shutdown-2bh2</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/building-a-tcp-server-in-rust-with-tokio-from-accept-loop-to-graceful-shutdown-2bh2</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Handling concurrent network connections requires managing lifecycles without blocking the event loop, which dictates high-throughput and fault isolation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are constructing a production-grade TCP server using the Tokio runtime. The scope includes binding a socket to a local interface, establishing a non-blocking accept loop, spawning isolated tasks for each connection, and implementing a graceful shutdown mechanism that listens for &lt;code&gt;SIGINT&lt;/code&gt; or &lt;code&gt;SIGTERM&lt;/code&gt;. The resulting architecture ensures that a single failing client does not crash the entire server and allows the system to drain pending requests before exiting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Initialize the Runtime and Listener
&lt;/h2&gt;

&lt;p&gt;Before accepting connections, you must boot the asynchronous runtime and configure the TCP socket. You bind the listener to a specific address to restrict traffic to the intended interface.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;net&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;TcpListener&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;#[tokio::main]&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nb"&gt;Box&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;TcpListener&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"127.0.0.1:8080"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nd"&gt;println!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Listening on port 8080..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;loop&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Accept loop logic follows&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;tokio::net::TcpListener&lt;/code&gt; ensures the accept operation does not block the current thread, allowing the runtime to handle I/O multiplexing efficiently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — The Non-Blocking Accept Loop
&lt;/h2&gt;

&lt;p&gt;The core logic involves polling the listener for incoming connections. You iterate indefinitely, attempting to pull a stream from the queue, and handling errors like address in use gracefully.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;mpsc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// ... previous setup&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="nf"&gt;.accept&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;println!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"New connection from: {}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// Spawn connection handler&lt;/span&gt;
    &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Handle stream...&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;while let&lt;/code&gt; pattern decouples the acceptance logic from the handling logic, preventing the thread from stalling while waiting for a socket to become available.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Isolate Connection Tasks
&lt;/h2&gt;

&lt;p&gt;Each incoming connection must be handled by its own task. This prevents memory growth in a single heap space and allows individual connections to fail without affecting others.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;oneshot&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;oneshot&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// Process stream buffer&lt;/span&gt;
    &lt;span class="c1"&gt;// Drop `stream` when done to release resources&lt;/span&gt;
    &lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; 
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spawning connections ensures resource independence, which is critical for high-concurrency backends where a malicious or misbehaving client must not starve legitimate users.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Implement Graceful Shutdown
&lt;/h2&gt;

&lt;p&gt;Stopping the server abruptly drops active sockets, potentially causing timeouts for clients. Graceful shutdown requires draining incoming requests and waiting for active handlers to finish.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;time&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Create a shutdown channel&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shutdown_tx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;shutdown_rx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;mpsc&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;loop&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;select!&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;listener&lt;/span&gt;&lt;span class="nf"&gt;.accept&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nd"&gt;println!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"New connection from: {}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Handle stream here&lt;/span&gt;
                &lt;span class="p"&gt;});&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shutdown_rx&lt;/span&gt;&lt;span class="nf"&gt;.recv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nd"&gt;println!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Shutting down..."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Wait for Ctrl-C, then signal shutdown&lt;/span&gt;
&lt;span class="nn"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;ctrl_c&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="nf"&gt;.expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to listen for ctrl_c"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shutdown_tx&lt;/span&gt;&lt;span class="nf"&gt;.send&lt;/span&gt;&lt;span class="p"&gt;(())&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This architecture uses &lt;code&gt;tokio::select!&lt;/code&gt; to race between accepting connections and receiving shutdown signals, ensuring the runtime waits for cleanup before terminating.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Async I/O&lt;/strong&gt;: Blocking the event loop inside the accept loop kills performance, so &lt;code&gt;tokio::net::TcpListener&lt;/code&gt; is mandatory.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Task Isolation&lt;/strong&gt;: Spawning connection handlers in the runtime prevents memory leaks and cascading failures across processes.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Signal Handling&lt;/strong&gt;: Using &lt;code&gt;tokio::signal&lt;/code&gt; allows the application to listen for OS-level interrupts and trigger cleanup logic instead of hard exiting.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Resource Cleanup&lt;/strong&gt;: Dropping handles ensures the operating system reclaims file descriptors once the server stops accepting new connections.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Channel Synchronization&lt;/strong&gt;: Using channels to signal shutdown allows the main loop to exit cleanly without relying on external processes.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Error Propagation&lt;/strong&gt;: Handling &lt;code&gt;Result&lt;/code&gt; types ensures that binding failures or listener accept errors do not silently ignore critical infrastructure states.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;To harden this implementation, you should implement TLS encryption using &lt;code&gt;tokio-rustls&lt;/code&gt; to secure data in transit. Adding metrics via Prometheus or OpenTelemetry allows you to monitor active connection counts and response times in real time. Next, consider migrating to a framework like Actix-web to integrate this handler into a full API surface with routing support. Finally, implement circuit breaker patterns to prevent cascading failures if upstream dependencies become unstable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;&lt;/strong&gt; — Encapsulating connection handling logic helps maintain modularity and complexity boundaries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;&lt;/strong&gt; — The book covers stream processing patterns relevant to managing high-volume TCP connections.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://amzn.to/4sPlPDL" rel="noopener noreferrer"&gt;Learn Rust in a Month of Lunches (MacLeod)&lt;/a&gt;&lt;/strong&gt; — Provides practical examples for managing async runtimes and error handling in daily coding.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Part of the Rust community aims to share knowledge about building resilient network services.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with ❤️ by your team.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rust</category>
      <category>backend</category>
      <category>networking</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Stream Processing vs Batch Processing: When to Use Each</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Thu, 09 Apr 2026 17:17:02 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/stream-processing-vs-batch-processing-when-to-use-each-3n69</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/stream-processing-vs-batch-processing-when-to-use-each-3n69</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Batch processing offers throughput and accuracy, while stream processing offers latency. Your architecture must decide which metric survives production requirements.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are designing a financial transaction monitoring system. This system requires real-time fraud detection for immediate blocking but also daily reconciliation for regulatory compliance. We cannot choose one paradigm exclusively. We must structure our data pipelines to handle high-volume historical data efficiently while processing live events with low latency. The core trade-off involves computational resources, consistency models, and operational complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Define Latency Requirements
&lt;/h2&gt;

&lt;p&gt;The first step is distinguishing between micro-batch and real-time needs. Micro-batches are small chunks processed periodically. Real-time implies processing each record as it arrives. We define a latency threshold &lt;code&gt;T&lt;/code&gt;. If user experience demands T &amp;lt; 100ms, batch is insufficient. If the system tolerates T &amp;lt; 5 minutes, batch becomes viable for cost savings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Latency configuration struct
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dataclasses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dataclass&lt;/span&gt;

&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProcessingConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;batch_interval_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;stream_latency_threshold_ms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;consistency_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ProcessingConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;batch_interval_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stream_latency_threshold_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;consistency_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;at_least_once&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Defining these constraints upfront prevents over-engineering the infrastructure with unnecessary state stores.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Implement Batch Layer
&lt;/h2&gt;

&lt;p&gt;For the reconciliation task, we ingest terabytes of historical logs. Batch processing allows us to parallelize work across a cluster without worrying about message ordering for every single event. We use Apache Spark logic to aggregate data over fixed windows. This approach maximizes throughput by utilizing local disk storage and minimizing network overhead for heavy computation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudo-code for Spark Batch Job
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_daily_batch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parquet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;windowed_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;windowed_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parquet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;spark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;daily_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;windowed_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Batch processing is cost-effective for large datasets where immediate insight is not critical to business logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Implement Stream Layer
&lt;/h2&gt;

&lt;p&gt;For fraud detection, we need a continuous consumer. We subscribe to a message broker like Apache Kafka. The stream processor maintains state in memory or in RocksDB to detect patterns within sliding windows. This requires handling backpressure and ensuring exactly-once semantics if financial integrity is non-negotiable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudo-code for Stream Consumer
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_stream_events&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;consumer_group&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;kafka_consumer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;risk_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;risk_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;risk_score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;alert_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;notify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;sink&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alert_log&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stream processing introduces overhead but enables immediate intervention, critical for high-stakes environments like finance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Manage Stateful Processing
&lt;/h2&gt;

&lt;p&gt;State management is the hidden cost of both paradigms. Batch layers often store state in Parquet files on object storage. Stream layers often require managed systems like Flink or Kafka Streams to retain checkpoints. Without managing state, you lose context required for windowing. A stateless stream consumer cannot aggregate totals or detect sequence anomalies across a session.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[BATCH] -&amp;gt; [SINK STORE] -&amp;gt; [LOAD ON DEMAND]
            |
[STREAM] &amp;lt;- [STATE STORE] &amp;lt;- [CHECKPOINT]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Separating state management from computation simplifies scaling and debugging during failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 — Operational Trade-offs
&lt;/h2&gt;

&lt;p&gt;Finally, consider the operational reality. Batch jobs are easier to pause and restart but introduce a cold start latency. Stream jobs require constant monitoring of consumer lag and state store health. Hybrid architectures must orchestrate both via a service bus. Overlapping state stores can lead to data duplication. You must design APIs that unify these views.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Hybrid orchestration logic
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;orchestrate_system&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;BATCH_INTERVAL&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;trigger_batch_job&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;continue_stream_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Balancing these requirements ensures robustness without sacrificing performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency vs. Throughput&lt;/strong&gt; — Batch wins on volume and cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency Models&lt;/strong&gt; — Batch offers strong consistency. Stream needs eventual consistency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State Management&lt;/strong&gt; — Batch uses durable storage. Stream uses in-memory checkpoints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Schemas&lt;/strong&gt; — Batch handles batched schemas. Stream handles streaming schemas.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure Handling&lt;/strong&gt; — Batch handles offline recovery. Stream handles replayable logs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Next, you might explore event-driven architectures using Apache Kafka and Apache Flink. You can look into optimizing micro-batch latency for near real-time needs. Consider whether a Lambda architecture or a Kappa architecture best fits your team's constraints.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt; — A philosophy of software design for data systems.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt; — Covers abstraction barriers in large codebases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Part of the &lt;strong&gt;Architecture Patterns&lt;/strong&gt; series.&lt;/p&gt;

</description>
      <category>distributed</category>
      <category>systems</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>Timeout Propagation: Why Your Deadlines Need to Flow Through the Entire Call Chain</title>
      <dc:creator>Dylan Dumont</dc:creator>
      <pubDate>Wed, 08 Apr 2026 07:04:12 +0000</pubDate>
      <link>https://forem.com/dylan_dumont_266378d98367/timeout-propagation-why-your-deadlines-need-to-flow-through-the-entire-call-chain-fkn</link>
      <guid>https://forem.com/dylan_dumont_266378d98367/timeout-propagation-why-your-deadlines-need-to-flow-through-the-entire-call-chain-fkn</guid>
      <description>&lt;p&gt;Timeout Propagation: Why Your Deadlines Need to Flow Through the Entire Call Chain&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ignoring a timeout on an API call doesn't isolate the failure; it poisons the shared thread pool and starves concurrent requests.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;We are implementing a deadline-aware client in a Go microservice. Our goal is to ensure that when a service receives a request, it knows exactly how much time remains for the entire transaction, not just the current function. This involves calculating the time budget before entering the call chain and passing it explicitly to every downstream dependency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 — Establish the Transaction Deadline
&lt;/h2&gt;

&lt;p&gt;Every handler must declare the maximum time allowed for the whole request lifecycle. This prevents downstream services from running indefinitely if the parent service is overwhelmed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;maxTransactionTime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;maxTransactionTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c"&gt;// Use ctx in subsequent logic&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Defining a global constant prevents ad-hoc timeout logic from appearing scattered across the codebase. It sets a hard boundary for the entire request scope.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Calculate Remaining Window
&lt;/h2&gt;

&lt;p&gt;Before making an external call, you must subtract the local processing time from the remaining transaction deadline. You cannot use the initial deadline for every nested call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;processPayment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deadline&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c"&gt;// Visualize the budget&lt;/span&gt;
        &lt;span class="c"&gt;//     [================] -&amp;gt; 5s (Total)&lt;/span&gt;
        &lt;span class="c"&gt;//        [======] -&amp;gt; 2s (Processing)&lt;/span&gt;
        &lt;span class="c"&gt;//     [============] -&amp;gt; 3s (Remaining)&lt;/span&gt;
        &lt;span class="n"&gt;externalCtx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c"&gt;// Call external API with 3s remaining&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;externalCtx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures that if the first call takes too long, the downstream API is not given a window larger than the original client allowed. It protects your system from cascading delays.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Propagate Context Downstream
&lt;/h2&gt;

&lt;p&gt;When calling out to external libraries, you must pass the new context rather than passing the original background context. The HTTP client should accept the context to handle timeouts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;fetchUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRequestWithContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MethodGet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userURL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Timeout&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Millisecond&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// Fallback per-call limit&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="n"&gt;User&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewDecoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Passing &lt;code&gt;ctx&lt;/code&gt; ensures that if the transaction times out, the request is aborted immediately. Without this, the connection remains open, draining file descriptors and memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — Listen for Deadline Exceeded
&lt;/h2&gt;

&lt;p&gt;When a timeout occurs, the &lt;code&gt;context&lt;/code&gt; will trigger a &lt;code&gt;context.DeadlineExceeded&lt;/code&gt; error. You must handle this explicitly so the application does not crash on panic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;After&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="c"&gt;// Normal execution path&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DeadlineExceeded&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"deadline exceeded"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This logic allows the code to exit gracefully or return a specific HTTP 504 Gateway Timeout status. It converts a system panic into a recoverable application error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 — Isolate Parallelism Budgets
&lt;/h2&gt;

&lt;p&gt;When launching goroutines or channels, you must ensure they do not consume the parent's entire budget. Split the remaining time among parallel tasks to prevent one slow child from starving others.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// If parent has 3s remaining, do not give all 3s to every child&lt;/span&gt;
&lt;span class="n"&gt;childCtx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1.5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Process in background&lt;/span&gt;
&lt;span class="p"&gt;}()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Allocating the full remaining budget to every parallel worker causes total starvation if one worker lags. Splitting the budget ensures that even a lagging task cannot block the whole request thread.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Deadline Budgeting&lt;/strong&gt;: Calculate and subtract local processing time to prevent passing infinite time windows to downstream dependencies.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Cancellation Propagation&lt;/strong&gt;: Pass the context explicitly so that a timeout at the entry point aborts all nested operations immediately.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Resource Starvation&lt;/strong&gt;: Unbounded timeouts drain thread pools and file descriptors, leading to system-wide hangs rather than isolated errors.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Context Management&lt;/strong&gt;: Every external call must receive a specific context derived from the original, shrinking deadline.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Graceful Failure&lt;/strong&gt;: Catch &lt;code&gt;context.DeadlineExceeded&lt;/code&gt; to return meaningful HTTP 504 errors instead of defaulting to panics.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Parallel Safety&lt;/strong&gt;: Split time budgets for concurrent tasks to prevent a single slow worker from blocking the entire transaction.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Observe&lt;/strong&gt;: Expose metrics for how often &lt;code&gt;context.DeadlineExceeded&lt;/code&gt; fires. Use this to tune timeouts for different tiers.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Retry&lt;/strong&gt;: Implement backoff logic only for retryable errors. Treat timeouts as terminal errors unless you have a specific transient failure pattern.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Document&lt;/strong&gt;: Maintain a table of expected latency SLAs for your services. Use these numbers to set base timeouts in contracts.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://amzn.to/4saY8oe" rel="noopener noreferrer"&gt;Designing Data-Intensive Applications (Kleppmann)&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://amzn.to/4m8wG9e" rel="noopener noreferrer"&gt;A Philosophy of Software Design (Ousterhout)&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  Computer Systems&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>backend</category>
      <category>architecture</category>
      <category>distributedsystems</category>
    </item>
  </channel>
</rss>
