<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Esteban Alvarez</title>
    <description>The latest articles on Forem by Esteban Alvarez (@etalazz).</description>
    <link>https://forem.com/etalazz</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3572714%2Fb0e5f5a6-a0f0-4bde-8ee4-8c27fc8f7955.jpg</url>
      <title>Forem: Esteban Alvarez</title>
      <link>https://forem.com/etalazz</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/etalazz"/>
    <language>en</language>
    <item>
      <title>Stop writing every request to your database: a tiny pattern that potentially saves 95–99% of writes</title>
      <dc:creator>Esteban Alvarez</dc:creator>
      <pubDate>Sat, 18 Oct 2025 15:33:48 +0000</pubDate>
      <link>https://forem.com/etalazz/stop-writing-every-request-to-your-database-a-tiny-pattern-that-potentially-saves-95-99-of-writes-5860</link>
      <guid>https://forem.com/etalazz/stop-writing-every-request-to-your-database-a-tiny-pattern-that-potentially-saves-95-99-of-writes-5860</guid>
      <description>&lt;p&gt;If you’ve ever implemented rate limiting, quotas, or high‑churn counters, you’ve probably felt the pain of per‑request writes to Redis/DB. It’s simple, but it gets slow and expensive fast.&lt;/p&gt;

&lt;p&gt;Here’s a small idea that fixes that: keep two numbers per key in memory — a stable &lt;code&gt;scalar&lt;/code&gt; (what’s already persisted) and a volatile &lt;code&gt;vector&lt;/code&gt; (net changes not yet persisted). Make decisions from both, and only persist when the vector crosses a threshold or at shutdown.&lt;/p&gt;

&lt;p&gt;I call this the Vector–Scalar Accumulator (VSA). It’s simple, fast, and easy to drop into real services.&lt;/p&gt;




&lt;h3&gt;
  
  
  The motivation (in one minute)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Most updates cancel out quickly (add/remove, like/unlike, reserve/release). Persisting every micro‑event wastes I/O.&lt;/li&gt;
&lt;li&gt;With VSA, you:

&lt;ul&gt;
&lt;li&gt;Decide admits/denies in nanoseconds from in‑memory state.&lt;/li&gt;
&lt;li&gt;Defer and batch persistence (e.g., every 50 net updates).&lt;/li&gt;
&lt;li&gt;Flush leftovers on graceful shutdown.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Result: thousands of requests turn into a handful of writes — typically a 95–99% reduction — without changing the correctness of the decision path.&lt;/p&gt;




&lt;h3&gt;
  
  
  The mental model
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;scalar (S)&lt;/code&gt;: the durable base (e.g., 1000 requests allowed).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;vector (V)&lt;/code&gt;: in‑memory net usage since the last commit.&lt;/li&gt;
&lt;li&gt;Availability: &lt;code&gt;Available = S - |V|&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;When &lt;code&gt;|V| &amp;gt;= threshold&lt;/code&gt;, persist the net and apply &lt;code&gt;Commit(V)&lt;/code&gt; which preserves availability but resets &lt;code&gt;V&lt;/code&gt; to zero.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Architecture at a glance
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flowchart LR
  Client --&amp;gt; API[/HTTP /check?api_key=.../]
  API --&amp;gt; Store[Per-key VSA Store]
  Store --&amp;gt;|TryConsume(1)| API
  Store --&amp;gt; Worker
  Worker --&amp;gt;|commitLoop (threshold)| Persister
  Worker --&amp;gt;|evictionLoop| Persister
  Worker --&amp;gt;|final flush on Stop| Persister
  Persister --&amp;gt; DB[(Durable sink)]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;API path is zero‑hop: no network I/O per request.&lt;/li&gt;
&lt;li&gt;Background worker does the slow stuff (batch commits, eviction, final flush).&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Request timeline (what actually happens)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sequenceDiagram
  participant U as User
  participant A as API
  participant V as VSA (S,V)
  participant W as Worker
  participant P as Persister

  U-&amp;gt;&amp;gt;A: GET /check?api_key=alice
  A-&amp;gt;&amp;gt;V: TryConsume(1)
  V--&amp;gt;&amp;gt;A: ok (remaining = S - |V|)
  A--&amp;gt;&amp;gt;U: 200 OK + X-RateLimit-Remaining

  Note over W: every commitInterval
  W-&amp;gt;&amp;gt;V: CheckCommit(threshold)
  alt |V| &amp;gt;= threshold
    W-&amp;gt;&amp;gt;P: CommitBatch(key, vector)
    P--&amp;gt;&amp;gt;W: ok
    W-&amp;gt;&amp;gt;V: Commit(vector) // preserves availability
  end

  Note over A,W,P: On shutdown: Worker runs final flush for any non-zero V
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Minimal code you can reason about
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1) Atomic, fair admission (no last‑token race)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Given: v := vsa.New(1000) // S=1000&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TryConsume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Deny (429): no tokens left&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Allow (200): remaining = v.Available()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;TryConsume(1)&lt;/code&gt; atomically checks &lt;code&gt;Available&lt;/code&gt; and increments the in‑memory &lt;code&gt;vector&lt;/code&gt;. Two concurrent calls cannot both grab the last token.&lt;/p&gt;

&lt;h4&gt;
  
  
  2) A tiny HTTP handler
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;handleCheckRateLimit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"api_key"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"API key is required"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;userVSA&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetOrCreate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;userVSA&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TryConsume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Header&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"X-RateLimit-Status"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Exceeded"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Header&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Retry-After"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"60"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Too Many Requests"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;429&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Header&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"X-RateLimit-Limit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rateLimit&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Header&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"X-RateLimit-Remaining"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userVSA&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Available&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Header&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"X-RateLimit-Status"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"OK"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WriteHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"OK"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  3) The worker that saves your I/O budget
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Worker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;commitLoop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ticker&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewTicker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;commitInterval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;ticker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ticker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runCommitCycle&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// commit any key with |vector| &amp;gt;= threshold&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stopChan&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runFinalFlush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c"&gt;// commit any non-zero vector (remainders)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside &lt;code&gt;runCommitCycle()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;shouldCommit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vec&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vsa&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CheckCommit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;shouldCommit&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;persister&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CommitBatch&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="n"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;&lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Vector&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;}})&lt;/span&gt;
    &lt;span class="n"&gt;vsa&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// S := S - vec; V := V - vec (preserves availability)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  What the logs look like (realistic demo)
&lt;/h3&gt;

&lt;p&gt;During load:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[2025-10-17T12:00:01-06:00] Persisting batch of 1 commits...
  - KEY: alice-key            VECTOR: 50
[2025-10-17T12:00:02-06:00] Persisting batch of 1 commits...
  - KEY: alice-key            VECTOR: 51
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On shutdown (graceful final flush):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Shutting down server...
Stopping background worker...
[2025-10-17T18:23:22-06:00] Persisting batch of 2 commits...
  - KEY: alice-key            VECTOR: 43
  - KEY: bob-key              VECTOR: 1
Server gracefully stopped.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: &lt;code&gt;bob-key&lt;/code&gt; didn’t reach the threshold during runtime, so it shows up in the final flush.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why this is fast and fair
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Zero per‑request network I/O: decisions are in memory only.&lt;/li&gt;
&lt;li&gt;Admission‑invariant commits: moving &lt;code&gt;vector → scalar&lt;/code&gt; doesn’t change &lt;code&gt;Available&lt;/code&gt;, so you don’t get oscillation bugs near batch boundaries.&lt;/li&gt;
&lt;li&gt;Atomic last‑token: &lt;code&gt;TryConsume&lt;/code&gt; prevents double‑spend of the final unit under high concurrency.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Typical outcome with &lt;code&gt;commitThreshold = 50&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1001 requests → about 20 threshold commits during runtime (or a single final batch on shutdown).&lt;/li&gt;
&lt;li&gt;That’s ~98% fewer writes compared to “write every request”.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  How to try it (end‑to‑end quick test)
&lt;/h3&gt;

&lt;p&gt;Find the Project Repo at &lt;a href="https://github.com/etalazz/vsa" rel="noopener noreferrer"&gt;https://github.com/etalazz/vsa&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Start the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go run ./cmd/ratelimiter-api/main.go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drive traffic with the script (Bash/Git Bash/WSL):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./scripts/test_ratelimiter.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you’ll see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Client hits &lt;code&gt;/check&lt;/code&gt; until Alice gets a &lt;code&gt;429&lt;/code&gt; on the 1001st request.&lt;/li&gt;
&lt;li&gt;Server prints periodic batched commits (&lt;code&gt;VECTOR: 50/51&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;On Ctrl+C, a final flush (e.g., &lt;code&gt;bob-key: 1&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Prefer manual poking?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s2"&gt;"http://localhost:8080/check?api_key=alice-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Trade‑offs (and easy mitigations)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Crash before flush can lose up to &lt;code&gt;commitThreshold&lt;/code&gt; per key.

&lt;ul&gt;
&lt;li&gt;Mitigate by lowering the threshold, shortening the interval, or pairing with a write‑ahead log (Kafka/Redis Streams) for critical keys.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Single node by default.

&lt;ul&gt;
&lt;li&gt;For strict global limits across many nodes, add token leasing: lease chunks from a central &lt;code&gt;remaining&lt;/code&gt; counter and serve locally from VSA. You keep the zero‑hop hot path; coordination happens only on lease boundaries.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  When to use this
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Rate limits, quotas, usage metering&lt;/li&gt;
&lt;li&gt;Like/unlike, view counters, telemetry aggregation&lt;/li&gt;
&lt;li&gt;Reservations: cart holds, connection pools, job slots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Any place with commutative deltas and lots of short‑lived churn.&lt;/p&gt;




&lt;h3&gt;
  
  
  Wrap‑up
&lt;/h3&gt;

&lt;p&gt;The VSA pattern is tiny but powerful: keep two numbers per key, decide from both, and write only when it matters. You’ll cut write amplification dramatically, keep tail latency predictable, and preserve fairness under load.&lt;/p&gt;

&lt;p&gt;If you build backends in Go, this is one of those practicality‑wins patterns you can implement in an afternoon — and reap benefits for years.&lt;/p&gt;

</description>
      <category>go</category>
      <category>ratelimiter</category>
    </item>
  </channel>
</rss>
