<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Harrison Guo</title>
    <description>The latest articles on Forem by Harrison Guo (@harrisonsec).</description>
    <link>https://forem.com/harrisonsec</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3809272%2F593698c5-7201-4bb0-898e-055cdbc0a2d2.png</url>
      <title>Forem: Harrison Guo</title>
      <link>https://forem.com/harrisonsec</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/harrisonsec"/>
    <language>en</language>
    <item>
      <title>Node Turns Waiting Into Events. Go Moves Context Switching Into User Space.</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Tue, 28 Apr 2026 18:01:12 +0000</pubDate>
      <link>https://forem.com/harrisonsec/node-turns-waiting-into-events-go-moves-context-switching-into-user-space-58ik</link>
      <guid>https://forem.com/harrisonsec/node-turns-waiting-into-events-go-moves-context-switching-into-user-space-58ik</guid>
      <description>&lt;p&gt;Most discussions of TypeScript/Node vs Go concurrency stop at the surface: &lt;em&gt;Node is async, Go is threaded.&lt;/em&gt; That framing isn't wrong — it just isn't deep enough to be useful when you're picking a runtime, debugging a tail-latency problem, or explaining to your team why one of the services keeps falling over under CPU load.&lt;/p&gt;

&lt;p&gt;The real difference is not async vs threaded. It's a question about where, in the system, suspended work lives — and what shape it takes when it's resumed.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Both Node and Go refuse to let the CPU sit idle while a request waits on I/O. They disagree on the unit of scheduling. Node's unit is the &lt;em&gt;continuation&lt;/em&gt; — the tail of an async function captured as a heap closure. Go's unit is the &lt;em&gt;goroutine&lt;/em&gt; — a full call stack the runtime can suspend and resume in user space. That single decision cascades into every other property of each runtime.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Wrong Question
&lt;/h2&gt;

&lt;p&gt;"Async vs threaded" is the wrong frame because it makes you think the choice is between paradigms. It isn't. Both runtimes have already made the &lt;em&gt;same&lt;/em&gt; fundamental decision: do not block an OS thread waiting for slow external work. The interesting choice is &lt;em&gt;how&lt;/em&gt; they implement that.&lt;/p&gt;

&lt;p&gt;The actually useful question is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When a request is waiting for I/O — for a database, an HTTP call, a Redis round-trip, a file read — &lt;strong&gt;what does the CPU do, and where does the suspended state of that request live?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once you frame it that way, Node and Go aren't opposites. They're two answers to the same question — and each answer cascades into a different language shape, a different library style, and a different failure mode under load.&lt;/p&gt;

&lt;p&gt;The naive blocking model answers the question with "an OS thread waits for the syscall to return." That model collapses around a few thousand concurrent connections — memory per thread, scheduler overhead, kernel context-switch cost. By 40,000 connections you're out of RAM, not CPU. Node and Go both refuse to do this. They diverge on &lt;em&gt;which resource gets freed up&lt;/em&gt; and &lt;em&gt;how the suspended work is captured for later resumption.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Node's Answer: Turn Waiting Into an Event
&lt;/h2&gt;

&lt;p&gt;Node's model can be summarized in one line: &lt;strong&gt;the JS main thread only executes code that's already ready to run.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Look at this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It reads as if the function is paused, blocking on the database. It isn't. Here's what V8 actually does at the bytecode level when it compiles an &lt;code&gt;async&lt;/code&gt; function: it rewrites the body into a state machine, with each &lt;code&gt;await&lt;/code&gt; becoming a state transition.&lt;/p&gt;

&lt;p&gt;The function above gets transformed into something equivalent to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;asyncFn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;promise&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;closure&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;                  &lt;span class="c1"&gt;// heap object holding locals&lt;/span&gt;

    &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;switch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;     &lt;span class="c1"&gt;// await → register continuation&lt;/span&gt;
          &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                         &lt;span class="c1"&gt;// ← function POPS here&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="nx"&gt;closure&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;           &lt;span class="c1"&gt;// resume: locals live in closure&lt;/span&gt;
          &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;closure&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;promise&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things to notice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;await&lt;/code&gt; is not a pause.&lt;/strong&gt; It's the point at which V8 returns from the function and pops the JS stack frame. The "rest of the function" is captured as a continuation registered on the awaited Promise via &lt;code&gt;.then&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local variables move to the heap.&lt;/strong&gt; Because the stack frame is gone, locals (&lt;code&gt;user&lt;/code&gt; here) live in a heap closure, accessible only when the state machine resumes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Each &lt;code&gt;await&lt;/code&gt; slices the function into another state.&lt;/strong&gt; A function with two &lt;code&gt;await&lt;/code&gt;s runs in three event-loop turns, with three independently-pushed JS frames, with all live state stored in heap closures between them.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That third point is the most non-obvious. A single &lt;code&gt;async&lt;/code&gt; function is &lt;strong&gt;not&lt;/strong&gt; one unit of execution — it's a sequence of fresh frames separated by event-loop turns:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2Fc2VxdWVuY2VEaWFncmFtCiAgICBhdXRvbnVtYmVyCiAgICBwYXJ0aWNpcGFudCBFTCBhcyBFdmVudCBMb29wIChsaWJ1dikKICAgIHBhcnRpY2lwYW50IEpTIGFzIEpTIE1haW4gVGhyZWFkIChWOCkKICAgIHBhcnRpY2lwYW50IEggYXMgSGVhcCAoY2xvc3VyZXMpCiAgICBwYXJ0aWNpcGFudCBLIGFzIEtlcm5lbCAvIEkvTwoKICAgIHJlY3QgcmdiKDI1NCwgMjQzLCAxOTkpCiAgICBOb3RlIG92ZXIgRUwsSzogVHVybiAxCiAgICBFTC0-PkpTOiBkaXNwYXRjaCBoYW5kbGVyKCkKICAgIGFjdGl2YXRlIEpTCiAgICBOb3RlIG92ZXIgSlM6IGNvbnN0IGEgPSAxCiAgICBKUy0-PkpTOiBjYWxsIGNvbXB1dGUxKCkg4oaSIHJldHVybnMgUHJvbWlzZQogICAgSlMtPj5IOiBWOCBzdG9yZXMgY2xvc3VyZSB7c3RhdGU6MSwgYX0KICAgIEpTLT4-SDogcmVnaXN0ZXIgc3RlcCBhcyAudGhlbiBoYW5kbGVyCiAgICBKUy0tPj5FTDogaGFuZGxlciBmcmFtZSBQT1BQRUQsIHJldHVybnMgUHJvbWlzZQogICAgZGVhY3RpdmF0ZSBKUwogICAgZW5kCgogICAgRUwtPj5LOiBlcG9sbF93YWl0IChubyBtaWNyb3Rhc2tzKQogICAgTm90ZSBvdmVyIEVMLEs6IC4uLiB0aW1lIHBhc3NlcywgT1MgdGhyZWFkIHBhcmtlZCAuLi4KICAgIEstLT4-RUw6IEkvTyByZWFkeSAoY29tcHV0ZTEgcmVzb2x2ZWQpCiAgICBFTC0-PkVMOiBlbnF1ZXVlIHN0ZXAgaW4gVjggbWljcm90YXNrIHF1ZXVlCgogICAgcmVjdCByZ2IoMjE5LCAyMzQsIDI1NCkKICAgIE5vdGUgb3ZlciBFTCxLOiBUdXJuIDIKICAgIEVMLT4-SlM6IGludm9rZSBzdGVwKHZhbHVlKSDigJQgTkVXIGZyYW1lCiAgICBhY3RpdmF0ZSBKUwogICAgSlMtPj5IOiBsb2FkIGNsb3N1cmUge3N0YXRlOjEsIGF9CiAgICBOb3RlIG92ZXIgSlM6IHggPSB2YWx1ZSwgc3RhdGUg4oaSIDIKICAgIEpTLT4-SlM6IGNhbGwgY29tcHV0ZTIoKSDihpIgcmV0dXJucyBQcm9taXNlCiAgICBKUy0-Pkg6IHJlZ2lzdGVyIHN0ZXAgKG5leHQgc3RhdGUpCiAgICBKUy0tPj5FTDogZnJhbWUgUE9QUEVEIGFnYWluCiAgICBkZWFjdGl2YXRlIEpTCiAgICBlbmQKCiAgICBLLS0-PkVMOiBjb21wdXRlMiByZXNvbHZlZAogICAgRUwtPj5FTDogZW5xdWV1ZSBzdGVwCgogICAgcmVjdCByZ2IoMjIwLCAyNTIsIDIzMSkKICAgIE5vdGUgb3ZlciBFTCxLOiBUdXJuIDMKICAgIEVMLT4-SlM6IGludm9rZSBzdGVwKHZhbHVlKSDigJQgeWV0IGFub3RoZXIgbmV3IGZyYW1lCiAgICBhY3RpdmF0ZSBKUwogICAgSlMtPj5IOiBsb2FkIGNsb3N1cmUge3N0YXRlOjIsIGEsIHh9CiAgICBOb3RlIG92ZXIgSlM6IHkgPSB2YWx1ZSwgc3RhdGUg4oaSIGRvbmUKICAgIEpTLT4-SlM6IHJlcy5qc29uKGEgKyB4ICsgeSkKICAgIEpTLS0-PkVMOiBoYW5kbGVyJ3MgUHJvbWlzZSByZXNvbHZlZAogICAgZGVhY3RpdmF0ZSBKUwogICAgZW5k" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2Fc2VxdWVuY2VEaWFncmFtCiAgICBhdXRvbnVtYmVyCiAgICBwYXJ0aWNpcGFudCBFTCBhcyBFdmVudCBMb29wIChsaWJ1dikKICAgIHBhcnRpY2lwYW50IEpTIGFzIEpTIE1haW4gVGhyZWFkIChWOCkKICAgIHBhcnRpY2lwYW50IEggYXMgSGVhcCAoY2xvc3VyZXMpCiAgICBwYXJ0aWNpcGFudCBLIGFzIEtlcm5lbCAvIEkvTwoKICAgIHJlY3QgcmdiKDI1NCwgMjQzLCAxOTkpCiAgICBOb3RlIG92ZXIgRUwsSzogVHVybiAxCiAgICBFTC0-PkpTOiBkaXNwYXRjaCBoYW5kbGVyKCkKICAgIGFjdGl2YXRlIEpTCiAgICBOb3RlIG92ZXIgSlM6IGNvbnN0IGEgPSAxCiAgICBKUy0-PkpTOiBjYWxsIGNvbXB1dGUxKCkg4oaSIHJldHVybnMgUHJvbWlzZQogICAgSlMtPj5IOiBWOCBzdG9yZXMgY2xvc3VyZSB7c3RhdGU6MSwgYX0KICAgIEpTLT4-SDogcmVnaXN0ZXIgc3RlcCBhcyAudGhlbiBoYW5kbGVyCiAgICBKUy0tPj5FTDogaGFuZGxlciBmcmFtZSBQT1BQRUQsIHJldHVybnMgUHJvbWlzZQogICAgZGVhY3RpdmF0ZSBKUwogICAgZW5kCgogICAgRUwtPj5LOiBlcG9sbF93YWl0IChubyBtaWNyb3Rhc2tzKQogICAgTm90ZSBvdmVyIEVMLEs6IC4uLiB0aW1lIHBhc3NlcywgT1MgdGhyZWFkIHBhcmtlZCAuLi4KICAgIEstLT4-RUw6IEkvTyByZWFkeSAoY29tcHV0ZTEgcmVzb2x2ZWQpCiAgICBFTC0-PkVMOiBlbnF1ZXVlIHN0ZXAgaW4gVjggbWljcm90YXNrIHF1ZXVlCgogICAgcmVjdCByZ2IoMjE5LCAyMzQsIDI1NCkKICAgIE5vdGUgb3ZlciBFTCxLOiBUdXJuIDIKICAgIEVMLT4-SlM6IGludm9rZSBzdGVwKHZhbHVlKSDigJQgTkVXIGZyYW1lCiAgICBhY3RpdmF0ZSBKUwogICAgSlMtPj5IOiBsb2FkIGNsb3N1cmUge3N0YXRlOjEsIGF9CiAgICBOb3RlIG92ZXIgSlM6IHggPSB2YWx1ZSwgc3RhdGUg4oaSIDIKICAgIEpTLT4-SlM6IGNhbGwgY29tcHV0ZTIoKSDihpIgcmV0dXJucyBQcm9taXNlCiAgICBKUy0-Pkg6IHJlZ2lzdGVyIHN0ZXAgKG5leHQgc3RhdGUpCiAgICBKUy0tPj5FTDogZnJhbWUgUE9QUEVEIGFnYWluCiAgICBkZWFjdGl2YXRlIEpTCiAgICBlbmQKCiAgICBLLS0-PkVMOiBjb21wdXRlMiByZXNvbHZlZAogICAgRUwtPj5FTDogZW5xdWV1ZSBzdGVwCgogICAgcmVjdCByZ2IoMjIwLCAyNTIsIDIzMSkKICAgIE5vdGUgb3ZlciBFTCxLOiBUdXJuIDMKICAgIEVMLT4-SlM6IGludm9rZSBzdGVwKHZhbHVlKSDigJQgeWV0IGFub3RoZXIgbmV3IGZyYW1lCiAgICBhY3RpdmF0ZSBKUwogICAgSlMtPj5IOiBsb2FkIGNsb3N1cmUge3N0YXRlOjIsIGEsIHh9CiAgICBOb3RlIG92ZXIgSlM6IHkgPSB2YWx1ZSwgc3RhdGUg4oaSIGRvbmUKICAgIEpTLT4-SlM6IHJlcy5qc29uKGEgKyB4ICsgeSkKICAgIEpTLS0-PkVMOiBoYW5kbGVyJ3MgUHJvbWlzZSByZXNvbHZlZAogICAgZGVhY3RpdmF0ZSBKUwogICAgZW5k" alt="sequenceDiagram" width="1103" height="1604"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is no "paused" function. There are only &lt;em&gt;captured continuations&lt;/em&gt; and &lt;em&gt;fresh frames that resume them&lt;/em&gt;. The event loop is the dispatcher: it watches for I/O readiness via libuv, for resolved Promises (via V8's microtask queue), for timers — and pulls the corresponding continuation onto the JS thread when it's ready to run. One thread can manage tens of thousands of concurrent connections, because at any moment only a handful of them have work that's actually ready.&lt;/p&gt;

&lt;p&gt;This is event-driven concurrency in its precise sense — the runtime turns "waiting" into a registered event, and only resumes the captured continuation when the event fires.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Visible Side Effect: Function Color
&lt;/h3&gt;

&lt;p&gt;Because the suspension point has to be marked at compile time, async-ness becomes part of the function's &lt;em&gt;type&lt;/em&gt;. A function that does I/O returns &lt;code&gt;Promise&amp;lt;T&amp;gt;&lt;/code&gt;. Its callers must &lt;code&gt;await&lt;/code&gt; it. Once they &lt;code&gt;await&lt;/code&gt;, they themselves return &lt;code&gt;Promise&amp;lt;T&amp;gt;&lt;/code&gt;. The "color" propagates up the call stack until you hit an async-aware entry point — typically the top of an HTTP handler or the event loop itself.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/" rel="noopener noreferrer"&gt;Bob Nystrom named this the function color problem&lt;/a&gt; in 2015. It's not a notation choice — it's a &lt;strong&gt;logical consequence of the stackless coroutine model&lt;/strong&gt;. V8 cannot save and restore arbitrary JS call stacks. The only way to express suspension is "return a Promise and be marked &lt;code&gt;async&lt;/code&gt;," and once one function does that, every function on the way up has to do the same.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBOb2RlWyI8Yj5Ob2RlIOKAlCBDb2xvciBDYXNjYWRlcyBVcCB0aGUgQ2FsbCBTdGFjazwvYj4iXQogICAgICAgIGRpcmVjdGlvbiBUQgogICAgICAgIG4xWyI8Yj5yZWFkRnJvbURCKCk8L2I-IPCfn6U8YnIvPuKGkiBQcm9taXNlJmx0O0RhdGEmZ3Q7PGJyLz48Yj5kb2VzIEkvTzwvYj4iXQogICAgICAgIG4yWyI8Yj5mZXRjaFVzZXIoKTwvYj4g8J-fpTxici8-4oaSIFByb21pc2UmbHQ7VXNlciZndDs8YnIvPjxiPm11c3QgYXdhaXQgcmVhZEZyb21EQjwvYj4iXQogICAgICAgIG4zWyI8Yj5oYW5kbGVSZXF1ZXN0KCk8L2I-IPCfn6U8YnIvPuKGkiBQcm9taXNlJmx0O1Jlc3BvbnNlJmd0Ozxici8-PGI-bXVzdCBhd2FpdCBmZXRjaFVzZXI8L2I-Il0KICAgICAgICBuNFsiPGI-cm91dGUoJy91c2VyJywgaGFuZGxlcik8L2I-IPCfn6U8YnIvPjxiPm11c3QgYWNjZXB0IFByb21pc2UgcmV0dXJuPC9iPiJdCiAgICAgICAgbjVbIjxiPm1haW4oKTwvYj4g8J-fpTxici8-4oaSIFByb21pc2UmbHQ7dm9pZCZndDs8YnIvPjxiPnRvcC1sZXZlbCBuZWVkcyBhd2FpdDwvYj4iXQogICAgICAgIG4xIC0uY29sb3IgaW5mZWN0cy4tPiBuMgogICAgICAgIG4yIC0uY29sb3IgaW5mZWN0cy4tPiBuMwogICAgICAgIG4zIC0uY29sb3IgaW5mZWN0cy4tPiBuNAogICAgICAgIG40IC0uY29sb3IgaW5mZWN0cy4tPiBuNQogICAgZW5kCgogICAgc3ViZ3JhcGggR29bIjxiPkdvIOKAlCBObyBDb2xvciwgTm8gQ2FzY2FkZTwvYj4iXQogICAgICAgIGRpcmVjdGlvbiBUQgogICAgICAgIGcxWyI8Yj5yZWFkRnJvbURCKCk8L2I-IOKsnDxici8-4oaSIERhdGE8YnIvPjxiPmJsb2NrcyBvbiBJL08gaW50ZXJuYWxseTwvYj4iXQogICAgICAgIGcyWyI8Yj5mZXRjaFVzZXIoKTwvYj4g4qycPGJyLz7ihpIgVXNlcjxici8-PGI-cGxhaW4gY2FsbDwvYj4iXQogICAgICAgIGczWyI8Yj5oYW5kbGVSZXF1ZXN0KCk8L2I-IOKsnDxici8-4oaSIFJlc3BvbnNlPGJyLz48Yj5wbGFpbiBjYWxsPC9iPiJdCiAgICAgICAgZzRbIjxiPnJvdXRlKCcvdXNlcicsIGhhbmRsZXIpPC9iPiDirJw8YnIvPjxiPmhhbmRsZXIgaXMgYSBwbGFpbiBmdW5jPC9iPiJdCiAgICAgICAgZzVbIjxiPm1haW4oKTwvYj4g4qycPGJyLz48Yj5wbGFpbiBmdW5jPC9iPiJdCiAgICAgICAgZzEgLS0-IGcyCiAgICAgICAgZzIgLS0-IGczCiAgICAgICAgZzMgLS0-IGc0CiAgICAgICAgZzQgLS0-IGc1CiAgICBlbmQKCiAgICBOb2RlIH5-fiBHbwoKICAgIGNsYXNzRGVmIHJlZENsYXNzIGZpbGw6I2ZlZTJlMixzdHJva2U6I2RjMjYyNixzdHJva2Utd2lkdGg6MnB4LGNvbG9yOiM3ZjFkMWQKICAgIGNsYXNzRGVmIHBsYWluQ2xhc3MgZmlsbDojZjNmNGY2LHN0cm9rZTojNmI3MjgwLHN0cm9rZS13aWR0aDoycHgsY29sb3I6IzExMTgyNwoKICAgIGNsYXNzIG4xLG4yLG4zLG40LG41IHJlZENsYXNzCiAgICBjbGFzcyBnMSxnMixnMyxnNCxnNSBwbGFpbkNsYXNz" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBOb2RlWyI8Yj5Ob2RlIOKAlCBDb2xvciBDYXNjYWRlcyBVcCB0aGUgQ2FsbCBTdGFjazwvYj4iXQogICAgICAgIGRpcmVjdGlvbiBUQgogICAgICAgIG4xWyI8Yj5yZWFkRnJvbURCKCk8L2I-IPCfn6U8YnIvPuKGkiBQcm9taXNlJmx0O0RhdGEmZ3Q7PGJyLz48Yj5kb2VzIEkvTzwvYj4iXQogICAgICAgIG4yWyI8Yj5mZXRjaFVzZXIoKTwvYj4g8J-fpTxici8-4oaSIFByb21pc2UmbHQ7VXNlciZndDs8YnIvPjxiPm11c3QgYXdhaXQgcmVhZEZyb21EQjwvYj4iXQogICAgICAgIG4zWyI8Yj5oYW5kbGVSZXF1ZXN0KCk8L2I-IPCfn6U8YnIvPuKGkiBQcm9taXNlJmx0O1Jlc3BvbnNlJmd0Ozxici8-PGI-bXVzdCBhd2FpdCBmZXRjaFVzZXI8L2I-Il0KICAgICAgICBuNFsiPGI-cm91dGUoJy91c2VyJywgaGFuZGxlcik8L2I-IPCfn6U8YnIvPjxiPm11c3QgYWNjZXB0IFByb21pc2UgcmV0dXJuPC9iPiJdCiAgICAgICAgbjVbIjxiPm1haW4oKTwvYj4g8J-fpTxici8-4oaSIFByb21pc2UmbHQ7dm9pZCZndDs8YnIvPjxiPnRvcC1sZXZlbCBuZWVkcyBhd2FpdDwvYj4iXQogICAgICAgIG4xIC0uY29sb3IgaW5mZWN0cy4tPiBuMgogICAgICAgIG4yIC0uY29sb3IgaW5mZWN0cy4tPiBuMwogICAgICAgIG4zIC0uY29sb3IgaW5mZWN0cy4tPiBuNAogICAgICAgIG40IC0uY29sb3IgaW5mZWN0cy4tPiBuNQogICAgZW5kCgogICAgc3ViZ3JhcGggR29bIjxiPkdvIOKAlCBObyBDb2xvciwgTm8gQ2FzY2FkZTwvYj4iXQogICAgICAgIGRpcmVjdGlvbiBUQgogICAgICAgIGcxWyI8Yj5yZWFkRnJvbURCKCk8L2I-IOKsnDxici8-4oaSIERhdGE8YnIvPjxiPmJsb2NrcyBvbiBJL08gaW50ZXJuYWxseTwvYj4iXQogICAgICAgIGcyWyI8Yj5mZXRjaFVzZXIoKTwvYj4g4qycPGJyLz7ihpIgVXNlcjxici8-PGI-cGxhaW4gY2FsbDwvYj4iXQogICAgICAgIGczWyI8Yj5oYW5kbGVSZXF1ZXN0KCk8L2I-IOKsnDxici8-4oaSIFJlc3BvbnNlPGJyLz48Yj5wbGFpbiBjYWxsPC9iPiJdCiAgICAgICAgZzRbIjxiPnJvdXRlKCcvdXNlcicsIGhhbmRsZXIpPC9iPiDirJw8YnIvPjxiPmhhbmRsZXIgaXMgYSBwbGFpbiBmdW5jPC9iPiJdCiAgICAgICAgZzVbIjxiPm1haW4oKTwvYj4g4qycPGJyLz48Yj5wbGFpbiBmdW5jPC9iPiJdCiAgICAgICAgZzEgLS0-IGcyCiAgICAgICAgZzIgLS0-IGczCiAgICAgICAgZzMgLS0-IGc0CiAgICAgICAgZzQgLS0-IGc1CiAgICBlbmQKCiAgICBOb2RlIH5-fiBHbwoKICAgIGNsYXNzRGVmIHJlZENsYXNzIGZpbGw6I2ZlZTJlMixzdHJva2U6I2RjMjYyNixzdHJva2Utd2lkdGg6MnB4LGNvbG9yOiM3ZjFkMWQKICAgIGNsYXNzRGVmIHBsYWluQ2xhc3MgZmlsbDojZjNmNGY2LHN0cm9rZTojNmI3MjgwLHN0cm9rZS13aWR0aDoycHgsY29sb3I6IzExMTgyNwoKICAgIGNsYXNzIG4xLG4yLG4zLG40LG41IHJlZENsYXNzCiAgICBjbGFzcyBnMSxnMixnMyxnNCxnNSBwbGFpbkNsYXNz" alt="flowchart LR" width="714" height="997"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Hard Limit
&lt;/h3&gt;

&lt;p&gt;The model fails the moment your code stops waiting. A single CPU-bound operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* heavy work */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…holds the JS main thread, and &lt;em&gt;every other request on this process is dead&lt;/em&gt; until it returns. The event loop has nowhere else to go. Worker threads, child processes, or splitting CPU work into a separate service are real fixes, but they're escape hatches — they exist because the core model has only one main thread executing JS, and there is exactly one of it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Go's Answer: Move Context Switching Into User Space
&lt;/h2&gt;

&lt;p&gt;Go writes synchronous code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sendResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no &lt;code&gt;await&lt;/code&gt;. There is no callback. The function looks like it blocks on the database. And yet the program scales to hundreds of thousands of concurrent operations on modest hardware.&lt;/p&gt;

&lt;p&gt;The trick is that the &lt;em&gt;scheduling boundary has been moved.&lt;/em&gt; Where Node has the programmer mark the suspension point with &lt;code&gt;await&lt;/code&gt; and the runtime captures a continuation, Go lets the programmer write straight-line code and has the &lt;em&gt;runtime&lt;/em&gt; suspend the entire goroutine when it hits a blocking I/O call.&lt;/p&gt;

&lt;p&gt;This is the central insight, and the cleanest one-line statement of Go's concurrency model:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Go's essence is the user-space-ification of context switching.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A goroutine isn't an OS thread. It's a small (initially 2 KB) growable stack and a register snapshot, managed by the Go runtime. The runtime maps a large number of goroutines (G) onto a small number of OS threads (M) using scheduling contexts (P). This is the GMP model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;G&lt;/strong&gt; — a goroutine. The unit of scheduling. Cheap to create, cheap to suspend.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;M&lt;/strong&gt; — an OS thread. Usually only &lt;code&gt;GOMAXPROCS&lt;/code&gt; of them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;P&lt;/strong&gt; — a scheduling context. Decides which G runs on which M.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;many G  →  Go scheduler  →  few M  →  CPU cores
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a goroutine hits a blocking syscall or a channel wait, the Go runtime suspends the goroutine — saves its stack and registers — detaches it from the current M, and schedules another runnable goroutine onto that M. When the original goroutine's wait completes, it's marked runnable again, and some M eventually picks it up and resumes execution from the suspension point. &lt;strong&gt;None of this enters the kernel.&lt;/strong&gt; No &lt;code&gt;clone(2)&lt;/code&gt;, no kernel-mediated thread switch, no kernel scheduler queue. The bookkeeping is all in user space.&lt;/p&gt;

&lt;p&gt;That's the user-space-ification. The CPU still has to switch contexts when work shifts between goroutines, but the cost is roughly a function call plus a stack swap — not a kernel-mediated thread switch.&lt;/p&gt;

&lt;p&gt;The key contrast with Node's model is in &lt;em&gt;where the suspended state lives:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBOb2RlWyI8Yj5Ob2RlIOKAlCBTdGFja2xlc3MgQ29yb3V0aW5lPC9iPiJdCiAgICAgICAgZGlyZWN0aW9uIFRCCiAgICAgICAgblN0YWNrWyI8Yj5KUyBDYWxsIFN0YWNrPC9iPjxici8-KG9uZSBmcmFtZSBhdCBhIHRpbWUpPGJyLz7ilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIE8YnIvPuKaoCA8Yj5jdXJyZW50bHkgZW1wdHk8L2I-PGJyLz4oYWxsIGFzeW5jIGZucyBwb3BwZWQsPGJyLz53YWl0aW5nIGluIGV2ZW50IGxvb3ApIl0KICAgICAgICBuSGVhcFsiPGI-SGVhcDwvYj4iXQogICAgICAgIG5DMVsiPGI-Y29udGludWF0aW9uICMxPC9iPjxici8-eyBzdGF0ZTogMSw8YnIvPiZuYnNwOyZuYnNwO2xvY2Fsczoge3JlcSwgcmVzLCBhfSw8YnIvPiZuYnNwOyZuYnNwO3N0ZXA6IGZuIHB0ciB9Il0KICAgICAgICBuQzJbIjxiPmNvbnRpbnVhdGlvbiAjMjwvYj48YnIvPnsgc3RhdGU6IDAsIC4uLiB9Il0KICAgICAgICBuQzNbIjxiPmNvbnRpbnVhdGlvbiAjMzwvYj48YnIvPnsgc3RhdGU6IDIsIC4uLiB9Il0KICAgICAgICBuSGVhcCAtLT4gbkMxCiAgICAgICAgbkhlYXAgLS0-IG5DMgogICAgICAgIG5IZWFwIC0tPiBuQzMKICAgICAgICBuTm90ZVsiPGI-RWFjaCA8Y29kZT5hd2FpdDwvY29kZT4gcG9wcyB0aGUgZnJhbWUuPC9iPjxici8-U3RhdGUgbGl2ZXMgb25seSBpbiBoZWFwIGNsb3N1cmVzLjxici8-U3RhY2sgaXMgcmV1c2VkIGFjcm9zcyBhbGwgdHVybnMuIl0KICAgIGVuZAoKICAgIHN1YmdyYXBoIEdvWyI8Yj5HbyDigJQgU3RhY2tmdWwgQ29yb3V0aW5lPC9iPiJdCiAgICAgICAgZGlyZWN0aW9uIFRCCiAgICAgICAgZ01bIjxiPk9TIFRocmVhZCAoTSk8L2I-PGJyLz5jdXJyZW50bHkgcnVubmluZyBHMyDilrYiXQogICAgICAgIGdIZWFwWyI8Yj5IZWFwPC9iPiJdCiAgICAgICAgZ0cxWyI8Yj5nb3JvdXRpbmUgRzE8L2I-ICgyIEtCIHN0YWNrKTxici8-4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSBPGJyLz5wcm9jZXNzKCk8YnIvPiZuYnNwOyZuYnNwO-KGsyBzbG93RG91YmxlKCk8YnIvPiZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwO-KGsyB0aW1lLlNsZWVwKCkg4piFcGFya2VkIl0KICAgICAgICBnRzJbIjxiPmdvcm91dGluZSBHMjwvYj4gKDIgS0Igc3RhY2spPGJyLz7ilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIE8YnIvPmhhbmRsZXIoKTxici8-Jm5ic3A7Jm5ic3A74oazIGRiLlF1ZXJ5KCkg4piFcGFya2VkIl0KICAgICAgICBnRzNbIjxiPmdvcm91dGluZSBHMzwvYj4gKDIgS0Igc3RhY2spPGJyLz7ilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIE8YnIvPmN1cnJlbnRseSBvbiBNIOKWtiJdCiAgICAgICAgZ0hlYXAgLS0-IGdHMQogICAgICAgIGdIZWFwIC0tPiBnRzIKICAgICAgICBnSGVhcCAtLT4gZ0czCiAgICAgICAgZ05vdGVbIjxiPkVhY2ggZ29yb3V0aW5lIG93bnMgaXRzIGZ1bGwgc3RhY2suPC9iPjxici8-UnVudGltZSBzYXZlcy9yZXN0b3JlcyBlbnRpcmUgc3RhY2s8YnIvPm9uIHN1c3BlbmQuIE5vIGZyYW1lIHBvcCBuZWVkZWQuIl0KICAgIGVuZAoKICAgIE5vZGUgfn5-IEdvCgogICAgY2xhc3NEZWYgbm9kZUFsZXJ0IGZpbGw6I2ZlZTJlMixzdHJva2U6I2RjMjYyNixzdHJva2Utd2lkdGg6M3B4LGNvbG9yOiM3ZjFkMWQKICAgIGNsYXNzRGVmIG5vZGVDbGFzcyBmaWxsOiNmZWYzYzcsc3Ryb2tlOiNkOTc3MDYsY29sb3I6IzExMTgyNwogICAgY2xhc3NEZWYgZ29DbGFzcyBmaWxsOiNkYmVhZmUsc3Ryb2tlOiMyNTYzZWIsY29sb3I6IzExMTgyNwogICAgY2xhc3NEZWYgbm90ZUNsYXNzIGZpbGw6I2ZmZmZmZixzdHJva2U6IzM3NDE1MSxzdHJva2Utd2lkdGg6MS41cHgsY29sb3I6IzExMTgyNwoKICAgIGNsYXNzIG5TdGFjayBub2RlQWxlcnQKICAgIGNsYXNzIG5IZWFwLG5DMSxuQzIsbkMzIG5vZGVDbGFzcwogICAgY2xhc3MgZ00sZ0hlYXAsZ0cxLGdHMixnRzMgZ29DbGFzcwogICAgY2xhc3Mgbk5vdGUsZ05vdGUgbm90ZUNsYXNz" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBOb2RlWyI8Yj5Ob2RlIOKAlCBTdGFja2xlc3MgQ29yb3V0aW5lPC9iPiJdCiAgICAgICAgZGlyZWN0aW9uIFRCCiAgICAgICAgblN0YWNrWyI8Yj5KUyBDYWxsIFN0YWNrPC9iPjxici8-KG9uZSBmcmFtZSBhdCBhIHRpbWUpPGJyLz7ilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIE8YnIvPuKaoCA8Yj5jdXJyZW50bHkgZW1wdHk8L2I-PGJyLz4oYWxsIGFzeW5jIGZucyBwb3BwZWQsPGJyLz53YWl0aW5nIGluIGV2ZW50IGxvb3ApIl0KICAgICAgICBuSGVhcFsiPGI-SGVhcDwvYj4iXQogICAgICAgIG5DMVsiPGI-Y29udGludWF0aW9uICMxPC9iPjxici8-eyBzdGF0ZTogMSw8YnIvPiZuYnNwOyZuYnNwO2xvY2Fsczoge3JlcSwgcmVzLCBhfSw8YnIvPiZuYnNwOyZuYnNwO3N0ZXA6IGZuIHB0ciB9Il0KICAgICAgICBuQzJbIjxiPmNvbnRpbnVhdGlvbiAjMjwvYj48YnIvPnsgc3RhdGU6IDAsIC4uLiB9Il0KICAgICAgICBuQzNbIjxiPmNvbnRpbnVhdGlvbiAjMzwvYj48YnIvPnsgc3RhdGU6IDIsIC4uLiB9Il0KICAgICAgICBuSGVhcCAtLT4gbkMxCiAgICAgICAgbkhlYXAgLS0-IG5DMgogICAgICAgIG5IZWFwIC0tPiBuQzMKICAgICAgICBuTm90ZVsiPGI-RWFjaCA8Y29kZT5hd2FpdDwvY29kZT4gcG9wcyB0aGUgZnJhbWUuPC9iPjxici8-U3RhdGUgbGl2ZXMgb25seSBpbiBoZWFwIGNsb3N1cmVzLjxici8-U3RhY2sgaXMgcmV1c2VkIGFjcm9zcyBhbGwgdHVybnMuIl0KICAgIGVuZAoKICAgIHN1YmdyYXBoIEdvWyI8Yj5HbyDigJQgU3RhY2tmdWwgQ29yb3V0aW5lPC9iPiJdCiAgICAgICAgZGlyZWN0aW9uIFRCCiAgICAgICAgZ01bIjxiPk9TIFRocmVhZCAoTSk8L2I-PGJyLz5jdXJyZW50bHkgcnVubmluZyBHMyDilrYiXQogICAgICAgIGdIZWFwWyI8Yj5IZWFwPC9iPiJdCiAgICAgICAgZ0cxWyI8Yj5nb3JvdXRpbmUgRzE8L2I-ICgyIEtCIHN0YWNrKTxici8-4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSB4pSBPGJyLz5wcm9jZXNzKCk8YnIvPiZuYnNwOyZuYnNwO-KGsyBzbG93RG91YmxlKCk8YnIvPiZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwO-KGsyB0aW1lLlNsZWVwKCkg4piFcGFya2VkIl0KICAgICAgICBnRzJbIjxiPmdvcm91dGluZSBHMjwvYj4gKDIgS0Igc3RhY2spPGJyLz7ilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIE8YnIvPmhhbmRsZXIoKTxici8-Jm5ic3A7Jm5ic3A74oazIGRiLlF1ZXJ5KCkg4piFcGFya2VkIl0KICAgICAgICBnRzNbIjxiPmdvcm91dGluZSBHMzwvYj4gKDIgS0Igc3RhY2spPGJyLz7ilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIHilIE8YnIvPmN1cnJlbnRseSBvbiBNIOKWtiJdCiAgICAgICAgZ0hlYXAgLS0-IGdHMQogICAgICAgIGdIZWFwIC0tPiBnRzIKICAgICAgICBnSGVhcCAtLT4gZ0czCiAgICAgICAgZ05vdGVbIjxiPkVhY2ggZ29yb3V0aW5lIG93bnMgaXRzIGZ1bGwgc3RhY2suPC9iPjxici8-UnVudGltZSBzYXZlcy9yZXN0b3JlcyBlbnRpcmUgc3RhY2s8YnIvPm9uIHN1c3BlbmQuIE5vIGZyYW1lIHBvcCBuZWVkZWQuIl0KICAgIGVuZAoKICAgIE5vZGUgfn5-IEdvCgogICAgY2xhc3NEZWYgbm9kZUFsZXJ0IGZpbGw6I2ZlZTJlMixzdHJva2U6I2RjMjYyNixzdHJva2Utd2lkdGg6M3B4LGNvbG9yOiM3ZjFkMWQKICAgIGNsYXNzRGVmIG5vZGVDbGFzcyBmaWxsOiNmZWYzYzcsc3Ryb2tlOiNkOTc3MDYsY29sb3I6IzExMTgyNwogICAgY2xhc3NEZWYgZ29DbGFzcyBmaWxsOiNkYmVhZmUsc3Ryb2tlOiMyNTYzZWIsY29sb3I6IzExMTgyNwogICAgY2xhc3NEZWYgbm90ZUNsYXNzIGZpbGw6I2ZmZmZmZixzdHJva2U6IzM3NDE1MSxzdHJva2Utd2lkdGg6MS41cHgsY29sb3I6IzExMTgyNwoKICAgIGNsYXNzIG5TdGFjayBub2RlQWxlcnQKICAgIGNsYXNzIG5IZWFwLG5DMSxuQzIsbkMzIG5vZGVDbGFzcwogICAgY2xhc3MgZ00sZ0hlYXAsZ0cxLGdHMixnRzMgZ29DbGFzcwogICAgY2xhc3Mgbk5vdGUsZ05vdGUgbm90ZUNsYXNz" alt="flowchart LR" width="1904" height="261"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Node, the JS call stack is shared and almost always near-empty — every async function in flight has already popped, with its state sitting in a heap closure. In Go, every goroutine owns its full call chain on its own heap-allocated stack; suspended goroutines look like frozen frames waiting for the runtime to resume them on some OS thread.&lt;/p&gt;

&lt;p&gt;This is also why neither language can simply borrow the other's model. &lt;strong&gt;Node runs on V8&lt;/strong&gt;, which was designed in 2008 for browser JS — single call stack, synchronous semantics, no concept of saving stacks across yields. Adding stackful coroutines would mean rewriting the engine, which is roughly what Java's Project Loom did to the JVM at huge cost. &lt;strong&gt;Go was designed from scratch&lt;/strong&gt; with a runtime that owns stacks, can grow them, and can save them. The choice is locked in by runtime architecture, not language taste.&lt;/p&gt;




&lt;h2&gt;
  
  
  What "User-Space" Actually Buys You
&lt;/h2&gt;

&lt;p&gt;The slogan only matters if user-space context switching is meaningfully cheaper than the kernel-mediated kind. It is — by more than an order of magnitude.&lt;/p&gt;

&lt;p&gt;Two goroutines pinned to one OS thread (&lt;code&gt;GOMAXPROCS=1&lt;/code&gt;), ping-ponging via &lt;code&gt;runtime.Gosched()&lt;/code&gt; and via an unbuffered channel. Two pthreads pinned to one core (&lt;code&gt;taskset -c 0&lt;/code&gt;), ping-ponging via &lt;code&gt;pthread_mutex&lt;/code&gt; + &lt;code&gt;pthread_cond&lt;/code&gt;. (Reproduction code at the end of the post.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Measured on Intel N100, Ubuntu 24.04 (kernel 6.8.0), Go 1.23.4, gcc 13.3:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;ns / switch&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Goroutine yield (&lt;code&gt;runtime.Gosched&lt;/code&gt;, GOMAXPROCS=1)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~102 ns&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Goroutine round-trip via unbuffered channel&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;~436 ns&lt;/strong&gt; (≈218 ns per G-switch + channel coordination)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pthread switch (mutex+cond ping-pong, single core)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;~2,900 ns&lt;/strong&gt; (range 2,818–3,611 across 5 runs of 2M iterations)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Ratio: roughly &lt;strong&gt;28× cheaper&lt;/strong&gt; for the bare scheduler yield, &lt;strong&gt;~13× cheaper&lt;/strong&gt; for the apples-to-apples synchronized round-trip.&lt;/p&gt;

&lt;p&gt;Where the gap comes from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mode switch.&lt;/strong&gt; The user → kernel → user round-trip alone is ~100 ns of entry/exit and ABI-mandated register save/restore. A goroutine switch never crosses that line.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduler work in kernel space.&lt;/strong&gt; Linux CFS maintains a red-black tree of runnable threads with locked, cross-CPU runqueues. The Go scheduler does the same job in user space with per-P local runqueues and lock-free fast paths — and skips the kernel locks entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache and TLB effects.&lt;/strong&gt; A kernel scheduler may migrate a thread to a different core, costing you cold L1/L2 and an instruction-cache reload. Goroutines normally stay on the same M, so the cache stays warm.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What the model does &lt;em&gt;not&lt;/em&gt; buy you: a goroutine that makes a real blocking syscall still pays for a real OS thread switch — the runtime detaches the G from its M and may spin up another M so the rest of the goroutines keep running. Async preemption (Go 1.14+, signal-based) is the runtime's answer to tight loops that never yield, and it has its own cost. Once you saturate &lt;code&gt;GOMAXPROCS&lt;/code&gt;, the user-space runqueue itself starts to show up in profiles.&lt;/p&gt;

&lt;p&gt;The "user-space-ification" buys you &lt;strong&gt;cheap G-to-G switching on a hot M.&lt;/strong&gt; That's where the order-of-magnitude lives. The syscalls, the M-to-M handoffs, the actual kernel work — those are still as expensive as they always were. The model wins by making the &lt;em&gt;common case&lt;/em&gt; — many concurrent goroutines, mostly waiting, occasionally running — almost free.&lt;/p&gt;

&lt;p&gt;(N100 is a low-power Alder Lake-N E-core; absolute numbers will be smaller on a server-class Xeon or EPYC, but the ratio is expected to hold.)&lt;/p&gt;




&lt;h2&gt;
  
  
  The Unit of Scheduling
&lt;/h2&gt;

&lt;p&gt;The cleanest comparison is to ask what each runtime actually schedules:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Node / TypeScript&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Go&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Unit of scheduling&lt;/td&gt;
&lt;td&gt;callback / Promise continuation&lt;/td&gt;
&lt;td&gt;goroutine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What's captured at suspension&lt;/td&gt;
&lt;td&gt;tail of an async function as a heap closure&lt;/td&gt;
&lt;td&gt;full call stack + registers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;How code looks&lt;/td&gt;
&lt;td&gt;explicit &lt;code&gt;async&lt;/code&gt;/&lt;code&gt;await&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;straight-line synchronous&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Suspension marked by&lt;/td&gt;
&lt;td&gt;the programmer (&lt;code&gt;await&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;the runtime (any blocking op)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Suspended state lives in&lt;/td&gt;
&lt;td&gt;V8 microtask queue + heap closure&lt;/td&gt;
&lt;td&gt;goroutine stack on the user-space heap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel involvement&lt;/td&gt;
&lt;td&gt;epoll/kqueue/IOCP via libuv&lt;/td&gt;
&lt;td&gt;epoll/kqueue/IOCP via netpoller&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU parallelism&lt;/td&gt;
&lt;td&gt;one main JS thread; needs workers/cluster for cores&lt;/td&gt;
&lt;td&gt;M:N scheduler runs goroutines across cores natively&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Function color&lt;/td&gt;
&lt;td&gt;yes (Promise infects up the call stack)&lt;/td&gt;
&lt;td&gt;no (any function may block)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What breaks under CPU load&lt;/td&gt;
&lt;td&gt;the entire event loop&lt;/td&gt;
&lt;td&gt;nothing — scheduler runs another G on another M&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The two columns describe deeply different mental models, but they belong to the same family. They are both &lt;em&gt;user-space concurrency runtimes that avoid kernel thread-per-request.&lt;/em&gt; They differ in where the suspension is captured (the language vs. the call stack) and how broad the scheduler's mandate is.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where the Boundaries Diverge: CPU-Bound Work
&lt;/h2&gt;

&lt;p&gt;Node and Go look interchangeable on I/O-bound workloads. They diverge sharply the moment CPU work enters the picture.&lt;/p&gt;

&lt;p&gt;Node's event loop has one job: dispatch ready callbacks onto a single JS thread. If a callback runs for 200 ms doing JSON parsing or hashing, the loop is &lt;em&gt;frozen&lt;/em&gt; for those 200 ms. Every other suspended continuation has to wait. Throughput collapses.&lt;/p&gt;

&lt;p&gt;Go's runtime has a different mandate. It doesn't only manage waiting — it also manages execution. If you spawn:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;task1&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;task2&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;task3&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…the scheduler is happy to put each goroutine on a different M, run them on different cores in true parallel, and preempt long-running goroutines so they don't starve the rest of the runtime. CPU-bound goroutines aren't a special case to work around. They're just goroutines.&lt;/p&gt;

&lt;p&gt;That's why Go's concurrency model covers more ground:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Node's model mainly solves non-CPU-bound concurrency — network I/O, database waits, downstream API calls. Go's model solves I/O waiting &lt;em&gt;and&lt;/em&gt; CPU parallelism with the same primitive.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This isn't a knock on Node. The event loop is brilliant at what it's designed for: lots of slow waits, light per-request CPU. It's the natural shape of API gateways, BFFs, websocket hubs, real-time aggregation, and most of the JSON-shuffling that makes up modern web backends. But sustained CPU work, mixed CPU + I/O pipelines, long-lived infrastructure services — those are workloads where Go's scheduler-driven model has more headroom built in.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Answers to the Same Question
&lt;/h2&gt;

&lt;p&gt;Strip away the implementation details and the two runtimes are answering the same question with different abstractions:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Concurrency at scale is the problem of what to do with the CPU while a request waits on I/O.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Node's answer: turn the wait into an event, capture the rest of the function as a continuation, resume the continuation when the event fires. One thread cycling through ready continuations.&lt;/p&gt;

&lt;p&gt;Go's answer: run the request on a goroutine, suspend the goroutine in user space when it blocks, schedule another runnable goroutine onto the OS thread, resume the original when its wait completes.&lt;/p&gt;

&lt;p&gt;Two ways of solving the same waste. One state-machines it. The other lowers the cost of context switching far enough that you can afford to keep one execution flow per request.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Two answers to one question: one is events, implemented as a state machine. The other is low-cost user-space context switching.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But there's a deeper layer worth surfacing. The two answers also disagree about &lt;em&gt;whether suspension should be visible in the type system.&lt;/em&gt; Node says yes — &lt;code&gt;Promise&amp;lt;T&amp;gt;&lt;/code&gt; is part of the signature, &lt;code&gt;async&lt;/code&gt; is part of the contract, function color propagates. Go says no — any function may block, and the type doesn't carry that information.&lt;/p&gt;

&lt;p&gt;This visibility-vs-uniformity trade-off shows up far beyond Node and Go. It's the same shape as monadic IO vs implicit IO in Haskell, checked vs unchecked exceptions in Java, capability-based security vs ambient authority. Each pair makes the same trade: composable static reasoning vs ergonomic uniform code. Node and Go are picking sides of a much bigger question.&lt;/p&gt;

&lt;p&gt;You see the consequence in the libraries. Node libraries publish &lt;code&gt;fs.readFile&lt;/code&gt; &lt;em&gt;and&lt;/em&gt; &lt;code&gt;fs.readFileSync&lt;/code&gt;, two retry helpers (one for sync ops, one for async), &lt;code&gt;p-limit&lt;/code&gt;-style bounded-concurrency wrappers around &lt;code&gt;Promise.all&lt;/code&gt;. Go libraries publish &lt;code&gt;os.ReadFile&lt;/code&gt; (one function), one &lt;code&gt;Retry(op func() error, n int) error&lt;/code&gt;, twenty lines of &lt;code&gt;chan&lt;/code&gt; + &lt;code&gt;WaitGroup&lt;/code&gt; for bounded concurrency. The Go versions aren't simpler because Go developers are smarter — they're simpler because the runtime hides the same complexity that Node's type system insists on exposing.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Closing Line
&lt;/h2&gt;

&lt;p&gt;If you remember one thing from this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Node turns waiting into events. Go turns execution flows into schedulable units. Both refuse to let the CPU sit idle while I/O blocks — they just disagree on what the unit of scheduling should be.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Or, if you want the deeper layer:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Node makes "this function might suspend" visible at the type level. Go makes it invisible.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's the whole story. Everything else — &lt;code&gt;await&lt;/code&gt; vs &lt;code&gt;go&lt;/code&gt;, libuv vs the netpoller, V8's microtask queue vs GMP, single-thread bottleneck vs CPU-bound resilience, libraries that look complicated vs libraries that look simple — falls out of that one disagreement.&lt;/p&gt;




&lt;h2&gt;
  
  
  Appendix: Reproduce the Benchmark
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;goroutine_switch_test.go&lt;/code&gt;&lt;/strong&gt; — &lt;code&gt;GOMAXPROCS=1 go test -bench=. -benchtime=5s -count=5&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;bench&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"runtime"&lt;/span&gt;
    &lt;span class="s"&gt;"sync"&lt;/span&gt;
    &lt;span class="s"&gt;"testing"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Channel ping-pong: each iter is a full round-trip = 2 G-switches.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;BenchmarkGoroutineSwitchChannel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;done&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}{}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResetTimer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}{}&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ch&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StopTimer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nb"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Bare scheduler yield. Each iter ≈ 1 G-switch.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;BenchmarkGoroutineSwitchGosched&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;wg&lt;/span&gt; &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WaitGroup&lt;/span&gt;
    &lt;span class="n"&gt;wg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;half&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;half&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Gosched&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;wg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResetTimer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;half&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Gosched&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;wg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Wait&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;pthread_switch.c&lt;/code&gt;&lt;/strong&gt; — &lt;code&gt;gcc -O2 -o pthread_switch pthread_switch.c -lpthread &amp;amp;&amp;amp; taskset -c 0 ./pthread_switch 2000000&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;#define _GNU_SOURCE
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;pthread.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdlib.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;time.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;stdint.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;pthread_mutex_t&lt;/span&gt; &lt;span class="n"&gt;mu&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PTHREAD_MUTEX_INITIALIZER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;pthread_cond_t&lt;/span&gt;  &lt;span class="n"&gt;cv&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PTHREAD_COND_INITIALIZER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt;            &lt;span class="n"&gt;iters&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;my_turn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;intptr_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;pthread_mutex_lock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;iters&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;my_turn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;pthread_cond_wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;my_turn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;pthread_cond_broadcast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cv&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;pthread_mutex_unlock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;now_ns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;timespec&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;clock_gettime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CLOCK_MONOTONIC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tv_sec&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1e9&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tv_nsec&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;iters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argc&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;atol&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000000L&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;pthread_t&lt;/span&gt; &lt;span class="n"&gt;t0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now_ns&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;pthread_create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;t0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;intptr_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;pthread_create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;t1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="kt"&gt;intptr_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;pthread_join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;pthread_join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now_ns&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ns / switch: %.1f&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;iters&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;GOMAXPROCS=1&lt;/code&gt; forces both goroutines onto the same M so we measure pure G-to-G switching, not cross-core migration. &lt;code&gt;taskset -c 0&lt;/code&gt; pins both pthreads to one CPU so they actually have to context-switch (otherwise they run in parallel on two cores and there is nothing to measure). Both benches do the simplest possible synchronized hand-off — no I/O, no real work — so what is left is the cost of the switch itself.&lt;/p&gt;

</description>
      <category>go</category>
      <category>node</category>
      <category>concurrency</category>
      <category>javascript</category>
    </item>
    <item>
      <title>gRPC Interceptors in Production: Design Patterns That Survive Real Load</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Mon, 20 Apr 2026 17:02:20 +0000</pubDate>
      <link>https://forem.com/harrisonsec/grpc-interceptors-in-production-design-patterns-that-survive-real-load-372h</link>
      <guid>https://forem.com/harrisonsec/grpc-interceptors-in-production-design-patterns-that-survive-real-load-372h</guid>
      <description>&lt;p&gt;gRPC interceptors are the middleware pattern, specialized for gRPC. If you've written HTTP middleware before, the shape is familiar — a function that wraps a call, can observe or modify the request, pass to the next handler, then observe or modify the response. The difference: gRPC's type system makes the flavors (unary, server-stream, client-stream, bidi) explicit, and chain ordering matters more than most people realize.&lt;/p&gt;

&lt;p&gt;Most online examples show a single toy interceptor. Production systems stack five to ten of them per service. Getting the composition right — ordering, concern separation, testability — is half of running a gRPC-based microservice well.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — gRPC interceptors are middleware with more explicit types. Chain them outside-in: observability wraps everything, then throttling, then auth, then retry, then the actual service. Keep each interceptor focused on one concern; the moment an interceptor does two things you're writing coupled middleware. Stream interceptors are trickier than unary — don't copy-paste unary logic into stream without thinking. Test the chain composition with bufconn, not just each interceptor in isolation.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Four Interceptor Types
&lt;/h2&gt;

&lt;p&gt;gRPC has four interceptor signatures, two for client, two for server:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unary server interceptor&lt;/strong&gt;: wraps a single request → single response call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stream server interceptor&lt;/strong&gt;: wraps streaming RPCs (server-stream, client-stream, bidi).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unary client interceptor&lt;/strong&gt;: wraps the client side of a unary call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stream client interceptor&lt;/strong&gt;: wraps the client side of a streaming call.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unary interceptors are easy. Stream interceptors are harder because you're wrapping a bidirectional wire, not a single call.&lt;/p&gt;

&lt;p&gt;Example unary server interceptor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;loggingInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryServerInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"method=%s duration=%s err=%v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FullMethod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Register it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loggingInterceptor&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Straightforward. Now stack five of them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chaining and Order
&lt;/h2&gt;

&lt;p&gt;Real services need multiple interceptors. gRPC's standard library gives you &lt;code&gt;grpc.ChainUnaryInterceptor(...)&lt;/code&gt; (since 1.25), or you can use &lt;code&gt;google.golang.org/grpc/interceptor&lt;/code&gt; helpers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChainUnaryInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;observabilityInterceptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c"&gt;// outermost&lt;/span&gt;
        &lt;span class="n"&gt;rateLimitInterceptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;authInterceptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;validationInterceptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;businessLogicContextInterceptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// innermost&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Chain order matters &lt;em&gt;enormously&lt;/em&gt;. Interceptors execute outside-in on the way to the handler, inside-out on the way back. Put the wrong interceptor outside the wrong one and you get bugs that are hard to debug.&lt;/p&gt;

&lt;p&gt;Canonical order I use:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBDbGllbnQoW2dSUEMgY2xpZW50XSkgLS0-IEkxCiAgICBJMVsiT2JzZXJ2YWJpbGl0eTxici8-dHJhY2luZyDCtyBtZXRyaWNzIMK3IGxvZ2dpbmciXSAtLT4gSTIKICAgIEkyWyJSYXRlIGxpbWl0aW5nIC8gcXVvdGEiXSAtLT4gSTMKICAgIEkzWyJBdXRoPGJyLz5hdXRobiDCtyBhdXRoeiJdIC0tPiBJNAogICAgSTRbIlZhbGlkYXRpb24iXSAtLT4gSTUKICAgIEk1WyJSZXRyeSAvIGlkZW1wb3RlbmN5Il0gLS0-IEk2CiAgICBJNlsiQ29udGV4dCBlbnJpY2htZW50Il0gLS0-IEhhbmRsZXJ7eyJCdXNpbmVzcyBoYW5kbGVyIn19CgogICAgY2xhc3NEZWYgb3V0ZXIgZmlsbDojZmVmNWU3LHN0cm9rZTojYjc3OTFmCiAgICBjbGFzc0RlZiBtaWQgZmlsbDojZThmNGY4LHN0cm9rZTojMmM1MjgyCiAgICBjbGFzc0RlZiBpbm5lciBmaWxsOiNmMGZmZjQsc3Ryb2tlOiMyZjg1NWEKICAgIGNsYXNzIEkxIG91dGVyCiAgICBjbGFzcyBJMixJMyxJNCBtaWQKICAgIGNsYXNzIEk1LEk2IGlubmVy" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBDbGllbnQoW2dSUEMgY2xpZW50XSkgLS0-IEkxCiAgICBJMVsiT2JzZXJ2YWJpbGl0eTxici8-dHJhY2luZyDCtyBtZXRyaWNzIMK3IGxvZ2dpbmciXSAtLT4gSTIKICAgIEkyWyJSYXRlIGxpbWl0aW5nIC8gcXVvdGEiXSAtLT4gSTMKICAgIEkzWyJBdXRoPGJyLz5hdXRobiDCtyBhdXRoeiJdIC0tPiBJNAogICAgSTRbIlZhbGlkYXRpb24iXSAtLT4gSTUKICAgIEk1WyJSZXRyeSAvIGlkZW1wb3RlbmN5Il0gLS0-IEk2CiAgICBJNlsiQ29udGV4dCBlbnJpY2htZW50Il0gLS0-IEhhbmRsZXJ7eyJCdXNpbmVzcyBoYW5kbGVyIn19CgogICAgY2xhc3NEZWYgb3V0ZXIgZmlsbDojZmVmNWU3LHN0cm9rZTojYjc3OTFmCiAgICBjbGFzc0RlZiBtaWQgZmlsbDojZThmNGY4LHN0cm9rZTojMmM1MjgyCiAgICBjbGFzc0RlZiBpbm5lciBmaWxsOiNmMGZmZjQsc3Ryb2tlOiMyZjg1NWEKICAgIGNsYXNzIEkxIG91dGVyCiAgICBjbGFzcyBJMixJMyxJNCBtaWQKICAgIGNsYXNzIEk1LEk2IGlubmVy" alt="Client([gRPC client]) --&amp;gt; I1" width="1784" height="94"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Outside-in on the way to the handler, inside-out on the way back. Observability must wrap everything — so it sees every rejection, every rate-limit hit, every failed auth — otherwise you have operational blind spots. Details:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Observability (tracing + metrics + logging)&lt;/strong&gt; — outermost. You want to see every request, including the ones that get rejected by later interceptors. If observability is inside auth, unauth'd attempts are invisible — a security-relevant blind spot.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rate limiting / quota&lt;/strong&gt; — before auth. Why? Because auth involves token verification (DB lookup, JWT parsing, external identity service), and you don't want unauthenticated requests to cost you CPU. Rate-limit first, authenticate second.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Auth (authentication + authorization)&lt;/strong&gt; — before business logic. Reject unauthenticated/unauthorized requests early.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Validation (request shape, basic sanity)&lt;/strong&gt; — before business logic. Catches malformed requests before they hit service code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Retry / idempotency handling&lt;/strong&gt; — closer to business. Only retry what actually made it through auth.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Request context enrichment (trace IDs, user metadata)&lt;/strong&gt; — innermost. Populate context with validated data for the service to use.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Inverted order produces real bugs. I've seen auth outside observability (auth failures weren't logged). Retry outside rate limiter (a retry storm blew through the rate limit). Validation outside observability (validation failures invisible in metrics). Each one a real incident.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keeping Interceptors Focused
&lt;/h2&gt;

&lt;p&gt;The rule: &lt;strong&gt;one concern per interceptor&lt;/strong&gt;. The moment you have an "auth-and-logging" interceptor, you're coupling concerns that should evolve separately.&lt;/p&gt;

&lt;p&gt;Concretely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Don't: single "observability" interceptor that does tracing, metrics, and logging in one function.&lt;/li&gt;
&lt;li&gt;Do: three interceptors (&lt;code&gt;tracingInterceptor&lt;/code&gt;, &lt;code&gt;metricsInterceptor&lt;/code&gt;, &lt;code&gt;loggingInterceptor&lt;/code&gt;), chained.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cost: three function-call overheads instead of one. Marginal.&lt;/p&gt;

&lt;p&gt;Benefit: you can swap tracing backends without touching logging. You can disable metrics in tests without disabling tracing. Each interceptor is testable in isolation.&lt;/p&gt;

&lt;p&gt;This is the same argument for Unix pipes over monolithic commands. Composition beats monoliths.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Interceptor Recipes
&lt;/h2&gt;

&lt;p&gt;Real interceptors I've written variants of many times:&lt;/p&gt;

&lt;h3&gt;
  
  
  Tracing (OpenTelemetry)
&lt;/h3&gt;

&lt;p&gt;Use the &lt;code&gt;otelgrpc&lt;/code&gt; integration from &lt;code&gt;go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc&lt;/code&gt;. Don't write your own — the ecosystem is mature. Current idiomatic setup uses a &lt;code&gt;StatsHandler&lt;/code&gt;, which hooks deeper than the interceptor chain and captures stream events correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"&lt;/span&gt;

&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatsHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;otelgrpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServerHandler&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChainUnaryInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="c"&gt;/* your app interceptors */&lt;/span&gt; &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Older codebases still use &lt;code&gt;otelgrpc.UnaryServerInterceptor()&lt;/code&gt; and &lt;code&gt;otelgrpc.StreamServerInterceptor()&lt;/code&gt; — those are deprecated but still work. Migrate when convenient; don't rewrite in a panic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Metrics
&lt;/h3&gt;

&lt;p&gt;Prometheus histogram of request duration per method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;reqDuration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;promauto&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewHistogramVec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;prometheus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HistogramOpts&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"grpc_server_request_duration_seconds"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Buckets&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prometheus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefBuckets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;metricsInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryServerInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;code&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;reqDuration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithLabelValues&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FullMethod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Observe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Seconds&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: cardinality of &lt;code&gt;method&lt;/code&gt; is bounded (you know your service's methods). Cardinality of &lt;code&gt;code&lt;/code&gt; is bounded (gRPC codes are a fixed enum). Don't add user-id or request-id as labels — that's cardinality-explosion territory.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auth
&lt;/h3&gt;

&lt;p&gt;Extract bearer token from metadata, verify, inject user context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;authInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryServerInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FromIncomingContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;codes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unauthenticated&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"no metadata"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"authorization"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;codes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unauthenticated&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"no auth token"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;verifyToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;codes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unauthenticated&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"invalid token"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// Skip certain public methods&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;isPublic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FullMethod&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userCtxKey&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key detail: add the user context here, near the boundary. Service code reads it from context. You don't pass claims as argument through every service method.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rate limiting
&lt;/h3&gt;

&lt;p&gt;Token bucket per caller or per method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;rateLimitInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limiter&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Limiter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryServerInterceptor&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
        &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryServerInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;codes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResourceExhausted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"rate limited"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Production rate limiting is fancier — per-tenant, distributed state in Redis, burst capacity — but the shape is the same. Reject with &lt;code&gt;ResourceExhausted&lt;/code&gt; before doing work.&lt;/p&gt;

&lt;h3&gt;
  
  
  Retry (client-side)
&lt;/h3&gt;

&lt;p&gt;Client interceptor that retries on transient errors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;retryClientInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempts&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryClientInterceptor&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reply&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
        &lt;span class="n"&gt;cc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientConn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;invoker&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryInvoker&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;opts&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallOption&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;invoker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;isRetryable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="n"&gt;backoff&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;uint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Millisecond&lt;/span&gt;
            &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;After&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Retry is one of the most dangerous interceptors. Get it wrong (no idempotency keys, retry non-idempotent operations, retry storm during outage) and it causes more production incidents than it prevents. Pair with &lt;a href="https://github.com/grpc-ecosystem/go-grpc-middleware" rel="noopener noreferrer"&gt;&lt;code&gt;grpc-middleware/retry&lt;/code&gt;&lt;/a&gt; if you can; it's battle-tested.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stream Interceptor Trap
&lt;/h2&gt;

&lt;p&gt;Stream interceptors are harder. The interceptor signature gives you a &lt;code&gt;grpc.ServerStream&lt;/code&gt;, which is a bidirectional channel. Logging becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;loggingStreamInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;srv&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;ss&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerStream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StreamServerInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StreamHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ss&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"stream=%s duration=%s err=%v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FullMethod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This only logs at stream-end, not per message. If you want per-message observability, you need to wrap the &lt;code&gt;ServerStream&lt;/code&gt; itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;observedStream&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerStream&lt;/span&gt;
    &lt;span class="n"&gt;sent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;recv&lt;/span&gt; &lt;span class="kt"&gt;int64&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;observedStream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;SendMsg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;atomic&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddInt64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerStream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SendMsg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;observedStream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;RecvMsg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerStream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RecvMsg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;atomic&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddInt64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then pass the wrapper to the handler. This is the pattern for any stream interceptor that needs per-message visibility.&lt;/p&gt;

&lt;p&gt;Common mistakes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Forgetting to propagate context to the wrapper.&lt;/strong&gt; The wrapped stream's &lt;code&gt;Context()&lt;/code&gt; should be the enriched context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-message overhead blows up long streams.&lt;/strong&gt; A message-level log line is fine at 100 msgs/sec. At 100K msgs/sec, it's your dominant cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State in the wrapper not thread-safe.&lt;/strong&gt; Streams can be concurrent on the &lt;code&gt;Send&lt;/code&gt; and &lt;code&gt;Recv&lt;/code&gt; sides. Protect counters.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Testing Interceptor Chains
&lt;/h2&gt;

&lt;p&gt;Unit test each interceptor in isolation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestAuthInterceptor_NoToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// no metadata&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnaryServerInfo&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;FullMethod&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"/my.Service/Method"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"handler should not be called"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;authInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;require&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Equal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;codes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unauthenticated&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Integration-test the chain end-to-end using &lt;code&gt;bufconn&lt;/code&gt; (in-memory connection):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestChain_Ordering&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;lis&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;bufconn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChainUnaryInterceptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;observability&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;business&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RegisterMyServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;realImpl&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Serve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bufnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithContextDialer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DialContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTransportCredentials&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;insecure&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCredentials&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewMyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Method&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;// assert on behavior end-to-end&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Integration tests catch bugs that unit tests don't: metadata propagation, interceptor ordering, context enrichment visible to the handler. Don't skip them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Patterns That Save Time
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;grpc-middleware/v2&lt;/code&gt;&lt;/strong&gt; (&lt;code&gt;github.com/grpc-ecosystem/go-grpc-middleware/v2&lt;/code&gt;) for chain helpers, recovery, and batteries-included interceptors. Don't reinvent every wheel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep error semantics consistent&lt;/strong&gt;. Every interceptor should return &lt;code&gt;status.Error(code, msg)&lt;/code&gt; for failures. Don't return raw Go errors — clients can't parse them properly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skip-list for public methods.&lt;/strong&gt; Auth and rate limit often need to skip health check and reflection endpoints. Keep the skip list in one place.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-service vs global interceptors&lt;/strong&gt;. Most interceptors are global (tracing, metrics, auth). A few might be per-service (e.g., a bespoke rate limiter for a specific hot endpoint). Compose accordingly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Panic recovery at the outermost layer&lt;/strong&gt;. A panic in a handler shouldn't kill the server. Use the &lt;code&gt;recovery&lt;/code&gt; middleware from &lt;code&gt;grpc-middleware&lt;/code&gt; or write your own, and put it first in the chain.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Discipline That Makes This Work
&lt;/h2&gt;

&lt;p&gt;Interceptors are the right tool for cross-cutting concerns — the things every RPC needs but the service code shouldn't have to think about. The discipline is: one concern per interceptor, careful ordering, consistent error semantics, tested end-to-end.&lt;/p&gt;

&lt;p&gt;The services I've seen do this well have clean business logic (because the cross-cutting stuff is outside it) and reliable operational behavior (because the interceptor chain is tested as a unit, not just piece-by-piece). The services that do it poorly have auth logic sprinkled through their handlers, tracing that randomly misses requests, and rate limiters that let certain code paths bypass.&lt;/p&gt;

&lt;p&gt;Interceptor order is one of those details that looks tactical and turns out to be architectural. Get it right once; the service's behavior improves every release.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-context-distributed-systems-production/" rel="noopener noreferrer"&gt;Go Context in Distributed Systems: What Actually Works in Production&lt;/a&gt; — the context that flows through every interceptor.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/rpc-vs-nats-who-owns-completion/" rel="noopener noreferrer"&gt;RPC vs NATS: It's Not About Sync vs Async — It's About Who Owns Completion&lt;/a&gt; — the shape of gRPC calls as one side of the bigger messaging picture.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/observability-cost-attribution-dual-path-architecture/" rel="noopener noreferrer"&gt;Observability and Cost Attribution: Why One Pipeline Isn't Enough&lt;/a&gt; — why tracing interceptors alone aren't enough for business attribution.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>grpc</category>
      <category>interceptors</category>
    </item>
    <item>
      <title>Go Generics, One Year In: Which Promises Held, Which Didn't</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:30:05 +0000</pubDate>
      <link>https://forem.com/harrisonsec/go-generics-one-year-in-which-promises-held-which-didnt-44m7</link>
      <guid>https://forem.com/harrisonsec/go-generics-one-year-in-which-promises-held-which-didnt-44m7</guid>
      <description>&lt;p&gt;Go 1.18 shipped generics in March 2022. The two years before that were dominated by hopeful blog posts ("finally, a real type system!") and the two years after by the predictable backlash ("why did we even bother, Go was simpler"). I've written production Go before and after. The honest answer is somewhere in the middle and closer to "useful for a narrower set of problems than we expected."&lt;/p&gt;

&lt;p&gt;This is a look back from someone who has shipped generic code in anger and reviewed a lot more of it. What held up. What didn't. What habits to adopt and which to avoid.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Go generics are genuinely valuable for &lt;strong&gt;parametric operations on container-shaped types&lt;/strong&gt; — slices, maps, channels, any-key lookup tables, min/max/sum utilities. Less valuable for "clever abstractions" that dress up control flow as type magic. The clearest gains are in the standard library itself (&lt;code&gt;slices&lt;/code&gt;, &lt;code&gt;maps&lt;/code&gt;) and in domain-specific utility packages. Most application code didn't need generics before and doesn't need them after. The mistake is not using generics; it's using them for things interfaces already handled fine.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What Generics Actually Are
&lt;/h2&gt;

&lt;p&gt;Go generics are &lt;strong&gt;type parameters on functions and types&lt;/strong&gt;. A function like &lt;code&gt;slices.Contains&lt;/code&gt; can be written once, work for any slice element type, and still be type-checked at compile time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Contains&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;S&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt; &lt;span class="n"&gt;comparable&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;S&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three features you should know:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Type parameters&lt;/strong&gt;: the &lt;code&gt;[E any]&lt;/code&gt; or &lt;code&gt;[E comparable]&lt;/code&gt; in brackets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Constraints&lt;/strong&gt;: tell the compiler what operations the type parameter supports. &lt;code&gt;any&lt;/code&gt;, &lt;code&gt;comparable&lt;/code&gt;, or custom interfaces like &lt;code&gt;constraints.Ordered&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Approximate constraints&lt;/strong&gt;: &lt;code&gt;~[]E&lt;/code&gt; means "any type whose underlying type is &lt;code&gt;[]E&lt;/code&gt;" — lets you be flexible about named slice types.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What they aren't: Java-style wildcards, C++ SFINAE, or anything that mimics variance. The design is deliberately narrower than most prior languages. It's more like Rust's generics, minus the trait system's complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Generics Clearly Win
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Standard-library style container and utility functions
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;slices&lt;/code&gt; and &lt;code&gt;maps&lt;/code&gt; packages in the standard library are the canonical example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;slices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"alice"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;slices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;numbers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;maps&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Keys&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;maps&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before generics, these were either hand-written per-type (tedious, error-prone), done via &lt;code&gt;interface{}&lt;/code&gt; (type-unsafe, slow), or done via &lt;code&gt;reflect&lt;/code&gt; (slow and error-prone). Generics are strictly better for these.&lt;/p&gt;

&lt;p&gt;The same pattern shows up in third-party libraries: &lt;code&gt;samber/lo&lt;/code&gt; (JS-style utilities), &lt;code&gt;thoas/go-funk&lt;/code&gt; (functional helpers), and many domain-specific ones. If you reach for lodash-style helpers in JavaScript, you'll want similar in Go, and generics made that workable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Concurrency helpers
&lt;/h3&gt;

&lt;p&gt;Generic worker pools, futures, result types — these all benefit from generics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Future&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;done&lt;/span&gt; &lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt;  &lt;span class="n"&gt;T&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt;  &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Future&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before generics, you'd have had an &lt;code&gt;interface{}&lt;/code&gt; return and a type assertion at the call site. Now you can express "this future produces a T" in the type. Cleaner at the boundary, safer at the call site.&lt;/p&gt;

&lt;h3&gt;
  
  
  Typed collections
&lt;/h3&gt;

&lt;p&gt;If your system has a genuinely typed container use case — say, an ordered map keyed by a domain ID — generics let you write it once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;OrderedMap&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;K&lt;/span&gt; &lt;span class="n"&gt;comparable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;order&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;K&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt;  &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="n"&gt;V&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a rare case where "custom generic container" is the right tool. The majority of code doesn't need this. But when you do need it, the generics version is much better than the &lt;code&gt;interface{}&lt;/code&gt; alternative.&lt;/p&gt;

&lt;h3&gt;
  
  
  Numerical / algorithmic code
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;constraints.Ordered&lt;/code&gt; (or its post-1.21 replacement &lt;code&gt;cmp.Ordered&lt;/code&gt;) is the key constraint for "works for any numeric or ordered type":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Max&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;cmp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Ordered&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Math helpers, min/max, sum, average — all cleanly generic. Readable, type-safe, performant.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Generics Don't Help, Or Hurt
&lt;/h2&gt;

&lt;h3&gt;
  
  
  "Generic services" and similar framework-y code
&lt;/h3&gt;

&lt;p&gt;I've seen codebases where someone wrote a generic "repository" type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Repository&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Repository&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="n"&gt;FindByID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Repository&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="n"&gt;Save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The instinct — "all repositories do the same thing" — is mostly wrong. Real repositories differ in query shape, error cases, caching rules, transaction boundaries. Forcing them behind a generic interface either (a) produces a lowest-common-denominator API that doesn't fit any actual use, or (b) gets so many type parameters that readability collapses.&lt;/p&gt;

&lt;p&gt;The Go idiom is usually better: one non-generic &lt;code&gt;UserRepository&lt;/code&gt;, one &lt;code&gt;OrderRepository&lt;/code&gt;, etc. Each concrete, each tuned to its domain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Over-constrained helpers
&lt;/h3&gt;

&lt;p&gt;If your "generic" function has five type parameters with custom constraints each, readability dies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Complicated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;comparable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;K&lt;/span&gt; &lt;span class="n"&gt;Hashable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;V&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;F&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;M&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is technically legal. Reading it, you realize it's a glorified map-with-cache-and-error. Interfaces or function types would have been clearer. Generics don't make complex APIs simple; they just let you make them complex in a type-checked way.&lt;/p&gt;

&lt;h3&gt;
  
  
  Behavioral polymorphism
&lt;/h3&gt;

&lt;p&gt;Interfaces are still the right tool when different types have &lt;strong&gt;different behavior&lt;/strong&gt;. A generic &lt;code&gt;Process[T any](x T) error&lt;/code&gt; doesn't help if you actually want different logic per type. You want an interface with a method.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Good use of interface&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Processor&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Bad use of generics&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;ProcessGeneric&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// can't actually differentiate behavior&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The separation: &lt;strong&gt;generics for parametric operations (same logic, any type), interfaces for polymorphic behavior (different logic per type).&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance: Usually a Wash
&lt;/h2&gt;

&lt;p&gt;The performance story is more nuanced than either "generics are slow" or "generics are free."&lt;/p&gt;

&lt;p&gt;Go's current generic implementation uses &lt;strong&gt;GCShape stenciling&lt;/strong&gt; — one compiled version per "GC shape" (roughly, per memory layout). This is between full monomorphization (one version per type, like Rust) and type-erased dispatch (one version total, like Java's reified-erased hybrid).&lt;/p&gt;

&lt;p&gt;Practical implications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Small primitive types (int, int64)&lt;/strong&gt; often get specialized versions. Competitive with hand-written.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pointer-sized types (most structs, interfaces)&lt;/strong&gt; share code. Slightly slower than hand-written but usually faster than interface-based dispatch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Call overhead is similar to function calls&lt;/strong&gt;, not interface dispatch. No devirtualization issue.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compile times increase&lt;/strong&gt;, especially for libraries with many instantiations. This is the real cost.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Benchmarks I've seen: generic versions are within 5% of hand-written equivalents, and consistently faster than &lt;code&gt;interface{}&lt;/code&gt;-based alternatives. Performance is almost never the deciding factor — readability and design fit matter more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Idioms That Emerged
&lt;/h2&gt;

&lt;p&gt;Over the years since 1.18, a few conventions have stuck:&lt;/p&gt;

&lt;h3&gt;
  
  
  Prefer &lt;code&gt;any&lt;/code&gt; to &lt;code&gt;interface{}&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;any&lt;/code&gt; is a type alias for &lt;code&gt;interface{}&lt;/code&gt; added in 1.18. Shorter, clearer. Use it everywhere.&lt;/p&gt;

&lt;h3&gt;
  
  
  Single-letter type parameters for simple cases, descriptive for complex
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;T&lt;/code&gt;, &lt;code&gt;K&lt;/code&gt;, &lt;code&gt;V&lt;/code&gt; for the obvious cases. More descriptive when the role is specific:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Reduce&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;In&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;In&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;In&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;initial&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Out&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Put constraints in a dedicated package
&lt;/h3&gt;

&lt;p&gt;If you have several custom constraints, group them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;constraints&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Ordered&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="kt"&gt;int64&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="kt"&gt;float64&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Numeric&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="kt"&gt;int64&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="err"&gt;~&lt;/span&gt;&lt;span class="kt"&gt;float64&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The standard &lt;code&gt;golang.org/x/exp/constraints&lt;/code&gt; (and later &lt;code&gt;cmp.Ordered&lt;/code&gt; in 1.21) set the pattern.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use &lt;code&gt;~T&lt;/code&gt; approximations for flexibility
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;~[]E&lt;/code&gt; includes named slice types. &lt;code&gt;~int&lt;/code&gt; includes &lt;code&gt;type MyInt int&lt;/code&gt;. Almost always the right choice for generic parametric code; refuses arbitrary extension.&lt;/p&gt;

&lt;h3&gt;
  
  
  Never overload generic helpers to do too much
&lt;/h3&gt;

&lt;p&gt;Each generic function should do one parametric thing. Generic helpers that try to be many things at once collapse under type-parameter weight.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Standard Library Won
&lt;/h2&gt;

&lt;p&gt;The clearest vindication of Go generics is what happened to the standard library. &lt;code&gt;slices&lt;/code&gt;, &lt;code&gt;maps&lt;/code&gt;, &lt;code&gt;cmp.Ordered&lt;/code&gt; — these additions are uncontroversially better than the pre-1.18 alternatives. A lot of code that used to be hand-rolled or based on &lt;code&gt;sort.Interface&lt;/code&gt; has cleaner replacements.&lt;/p&gt;

&lt;p&gt;The user-land picture is more mixed. Libraries that benefit from generics genuinely use them well (&lt;code&gt;samber/lo&lt;/code&gt;, &lt;code&gt;kelindar/column&lt;/code&gt;, many others). Libraries that don't need them mostly haven't been retrofitted with them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Do Now
&lt;/h2&gt;

&lt;p&gt;A few simple rules I apply:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Prefer standard library generic helpers over hand-rolled.&lt;/strong&gt; &lt;code&gt;slices.Contains&lt;/code&gt;, &lt;code&gt;slices.Sort&lt;/code&gt;, &lt;code&gt;maps.Keys&lt;/code&gt; — use them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write a generic helper only when I have at least two concrete use cases for it.&lt;/strong&gt; One use case is a pattern waiting to be born, not necessarily a generic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prefer functions to methods on generic types&lt;/strong&gt; when possible. Generic methods have more friction (can't overload by type, can't add methods outside the defining package).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep constraints simple.&lt;/strong&gt; &lt;code&gt;any&lt;/code&gt;, &lt;code&gt;comparable&lt;/code&gt;, &lt;code&gt;cmp.Ordered&lt;/code&gt;, and domain-specific single-type-union constraints cover 95% of cases. More complex constraints usually mean the abstraction is wrong.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Never turn interfaces into generics just because you can.&lt;/strong&gt; If the types have genuinely different behavior, an interface is right.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Where Generics Actually Sit Now
&lt;/h2&gt;

&lt;p&gt;Generics were oversold before they landed ("Go finally becomes a real language!") and oversampled in the aftermath ("generics everywhere!"). The truth is narrower and more boring: they're a useful addition for a specific class of problems, mostly centered on parametric operations over containers and numerics. They improved the standard library. They haven't changed the shape of most Go code.&lt;/p&gt;

&lt;p&gt;If you've been writing Go and wondering whether you're missing out by not using generics, the answer is almost certainly no. Code without them is still idiomatic. Code with them, when the use case fits, is cleaner. Neither is dominant. Both are fine.&lt;/p&gt;

&lt;p&gt;The one concrete thing I'd say: &lt;strong&gt;learn the generic parts of the standard library&lt;/strong&gt;. &lt;code&gt;slices&lt;/code&gt;, &lt;code&gt;maps&lt;/code&gt;, &lt;code&gt;cmp.Ordered&lt;/code&gt;. Use them reflexively. Stop hand-rolling &lt;code&gt;indexOf&lt;/code&gt; and &lt;code&gt;contains&lt;/code&gt;. Everything else can wait until you have a real problem that generics solve.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-profiling-pprof-escape-analysis-inlining/" rel="noopener noreferrer"&gt;Go Profiling in Anger: pprof, Escape Analysis, and Inlining Without Magic&lt;/a&gt; — the performance toolchain that tells you whether your generic code actually matches the hand-written version.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-sync-pool-buffer-reuse-when-it-helps/" rel="noopener noreferrer"&gt;sync.Pool in Go: When It Actually Helps, and When It Quietly Hurts&lt;/a&gt; — another feature most commonly misapplied.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/scale-up-scale-out-every-language-wins-somewhere/" rel="noopener noreferrer"&gt;Scale-Up vs Scale-Out: Why Every Language Wins Somewhere&lt;/a&gt; — the meta-question behind every language-feature debate.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>generics</category>
      <category>typeparameters</category>
    </item>
    <item>
      <title>Go Profiling in Anger: pprof, Escape Analysis, and Inlining Without Magic</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:29:23 +0000</pubDate>
      <link>https://forem.com/harrisonsec/go-profiling-in-anger-pprof-escape-analysis-and-inlining-without-magic-3ij</link>
      <guid>https://forem.com/harrisonsec/go-profiling-in-anger-pprof-escape-analysis-and-inlining-without-magic-3ij</guid>
      <description>&lt;p&gt;Go's performance culture has a ritual quality. "Use sync.Pool." "Avoid interface boxing." "Preallocate slices." Copy-pasted from blog posts and applied without measurement. Sometimes helpful. Often hollow.&lt;/p&gt;

&lt;p&gt;The honest answer is that Go performance work is mostly &lt;strong&gt;just profiling&lt;/strong&gt;. Good profiling tells you what's actually slow. Bad profiling — or no profiling — leaves you guessing. The toolchain that Go ships with is genuinely excellent; more engineers should use it, and fewer should follow checklist optimizations they haven't measured.&lt;/p&gt;

&lt;p&gt;This is a practical, end-to-end guide to pprof, escape analysis, and inlining — the three Go-specific tools that answer most performance questions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Start every Go perf investigation with a CPU pprof of the hot path under realistic load. 80% of issues are obvious in the flame graph. For the remaining 20%, add a heap profile and look for allocation pressure driving GC. Only after you've localized the problem with real data should you reach for micro-optimizations: escape analysis via &lt;code&gt;-gcflags='-m'&lt;/code&gt;, inlining hints, and targeted benchmark-driven rewrites. Skip the profile step, and you are optimizing the wrong thing.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Investigation Flow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IFRECiAgICBTdGFydChbUGVyZm9ybWFuY2UgY29uY2Vybl0pIC0tPiBDUFVbVGFrZSBDUFUgcHJvZmlsZTxici8-LWh0dHAgcHByb2YgwrcgMzBzIHVuZGVyIGxvYWRdCiAgICBDUFUgLS0-IEhvdHtIb3QgY29kZTxici8-b2J2aW91cz99CiAgICBIb3QgLS0-fFllc3wgRml4MVtGaXggdGhlIGhvdCBwYXRoIMK3IHJlLW1lYXN1cmVdCiAgICBIb3QgLS0-fE5vIMK3IEdDIGhpZ2h8IEhlYXBbVGFrZSBoZWFwIC8gYWxsb2MgcHJvZmlsZV0KICAgIEhlYXAgLS0-IEFsbG9jU2l0ZXtTcGVjaWZpYzxici8-YWxsb2Mgc2l0ZT99CiAgICBBbGxvY1NpdGUgLS0-fFllc3wgRXNjYXBlW0NoZWNrIC1nY2ZsYWdzPSctbSc8YnIvPmZvciB0aGF0IGZ1bmN0aW9uXQogICAgQWxsb2NTaXRlIC0tPnxOb3wgQmVuY2hNaWNyb1tJc29sYXRlIGluIGJlbmNobWFyazxici8-LWJlbmNobWVtIMK3IC1jb3VudD01XQogICAgRXNjYXBlIC0tPiBGaXgyW0ZpeCBhbGxvYyDCtyByZS1tZWFzdXJlXQogICAgQmVuY2hNaWNybyAtLT4gRml4M1tPcHRpbWl6ZSBvciBhY2NlcHRdCiAgICBGaXgxIC0tPiBWZXJpZnlbUHJvZmlsZSBhZ2FpbiDCtyBjb25maXJtXQogICAgRml4MiAtLT4gVmVyaWZ5CiAgICBGaXgzIC0tPiBWZXJpZnkKCiAgICBjbGFzc0RlZiBzdGFydCBmaWxsOiNlOGY0Zjgsc3Ryb2tlOiMyYzUyODIKICAgIGNsYXNzRGVmIGFjdGlvbiBmaWxsOiNmMGZmZjQsc3Ryb2tlOiMyZjg1NWEKICAgIGNsYXNzRGVmIHZlcmlmeSBmaWxsOiNmZWY1ZTcsc3Ryb2tlOiNiNzc5MWYKICAgIGNsYXNzIFN0YXJ0IHN0YXJ0CiAgICBjbGFzcyBGaXgxLEZpeDIsRml4MyBhY3Rpb24KICAgIGNsYXNzIFZlcmlmeSB2ZXJpZnk%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IFRECiAgICBTdGFydChbUGVyZm9ybWFuY2UgY29uY2Vybl0pIC0tPiBDUFVbVGFrZSBDUFUgcHJvZmlsZTxici8-LWh0dHAgcHByb2YgwrcgMzBzIHVuZGVyIGxvYWRdCiAgICBDUFUgLS0-IEhvdHtIb3QgY29kZTxici8-b2J2aW91cz99CiAgICBIb3QgLS0-fFllc3wgRml4MVtGaXggdGhlIGhvdCBwYXRoIMK3IHJlLW1lYXN1cmVdCiAgICBIb3QgLS0-fE5vIMK3IEdDIGhpZ2h8IEhlYXBbVGFrZSBoZWFwIC8gYWxsb2MgcHJvZmlsZV0KICAgIEhlYXAgLS0-IEFsbG9jU2l0ZXtTcGVjaWZpYzxici8-YWxsb2Mgc2l0ZT99CiAgICBBbGxvY1NpdGUgLS0-fFllc3wgRXNjYXBlW0NoZWNrIC1nY2ZsYWdzPSctbSc8YnIvPmZvciB0aGF0IGZ1bmN0aW9uXQogICAgQWxsb2NTaXRlIC0tPnxOb3wgQmVuY2hNaWNyb1tJc29sYXRlIGluIGJlbmNobWFyazxici8-LWJlbmNobWVtIMK3IC1jb3VudD01XQogICAgRXNjYXBlIC0tPiBGaXgyW0ZpeCBhbGxvYyDCtyByZS1tZWFzdXJlXQogICAgQmVuY2hNaWNybyAtLT4gRml4M1tPcHRpbWl6ZSBvciBhY2NlcHRdCiAgICBGaXgxIC0tPiBWZXJpZnlbUHJvZmlsZSBhZ2FpbiDCtyBjb25maXJtXQogICAgRml4MiAtLT4gVmVyaWZ5CiAgICBGaXgzIC0tPiBWZXJpZnkKCiAgICBjbGFzc0RlZiBzdGFydCBmaWxsOiNlOGY0Zjgsc3Ryb2tlOiMyYzUyODIKICAgIGNsYXNzRGVmIGFjdGlvbiBmaWxsOiNmMGZmZjQsc3Ryb2tlOiMyZjg1NWEKICAgIGNsYXNzRGVmIHZlcmlmeSBmaWxsOiNmZWY1ZTcsc3Ryb2tlOiNiNzc5MWYKICAgIGNsYXNzIFN0YXJ0IHN0YXJ0CiAgICBjbGFzcyBGaXgxLEZpeDIsRml4MyBhY3Rpb24KICAgIGNsYXNzIFZlcmlmeSB2ZXJpZnk%3D" alt="Start([Performance concern]) --&amp;gt; CPU[Take CPU profile&amp;lt;br/&amp;gt;-http pprof · 30s under load]" width="803" height="1086"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  CPU Profiling: The First Thing, Always
&lt;/h2&gt;

&lt;p&gt;Every Go binary can expose a pprof HTTP endpoint in two lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="s"&gt;"net/http/pprof"&lt;/span&gt;
&lt;span class="c"&gt;// later&lt;/span&gt;
&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListenAndServe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"localhost:6060"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under load, grab a CPU profile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go tool pprof &lt;span class="nt"&gt;-http&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;:9999 http://localhost:6060/debug/pprof/profile?seconds&lt;span class="o"&gt;=&lt;/span&gt;30
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This opens a flame graph in your browser. The wide blocks are where CPU time is spent. Usually the answer is immediate — "oh, JSON encoding is 40% of my CPU; let me switch to a faster encoder." Or "regex compilation is in the hot path because someone forgot to pre-compile."&lt;/p&gt;

&lt;p&gt;A few things that look surprising on first profile but shouldn't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;runtime.mallocgc&lt;/code&gt; taking 10%+&lt;/strong&gt; is GC pressure. You're allocating a lot. Look at heap profile next.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;runtime.schedule&lt;/code&gt; or &lt;code&gt;runtime.findrunnable&lt;/code&gt; taking 5%+&lt;/strong&gt; means you have too many goroutines churning. Check if you're spawning per-request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;syscall.Syscall&lt;/code&gt; high&lt;/strong&gt; means you're system-call-heavy — usually I/O. Either buffer/batch, or consider epoll-direct if it's in your hot path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;mutex.Lock&lt;/code&gt; visible&lt;/strong&gt; means contention. Either shrink the lock hold time or shard the lock.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Don't guess your way through these. Click into each, read the stack, find the user code that caused it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Heap Profiling: When CPU Points to GC
&lt;/h2&gt;

&lt;p&gt;If &lt;code&gt;runtime.mallocgc&lt;/code&gt; shows up in your CPU profile as a non-trivial chunk, heap profile tells you why:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go tool pprof &lt;span class="nt"&gt;-http&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;:9999 http://localhost:6060/debug/pprof/heap
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go tool pprof &lt;span class="nt"&gt;-http&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;:9999 http://localhost:6060/debug/pprof/allocs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;heap&lt;/code&gt; shows current memory usage. &lt;code&gt;allocs&lt;/code&gt; shows cumulative allocations since program start — this is usually what you want to optimize.&lt;/p&gt;

&lt;p&gt;In the flame graph, look for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Specific allocation sites taking disproportionate share.&lt;/strong&gt; A single line of code creating 50% of allocations is an obvious target.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calls to &lt;code&gt;makeslice&lt;/code&gt;, &lt;code&gt;makemap&lt;/code&gt;, &lt;code&gt;newobject&lt;/code&gt;&lt;/strong&gt; with known-size inputs. If you know the size, preallocate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interface boxing in hot paths.&lt;/strong&gt; Every time you pass a concrete type through an &lt;code&gt;interface{}&lt;/code&gt; argument in a tight loop, the runtime may heap-allocate the boxed value.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;String concatenation with &lt;code&gt;+&lt;/code&gt;.&lt;/strong&gt; This is the textbook preventable allocation — use &lt;code&gt;strings.Builder&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal isn't "zero allocations" — that's usually not practical. The goal is "allocations per operation in a tight, repeated path are bounded and understood."&lt;/p&gt;

&lt;h2&gt;
  
  
  Escape Analysis: The Compiler's Story
&lt;/h2&gt;

&lt;p&gt;Go's compiler decides at compile time whether a variable lives on the stack (free, garbage-collected with the function) or the heap (allocated, GC-tracked). This is called escape analysis.&lt;/p&gt;

&lt;p&gt;To see the analysis for your code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go build &lt;span class="nt"&gt;-gcflags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'-m'&lt;/span&gt; ./...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./foo.go:12:6: can inline hotFunction
./foo.go:15:10: &amp;amp;Thing{} escapes to heap
./foo.go:18:14: make([]int, 100) does not escape
./foo.go:22:6: parameter "x" escapes to heap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key things to read for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;escapes to heap&lt;/code&gt;&lt;/strong&gt; — this allocation is heap-allocated. If it's in a hot path, investigate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;does not escape&lt;/code&gt;&lt;/strong&gt; — stack-allocated, free. You want most short-lived locals to do this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;parameter escapes to heap&lt;/code&gt;&lt;/strong&gt; — the caller's passed value escapes because this function keeps a reference to it. Often fixable by taking a copy or not storing a reference.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The most common surprise: &lt;strong&gt;passing a value to a function that eventually hands it to &lt;code&gt;interface{}&lt;/code&gt; causes the value to escape&lt;/strong&gt;. A pattern like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handleRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"got request"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// req.ID boxes to interface{} and may escape&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;req.ID&lt;/code&gt; escapes because of the &lt;code&gt;...interface{}&lt;/code&gt; argument. In a tight path, this is measurable. Fix: use a typed logger that takes concrete types, or accept the cost because logging on the hot path is usually not the hot path.&lt;/p&gt;

&lt;p&gt;Escape analysis is one of those things where reading the output a few times is worth it. You start seeing your code differently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inlining: When the Compiler Eliminates the Call
&lt;/h2&gt;

&lt;p&gt;Go's compiler inlines small functions to avoid call overhead. Seeing what got inlined:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go build &lt;span class="nt"&gt;-gcflags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'-m'&lt;/span&gt; ./... 2&amp;gt;&amp;amp;1 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s1"&gt;'can inline|cannot inline'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./foo.go:12:6: can inline hotFunction
./foo.go:18:6: cannot inline bigFunction: function too complex: cost 117 exceeds budget 80
./foo.go:22:6: cannot inline interfacingFunction: call to unknown method
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Default budget is 80 AST nodes. Hard blockers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Calls through interfaces.&lt;/strong&gt; The compiler doesn't know what concrete method gets called. No inlining.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calls to functions that contain loops with &lt;code&gt;for range&lt;/code&gt; over a channel.&lt;/strong&gt; Historically blocked, though the mid-stack inliner has improved this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recursive functions.&lt;/strong&gt; Obvious.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functions over the budget.&lt;/strong&gt; Refactor smaller if the call is hot.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When to care:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Never in normal code. Go inlines what it can; your code runs.&lt;/li&gt;
&lt;li&gt;Sometimes in tight hot loops where the call overhead is 10%+ of the total work. Benchmark shows it.&lt;/li&gt;
&lt;li&gt;Occasionally when you control an interface boundary and can replace it with a concrete type on a hot path.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Don't structure your code around inlining. Code readability beats hypothetical call-overhead wins in nearly every case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmarks: The Ground Truth
&lt;/h2&gt;

&lt;p&gt;Every perf claim should be backed by a benchmark. &lt;code&gt;testing.B&lt;/code&gt; is the tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;BenchmarkEncodeResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;newResponse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReportAllocs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResetTimer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;-bench&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;BenchmarkEncode &lt;span class="nt"&gt;-benchmem&lt;/span&gt; &lt;span class="nt"&gt;-count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;-count=5&lt;/code&gt; runs each bench 5 times, so you can compare variance. Don't trust a single run. Hardware, OS scheduling, thermals — all add noise.&lt;/p&gt;

&lt;p&gt;For comparing two implementations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;-bench&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;BenchmarkEncodeResponse &lt;span class="nt"&gt;-benchmem&lt;/span&gt; &lt;span class="nt"&gt;-count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10 ./... &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; old.txt
&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;change code&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;-bench&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;BenchmarkEncodeResponse &lt;span class="nt"&gt;-benchmem&lt;/span&gt; &lt;span class="nt"&gt;-count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10 ./... &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; new.txt
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;benchstat old.txt new.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;benchstat&lt;/code&gt; (&lt;code&gt;golang.org/x/perf/cmd/benchstat&lt;/code&gt;) gives you statistical significance. If the difference isn't statistically meaningful, you didn't actually improve anything — you just rolled the dice differently.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 80/20 of Go Performance
&lt;/h2&gt;

&lt;p&gt;After enough of this work, a few patterns dominate the real wins:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Query shape, not language.&lt;/strong&gt; A slow endpoint is usually doing 10 DB queries when it could do 1. Go is almost never the bottleneck; the data layer is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network hop count.&lt;/strong&gt; Every inter-service call adds latency. Merging two small services or co-locating tight integrations beats any language-level optimization.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching at the right layer.&lt;/strong&gt; A well-placed LRU cache saves more than micro-optimizing the uncached path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Preallocating known-size slices/maps.&lt;/strong&gt; &lt;code&gt;make([]int, 0, n)&lt;/code&gt; when you know n is almost free. The default &lt;code&gt;make([]int, 0)&lt;/code&gt; reallocates as you append.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoiding interface boxing in loops.&lt;/strong&gt; This is the one micro-optimization that regularly shows up in real profiles.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything else — &lt;code&gt;sync.Pool&lt;/code&gt;, escape analysis hand-tuning, loop unrolling — is a long-tail optimization. Worth it when profiling tells you it is. Premature otherwise.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Habit I Recommend
&lt;/h2&gt;

&lt;p&gt;Before adding any optimization, do exactly three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Take a profile with the optimization off. Save it.&lt;/li&gt;
&lt;li&gt;Apply the optimization.&lt;/li&gt;
&lt;li&gt;Take a profile with the optimization on. Compare.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the comparison doesn't show clear improvement on the metric you cared about, revert. Do not add complexity without evidence.&lt;/p&gt;

&lt;p&gt;This sounds obvious. Almost nobody does it. Most perf work in Go codebases accumulates dead optimizations that add nothing or actively hurt — but nobody knows which, because nobody benchmarked.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Habit That Compounds
&lt;/h2&gt;

&lt;p&gt;Go's performance tooling is better than Go's performance culture gives it credit for. pprof, escape analysis, inlining diagnostics, and benchmarks are built in. They're precise. They tell you the truth.&lt;/p&gt;

&lt;p&gt;The reason most Go code isn't as fast as it could be isn't that Go is slow (it isn't). It's that engineers copy-paste optimizations they haven't measured, call the work done, and move on. The few engineers who profile first and optimize second write code that's actually fast — and usually simpler than the ritual-heavy version.&lt;/p&gt;

&lt;p&gt;Profile first. Everything else follows.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-sync-pool-buffer-reuse-when-it-helps/" rel="noopener noreferrer"&gt;sync.Pool in Go: When It Actually Helps, and When It Quietly Hurts&lt;/a&gt; — the one Go optimization most likely to be misapplied.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-millions-connections-user-space-context-switching/" rel="noopener noreferrer"&gt;Why Go Handles Millions of Connections: User-Space Context Switching, Explained&lt;/a&gt; — understanding the runtime is the prerequisite to understanding profiles.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/testing-real-world-go-backends/" rel="noopener noreferrer"&gt;Testing Real-World Go Backends Isn't What Many People Think&lt;/a&gt; — benchmarking is the last mile of testing.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>performance</category>
      <category>pprof</category>
    </item>
    <item>
      <title>sync.Pool in Go: When It Actually Helps, and When It Quietly Hurts</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:28:42 +0000</pubDate>
      <link>https://forem.com/harrisonsec/syncpool-in-go-when-it-actually-helps-and-when-it-quietly-hurts-2676</link>
      <guid>https://forem.com/harrisonsec/syncpool-in-go-when-it-actually-helps-and-when-it-quietly-hurts-2676</guid>
      <description>&lt;p&gt;&lt;code&gt;sync.Pool&lt;/code&gt; is one of those Go features that shows up prominently in "how to write fast Go" blog posts and then gets applied to everything. The result is a codebase sprinkled with pools that don't help and sometimes hurt. Most Go code I review does not need &lt;code&gt;sync.Pool&lt;/code&gt;. The code that does need it often uses it wrong.&lt;/p&gt;

&lt;p&gt;This is a working engineer's take on when pooling actually helps, when it's wasted effort, and the specific traps it creates.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — &lt;code&gt;sync.Pool&lt;/code&gt; is a GC-pressure reducer for workloads that allocate large-ish, short-lived objects at high frequency. It is not a general-purpose optimization. The cases where it clearly helps: per-request buffers in HTTP handlers, encoder/decoder instances, JSON buffers, protocol frame buffers. The cases where it hurts or is wasted: small objects, infrequent allocations, long-lived state, and any code that forgets to reset pooled items. Benchmark before and after — always.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What sync.Pool Actually Does
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;sync.Pool&lt;/code&gt; is a free-list for objects that the GC can clear. You &lt;code&gt;Get()&lt;/code&gt; an object (fresh or recycled). You use it. You &lt;code&gt;Put()&lt;/code&gt; it back. The runtime tries to give you a recycled one next time, but reserves the right to drop the whole pool on GC.&lt;/p&gt;

&lt;p&gt;Key properties:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GC clears pools on every cycle.&lt;/strong&gt; This is crucial. Pools are not a long-term cache — they're a hint to the runtime that "if you're going to collect these, wait a moment in case they get reused first."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-P (per-scheduler-thread) local storage.&lt;/strong&gt; Most &lt;code&gt;Get()&lt;/code&gt;/&lt;code&gt;Put()&lt;/code&gt; calls hit a goroutine-local pool with no contention. Scaling across cores is nearly free.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No guarantees.&lt;/strong&gt; A &lt;code&gt;Get()&lt;/code&gt; might return a fresh object. A &lt;code&gt;Put()&lt;/code&gt; might be discarded if the pool is full or the GC just fired.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This design is exactly right for "reusable scratch space." It's wrong for "cached resources I need to stay around" (use a real cache instead).&lt;/p&gt;

&lt;h2&gt;
  
  
  Should You Pool This?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IFRECiAgICBTdGFydChbQ29uc2lkZXJpbmcgc3luYy5Qb29sP10pIC0tPiBRMXtIYXZlIHlvdTxici8-YmVuY2htYXJrZWQ8YnIvPi1iZW5jaG1lbT99CiAgICBRMSAtLT58Tm98IFNraXAxW0JlbmNobWFyayBmaXJzdC48YnIvPk1vc3QgY29kZSBkb2Vzbid0IG5lZWQgdGhpcy5dCiAgICBRMSAtLT58WWVzfCBRMntPYmplY3Qgc2l6ZTxici8-PiAxIEtCP30KICAgIFEyIC0tPnxObyDCtyBzbWFsbCBvYmplY3R8IFNraXAyW1Bvb2wgb3ZlcmhlYWQgZXhjZWVkczxici8-YWxsb2MgY29zdC4gVXNlICduZXcnLl0KICAgIFEyIC0tPnxZZXN8IFEze0FsbG9jYXRpb25zPGJyLz5mcmVxdWVudD88YnIvPjEwMDBzL3NlY30KICAgIFEzIC0tPnxObyDCtyByYXJlfCBTa2lwM1tHQyBoYW5kbGVzIHRoaXMgZmluZS48YnIvPlNraXAuXQogICAgUTMgLS0-fFllc3wgUTR7U2hvcnQtbGl2ZWQ8YnIvPmFuZCBlYXNpbHk8YnIvPnJlc2V0P30KICAgIFE0IC0tPnxObyDCtyBsb25nLWxpdmVkfCBTa2lwNFtVc2UgYSByZWFsIGNhY2hlPGJyLz5vciByZXNvdXJjZSBwb29sLl0KICAgIFE0IC0tPnxZZXN8IFVzZVtVc2Ugc3luYy5Qb29sLjxici8-QWx3YXlzIFJlc2V0IG9uIEdldCBhbmQgUHV0Ll0KCiAgICBjbGFzc0RlZiBza2lwIGZpbGw6I2ZlZDdkNyxzdHJva2U6I2M1MzAzMAogICAgY2xhc3NEZWYgdXNlIGZpbGw6I2YwZmZmNCxzdHJva2U6IzJmODU1YQogICAgY2xhc3MgU2tpcDEsU2tpcDIsU2tpcDMsU2tpcDQgc2tpcAogICAgY2xhc3MgVXNlIHVzZQ%3D%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IFRECiAgICBTdGFydChbQ29uc2lkZXJpbmcgc3luYy5Qb29sP10pIC0tPiBRMXtIYXZlIHlvdTxici8-YmVuY2htYXJrZWQ8YnIvPi1iZW5jaG1lbT99CiAgICBRMSAtLT58Tm98IFNraXAxW0JlbmNobWFyayBmaXJzdC48YnIvPk1vc3QgY29kZSBkb2Vzbid0IG5lZWQgdGhpcy5dCiAgICBRMSAtLT58WWVzfCBRMntPYmplY3Qgc2l6ZTxici8-PiAxIEtCP30KICAgIFEyIC0tPnxObyDCtyBzbWFsbCBvYmplY3R8IFNraXAyW1Bvb2wgb3ZlcmhlYWQgZXhjZWVkczxici8-YWxsb2MgY29zdC4gVXNlICduZXcnLl0KICAgIFEyIC0tPnxZZXN8IFEze0FsbG9jYXRpb25zPGJyLz5mcmVxdWVudD88YnIvPjEwMDBzL3NlY30KICAgIFEzIC0tPnxObyDCtyByYXJlfCBTa2lwM1tHQyBoYW5kbGVzIHRoaXMgZmluZS48YnIvPlNraXAuXQogICAgUTMgLS0-fFllc3wgUTR7U2hvcnQtbGl2ZWQ8YnIvPmFuZCBlYXNpbHk8YnIvPnJlc2V0P30KICAgIFE0IC0tPnxObyDCtyBsb25nLWxpdmVkfCBTa2lwNFtVc2UgYSByZWFsIGNhY2hlPGJyLz5vciByZXNvdXJjZSBwb29sLl0KICAgIFE0IC0tPnxZZXN8IFVzZVtVc2Ugc3luYy5Qb29sLjxici8-QWx3YXlzIFJlc2V0IG9uIEdldCBhbmQgUHV0Ll0KCiAgICBjbGFzc0RlZiBza2lwIGZpbGw6I2ZlZDdkNyxzdHJva2U6I2M1MzAzMAogICAgY2xhc3NEZWYgdXNlIGZpbGw6I2YwZmZmNCxzdHJva2U6IzJmODU1YQogICAgY2xhc3MgU2tpcDEsU2tpcDIsU2tpcDMsU2tpcDQgc2tpcAogICAgY2xhc3MgVXNlIHVzZQ%3D%3D" alt="Start([Considering sync.Pool?]) --&amp;gt; Q1{Have you&amp;lt;br/&amp;gt;benchmarked&amp;lt;br/&amp;gt;-benchmem?}" width="918" height="1217"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most paths in real code exit this flow long before hitting "use". That's correct.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Pooling Helps: Per-Request Buffers
&lt;/h2&gt;

&lt;p&gt;Canonical case. An HTTP handler serializes a response to a buffer, writes the buffer, moves on. The next request does the same thing. Without pooling, the GC collects the buffer every request. With pooling, the buffer is reused:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;bufferPool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;bufferPool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;bufferPool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;

    &lt;span class="n"&gt;writeResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bytes&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under realistic load (thousands of requests per second), this typically reduces allocation pressure by 20-40% and measurably lowers GC pause times. The exact number depends on your allocation pattern, but the principle holds: &lt;strong&gt;large, frequent, short-lived allocations are exactly what pooling is for&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What makes this the canonical case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Buffers are big enough (4KB initial) that the allocation actually matters.&lt;/li&gt;
&lt;li&gt;They're frequent — thousands per second.&lt;/li&gt;
&lt;li&gt;Short-lived — used within one request.&lt;/li&gt;
&lt;li&gt;Easy to reset — &lt;code&gt;buf.Reset()&lt;/code&gt; clears it cleanly.&lt;/li&gt;
&lt;li&gt;Same shape every time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you see a request-scoped buffer that fits all five, pooling almost always pays.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Pooling Is Wasted Effort
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Small objects.&lt;/strong&gt; Pooling a 24-byte struct with three fields is almost never worth it. The pool's own overhead (per-P lookup, interface boxing) is larger than the allocation. Benchmark to confirm — you'll see &lt;code&gt;allocs/op&lt;/code&gt; go down but &lt;code&gt;ns/op&lt;/code&gt; stay the same or go up.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Not worth it:&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Small&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;smallPool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Small&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;}}&lt;/span&gt;

&lt;span class="c"&gt;// Just use new(Small) or &amp;amp;Small{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Infrequent allocations.&lt;/strong&gt; If your code path runs once an hour, pooling saves nothing meaningful. The GC handles a handful of allocations just fine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long-lived state.&lt;/strong&gt; Connection objects, database handles, caches. These shouldn't be in &lt;code&gt;sync.Pool&lt;/code&gt; — they should be in a proper cache or connection pool (like &lt;code&gt;*sql.DB&lt;/code&gt;, which internally manages connections without &lt;code&gt;sync.Pool&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anything you can't reliably reset.&lt;/strong&gt; If an object has state that needs to be "returned to zero," and you can forget to zero it, you're one typo away from data leaking between requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Reset Trap
&lt;/h2&gt;

&lt;p&gt;The single most dangerous mistake with &lt;code&gt;sync.Pool&lt;/code&gt;: forgetting to reset the object before putting it back, or reusing it before clearing whatever was in it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Wrong:&lt;/span&gt;
&lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;responseData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// might not start empty&lt;/span&gt;
&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bytes&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// buf still has data; next caller might see it&lt;/span&gt;

&lt;span class="c"&gt;// Right:&lt;/span&gt;
&lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// ← explicit&lt;/span&gt;
&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;responseData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bytes&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This has caused real production incidents. Pooled buffers across request handlers have leaked bearer tokens, user PII, and password reset codes when a reset was missed. The runtime doesn't help — there's no "enforce reset" mechanism. You have to do it.&lt;/p&gt;

&lt;p&gt;Habits that reduce the risk:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Always pair &lt;code&gt;Get&lt;/code&gt; with a &lt;code&gt;defer Reset+Put&lt;/code&gt;&lt;/strong&gt; at the top of the function.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reset at both ends&lt;/strong&gt; (on Get and on Put) — paranoid but effective.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For byte slices, shrink before return&lt;/strong&gt;: &lt;code&gt;buf.Reset()&lt;/code&gt; on a &lt;code&gt;bytes.Buffer&lt;/code&gt; resets length but keeps capacity — that's usually what you want. For a raw &lt;code&gt;[]byte&lt;/code&gt;, use &lt;code&gt;buf[:0]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Make your &lt;code&gt;New&lt;/code&gt; function return a pre-reset object.&lt;/strong&gt; Don't assume it's always "fresh."&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Alloc Benchmark Methodology
&lt;/h2&gt;

&lt;p&gt;The only honest way to know if pooling is helping is &lt;code&gt;go test -bench -benchmem&lt;/code&gt;. Here's what a useful benchmark looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;BenchmarkWithoutPool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReportAllocs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResetTimer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;writeResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exampleRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bytes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;BenchmarkWithPool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReportAllocs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResetTimer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;bufferPool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;writeResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exampleRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bytes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;bufferPool&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;go &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;-bench&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-benchmem&lt;/span&gt;
&lt;span class="go"&gt;BenchmarkWithoutPool-10    200000    8431 ns/op    4352 B/op    3 allocs/op
BenchmarkWithPool-10       500000    3214 ns/op     128 B/op    1 allocs/op
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look for two things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;allocs/op&lt;/code&gt; drops significantly&lt;/strong&gt; (here: 3 → 1).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;ns/op&lt;/code&gt; drops or stays flat&lt;/strong&gt; (here: 8431 → 3214).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If &lt;code&gt;allocs/op&lt;/code&gt; drops but &lt;code&gt;ns/op&lt;/code&gt; goes up, pooling is adding overhead without saving enough GC pressure to justify itself. That's the "wasted effort" signal.&lt;/p&gt;

&lt;p&gt;The benchmark alone isn't enough, though — you also need production evidence. pprof heap profiles before and after deployment should show reduced allocation. If the prod numbers don't match the benchmark, you're measuring the wrong thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Pattern That Actually Works: Scoped Pools
&lt;/h2&gt;

&lt;p&gt;One pattern I've found useful: &lt;strong&gt;scope the pool to the type of work it serves&lt;/strong&gt;. Don't have one giant pool that everything pulls from.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// JSON response buffer pool&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;jsonBufPool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Protocol frame buffer pool (different typical size)&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;frameBufPool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;64&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="m"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why separate pools matter: if you have one shared pool, you might &lt;code&gt;Get()&lt;/code&gt; a 64KB buffer when you needed a 4KB one and waste memory. Or worse, you might &lt;code&gt;Get()&lt;/code&gt; a 4KB one for a 64KB job and grow it (defeating pooling's purpose).&lt;/p&gt;

&lt;p&gt;Separate pools stay close to their intended sizes. Each pool's items are homogeneous. The New function's initial capacity reflects the typical workload.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Big Thing &lt;code&gt;sync.Pool&lt;/code&gt; Isn't
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;sync.Pool&lt;/code&gt; is not a replacement for bounded resource pools (database connections, HTTP clients, goroutine worker pools). Those need explicit lifecycle management, health checks, and non-discardable state. Use a real pool library for them.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sync.Pool&lt;/code&gt; is also not a cache. A cache holds items you want to find again. &lt;code&gt;sync.Pool&lt;/code&gt; holds items you might reuse if one's convenient, and discards them otherwise. Different primitive for a different problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Matters
&lt;/h2&gt;

&lt;p&gt;Most Go code is fast enough without pooling. Before adding &lt;code&gt;sync.Pool&lt;/code&gt; to your hot path, ask:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Have I actually benchmarked this with &lt;code&gt;-benchmem&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;Are the objects I'd pool both large and frequent?&lt;/li&gt;
&lt;li&gt;Can I reliably reset them?&lt;/li&gt;
&lt;li&gt;Is GC pressure in pprof profiles actually a problem?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If any answer is no, skip the pool. The simpler code is almost always the better code.&lt;/p&gt;

&lt;p&gt;The cases where pooling pays are real but narrower than internet wisdom suggests. Per-request buffers, protocol frame buffers, encoder/decoder state, crypto scratch space. Beyond that, the pool usually adds more lines of code than it saves nanoseconds — and each of those lines is one more place where a missing &lt;code&gt;Reset()&lt;/code&gt; can leak bytes between requests.&lt;/p&gt;

&lt;p&gt;Measure. Then decide.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-chan-context-structure-not-speed/" rel="noopener noreferrer"&gt;Go's Concurrency Is About Structure, Not Speed&lt;/a&gt; — the bigger principle: Go optimizes for correct structure, not raw speed.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/testing-real-world-go-backends/" rel="noopener noreferrer"&gt;Testing Real-World Go Backends Isn't What Many People Think&lt;/a&gt; — how to actually benchmark and prove a pool helps.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>performance</category>
      <category>syncpool</category>
    </item>
    <item>
      <title>IronSys: A Production Blueprint for Modern Concurrency</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:28:01 +0000</pubDate>
      <link>https://forem.com/harrisonsec/ironsys-a-production-blueprint-for-modern-concurrency-8c9</link>
      <guid>https://forem.com/harrisonsec/ironsys-a-production-blueprint-for-modern-concurrency-8c9</guid>
      <description>&lt;p&gt;In the last post I walked through the four concurrency pillars — shared memory + locks, CSP, actors, STM — and argued that real systems mix them on purpose. Someone reasonably asked: &lt;em&gt;okay, but what does that actually look like?&lt;/em&gt; Fair question. Abstract taxonomy is less useful than a worked example.&lt;/p&gt;

&lt;p&gt;IronSys is that worked example. It's a composite blueprint — not a real service, but representative of a class of services I've designed, helped design, or debugged in production. Let's say it's a mid-sized backend system: public API, stateful user sessions, streaming data in, aggregation and reporting out. The kind of thing that appears in the middle of any serious platform.&lt;/p&gt;

&lt;p&gt;The interesting part isn't the features. It's which concurrency primitive shows up where, and why.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — IronSys is a composite production blueprint: a multi-service Go backend with stateful user sessions, streaming ingest, and usage aggregation. It uses CSP channels for pipelines and coordination, a goroutine-per-entity actor pattern for stateful sessions, mutexes and atomics for hot shared counters, and durable queues for cross-service handoff. Each primitive is picked for a specific failure mode. The pattern is not "mix for variety"; it's "match the primitive to the work."&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The System Shape
&lt;/h2&gt;

&lt;p&gt;Before deciding on concurrency primitives, sketch the work shapes. IronSys has four:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Public API&lt;/strong&gt; — request/response, modest concurrency, latency-sensitive. The classic HTTP backend.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live sessions&lt;/strong&gt; — stateful, long-lived per-user entities. Think multiplayer game server, collaborative editor, real-time dashboard.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming ingest&lt;/strong&gt; — high-throughput events arriving over Kafka/NATS, fanned out to workers for processing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch aggregation&lt;/strong&gt; — periodic rollup jobs that read from storage, compute, write back.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Four shapes, four concurrency patterns. The wrong design would apply the same primitive to all four. The right design picks each separately.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBTaGFwZXNbIldvcmsgc2hhcGVzIl0KICAgICAgICBTMVsiMS4gUHVibGljIEFQSTxici8-c3RhdGVsZXNzIMK3IHJlcXVlc3QvcmVzcG9uc2UiXQogICAgICAgIFMyWyIyLiBMaXZlIHNlc3Npb25zPGJyLz5zdGF0ZWZ1bCDCtyBsb25nLWxpdmVkIl0KICAgICAgICBTM1siMy4gU3RyZWFtaW5nIGluZ2VzdDxici8-aGlnaCB0aHJvdWdocHV0IMK3IHN0YXRlbGVzcyJdCiAgICAgICAgUzRbIjQuIEJhdGNoIGFnZ3JlZ2F0aW9uPGJyLz5waXBlbGluZSDCtyBzY2hlZHVsZWQiXQogICAgZW5kCgogICAgc3ViZ3JhcGggUHJpbWl0aXZlc1siQ29uY3VycmVuY3kgcHJpbWl0aXZlcyJdCiAgICAgICAgUDFbIkdvcm91dGluZSArIG11dGV4PGJyLz5wZXItcmVxdWVzdCBoYW5kbGVyIl0KICAgICAgICBQMlsiR29yb3V0aW5lLXBlci1lbnRpdHk8YnIvPmFjdG9yLWxpa2UgwrcgcHJpdmF0ZSBzdGF0ZSJdCiAgICAgICAgUDNbIkJvdW5kZWQgY2hhbm5lbCArIHdvcmtlciBwb29sPGJyLz5DU1AgwrcgYmFja3ByZXNzdXJlIl0KICAgICAgICBQNFsiQ1NQIHBpcGVsaW5lICsgZXJyZ3JvdXA8YnIvPnN0YWdlZCDCtyBjYW5jZWxsYWJsZSJdCiAgICBlbmQKCiAgICBTMSAtLT4gUDEKICAgIFMyIC0tPiBQMgogICAgUzMgLS0-IFAzCiAgICBTNCAtLT4gUDQKCiAgICBjbGFzc0RlZiBzaGFwZSBmaWxsOiNlOGY0Zjgsc3Ryb2tlOiMyYzUyODIKICAgIGNsYXNzRGVmIHByaW0gZmlsbDojZjBmZmY0LHN0cm9rZTojMmY4NTVhCiAgICBjbGFzcyBTaGFwZXMgc2hhcGUKICAgIGNsYXNzIFByaW1pdGl2ZXMgcHJpbQ%3D%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBTaGFwZXNbIldvcmsgc2hhcGVzIl0KICAgICAgICBTMVsiMS4gUHVibGljIEFQSTxici8-c3RhdGVsZXNzIMK3IHJlcXVlc3QvcmVzcG9uc2UiXQogICAgICAgIFMyWyIyLiBMaXZlIHNlc3Npb25zPGJyLz5zdGF0ZWZ1bCDCtyBsb25nLWxpdmVkIl0KICAgICAgICBTM1siMy4gU3RyZWFtaW5nIGluZ2VzdDxici8-aGlnaCB0aHJvdWdocHV0IMK3IHN0YXRlbGVzcyJdCiAgICAgICAgUzRbIjQuIEJhdGNoIGFnZ3JlZ2F0aW9uPGJyLz5waXBlbGluZSDCtyBzY2hlZHVsZWQiXQogICAgZW5kCgogICAgc3ViZ3JhcGggUHJpbWl0aXZlc1siQ29uY3VycmVuY3kgcHJpbWl0aXZlcyJdCiAgICAgICAgUDFbIkdvcm91dGluZSArIG11dGV4PGJyLz5wZXItcmVxdWVzdCBoYW5kbGVyIl0KICAgICAgICBQMlsiR29yb3V0aW5lLXBlci1lbnRpdHk8YnIvPmFjdG9yLWxpa2UgwrcgcHJpdmF0ZSBzdGF0ZSJdCiAgICAgICAgUDNbIkJvdW5kZWQgY2hhbm5lbCArIHdvcmtlciBwb29sPGJyLz5DU1AgwrcgYmFja3ByZXNzdXJlIl0KICAgICAgICBQNFsiQ1NQIHBpcGVsaW5lICsgZXJyZ3JvdXA8YnIvPnN0YWdlZCDCtyBjYW5jZWxsYWJsZSJdCiAgICBlbmQKCiAgICBTMSAtLT4gUDEKICAgIFMyIC0tPiBQMgogICAgUzMgLS0-IFAzCiAgICBTNCAtLT4gUDQKCiAgICBjbGFzc0RlZiBzaGFwZSBmaWxsOiNlOGY0Zjgsc3Ryb2tlOiMyYzUyODIKICAgIGNsYXNzRGVmIHByaW0gZmlsbDojZjBmZmY0LHN0cm9rZTojMmY4NTVhCiAgICBjbGFzcyBTaGFwZXMgc2hhcGUKICAgIGNsYXNzIFByaW1pdGl2ZXMgcHJpbQ%3D%3D" alt="S1[" width="686" height="596"&gt;&lt;/a&gt;stateless · request/response"]"/&amp;gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The API Handlers
&lt;/h2&gt;

&lt;p&gt;Nothing fancy. Stock Go HTTP server. Each request is its own goroutine (Go's runtime does this automatically). Shared state — rate limiters, cache, config — is protected by mutexes or atomics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;RateLimiter&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;mu&lt;/span&gt;      &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Mutex&lt;/span&gt;
    &lt;span class="n"&gt;buckets&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;RateLimiter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;buckets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newBucket&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;buckets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;allow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Obvious choice. The contention is bounded by request rate, the state is small, a mutex is the simplest possible tool. Over-engineering here — sharded maps, lock-free data structures — buys nothing.&lt;/p&gt;

&lt;p&gt;What IronSys does here that many teams miss: &lt;strong&gt;every handler is context-aware from request entry&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;HandleFoo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parseReq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;writeResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Context flows everywhere downstream. The handler layer is boring; that's the point.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Live Sessions — Actor Pattern in Go
&lt;/h2&gt;

&lt;p&gt;Each active user session is a long-lived goroutine with an inbox channel. I call this the &lt;strong&gt;goroutine-per-entity pattern&lt;/strong&gt; — it's Erlang actors without the runtime, built from Go primitives.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Session&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;       &lt;span class="n"&gt;SessionID&lt;/span&gt;
    &lt;span class="n"&gt;mailbox&lt;/span&gt;  &lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;SessionCmd&lt;/span&gt;  &lt;span class="c"&gt;// the "actor" inbox&lt;/span&gt;
    &lt;span class="n"&gt;shutdown&lt;/span&gt; &lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt;    &lt;span class="n"&gt;sessionState&lt;/span&gt;      &lt;span class="c"&gt;// private to this goroutine&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;SessionCmd&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;op&lt;/span&gt;     &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt;   &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;reply&lt;/span&gt;  &lt;span class="k"&gt;chan&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;SessionReply&lt;/span&gt; &lt;span class="c"&gt;// optional reply channel&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;runSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="nb"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mailbox&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mailbox&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shutdown&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// persist final state&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this pattern, not "session is a struct with a mutex"?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;State is private to one goroutine&lt;/strong&gt;. No sharing, no locks, no lock-ordering bugs. The session state is accessed by exactly one execution context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serial message processing&lt;/strong&gt;. Commands process one at a time, in FIFO order. Business invariants hold naturally.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Natural location for cross-session coordination&lt;/strong&gt;. Each session is a message destination. Broadcasting to all sessions, or routing a command to a specific session, is just "send on its inbox."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clean lifecycle&lt;/strong&gt;. The goroutine runs until &lt;code&gt;shutdown&lt;/code&gt; or &lt;code&gt;ctx.Done&lt;/code&gt;. State is flushed once, on exit. No race between "is this session still alive" and "did we finish writing its state."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The manager that creates and routes to sessions looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;SessionManager&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;mu&lt;/span&gt;       &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RWMutex&lt;/span&gt;
    &lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;SessionID&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;SessionManager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;SessionID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RLock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RUnlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;SessionManager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;SessionID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;runSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// supervisor goroutine&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the mixing: the manager uses a mutex-protected map (shared state with a clear owner), individual sessions use the actor pattern (isolated state, message-passing). Two primitives, picked per-job.&lt;/p&gt;

&lt;p&gt;This pattern scales to millions of sessions because goroutines are cheap. I've seen this exact pattern serve 400K concurrent sessions on a single pod.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Streaming Ingest — Bounded Worker Pool (CSP)
&lt;/h2&gt;

&lt;p&gt;Kafka consumer feeding a worker pool. Canonical CSP territory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;runConsumer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cons&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;kafka&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Consumer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;jobs&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;wg&lt;/span&gt; &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WaitGroup&lt;/span&gt;

    &lt;span class="c"&gt;// Fixed worker pool&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;workerCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;wg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;wg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// Producer&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="nb"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cons&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;jobs&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;

    &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;wg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Wait&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The bounded channel is the concurrency clamp. Kafka can push as fast as it wants; the worker pool consumes at its own pace; backpressure propagates back to Kafka's consumer offset naturally.&lt;/p&gt;

&lt;p&gt;Why not actors here? Because the work items are stateless — you're processing events, not maintaining per-entity state. The overhead of an actor (mailbox, dispatch, ownership) is unjustified. CSP is the right fit.&lt;/p&gt;

&lt;p&gt;Why not mutex + a worker loop? You could, but the channel primitive is exactly the right shape — bounded capacity + safe cross-goroutine handoff + graceful shutdown — without needing to build those three features yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Batch Aggregation — Pipelines + errgroup
&lt;/h2&gt;

&lt;p&gt;Nightly rollup: read from storage, compute per-account aggregates, write back.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;runRollup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gctx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;errgroup&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// Stage 1: parse&lt;/span&gt;
    &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;ParsedEvent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Go&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="nb"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;parseStage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c"&gt;// Stage 2: aggregate (keyed by account)&lt;/span&gt;
    &lt;span class="n"&gt;agged&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;Aggregate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Go&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="nb"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agged&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;aggregateStage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agged&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c"&gt;// Stage 3: persist&lt;/span&gt;
    &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Go&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;persistStage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agged&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Wait&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three stages in a pipeline. Each stage is a goroutine, connected by bounded channels. &lt;code&gt;errgroup&lt;/code&gt; ties them together: first error cancels the whole pipeline.&lt;/p&gt;

&lt;p&gt;The aggregation stage internally uses a map protected by a mutex, because it's a single goroutine reading the map — no contention at all, but still safe if a future change introduces more readers.&lt;/p&gt;

&lt;p&gt;This is textbook CSP: &lt;em&gt;the topology of channels is the architecture&lt;/em&gt;. Read the code and the shape of the computation is obvious.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cross-Service Handoff — Durable Queues
&lt;/h2&gt;

&lt;p&gt;IronSys talks to two other services: a billing service (async, eventually consistent) and an auth service (sync, immediate).&lt;/p&gt;

&lt;p&gt;For billing: a dedicated NATS JetStream subject with at-least-once delivery. Usage events go in one end; the billing service reads them. The emission codepath has a local write-ahead log so that if NATS is briefly down, events buffer on disk and replay when the connection recovers.&lt;/p&gt;

&lt;p&gt;For auth: gRPC with tight timeouts. Caller owns completion. If auth is slow, the API handler's deadline fires and the request fails fast.&lt;/p&gt;

&lt;p&gt;Two different ownership models for two different shapes of work. See: &lt;a href="https://harrisonsec.com/blog/rpc-vs-nats-who-owns-completion/" rel="noopener noreferrer"&gt;RPC vs NATS: Who Owns Completion&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Primitives Map
&lt;/h2&gt;

&lt;p&gt;Summarizing which primitive serves which job in IronSys:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Work shape&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;HTTP request handling&lt;/td&gt;
&lt;td&gt;Stock &lt;code&gt;net/http&lt;/code&gt; + goroutine per request&lt;/td&gt;
&lt;td&gt;Language default, right for stateless&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hot shared state (rate limiter, cache)&lt;/td&gt;
&lt;td&gt;Mutex / atomic&lt;/td&gt;
&lt;td&gt;Simplest primitive that works&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stateful user sessions&lt;/td&gt;
&lt;td&gt;Goroutine-per-entity (actor-like)&lt;/td&gt;
&lt;td&gt;Isolated state, message-passing, serial processing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session directory&lt;/td&gt;
&lt;td&gt;RWMutex-protected map&lt;/td&gt;
&lt;td&gt;Shared lookup, read-heavy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Streaming event processing&lt;/td&gt;
&lt;td&gt;Bounded channel + worker pool (CSP)&lt;/td&gt;
&lt;td&gt;Backpressure, parallelism, graceful shutdown&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-stage data pipeline&lt;/td&gt;
&lt;td&gt;CSP pipeline + errgroup&lt;/td&gt;
&lt;td&gt;Stage topology = architecture; first-error cancels all&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Async cross-service handoff&lt;/td&gt;
&lt;td&gt;Durable queue (NATS JetStream / Kafka)&lt;/td&gt;
&lt;td&gt;Receiver owns completion, at-least-once delivery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sync cross-service call&lt;/td&gt;
&lt;td&gt;gRPC with ctx timeout&lt;/td&gt;
&lt;td&gt;Caller owns completion, fast failure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Notice: &lt;strong&gt;all four concurrency pillars show up&lt;/strong&gt;. Mutexes in the rate limiter. CSP in the event pipeline. Actors (in pattern) in the session runtime. (STM is missing; it would show up if I were doing this in Clojure or Haskell.)&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Architecture Gets Wrong
&lt;/h2&gt;

&lt;p&gt;Every architecture has weaknesses. IronSys's are real:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The actor pattern isn't real actors.&lt;/strong&gt; Without Erlang-style supervision, if a session goroutine panics, Go's default behavior is to kill the process. Adding panic recovery per-session is easy but not free. In practice, most teams hit this 6 months in, add a recovery wrapper, and move on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bounded channels can mask slow downstream.&lt;/strong&gt; If a channel fills up and the producer blocks, that's backpressure — great. But if the channel is buffered too large, you can buffer a lot of work into memory before realizing downstream is slow. Tune buffer sizes with measurements, not guesses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Goroutine-per-entity has a per-session baseline cost.&lt;/strong&gt; Cheap but not free. A million sessions is ~2.5GB of goroutine stacks. For services where most entities are inactive, a lazy pattern (spin up on activity, suspend to disk on idle) is better.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixing paradigms cognitively.&lt;/strong&gt; New engineers have to learn four patterns instead of one. The productivity hit is real for the first two weeks; the payoff is in the next two years.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What This Blueprint Is Really Selling
&lt;/h2&gt;

&lt;p&gt;A system with four work shapes should have four concurrency patterns, not one stretched to cover everything. The four pillars aren't theoretical; they map to real design decisions, and production Go services that use them deliberately are easier to reason about than those that don't.&lt;/p&gt;

&lt;p&gt;What IronSys is really selling is &lt;strong&gt;intentional heterogeneity&lt;/strong&gt;. Every primitive is there for a reason. Every reason is traceable to a specific failure mode you want to prevent. The architecture should be legible — a new engineer reading the code should understand why a channel is there instead of a mutex, why a session has its own goroutine instead of being a struct in a shared map, why billing goes through a durable queue instead of a gRPC call.&lt;/p&gt;

&lt;p&gt;If you can't answer "why this primitive here," the code isn't finished. It's just working, for now.&lt;/p&gt;

&lt;p&gt;Blueprints are useful precisely because they're generic. The specifics of your system will be different. But the decision framework — what's the work shape, what's the failure mode, what's the right primitive — is the same every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/four-pillars-modern-concurrency-locks-to-actors/" rel="noopener noreferrer"&gt;From Locks to Actors: The Four Pillars of Modern Concurrency&lt;/a&gt; — the taxonomy behind the choices in IronSys.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-chan-context-structure-not-speed/" rel="noopener noreferrer"&gt;Go's Concurrency Is About Structure, Not Speed&lt;/a&gt; — chan and context as the glue across all of these.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/rpc-vs-nats-who-owns-completion/" rel="noopener noreferrer"&gt;RPC vs NATS: It's Not About Sync vs Async — It's About Who Owns Completion&lt;/a&gt; — the cross-service handoff choices.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/testing-real-world-go-backends/" rel="noopener noreferrer"&gt;Testing Real-World Go Backends Isn't What Many People Think&lt;/a&gt; — how you verify a system like this actually holds up.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>concurrency</category>
      <category>systemdesign</category>
      <category>go</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Docker Kubernetes: What They Really Changed (It's Not What You Think)</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Mon, 20 Apr 2026 07:13:19 +0000</pubDate>
      <link>https://forem.com/harrisonsec/docker-x-kubernetes-what-they-really-changed-its-not-what-you-think-1972</link>
      <guid>https://forem.com/harrisonsec/docker-x-kubernetes-what-they-really-changed-its-not-what-you-think-1972</guid>
      <description>&lt;p&gt;"A Docker container is basically a lightweight VM, right?" No. That sentence alone causes more architectural misunderstandings than any other in modern backend engineering. A VM virtualizes hardware. A container is a set of Linux kernel features — namespaces, cgroups, overlay filesystems — wrapped in a nicer CLI. Same host kernel, same memory space, same attack surface if the kernel has a bug. The marketing that says otherwise has cost teams real money in misconfigured production.&lt;/p&gt;

&lt;p&gt;Kubernetes gets the same treatment. "It's a tool for running containers." Also not really. Kubernetes is a distributed scheduler, service mesh, declarative control plane, and reconciliation engine. Containers are one of the things it happens to run. Treating Kubernetes as "container orchestration" produces systems that break in predictable, frustrating ways — because the team never learned that the reconciliation loop, not the container, is the thing that actually matters.&lt;/p&gt;

&lt;p&gt;This is a working engineer's re-read of what Docker and Kubernetes actually changed. Not the marketing story. The underneath-the-hood story that tells you when to reach for them and when they're overkill.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Docker didn't invent Linux namespaces, cgroups, or filesystem layering; it packaged them into a developer-friendly workflow. That workflow is what changed. Kubernetes didn't invent distributed scheduling, service discovery, or rolling deployments; it standardized the declarative, reconciliation-loop pattern for all of them. That pattern is what changed. Understanding these primitives (namespaces + cgroups + reconciliation loops) tells you when to reach for the tools and when the tools are overkill.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What Docker Actually Is
&lt;/h2&gt;

&lt;p&gt;Docker is a set of Linux kernel features wrapped in a nice CLI and an image format. The features existed before Docker; they just weren't accessible.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Linux namespaces&lt;/strong&gt; — process, mount, network, IPC, UTS, user, cgroup. Each namespace gives a process its own view of that resource. When your container thinks it has PID 1, it really thinks so; inside its PID namespace, the host's init is invisible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cgroups (v1/v2)&lt;/strong&gt; — resource accounting and limits. How much CPU, memory, I/O bandwidth a group of processes can use. This is why a misconfigured container can eat a host's memory and take everything else down.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Union / overlay filesystems&lt;/strong&gt; — the thing that lets you stack "base image" + "layer 1" + "layer 2" without copying. OverlayFS on modern kernels.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image format (OCI)&lt;/strong&gt; — a standard way to package a root filesystem plus metadata into something reproducible.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Docker's innovation was not inventing any of this. It was making them &lt;strong&gt;accessible&lt;/strong&gt;. &lt;code&gt;docker run -p 8080:80 nginx&lt;/code&gt; hides a beautiful horror of namespace creation, iptables rules, virtual ethernet pairs, overlay mounts, and cgroup assignment. Before Docker, you'd have spent a week reading &lt;code&gt;unshare(2)&lt;/code&gt; and &lt;code&gt;ip netns add&lt;/code&gt; to reproduce this. After Docker, you did it in a workshop afternoon.&lt;/p&gt;

&lt;p&gt;What actually changed: &lt;strong&gt;deployments became reproducible&lt;/strong&gt;. The image you built on your laptop contained everything needed to run — OS libraries, Python version, environment. "Works on my machine" stopped being a coping mechanism and started being a legitimate development artifact. That's the Docker revolution. Not containers. Reproducible, portable environments.&lt;/p&gt;

&lt;p&gt;The thing that is &lt;em&gt;not&lt;/em&gt; true, despite the marketing: Docker containers are not VMs. They share the host kernel. A kernel exploit in one container can reach the host and other containers. Containers are a soft isolation — good enough for most production multi-tenant workloads, not good enough for hostile tenants.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Kubernetes Actually Is
&lt;/h2&gt;

&lt;p&gt;Kubernetes is a declarative control plane built on the &lt;strong&gt;reconciliation loop&lt;/strong&gt; pattern. This is the single most important idea to internalize.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You write a manifest describing the &lt;strong&gt;desired state&lt;/strong&gt;: "three replicas of this deployment, exposed through this service, attached to this config."&lt;/li&gt;
&lt;li&gt;You hand the manifest to the control plane: "make it so."&lt;/li&gt;
&lt;li&gt;Kubernetes runs an unending loop: observe the current state, compare to desired, take actions to close the gap.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything Kubernetes does follows this pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Deployment&lt;/code&gt; controllers watch the pod count, scale up if low, scale down if high.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ReplicaSet&lt;/code&gt; controllers ensure N identical pods exist.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Service&lt;/code&gt; controllers maintain the iptables / IPVS / eBPF rules that route virtual IPs.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Ingress&lt;/code&gt; controllers watch Ingress resources and configure the edge proxy.&lt;/li&gt;
&lt;li&gt;The scheduler watches for unscheduled pods and binds them to nodes.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Node&lt;/code&gt; controller watches node health and evicts pods from unhealthy nodes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your application is just the &lt;strong&gt;data&lt;/strong&gt; in the reconciliation loop. The loops run forever, closing gaps. That's Kubernetes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBHaXRbKEdpdCDCtyBtYW5pZmVzdHM8YnIvPnNvdXJjZSBvZiB0cnV0aCldIC0tPiBBUElbS3ViZXJuZXRlcyBBUEkgc2VydmVyXQoKICAgIHN1YmdyYXBoIExvb3BbIlJlY29uY2lsaWF0aW9uIGxvb3AgwrcgZm9yZXZlciJdCiAgICAgICAgRGVzaXJlZFsiRGVzaXJlZCBzdGF0ZTxici8-ZnJvbSBtYW5pZmVzdCJdIC0tPiBDb21wYXJle01hdGNoP30KICAgICAgICBPYnNlcnZlZFsiT2JzZXJ2ZWQgc3RhdGU8YnIvPmZyb20gY2x1c3RlciJdIC0tPiBDb21wYXJlCiAgICAgICAgQ29tcGFyZSAtLT58Tm8gwrcgYWN0fCBBY3Rpb25bIkNvbnRyb2xsZXIgdGFrZXMgYWN0aW9uPGJyLz5zY2FsZSDCtyBzY2hlZHVsZSDCtyBldmljdCDCtyByb3V0ZSJdCiAgICAgICAgQWN0aW9uIC0tPiBPYnNlcnZlZAogICAgICAgIENvbXBhcmUgLS0-fFllcyDCtyB3YWl0fCBPYnNlcnZlZAogICAgZW5kCgogICAgQVBJIC0tPiBEZXNpcmVkCiAgICBBUEkgLS0-IE9ic2VydmVkCgogICAgVXNlcihbWW91IMK3IGt1YmVjdGwgYXBwbHldKSAtLT58dXBkYXRlIG1hbmlmZXN0fCBHaXQKCiAgICBjbGFzc0RlZiBsb29wIGZpbGw6I2U4ZjRmOCxzdHJva2U6IzJjNTI4MixzdHJva2Utd2lkdGg6MnB4CiAgICBjbGFzcyBMb29wIGxvb3A%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBHaXRbKEdpdCDCtyBtYW5pZmVzdHM8YnIvPnNvdXJjZSBvZiB0cnV0aCldIC0tPiBBUElbS3ViZXJuZXRlcyBBUEkgc2VydmVyXQoKICAgIHN1YmdyYXBoIExvb3BbIlJlY29uY2lsaWF0aW9uIGxvb3AgwrcgZm9yZXZlciJdCiAgICAgICAgRGVzaXJlZFsiRGVzaXJlZCBzdGF0ZTxici8-ZnJvbSBtYW5pZmVzdCJdIC0tPiBDb21wYXJle01hdGNoP30KICAgICAgICBPYnNlcnZlZFsiT2JzZXJ2ZWQgc3RhdGU8YnIvPmZyb20gY2x1c3RlciJdIC0tPiBDb21wYXJlCiAgICAgICAgQ29tcGFyZSAtLT58Tm8gwrcgYWN0fCBBY3Rpb25bIkNvbnRyb2xsZXIgdGFrZXMgYWN0aW9uPGJyLz5zY2FsZSDCtyBzY2hlZHVsZSDCtyBldmljdCDCtyByb3V0ZSJdCiAgICAgICAgQWN0aW9uIC0tPiBPYnNlcnZlZAogICAgICAgIENvbXBhcmUgLS0-fFllcyDCtyB3YWl0fCBPYnNlcnZlZAogICAgZW5kCgogICAgQVBJIC0tPiBEZXNpcmVkCiAgICBBUEkgLS0-IE9ic2VydmVkCgogICAgVXNlcihbWW91IMK3IGt1YmVjdGwgYXBwbHldKSAtLT58dXBkYXRlIG1hbmlmZXN0fCBHaXQKCiAgICBjbGFzc0RlZiBsb29wIGZpbGw6I2U4ZjRmOCxzdHJva2U6IzJjNTI4MixzdHJva2Utd2lkdGg6MnB4CiAgICBjbGFzcyBMb29wIGxvb3A%3D" alt="Git[(Git · manifests&amp;lt;br/&amp;gt;source of truth)] --&amp;gt; API[Kubernetes API server]" width="1723" height="308"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every Kubernetes feature — Deployments, Services, Ingresses, HPAs, CronJobs, StatefulSets — is some controller running this exact pattern. Once you see it, the platform stops being magic.&lt;/p&gt;

&lt;p&gt;What actually changed because of this: &lt;strong&gt;the operational model became shared across companies&lt;/strong&gt;. Before Kubernetes, every engineering team had a bespoke orchestration system: a collection of Chef/Puppet/Ansible recipes, some custom scripts, a deploy button, and a few senior engineers who knew which knobs to turn during incidents. Different at every company. Opaque to new hires. Sensitive to key-person risk.&lt;/p&gt;

&lt;p&gt;Kubernetes is many things, but the single biggest thing it did was replace a hundred bespoke orchestration glues with one standard. It's not the best tool for every problem — Nomad is simpler, ECS is more managed, Cloud Run hides the thing entirely — but it's the standard, and "it's the standard" has real value: hires know it, vendors build for it, books exist, the job market is liquid.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Mental Model Most People Miss
&lt;/h2&gt;

&lt;p&gt;Once you see "reconciliation loop," you stop asking questions Kubernetes doesn't answer.&lt;/p&gt;

&lt;p&gt;"How do I deploy?" You don't. You update a manifest. A controller observes the change and reconciles.&lt;/p&gt;

&lt;p&gt;"How do I roll back?" You don't. You update the manifest back. A controller observes the change and reconciles in the other direction.&lt;/p&gt;

&lt;p&gt;"Why did my pod get killed?" Because a controller decided the current state (this pod is here, on this node) didn't match the desired state (node is draining, or pod is over its memory limit, or a replica count decreased). It closed the gap.&lt;/p&gt;

&lt;p&gt;"Why can't I SSH in and hand-edit things?" Because the next reconcile loop will undo your edit. The manifest is the source of truth. If you want to change behavior, change the manifest.&lt;/p&gt;

&lt;p&gt;This is a shift from imperative ops ("run these commands to deploy") to declarative ops ("the system should look like this; make it so"). Git becomes the history of what your infrastructure should be. Time travel works. Change review works. Disaster recovery becomes "re-apply the manifests to a new cluster." When it clicks, you stop fighting the platform.&lt;/p&gt;

&lt;p&gt;Until it clicks, the platform feels maddening. "I just want to run a container" — yes, but the platform doesn't care what you want to do once. It cares about the continuous state. Every action through kubectl apply is a statement of desired state, not an imperative command.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changed in Practice
&lt;/h2&gt;

&lt;p&gt;Concretely, what looks different on a team that's moved from "SSH into the box and &lt;code&gt;systemctl restart&lt;/code&gt;" to a reconciled-state model:&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment became a git push
&lt;/h3&gt;

&lt;p&gt;Before: log into the bastion, pull the latest build, restart the service, watch the log.&lt;br&gt;
After: merge to main, CI pushes image to registry, ArgoCD/Flux observes the manifest change, the Deployment controller updates the ReplicaSet, pods roll gradually.&lt;/p&gt;

&lt;p&gt;Benefits: change review, audit trail, rollback by git revert, consistent deploys across teams.&lt;br&gt;
Costs: debugging a broken deploy requires understanding the CD pipeline, the manifest, and the controller that's reconciling. The failure mode surface is wider.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scaling became a number in a file
&lt;/h3&gt;

&lt;p&gt;Before: write a script that watches metrics, calls the cloud API, hopes for the best.&lt;br&gt;
After: &lt;code&gt;replicas: 10&lt;/code&gt; in a manifest, or an HPA (Horizontal Pod Autoscaler) that watches metrics and adjusts the Deployment.&lt;/p&gt;

&lt;p&gt;Benefits: declarative, versioned, reproducible.&lt;br&gt;
Costs: HPA behavior is subtle — wrong thresholds cause thrashing, wrong metrics cause over/underscaling. Many teams never invest in tuning.&lt;/p&gt;

&lt;h3&gt;
  
  
  Service discovery became DNS
&lt;/h3&gt;

&lt;p&gt;Before: register in Consul, read from Consul, have a catalog. Or hardcode IPs. Or service registry.&lt;br&gt;
After: &lt;code&gt;my-service.my-namespace.svc.cluster.local&lt;/code&gt; resolves to a stable virtual IP. Kube-proxy or CNI load-balances to healthy pods.&lt;/p&gt;

&lt;p&gt;Benefits: services don't need to know how other services run. Standard DNS.&lt;br&gt;
Costs: the DNS / networking layer is one of the hardest parts of Kubernetes to debug. When service discovery breaks, you're reading iptables or eBPF maps, not a Consul dashboard.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuration became a manifest
&lt;/h3&gt;

&lt;p&gt;Before: environment variables, .env files, maybe Consul KV.&lt;br&gt;
After: ConfigMaps and Secrets, mounted as env vars or volumes.&lt;/p&gt;

&lt;p&gt;Benefits: versioned, reviewed, separate from code.&lt;br&gt;
Costs: changing a ConfigMap doesn't automatically restart pods. You have to annotate the Deployment or use something like reloader. New users get bitten by this constantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Kubernetes Is Overkill
&lt;/h2&gt;

&lt;p&gt;I'll say it directly: most teams adopting Kubernetes for the first time don't need it.&lt;/p&gt;

&lt;p&gt;Rules of thumb:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Two or three services, one team&lt;/strong&gt;: you don't need Kubernetes. ECS, Nomad, Cloud Run, or even systemd + Ansible will do. The operational overhead of Kubernetes exceeds its benefit at this scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ten to twenty services, small team&lt;/strong&gt;: Kubernetes starts breaking even if you pick a managed service (EKS, GKE, AKS). Don't run your own control plane.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fifty+ services, multiple teams, serious release engineering needs&lt;/strong&gt;: Kubernetes is probably the right call. The cost of complexity is amortized over the benefits of a shared declarative platform.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The dangerous zone is 5-15 services on a small team. At that scale, Kubernetes often wins the resume-driven-development vote and loses the actual-outcomes vote. Pick a simpler tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Kubernetes Is the Right Answer
&lt;/h2&gt;

&lt;p&gt;The jobs where Kubernetes genuinely shines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-service, multi-team engineering orgs&lt;/strong&gt; where consistency matters more than per-service optimality.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scale-out workloads with heterogeneous shapes&lt;/strong&gt; — web apps, job runners, ML batch jobs, stateful databases, all on one platform.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Teams that want declarative infrastructure&lt;/strong&gt; — GitOps via ArgoCD/Flux, infra PRs reviewed like code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workloads with nontrivial scheduling&lt;/strong&gt; — affinity rules, taints, GPU allocation, spot instances.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operators ecosystem&lt;/strong&gt; — Kubernetes operators (Prometheus operator, cert-manager, etc.) let you extend the same reconciliation model to application-specific concerns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notice the pattern: Kubernetes wins when you want the platform's primitives — declarative state, reconciliation, operators — beyond just container scheduling. If you only want "run my container," you're buying a jumbo jet to fly to the next town.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Tell a Team Starting Fresh
&lt;/h2&gt;

&lt;p&gt;Two concrete takeaways I'd hand to engineers thinking about Docker and Kubernetes.&lt;/p&gt;

&lt;p&gt;For Docker: the image isn't the point. Reproducibility is. An image built on your laptop that runs unchanged in CI and production — that's the contract you got. Break it (say, by mutating state inside the running container) and you lose the value. The container is a delivery mechanism for a reproducible environment.&lt;/p&gt;

&lt;p&gt;For Kubernetes: the manifest is the source of truth. Every piece of your infrastructure — deployments, services, secrets, ingresses, policies — lives in git. Every change is a git change. Every rollback is a git revert. If you find yourself running &lt;code&gt;kubectl edit&lt;/code&gt; on production, something is wrong with your workflow, not with Kubernetes.&lt;/p&gt;

&lt;p&gt;Both tools won because they codified patterns that were already emerging in sophisticated shops. They didn't invent the patterns. They made them accessible, portable, and standard. That's the fifteen-year revolution. Not containers. Not YAML. The standardization of patterns that used to require a senior infrastructure team to implement from scratch at every company.&lt;/p&gt;

&lt;p&gt;When you work with the grain of the pattern — reproducible environments for Docker, reconciled declarative state for Kubernetes — both tools get out of the way. When you fight the grain, they fight back.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-millions-connections-user-space-context-switching/" rel="noopener noreferrer"&gt;Why Go Handles Millions of Connections&lt;/a&gt; — Linux primitives that Docker is built on, seen from the language side.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/observability-cost-attribution-dual-path-architecture/" rel="noopener noreferrer"&gt;Observability and Cost Attribution: Why One Pipeline Isn't Enough&lt;/a&gt; — what happens to operational complexity when you have dozens of services.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/scale-up-scale-out-every-language-wins-somewhere/" rel="noopener noreferrer"&gt;Scale-Up vs Scale-Out: Why Every Language Wins Somewhere&lt;/a&gt; — the architectural decision that drives whether you need Kubernetes at all.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>docker</category>
      <category>kubernetes</category>
      <category>containers</category>
      <category>devops</category>
    </item>
    <item>
      <title>Observability and Cost Attribution: Why One Pipeline Isn't Enough</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Mon, 20 Apr 2026 07:13:17 +0000</pubDate>
      <link>https://forem.com/harrisonsec/observability-and-cost-attribution-why-one-pipeline-isnt-enough-1283</link>
      <guid>https://forem.com/harrisonsec/observability-and-cost-attribution-why-one-pipeline-isnt-enough-1283</guid>
      <description>&lt;p&gt;A team I worked with tried to build their billing system on top of their tracing pipeline. The idea was clean: every operation already generates a span; spans already have duration and attributes; adding &lt;code&gt;user_id&lt;/code&gt; and &lt;code&gt;billable_units&lt;/code&gt; to each span lets finance query the trace store to compute invoices. One pipeline, less infrastructure. Beautiful.&lt;/p&gt;

&lt;p&gt;Six weeks before the first billing cycle, the wheels came off. The tracing system was sampling at 10% because full-capture was too expensive. The sampler was head-based, meaning whether a trace got kept was decided at request entry, long before the code knew whether the request was billable. Some users got charged for 10% of their actual usage; others got free service. Nobody's invoice agreed with the other team's report.&lt;/p&gt;

&lt;p&gt;The workaround — "don't sample billable traces" — sounded reasonable, broke the tracing pipeline's cost model immediately, and created a dozen new edge cases around which requests counted as "billable." Within a month the team was reluctantly building a second pipeline for billing. They still had the first one for traces. Now they had two pipelines that disagreed with each other.&lt;/p&gt;

&lt;p&gt;The postmortem landed on a single sentence: &lt;strong&gt;observability and cost attribution aren't the same problem, and pretending they are is expensive twice.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Tracing and metrics optimize for signal-to-noise — you want the interesting outliers, sampling is OK, dropping data is tolerable. Billing optimizes for completeness and auditability — every event must be captured and durably recorded, end of story. The two pipelines have opposite trade-offs on sampling, retention, schema evolution, and cost. Building them as one pipeline forces one of the two to lose. Build them as two, share primitives where possible, let each specialize where it must.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why They Look Alike
&lt;/h2&gt;

&lt;p&gt;Observability pipelines and billing pipelines do look eerily similar from a distance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Both capture events from production systems.&lt;/li&gt;
&lt;li&gt;Both attach metadata to those events.&lt;/li&gt;
&lt;li&gt;Both aggregate events over time windows.&lt;/li&gt;
&lt;li&gt;Both export to a query layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's tempting — especially to engineers who like clean architecture — to say &lt;em&gt;these are the same problem&lt;/em&gt; and build one system. The similarity is surface. The constraints are opposite.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Observability&lt;/th&gt;
&lt;th&gt;Billing&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Loss tolerance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High (sampling is fine)&lt;/td&gt;
&lt;td&gt;Zero&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency tolerance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Seconds to minutes&lt;/td&gt;
&lt;td&gt;Minutes to hours is fine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Retention&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Days to weeks&lt;/td&gt;
&lt;td&gt;Years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Schema evolution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast, frequent&lt;/td&gt;
&lt;td&gt;Slow, with audit trail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cardinality profile&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low cardinality on hot dims&lt;/td&gt;
&lt;td&gt;Arbitrary (per user, per resource)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Consumers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SRE, engineering, on-call&lt;/td&gt;
&lt;td&gt;Finance, legal, customer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Failure mode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Blind spot in a dashboard&lt;/td&gt;
&lt;td&gt;Wrong invoice, legal exposure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The one that really matters: &lt;strong&gt;loss tolerance&lt;/strong&gt;. Everything else follows from it.&lt;/p&gt;

&lt;p&gt;A tracing pipeline that drops 10% of spans is fine. You still see the outliers. You still find the slow paths. The system does its job.&lt;/p&gt;

&lt;p&gt;A billing pipeline that drops 10% of events is a disaster. Some users underpay. Some users overpay. Finance reconciliation fails. You end up manually auditing transactions for weeks.&lt;/p&gt;

&lt;p&gt;The moment one pipeline has to satisfy zero-loss and the other can tolerate 90% sampling, you have two different systems whether you wanted one or two.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Dual-Path Architecture
&lt;/h2&gt;

&lt;p&gt;The design I keep reaching back to is straightforward: &lt;strong&gt;two pipelines, shared ingest, separate durability and query paths&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBBcHBbQXBwbGljYXRpb24gY29kZV0gLS0-IHxPVExQIHNwYW5zfCBUcmFjZUNvbAogICAgQXBwIC0tPiB8c3RydWN0dXJlZCB1c2FnZSBldmVudHM8YnIvPnVuc2FtcGxlZHwgVXNhZ2VRCgogICAgc3ViZ3JhcGggVHJhY2VQYXRoWyJUcmFjaW5nIHBhdGgg4oCUIGxvc3MtdG9sZXJhbnQsIGZhc3QiXQogICAgICAgIFRyYWNlQ29sWyJUcmFjaW5nIGNvbGxlY3RvciJdIC0tPiBTYW1wbGVyWyJTYW1wbGVyPGJyLz5oZWFkIG9yIHRhaWwgwrcgfjEwJSJdCiAgICAgICAgU2FtcGxlciAtLT4gSG90U3RvcmVbIkhvdCB0cmFjZSBzdG9yZTxici8-VGVtcG8gLyBKYWVnZXI8YnIvPmRheXMgcmV0ZW50aW9uIl0KICAgIGVuZAoKICAgIHN1YmdyYXBoIEJpbGxpbmdQYXRoWyJCaWxsaW5nIHBhdGgg4oCUIHplcm8tbG9zcywgYXVkaXRhYmxlIl0KICAgICAgICBVc2FnZVFbIkR1cmFibGUgcXVldWU8YnIvPkthZmthIC8gTkFUUyBKZXRTdHJlYW08YnIvPldBTC1kdXJhYmxlIl0gLS0-IFdhcmVob3VzZVsiQ29sdW1uYXIgd2FyZWhvdXNlPGJyLz5CaWdRdWVyeSAvIFNub3dmbGFrZSAvIENsaWNrSG91c2U8YnIvPnllYXJzIHJldGVudGlvbiJdCiAgICBlbmQKCiAgICBjbGFzc0RlZiB0cmFjZSBmaWxsOiNmZWY1ZTcsc3Ryb2tlOiNiNzc5MWYKICAgIGNsYXNzRGVmIGJpbGwgZmlsbDojZjBmZmY0LHN0cm9rZTojMmY4NTVhLHN0cm9rZS13aWR0aDoycHgKICAgIGNsYXNzIFRyYWNlUGF0aCB0cmFjZQogICAgY2xhc3MgQmlsbGluZ1BhdGggYmlsbA%3D%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBBcHBbQXBwbGljYXRpb24gY29kZV0gLS0-IHxPVExQIHNwYW5zfCBUcmFjZUNvbAogICAgQXBwIC0tPiB8c3RydWN0dXJlZCB1c2FnZSBldmVudHM8YnIvPnVuc2FtcGxlZHwgVXNhZ2VRCgogICAgc3ViZ3JhcGggVHJhY2VQYXRoWyJUcmFjaW5nIHBhdGgg4oCUIGxvc3MtdG9sZXJhbnQsIGZhc3QiXQogICAgICAgIFRyYWNlQ29sWyJUcmFjaW5nIGNvbGxlY3RvciJdIC0tPiBTYW1wbGVyWyJTYW1wbGVyPGJyLz5oZWFkIG9yIHRhaWwgwrcgfjEwJSJdCiAgICAgICAgU2FtcGxlciAtLT4gSG90U3RvcmVbIkhvdCB0cmFjZSBzdG9yZTxici8-VGVtcG8gLyBKYWVnZXI8YnIvPmRheXMgcmV0ZW50aW9uIl0KICAgIGVuZAoKICAgIHN1YmdyYXBoIEJpbGxpbmdQYXRoWyJCaWxsaW5nIHBhdGgg4oCUIHplcm8tbG9zcywgYXVkaXRhYmxlIl0KICAgICAgICBVc2FnZVFbIkR1cmFibGUgcXVldWU8YnIvPkthZmthIC8gTkFUUyBKZXRTdHJlYW08YnIvPldBTC1kdXJhYmxlIl0gLS0-IFdhcmVob3VzZVsiQ29sdW1uYXIgd2FyZWhvdXNlPGJyLz5CaWdRdWVyeSAvIFNub3dmbGFrZSAvIENsaWNrSG91c2U8YnIvPnllYXJzIHJldGVudGlvbiJdCiAgICBlbmQKCiAgICBjbGFzc0RlZiB0cmFjZSBmaWxsOiNmZWY1ZTcsc3Ryb2tlOiNiNzc5MWYKICAgIGNsYXNzRGVmIGJpbGwgZmlsbDojZjBmZmY0LHN0cm9rZTojMmY4NTVhLHN0cm9rZS13aWR0aDoycHgKICAgIGNsYXNzIFRyYWNlUGF0aCB0cmFjZQogICAgY2xhc3MgQmlsbGluZ1BhdGggYmlsbA%3D%3D" alt="App[Application code] --&amp;gt; |OTLP spans| TraceCol" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Two emission paths from the application. Two pipelines behind them. Each tuned for its job.&lt;/p&gt;

&lt;h3&gt;
  
  
  The tracing path
&lt;/h3&gt;

&lt;p&gt;Stays conventional. OpenTelemetry SDK emits spans. Collector applies head-based or tail-based sampling. Hot store (Tempo, Jaeger, Grafana Cloud) gets 10-20% of the volume. Retention a few days to a few weeks. Query layer is for engineers debugging incidents.&lt;/p&gt;

&lt;p&gt;What I optimize for here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost per span&lt;/strong&gt; — you're keeping billions; every byte matters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query latency&lt;/strong&gt; — on-call wants answers in seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-instrumentation coverage&lt;/strong&gt; — the fewer things you have to manually instrument, the better.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What I don't care about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full capture. Sampling is fine.&lt;/li&gt;
&lt;li&gt;Long retention. You're debugging last Tuesday, not last fiscal year.&lt;/li&gt;
&lt;li&gt;Per-user accuracy. If a single user's trace got dropped, nobody cares.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The usage-event path
&lt;/h3&gt;

&lt;p&gt;The dedicated billing pipeline. Every billable operation emits a &lt;strong&gt;usage event&lt;/strong&gt; — a small, structured record with everything finance needs and nothing it doesn't.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"event_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ue_01HFNGR..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"occurred_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-02-14T18:22:30.145Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"account_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"acc_12345"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"resource_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"res_6789"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"operation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"api.request"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dimensions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tier"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"standard"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"units"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cpu_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;147&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"egress_bytes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8342&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"idempotency_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"req_abc_20260214182230"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rules on this path:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unsampled.&lt;/strong&gt; Every billable operation emits exactly one event. No head sampling. No tail sampling. No "approximate."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durable writes.&lt;/strong&gt; Emitter has a local write-ahead log or durable queue. If the downstream is down, events buffer locally until delivery. No dropped events under partial failure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency keys.&lt;/strong&gt; Every event has a unique ID (or composite key) so downstream dedup is trivial. This lets you retry safely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema versioned and immutable.&lt;/strong&gt; Once an event shape is shipped, it doesn't mutate. New fields add a new version. Old versions keep working until you intentionally deprecate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long retention.&lt;/strong&gt; Years, usually. Auditors ask for 2023's data in 2027.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The downstream infrastructure matches: &lt;strong&gt;Kafka or NATS JetStream with high replication factor&lt;/strong&gt; for ingest, &lt;strong&gt;columnar warehouse&lt;/strong&gt; (BigQuery, Snowflake, ClickHouse) for aggregation and query, &lt;strong&gt;separate auth and access control&lt;/strong&gt; from engineering-facing tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  What the two paths share
&lt;/h3&gt;

&lt;p&gt;Not nothing. They share:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The trace/request ID.&lt;/strong&gt; Usage events include the trace ID of the request that generated them. This is the &lt;em&gt;one&lt;/em&gt; cross-pipeline link that matters — when finance escalates "this user says they were charged for X requests but they swear they only made Y," you want to be able to find the traces of those Y requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenTelemetry as the emission library.&lt;/strong&gt; OTel can emit both spans and custom events. Using it for both keeps the instrumentation codepaths uniform. But the pipelines behind the emitter are different.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The application's definition of an "operation."&lt;/strong&gt; Both pipelines have opinions about what counts as one operation. Keep that definition single-source.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Head-Sampling Kills Billing
&lt;/h2&gt;

&lt;p&gt;Worth dwelling on the specific thing that breaks when you try to unify.&lt;/p&gt;

&lt;p&gt;Head-based sampling decides whether to record a trace at entry, based on trace ID. It's O(1), stateless, and fair across traffic shapes — the standard default.&lt;/p&gt;

&lt;p&gt;The failure: &lt;strong&gt;at entry time, the system has no idea whether this request will be billable.&lt;/strong&gt; The sampler doesn't know if the user is on a paid plan, if the request will succeed, if it will hit a billable feature. It just picks randomly.&lt;/p&gt;

&lt;p&gt;Tail-based sampling fixes part of this — you decide after the fact, based on span attributes. Now you can keep all errors, all slow requests, all requests from paid users. Better, but still subject to buffering limits. Heavy tail-based samplers sit in front of your trace ingest pipeline and drop spans when buffers fill, which still gives you lossy billing during traffic bursts.&lt;/p&gt;

&lt;p&gt;The only sampler that's correct for billing is "capture everything." And "capture everything" is what the tracing pipeline tries to avoid, because that's what makes it expensive.&lt;/p&gt;

&lt;p&gt;You can do "capture everything for billable operations, sample everything else" in one pipeline. It works. It also ends up being the most complex sampler you've ever written, with an exception branch that duplicates the decision logic from your actual billing code. The dedicated usage-event path is simpler.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cardinality and the Per-User Problem
&lt;/h2&gt;

&lt;p&gt;A related anti-pattern: attaching user ID as a Prometheus label.&lt;/p&gt;

&lt;p&gt;Prometheus (and most metrics systems) store one time series per label combination. Add a &lt;code&gt;user_id&lt;/code&gt; label to a metric that ten thousand users hit, and you just created ten thousand time series. Add a &lt;code&gt;request_type&lt;/code&gt; label alongside, and that's ten thousand × request-type-count. Cardinality explodes. Your metrics storage bill goes with it.&lt;/p&gt;

&lt;p&gt;The instinct is fine — "I want to track per-user throughput" — the mechanism is wrong. Metrics with high-cardinality labels are the square peg. Usage events are the round hole. Emit a usage event with &lt;code&gt;account_id&lt;/code&gt; as a dimension, aggregate per-user in the warehouse at query time.&lt;/p&gt;

&lt;p&gt;Rule I use: &lt;strong&gt;metrics for engineering-facing dashboards, events for business-facing attribution&lt;/strong&gt;. If the label cardinality could exceed ~1,000 distinct values, it belongs in an event, not a label.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Boring Operational Details
&lt;/h2&gt;

&lt;p&gt;Where the two pipelines actually differ in day-to-day ops:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retention&lt;/strong&gt;. Tracing a few weeks, maybe. Billing store, years. Warehouse partitioning by date and account_id makes multi-year queries practical. Archive older partitions to object storage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Access control&lt;/strong&gt;. Traces: engineers. Billing events: accounting + support + an audit-only read path for legal. Not the same principals, not the same ACL model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Schema governance&lt;/strong&gt;. Traces: OTel semantic conventions, loose. Billing events: your own schema with a proto or Avro definition, version bumps tracked in a migration log, additive only.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reconciliation&lt;/strong&gt;. Billing needs to agree with itself. Daily reconciliation job that asserts "yesterday's event count per user equals the sum of the per-hour counts" catches silent drops early. No equivalent makes sense for tracing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Replay&lt;/strong&gt;. When a billing bug is discovered, you need to replay historical events through a fixed pipeline. Kafka's offset model makes this natural; NATS JetStream has it too. The tracing pipeline rarely needs replay — if the last two weeks of traces have a bug, you shrug and fix forward.&lt;/p&gt;

&lt;h2&gt;
  
  
  When You Can Get Away With One
&lt;/h2&gt;

&lt;p&gt;Small workloads with no audit requirement, usage-based pricing below ~$1/user, and a team of three — one pipeline is fine. Add user attributes to spans, store them all, build a nightly aggregation job, call it billing. It works.&lt;/p&gt;

&lt;p&gt;The threshold where it stops working is somewhere around:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Revenue per customer exceeds the cost of a mistake.&lt;/strong&gt; At $10k/month per customer, a dropped event is a $10k issue.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The first auditor asks "show me exactly what this customer used in March 2024."&lt;/strong&gt; Unsampled, durable, retrievable, signed — that's the table stakes for audit-grade billing, and sampled traces can't meet any of those.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engineering starts wanting cheaper traces.&lt;/strong&gt; When the tracing pipeline outgrows your budget and someone proposes "let's sample more aggressively," you're about to break billing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When any of those lights up, separate the pipelines. The cheapest time to separate is before you've built tools on top of the unified one.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Invest in Splitting the Pipelines
&lt;/h2&gt;

&lt;p&gt;Observability and cost attribution are adjacent problems that optimize for opposite things. A tracing pipeline that compromises on completeness becomes a bad billing pipeline. A billing pipeline that compromises on cardinality and retention becomes a bad tracing pipeline. Building one system that satisfies both usually produces two systems that satisfy neither.&lt;/p&gt;

&lt;p&gt;The dual-path design isn't more complex. It's just &lt;em&gt;honest&lt;/em&gt; about the constraints. Same emission library, same operation definition, two paths behind the emitter, each tuned for its job.&lt;/p&gt;

&lt;p&gt;If you're about to launch usage-based pricing and you're planning to compute invoices from your trace store, rethink it now. The sooner you split, the cheaper the split.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/nats-kafka-mqtt-same-category-different-jobs/" rel="noopener noreferrer"&gt;NATS vs Kafka vs MQTT: Same Category, Very Different Jobs&lt;/a&gt; — why the durability choice on the billing path matters so much.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/rpc-vs-nats-who-owns-completion/" rel="noopener noreferrer"&gt;RPC vs NATS: It's Not About Sync vs Async — It's About Who Owns Completion&lt;/a&gt; — completion ownership applies to the emit path, too.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>observability</category>
      <category>billing</category>
      <category>costattribution</category>
      <category>opentelemetry</category>
    </item>
    <item>
      <title>The 90% Problem: Why Most AI Agents Are Still Broken</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Sat, 18 Apr 2026 23:02:07 +0000</pubDate>
      <link>https://forem.com/harrisonsec/the-90-problem-why-most-ai-agents-are-still-broken-3pd4</link>
      <guid>https://forem.com/harrisonsec/the-90-problem-why-most-ai-agents-are-still-broken-3pd4</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmreka6ybk5ljcyqpu2sc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmreka6ybk5ljcyqpu2sc.jpg" alt="The 90% Problem" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Agent Works Great. Until It Doesn't.
&lt;/h2&gt;

&lt;p&gt;You built an AI agent over the weekend. It calls tools, remembers context, follows instructions. You demo it to your team. Everyone's impressed.&lt;/p&gt;

&lt;p&gt;Monday morning, a user types "rename Ember to Infernia." Your agent loops 15 times, burns through your API budget, and returns a response that doesn't contain the word "Infernia." A rename. One entity. One operation.&lt;/p&gt;

&lt;p&gt;I've been there. I ran an eval suite on a production agent — 5 test cases, 5 runs each. Pass rate: &lt;strong&gt;40%.&lt;/strong&gt; Not on hard tasks. On things like "update the right character out of six" and "rename one entity." The model was GPT-4 class. Plenty capable. The problem was everything &lt;em&gt;around&lt;/em&gt; the model.&lt;/p&gt;

&lt;p&gt;This is the 90% problem:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Building the core loop (perceive → reason → act):  10% of the work
Making it not break in production:                  90% of the work
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It took me a while to see where the problem actually was. The real gap wasn't missing features. It was open loops: verification that doesn't retry, memory that doesn't decay, compression that doesn't circuit-break.&lt;/p&gt;

&lt;p&gt;Here's what I found — from analyzing Claude Code's leaked source, where a 1,729-line &lt;code&gt;query.ts&lt;/code&gt; file contains a 1,421-line &lt;code&gt;while(true)&lt;/code&gt; loop inside a roughly 512,000-line codebase, and from fixing a production agent's pass rate with code changes alone. No model upgrade. No prompt magic. Just engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Five Pillars — And Where Agents Actually Fail
&lt;/h2&gt;

&lt;p&gt;Every production agent needs five things. Most only build two of them well.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pillar&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;th&gt;Who Does the Work&lt;/th&gt;
&lt;th&gt;Most Agents' Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context Management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;What the LLM sees&lt;/td&gt;
&lt;td&gt;Code orchestrates; LLM helps compress&lt;/td&gt;
&lt;td&gt;Dump everything, hope for the best&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory Management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;What persists across sessions&lt;/td&gt;
&lt;td&gt;Code orchestrates; LLM helps recall&lt;/td&gt;
&lt;td&gt;Basic store/retrieve, no lifecycle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reflection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent checks its own output&lt;/td&gt;
&lt;td&gt;Code triggers; LLM judges&lt;/td&gt;
&lt;td&gt;Not implemented or logs-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Planning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent thinks before acting&lt;/td&gt;
&lt;td&gt;LLM (decompose tasks, sequence steps)&lt;/td&gt;
&lt;td&gt;Decent — LLMs are good at this&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tool Use&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent interacts with the world&lt;/td&gt;
&lt;td&gt;LLM selects, Code executes&lt;/td&gt;
&lt;td&gt;Decent — most mature pillar&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm48fwvkvq4f2foh8qsrz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm48fwvkvq4f2foh8qsrz.jpg" alt="The Five Pillars of Agent Architecture" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Planning and Tool Use work reasonably well because &lt;strong&gt;they ride on model improvements&lt;/strong&gt;. GPT-3.5 struggled with tool calling; Claude Opus 4.6 is reliable. You get these improvements for free with model upgrades.&lt;/p&gt;

&lt;p&gt;Context and Memory are where agents fail because &lt;strong&gt;they're engineering problems, not model problems&lt;/strong&gt;. Reflection sits in the middle: the LLM can judge quality, but code still has to trigger that check, route the result, and do something with it. No model upgrade will fix a context pipeline that dumps 10 irrelevant entities into the prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  The LLM vs Code Divide
&lt;/h2&gt;

&lt;p&gt;This is the most important insight for anyone building agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HIGH LLM dependence (improves with better models):
  Planning        → LLM generates the plan
  Reflection      → LLM evaluates quality
  Tool Selection  → LLM picks the right tool

LOW LLM dependence (never improves from model upgrades):
  Context Management  → Code sorts, filters, compresses
  Memory Management   → Code stores, retrieves, scores, decays
  Error Handling      → Code classifies errors, retries, circuit-breaks
  Tool Execution      → Code runs tools, parallelizes, batches
  State Management    → Code tracks progress, checkpoints
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdnw3ua812lnk1q6bumjl.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdnw3ua812lnk1q6bumjl.jpg" alt="LLM vs Code Divide" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But low LLM dependence does not mean zero model calls. It means the failure mode is mostly in the orchestration. Even code-dominated pillars still use models in three very different ways:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdiiw89cv7gosi43o10s.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdiiw89cv7gosi43o10s.jpg" alt="Three Ways Agents Call LLMs" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Direct LLM call (sideQuery)&lt;/strong&gt; — Code asks a narrow question, takes the answer, and moves on. Example: Claude Code's memory recall uses a single Sonnet side-query to choose 5 relevant memories from roughly 200 files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Forked sub-agent&lt;/strong&gt; — Code delegates a bounded task to a child agent with its own context, tools, and loop. Example: Claude Code's autocompact hands summarization to a child agent instead of forcing the main loop to do it inline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Tool-use loop&lt;/strong&gt; — The LLM decides which tool to call, the program executes it, and the result flows back into the next turn. This is the main agent loop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Simple question (which memories are relevant?)  → Direct call
Complex but bounded task (summarize this)       → Forked sub-agent
Open-ended execution (build this feature)       → Tool-use loop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This choice is not academic. It changes latency, token cost, and failure modes. In Claude Code's memory system, a side-query is cheap. A forked summarizer is much heavier. Using the wrong pattern wastes budget or hurts reliability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The trap:&lt;/strong&gt; Teams chase model upgrades ("let's switch to Claude Opus") instead of fixing their context pipeline. Better models help — but in my experience, fixing the context pipeline delivers a larger improvement per dollar than upgrading the model.&lt;/p&gt;

&lt;p&gt;In one production system, fixing context management alone — without changing the model — moved quality from 40% to 60%. Seven out of eight fixes were pure code, zero LLM cost. The model was always capable. The context was holding it back.&lt;/p&gt;

&lt;h2&gt;
  
  
  What 90% Actually Looks Like — From Claude Code's Source
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxpfl91l7pkun6clj45qx.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxpfl91l7pkun6clj45qx.jpg" alt="The Anatomy of Production: Claude Code" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Claude Code's leaked source is roughly 512,000 lines. Here's the useful way to think about that split:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query.ts orchestrator file:             1,729 lines    (~0.3%)
Core while(true) loop inside it:        1,421 lines
Everything else:                     ~510,000 lines    (~99.7%)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That "everything else" is the 90%:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context Management (3,960 lines in src/services/compact/):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;5-level progressive compression pipeline&lt;/li&gt;
&lt;li&gt;Microcompact with dual code paths based on cache state&lt;/li&gt;
&lt;li&gt;Token estimation without API calls (&amp;lt;5% error)&lt;/li&gt;
&lt;li&gt;Post-compression recovery (restore last 5 files, skills, agent state)&lt;/li&gt;
&lt;li&gt;Circuit breaker: 3 consecutive failures → stop (after 250K API calls/day were wasted without it)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Memory System (1,736 lines in src/memdir/):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;4-type closed taxonomy with structured frontmatter&lt;/li&gt;
&lt;li&gt;Sonnet side-query for semantic retrieval (250ms, async prefetch)&lt;/li&gt;
&lt;li&gt;Background extraction agent with mutual exclusion&lt;/li&gt;
&lt;li&gt;Trust verification (eval went 0/2 → 3/3 with this addition)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Error Handling (spread across entire codebase):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Message normalization: fix orphan tool_use/tool_result pairs from crashes&lt;/li&gt;
&lt;li&gt;Prompt-Too-Long recovery: reactive compression as last resort&lt;/li&gt;
&lt;li&gt;Tool failure classification: timeout vs permission vs not-found&lt;/li&gt;
&lt;li&gt;Max output token escalation: 8K default → 64K on truncation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Permission System (multi-layer):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tool-level risk classification&lt;/li&gt;
&lt;li&gt;User confirmation for dangerous operations&lt;/li&gt;
&lt;li&gt;Sandbox isolation for high-risk tools&lt;/li&gt;
&lt;li&gt;Context injection scanning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is intellectually exciting. It's plumbing. But without it, the "exciting" part — the agent loop — crashes on every non-trivial conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenClaw and Hermes Surface the Same Pattern
&lt;/h2&gt;

&lt;p&gt;Two open-source agents worth watching right now — OpenClaw and Hermes Agent — illustrate the same architectural lesson.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context management: still basic; I don't see the kind of progressive compression Claude Code built&lt;/li&gt;
&lt;li&gt;Memory: Markdown + SQLite (more sophisticated than Claude Code's storage layer)&lt;/li&gt;
&lt;li&gt;Reflection: limited; I don't yet see a strong closed verification loop&lt;/li&gt;
&lt;li&gt;Security: public reports in early 2026 highlighted exposed instances and malicious marketplace skills; &lt;code&gt;openclaw security audit&lt;/code&gt; exists, but tools alone don't close the operational loop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context management: still basic&lt;/li&gt;
&lt;li&gt;Memory: SQLite + full-text search + &lt;code&gt;MEMORY.md&lt;/code&gt; dual-layer&lt;/li&gt;
&lt;li&gt;Reflection: self-evolving skills generated from completed tasks&lt;/li&gt;
&lt;li&gt;Error handling: layered on paper, but still early&lt;/li&gt;
&lt;li&gt;Maturity: promising, but I haven't seen evidence yet that the self-iteration loop holds up at scale&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both can complete tasks. My point is not that they don't work. It's that the hardest production loops — compression, failure accounting, verification retries, and memory hygiene — still look only partially closed. The features exist; &lt;strong&gt;the loops aren't closed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfz0xlyi4ev29uyb768f.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfz0xlyi4ev29uyb768f.jpg" alt="Features Exist vs Loops Are Closed" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  "Features Exist" vs "Loops Are Closed"
&lt;/h2&gt;

&lt;p&gt;This is the most overlooked distinction in agent architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Open loop:   Build a verification step → log issues → done
Closed loop: Build a verification step → log issues → retry with feedback → verify again

Open loop:   Score memory relevance → store the score → done  
Closed loop: Score memory relevance → reinforce high-scoring memories → decay low-scoring → improve retrieval over time

Open loop:   Detect compression failure → log it → continue
Closed loop: Detect compression failure → count consecutive failures → circuit-break after 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In every open-source agent I've analyzed so far, some of the critical loops are still open. The infrastructure is there. The wiring is only partially connected.&lt;/p&gt;

&lt;p&gt;Here's the test: look at your agent's verification step. Does it log a failure and move on? That's an open loop. Does it log, retry with the failure as feedback, and verify again? That's closed. The difference is one &lt;code&gt;if&lt;/code&gt; statement and a retry call — but it's the difference between "we have quality checks" and "we actually catch errors before users see them."&lt;/p&gt;

&lt;p&gt;This is the hardest 10% of the 90%. Not building the infrastructure — connecting it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpbhfohjvdiddbtk9g8lf.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpbhfohjvdiddbtk9g8lf.jpg" alt="The Proof: 40% to 60%" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Proof: 40% to 60% With Code Alone
&lt;/h2&gt;

&lt;p&gt;I ran A/B evals on a production agent. Same model, same test cases, different code. Result: &lt;strong&gt;40% → 60% pass rate.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The breakdown: 8 fixes total. &lt;strong&gt;7 were pure code — zero LLM cost.&lt;/strong&gt; Context prioritization, structured error classification, round limits, conclusion preservation during truncation, circuit breakers. The only fix that used an LLM call was a pre-loop planning step at $0.003 per request.&lt;/p&gt;

&lt;p&gt;The model was always capable. The context was holding it back.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Full case study with per-test breakdown: How I Improved an AI Agent from 40% to 60%)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Builders
&lt;/h2&gt;

&lt;p&gt;If you're building an AI agent:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Don't start with the model.&lt;/strong&gt; Start with context management. Clean, prioritized, bounded input is the highest-leverage investment you can make.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Close your loops.&lt;/strong&gt; If you built a verification step, make it retry. If you built memory scoring, wire the reinforcement. Half-built infrastructure is worse than none — it gives false confidence.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Measure before you upgrade.&lt;/strong&gt; Before switching to a more expensive model, run an eval suite on your current one. The bottleneck is probably context, not capability.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Budget 90% of your time for the 90%.&lt;/strong&gt; The agent loop is a weekend project. Error handling, compression, memory lifecycle, permission systems — that's the real work. Plan accordingly.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The model is a commodity. The engineering around it is the product.&lt;/p&gt;

&lt;p&gt;Ask yourself: what percentage of your agent's codebase is the core loop, and what percentage is everything else? If you don't know the answer, that's where to start.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Diagrams from this essay packaged as a single-file reference: &lt;a href="https://harrisonsec.com/downloads/Engineering_Reliable_Agents.pdf" rel="noopener noreferrer"&gt;Engineering Reliable Agents (PDF)&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the AI Agent Architecture series.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Deep dives into the 90%:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://harrisonsec.com/blog/claude-code-context-engineering-compression-pipeline/" rel="noopener noreferrer"&gt;Claude Code Part 3: The 5-Level Compression Pipeline&lt;/a&gt; — how Anthropic solved context management&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://harrisonsec.com/blog/claude-code-memory-first-principles-tradeoffs/" rel="noopener noreferrer"&gt;Claude Code Part 4: Why Markdown Instead of Vector DBs&lt;/a&gt; — first-principles memory tradeoffs&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;40% to 60% With A/B Data — the full case study behind the numbers in this article&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>architecture</category>
      <category>claudecode</category>
      <category>openclaw</category>
    </item>
    <item>
      <title>Testing Real-World Go Backends Isn't What Many People Think</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Sat, 18 Apr 2026 00:18:33 +0000</pubDate>
      <link>https://forem.com/harrisonsec/testing-real-world-go-backends-isnt-what-many-people-think-12nl</link>
      <guid>https://forem.com/harrisonsec/testing-real-world-go-backends-isnt-what-many-people-think-12nl</guid>
      <description>&lt;p&gt;I've reviewed enough Go backend test suites to notice a pattern. The services with the most unit tests are often the ones with the most production incidents. Not because unit tests cause incidents — because the teams writing unit tests and calling it a day weren't testing the things that actually broke.&lt;/p&gt;

&lt;p&gt;Production bugs in distributed Go backends don't usually look like "function computed wrong value." They look like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"The context deadline didn't propagate into the background goroutine, so under load it leaked."&lt;/li&gt;
&lt;li&gt;"Two services agreed on the happy path, but the error-shape contract diverged six months ago, and now one returns &lt;code&gt;status.Code(codes.Unavailable)&lt;/code&gt; where the other expects &lt;code&gt;codes.ResourceExhausted&lt;/code&gt;."&lt;/li&gt;
&lt;li&gt;"The retry logic is race-y. With test-scale traffic it works; at 10x production it double-charges."&lt;/li&gt;
&lt;li&gt;"The database migration works on SQLite (our test DB) but not Postgres 15's stricter planner."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No unit test catches those. A different set of test shapes does.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Stop framing tests as "unit vs integration." That's a level-of-isolation axis, and it's the least interesting one. The axes that matter for production Go: deterministic behavior (controlled clocks, seeded randomness), concurrency correctness (race detector, stress tests), contract fidelity (shared schemas, real downstreams), and environment fidelity (real DBs, real networks). Design your test suite around those; coverage follows.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Wrong Taxonomy
&lt;/h2&gt;

&lt;p&gt;"Unit tests test one function. Integration tests test several. E2E tests test the whole system."&lt;/p&gt;

&lt;p&gt;That framing is a starting point for junior engineers. It stops being useful the moment you're debugging why your Go service silently dropped a message in production. The level of isolation isn't the interesting axis. What is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic vs non-deterministic behavior.&lt;/strong&gt; Do the same inputs produce the same outputs every time?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency correctness.&lt;/strong&gt; Do the race conditions stay caught?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contract fidelity.&lt;/strong&gt; Do your assumptions about downstreams match what they actually do?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment fidelity.&lt;/strong&gt; Does your test environment reproduce the production runtime closely enough to catch real bugs?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A test can be "unit" on the isolation axis but score on two or three of these. A test can be "integration" and miss all four.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deterministic Behavior: The One Thing Every Test Should Have
&lt;/h2&gt;

&lt;p&gt;If you can't run your test a thousand times and get the same result, you have a flaky test, and flaky tests are worse than no tests — they train the team to ignore failures.&lt;/p&gt;

&lt;p&gt;The three sources of non-determinism in Go test suites, in order of prevalence:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Time
&lt;/h3&gt;

&lt;p&gt;Any test that calls &lt;code&gt;time.Now()&lt;/code&gt;, &lt;code&gt;time.After()&lt;/code&gt;, &lt;code&gt;time.Sleep()&lt;/code&gt;, or depends on wall-clock intervals is a landmine. It works on the developer's laptop and fails in a slow CI runner where GC decided to kick in.&lt;/p&gt;

&lt;p&gt;Fix: inject a clock. A minimal clock interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Clock&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
    &lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;After&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;realClock&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;realClock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;realClock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;realClock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;After&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;After&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In production, &lt;code&gt;realClock&lt;/code&gt;. In tests, a &lt;code&gt;FakeClock&lt;/code&gt; that advances manually. Libraries like &lt;code&gt;github.com/benbjohnson/clock&lt;/code&gt; give you this for free.&lt;/p&gt;

&lt;p&gt;Payoff: a test that verifies "retries happen every 500ms for 3 attempts" becomes deterministic — advance the fake clock 500ms, observe a retry, advance another 500ms, observe again. No sleeping in the test.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Randomness
&lt;/h3&gt;

&lt;p&gt;Anything that shuffles, samples, picks a random ID, or generates random test data needs a seeded random source. &lt;code&gt;math/rand.Intn&lt;/code&gt; with default source is a machine-global shared state; two tests running in parallel can interfere.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt; &lt;span class="kt"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Service&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;))}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In tests, pass a known seed. In production, &lt;code&gt;rand.NewSource(time.Now().UnixNano())&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Concurrency ordering
&lt;/h3&gt;

&lt;p&gt;The nasty one. A test that creates goroutines and checks a result has to either (a) synchronize on a deterministic completion signal (a channel, a &lt;code&gt;WaitGroup&lt;/code&gt;) or (b) poll with a timeout — which is back to non-determinism.&lt;/p&gt;

&lt;p&gt;The best habit: design for deterministic completion. If you're testing "five goroutines should all complete and total the result," use &lt;code&gt;sync.WaitGroup.Wait()&lt;/code&gt; or close a channel. Don't sleep. Don't poll.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concurrency Correctness: The Race Detector Is Not Optional
&lt;/h2&gt;

&lt;p&gt;Go ships with a race detector. Running &lt;code&gt;go test -race&lt;/code&gt; is one flag and it catches an entire category of bugs that will otherwise show up as "works on my machine." In my experience, any production Go service will, on first &lt;code&gt;-race&lt;/code&gt; run, surface at least one real data race that had been silently ignored.&lt;/p&gt;

&lt;p&gt;The race detector adds ~5-10x runtime overhead, so people skip it on every-save tests. Fine. Run it in CI. Run it on nightly integration tests. Run it on anything touching shared state. Some configurations I've seen work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Every PR&lt;/strong&gt;: run unit tests with &lt;code&gt;-race&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nightly&lt;/strong&gt;: run full integration suite with &lt;code&gt;-race&lt;/code&gt; and a longer timeout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pre-release&lt;/strong&gt;: run stress tests with &lt;code&gt;-race&lt;/code&gt; against a production-sized dataset.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The cost of running with &lt;code&gt;-race&lt;/code&gt; is engineering discipline. The payoff is not debugging a data race at 2 AM.&lt;/p&gt;

&lt;p&gt;Beyond the race detector, &lt;strong&gt;stress tests&lt;/strong&gt; are undervalued. A test that runs your concurrent path 1,000 times with different goroutine interleavings catches bugs that a single-iteration test never will.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestConcurrentWorkers_Stress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Short&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Skip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"stress test"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"iter%d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="c"&gt;// ... actual test body ...&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;t.Parallel()&lt;/code&gt; + 1,000 iterations + &lt;code&gt;-race&lt;/code&gt; finds race conditions that a single deterministic run happily misses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Contract Fidelity: The Bug Class Everyone Misses
&lt;/h2&gt;

&lt;p&gt;Say your service calls a downstream gRPC service for payments. You write a mock that returns a successful response. Your tests pass. The downstream team changes their error code vocabulary. Your service now misinterprets their new error. Production finds out first.&lt;/p&gt;

&lt;p&gt;Contract testing addresses this. Two approaches work in practice:&lt;/p&gt;

&lt;h3&gt;
  
  
  Shared schema, shared types
&lt;/h3&gt;

&lt;p&gt;If the downstream service publishes a protobuf file (they should), your service imports it directly. Your tests use types generated from the real contract. If the downstream bumps the proto, your next build fails — loudly, at compile time.&lt;/p&gt;

&lt;p&gt;This is the simplest and often best answer for Go services with gRPC downstreams. The contract is literally the shared protobuf.&lt;/p&gt;

&lt;h3&gt;
  
  
  Consumer-driven contract tests
&lt;/h3&gt;

&lt;p&gt;Each consumer writes tests that capture their expectations of the downstream. Those tests run against the real downstream (or a contract-test server like Pact). When the downstream changes, the contract tests catch it before the contract-as-written reality diverges.&lt;/p&gt;

&lt;p&gt;This helps for REST APIs where there's no single source of truth schema. It's more ceremony. For most gRPC Go services, shared protobufs cover it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "mock everything" antipattern
&lt;/h3&gt;

&lt;p&gt;If your test suite consists of mocks that return whatever your test needs, you're not testing integration. You're testing that your code calls your mocks correctly. That's a tautology. Real integration bugs live in the gap between your mock's behavior and the downstream's actual behavior.&lt;/p&gt;

&lt;p&gt;Have at least one test per integration point that hits the real downstream — either in a staging environment or via Testcontainers. Keep the mocks for fast feedback, but don't pretend they're the only tests you need.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment Fidelity: Use Real Infra Where It Matters
&lt;/h2&gt;

&lt;p&gt;The sharpest line in my test taxonomy is between "close to production runtime" and "not close."&lt;/p&gt;

&lt;p&gt;Things that matter and are worth running on real infrastructure in tests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Databases.&lt;/strong&gt; SQLite is not Postgres is not MySQL. Query planner, isolation levels, and error shapes differ. Test with the DB you ship with.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message brokers.&lt;/strong&gt; Kafka's ordering and offset semantics cannot be faked well. Use a real Kafka (or Redpanda) in tests that exercise ordering or replay.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caches.&lt;/strong&gt; Redis has specific failover and eviction semantics. A fake in-memory map doesn't reproduce them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time-sensitive downstream APIs.&lt;/strong&gt; Anything with rate limits or TTLs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Things that rarely matter and are fine with fakes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Object storage.&lt;/strong&gt; A local file-system backend usually reproduces S3 well enough.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics / tracing exporters.&lt;/strong&gt; Tests don't need a real Prometheus.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Email / SMS.&lt;/strong&gt; A mock recording calls is plenty.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern: &lt;strong&gt;test with real infra for anything where semantic difference is possible&lt;/strong&gt;. Testcontainers (&lt;code&gt;github.com/testcontainers/testcontainers-go&lt;/code&gt;) makes this painless:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;setupPostgres&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;postgres&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RunContainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;testcontainers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"postgres:15-alpine"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;postgres&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithDatabase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"testdb"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;postgres&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithUsername&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"testuser"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;postgres&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithPassword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"testpass"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;require&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NoError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cleanup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Terminate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;dsn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ConnectionString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"sslmode=disable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;require&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NoError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dsn&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Slow? Yes — each container takes a few seconds to start. But you can run them once per test package with a &lt;code&gt;TestMain&lt;/code&gt;, and the bugs they catch are the ones most worth catching.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Real Taxonomy
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBGYXN0WyJSdW4gb24gZXZlcnkgc2F2ZSJdCiAgICAgICAgVDFbIkZhc3QgdGVzdHM8YnIvPnB1cmUgZnVuY3Rpb25zIMK3IGFsZ29yaXRobXMiXQogICAgZW5kCgogICAgc3ViZ3JhcGggUFJbIlJ1biBvbiBldmVyeSBQUiJdCiAgICAgICAgVDJbIkNvbmN1cnJlbmN5IHRlc3RzPGJyLz4tcmFjZSDCtyBzdHJlc3MiXQogICAgICAgIFQzWyJEZXRlcm1pbmlzdGljIGludGVncmF0aW9uPGJyLz5mYWtlIGNsb2NrIMK3IGZha2UgZG93bnN0cmVhbSJdCiAgICAgICAgVDRbIlJlYWwtaW5mcmEgaW50ZWdyYXRpb248YnIvPlRlc3Rjb250YWluZXJzIFBvc3RncmVzIC8gUmVkaXMgLyBLYWZrYSJdCiAgICAgICAgVDVbIkNvbnRyYWN0IHRlc3RzPGJyLz5zaGFyZWQgc2NoZW1hcyDCtyBwcm90byB2ZXJzaW9ucyJdCiAgICBlbmQKCiAgICBzdWJncmFwaCBOaWdodGx5WyJSdW4gb24gc2NoZWR1bGUiXQogICAgICAgIFQ2WyJTdHJlc3MgdGVzdHM8YnIvPjEwMDAtaXRlciAtcmFjZSJdCiAgICAgICAgVDdbIkVuZC10by1lbmQ8YnIvPnJlYWwgc2VydmljZXMgwrcgc3RhZ2luZyJdCiAgICBlbmQKCiAgICBGYXN0IC0tPiBQUiAtLT4gTmlnaHRseQoKICAgIGNsYXNzRGVmIGZhc3QgZmlsbDojZjBmZmY0LHN0cm9rZTojMmY4NTVhCiAgICBjbGFzc0RlZiBwciBmaWxsOiNlOGY0Zjgsc3Ryb2tlOiMyYzUyODIKICAgIGNsYXNzRGVmIG5pZ2h0bHkgZmlsbDojZmVmNWU3LHN0cm9rZTojYjc3OTFmCiAgICBjbGFzcyBGYXN0IGZhc3QKICAgIGNsYXNzIFBSIHByCiAgICBjbGFzcyBOaWdodGx5IG5pZ2h0bHk%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBGYXN0WyJSdW4gb24gZXZlcnkgc2F2ZSJdCiAgICAgICAgVDFbIkZhc3QgdGVzdHM8YnIvPnB1cmUgZnVuY3Rpb25zIMK3IGFsZ29yaXRobXMiXQogICAgZW5kCgogICAgc3ViZ3JhcGggUFJbIlJ1biBvbiBldmVyeSBQUiJdCiAgICAgICAgVDJbIkNvbmN1cnJlbmN5IHRlc3RzPGJyLz4tcmFjZSDCtyBzdHJlc3MiXQogICAgICAgIFQzWyJEZXRlcm1pbmlzdGljIGludGVncmF0aW9uPGJyLz5mYWtlIGNsb2NrIMK3IGZha2UgZG93bnN0cmVhbSJdCiAgICAgICAgVDRbIlJlYWwtaW5mcmEgaW50ZWdyYXRpb248YnIvPlRlc3Rjb250YWluZXJzIFBvc3RncmVzIC8gUmVkaXMgLyBLYWZrYSJdCiAgICAgICAgVDVbIkNvbnRyYWN0IHRlc3RzPGJyLz5zaGFyZWQgc2NoZW1hcyDCtyBwcm90byB2ZXJzaW9ucyJdCiAgICBlbmQKCiAgICBzdWJncmFwaCBOaWdodGx5WyJSdW4gb24gc2NoZWR1bGUiXQogICAgICAgIFQ2WyJTdHJlc3MgdGVzdHM8YnIvPjEwMDAtaXRlciAtcmFjZSJdCiAgICAgICAgVDdbIkVuZC10by1lbmQ8YnIvPnJlYWwgc2VydmljZXMgwrcgc3RhZ2luZyJdCiAgICBlbmQKCiAgICBGYXN0IC0tPiBQUiAtLT4gTmlnaHRseQoKICAgIGNsYXNzRGVmIGZhc3QgZmlsbDojZjBmZmY0LHN0cm9rZTojMmY4NTVhCiAgICBjbGFzc0RlZiBwciBmaWxsOiNlOGY0Zjgsc3Ryb2tlOiMyYzUyODIKICAgIGNsYXNzRGVmIG5pZ2h0bHkgZmlsbDojZmVmNWU3LHN0cm9rZTojYjc3OTFmCiAgICBjbGFzcyBGYXN0IGZhc3QKICAgIGNsYXNzIFBSIHByCiAgICBjbGFzcyBOaWdodGx5IG5pZ2h0bHk%3D" alt="T1[" width="" height=""&gt;&lt;/a&gt;pure functions · algorithms"]"/&amp;gt;&lt;/p&gt;

&lt;p&gt;Here's the taxonomy I actually use when designing a test suite for a Go backend:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fast tests&lt;/strong&gt; (seconds for the whole file): pure functions, algorithms, small state machines. Run on every save.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency tests&lt;/strong&gt; (seconds to a minute): anything with goroutines. Run with &lt;code&gt;-race&lt;/code&gt;. Run in PR.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic integration tests&lt;/strong&gt; (single-digit seconds per test): one module + fakes + fake clock. Fast enough to keep in the main test run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-infra integration tests&lt;/strong&gt; (seconds per test): one module + real DB / Kafka / Redis via Testcontainers. Run in PR, longer timeout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contract tests&lt;/strong&gt; (milliseconds): verify shared schemas with downstreams. Run on every schema change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stress tests&lt;/strong&gt; (minutes): high-iteration, high-concurrency, with &lt;code&gt;-race&lt;/code&gt;. Run nightly or on schedule.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;End-to-end tests&lt;/strong&gt; (minutes): real services, real network, against a staging environment. Run pre-release.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What you'll notice: "unit" and "integration" don't appear as categories. That's on purpose. The level of isolation is implementation detail. The purpose of the test is the taxonomy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Small Habits That Pay Off
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;t.Cleanup&lt;/code&gt; over &lt;code&gt;defer&lt;/code&gt;.&lt;/strong&gt; Cleanups run in LIFO order, can be added anywhere in the test, and survive test panics better.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prefer table-driven tests.&lt;/strong&gt; Twenty tests as rows in a slice beats twenty nearly-identical test functions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fail tests with &lt;code&gt;t.Fatalf&lt;/code&gt;, not &lt;code&gt;t.Errorf&lt;/code&gt;, for setup failures.&lt;/strong&gt; A broken setup should abort; a broken assertion might allow the test to continue collecting more failures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Golden files for complex outputs.&lt;/strong&gt; If you're verifying a generated SQL query, a serialized event, or a JSON response, a golden file comparison is more readable than a long string literal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Separate &lt;code&gt;_test.go&lt;/code&gt; files for slow tests with a build tag.&lt;/strong&gt; &lt;code&gt;//go:build integration&lt;/code&gt; lets you run them explicitly.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Shift That Changed My Testing
&lt;/h2&gt;

&lt;p&gt;Coverage numbers lie. The question is not "what percent of lines are executed by tests" — it's "what percent of the risky behaviors are covered by tests that will actually fail when those behaviors break."&lt;/p&gt;

&lt;p&gt;A codebase with 95% line coverage and zero race tests, zero real-DB tests, and mock-heavy integration tests is brittle. A codebase with 60% line coverage, &lt;code&gt;go test -race&lt;/code&gt; in CI, Testcontainers for the DB, and a stress test for every hot concurrent path is not.&lt;/p&gt;

&lt;p&gt;The single biggest shift I recommend: &lt;strong&gt;stop thinking about tests in terms of isolation level, and start thinking about them in terms of the production failure modes you're actually afraid of&lt;/strong&gt;. Map each failure mode to a test shape. If you don't have a test shape for a failure mode, you don't really have that failure mode covered — you just hope it doesn't happen.&lt;/p&gt;

&lt;p&gt;Production has opinions about what you hope.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-chan-context-structure-not-speed/" rel="noopener noreferrer"&gt;Go's Concurrency Is About Structure, Not Speed&lt;/a&gt; — the concurrency patterns that make production-shape Go possible.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-context-distributed-systems-production/" rel="noopener noreferrer"&gt;Go Context in Distributed Systems: What Actually Works in Production&lt;/a&gt; — the single most common test gap in Go services I review.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/fail-fast-bounded-resilience-distributed-systems/" rel="noopener noreferrer"&gt;Why Your "Fail-Fast" Strategy is Killing Your Distributed System&lt;/a&gt; — a production failure mode that's hard to test unless you design the test for it.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>testing</category>
      <category>backendengineering</category>
    </item>
    <item>
      <title>Scale-Up vs Scale-Out: Why Every Language Wins Somewhere</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Sat, 18 Apr 2026 00:18:32 +0000</pubDate>
      <link>https://forem.com/harrisonsec/scale-up-vs-scale-out-why-every-language-wins-somewhere-3k6l</link>
      <guid>https://forem.com/harrisonsec/scale-up-vs-scale-out-why-every-language-wins-somewhere-3k6l</guid>
      <description>&lt;p&gt;I worked with a team that rewrote a critical service from Go to Rust because "performance." Six months later, the service was 30% faster, the team was miserable, and feature velocity had dropped to a crawl. Meanwhile the competitor team, still on Go, had shipped four new features.&lt;/p&gt;

&lt;p&gt;We did the postmortem eventually. The service handled maybe 2,000 requests per second on a 4-core machine. CPU utilization sat around 20%. Rust's extra speed bought us exactly nothing — the bottleneck was downstream database latency. What it cost us was every feature we didn't ship while writing unsafe, fighting the borrow checker, and nursing the team through the learning curve.&lt;/p&gt;

&lt;p&gt;That incident taught me the question I wish I'd learned earlier: &lt;strong&gt;what are you actually scaling, and does the language buy you the right kind of scale?&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Language benchmarks optimize for one axis: per-request performance. Real systems have multiple axes — throughput, latency, concurrency, developer velocity, operational complexity, memory efficiency. Rust, Go, Java, Python aren't competing to be "fastest." They're different answers to different bets about what you're going to scale. Pick by fit, not by leaderboard.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Two Kinds of Scale
&lt;/h2&gt;

&lt;p&gt;At the top level, two strategies dominate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scale-up&lt;/strong&gt;: make one machine do more. Vertical scaling. Faster CPUs, more RAM, specialized hardware, lower per-operation cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scale-out&lt;/strong&gt;: add more machines. Horizontal scaling. Cheaper commodity hardware, more concurrency, lots of work running in parallel.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren't just infrastructure decisions. They're reflected in the language and ecosystem you pick. A language optimized for scale-up (Rust, C++) has different priorities than one optimized for scale-out (Go, Elixir) or one optimized for neither but for developer leverage (Python, Ruby).&lt;/p&gt;

&lt;p&gt;The big confusion comes from mixing axes. "Rust is faster than Go" is true on per-op microbenchmarks and irrelevant if your workload is I/O-bound service-to-service traffic. "Python is slow" is true in a compute-bound loop and irrelevant for a 500-QPS API that spends 95% of its time waiting on PostgreSQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Each Language Actually Wins
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FcXVhZHJhbnRDaGFydAogICAgdGl0bGUgTGFuZ3VhZ2UgZml0IGJ5IHdoYXQgeW91J3JlIHNjYWxpbmcKICAgIHgtYXhpcyBTY2FsZS1vdXQgKG1hbnkgbWFjaGluZXMgLyBjaGVhcCBjb25jdXJyZW5jeSkgLS0-IFNjYWxlLXVwIChvbmUgbWFjaGluZSwgcHVzaGVkIGhhcmQpCiAgICB5LWF4aXMgUHJvdG90eXBlIHZlbG9jaXR5IC0tPiBQcm9kdWN0aW9uIHJpZ29yCiAgICBxdWFkcmFudC0xICJTY2FsZS11cCArIHJpZ29yPGJyLz4oUnVzdCDCtyBDKysgwrcgWmlnKSIKICAgIHF1YWRyYW50LTIgIlNjYWxlLW91dCArIHJpZ29yPGJyLz4oR28gwrcgSmF2YS9Lb3RsaW4pIgogICAgcXVhZHJhbnQtMyAiU2NhbGUtb3V0ICsgdmVsb2NpdHk8YnIvPihQeXRob24gwrcgUnVieSDCtyBOb2RlKSIKICAgIHF1YWRyYW50LTQgIlNjYWxlLXVwICsgdmVsb2NpdHk8YnIvPihuYXJyb3cgbmljaGUpIgogICAgUnVzdDogWzAuODUsIDAuODVdCiAgICAiQysrIjogWzAuOTIsIDAuODhdCiAgICBHbzogWzAuMjUsIDAuNzVdCiAgICAiSmF2YS9Lb3RsaW4iOiBbMC4zMCwgMC44MF0KICAgIFB5dGhvbjogWzAuMjUsIDAuMjVdCiAgICBSdWJ5OiBbMC4yNSwgMC4zMF0KICAgIE5vZGU6IFswLjMwLCAwLjM1XQ%3D%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FcXVhZHJhbnRDaGFydAogICAgdGl0bGUgTGFuZ3VhZ2UgZml0IGJ5IHdoYXQgeW91J3JlIHNjYWxpbmcKICAgIHgtYXhpcyBTY2FsZS1vdXQgKG1hbnkgbWFjaGluZXMgLyBjaGVhcCBjb25jdXJyZW5jeSkgLS0-IFNjYWxlLXVwIChvbmUgbWFjaGluZSwgcHVzaGVkIGhhcmQpCiAgICB5LWF4aXMgUHJvdG90eXBlIHZlbG9jaXR5IC0tPiBQcm9kdWN0aW9uIHJpZ29yCiAgICBxdWFkcmFudC0xICJTY2FsZS11cCArIHJpZ29yPGJyLz4oUnVzdCDCtyBDKysgwrcgWmlnKSIKICAgIHF1YWRyYW50LTIgIlNjYWxlLW91dCArIHJpZ29yPGJyLz4oR28gwrcgSmF2YS9Lb3RsaW4pIgogICAgcXVhZHJhbnQtMyAiU2NhbGUtb3V0ICsgdmVsb2NpdHk8YnIvPihQeXRob24gwrcgUnVieSDCtyBOb2RlKSIKICAgIHF1YWRyYW50LTQgIlNjYWxlLXVwICsgdmVsb2NpdHk8YnIvPihuYXJyb3cgbmljaGUpIgogICAgUnVzdDogWzAuODUsIDAuODVdCiAgICAiQysrIjogWzAuOTIsIDAuODhdCiAgICBHbzogWzAuMjUsIDAuNzVdCiAgICAiSmF2YS9Lb3RsaW4iOiBbMC4zMCwgMC44MF0KICAgIFB5dGhvbjogWzAuMjUsIDAuMjVdCiAgICBSdWJ5OiBbMC4yNSwgMC4zMF0KICAgIE5vZGU6IFswLjMwLCAwLjM1XQ%3D%3D" alt="title Language fit by what you're scaling" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Rough positioning — not a benchmark, a fit map. The language you pick should live near the kind of scaling your system actually demands.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rust / C++ / Zig — Scale-up champions
&lt;/h3&gt;

&lt;p&gt;These languages dominate when &lt;strong&gt;per-machine throughput is the bottleneck&lt;/strong&gt; and you can afford the engineering cost. That's a narrower set of problems than Twitter would have you believe, but the problems that exist are real:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-frequency trading engines — microseconds matter, GC pauses are unacceptable, every cache line counts.&lt;/li&gt;
&lt;li&gt;Inference engines — llm.cpp, vllm, mistral.rs. Memory layout, SIMD, custom kernels.&lt;/li&gt;
&lt;li&gt;Databases and storage engines — ScyllaDB, TiKV, Foundation internals. State machines that live forever and must not leak.&lt;/li&gt;
&lt;li&gt;Network data planes — Cloudflare's Pingora, proxies at the edge.&lt;/li&gt;
&lt;li&gt;Game engines, audio/video encoding, embedded.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern: &lt;strong&gt;one box, pushed hard, for years&lt;/strong&gt;. Memory safety matters because bugs compound over time. Performance matters because throughput per core is the product.&lt;/p&gt;

&lt;p&gt;The cost: every commit is slower. Refactoring is expensive. Onboarding is measured in months, not weeks. The compile times are what they are. You pay this cost every day the service exists.&lt;/p&gt;

&lt;h3&gt;
  
  
  Go — Scale-out champion
&lt;/h3&gt;

&lt;p&gt;Go hits a specific sweet spot: &lt;strong&gt;cheap concurrency, predictable performance, fast-to-ship code, and easy to hire for&lt;/strong&gt;. It's a scale-out language.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Thousands of goroutines per core, 2KB stacks, user-space context switching. The "cost of one more waiter" is nearly zero.&lt;/li&gt;
&lt;li&gt;Standard library is enough for 80% of backend work — HTTP server, JSON, SQL, crypto.&lt;/li&gt;
&lt;li&gt;Compilation is fast enough to stay in flow. Iteration loop feels similar to a dynamic language.&lt;/li&gt;
&lt;li&gt;Minimalism is aggressive. One person can read the whole language in a weekend. New hires are productive in days.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where it loses: per-op performance. Go's GC is fine but not invisible. Zero-copy generic code is harder to write than in Rust. The type system doesn't prevent the entire class of bugs Rust's does.&lt;/p&gt;

&lt;p&gt;Go's bet: the problem you're most likely to have is "I need to handle 10x the concurrent work with 2x the code." Not "I need this loop to be 5% faster." For most backend services, that bet is right.&lt;/p&gt;

&lt;h3&gt;
  
  
  Java / Kotlin — Mature scale-out with runtime depth
&lt;/h3&gt;

&lt;p&gt;The JVM is what you want when the workload is scale-out but you need runtime flexibility Go doesn't give you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A mature JIT that optimizes hot paths beyond what AOT can.&lt;/li&gt;
&lt;li&gt;Rich profiling and monitoring (JFR, async-profiler, flight recorder) that makes post-deploy tuning feasible.&lt;/li&gt;
&lt;li&gt;A library ecosystem that, after 25 years, has a mature library for basically anything.&lt;/li&gt;
&lt;li&gt;Kotlin on top gives you modern syntax and coroutines without leaving the ecosystem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where it loses: startup time, memory overhead, operational complexity (GC tuning is a real job), the occasional "it works on my JDK 11 but the prod JDK 17 changed something." Also: hiring is harder than Go now, at least in my corner of the industry.&lt;/p&gt;

&lt;p&gt;Java's bet: "you'll still be running this service in ten years, and you want to be able to tune its runtime when that day comes." For large enterprises with deep infrastructure, that bet pays off. For a startup shipping its first three services, the overhead is not worth it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python / Ruby — Developer-velocity champions
&lt;/h3&gt;

&lt;p&gt;The forgotten-but-dominant answer: languages that optimize neither scale-up nor scale-out, but &lt;strong&gt;scale-the-team&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fast to write, fast to read, fast to debug.&lt;/li&gt;
&lt;li&gt;Massive libraries for data, ML, scripting, DSLs.&lt;/li&gt;
&lt;li&gt;Easy to onboard anyone — CS students, data scientists, analysts.&lt;/li&gt;
&lt;li&gt;Prototype-to-production path is shorter than anywhere else.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where they lose: per-core throughput, concurrency (the GIL is real), memory. Python and Ruby are not your language for a 100K QPS service.&lt;/p&gt;

&lt;p&gt;But a lot of real companies don't need a 100K QPS service. They need to get a thing working, put it in front of users, and iterate. If your current problem is "we need to ship the next feature this week," Python might be the right answer even if a Rust version would technically run faster.&lt;/p&gt;

&lt;p&gt;Python's bet: throughput isn't the constraint yet. Time-to-shipped-feature is. For most companies most of the time, that's correct.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Axes Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Beyond scale-up/scale-out, a few axes decide more projects than raw performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Developer-velocity per week
&lt;/h3&gt;

&lt;p&gt;"I can ship a feature and have it in production by Friday" beats "this service is 2x faster" most of the time. Measure it. If your current stack requires a two-day ceremony to deploy a one-line change, throughput is not your problem. Velocity is.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operational complexity
&lt;/h3&gt;

&lt;p&gt;Scale-up is operationally cheaper than scale-out. One machine, one process, one log. Scale-out gives you better redundancy but also distributed-systems problems — consistency, ordering, partial failure, chaos engineering. If your team is three people, the operational complexity of a 20-node scale-out cluster may eat more time than the language choice saves.&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory efficiency per dollar
&lt;/h3&gt;

&lt;p&gt;At cloud scale, memory is expensive. A Rust service that fits in 2GB where a Java service needs 8GB is a 4x savings on every instance. Multiply by thousands of instances and "per-op performance" stops being the interesting number — per-GB cost starts to matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hiring pool
&lt;/h3&gt;

&lt;p&gt;The language with the deepest talent pool in your market is usually the right answer for a new system, all else equal. A marginal technical improvement isn't worth a six-month hiring pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Learning curve shape
&lt;/h3&gt;

&lt;p&gt;Some languages have shallow onboarding (Go, Python) and a long tail of depth. Others have steep onboarding (Rust, Haskell) and you're productive only after the ramp. For a senior team on a long-lived system, steep is fine. For a fast-moving team, steep is expensive.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern I See Repeated
&lt;/h2&gt;

&lt;p&gt;A company starts small, picks Python or Ruby, builds the thing, ships to production. Ten employees. One codebase. Life is fast.&lt;/p&gt;

&lt;p&gt;They grow to fifty engineers. The monolith cracks. Some services get rewritten in Go for concurrency and operational simplicity. A few performance-critical ones get written in Rust. Data infra sits on the JVM (Kafka, Spark, Flink). A few internal tools stay in Python because the team knows it and it works.&lt;/p&gt;

&lt;p&gt;Five years in, the stack is polyglot. Nobody regrets it. What they regret is the six months they spent trying to make a single-language stack work past its comfort zone — the Python team pushing for "just async more things," or the Rust team fighting the borrow checker on code that could have been Go, or the Java team explaining to a new hire why the stack trace is 400 lines long.&lt;/p&gt;

&lt;p&gt;The pattern: &lt;strong&gt;pick the language that fits the service, not the service that fits the language&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Ask the Question Now
&lt;/h2&gt;

&lt;p&gt;When someone proposes "let's build this new thing in X," I ask:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What's the expected traffic profile, and what's the per-request work shape?&lt;/li&gt;
&lt;li&gt;Is this scale-up limited (per-machine throughput) or scale-out limited (concurrent work)?&lt;/li&gt;
&lt;li&gt;Who's going to write this, and how fast do we need them productive?&lt;/li&gt;
&lt;li&gt;Who's going to operate this, and what's their tooling comfort?&lt;/li&gt;
&lt;li&gt;Does this interact with an existing ecosystem (JVM data platform, Rust security infra)?&lt;/li&gt;
&lt;li&gt;How long does it have to live?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The answer to those five questions usually lands me on one of three languages for 80% of systems I see: Go, Rust, or (for data-adjacent work) Kotlin on the JVM. Python still shows up for tools and glue. Everything else is contextual.&lt;/p&gt;

&lt;p&gt;The benchmarks don't help. Per-op microbenchmarks answer questions nobody is actually asking. The right question is which axes matter for &lt;em&gt;this&lt;/em&gt; system, and which language's bet lines up with those axes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Argument I've Stopped Having
&lt;/h2&gt;

&lt;p&gt;I still see engineers argue about whether Rust or Go is "better." Both are good languages. Both are bad choices for problems they weren't designed for. The meaningful question is which kind of scale you're paying for — and the honest answer is almost always a mix, evolving over time.&lt;/p&gt;

&lt;p&gt;The Rust rewrite I opened with wasn't a bad decision because Rust is a bad language. It was a bad decision because we weren't scale-up limited. We were downstream-database limited. No language could help with that.&lt;/p&gt;

&lt;p&gt;Know which scale you're buying, and buy it on purpose.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-millions-connections-user-space-context-switching/" rel="noopener noreferrer"&gt;Why Go Handles Millions of Connections: User-Space Context Switching, Explained&lt;/a&gt; — the design decision behind Go's scale-out bet.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-chan-context-structure-not-speed/" rel="noopener noreferrer"&gt;Go's Concurrency Is About Structure, Not Speed&lt;/a&gt; — what you actually get with Go, and what you don't.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/nats-kafka-mqtt-same-category-different-jobs/" rel="noopener noreferrer"&gt;NATS vs Kafka vs MQTT: Same Category, Very Different Jobs&lt;/a&gt; — applying the same fit-vs-benchmark thinking to messaging.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programminglanguages</category>
      <category>systemdesign</category>
      <category>scale</category>
      <category>rust</category>
    </item>
    <item>
      <title>From Locks to Actors: The Four Pillars of Modern Concurrency</title>
      <dc:creator>Harrison Guo</dc:creator>
      <pubDate>Fri, 17 Apr 2026 05:50:27 +0000</pubDate>
      <link>https://forem.com/harrisonsec/from-locks-to-actors-the-four-pillars-of-modern-concurrency-3o50</link>
      <guid>https://forem.com/harrisonsec/from-locks-to-actors-the-four-pillars-of-modern-concurrency-3o50</guid>
      <description>&lt;p&gt;Most working engineers have spent ninety percent of their concurrent-programming life in one model: shared memory protected by locks. Threads that all see the same variables. Mutexes around the critical sections. Hope and care. It's the model every OS textbook teaches, every mainstream language supports, and every senior engineer has a horror story about.&lt;/p&gt;

&lt;p&gt;It's also not the only option. Or even the best one, for many of the problems it gets used for. Three other models — CSP, actors, and software transactional memory — have been around for decades, mature enough for production, and each solves a class of problems that lock-based designs handle poorly.&lt;/p&gt;

&lt;p&gt;This is a map of all four, from a working backend engineer who uses each of them for different jobs, and a take on when each is the right answer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBQMVsiMSDCtyBTaGFyZWQgTWVtb3J5ICsgTG9ja3MiXQogICAgICAgIE0xWyJUaHJlYWRzIHNoYXJlIGFkZHJlc3Mgc3BhY2UiXQogICAgICAgIE0yWyJNdXRleCDCtyBhdG9taWNzIMK3IGNvbmQgdmFyIl0KICAgICAgICBNM1siRGVhZGxvY2tzIMK3IHJhY2VzIMK3IGludmlzaWJsZSBidWdzIl0KICAgIGVuZAoKICAgIHN1YmdyYXBoIFAyWyIyIMK3IENTUCDigJQgQ29tbXVuaWNhdGluZyBTZXF1ZW50aWFsIFByb2Nlc3NlcyJdCiAgICAgICAgQzFbIkdvcm91dGluZXMgKyBjaGFubmVscyJdCiAgICAgICAgQzJbIk93bmVyc2hpcCBtb3ZlcyB3aXRoIG1lc3NhZ2UiXQogICAgICAgIEMzWyJCYWNrcHJlc3N1cmUgYnVpbHQtaW4iXQogICAgZW5kCgogICAgc3ViZ3JhcGggUDNbIjMgwrcgQWN0b3JzIl0KICAgICAgICBBMVsiTmFtZWQgZW50aXR5ICsgbWFpbGJveCJdCiAgICAgICAgQTJbIlByaXZhdGUgc3RhdGUgwrcgbm8gc2hhcmluZyJdCiAgICAgICAgQTNbIlN1cGVydmlzaW9uIMK3IGxldCBpdCBjcmFzaCJdCiAgICBlbmQKCiAgICBzdWJncmFwaCBQNFsiNCDCtyBTb2Z0d2FyZSBUcmFuc2FjdGlvbmFsIE1lbW9yeSJdCiAgICAgICAgUzFbIk9wdGltaXN0aWMgdHJhbnNhY3Rpb25zIl0KICAgICAgICBTMlsiQ29tcG9zYWJsZSDCtyByZXRyeSBvbiBjb25mbGljdCJdCiAgICAgICAgUzNbIk5vIGxvY2tzLCBubyBkZWFkbG9ja3MiXQogICAgZW5kCgogICAgY2xhc3NEZWYgcGlsbGFyIGZpbGw6I2U4ZjRmOCxzdHJva2U6IzJjNTI4MixzdHJva2Utd2lkdGg6MnB4CiAgICBjbGFzcyBQMSxQMixQMyxQNCBwaWxsYXI%3D" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmermaid.ink%2Fimg%2FZmxvd2NoYXJ0IExSCiAgICBzdWJncmFwaCBQMVsiMSDCtyBTaGFyZWQgTWVtb3J5ICsgTG9ja3MiXQogICAgICAgIE0xWyJUaHJlYWRzIHNoYXJlIGFkZHJlc3Mgc3BhY2UiXQogICAgICAgIE0yWyJNdXRleCDCtyBhdG9taWNzIMK3IGNvbmQgdmFyIl0KICAgICAgICBNM1siRGVhZGxvY2tzIMK3IHJhY2VzIMK3IGludmlzaWJsZSBidWdzIl0KICAgIGVuZAoKICAgIHN1YmdyYXBoIFAyWyIyIMK3IENTUCDigJQgQ29tbXVuaWNhdGluZyBTZXF1ZW50aWFsIFByb2Nlc3NlcyJdCiAgICAgICAgQzFbIkdvcm91dGluZXMgKyBjaGFubmVscyJdCiAgICAgICAgQzJbIk93bmVyc2hpcCBtb3ZlcyB3aXRoIG1lc3NhZ2UiXQogICAgICAgIEMzWyJCYWNrcHJlc3N1cmUgYnVpbHQtaW4iXQogICAgZW5kCgogICAgc3ViZ3JhcGggUDNbIjMgwrcgQWN0b3JzIl0KICAgICAgICBBMVsiTmFtZWQgZW50aXR5ICsgbWFpbGJveCJdCiAgICAgICAgQTJbIlByaXZhdGUgc3RhdGUgwrcgbm8gc2hhcmluZyJdCiAgICAgICAgQTNbIlN1cGVydmlzaW9uIMK3IGxldCBpdCBjcmFzaCJdCiAgICBlbmQKCiAgICBzdWJncmFwaCBQNFsiNCDCtyBTb2Z0d2FyZSBUcmFuc2FjdGlvbmFsIE1lbW9yeSJdCiAgICAgICAgUzFbIk9wdGltaXN0aWMgdHJhbnNhY3Rpb25zIl0KICAgICAgICBTMlsiQ29tcG9zYWJsZSDCtyByZXRyeSBvbiBjb25mbGljdCJdCiAgICAgICAgUzNbIk5vIGxvY2tzLCBubyBkZWFkbG9ja3MiXQogICAgZW5kCgogICAgY2xhc3NEZWYgcGlsbGFyIGZpbGw6I2U4ZjRmOCxzdHJva2U6IzJjNTI4MixzdHJva2Utd2lkdGg6MnB4CiAgICBjbGFzcyBQMSxQMixQMyxQNCBwaWxsYXI%3D" alt="M1[" width="953" height="754"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; — Concurrency has four viable pillars: shared memory + locks (threads, mutexes), CSP (channels, Go), actors (mailboxes, Erlang), and STM (transactional memory, Clojure). None is universally better. Each solves a different problem and has a different failure mode. Senior designs often mix three of them in one system. Mutex-for-everything works until it doesn't — usually at exactly the scale you promised you'd never reach.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Pillar 1: Shared Memory + Locks
&lt;/h2&gt;

&lt;p&gt;The default. Threads, mutexes, atomics, condition variables. Every mainstream language has them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;: multiple threads of execution share the same address space. They read and write the same data. Mutexes make sure only one thread touches a critical section at a time. Atomics do the same for single-word operations without a full lock.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it shines&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simple shared counters and caches.&lt;/strong&gt; &lt;code&gt;atomic.AddInt64&lt;/code&gt;, &lt;code&gt;sync.Map&lt;/code&gt;, LRU caches. The right tool.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tight single-process coordination&lt;/strong&gt; where the code is small enough for one person to hold in their head.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance-critical paths&lt;/strong&gt; where the overhead of channel sends or actor dispatches is too much.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Failure modes&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deadlocks.&lt;/strong&gt; Two threads acquire locks in opposite order. Happens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Priority inversion.&lt;/strong&gt; Low-priority thread holds the lock, high-priority thread waits, work piles up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lock ordering bugs at scale.&lt;/strong&gt; When N components each take M locks, the reasoning gets exponential.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory-model weirdness.&lt;/strong&gt; What one thread writes, another may not immediately see. You start caring about happens-before, acquire/release semantics, and why &lt;code&gt;volatile&lt;/code&gt; in Java is not what you thought.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invisible races.&lt;/strong&gt; The worst kind. Tests pass; production fails weirdly twice a month.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use mutexes for small, localized shared state. Once the shared state has three collaborators or more, or a nontrivial invariant across fields, reach for one of the other models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pillar 2: CSP (Communicating Sequential Processes)
&lt;/h2&gt;

&lt;p&gt;Tony Hoare's 1978 paper, popularized by Occam and now Go. The model Rob Pike and Ken Thompson picked for Go's concurrency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;: processes don't share memory; they send messages on named &lt;strong&gt;channels&lt;/strong&gt;. Senders and receivers rendezvous on the channel. Ownership of data moves with the message. "Do not communicate by sharing memory; share memory by communicating."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it shines&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pipelines.&lt;/strong&gt; Data flows through stages, each a goroutine, connected by channels. Clean to read.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fan-out / fan-in.&lt;/strong&gt; One producer, many workers, one aggregator. The channel topology &lt;em&gt;is&lt;/em&gt; the architecture.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backpressure.&lt;/strong&gt; A bounded channel blocks the producer when full. No extra flow control needed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cancellation coordination.&lt;/strong&gt; &lt;code&gt;select&lt;/code&gt; with &lt;code&gt;&amp;lt;-ctx.Done()&lt;/code&gt; is a clean primitive.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lifecycle control.&lt;/strong&gt; Closing a channel is a broadcast to every listener.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Failure modes&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deadlocks remain possible.&lt;/strong&gt; Two goroutines each waiting on the other's channel. Cycles in the channel graph are lethal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory leaks via unclosed channels.&lt;/strong&gt; A goroutine blocked on a send that will never be received lives forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Awkward request/reply.&lt;/strong&gt; You end up passing a reply channel with each request, which works but feels verbose.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Order isn't free.&lt;/strong&gt; Channel ordering is only per-channel. If you fan out and fan in, the aggregation is unordered unless you sort.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use CSP for coordination-heavy designs. When the structure of "who's alive, who sends to whom, when do things stop" is the architecture, channels make that visible in the code.&lt;/p&gt;

&lt;p&gt;Go is the obvious exemplar, but CSP-style is also available in Rust (&lt;code&gt;crossbeam-channel&lt;/code&gt;, &lt;code&gt;tokio::sync::mpsc&lt;/code&gt;), Kotlin (coroutines with channels), Python (&lt;code&gt;asyncio.Queue&lt;/code&gt;), and C# (&lt;code&gt;System.Threading.Channels&lt;/code&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  Pillar 3: Actors
&lt;/h2&gt;

&lt;p&gt;Carl Hewitt's 1973 paper. Made practical by Erlang (1986) and later Akka (Scala/Java). The model behind WhatsApp, a decade of telecom, and most fault-tolerant messaging infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;: an &lt;strong&gt;actor&lt;/strong&gt; is a named entity with private state and a mailbox. Other actors send messages to its address. Messages are processed one at a time from the mailbox. No shared memory. Parent actors supervise children; when a child crashes, the parent decides to restart, escalate, or ignore. Crashes are normal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it shines&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fault isolation at scale.&lt;/strong&gt; One actor crashing is expected; it doesn't take down the system. Supervision hierarchies make "let it crash" a sensible engineering strategy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stateful services.&lt;/strong&gt; Each actor holds its own state. Conceptually clean: no shared global state, no locks around it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Location transparency.&lt;/strong&gt; An actor can live in the same process, another process, or another machine. The sender doesn't know. This is where actors shine in distributed systems — the model scales across the network boundary natively.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Massive concurrency with stateful semantics.&lt;/strong&gt; Erlang routinely runs millions of actors per node. Each is cheap.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Failure modes&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mailbox unboundedness.&lt;/strong&gt; If a producer sends faster than the actor can process, the mailbox grows without bound. Bounded mailboxes exist; use them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message-ordering assumptions break across the network.&lt;/strong&gt; Within one node, delivery order is preserved per sender. Across nodes, all bets are off without explicit sequencing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testing is harder.&lt;/strong&gt; Actors make their own state opaque; you test behavior through message exchange. Good frameworks help, but the habits needed are different from testing normal code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conceptual mismatch in CRUD-style backends.&lt;/strong&gt; If your business logic is "select some rows, transform them, insert result," actors are overkill. They shine on long-lived stateful entities (a game character, a connected device, a user session), not on stateless request handlers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Erlang and Elixir are the canonical runtimes. Akka brings actors to the JVM. Pony is a rare actor-first typed language. In Go, you can simulate actors with a goroutine + channel-as-mailbox pattern, but you lose Erlang's supervision and "let it crash" semantics unless you build them yourself.&lt;/p&gt;

&lt;p&gt;Use actors when you have &lt;strong&gt;long-lived stateful entities with fault requirements&lt;/strong&gt;. Telecom, messaging, multiplayer game servers, IoT device shadows, any system where "this particular entity has its own state machine, and we really care when it crashes" is the shape.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pillar 4: Software Transactional Memory (STM)
&lt;/h2&gt;

&lt;p&gt;Imagine database transactions, but for in-memory data. That's STM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;: critical sections are wrapped in transactions. The runtime tracks reads and writes optimistically. On commit, if any data touched was modified by another transaction, the current one rolls back and retries. No explicit locks. Composability — two transactions can be combined into a larger one without redesigning the locking order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it shines&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Composable concurrent code.&lt;/strong&gt; Combining operations that were individually correct usually stays correct under STM. Lock-based code famously does not.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read-mostly workloads.&lt;/strong&gt; STM with multi-version concurrency control scales reads without blocking.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoiding the lock-ordering bug class.&lt;/strong&gt; No locks, no deadlocks. The failure mode is retry storms, which are easier to reason about.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Failure modes&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;I/O inside transactions is awful.&lt;/strong&gt; Transactions may retry. If you did I/O, you may have done it multiple times. Either separate I/O from transactional state, or the runtime has to forbid I/O inside transactions (Haskell's STM monad does this at the type level).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retry storms under contention.&lt;/strong&gt; Heavy write contention on the same data means constant retries. In the worst case, throughput can be worse than locks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Limited language support.&lt;/strong&gt; Clojure (built-in), Haskell (&lt;code&gt;STM&lt;/code&gt;), Scala (&lt;code&gt;scala-stm&lt;/code&gt;), Rust (experimental &lt;code&gt;stm&lt;/code&gt; crates). Not a mainstream feature of Go/Java/C#.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Clojure is the canonical "STM as a first-class citizen" language — its refs and transactions are idiomatic. Haskell's &lt;code&gt;STM&lt;/code&gt; monad is arguably the cleanest realization. In other ecosystems, STM exists as libraries but hasn't displaced mutexes.&lt;/p&gt;

&lt;p&gt;Use STM when the concurrent state is small-to-medium, the access pattern is read-heavy with occasional writes, and you want the composability. For the rare problems that fit, STM is strictly simpler to reason about than locks. For problems that don't fit (I/O-heavy, write-contention-heavy), STM is worse.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Real Systems Mix Them
&lt;/h2&gt;

&lt;p&gt;The surprise for engineers who've only used one model: &lt;strong&gt;mature systems mix three of them in one codebase&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A typical backend service I'd build today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mutexes / atomics&lt;/strong&gt; for the inner loops — counters, caches, rate-limiter state, anything performance-critical with one clear owner.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Channels (CSP)&lt;/strong&gt; for coordination — worker pools, pipelines, cancellation, shutdown signaling, bounded queues.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Actors (in a sense)&lt;/strong&gt; for long-lived stateful entities — each connected client session, each in-flight request, each background job. In Go I'd model this as "one goroutine per entity, communicating via channels," which isn't formal actors but inherits the useful semantics: isolated state, message-passing, crash-isolation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And I wouldn't use STM in that stack. Not because it's bad, but because the language runtime doesn't make it first-class. If I were writing Clojure, STM would be a natural fit for the in-memory state machines that would otherwise be locked maps.&lt;/p&gt;

&lt;p&gt;The old "pick one concurrency model" debate was always a false choice. The real decision is per-problem: what shape is the concurrent work, what's the state-sharing pattern, and what failure semantics do I want.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision Guide
&lt;/h2&gt;

&lt;p&gt;Quick map:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;I have a counter that multiple goroutines read and update.&lt;/strong&gt; → atomic or mutex.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I have a pipeline of work that flows through stages.&lt;/strong&gt; → channels (CSP).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I have a fleet of long-lived sessions, each with its own state and lifetime.&lt;/strong&gt; → actor pattern (goroutine + mailbox channel, or real actor framework).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I have a fleet of connected devices each with a state machine that must survive crashes.&lt;/strong&gt; → actor framework with supervision (Erlang, Akka, or Go with explicit crash/restart logic).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I have complex shared state with nontrivial invariants across fields, and updates are occasional but important to compose.&lt;/strong&gt; → STM if your language supports it; otherwise, lots of careful mutex discipline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I have a request/response flow with fan-out to downstreams.&lt;/strong&gt; → CSP with &lt;code&gt;errgroup.WithContext&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I have no idea what I have.&lt;/strong&gt; → Start with mutexes, switch when it hurts. Don't over-engineer the first version.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Real Lesson
&lt;/h2&gt;

&lt;p&gt;Most people who get bitten by concurrency bugs got bitten because they used the wrong model, not because they used it wrong. A mutex-heavy design for a workload that's really a pipeline is fragile. A channels-for-everything design when there's a shared counter underneath ends up with awkward rendezvous. An actors-everywhere design when the business is CRUD requests reads like over-engineering.&lt;/p&gt;

&lt;p&gt;The four pillars aren't competing theories of concurrency. They're four tools, each good at specific jobs. Senior engineers know all four and reach for the right one. Junior engineers reach for the only one they know and force-fit it.&lt;/p&gt;

&lt;p&gt;If your career so far has been mostly mutexes, spend a weekend reading the other three. Write a toy pipeline in Go channels. Read Erlang's supervision documentation. Play with Clojure refs. The investment pays back every time you sit in a design review and someone proposes locking their way out of a structural problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-chan-context-structure-not-speed/" rel="noopener noreferrer"&gt;Go's Concurrency Is About Structure, Not Speed&lt;/a&gt; — CSP applied concretely in Go.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/go-millions-connections-user-space-context-switching/" rel="noopener noreferrer"&gt;Why Go Handles Millions of Connections&lt;/a&gt; — the runtime characteristics that make CSP cheap in Go.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harrisonsec.com/blog/scale-up-scale-out-every-language-wins-somewhere/" rel="noopener noreferrer"&gt;Scale-Up vs Scale-Out: Why Every Language Wins Somewhere&lt;/a&gt; — the language-level view of the same question.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>concurrency</category>
      <category>systemdesign</category>
      <category>go</category>
      <category>erlang</category>
    </item>
  </channel>
</rss>
