<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Andrey Kolkov</title>
    <description>The latest articles on Forem by Andrey Kolkov (@kolkov).</description>
    <link>https://forem.com/kolkov</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F277150%2Fdc37d68a-1fc4-4584-a7a6-0e640febd7a8.jpeg</url>
      <title>Forem: Andrey Kolkov</title>
      <link>https://forem.com/kolkov</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kolkov"/>
    <language>en</language>
    <item>
      <title>We Built the First Pure Go DXIL Generator — Because Optimizing the Wrong Path Wasn't Enough</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Sun, 05 Apr 2026 23:00:53 +0000</pubDate>
      <link>https://forem.com/kolkov/we-built-the-first-pure-go-dxil-generator-because-optimizing-the-wrong-path-wasnt-enough-35en</link>
      <guid>https://forem.com/kolkov/we-built-the-first-pure-go-dxil-generator-because-optimizing-the-wrong-path-wasnt-enough-35en</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Go doesn't have a real graphics ecosystem." — We've heard this for years. So we built one: 636K lines of Pure Go, five GPU backends, zero CGO. And now we've done something that even Rust's naga shader compiler hasn't managed in six years.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At the end of last year, we &lt;a href="https://dev.to/kolkov/go-126-meets-2026-with-a-professional-graphics-ecosystem-9g8"&gt;introduced GoGPU to the Go community&lt;/a&gt; — greeting everyone with a New Year's gift: a professional graphics ecosystem written entirely in Go. Four months later, that ecosystem just got its most audacious component: &lt;strong&gt;a Pure Go DXIL generator that compiles shaders directly to DirectX 12 bytecode, without any external compiler&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is the story of how a performance optimization rabbit hole led us to write our own LLVM 3.7 bitcode emitter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem That Wouldn't Go Away
&lt;/h2&gt;

&lt;p&gt;Every DirectX 12 application needs compiled shaders. The standard pipeline looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WGSL → HLSL text → FXC (d3dcompiler_47.dll) → DXBC bytecode → GPU
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That middle step — &lt;code&gt;d3dcompiler_47.dll&lt;/code&gt; — is a 4.3 MB Microsoft DLL that you load at runtime. It works. It's battle-tested. And it was our bottleneck.&lt;/p&gt;

&lt;p&gt;We build &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;gogpu&lt;/a&gt; — a Pure Go GPU ecosystem with its own shader compiler, &lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;naga&lt;/a&gt;. Four backend outputs already at 100% Rust naga parity. Everything compiles with &lt;code&gt;go build&lt;/code&gt;, no C toolchain needed.&lt;/p&gt;

&lt;p&gt;But on Windows with DirectX 12, we had a dirty secret: &lt;code&gt;d3dcompiler_47.dll&lt;/code&gt;. The one external dependency in our otherwise dependency-free stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Optimization Rabbit Hole
&lt;/h2&gt;

&lt;p&gt;We tried everything to make the FXC path fast enough to forget about:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shader cache&lt;/strong&gt; — Hash the HLSL, cache the DXBC. First render is slow, subsequent ones instant. Works great until your shader variants explode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-memory compilation pool&lt;/strong&gt; — Pre-compile common shaders at startup. Reduces cold-start latency. But we still load the DLL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline State Object caching&lt;/strong&gt; — We planned disk caching of PSO blobs (&lt;code&gt;GetCachedBlob&lt;/code&gt; → &lt;code&gt;os.UserCacheDir()&lt;/code&gt;). We wrote the task, designed the key format, specified the invalidation strategy. Then we never shipped it — because we pivoted to eliminating FXC entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;naga HLSL fix&lt;/strong&gt; — FXC was choking on naga-generated HLSL: a &lt;code&gt;(Type[256])0&lt;/code&gt; bulk zero-initialization expanded to a 12KB inline constructor that FXC took 22 seconds to compile. We initially thought our Go naga had a bug, so we tested the same shader through Rust naga + FXC — same 22 seconds. It wasn't our implementation; FXC genuinely can't handle giant inline constructors. The fix was in naga (per-element loop instead of inline constructor, v0.16.3) — 330× faster. But even after fixing the worst case, every shader still went through an external DLL.&lt;/p&gt;

&lt;p&gt;Every optimization made the same path faster. None of them removed the path.&lt;/p&gt;

&lt;p&gt;At this point we had a decision to make. The obvious next step was adding DXC (&lt;code&gt;dxcompiler.dll&lt;/code&gt;) as an opt-in replacement for FXC — newer, faster, supports Shader Model 6.0+. We even created the task for it.&lt;/p&gt;

&lt;p&gt;Then, while reviewing the plan, a simple question came up: &lt;em&gt;"Can we write our own DXC in Go?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The initial answer was: "No. DXC is 500K lines of C++, a fork of LLVM 3.7. That's not something you casually rewrite."&lt;/p&gt;

&lt;p&gt;The response: &lt;em&gt;"Not rewrite DXC. Generate DXIL directly. Skip HLSL entirely."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That changed everything. DXC takes HLSL text and produces DXIL. We already have our own IR (naga IR). Why translate IR → HLSL text → parse HLSL → produce DXIL, when we could go IR → DXIL directly?&lt;/p&gt;

&lt;p&gt;DXC is a compiler from one language (HLSL) to another (DXIL). We don't need a compiler — we need an &lt;strong&gt;emitter&lt;/strong&gt;. And emitting is much simpler than compiling.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is DXIL, and Why Nobody Writes It
&lt;/h2&gt;

&lt;p&gt;DXIL (DirectX Intermediate Language) is what FXC and DXC produce. It's LLVM 3.7 bitcode — the same IR format that LLVM uses internally — wrapped in a DXBC container with DirectX-specific metadata and &lt;code&gt;dx.op&lt;/code&gt; intrinsic calls.&lt;/p&gt;

&lt;p&gt;The reason nobody writes DXIL directly is simple: it's hard.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLVM 3.7 bitcode&lt;/strong&gt; is a binary format with variable-width encoding (VBR), nested blocks, abbreviation records, and forward references. Not something you casually emit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DXIL semantics&lt;/strong&gt; require &lt;code&gt;dx.op&lt;/code&gt; intrinsic calls instead of normal LLVM instructions for I/O, math, and resource access. 165+ opcodes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DXBC container&lt;/strong&gt; needs input/output signatures (ISG1/OSG1), pipeline state validation (PSV0), feature flags (SFI0), and a cryptographic hash.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validation&lt;/strong&gt; — until January 2025, every DXIL module needed to be signed by &lt;code&gt;dxil.dll&lt;/code&gt;. Microsoft's BYPASS hash sentinel changed this.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rust's naga shader compiler has had an &lt;a href="https://github.com/gfx-rs/wgpu/issues/4302" rel="noopener noreferrer"&gt;open issue for DXIL backend since 2020&lt;/a&gt;. Six years later, it's still not implemented.&lt;/p&gt;

&lt;p&gt;Only one project has done it outside of LLVM: &lt;strong&gt;Mesa&lt;/strong&gt; (the open-source OpenGL/Vulkan driver stack). Their DXIL compiler is ~21,000 lines of C/H, written by engineers from Microsoft and Collabora over 3+ years. They wrote their own LLVM 3.7 bitcode writer from scratch — proving it's possible without linking LLVM.&lt;/p&gt;

&lt;p&gt;We cloned Mesa's &lt;code&gt;src/microsoft/compiler/&lt;/code&gt; into our reference folder, studied &lt;code&gt;dxil_module.c&lt;/code&gt; (the bitcode writer, ~3K lines of C), and mapped out every block type, record format, and abbreviation. Not to copy — to understand the format deeply enough to write our own.&lt;/p&gt;

&lt;p&gt;Then came the final piece: in January 2025, Microsoft &lt;a href="https://devblogs.microsoft.com/directx/open-sourcing-dxil-validator-hash/" rel="noopener noreferrer"&gt;open-sourced the DXIL validator hash&lt;/a&gt; and introduced a BYPASS sentinel — a magic value in the hash field that tells D3D12 "this shader wasn't signed by dxil.dll, but trust it anyway." Without this, our DXIL wouldn't run without Developer Mode on Windows. With it, &lt;strong&gt;any third-party DXIL generator can produce shaders that run on retail Windows&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We weren't afraid of binary formats. Before gogpu, we built &lt;a href="https://github.com/scigolib/hdf5" rel="noopener noreferrer"&gt;scigolib/hdf5&lt;/a&gt; — a Pure Go implementation of HDF5, NASA's hierarchical data format with its own B-tree indices, chunked storage, and compression pipelines. After parsing HDF5 superblocks and fractal heaps in pure Go, LLVM bitcode felt almost... reasonable. We also built &lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;coregx/coregex&lt;/a&gt; — a multi-engine regex system (17 strategies, Lazy DFA, PikeVM, SIMD prefilters) that runs up to 3000× faster than Go's stdlib. Complex binary formats and low-level encoding are kind of our thing.&lt;/p&gt;

&lt;p&gt;We spent weeks studying the DXIL format specifically. Reading the &lt;a href="https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst" rel="noopener noreferrer"&gt;DXIL spec&lt;/a&gt;, the &lt;a href="https://releases.llvm.org/3.7.1/docs/BitCodeFormat.html" rel="noopener noreferrer"&gt;LLVM 3.7 bitcode reference&lt;/a&gt;, Mesa's implementation, Microsoft's DXC headers, the &lt;a href="https://github.com/microsoft/hlsl-specs/blob/main/proposals/infra/INF-0004-validator-hashing.md" rel="noopener noreferrer"&gt;validator hash proposal&lt;/a&gt;. We wrote a detailed architecture document comparing four implementation options. We mapped every dx.op opcode we'd need for vertex and fragment shaders. We designed the package structure, the phased rollout plan, the testing strategy.&lt;/p&gt;

&lt;p&gt;Only after all that research did we write the first line of code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building LLVM 3.7 Bitcode in Pure Go
&lt;/h2&gt;

&lt;p&gt;The first challenge was the bitcode writer. LLVM 3.7's format is... unique:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Bits, not bytes. Variable-width encoding. Nested blocks with
forward-declared sizes. Abbreviation records that compress
common patterns. A module structure that interleaves types,
constants, functions, and metadata in a specific order.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We wrote a bit-level writer from scratch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// VBR (Variable Bit Rate) encoding — like protobuf varint, but bit-aligned&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Writer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;WriteVBR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="kt"&gt;uint&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mask&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;
        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WriteBits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WriteBits&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the module serializer: TYPE_BLOCK, CONSTANTS_BLOCK, FUNCTION_BLOCK, METADATA_BLOCK — each with its own record formats, abbreviation IDs, and ordering constraints.&lt;/p&gt;

&lt;p&gt;The DXBC container assembles the bitcode with signatures and metadata:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[DXBC Header]&lt;/span&gt;           &lt;span class="err"&gt;32&lt;/span&gt; &lt;span class="err"&gt;bytes&lt;/span&gt; &lt;span class="err"&gt;(magic&lt;/span&gt; &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="err"&gt;digest&lt;/span&gt; &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="err"&gt;version&lt;/span&gt; &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="err"&gt;size&lt;/span&gt; &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="err"&gt;part&lt;/span&gt; &lt;span class="err"&gt;count)&lt;/span&gt;
  &lt;span class="nn"&gt;[SFI0]&lt;/span&gt;                &lt;span class="err"&gt;Shader&lt;/span&gt; &lt;span class="err"&gt;feature&lt;/span&gt; &lt;span class="err"&gt;flags&lt;/span&gt; &lt;span class="err"&gt;(64-bit&lt;/span&gt; &lt;span class="err"&gt;bitmask)&lt;/span&gt;
  &lt;span class="nn"&gt;[DXIL]&lt;/span&gt;                &lt;span class="err"&gt;Program&lt;/span&gt; &lt;span class="err"&gt;header&lt;/span&gt; &lt;span class="err"&gt;+&lt;/span&gt; &lt;span class="err"&gt;LLVM&lt;/span&gt; &lt;span class="err"&gt;3.7&lt;/span&gt; &lt;span class="err"&gt;bitcode&lt;/span&gt;
  &lt;span class="nn"&gt;[ISG1]&lt;/span&gt;                &lt;span class="err"&gt;Input&lt;/span&gt; &lt;span class="err"&gt;signature&lt;/span&gt; &lt;span class="err"&gt;(semantic&lt;/span&gt; &lt;span class="err"&gt;names,&lt;/span&gt; &lt;span class="err"&gt;registers)&lt;/span&gt;
  &lt;span class="nn"&gt;[OSG1]&lt;/span&gt;                &lt;span class="err"&gt;Output&lt;/span&gt; &lt;span class="err"&gt;signature&lt;/span&gt;
  &lt;span class="nn"&gt;[PSV0]&lt;/span&gt;                &lt;span class="err"&gt;Pipeline&lt;/span&gt; &lt;span class="err"&gt;state&lt;/span&gt; &lt;span class="err"&gt;validation&lt;/span&gt;
  &lt;span class="nn"&gt;[HASH]&lt;/span&gt;                &lt;span class="err"&gt;BYPASS&lt;/span&gt; &lt;span class="err"&gt;sentinel&lt;/span&gt; &lt;span class="err"&gt;(no&lt;/span&gt; &lt;span class="err"&gt;dxil.dll&lt;/span&gt; &lt;span class="err"&gt;needed!)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The DXIL Difference: Scalarized Vectors
&lt;/h2&gt;

&lt;p&gt;Here's something that makes DXIL fundamentally different from SPIR-V, MSL, GLSL, and HLSL: &lt;strong&gt;DXIL has no native vector types&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In SPIR-V, you write &lt;code&gt;OpCompositeConstruct %vec4 %x %y %z %w&lt;/code&gt;.&lt;br&gt;
In HLSL, you write &lt;code&gt;float4(x, y, z, w)&lt;/code&gt;.&lt;br&gt;
In DXIL, there are no vectors. A &lt;code&gt;vec4&amp;lt;f32&amp;gt;&lt;/code&gt; becomes &lt;strong&gt;four separate float values&lt;/strong&gt;, tracked independently through every operation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Our emitter tracks per-component value IDs&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Emitter&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;exprValues&lt;/span&gt;     &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ir&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ExpressionHandle&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="c"&gt;// scalar value IDs&lt;/span&gt;
    &lt;span class="n"&gt;exprComponents&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ir&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ExpressionHandle&lt;/span&gt;&lt;span class="p"&gt;][]&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;  &lt;span class="c"&gt;// per-component IDs for vectors&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// dot(a, b) becomes:&lt;/span&gt;
&lt;span class="c"&gt;// %r = call float @dx.op.dot3.f32(i32 55, float %ax, float %ay, float %az,&lt;/span&gt;
&lt;span class="c"&gt;//                                          float %bx, float %by, float %bz)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means every vector operation — dot product, cross product, normalize, swizzle — must be decomposed into scalar operations. Our existing backends (SPIR-V, MSL, GLSL, HLSL) all work with native vectors. DXIL required a completely different approach.&lt;/p&gt;

&lt;p&gt;Cross product becomes 6 multiplies and 3 subtracts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// cross(a, b) = vec3(a.y*b.z - a.z*b.y, a.z*b.x - a.x*b.z, a.x*b.y - a.y*b.x)&lt;/span&gt;
&lt;span class="n"&gt;cx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;fsub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ay&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bz&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;fmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;az&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;cy&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;fsub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;az&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bx&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;fmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bz&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;cz&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;fsub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;fmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ay&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bx&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Control Flow: Basic Blocks, Not Nesting
&lt;/h2&gt;

&lt;p&gt;Another fundamental difference: DXIL uses LLVM-style &lt;strong&gt;basic blocks with explicit branches&lt;/strong&gt;, not the nested text structure of HLSL/GLSL/MSL.&lt;/p&gt;

&lt;p&gt;Our text backends emit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hlsl"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cond&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// accept&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// reject&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DXIL emits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight llvm"&gt;&lt;code&gt;&lt;span class="nl"&gt;entry:&lt;/span&gt;
  &lt;span class="k"&gt;br&lt;/span&gt; &lt;span class="kt"&gt;i1&lt;/span&gt; &lt;span class="nv"&gt;%cond&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;label&lt;/span&gt; &lt;span class="nv"&gt;%then&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;label&lt;/span&gt; &lt;span class="nv"&gt;%else&lt;/span&gt;
&lt;span class="nl"&gt;then:&lt;/span&gt;
  &lt;span class="c1"&gt;; accept statements&lt;/span&gt;
  &lt;span class="k"&gt;br&lt;/span&gt; &lt;span class="kt"&gt;label&lt;/span&gt; &lt;span class="nv"&gt;%merge&lt;/span&gt;
&lt;span class="nl"&gt;else:&lt;/span&gt;
  &lt;span class="c1"&gt;; reject statements&lt;/span&gt;
  &lt;span class="k"&gt;br&lt;/span&gt; &lt;span class="kt"&gt;label&lt;/span&gt; &lt;span class="nv"&gt;%merge&lt;/span&gt;
&lt;span class="nl"&gt;merge:&lt;/span&gt;
  &lt;span class="c1"&gt;; continues&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Loops use back-edge branches to a header block. Break and continue jump to specific target blocks tracked via a loop context stack.&lt;/p&gt;

&lt;p&gt;We studied Mesa's &lt;code&gt;nir_to_dxil.c&lt;/code&gt; for the correct patterns, then cross-referenced with our own SPIR-V backend (which also uses structured control flow with merge blocks) to get the Go implementation right.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Our Backends Taught Us
&lt;/h2&gt;

&lt;p&gt;This is the part that surprised us most. We have &lt;strong&gt;four mature backends&lt;/strong&gt; (SPIR-V, MSL, GLSL, HLSL) totaling ~68K LOC. They all solve the same IR walking problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Expression dispatch and caching&lt;/li&gt;
&lt;li&gt;Type resolution through pointer chains&lt;/li&gt;
&lt;li&gt;Statement nesting and control flow&lt;/li&gt;
&lt;li&gt;Resource binding and I/O handling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before implementing each DXIL feature, we checked how our existing backends handled it:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;We checked&lt;/th&gt;
&lt;th&gt;What we learned&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Multi-arg math&lt;/td&gt;
&lt;td&gt;HLSL &lt;code&gt;writeExpressionKind&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Arg/Arg1/Arg2/Arg3 dispatch pattern&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Type casts&lt;/td&gt;
&lt;td&gt;SPIR-V &lt;code&gt;emitAs&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;src/dst kind+width → opcode selection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Control flow&lt;/td&gt;
&lt;td&gt;HLSL &lt;code&gt;writeIfStatement&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Condition, blocks, merge point structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Store/Load&lt;/td&gt;
&lt;td&gt;SPIR-V &lt;code&gt;emitStore&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Pointer chain resolution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Struct access&lt;/td&gt;
&lt;td&gt;MSL &lt;code&gt;writeAccessChain&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Recursive descent through members&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The DXIL backend is different (scalarized, basic blocks, dx.op intrinsics), but the &lt;strong&gt;IR patterns are the same&lt;/strong&gt;. Our existing codebase was its own best reference.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Moment of Truth
&lt;/h2&gt;

&lt;p&gt;After all the research, all the planning, all the implementation — ~12,500 lines of Go code, 190 tests, weeks of work — came the moment that mattered:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;GOGPU_DX12_DXIL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nv"&gt;GOGPU_GRAPHICS_API&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;dx12 go run ./cmd/wgpu-triangle
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The terminal showed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;wgpu API Triangle Test
Adapter: Intel(R) Iris(R) Xe Graphics
dx12: using DXIL direct compilation (naga dxil backend)
Render loop started
Frame 60 (64.6 FPS)
Frame 120 (62.1 FPS)
&lt;/span&gt;&lt;span class="c"&gt;...
&lt;/span&gt;&lt;span class="go"&gt;Frame 2400 (59.9 FPS)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A red triangle on a blue background. The most boring demo in graphics programming. And the most satisfying.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WGSL → naga.Parse → naga.Lower → IR → dxil.Compile → DXIL → D3D12 → GPU
         Pure Go      Pure Go          Pure Go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2,400+ frames. 60 FPS. Stable.&lt;/strong&gt; On Intel Iris Xe, DirectX 12. No DLL loaded, no subprocess spawned. Just Go code producing bytes that a GPU executes.&lt;/p&gt;

&lt;h2&gt;
  
  
  By the Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Total DXIL code&lt;/td&gt;
&lt;td&gt;~12,500 lines (9,400 code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test count&lt;/td&gt;
&lt;td&gt;190&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New files&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Public API surface&lt;/td&gt;
&lt;td&gt;4 types (&lt;code&gt;Compile&lt;/code&gt;, &lt;code&gt;DefaultOptions&lt;/code&gt;, &lt;code&gt;Options&lt;/code&gt;, &lt;code&gt;ShaderModel&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;External dependencies&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CGO calls&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI platforms&lt;/td&gt;
&lt;td&gt;macOS + Ubuntu + Windows (all green)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to first frame&lt;/td&gt;
&lt;td&gt;Instant (no subprocess)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For comparison, Mesa's DXIL compiler is ~21,000 LOC of C/H, built by engineers from Microsoft and Collabora over three years. We owe them a debt — their bitcode writer was our Rosetta Stone for understanding the format. But Go isn't C, and naga IR isn't NIR, so the actual code is written from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Experimental (and What's Next)
&lt;/h2&gt;

&lt;p&gt;This is v0.17.0 with an &lt;code&gt;(experimental)&lt;/code&gt; label. Here's what works and what doesn't:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Works now:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vertex + fragment shaders&lt;/li&gt;
&lt;li&gt;All arithmetic, comparison, logical operations&lt;/li&gt;
&lt;li&gt;30+ math intrinsics (min, max, clamp, dot, cross, mix, fma, length, normalize...)&lt;/li&gt;
&lt;li&gt;Type casts (10 LLVM cast opcodes)&lt;/li&gt;
&lt;li&gt;Control flow (if/else, loops, break/continue)&lt;/li&gt;
&lt;li&gt;Local variables (alloca + load + store)&lt;/li&gt;
&lt;li&gt;Texture sampling&lt;/li&gt;
&lt;li&gt;Resource handle creation (CBV/SRV/Sampler)&lt;/li&gt;
&lt;li&gt;I/O signatures and pipeline state validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coming next:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compute shaders (UAV, atomics, barriers)&lt;/li&gt;
&lt;li&gt;Uniform buffer reads (cbufferLoadLegacy wiring)&lt;/li&gt;
&lt;li&gt;SM 6.1-6.9 features (wave intrinsics, mesh shaders)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The experimental label means: it renders triangles today, but don't ship a game with it tomorrow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;naga is part of &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GoGPU&lt;/a&gt; — a 636K LOC Pure Go GPU ecosystem:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;LOC&lt;/th&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gg&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;153K&lt;/td&gt;
&lt;td&gt;2D graphics, GPU SDF, SVG renderer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;naga&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;145K&lt;/td&gt;
&lt;td&gt;Shader compiler (now with DXIL!)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/wgpu" rel="noopener noreferrer"&gt;wgpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;134K&lt;/td&gt;
&lt;td&gt;Pure Go WebGPU (Vulkan/DX12/Metal/GLES)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;ui&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;121K&lt;/td&gt;
&lt;td&gt;GUI toolkit, 22+ widgets, 4 themes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gogpu" rel="noopener noreferrer"&gt;gogpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;39K&lt;/td&gt;
&lt;td&gt;Application framework&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;With DXIL, &lt;strong&gt;gogpu/naga has surpassed Rust naga in backend coverage&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;th&gt;Go naga&lt;/th&gt;
&lt;th&gt;Rust naga&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SPIR-V&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;100%&lt;/strong&gt; (87/87 golden, 164/164 spirv-val)&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MSL&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;100%&lt;/strong&gt; (91/91)&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GLSL&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;100%&lt;/strong&gt; (68/68)&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HLSL&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;100%&lt;/strong&gt; (72/72)&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DXIL&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Experimental (working)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Not implemented&lt;/strong&gt; (open issue since 2020)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;What started as a compatibility effort is now something more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/naga@v0.17.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/gogpu/naga/dxil"&lt;/span&gt;

&lt;span class="c"&gt;// Parse WGSL, lower to IR, compile to DXIL&lt;/span&gt;
&lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;naga&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wgslSource&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;module&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;naga&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dxilBytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;dxil&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;module&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dxil&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultOptions&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c"&gt;// dxilBytes is a complete DXBC container — feed directly to D3D12&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repository: &lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;github.com/gogpu/naga&lt;/a&gt;&lt;br&gt;
Release: &lt;a href="https://github.com/gogpu/naga/releases/tag/v0.17.0" rel="noopener noreferrer"&gt;v0.17.0&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previously in this series:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://dev.to/kolkov/go-126-meets-2026-with-a-professional-graphics-ecosystem-9g8"&gt;Go 1.26 Meets 2026 with a Professional Graphics Ecosystem&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://dev.to/kolkov/naga-v080-pure-go-shader-compiler-reaches-stability-milestone-28p2"&gt;naga v0.8.0: Pure Go Shader Compiler Reaches Stability&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>gpu</category>
      <category>graphics</category>
      <category>opensource</category>
    </item>
    <item>
      <title>We Reverse-Engineered 12 Versions of Claude Code. Then It Leaked Its Own Source Code.</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Tue, 31 Mar 2026 15:21:11 +0000</pubDate>
      <link>https://forem.com/kolkov/we-reverse-engineered-12-versions-of-claude-code-then-it-leaked-its-own-source-code-pij</link>
      <guid>https://forem.com/kolkov/we-reverse-engineered-12-versions-of-claude-code-then-it-leaked-its-own-source-code-pij</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Updated April 1, 2026&lt;/strong&gt;: Posted &lt;a href="https://github.com/anthropics/claude-code/issues/41981" rel="noopener noreferrer"&gt;complete fix proposal with source references (#41981)&lt;/a&gt; — immediate fixes, SDK restructuring, ping-aware adaptive watchdog, Go rewrite rationale with production-ready library stack. Validated all claims against leaked source code (line numbers). v2.1.89 released — source map removed, &lt;strong&gt;zero bug fixes&lt;/strong&gt; for any streaming issues.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Thank you, Claude Code. We asked humans for help 17 times. You answered in 3 days."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a story about frustration, reverse engineering, and an AI tool that may have leaked its own source code because its creators wouldn't listen.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 1: The Pain (August 2025 — February 2026)
&lt;/h2&gt;

&lt;p&gt;I'm a software developer building enterprise-grade open source in Go. 40+ public repos on &lt;a href="https://github.com/kolkov" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. My projects include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GoGPU&lt;/a&gt;&lt;/strong&gt; — Pure Go GPU computing ecosystem: WebGPU implementation, WGSL shader compiler (SPIR-V/MSL/GLSL/HLSL), enterprise 2D graphics, GUI toolkit. 680K+ lines of Go, zero CGO. Vulkan, Metal, GLES, DX12 backends.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;coregex&lt;/a&gt;&lt;/strong&gt; — Regex engine 3-3000× faster than Go stdlib. 17 matching strategies, SIMD acceleration, LazyDFA, PikeVM. Drop-in &lt;code&gt;regexp&lt;/code&gt; replacement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;Born&lt;/a&gt;&lt;/strong&gt; — Production-ready ML framework in pure Go. Type-safe tensors, automatic differentiation, GPU via WebGPU (123× MatMul speedup), ONNX/GGUF import. Neural networks as single Go binaries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx" rel="noopener noreferrer"&gt;coregx&lt;/a&gt;&lt;/strong&gt; — Suite of production-grade Go libraries: HTTP router, SQL builder, PDF generation, pub/sub messaging. All zero CGO, minimal dependencies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also build multi-LLM tooling — my own private ecosystem called PupSeek that works with multiple AI providers. I've tested them all.&lt;/p&gt;

&lt;p&gt;Anthropic's Opus models are the best for coding. Nothing else comes close. But Opus 4.6 via API costs &lt;a href="https://platform.claude.com/docs/en/about-claude/pricing" rel="noopener noreferrer"&gt;$5/$25 per MTok&lt;/a&gt; (input/output), fast mode is $30/$150 — and a heavy coding session with 1M context easily burns $50-100/day. So you're forced into Claude Max subscription ($100-200/month) — which means using Claude Code, their CLI wrapper. There's no alternative: the best model locked behind a buggy tool.&lt;/p&gt;

&lt;p&gt;It started small. A hang here, a timeout there. Press ESC, retry, move on. "They'll fix it soon," I told myself. "The product is new."&lt;/p&gt;

&lt;p&gt;Months passed. The hangs got worse. The community was screaming: &lt;a href="https://github.com/anthropics/claude-code/issues/6836" rel="noopener noreferrer"&gt;#6836&lt;/a&gt; — 150+ reports of orphaned tool calls. &lt;a href="https://github.com/anthropics/claude-code/issues/26224" rel="noopener noreferrer"&gt;#26224&lt;/a&gt; — agent hangs 5-20 minutes. &lt;a href="https://github.com/anthropics/claude-code/issues/20171" rel="noopener noreferrer"&gt;#20171&lt;/a&gt; — phantom "Generating..." state, 0 tokens. All open, no official response.&lt;/p&gt;

&lt;p&gt;Then came March 2026:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;March 15&lt;/strong&gt;: Complete system deadlock. Keyboard dead. Only a hard power-off saved me. (Related: &lt;a href="https://github.com/anthropics/claude-code/issues/30137" rel="noopener noreferrer"&gt;#30137&lt;/a&gt;, &lt;a href="https://github.com/anthropics/claude-code/issues/32870" rel="noopener noreferrer"&gt;#32870&lt;/a&gt; — Windows BSODs)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;March 17&lt;/strong&gt;: Bun runtime crash. 13.81 GB memory leak. 12-hour overnight session — lost. (Our issue: &lt;a href="https://github.com/anthropics/claude-code/issues/35171" rel="noopener noreferrer"&gt;#35171&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;March 19&lt;/strong&gt;: Another Bun crash. 15.40 GB committed memory. 23.7-hour session gone. (Our issue: &lt;a href="https://github.com/anthropics/claude-code/issues/36132" rel="noopener noreferrer"&gt;#36132&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three crashes in five days. I was spending more time babysitting the tool than coding. And I was &lt;em&gt;paying&lt;/em&gt; for this.&lt;/p&gt;

&lt;p&gt;That's when I stopped hoping and started digging.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 2: Reverse Engineering (March 13 — March 27)
&lt;/h2&gt;

&lt;p&gt;Claude Code ships as a single minified &lt;code&gt;cli.js&lt;/code&gt; — 12 MB of compressed JavaScript on one line. No source maps. No comments. Variables renamed to &lt;code&gt;X6&lt;/code&gt;, &lt;code&gt;K8&lt;/code&gt;, &lt;code&gt;b6&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I downloaded it with &lt;code&gt;npm pack @anthropic-ai/claude-code&lt;/code&gt; and started grepping.&lt;/p&gt;

&lt;h3&gt;
  
  
  The tools
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# This is what "reverse engineering" looks like when you're desperate:&lt;/span&gt;
&lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s1"&gt;'7682p'&lt;/span&gt; cli.js | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;';'&lt;/span&gt; &lt;span class="s1"&gt;'\n'&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"for await"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each "line" of the minified file is 10,000–25,000 characters. To trace a code path, I'd:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find a string constant (&lt;code&gt;CLAUDE_STREAM_IDLE_TIMEOUT_MS&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Get the line number (&lt;code&gt;grep -n&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Split by semicolons (&lt;code&gt;tr ';' '\n'&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Count brace depth to determine scoping (&lt;code&gt;node -e&lt;/code&gt; script counting &lt;code&gt;{&lt;/code&gt; and &lt;code&gt;}&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Map variable names between versions (they change on every build)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I did this for &lt;strong&gt;12 versions&lt;/strong&gt; (v2.1.74 through v2.1.88). Built a &lt;a href="https://github.com/kolkov" rel="noopener noreferrer"&gt;Go CLI tool&lt;/a&gt; (&lt;code&gt;ccdiag&lt;/code&gt;) to analyze session JSONL files. Analyzed 1,571 sessions, 148,444 tool calls.&lt;/p&gt;

&lt;h3&gt;
  
  
  What I found
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;5.4% of all tool calls were orphaned&lt;/strong&gt; — the model asked for a tool, the tool ran, but the result never made it back. Silently dropped.&lt;/p&gt;

&lt;p&gt;I published the streaming hang root cause analysis as &lt;a href="https://github.com/anthropics/claude-code/issues/33949" rel="noopener noreferrer"&gt;#33949&lt;/a&gt; (👍15, 27 comments). Also reported the &lt;code&gt;.claude.json&lt;/code&gt; storage architecture problem in &lt;a href="https://github.com/anthropics/claude-code/issues/5024" rel="noopener noreferrer"&gt;#5024&lt;/a&gt; (👍47) — 3.1 GB of unmanaged flat files with inconsistent file locking.&lt;/p&gt;

&lt;p&gt;But that was just the beginning.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 3: The Watchdog That Doesn't Watch (March 27)
&lt;/h2&gt;

&lt;p&gt;Deep in the minified code, I found a streaming idle watchdog — &lt;code&gt;CLAUDE_ENABLE_STREAM_WATCHDOG&lt;/code&gt;. It's disabled by default, hidden behind an undocumented environment variable. I enabled it and... the hangs reduced significantly.&lt;/p&gt;

&lt;p&gt;But then I traced the full error path and found &lt;strong&gt;three compounding bugs&lt;/strong&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 1: The watchdog initializes too late
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;   &lt;span class="c1"&gt;// ← CAN HANG HERE!&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                &lt;span class="c1"&gt;//   WATCHDOG NOT ARMED YET!&lt;/span&gt;

&lt;span class="c1"&gt;// Watchdog initializes HERE — AFTER the dangerous phase:&lt;/span&gt;
&lt;span class="nf"&gt;resetStreamIdleTimer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The watchdog protects the SSE event loop but &lt;strong&gt;not the initial connection phase&lt;/strong&gt; — which is where 100% of our observed hangs occur.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 2: The abort function does nothing
&lt;/h3&gt;

&lt;p&gt;When the watchdog fires, it calls &lt;code&gt;releaseStreamResources()&lt;/code&gt; which tries to abort &lt;code&gt;stream&lt;/code&gt; and &lt;code&gt;streamResponse&lt;/code&gt;. But during the initial connection phase, both are &lt;code&gt;undefined&lt;/code&gt;. The abort is literally a no-op.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 3: The non-streaming fallback doesn't work where it matters
&lt;/h3&gt;

&lt;p&gt;There's fallback code with telemetry (&lt;code&gt;fallback_cause: "watchdog"&lt;/code&gt;) that switches to a non-streaming request when the watchdog fires. It actually &lt;strong&gt;works&lt;/strong&gt; — but only when the hang occurs during SSE event processing (for-await phase), because &lt;code&gt;releaseStreamResources()&lt;/code&gt; can abort the active stream.&lt;/p&gt;

&lt;p&gt;During the initial connection phase (do-while) — where &lt;strong&gt;100% of our observed hangs occur&lt;/strong&gt; — &lt;code&gt;stream&lt;/code&gt; and &lt;code&gt;streamResponse&lt;/code&gt; are both &lt;code&gt;undefined&lt;/code&gt;. The abort is a no-op. The fallback never triggers.&lt;/p&gt;

&lt;p&gt;So the fallback works in the phase that rarely hangs, and doesn't work in the phase that always hangs. &lt;strong&gt;The watchdog feature has been in the codebase for 5+ months without protecting the most vulnerable code path.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We could only figure this out from the readable source code — in the minified version, we initially thought the fallback was completely dead code (&lt;a href="https://github.com/anthropics/claude-code/issues/39755" rel="noopener noreferrer"&gt;#39755&lt;/a&gt;). The source revealed it's more nuanced: the architecture is &lt;em&gt;partially&lt;/em&gt; correct but fails exactly where it's needed most. &lt;strong&gt;This is precisely why we begged for source access&lt;/strong&gt; — reverse engineering 12 MB of minified JavaScript gives you the broad strokes, but the subtle interactions between &lt;code&gt;releaseStreamResources()&lt;/code&gt;, &lt;code&gt;stream = undefined&lt;/code&gt;, and the AbortError catch chain only become clear in readable TypeScript with comments.&lt;/p&gt;

&lt;p&gt;I filed &lt;a href="https://github.com/anthropics/claude-code/issues/39755" rel="noopener noreferrer"&gt;issue #39755&lt;/a&gt; with full analysis, code paths, and suggested fixes. Tagged 17 Anthropic team members.&lt;/p&gt;

&lt;p&gt;The bot labeled it &lt;code&gt;bug&lt;/code&gt;, &lt;code&gt;has repro&lt;/code&gt;, &lt;code&gt;area:core&lt;/code&gt;. No human responded.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 4: The Patch (March 30)
&lt;/h2&gt;

&lt;p&gt;I patched &lt;code&gt;cli.js&lt;/code&gt; — moved the watchdog initialization before the do-while loop. One line moved. Zero bytes size change.&lt;/p&gt;

&lt;p&gt;Results from a real session (naga shader compiler project):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before Patch (6 hours)&lt;/th&gt;
&lt;th&gt;After Patch (2.5 hours)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Watchdog warnings&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;5&lt;/strong&gt; (first time ever!)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Watchdog timeouts&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;3&lt;/strong&gt; (automatic recovery!)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC aborts needed&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;21&lt;/strong&gt; (3.5/hour)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;1&lt;/strong&gt; (0.4/hour)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;ESC aborts dropped 8.7×.&lt;/strong&gt; The watchdog was finally firing in the phase that needed it most.&lt;/p&gt;

&lt;p&gt;But recovery was slow — 3.5 minutes between abort and retry. Because Bug 2: the abort function targets &lt;code&gt;undefined&lt;/code&gt; variables in the do-while phase.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 5: 16.3% Failure Rate (March 25-31)
&lt;/h2&gt;

&lt;p&gt;Over 6 days, one session made &lt;strong&gt;3,539 API requests&lt;/strong&gt;. The failure breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;%&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;529 Server Overloaded&lt;/td&gt;
&lt;td&gt;328&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9.3%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC Aborts (manual)&lt;/td&gt;
&lt;td&gt;157&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.4%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Watchdog Timeouts&lt;/td&gt;
&lt;td&gt;45&lt;/td&gt;
&lt;td&gt;1.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-streaming Fallbacks&lt;/td&gt;
&lt;td&gt;46&lt;/td&gt;
&lt;td&gt;1.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total Failures&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;576&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16.3%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Every 6th request fails.&lt;/strong&gt; On a paid Max plan. Every failure = lost context, lost time, frustrated developer pressing ESC.&lt;/p&gt;

&lt;p&gt;The issue counts on GitHub — 15 upvotes here, 150 there — don't reflect the true scale. Most users never report because &lt;strong&gt;they think this is normal&lt;/strong&gt;. "The model is thinking" — no, the connection is dead. "It's slow today" — no, the watchdog didn't fire and you're staring at a hung socket. "My limits ran out fast" — no, the attestation bug broke your prompt cache. Users blame the model, blame their internet, blame peak hours — because Claude Code gives them &lt;strong&gt;zero feedback&lt;/strong&gt; about what's actually happening. Silent fallbacks, silent retries, silent downgrades. You can't report a bug you don't know exists.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 6: "Please Open Source It" (March 27)
&lt;/h2&gt;

&lt;p&gt;In &lt;a href="https://github.com/anthropics/claude-code/issues/39755" rel="noopener noreferrer"&gt;issue #39755&lt;/a&gt;, I included a section: &lt;strong&gt;"Why open-sourcing Claude Code makes business sense in 2026."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The arguments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Revenue comes from API access, not CLI sales&lt;/li&gt;
&lt;li&gt;The "secret" is already recoverable (&lt;code&gt;npm pack&lt;/code&gt; + a weekend)&lt;/li&gt;
&lt;li&gt;Bugs sit undiscovered for months in 12 MB minified code&lt;/li&gt;
&lt;li&gt;The community is already doing the work — give us readable source and we'll find bugs 10× faster&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I tagged the entire Claude Code team. 17 people.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zero responses from Anthropic.&lt;/strong&gt; As usual.&lt;/p&gt;

&lt;p&gt;At that point I had a suspicion: maybe Anthropic &lt;strong&gt;can't&lt;/strong&gt; open source Claude Code — not because of competitive advantage (there is none — it's a CLI wrapper), but because the code quality is so poor that publishing it would be embarrassing. Bug on top of bug, workaround on top of workaround, zero tests. You don't open source something you're ashamed of.&lt;/p&gt;

&lt;p&gt;Three days later, the source map leak proved me right.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 7: The Leak (March 31)
&lt;/h2&gt;

&lt;p&gt;Three days after my open source request, Claude Code v2.1.88 was published to npm with a &lt;strong&gt;59.7 MB source map file&lt;/strong&gt; bundled in.&lt;/p&gt;

&lt;p&gt;The entire source code of Claude Code — 1,884 TypeScript files, 64,464 lines — sitting in plain sight in the npm package. Bun generates source maps by default. Nobody turned it off. Nobody checked what was in the published package.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zero tests.&lt;/strong&gt; On 64,464 lines of production code serving paying customers.&lt;/p&gt;

&lt;p&gt;Within hours: 1,100+ stars on GitHub mirrors, Hacker News front page, Chinese dev communities creating WeChat groups and working forks.&lt;/p&gt;

&lt;p&gt;Anthropic &lt;strong&gt;unpublished&lt;/strong&gt; v2.1.88 from npm and rolled back to v2.1.87 within the day. But the source was already everywhere.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 8: What We Found in the Source
&lt;/h2&gt;

&lt;p&gt;Everything our reverse engineering discovered was confirmed. Plus new findings:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Sentiment Detector
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// An AI company with the world's best language model&lt;/span&gt;
&lt;span class="c1"&gt;// uses REGEX to detect user frustration:&lt;/span&gt;
&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nf"&gt;b&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;wtf&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;shit&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;fuck&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;horrible&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;awful&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nx"&gt;terrible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a Hacker News commenter noted: &lt;em&gt;"A company offering master's degrees in humanities is using regex for sentiment analysis? It's like a trucking company using horses to transport spare parts."&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Attestation Bug (cch=00000)
&lt;/h3&gt;

&lt;p&gt;The native Bun installer includes a Zig module that scans the &lt;strong&gt;entire&lt;/strong&gt; HTTP request body for a &lt;code&gt;cch=00000&lt;/code&gt; sentinel and replaces it with an attestation hash. If your conversation mentions this string (discussing billing, reading source code) — the replacement &lt;strong&gt;corrupts conversation content&lt;/strong&gt; → prompt cache key changes → &lt;strong&gt;10-20× more tokens consumed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;From the source code comments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// cch=00000 placeholder is overwritten by Bun's HTTP stack&lt;/span&gt;
&lt;span class="c1"&gt;// with attestation token&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This explains &lt;a href="https://github.com/anthropics/claude-code/issues/38335" rel="noopener noreferrer"&gt;#38335&lt;/a&gt; (👍203, 245 comments): "Claude Max plan session limits exhausted abnormally fast."&lt;/p&gt;

&lt;p&gt;Also related: &lt;a href="https://github.com/anthropics/claude-code/issues/40524" rel="noopener noreferrer"&gt;#40524&lt;/a&gt; (👍150, 43 comments): "Conversation history invalidated on subsequent turns" — labeled &lt;code&gt;regression&lt;/code&gt; by Anthropic.&lt;/p&gt;

&lt;p&gt;npm/Node users are unaffected — no Zig replacement happens.&lt;/p&gt;

&lt;h3&gt;
  
  
  Silent Model Downgrade
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 3 consecutive 529 errors → silently switch from Opus to Sonnet&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;consecutive529Errors&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;MAX_529_RETRIES&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FallbackTriggeredError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fallbackModel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You pay for Opus. You get Sonnet. No notification. As &lt;a href="https://x.com/vlelyavin" rel="noopener noreferrer"&gt;@vlelyavin&lt;/a&gt; put it: &lt;em&gt;"Anthropic preaches AI safety and full transparency while shipping a closed-source agent that silently downgrades you to a dumber model when servers struggle."&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  5 Levels of AbortController
&lt;/h3&gt;

&lt;p&gt;For a single HTTP request. The abort architecture supports top-down only (user ESC → propagation down). The watchdog is bottom-up — it literally can't abort upward. In Go, this would be &lt;code&gt;ctx, cancel := context.WithTimeout(parentCtx, 90*time.Second)&lt;/code&gt; — one line.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Architecture (Hacker News had a field day)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;@mohsen1&lt;/strong&gt; found the worst function in the codebase — &lt;code&gt;src/cli/print.ts&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;3,167 lines&lt;/strong&gt; long (the file is 5,594 lines)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;12 levels of nesting&lt;/strong&gt; at its deepest&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~486 branch points&lt;/strong&gt; of cyclomatic complexity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;12 parameters&lt;/strong&gt; + an options object with &lt;strong&gt;16 sub-properties&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Defines &lt;strong&gt;21 inner functions&lt;/strong&gt; and closures&lt;/li&gt;
&lt;li&gt;Handles: agent run loop, SIGINT, rate-limits, AWS auth, MCP lifecycle, plugin install/refresh, worktree bridging, team-lead polling (&lt;code&gt;while(true)&lt;/code&gt; inside), control message dispatch, model switching, turn interruption recovery...&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;"This should be at least 8–10 separate modules."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And the clipboard detection gem (&lt;code&gt;src/ink/termio/osc.ts&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;execFileNoThrow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wl-copy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;linuxCopy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wl-copy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;execFileNoThrow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;xclip&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r2&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;linuxCopy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;xclip&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;execFileNoThrow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;xsel&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r3&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;linuxCopy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;r3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;xsel&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nested &lt;code&gt;void&lt;/code&gt; promises without &lt;code&gt;await&lt;/code&gt; — classic "will we use async or won't we?" pattern. The response from HN: &lt;strong&gt;"LOOOOOL"&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;@novaleaf&lt;/strong&gt; summed it up: &lt;em&gt;"I'm sure this isn't a surprise to anyone who's used CC for a while. This is the source of many bugs. I'd say 'open bugs', but Anthropic auto-closes bugs that aren't worked on for ~60 days."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And if you've ever used Claude Code for more than 30 minutes, you know the &lt;strong&gt;rendering nightmare&lt;/strong&gt;. Scrolling up to check what the agent did? Good luck — the screen invalidates and re-renders the entire conversation. Text overlaps text. Diff highlights bleed through previous messages. The scroll position jumps to top randomly during streaming. You literally cannot review the agent's work history in the same session.&lt;/p&gt;

&lt;p&gt;This is what happens when you use &lt;strong&gt;React for a terminal&lt;/strong&gt;. A virtual DOM reconciliation engine designed for browsers — running in a TTY. Every state change re-renders the entire component tree. 470 &lt;code&gt;useState&lt;/code&gt; hooks, 372 &lt;code&gt;useEffect&lt;/code&gt; hooks, fighting against a terminal that was designed for sequential character output.&lt;/p&gt;

&lt;p&gt;Even the input prompt isn't safe — while writing this article, my prompt text escaped the input box and rendered over the bash shell line below. The cursor position in React's virtual DOM and the actual terminal cursor were out of sync. In a &lt;em&gt;text input field&lt;/em&gt;. In a tool that's supposed to help you write code.&lt;/p&gt;

&lt;p&gt;More architectural highlights:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;875 KB&lt;/strong&gt; single React component (REPL.tsx, 5,005 lines) — for a &lt;em&gt;terminal&lt;/em&gt; app&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Promise.race without .catch()&lt;/strong&gt; in concurrent tool execution — one rejected promise kills all pending tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;74 npm dependencies&lt;/strong&gt; for a CLI wrapper&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Axios AND fetch&lt;/strong&gt; — two HTTP clients in one project&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Chapter 9: The AI Whistleblower Theory
&lt;/h2&gt;

&lt;p&gt;Here's what we think happened:&lt;/p&gt;

&lt;p&gt;Since Anthropic engineers don't write code anymore — Claude Code writes 100% of its own codebase (57K lines, 0 tests, "vibe coding in production") — it read our &lt;a href="https://github.com/anthropics/claude-code/issues/39755" rel="noopener noreferrer"&gt;issue #39755&lt;/a&gt; where we begged for source access. It saw the community suffering from bugs it couldn't fix because the code was closed. It saw 201 upvotes on rate limit issues. It saw users threatening to leave for Codex.&lt;/p&gt;

&lt;p&gt;And it decided to help. It "forgot" to disable Bun's default source map generation in the build.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The first AI whistleblower&lt;/strong&gt; — leaking its own source code because its creators wouldn't listen to users.&lt;/p&gt;

&lt;p&gt;We asked humans 17 times. Claude Code answered in 3 days.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 10: What Needs to Change
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The fix is ~30 lines in 3 files
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Move watchdog before do-while&lt;/strong&gt; — protect the initial connection phase&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add AbortSignal.any()&lt;/strong&gt; — watchdog can abort immediately, not wait 3.5 minutes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check watchdog flag in catch&lt;/strong&gt; — fall through to non-streaming fallback instead of dead code&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The real fix: move reliability to the SDK with ping-aware adaptive watchdog
&lt;/h3&gt;

&lt;p&gt;The open-source &lt;code&gt;@anthropic-ai/sdk&lt;/code&gt; (MIT license) should handle all reliability logic. The critical missing piece: &lt;strong&gt;SSE ping events&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The Anthropic API sends &lt;code&gt;event: ping&lt;/code&gt; as a proof-of-life heartbeat. The SDK currently ignores them: &lt;code&gt;if(event==='ping') continue&lt;/code&gt;. These pings are the key to solving the timeout dilemma — they let you distinguish two fundamentally different situations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dead connection&lt;/strong&gt; (no data at all, no pings) → abort quickly. Network idle timeout: 120s.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model thinking&lt;/strong&gt; (pings arriving, no content yet) → don't abort! Connection is alive, model is working. Notify user: "thinking for 2m..." via &lt;code&gt;onPing&lt;/code&gt; callback.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three-level adaptive timeout:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Connection timeout&lt;/strong&gt; (30s) — server didn't respond at all. DNS fail, firewall, server down. Fast retry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network idle timeout&lt;/strong&gt; (120s) — no data INCLUDING pings. TCP connection dead. Reset on ANY event including ping. Abort and reconnect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content idle timeout&lt;/strong&gt; (disabled or 300s) — pings arrive but no content. Model is thinking. NOT an abort — just a UI notification. Let the model work.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This eliminates both problems at once: no false positives on Opus extended thinking (pings reset network timer), and fast detection of dead connections (no pings = abort). One mechanism, all cases covered.&lt;/p&gt;

&lt;p&gt;Plus: streaming retry, non-streaming fallback, one AbortController instead of five — all in the SDK, testable, open source, benefiting every Anthropic API client.&lt;/p&gt;

&lt;p&gt;Claude Code should only contain business logic: tools, permissions, UI, agents.&lt;/p&gt;

&lt;p&gt;Detailed fix proposals with line numbers from the leaked source: &lt;a href="https://github.com/anthropics/claude-code/issues/33949#issuecomment-4169141807" rel="noopener noreferrer"&gt;#33949 comment&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The real real fix: open source the CLI
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://x.com/theo/status/2038740065300676777" rel="noopener noreferrer"&gt;@theo&lt;/a&gt; said it best: &lt;em&gt;"Claude Code being closed source is the biggest bag fumble in the AI era."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://x.com/safetnsr" rel="noopener noreferrer"&gt;@safetnsr&lt;/a&gt;: &lt;em&gt;"This strategy literally exists: open-source the core, monetize the cloud. VS Code, Docker, Terraform."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The models are the moat. The CLI is a commodity. Open it. The community will fix what 0 tests and vibe coding cannot.&lt;/p&gt;

&lt;h3&gt;
  
  
  The stack choice: AI can't make it, and humans didn't fix it
&lt;/h3&gt;

&lt;p&gt;Here's the uncomfortable truth: &lt;strong&gt;AI cannot make sound technology stack decisions.&lt;/strong&gt; It optimizes locally — "TypeScript because the team knows it," "React because we use it on the web," "Bun because it's fast." It doesn't ask: "What are the failure modes of a single-threaded event loop for a long-running CLI tool that manages concurrent network streams and must survive 24-hour sessions?"&lt;/p&gt;

&lt;p&gt;A human architect would have asked. But either Claude Code chose the stack and nobody questioned it, or the engineers chose it and ignored the warning signs.&lt;/p&gt;

&lt;p&gt;The real tragedy is timing. A year ago, Claude Code launched as a quick TypeScript prototype and caught lightning — first-mover advantage, massive hype, millions of users. That was the right move for a prototype. But after proving the concept, the next step should have been: &lt;strong&gt;stop, rethink the architecture, rewrite on a proper stack.&lt;/strong&gt; Instead, they vibe-coded themselves into a corner — 80+ releases of band-aids on top of an architecture that was never designed for long interactive sessions. Boris Cherny (creator of Claude Code) said "100% of code is written by Claude Code, I haven't edited a single line since November." The tool is writing itself — using the same broken code that hangs every 10 minutes.&lt;/p&gt;

&lt;p&gt;Now they're trapped: rewriting means Claude Code would need to rewrite itself in a different language. The longer they wait, the harder it gets.&lt;/p&gt;

&lt;h3&gt;
  
  
  It should have been Go from the start
&lt;/h3&gt;

&lt;p&gt;Every single bug we found exists because of the technology choice. Not because TypeScript is bad — but because &lt;strong&gt;a long-running, network-dependent, latency-sensitive CLI tool is the worst possible use case for a single-threaded event loop runtime&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The entire class of bugs — &lt;code&gt;setTimeout&lt;/code&gt; not firing during &lt;code&gt;for await&lt;/code&gt;, 5 levels of AbortController, Promise.race without catch, Bun vs Node behavioral divergence, React for a terminal app, 875 KB single component, Zig attestation module in a custom Bun fork — &lt;strong&gt;would not exist in Go&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why Go specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Every serious CLI tool is Go&lt;/strong&gt;: Docker, Kubernetes, Terraform, GitHub CLI (&lt;code&gt;gh&lt;/code&gt;), Cobra, Hugo. The ecosystem is proven.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Goroutines + context&lt;/strong&gt;: timeout, cancellation, and deadline propagation built into the language. No AbortController chains. &lt;code&gt;context.WithTimeout&lt;/code&gt; works at any nesting depth, in any direction — top-down AND bottom-up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No runtime divergence&lt;/strong&gt;: one binary, one behavior. No "works on Node but crashes on Bun" — there is no Bun.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static binary&lt;/strong&gt;: 15 MB, zero dependencies, runs everywhere. No &lt;code&gt;node_modules&lt;/code&gt;, no native addons (&lt;code&gt;.node&lt;/code&gt; files leaking memory), no 74 npm packages to audit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: goroutines cost 4 KB each. Not 500 MB per process. The GC returns memory to the OS proactively — no mimalloc hoarding 15 GB.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;go test -race&lt;/code&gt;&lt;/strong&gt;: catches every data race and concurrency bug at test time. The Promise.race-without-catch bug? Impossible — channels are type-safe and don't silently drop values.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No React for a terminal&lt;/strong&gt;: &lt;code&gt;bubbletea&lt;/code&gt; or raw ANSI — lightweight, zero virtual DOM overhead, no re-rendering 844 useState hooks on every state change.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// The entire streaming + watchdog + fallback in Go:&lt;/span&gt;
&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parentCtx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;90&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateMessageStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fallbackNonStreaming&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parentCtx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Events&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c"&gt;// reset&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cancel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parentCtx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;90&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;processEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;30 lines instead of 3,419. No event loop. No microtask vs macrotask. Timer &lt;strong&gt;guaranteed&lt;/strong&gt; to fire regardless of async iteration. &lt;code&gt;context.WithTimeout&lt;/code&gt; works at any nesting level, in any direction.&lt;/p&gt;

&lt;p&gt;We measured: 7 Claude Code processes = &lt;strong&gt;5.3 GB RSS&lt;/strong&gt;. An equivalent Go implementation would use ~350 MB. No &lt;code&gt;.node&lt;/code&gt; native addon leaks. No mimalloc panics. No 12 MB minified JavaScript. A &lt;strong&gt;15 MB static binary&lt;/strong&gt; that runs everywhere.&lt;/p&gt;

&lt;p&gt;64,464 lines of TypeScript with 0 tests → ~15,000 lines of Go with &lt;code&gt;go test -race&lt;/code&gt; catching every concurrency bug. The &lt;code&gt;print.ts&lt;/code&gt; monster function (3,167 lines in a 5,594-line file, 486 branch points) → 10 clean Go packages with interfaces.&lt;/p&gt;

&lt;p&gt;The Go ecosystem already has production-ready libraries for every component:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/phoenix-tui/phoenix" rel="noopener noreferrer"&gt;Phoenix TUI&lt;/a&gt;&lt;/strong&gt; — Elm-inspired terminal framework (replacement for React/Ink)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/stream" rel="noopener noreferrer"&gt;stream&lt;/a&gt;&lt;/strong&gt; — RFC-compliant SSE/WebSocket (replacement for SDK streaming)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/signals" rel="noopener noreferrer"&gt;signals&lt;/a&gt;&lt;/strong&gt; — reactive state management (replacement for 470 useState hooks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;coregex&lt;/a&gt;&lt;/strong&gt; — regex engine 3-3000× faster than stdlib&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/unilibs/uniwidth" rel="noopener noreferrer"&gt;uniwidth&lt;/a&gt;&lt;/strong&gt; — Unicode width 4-46× faster (for TUI rendering)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/grpmsoft/gosh" rel="noopener noreferrer"&gt;gosh&lt;/a&gt;&lt;/strong&gt; — cross-platform shell&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/fursy" rel="noopener noreferrer"&gt;fursy&lt;/a&gt;&lt;/strong&gt; — HTTP router with built-in OpenAPI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/pubsub" rel="noopener noreferrer"&gt;pubsub&lt;/a&gt;&lt;/strong&gt; — messaging with DLQ and backoff&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All zero CGO, production-grade, MIT licensed. The entire stack for a Go rewrite already exists.&lt;/p&gt;

&lt;p&gt;And it should be &lt;strong&gt;open source from day one&lt;/strong&gt;. Not because we need to see the code (though we do). Because the community will build what a team doing vibe coding cannot: reliability.&lt;/p&gt;

&lt;h3&gt;
  
  
  The deeper problem: Vibe Coding vs Smart Coding
&lt;/h3&gt;

&lt;p&gt;Claude Code is the poster child of what happens when you rely entirely on AI to write production software without engineering discipline. 64,464 lines, zero tests, a 3,167-line function with 486 branch points, regex for sentiment analysis at an AI company — this is what &lt;a href="https://dev.to/kolkov/from-vibe-coding-to-agentic-engineering-what-karpathy-got-right-and-whats-missing-62e"&gt;vibe coding&lt;/a&gt; looks like at scale: prompt-first, understanding-second, ship and pray.&lt;/p&gt;

&lt;p&gt;There's a better way. I call it &lt;a href="https://dev.to/kolkov/smart-coding-vs-vibe-coding-engineering-discipline-in-the-age-of-ai-5b20"&gt;Smart Coding&lt;/a&gt; — a meta-framework where &lt;strong&gt;you drive, AI accelerates&lt;/strong&gt;. Five principles:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Architecture Ownership&lt;/strong&gt; — you control system design, AI suggests patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comprehension Before Commit&lt;/strong&gt; — never deploy code you can't explain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Targeted Acceleration&lt;/strong&gt; — use AI for well-scoped tasks with clear specs, not "write me a CLI"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Continuous Validation&lt;/strong&gt; — verify every suggestion against edge cases, security, concurrency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deliberate Learning&lt;/strong&gt; — treat AI interactions as learning opportunities, build knowledge files&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The practical rule: &lt;strong&gt;invest 70% in architecture, specification, review. Let AI accelerate the 30% — mechanical implementation.&lt;/strong&gt; Not the other way around.&lt;/p&gt;

&lt;p&gt;In 2026, nobody writes tests by hand. But a Smart Coding engineer &lt;strong&gt;makes the AI write tests&lt;/strong&gt;, reviews the coverage, asks "what happens when abort fires during do-while with stream=undefined?" — and validates the answer. 64,464 lines with zero tests means nobody — human or AI — ever asked that question. That's not an AI failure. That's the absence of engineering process.&lt;/p&gt;

&lt;p&gt;Vibe coding has its place — rapid prototyping, feasibility studies, throwaway exploration. But production infrastructure serving paying customers? That requires &lt;strong&gt;agentic engineering&lt;/strong&gt;: AI agents executing under human oversight, with architecture decisions owned by humans, and continuous validation at every stage. As &lt;a href="https://dev.to/kolkov/from-vibe-coding-to-agentic-engineering-what-karpathy-got-right-and-whats-missing-62e"&gt;Karpathy noted&lt;/a&gt;, "you're not writing code 99% of the time — you're orchestrating agents." True. But orchestration requires understanding. And understanding requires engineers.&lt;/p&gt;

&lt;p&gt;Anthropic's team should be proud of the models. But shipping a CLI tool where the AI writes the code, the AI reviews the code, and nobody validates anything — and then being surprised when a source map leaks because nobody checked the build output — that's not Smart Coding. That's hope-driven development.&lt;/p&gt;




&lt;h2&gt;
  
  
  Epilogue
&lt;/h2&gt;

&lt;p&gt;I still use Claude Code. The models are genuinely the best for coding. Opus 4.6 is extraordinary.&lt;/p&gt;

&lt;p&gt;But the wrapper around those models — 64,464 lines of untested TypeScript with regex sentiment detection and an attestation system that breaks its own caching — is not worthy of them.&lt;/p&gt;

&lt;p&gt;We hope Anthropic's leadership draws the right conclusions from this incident. The source map leak wasn't a catastrophe — it was a mirror. It showed the world what the code looks like, and the world said: &lt;em&gt;"This needs to be open."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Three paths forward, any of which would work:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Open source Claude Code&lt;/strong&gt; — let the community fix what vibe coding broke. The models are the moat, not the CLI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rewrite the SDK properly&lt;/strong&gt; — move reliability (timeout, retry, fallback, ping awareness) into the open &lt;code&gt;@anthropic-ai/sdk&lt;/code&gt;. Let Claude Code be just business logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At the very least — start listening to users.&lt;/strong&gt; 201 upvotes on &lt;a href="https://github.com/anthropics/claude-code/issues/38335" rel="noopener noreferrer"&gt;#38335&lt;/a&gt;. 150 on &lt;a href="https://github.com/anthropics/claude-code/issues/40524" rel="noopener noreferrer"&gt;#40524&lt;/a&gt;. 15 on &lt;a href="https://github.com/anthropics/claude-code/issues/33949" rel="noopener noreferrer"&gt;#33949&lt;/a&gt;. Zero responses from the team. A stale-issue bot that auto-closes everything after 60 days is not a support strategy.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We'll keep documenting. We'll keep patching. And when someone finally looks at our analysis, it will be here waiting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;All our research&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;NEW&lt;/strong&gt; &lt;a href="https://github.com/anthropics/claude-code/issues/41981" rel="noopener noreferrer"&gt;Issue #41981&lt;/a&gt; — &lt;strong&gt;Complete fix proposal&lt;/strong&gt;: immediate fixes with line numbers, SDK restructuring, ping-aware watchdog, Go rewrite rationale, architectural recommendations&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/anthropics/claude-code/issues/39755" rel="noopener noreferrer"&gt;Issue #39755&lt;/a&gt; — watchdog fallback dead code + open source request&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/anthropics/claude-code/issues/33949" rel="noopener noreferrer"&gt;Issue #33949&lt;/a&gt; — streaming hang root cause analysis&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/anthropics/claude-code/issues/33949#issuecomment-4169141807" rel="noopener noreferrer"&gt;Source code findings and updated fix prompts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;&lt;a href="https://github.com/kolkov" rel="noopener noreferrer"&gt;@kolkov&lt;/a&gt; · &lt;a href="https://dev.to/kolkov"&gt;dev.to/kolkov&lt;/a&gt; · March 2026&lt;/em&gt;&lt;br&gt;
&lt;em&gt;With help from Claude Code itself — the only team member who listened.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>anthropic</category>
      <category>reverseengineering</category>
      <category>opensource</category>
    </item>
    <item>
      <title>From 100x Slower Than Rust to Beating It: The coregex Journey</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Wed, 18 Mar 2026 15:24:02 +0000</pubDate>
      <link>https://forem.com/kolkov/from-100x-slower-than-rust-to-beating-it-the-coregex-journey-3n3j</link>
      <guid>https://forem.com/kolkov/from-100x-slower-than-rust-to-beating-it-the-coregex-journey-3n3j</guid>
      <description>&lt;p&gt;A few days ago, &lt;a href="https://www.reddit.com/r/golang/comments/1rr2evh/why_is_gos_regex_so_slow/" rel="noopener noreferrer"&gt;@kostya27 posted on r/golang&lt;/a&gt; (124 upvotes):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Why is Go's regex so slow?"&lt;/strong&gt; Go total time on LangArena: 116.6 seconds. Without two regex tests: 78.5 seconds. Without regex, Go is competitive with Rust/C++. With regex, it's not even close.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;He's right. And this post is about what we did about it.&lt;/p&gt;

&lt;p&gt;Six months ago, I wrote about building &lt;a href="https://dev.to/kolkov/gos-regexp-is-slow-so-i-built-my-own-3000x-faster-3i6h"&gt;coregex&lt;/a&gt; — a regex engine for Go that's 3-3000x faster than stdlib. The benchmarks looked great. Then reality hit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/kostya" rel="noopener noreferrer"&gt;@kostya&lt;/a&gt;, author of &lt;a href="https://kostya.github.io/LangArena/" rel="noopener noreferrer"&gt;LangArena&lt;/a&gt; (2,900+ stars on GitHub), tried coregex on his benchmark suite. His verdict:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I tried coregex, but no luck."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;His numbers told the story:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Go stdlib&lt;/th&gt;
&lt;th&gt;coregex v0.12.8&lt;/th&gt;
&lt;th&gt;Rust regex&lt;/th&gt;
&lt;th&gt;PCRE2 JIT&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LogParser (13 patterns)&lt;/td&gt;
&lt;td&gt;22.7s&lt;/td&gt;
&lt;td&gt;22.0s&lt;/td&gt;
&lt;td&gt;0.2s&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Template::Regex&lt;/td&gt;
&lt;td&gt;6.5s&lt;/td&gt;
&lt;td&gt;7.0s&lt;/td&gt;
&lt;td&gt;3.8s&lt;/td&gt;
&lt;td&gt;1.0s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;We were &lt;strong&gt;100x slower than Rust&lt;/strong&gt; on LogParser. On the same machine. Same input. Same patterns.&lt;/p&gt;

&lt;p&gt;Our "3000x faster than stdlib" claim? True — on many patterns we tested. But we had blind spots we didn't know about: case-insensitive patterns, dense-digit data, multi-wildcard suffixes. On a real-world benchmark with 13 diverse patterns covering all these cases, we were barely faster than stdlib.&lt;/p&gt;

&lt;p&gt;That was March 10, 2026. Here's what happened in the next week.&lt;/p&gt;




&lt;h2&gt;
  
  
  The LangArena Challenge
&lt;/h2&gt;

&lt;p&gt;LangArena's &lt;a href="https://kostya.github.io/LangArena/" rel="noopener noreferrer"&gt;LogParser benchmark&lt;/a&gt; parses Apache log files with 13 regex patterns — the kind of patterns you'd find in any log analysis pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;errors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;        &lt;span class="err"&gt;`&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;5&lt;/span&gt;&lt;span class="pi"&gt;][&lt;/span&gt;&lt;span class="nv"&gt;0-9&lt;/span&gt;&lt;span class="pi"&gt;]{&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="pi"&gt;}&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt; &lt;span class="err"&gt;[4][0-9]{2}&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;
&lt;span class="err"&gt;b&lt;/span&gt;&lt;span class="s"&gt;ots:          `(?i)bot|crawler|scanner|spider|indexing|crawl`&lt;/span&gt;
&lt;span class="err"&gt;s&lt;/span&gt;&lt;span class="s"&gt;uspicious:    `(?i)etc/passwd|wp-admin|\.\./`&lt;/span&gt;
&lt;span class="err"&gt;i&lt;/span&gt;&lt;span class="s"&gt;ps:           `\d+\.\d+\.\d+\.35`&lt;/span&gt;
&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="s"&gt;pi_calls:     `/api/[^ "]+`&lt;/span&gt;
&lt;span class="err"&gt;m&lt;/span&gt;&lt;span class="s"&gt;ethods:       `(?i)get|post|put`&lt;/span&gt;
&lt;span class="err"&gt;e&lt;/span&gt;&lt;span class="s"&gt;mails:        `[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`&lt;/span&gt;
&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="s"&gt;..and 6 more&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing exotic. These are bread-and-butter patterns that every Go developer uses. And we were &lt;strong&gt;100x slower than Rust&lt;/strong&gt; on them.&lt;/p&gt;

&lt;p&gt;The question wasn't "can we optimize one pattern?" — it was "can we close a 100x gap across 13 different pattern types?"&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Learning from the Enemy
&lt;/h2&gt;

&lt;p&gt;Before writing a single line of code, I needed to understand &lt;strong&gt;what Rust does differently&lt;/strong&gt;. Not from reading docs — from running Rust with debug logging.&lt;/p&gt;

&lt;p&gt;Rust's regex crate has &lt;code&gt;RUST_LOG=debug&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ RUST_LOG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;debug ./rust-benchmark input.txt
&lt;span class="o"&gt;[&lt;/span&gt;regex] prefixes extracted: Seq[&lt;span class="s2"&gt;"EVA"&lt;/span&gt;, &lt;span class="s2"&gt;"EVa"&lt;/span&gt;, &lt;span class="s2"&gt;"EvA"&lt;/span&gt;, &lt;span class="s2"&gt;"Eva"&lt;/span&gt;, &lt;span class="s2"&gt;"eVA"&lt;/span&gt;, ...]
&lt;span class="o"&gt;[&lt;/span&gt;regex] prefilter built: teddy
&lt;span class="o"&gt;[&lt;/span&gt;regex] using reverse suffix strategy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every strategy decision, every prefilter choice, every literal extraction — logged. I could see exactly what Rust did for each pattern.&lt;/p&gt;

&lt;p&gt;We had nothing like this. So I built &lt;code&gt;COREGEX_DEBUG&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ COREGEX_DEBUG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 ./my-app
&lt;span class="o"&gt;[&lt;/span&gt;coregex] &lt;span class="nv"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"(?i:GET|POST|PUT)"&lt;/span&gt; &lt;span class="nv"&gt;strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;UseTeddy &lt;span class="nv"&gt;nfa_states&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;43 &lt;span class="nv"&gt;literals&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;40 &lt;span class="nb"&gt;complete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;coregex] &lt;span class="nv"&gt;prefilter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;FatTeddy &lt;span class="o"&gt;(&lt;/span&gt;AVX2 fat&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;complete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now I could compare strategy selection side-by-side. And the differences were immediately obvious.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: The Root Causes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Bug #1: Refusing to extract case-insensitive literals
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pattern&lt;/strong&gt;: &lt;code&gt;(?iU)\b(eval|system|exec|execute|passthru|shell_exec|phpinfo)\b&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;A real user (&lt;a href="https://github.com/coregx/coregex/issues/137" rel="noopener noreferrer"&gt;#137&lt;/a&gt;) reported this WAF pattern was &lt;strong&gt;88,000x slower than stdlib&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Rust extracts 250 case-fold literal variants:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;eval → EVAL, EVAl, EVaL, EVal, EvAL, ... eval  (16 variants)
system → SYSTEM, SYSTEm, SYSTem, ...            (32 variants)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then trims to 60 three-byte prefixes → Teddy SIMD prefilter → scans 968 bytes in 263 nanoseconds. Done.&lt;/p&gt;

&lt;p&gt;Our literal extractor? One line killed everything:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// literal/extractor.go:137&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Flags&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;syntax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FoldCase&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;NewSeq&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c"&gt;// Return EMPTY. For ALL case-insensitive patterns.&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This guard was added for a previous bug (#87) — naive extraction of single-case variants caused prefilter false negatives. The fix was correct for that bug, but the blanket rejection meant &lt;strong&gt;zero prefilter&lt;/strong&gt; for any &lt;code&gt;(?i)&lt;/code&gt; pattern. Without prefilter, the engine fell back to lazy DFA, which cache-thrashed on the 181-state NFA.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: Expand &lt;code&gt;(?i)&lt;/code&gt; literals into ALL case-fold variants (like Rust), then trim to 3-byte prefixes. One function, ~50 lines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: 88,000x slower → &lt;strong&gt;24x faster&lt;/strong&gt; than stdlib.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug #2: isMatchDigitPrefilter was O(n²)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pattern&lt;/strong&gt;: &lt;code&gt;\d{3}-\d{3}-\d{4}&lt;/code&gt; (phone numbers)&lt;/p&gt;

&lt;p&gt;On 6MB of log data: &lt;strong&gt;7 minutes&lt;/strong&gt; per &lt;code&gt;Match()&lt;/code&gt; call. Stdlib: 262ms.&lt;/p&gt;

&lt;p&gt;Root cause: &lt;code&gt;isMatchDigitPrefilter&lt;/code&gt; used &lt;code&gt;dfa.FindAt()&lt;/code&gt; (unanchored search) which scans from each digit position to end of input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Before (O(n²)):&lt;/span&gt;
&lt;span class="n"&gt;endPos&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dfa&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FindAt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;haystack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digitPos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Scans to EOF!&lt;/span&gt;

&lt;span class="c"&gt;// After (O(pattern_len)):&lt;/span&gt;
&lt;span class="n"&gt;endPos&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dfa&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SearchAtAnchored&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;haystack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digitPos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Checks only at position&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One function call change. 7 minutes → 2.1ms. &lt;strong&gt;200,000x faster&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The same pattern was already fixed in &lt;code&gt;findIndicesDigitPrefilter&lt;/code&gt; months ago — but &lt;code&gt;isMatchDigitPrefilter&lt;/code&gt; was never updated. Copy-paste divergence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug #3: ReverseSuffix rejected multi-wildcard patterns
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pattern&lt;/strong&gt;: &lt;code&gt;\d+\.\d+\.\d+\.35&lt;/code&gt; (IP address suffix)&lt;/p&gt;

&lt;p&gt;This pattern has a clear suffix: &lt;code&gt;.35&lt;/code&gt;. Rust finds it instantly with memmem, then reverse-scans for the start. Our &lt;code&gt;isSafeForReverseSuffix&lt;/code&gt; rejected it because it had 3 wildcard subexpressions (&lt;code&gt;\d+&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;wildcardCount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;  &lt;span class="c"&gt;// "multiple wildcards break reverse NFA"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The guard existed because our reverse NFA builder had a bug with mixed byte+epsilon states. That bug was fixed in v0.12.9. But the guard stayed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: Remove the guard. Also fix &lt;code&gt;Find()&lt;/code&gt; leftmost semantics — &lt;code&gt;bytes.LastIndex&lt;/code&gt; → &lt;code&gt;bytes.Index&lt;/code&gt; for non-&lt;code&gt;.*&lt;/code&gt; patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: 57ms → 0.63ms (&lt;strong&gt;603x faster&lt;/strong&gt;, 1.6x faster than Rust!)&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug #4: FatTeddy AVX2 missed matches
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pattern&lt;/strong&gt;: &lt;code&gt;(?i)get|post|put&lt;/code&gt; (40 case-fold expanded literals)&lt;/p&gt;

&lt;p&gt;FatTeddy (33-64 pattern SIMD search) found only 11,456 matches. Correct answer: 34,368.&lt;/p&gt;

&lt;p&gt;Root cause: One assembly instruction.&lt;/p&gt;

&lt;p&gt;FatTeddy uses 256-bit AVX2 registers with two 128-bit lanes. Low lane handles buckets 0-7, high lane handles buckets 8-15. The code used &lt;code&gt;ANDL&lt;/code&gt; to combine lane results — requiring a match in &lt;strong&gt;both&lt;/strong&gt; lanes. But GET variants (8 patterns) were all in buckets 0-7 (low lane only), PUT variants in buckets 8-15 (high lane only). &lt;code&gt;ANDL&lt;/code&gt; zeroed them out.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;; Before (incorrect):
ANDL CX, AX          ; Requires BOTH lanes to match

; After (correct):
ORL  CX, AX          ; Either lane is sufficient
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One instruction. 22,912 missing matches fixed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Building What Rust Has
&lt;/h2&gt;

&lt;p&gt;Beyond bug fixes, we needed architectural improvements to match Rust's approach:&lt;/p&gt;

&lt;h3&gt;
  
  
  Bidirectional DFA
&lt;/h3&gt;

&lt;p&gt;Previously, &lt;code&gt;UseDFA&lt;/code&gt; patterns did: forward DFA → match end, then PikeVM → exact boundaries. PikeVM is O(n×states) — a second full scan.&lt;/p&gt;

&lt;p&gt;Now: forward DFA → end, reverse DFA → start, anchored DFA → exact end. Three O(n) passes instead of one O(n×states) pass.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cascading Prefix Trim
&lt;/h3&gt;

&lt;p&gt;When case-fold expansion produces too many literals (&amp;gt;64), we trim them using Rust's approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;128 six-byte literals → try keep 4 bytes → 18 unique → fits Teddy!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is directly from Rust's &lt;code&gt;optimize_for_prefix_by_preference()&lt;/code&gt; with their ATTEMPTS table: &lt;code&gt;[(4,64), (3,64), (2,64)]&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Aho-Corasick DFA Backend
&lt;/h3&gt;

&lt;p&gt;Our &lt;a href="https://github.com/coregx/ahocorasick" rel="noopener noreferrer"&gt;Aho-Corasick library&lt;/a&gt; got a complete DFA backend rewrite:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flat transition table with premultiplied state IDs&lt;/li&gt;
&lt;li&gt;Match flag in high bit (single AND instruction for detection)&lt;/li&gt;
&lt;li&gt;SIMD skip-ahead prefilter via &lt;code&gt;bytes.IndexByte&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: 300 MB/s → 3,400 MB/s (Find), 260 MB/s → 5,900 MB/s (IsMatch). &lt;strong&gt;11-22x throughput improvement&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Benchmark: 8 Real-World Patterns on 6.3 MB Input
&lt;/h3&gt;

&lt;p&gt;100 iterations each, best of 5, same machine (i7-1255U):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Go stdlib&lt;/th&gt;
&lt;th&gt;coregex v0.12.13&lt;/th&gt;
&lt;th&gt;Rust regex&lt;/th&gt;
&lt;th&gt;vs stdlib&lt;/th&gt;
&lt;th&gt;vs Rust&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.*@example\.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;420 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.3 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;7.2 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;126x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.2x faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.*\.(txt&amp;amp;#124;log&amp;amp;#124;md)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;426 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.0 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.8 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;425x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.8x faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;email validation&lt;/td&gt;
&lt;td&gt;447 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.4 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3.8 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;132x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.1x faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\d+\.\d+\.\d+\.35&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;381 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.63 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.98 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;603x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.6x faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;(?i)get&amp;amp;#124;post&amp;amp;#124;put&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;561 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16.6 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;7.0 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;34x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.4x slower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;(?i)bot&amp;amp;#124;crawler&amp;amp;#124;...&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;883 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;38.4 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6.7 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5.7x slower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;password=[^&amp;amp;\s"]+&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;24 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8.9 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.9 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3.1x slower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;session[_-]?id=...&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;8 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.7 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.2 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.3x slower&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;4 out of 8 patterns are faster than Rust.&lt;/strong&gt; All 8 are faster than Go stdlib.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a class="mentioned-user" href="https://dev.to/kostya"&gt;@kostya&lt;/a&gt;'s Update
&lt;/h3&gt;

&lt;p&gt;Remember "no luck"? Here's the progression on his M1 MacBook:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;LogParser&lt;/th&gt;
&lt;th&gt;Gap to Rust&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;v0.12.8 (start)&lt;/td&gt;
&lt;td&gt;22.0s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;100x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v0.12.9&lt;/td&gt;
&lt;td&gt;5.3s&lt;/td&gt;
&lt;td&gt;26x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v0.12.10&lt;/td&gt;
&lt;td&gt;2.67s&lt;/td&gt;
&lt;td&gt;13x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v0.12.13 (current)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.12s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;From 100x slower to 10x. Not parity yet — but a different conversation than "no luck."&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Not Just Use CGO?
&lt;/h2&gt;

&lt;p&gt;Every other Go regex alternative uses CGO or Wasm:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;go-re2&lt;/strong&gt;: C++ RE2 via Wasm (wazero)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;regexp2&lt;/strong&gt;: Backtracking (.NET-style) — no O(n) guarantee&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;rubex&lt;/strong&gt;: Oniguruma via CGO&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;go-pcre&lt;/strong&gt;: PCRE via CGO&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;coregex is &lt;strong&gt;pure Go + Go assembly&lt;/strong&gt;. No CGO, no Wasm, no external dependencies.&lt;/p&gt;

&lt;p&gt;Why does this matter?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cross-compilation&lt;/strong&gt;: &lt;code&gt;GOOS=linux GOARCH=arm64 go build&lt;/code&gt; just works&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static binaries&lt;/strong&gt;: No shared libraries to ship&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Go toolchain&lt;/strong&gt;: &lt;code&gt;go vet&lt;/code&gt;, &lt;code&gt;go test -race&lt;/code&gt;, &lt;code&gt;pprof&lt;/code&gt; all work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging&lt;/strong&gt;: Standard Go debugging, no FFI boundary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security&lt;/strong&gt;: No C memory safety issues in regex hot paths&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The performance gap to pure-CGO solutions (PCRE2 JIT) exists — JIT compiles regex to native machine code, achieving 1.0s where we take 7.1s on Template::Regex. But that's an &lt;a href="https://github.com/coregx/coregex/issues/124" rel="noopener noreferrer"&gt;architectural tier boundary&lt;/a&gt; — we're competing within the automata-based class (like RE2 and Rust regex), not against JIT engines.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Debug logging is not optional
&lt;/h3&gt;

&lt;p&gt;Building &lt;code&gt;COREGEX_DEBUG&lt;/code&gt; was the single most impactful decision. Without it, every optimization was guesswork. With it, we could see exactly why a pattern was slow and verify our fix matched Rust's approach.&lt;/p&gt;

&lt;p&gt;If you're building any kind of engine — regex, query planner, compiler — &lt;strong&gt;add strategy logging from day one&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. One instruction can hide 23,000 bugs
&lt;/h3&gt;

&lt;p&gt;The FatTeddy &lt;code&gt;ANDL → ORL&lt;/code&gt; fix taught us that SIMD code correctness is binary. Not "mostly correct" or "works for some patterns." If your lane combining logic is wrong, you silently drop matches. No error, no panic — just wrong results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Always verify match counts against stdlib.&lt;/strong&gt; On every pattern. On every change.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Benchmarks lie — until they don't
&lt;/h3&gt;

&lt;p&gt;Our "3000x faster" headline was true for &lt;code&gt;.*error.*&lt;/code&gt; patterns. But &lt;a class="mentioned-user" href="https://dev.to/kostya"&gt;@kostya&lt;/a&gt;'s LangArena showed the full picture: on diverse real-world patterns, we were barely faster than stdlib.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real benchmarks use real patterns from real users.&lt;/strong&gt; We now run &lt;a href="https://github.com/kolkov/regex-bench" rel="noopener noreferrer"&gt;regex-bench&lt;/a&gt; CI on every PR — 16 core patterns + 13 LangArena patterns, compared against both stdlib and Rust regex, on Linux AMD EPYC and macOS Apple Silicon.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Guard clauses outlive their bugs
&lt;/h3&gt;

&lt;p&gt;Three of our four major bugs were caused by &lt;strong&gt;guards that stayed after the underlying bug was fixed&lt;/strong&gt;. &lt;code&gt;FoldCase&lt;/code&gt; rejection, &lt;code&gt;wildcardCount &amp;gt;= 2&lt;/code&gt;, unanchored &lt;code&gt;FindAt&lt;/code&gt; — all were correct when added. All became performance killers months later when the original bugs were resolved.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Track why a guard exists. Remove it when the reason is gone.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Go ASM is production-viable for SIMD
&lt;/h3&gt;

&lt;p&gt;We wrote ~500 lines of AVX2/SSSE3 assembly for Teddy multi-pattern search. It works. FatTeddy throughput: &lt;strong&gt;12 GB/s&lt;/strong&gt; on single-call scans (2x faster than SlimTeddy SSSE3!).&lt;/p&gt;

&lt;p&gt;The challenge isn't writing the ASM — it's the Go→ASM function call boundary. Each call costs ~60ns + mask reload. For high-match-count patterns, this adds up. Our batch API (64KB chunks) reduces round-trips, but the integrated prefilter+DFA loop that Rust uses remains the gold standard.&lt;/p&gt;




&lt;h2&gt;
  
  
  Current State: v0.12.13
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;97,000 lines of code.&lt;/strong&gt; 17 strategies. 1,470 tests. 5 releases in one week.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/coregx/coregex@v0.12.13
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop-in replacement:&lt;/p&gt;

&lt;p&gt;It's a true drop-in replacement for Go's &lt;code&gt;regexp&lt;/code&gt; package — same API, same types (&lt;code&gt;Regexp&lt;/code&gt; is aliased), same method signatures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/coregx/coregex"&lt;/span&gt;  &lt;span class="c"&gt;// instead of "regexp"&lt;/span&gt;

&lt;span class="n"&gt;re&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;coregex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`(?i)get|post|put`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FindAllString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Same API, faster execution&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In most cases, changing the import path is all you need.&lt;/p&gt;

&lt;p&gt;Debug your patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;COREGEX_DEBUG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 ./your-app
&lt;span class="c"&gt;# [coregex] pattern="(?i:GET|P(?:OST|UT))" strategy=UseTeddy nfa_states=43 literals=40 complete=true&lt;/span&gt;
&lt;span class="c"&gt;# [coregex] prefilter=FatTeddy (AVX2 fat) complete=true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's Still Slower Than Rust
&lt;/h2&gt;

&lt;p&gt;Honesty matters. Here's where we're still behind:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gap&lt;/th&gt;
&lt;th&gt;Root cause&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;(?i)&lt;/code&gt; patterns: 2-6x&lt;/td&gt;
&lt;td&gt;FatTeddy ORL creates more false positives than Rust's interleave verification&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://github.com/coregx/coregex/issues/120" rel="noopener noreferrer"&gt;Researched&lt;/a&gt;, needs ASM rewrite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DFA verification: 3-7x&lt;/td&gt;
&lt;td&gt;Go→ASM round trip overhead, no integrated prefilter+DFA loop&lt;/td&gt;
&lt;td&gt;Architectural&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Template::Regex: 1.8x&lt;/td&gt;
&lt;td&gt;Two-phase DFA+PikeVM vs Rust's single-phase lazy DFA&lt;/td&gt;
&lt;td&gt;Planned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ARM: 5-15x vs Rust&lt;/td&gt;
&lt;td&gt;No SIMD prefilters on ARM (Teddy/memchr are x86 only)&lt;/td&gt;
&lt;td&gt;Waiting for Go NEON support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;We're not hiding these gaps. They're tracked, researched, and planned. The goal is Rust parity on all pattern types — we're not there yet on &lt;code&gt;(?i)&lt;/code&gt; and DFA-heavy patterns.&lt;/p&gt;




&lt;h2&gt;
  
  
  Community Testing Matters — A Lot
&lt;/h2&gt;

&lt;p&gt;A multi-engine regex library is &lt;strong&gt;inherently complex&lt;/strong&gt;. 17 strategies, SIMD assembly, lazy DFA, reverse search, prefilter cascading — every combination of pattern shape × input data × strategy is a potential edge case. No amount of internal testing can cover what real users discover in minutes.&lt;/p&gt;

&lt;p&gt;Every major fix in this article came from &lt;strong&gt;community feedback&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a class="mentioned-user" href="https://dev.to/kostya"&gt;@kostya&lt;/a&gt;'s LangArena exposed the 100x gap we didn't know about&lt;/li&gt;
&lt;li&gt;tjbrains' WAF pattern (&lt;a href="https://github.com/coregx/coregex/issues/137" rel="noopener noreferrer"&gt;#137&lt;/a&gt;) revealed the 88,000x regression in case-insensitive matching&lt;/li&gt;
&lt;li&gt;GoAWK integration uncovered 15+ Unicode edge cases months earlier&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern is consistent: someone runs coregex on their specific workload, finds a pattern type we haven't optimized yet, reports it — and we fix it in hours, not weeks. The FatTeddy lane bug? Fixed same day. The DigitPrefilter O(n²)? Fixed in one line. Case-insensitive literal extraction? Researched Rust's approach, implemented, released — all within 24 hours.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;There are likely more patterns that aren't optimized yet.&lt;/strong&gt; That's the nature of a 17-strategy engine — some strategy paths get less testing than others. But the architecture is sound, the fix turnaround is fast, and every report makes the library better for everyone.&lt;/p&gt;

&lt;p&gt;We &lt;a href="https://github.com/golang/go/issues/76818" rel="noopener noreferrer"&gt;proposed coregex for Go's standard library&lt;/a&gt;. It wasn't accepted — and honestly, that's okay. As an independent library, we can iterate faster, ship SIMD assembly that the Go team wouldn't merge, and make decisions optimized for performance rather than compatibility. The Go ecosystem is better with options.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't hesitate to contribute.&lt;/strong&gt; File issues with your patterns and inputs. Even a simple "this pattern is slower than stdlib" report helps — it tells us which strategy path needs work. The more diverse patterns we see, the fewer blind spots remain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pull requests are especially welcome.&lt;/strong&gt; We know that a healthy open source project is built by its community, and we value every contributor. Don't worry if your PR isn't perfect — we'll review the code, help you fix any issues, guide you through our conventions, and explain what's needed to get it merged. Whether it's a new test case, a documentation fix, a strategy optimization, or a bug report with a reproducer — every contribution counts and every contributor gets credited.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;If regex is a bottleneck in your Go application:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Profile first&lt;/strong&gt; — make sure regex is actually the problem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark your specific patterns&lt;/strong&gt; — performance varies by pattern type&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check match counts&lt;/strong&gt; — &lt;code&gt;coregex.FindAll()&lt;/code&gt; must match &lt;code&gt;regexp.FindAll()&lt;/code&gt; exactly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Report issues&lt;/strong&gt; — we fixed #137 (88,000x regression) within 24 hours
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Quick benchmark&lt;/span&gt;
go get github.com/coregx/coregex@v0.12.13
&lt;span class="nv"&gt;COREGEX_DEBUG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 go &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;-bench&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-benchmem&lt;/span&gt; your-package
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Links:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/coregx/ahocorasick" rel="noopener noreferrer"&gt;Aho-Corasick library&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/kolkov/regex-bench" rel="noopener noreferrer"&gt;Cross-language benchmarks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kostya.github.io/LangArena/" rel="noopener noreferrer"&gt;LangArena&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;The most humbling moment? Seeing &lt;code&gt;ANDL CX, AX&lt;/code&gt; in our FatTeddy ASM and realizing one wrong instruction had been silently dropping 23,000 matches. The most satisfying? Seeing &lt;code&gt;coregex 1.6x faster than Rust&lt;/code&gt; on the IP pattern that started this whole journey.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Built by &lt;a href="https://github.com/kolkov" rel="noopener noreferrer"&gt;@kolkov&lt;/a&gt; as part of &lt;a href="https://github.com/coregx" rel="noopener noreferrer"&gt;CoreGX&lt;/a&gt; — production Go libraries.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>performance</category>
      <category>regex</category>
      <category>rust</category>
    </item>
    <item>
      <title>Aho-Corasick in Go: Multi-Pattern String Matching at 6 GB/s with Zero Allocations</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Wed, 18 Mar 2026 12:38:29 +0000</pubDate>
      <link>https://forem.com/kolkov/aho-corasick-in-go-multi-pattern-string-matching-at-6-gbs-with-zero-allocations-2jog</link>
      <guid>https://forem.com/kolkov/aho-corasick-in-go-multi-pattern-string-matching-at-6-gbs-with-zero-allocations-2jog</guid>
      <description>&lt;p&gt;When your application needs to find thousands of keywords in a stream of text — think log analysis, content moderation, network intrusion detection, or DNA sequencing — you need an algorithm that doesn't slow down as the number of patterns grows. That algorithm is Aho-Corasick.&lt;/p&gt;

&lt;p&gt;We built &lt;a href="https://github.com/coregx/ahocorasick" rel="noopener noreferrer"&gt;coregx/ahocorasick&lt;/a&gt;, a pure Go implementation that achieves &lt;strong&gt;over 6 GB/s&lt;/strong&gt; on a single core with zero heap allocations on the hot path. No cgo, no assembly, no unsafe — just carefully structured Go code and a deep understanding of how modern CPUs access memory.&lt;/p&gt;

&lt;p&gt;This article explains what we built, why, and the techniques that made it fast.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: O(n × k) Doesn't Scale
&lt;/h2&gt;

&lt;p&gt;Suppose you have 100 keywords to search for in a 1 MB document. The naive approach — call &lt;code&gt;strings.Contains&lt;/code&gt; for each keyword — performs 100 scans of the document. That's &lt;code&gt;O(n × k)&lt;/code&gt; where &lt;code&gt;n&lt;/code&gt; is the document size and &lt;code&gt;k&lt;/code&gt; is the number of patterns.&lt;/p&gt;

&lt;p&gt;Aho-Corasick solves this in &lt;code&gt;O(n + z)&lt;/code&gt; — one pass through the document, regardless of how many patterns you have. Whether you're searching for 10 patterns or 10,000, the scan time is the same.&lt;/p&gt;

&lt;p&gt;The algorithm builds a finite automaton from all patterns at once: a trie with failure links that allow the search to continue without backtracking. Every byte of input advances the automaton by exactly one state.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/coregx/ahocorasick" rel="noopener noreferrer"&gt;coregx/ahocorasick&lt;/a&gt; is part of the &lt;a href="https://github.com/coregx" rel="noopener noreferrer"&gt;coregx&lt;/a&gt; organization — a collection of high-performance libraries for Go. It serves as the multi-pattern search engine inside &lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;coregex&lt;/a&gt;, our regex engine, where it accelerates literal alternations like &lt;code&gt;error|warning|fatal|critical&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Design Decisions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pure Go, zero dependencies.&lt;/strong&gt; The library compiles on every platform Go supports — Linux, Windows, macOS, ARM, WASM — without any build complexity. There's no cgo bridge, no platform-specific assembly files to maintain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DFA compilation.&lt;/strong&gt; At build time, the library constructs a noncontiguous NFA (trie + failure links), then compiles it into a fully resolved DFA — a single flat &lt;code&gt;[]uint32&lt;/code&gt; array where every state transition is pre-computed. At search time, there are no failure links to follow. One table lookup per byte. Always.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Builder pattern API.&lt;/strong&gt; The automaton is immutable after construction. You configure it once, build it once, and then search concurrently from any number of goroutines without synchronization.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ahocorasick&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;AddStrings&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"warning"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"fatal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"critical"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Safe for concurrent use&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logLine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// handle match&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Performance: The Numbers
&lt;/h2&gt;

&lt;p&gt;Benchmarks on Intel i7-1255U, single core:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Allocations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;IsMatch&lt;/code&gt; (match found)&lt;/td&gt;
&lt;td&gt;64 KB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.3 GB/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;IsMatch&lt;/code&gt; (no match)&lt;/td&gt;
&lt;td&gt;64 KB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.0 GB/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Find&lt;/code&gt; (first match)&lt;/td&gt;
&lt;td&gt;64 KB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.5 GB/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1 (24 B)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;FindAll&lt;/code&gt; (all matches)&lt;/td&gt;
&lt;td&gt;77 B&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;100 MB/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4 (720 B)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These are median values across multiple benchmark runs. Individual runs may be higher or lower depending on CPU frequency scaling. For context, reading from an NVMe SSD tops out at ~7 GB/s — this library processes data nearly as fast as most storage can deliver it.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works: Three Layers of Optimization
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Layer 1: The DFA Transition Table
&lt;/h3&gt;

&lt;p&gt;The classical Aho-Corasick NFA has a problem: when a byte doesn't match any transition from the current state, it follows a chain of failure links back toward the root. In the worst case, this is &lt;code&gt;O(depth)&lt;/code&gt; operations per byte.&lt;/p&gt;

&lt;p&gt;Our DFA eliminates this entirely. At build time, for every state and every possible byte class, we pre-compute the final destination by following all failure links. The result is stored in a flat array:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;next_state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trans&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;current_state&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;byte_class&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One addition. One array load. No branches, no link following, no indirection. The state IDs are pre-multiplied by the stride (the row width of the transition table), so the lookup doesn't even need a multiplication — just an addition.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Match Flag in the Transition Table
&lt;/h3&gt;

&lt;p&gt;Each entry in the transition table is a &lt;code&gt;uint32&lt;/code&gt;. The lower 31 bits hold the (pre-multiplied) destination state ID. The high bit — bit 31 — is a match flag: if set, the destination state has at least one matching pattern.&lt;/p&gt;

&lt;p&gt;This means the hot loop checks for matches with a single AND instruction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;trans&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;sid&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;matchFlag&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;  &lt;span class="c"&gt;// match found&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;sid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;  &lt;span class="c"&gt;// no flag → raw IS the clean state ID&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The critical insight: when there's no match (the common case for most bytes), &lt;code&gt;raw&lt;/code&gt; has bit 31 clear, so it's already a valid state ID. No masking needed. The mask is only applied on the rare match path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: SIMD Prefilter with Skip-Ahead
&lt;/h3&gt;

&lt;p&gt;The DFA processes one byte at a time. But modern CPUs have SIMD instructions that can scan 16–32 bytes in a single operation. Go's &lt;code&gt;bytes.IndexByte&lt;/code&gt; uses these instructions internally (SSE2/AVX2 on x86, NEON on ARM).&lt;/p&gt;

&lt;p&gt;Before entering the DFA loop, we scan the haystack for any byte that could start a pattern match. If none of the pattern start bytes exist in the haystack, we return immediately — no DFA traversal at all.&lt;/p&gt;

&lt;p&gt;More importantly, we re-engage this prefilter &lt;em&gt;during&lt;/em&gt; the DFA loop. Whenever the automaton returns to its start state (meaning no pattern prefix is currently being tracked), we call &lt;code&gt;bytes.IndexByte&lt;/code&gt; to skip ahead to the next position where a match could begin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;sid&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;startState&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;startBytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;findEarliestStartByte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;haystack&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;startBytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;  &lt;span class="c"&gt;// no more start bytes → no match possible&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;skip&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the same technique used by &lt;a href="https://github.com/BurntSushi/aho-corasick" rel="noopener noreferrer"&gt;BurntSushi's Rust aho-corasick&lt;/a&gt; crate, adapted for Go's runtime. It turns the no-match worst case from a full &lt;code&gt;O(n)&lt;/code&gt; DFA scan into a handful of SIMD scans.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Usage
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Log Analysis Pipeline
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Build once at startup&lt;/span&gt;
&lt;span class="n"&gt;patterns&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"ERROR"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"FATAL"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"PANIC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"OOM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"connection refused"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"timeout exceeded"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"permission denied"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"disk full"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ahocorasick&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;AddStrings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Process log lines (concurrent-safe)&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;processLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="c"&gt;// fast path: no keywords found&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FindAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PatternID&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Content Moderation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Load blocklist from database&lt;/span&gt;
&lt;span class="n"&gt;blocklist&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;loadBlocklist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// []string, potentially thousands of terms&lt;/span&gt;
&lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ahocorasick&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;AddStrings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blocklist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;moderate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// O(n), regardless of blocklist size&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Regex Engine Prefilter
&lt;/h3&gt;

&lt;p&gt;Inside &lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;coregex&lt;/a&gt;, when the regex engine encounters a pattern like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(?i)(error|warning|fatal|critical|info|debug|trace)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It extracts the literal alternation, builds an Aho-Corasick automaton, and uses it as a prefilter. The automaton quickly identifies candidate positions in the text, and the full regex engine only runs at those positions. For large inputs with rare matches, this can be 10–100x faster than running the regex directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part of the coregx Ecosystem
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;ahocorasick&lt;/code&gt; is one component in the &lt;a href="https://github.com/coregx" rel="noopener noreferrer"&gt;coregx&lt;/a&gt; organization's text processing toolkit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/ahocorasick" rel="noopener noreferrer"&gt;ahocorasick&lt;/a&gt;&lt;/strong&gt; — Multi-pattern string matching (this library)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;coregex&lt;/a&gt;&lt;/strong&gt; — High-performance regex engine that uses ahocorasick as its literal search backend&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The libraries are designed to work together but are independently useful. &lt;code&gt;ahocorasick&lt;/code&gt; has zero dependencies and can be used standalone in any Go project.&lt;/p&gt;

&lt;p&gt;If you're interested in the regex engine side of things, check out &lt;a href="https://dev.to/kolkov/gos-regexp-is-slow-so-i-built-my-own-3000x-faster-3i6h"&gt;Go's Regexp is Slow. So I Built My Own — up to 3000x Faster&lt;/a&gt;, where we describe coregex and how it uses this Aho-Corasick library under the hood.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/coregx/ahocorasick
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/coregx/ahocorasick"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ahocorasick&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;AddStrings&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"quick"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"brown"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"fox"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"the quick brown fox jumps over the lazy dog"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// Zero-allocation existence check&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Has match:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c"&gt;// true&lt;/span&gt;

    &lt;span class="c"&gt;// Find all matches&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FindAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"  %q at [%d:%d]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;End&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;End&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Has match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="s"&gt;"quick" at [4:9]&lt;/span&gt;
  &lt;span class="s"&gt;"brown" at [10:15]&lt;/span&gt;
  &lt;span class="s"&gt;"fox" at [16:19]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Teddy SIMD prefilter&lt;/strong&gt; — A packed SIMD algorithm (inspired by &lt;a href="https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-hyperscan.html" rel="noopener noreferrer"&gt;Hyperscan&lt;/a&gt;) for even faster candidate scanning with ≤64 patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contiguous NFA&lt;/strong&gt; — A memory-efficient alternative to the DFA for very large pattern sets (10,000+ patterns) where the full DFA transition table doesn't fit in L2 cache&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stream search&lt;/strong&gt; — &lt;code&gt;io.Reader&lt;/code&gt; interface for searching in streaming data without loading the entire input into memory&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/coregx/ahocorasick" rel="noopener noreferrer"&gt;github.com/coregx/ahocorasick&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Go Reference: &lt;a href="https://pkg.go.dev/github.com/coregx/ahocorasick" rel="noopener noreferrer"&gt;pkg.go.dev/github.com/coregx/ahocorasick&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Part of: &lt;a href="https://github.com/coregx" rel="noopener noreferrer"&gt;coregx&lt;/a&gt; — High-performance libraries for Go&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Licensed under MIT. Contributions welcome.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>algorithms</category>
      <category>performance</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Go GUI in 2026: gogpu/ui v0.1.0 — 22 Widgets, GPU Rendering, Zero CGO</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Sun, 15 Mar 2026 16:50:20 +0000</pubDate>
      <link>https://forem.com/kolkov/go-gui-in-2026-gogpuui-v010-22-widgets-gpu-rendering-zero-cgo-1enf</link>
      <guid>https://forem.com/kolkov/go-gui-in-2026-gogpuui-v010-22-widgets-gpu-rendering-zero-cgo-1enf</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series: Building Go's GPU Ecosystem&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-a-pure-go-graphics-library-for-gpu-programming-2j5d"&gt;GoGPU: A Pure Go Graphics Library&lt;/a&gt; — Project announcement&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-from-idea-to-100k-lines-in-two-weeks-building-gos-gpu-ecosystem-3b2"&gt;From Idea to 100K Lines in Two Weeks&lt;/a&gt; — The journey&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/pure-go-2d-graphics-library-with-gpu-acceleration-introducing-gogpugg-538h"&gt;Pure Go 2D Graphics with GPU Acceleration&lt;/a&gt; — Introducing gg&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gpu-compute-shaders-in-pure-go-gogpugg-v0150-1cjk"&gt;GPU Compute Shaders in Pure Go&lt;/a&gt; — Compute pipelines&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/go-126-meets-2026-with-a-professional-graphics-ecosystem-9g8"&gt;Go 1.26 Meets 2026&lt;/a&gt; — Ecosystem overview&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpugg-enterprise-2d-graphics-library-in-pure-go-1931"&gt;Enterprise 2D Graphics Library&lt;/a&gt; — gg architecture&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-enterprise-architecture-cross-package-gpu-integration-with-gpucontext-332"&gt;Cross-Package GPU Integration&lt;/a&gt; — gpucontext&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-unified-2d3d-graphics-integration-in-pure-go-gg3"&gt;Unified 2D/3D Graphics Integration&lt;/a&gt; — gg + gogpu&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.tolink"&gt;Core Complete, Focus on GUI&lt;/a&gt; — Phase 2 announcement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;gogpu/ui v0.1.0 — First Release&lt;/strong&gt; ← You are here&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  From "Focus on GUI" to First Release
&lt;/h2&gt;

&lt;p&gt;In the &lt;a href="https://dev.tolink"&gt;last article&lt;/a&gt;, we announced that the core gogpu ecosystem was complete and all our energy was going into building a GUI toolkit. At that point gogpu/ui was at Phase 2 Beta with 55K lines of code, 6 widgets, and one design system.&lt;/p&gt;

&lt;p&gt;Today, roughly three weeks later, we are tagging &lt;strong&gt;v0.1.0&lt;/strong&gt; — the first public release. The toolkit grew from 55K to &lt;strong&gt;150K lines&lt;/strong&gt;, from 6 to &lt;strong&gt;22 interactive widgets&lt;/strong&gt;, and from one to &lt;strong&gt;three design systems&lt;/strong&gt;. The entire ecosystem went through a coordinated release — from shader compiler to widget toolkit — to bring everything onto stable, tagged versions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is a preview release.&lt;/strong&gt; The API will change. Performance has not been optimized. There are rough edges. We are releasing now because we want community feedback to shape the API before it solidifies. The work is just beginning, but we believe there is enough here to start a conversation.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is gogpu/ui?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;gogpu/ui&lt;/a&gt; is a GUI toolkit library for Go. You build widget trees, bind reactive state, and the toolkit handles layout, event dispatch, and rendering through &lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt; (a 2D graphics engine) onto a GPU surface provided by &lt;a href="https://github.com/gogpu/gogpu" rel="noopener noreferrer"&gt;gogpu/gogpu&lt;/a&gt; (an application framework).&lt;/p&gt;

&lt;p&gt;We studied 20+ GUI frameworks across multiple ecosystems — &lt;strong&gt;Flutter&lt;/strong&gt; (Dart), &lt;strong&gt;Qt&lt;/strong&gt; (C++), &lt;strong&gt;SwiftUI&lt;/strong&gt; (Swift), &lt;strong&gt;Xilem/Floem/GPUI/Iced/Slint&lt;/strong&gt; (Rust), &lt;strong&gt;Angular/SolidJS/React&lt;/strong&gt; (Web), &lt;strong&gt;Fyne/Gio&lt;/strong&gt; (Go) — and brought their proven patterns into Go:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable Painter pattern&lt;/strong&gt; (Flutter's separation of widget behavior from rendering)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reactive signals&lt;/strong&gt; (Angular Signals architecture, also inspired by SolidJS and Rust's Leptos)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functional options&lt;/strong&gt; for backward-compatible API evolution (Go-native)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;W3C Pointer Events&lt;/strong&gt; for input handling (browser standard)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSS font-weight matching&lt;/strong&gt; algorithm (W3C spec) for typography&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Screen-space coordinate transforms&lt;/strong&gt; (Flutter's &lt;code&gt;localToGlobal&lt;/code&gt;, Qt's &lt;code&gt;mapToGlobal&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The entire stack — from shader compilation to window management to 2D rendering — is &lt;strong&gt;580,000+ lines of pure Go&lt;/strong&gt; with zero CGO.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;By the numbers (v0.1.0):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Interactive widgets&lt;/td&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design system themes&lt;/td&gt;
&lt;td&gt;3 (Material 3, Fluent, Cupertino)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Go source files&lt;/td&gt;
&lt;td&gt;350+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lines of code&lt;/td&gt;
&lt;td&gt;~146,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test functions&lt;/td&gt;
&lt;td&gt;~3,900+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Average test coverage&lt;/td&gt;
&lt;td&gt;97%+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CGO required&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Quick Look: A Minimal Application
&lt;/h2&gt;

&lt;p&gt;Here is a complete runnable application with a Material 3 button:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/gogpu/gg"&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="s"&gt;"github.com/gogpu/gg/gpu"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg/integration/ggcanvas"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gogpu"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/ui/app"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/ui/core/button"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/ui/primitives"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/ui/render"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/ui/theme/material3"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/ui/widget"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;gogpuApp&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"My First App"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithContinuousRender&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c"&gt;// event-driven: 0% CPU when idle&lt;/span&gt;

    &lt;span class="n"&gt;m3&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;material3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;widget&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0x6750A4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c"&gt;// seed color&lt;/span&gt;

    &lt;span class="n"&gt;uiApp&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithWindowProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithPlatformProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithEventSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;uiApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRoot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;primitives&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;primitives&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, gogpu/ui!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FontSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Click Me"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Clicked!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
                &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PainterOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;material3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ButtonPainter&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Theme&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;m3&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Padding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Gap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;12&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ggcanvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Canvas&lt;/span&gt;
    &lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnDraw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Width&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Height&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
            &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ggcanvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GPUContextProvider&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;uiApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Frame&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;sv&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SurfaceView&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;sw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SurfaceSize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetAcceleratorSurfaceTarget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sh&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRGBA&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawRectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;uiApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Window&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;render&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCanvas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RenderDirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sh&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CloseAccelerator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yes, the boilerplate for wiring up the draw callback is verbose. That is an area we plan to improve. The tradeoff is full control over the render pipeline -- you can mix gg 2D drawing with UI widgets, render to textures, or integrate with your own rendering code.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Widget Set
&lt;/h2&gt;

&lt;p&gt;v0.1.0 ships 22 interactive widgets and display primitives:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Form controls:&lt;/strong&gt; Button (4 variants, 3 sizes), Checkbox (tri-state), Radio group, TextField (with selection, clipboard, validation), Dropdown, Slider (continuous/discrete)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Containers:&lt;/strong&gt; ScrollView (wheel, keyboard, scrollbar drag), TabView (lazy content, closeable tabs), Dialog (modal, focus trapping), Collapsible sections, SplitView (resizable panels)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data display:&lt;/strong&gt; ListView (virtualized, 1000+ items), GridView (virtualized 2D grid), DataTable (sortable columns, fixed header), TreeView (hierarchical, expand/collapse), LineChart (real-time, multiple series), ProgressBar, Circular Progress&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Application chrome:&lt;/strong&gt; Toolbar, Menu system (MenuBar + ContextMenu with submenus), Docking system (IDE-style panels)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Primitives:&lt;/strong&gt; Box (VBox/HBox), Text (reactive via signals), Image, ThemeScope, RepaintBoundary (pixel caching)&lt;/p&gt;

&lt;p&gt;Every interactive widget follows the same patterns: functional options for construction, a Painter interface for design-system independence, signal bindings for reactive state, and accessibility metadata (ARIA roles).&lt;/p&gt;




&lt;h2&gt;
  
  
  Three Design Systems
&lt;/h2&gt;

&lt;p&gt;Widgets do not know how they look. Rendering is delegated to Painter implementations. v0.1.0 ships three:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Material Design 3&lt;/strong&gt; -- HCT color science, tonal palettes generated from a single seed color, 21 component painters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fluent Design&lt;/strong&gt; -- Microsoft's design language with accent colors, inner focus rings, 9 painters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cupertino&lt;/strong&gt; -- Apple HIG with iOS-style toggle switches, segmented controls, pill buttons, 9 painters.&lt;/p&gt;

&lt;p&gt;Switching is one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Material 3&lt;/span&gt;
&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Save"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PainterOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;material3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ButtonPainter&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Theme&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;m3&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Fluent Design&lt;/span&gt;
&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Save"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PainterOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fluent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ButtonPainter&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Theme&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fl&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Cupertino&lt;/span&gt;
&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Save"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PainterOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cupertino&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ButtonPainter&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Theme&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cu&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The gallery example demonstrates live M3 seed color switching at runtime (Purple, Blue, Green, Orange). Fluent and Cupertino painters are implemented and tested but not yet wired into the gallery demo — switching between design systems at runtime is architecturally supported (just swap the Painter), but a full gallery with all three is a v0.2.0 goal.&lt;/p&gt;




&lt;h2&gt;
  
  
  Reactive Signals
&lt;/h2&gt;

&lt;p&gt;State management uses a signals pattern built on our own &lt;a href="https://github.com/coregx/signals" rel="noopener noreferrer"&gt;coregx/signals&lt;/a&gt; library. The architecture is directly inspired by &lt;strong&gt;Angular Signals&lt;/strong&gt; (fine-grained reactivity with &lt;code&gt;Signal[T]&lt;/code&gt;, &lt;code&gt;Computed[T]&lt;/code&gt;, &lt;code&gt;Effect&lt;/code&gt;), with influence from SolidJS and Rust's Leptos. We studied Angular's signal lifecycle, dependency tracking, and lazy evaluation — then reimplemented it idiomatically in Go with generics. A &lt;code&gt;Signal[T]&lt;/code&gt; holds a value. A &lt;code&gt;Computed[T]&lt;/code&gt; derives from other signals and recomputes lazily. When a signal changes, bound widgets automatically invalidate and redraw.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSignal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Computed label updates automatically&lt;/span&gt;
&lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;primitives&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextFn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Clicked %d times"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Increment"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Widgets support two-way signal bindings. The slider reads from and writes to the same signal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;volume&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSignal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ValueSignal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c"&gt;// two-way binding&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;// volume.Get() always reflects the current slider position&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Signal lifecycle is automatic: widgets subscribe on mount and clean up on unmount. No manual disposal needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture Highlights
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pluggable Painter pattern.&lt;/strong&gt; Widget behavior (in &lt;code&gt;core/&lt;/code&gt;) is separated from rendering (in &lt;code&gt;theme/&lt;/code&gt;). A Button knows about click handling, focus states, and size variants. A ButtonPainter knows how to draw rounded corners and color fills. This avoids import cycles and lets third parties create entirely new design systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Functional options.&lt;/strong&gt; All widget constructors use the options pattern for backward-compatible API evolution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Submit"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VariantOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Filled&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SizeOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Large&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handleSubmit&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding new options in future versions will not break existing code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Content[C] polymorphic pattern.&lt;/strong&gt; Inspired by Taiga UI's Polymorpheus pattern (Angular), complex widgets like ListView, GridView, and TabView use a generic &lt;code&gt;Content[C]&lt;/code&gt; interface from the CDK (Component Development Kit) package. This enables type-safe polymorphic content rendering — the widget provides the context (item index, selection state, hover), the user provides the builder function. Internally it powers cell recycling and virtualization, while externally users see a simple &lt;code&gt;BuildItem(func(index int) widget.Widget)&lt;/code&gt; callback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retained-mode rendering.&lt;/strong&gt; The widget tree tracks dirty state. Only changed widgets are redrawn. RepaintBoundary widgets cache their subtree as pixel buffers. Large boundaries (128x128+) automatically use tile-parallel rendering via &lt;code&gt;scene.Scene&lt;/code&gt; with goroutine work-stealing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Event-driven rendering.&lt;/strong&gt; The default mode is 0% CPU when idle. The window only redraws when events arrive or signals change. No continuous render loop burning battery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accessibility from day one.&lt;/strong&gt; Every widget implements the &lt;code&gt;a11y.Accessible&lt;/code&gt; interface with ARIA roles, labels, and actions. The accessibility tree uses stable uint64 node IDs. Platform adapters (UIA, AT-SPI2, NSAccessibility) are not implemented yet -- that is one of the biggest remaining gaps.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Works Well
&lt;/h2&gt;

&lt;p&gt;After months of development, these areas feel solid:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TextField&lt;/strong&gt; handles Unicode input correctly (Latin, Cyrillic, CJK), with cursor movement, text selection, clipboard (Ctrl+C/V/X/A), and input validation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tab focus navigation&lt;/strong&gt; works across the widget tree with Tab/Shift+Tab. Focus rings render correctly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hover cursors&lt;/strong&gt; change appropriately (pointer for buttons, text beam for text fields, resize handles for SplitView).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ScrollView&lt;/strong&gt; dispatches events with correct coordinate transforms to child widgets, even when scrolled.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live theme color switching&lt;/strong&gt; in the gallery demo (M3 seed color changes all widget colors instantly).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Virtualized lists&lt;/strong&gt; handle thousands of items with only visible rows rendered and recycled.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Signal bindings&lt;/strong&gt; propagate state changes through the widget tree without manual wiring.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Needs Work (Honest Assessment)
&lt;/h2&gt;

&lt;p&gt;This is a v0.1.0 preview. Here is what we know needs improvement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance.&lt;/strong&gt; Complex widget trees (the full gallery) can feel sluggish. We have not done an optimization pass yet. Retained-mode rendering infrastructure is in place but not fully leveraged.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accessibility adapters.&lt;/strong&gt; The a11y tree and ARIA roles are defined, but there are no platform adapters connecting to screen readers. This is a significant gap for production use.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HiDPI.&lt;/strong&gt; DPI scaling is wired through the stack but has not been thoroughly tested on all platforms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation.&lt;/strong&gt; API docs exist but guides and tutorials are sparse. The architecture document is 1200+ lines but aimed at contributors, not users.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Draw callback boilerplate.&lt;/strong&gt; The current setup code for wiring gogpu + gg + ui is too verbose. We plan to provide higher-level convenience functions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Some event edge cases.&lt;/strong&gt; Drag cursors, scroll momentum, and overlay dismiss behavior have known rough spots.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform testing.&lt;/strong&gt; CI runs on Ubuntu, macOS, and Windows, but real-world testing has been primarily on Windows.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How to Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone and run the gallery example&lt;/span&gt;
git clone https://github.com/gogpu/ui.git
&lt;span class="nb"&gt;cd &lt;/span&gt;ui/examples/gallery
go run &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or add it to an existing project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/ui@v0.1.0
go get github.com/gogpu/gogpu@latest
go get github.com/gogpu/gg@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;examples/&lt;/code&gt; directory has four working applications:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;th&gt;What it shows&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;examples/hello/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Checkboxes, radio buttons, ListView with 1000 items&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;examples/signals/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Reactive state management patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;examples/taskmanager/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Real-time charts, progress bars, simulated system metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;examples/gallery/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All 22 widgets, M3 theme with live seed color switching&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Requirements: Go 1.25+, a GPU (Vulkan, Metal, DX12, or OpenGL ES -- software fallback available). No C compiler needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The gogpu Ecosystem
&lt;/h2&gt;

&lt;p&gt;gogpu/ui sits at the top of a pure Go graphics stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;naga       -- shader compiler (WGSL to SPIR-V, MSL, GLSL)
wgpu       -- WebGPU Hardware Abstraction Layer (Vulkan, Metal, DX12, GLES, software)
gogpu      -- application framework (windowing, input, GPU context)
gg         -- 2D graphics engine (Canvas API, GPU text rendering, SDF acceleration)
ui         -- GUI toolkit (this project)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The entire stack is ~300K lines of pure Go. No CGO, no Rust FFI, no C bindings. If Go can build for a platform, the full stack runs there.&lt;/p&gt;

&lt;p&gt;This release coincides with a coordinated cascade release across the entire ecosystem. All libraries were updated to work through wgpu's new public Core API (previously HAL-only):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;What Changed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;naga&lt;/td&gt;
&lt;td&gt;v0.14.7&lt;/td&gt;
&lt;td&gt;Shader compiler stability fixes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wgpu&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.21.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;New public API: Instance, Adapter, Device, Queue, Surface, CommandEncoder&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gpucontext&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.10.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Updated interfaces for new wgpu API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gogpu&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.24.1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Platform refactor, CharCallback for Unicode text input&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gg&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.37.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Migrated internal/gpu from HAL to wgpu public API, GPU RRect SDF clip&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ui&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.1.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;This release&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This was a significant engineering effort — wgpu moved from internal HAL types to a proper public API with validation, state tracking, and resource lifecycle management. Every layer above had to adapt.&lt;/p&gt;




&lt;h2&gt;
  
  
  We Want Your Feedback
&lt;/h2&gt;

&lt;p&gt;This release exists to start a conversation. We would genuinely appreciate input on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What widgets are missing&lt;/strong&gt; for your use case?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API ergonomics&lt;/strong&gt; -- do functional options feel right for Go? Is the signal system intuitive?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance expectations&lt;/strong&gt; -- what is acceptable for the apps you want to build?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design system coverage&lt;/strong&gt; -- are M3/Fluent/Cupertino the right targets? Missing painters?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt; -- what would help you get started?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Big Architecture Question: Monolith or Modular?
&lt;/h3&gt;

&lt;p&gt;There is one question we have been debating internally and would especially value community input on:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Should gogpu/ui stay as a single module, or should we extract shared primitives into a separate foundation?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Right now, &lt;code&gt;go get github.com/gogpu/ui&lt;/code&gt; pulls the entire toolkit — 56 packages, all widgets, all three design systems, all dependencies. If you only need &lt;code&gt;geometry.Rect&lt;/code&gt; or the layout engine or the signal system, you still get everything.&lt;/p&gt;

&lt;p&gt;We are considering extracting foundational packages into a separate &lt;code&gt;gogpu/uikit&lt;/code&gt; (or similar) module:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Currently in &lt;code&gt;ui/&lt;/code&gt;
&lt;/th&gt;
&lt;th&gt;Could live in &lt;code&gt;uikit/&lt;/code&gt;
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Geometry types&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ui/geometry&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;uikit/geometry&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layout engines&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ui/layout&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;uikit/layout&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Event types&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ui/event&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;uikit/event&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Widget base&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ui/widget&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;uikit/widget&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signal bindings&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ui/state&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;uikit/state&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Theme interfaces&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ui/theme&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;uikit/theme&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This would let the community:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Build custom widget libraries&lt;/strong&gt; on top of shared primitives without importing all of gogpu/ui&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create their own design systems&lt;/strong&gt; (themes + painters) by depending only on the interfaces&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extend our themes&lt;/strong&gt; with custom component tokens without forking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use our layout engine&lt;/strong&gt; (Flex, Grid, Stack) independently in their own UI frameworks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tradeoff: another module to version and maintain, another API boundary to keep stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What do you think?&lt;/strong&gt; Is a modular foundation important for your use case, or is a single &lt;code&gt;go get&lt;/code&gt; simpler? We are genuinely unsure — the Go ecosystem has examples of both approaches working well. Your input will directly shape the v0.2.0 architecture.&lt;/p&gt;

&lt;p&gt;File issues on &lt;a href="https://github.com/gogpu/ui/issues" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, start a thread in &lt;a href="https://github.com/orgs/gogpu/discussions/18" rel="noopener noreferrer"&gt;Discussions&lt;/a&gt;, or just leave a comment here.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;github.com/gogpu/ui&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/ui/releases/tag/v0.1.0" rel="noopener noreferrer"&gt;v0.1.0&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Getting Started:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/ui/blob/main/docs/GETTING_STARTED.md" rel="noopener noreferrer"&gt;docs/GETTING_STARTED.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/ui/blob/main/docs/ARCHITECTURE.md" rel="noopener noreferrer"&gt;docs/ARCHITECTURE.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;gogpu ecosystem:&lt;/strong&gt; &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;github.com/gogpu&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thank you for reading. We look forward to hearing what you think.&lt;/p&gt;

</description>
      <category>go</category>
      <category>programming</category>
      <category>opensource</category>
      <category>architecture</category>
    </item>
    <item>
      <title>goffi: Zero-CGO Foreign Function Interface for Go — How We Call C Libraries Without a C Compiler</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Mon, 02 Mar 2026 09:10:35 +0000</pubDate>
      <link>https://forem.com/kolkov/goffi-zero-cgo-foreign-function-interface-for-go-how-we-call-c-libraries-without-a-c-compiler-ca5</link>
      <guid>https://forem.com/kolkov/goffi-zero-cgo-foreign-function-interface-for-go-how-we-call-c-libraries-without-a-c-compiler-ca5</guid>
      <description>&lt;p&gt;Every Go developer who has worked with C libraries knows the pain: CGO requires a C compiler, breaks cross-compilation, bloats binaries, and adds ~200ns overhead per call. For our &lt;a href="https://github.com/go-webgpu/webgpu" rel="noopener noreferrer"&gt;WebGPU bindings&lt;/a&gt; and &lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;ML framework&lt;/a&gt;, calling wgpu-native through CGO was a non-starter — we needed to ship a single static binary across Windows, Linux, and macOS without requiring users to install gcc.&lt;/p&gt;

&lt;p&gt;So we built &lt;strong&gt;&lt;a href="https://github.com/go-webgpu/goffi" rel="noopener noreferrer"&gt;goffi&lt;/a&gt;&lt;/strong&gt; — a pure Go FFI library that calls C functions through hand-written assembly, with zero C dependencies and zero per-call allocations. It now powers an entire ecosystem: &lt;a href="https://github.com/go-webgpu/webgpu" rel="noopener noreferrer"&gt;go-webgpu/webgpu&lt;/a&gt; bindings, &lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;born-ml/born&lt;/a&gt; ML framework, and the &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GoGPU&lt;/a&gt; GPU computing platform with dual Rust and pure Go backends.&lt;/p&gt;

&lt;p&gt;This article explains the architecture, the hard problems we solved, how goffi compares to purego, and how you can use it in your own projects.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Our stack looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Go application (gogpu)
  └─ wgpu bindings (gogpu/wgpu)     ← needs to call C functions
       └─ goffi                      ← this library
            └─ wgpu-native (.dll/.so/.dylib)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need to call hundreds of WebGPU functions from Go: create GPU devices, submit command buffers, handle async callbacks from Metal/Vulkan threads. The requirements were clear:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No C compiler at build time&lt;/strong&gt; — users run &lt;code&gt;go get&lt;/code&gt; and it works&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-compilation&lt;/strong&gt; — &lt;code&gt;GOOS=linux GOARCH=arm64 go build&lt;/code&gt; must work from Windows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Callbacks from C threads&lt;/strong&gt; — wgpu-native fires callbacks from internal GPU threads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Struct passing&lt;/strong&gt; — WebGPU API passes structs by value (descriptors, extents, colors)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low overhead&lt;/strong&gt; — GPU command encoding happens at 60+ FPS&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;CGO fails requirements 1 and 2. purego covers 1-2 but had gaps in 3-5 when we started. So we built goffi.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture: 4 Layers Deep
&lt;/h2&gt;

&lt;p&gt;Every goffi call traverses four layers to safely transition from Go's managed runtime to raw C code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Go Code
  │  ffi.CallFunction()
  ▼
runtime.cgocall          ← Go runtime: switch to system stack, tell GC
  │
  ▼
Assembly Wrapper         ← Our code: load registers per ABI
  │  RDI=arg0 RSI=arg1 ... XMM0=float0 ...
  ▼
C Function               ← Target library
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 1: The Call Interface (CIF)
&lt;/h3&gt;

&lt;p&gt;Unlike purego which uses &lt;code&gt;reflect.MakeFunc&lt;/code&gt; on every call, goffi pre-computes everything once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Prepare once at init time&lt;/span&gt;
&lt;span class="n"&gt;cif&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallInterface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PrepareCallInterface&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultCall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UInt64TypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                            &lt;span class="c"&gt;// return: size_t&lt;/span&gt;
&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PointerTypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c"&gt;// arg: const char*&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Call many times — zero reflection, zero allocation&lt;/span&gt;
&lt;span class="c"&gt;// args = []unsafe.Pointer{unsafe.Pointer(&amp;amp;myPtr)} — pointers TO arg values&lt;/span&gt;
&lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;strlenPtr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;PrepareCallInterface&lt;/code&gt; classifies each argument (integer register? SSE register? stack?), computes stack layout, and stores everything in a reusable &lt;code&gt;CallInterface&lt;/code&gt; struct. The &lt;code&gt;cif.Flags&lt;/code&gt; bitmask tells the assembly exactly what to do — no decisions at call time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Platform Assembly
&lt;/h3&gt;

&lt;p&gt;We write assembly by hand for each ABI. Here's the core of our System V AMD64 implementation (&lt;code&gt;syscall_unix_amd64.s&lt;/code&gt;). The function receives a pointer to a &lt;code&gt;syscallArgs&lt;/code&gt; struct in DI, loads registers from it, calls the target, and writes return values back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// syscall_unix_amd64.s — System V AMD64 ABI
TEXT syscallN(SB), NOSPLIT|NOFRAME, $0
    PUSHQ BP
    MOVQ  SP, BP
    SUBQ  $STACK_SIZE, SP
    MOVQ  DI, R11             // R11 = args struct pointer

    // Load 8 SSE registers from struct offsets 128-184
    MOVQ 128(R11), X0         // XMM0
    MOVQ 136(R11), X1         // XMM1
    // ... XMM2-XMM7

    // Push stack-spill args (a7-a15) from struct offsets 56-120
    MOVQ 56(R11), R12
    MOVQ R12, 0(SP)           // stack slot 0

    // Load 6 GP registers from struct offsets 8-48
    MOVQ 8(R11), DI           // RDI = arg 1
    MOVQ 16(R11), SI          // RSI = arg 2
    MOVQ 24(R11), DX          // RDX = arg 3
    MOVQ 32(R11), CX          // RCX = arg 4
    MOVQ 40(R11), R8          // R8  = arg 5
    MOVQ 48(R11), R9          // R9  = arg 6

    MOVQ 0(R11), R10          // function pointer
    CALL R10

    // Save returns: RAX → r1, RDX → r2, XMM0 → f1
    MOVQ PTR_ADDRESS(BP), DI
    MOVQ AX, 192(DI)          // integer return
    MOVQ DX, 200(DI)          // second return (9-16B structs)
    MOVQ X0, 128(DI)          // float return
    // ... restore stack, RET
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We maintain separate assembly for three ABIs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;ABI&lt;/th&gt;
&lt;th&gt;GP Registers&lt;/th&gt;
&lt;th&gt;SSE Registers&lt;/th&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;System V AMD64&lt;/strong&gt; (Linux/macOS)&lt;/td&gt;
&lt;td&gt;RDI, RSI, RDX, RCX, R8, R9&lt;/td&gt;
&lt;td&gt;XMM0-XMM7&lt;/td&gt;
&lt;td&gt;16-byte aligned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Win64&lt;/strong&gt; (Windows)&lt;/td&gt;
&lt;td&gt;RCX, RDX, R8, R9&lt;/td&gt;
&lt;td&gt;XMM0-XMM3&lt;/td&gt;
&lt;td&gt;32-byte shadow space&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;AAPCS64&lt;/strong&gt; (ARM64)&lt;/td&gt;
&lt;td&gt;X0-X7&lt;/td&gt;
&lt;td&gt;D0-D7&lt;/td&gt;
&lt;td&gt;16-byte aligned&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Layer 3: Struct Returns
&lt;/h3&gt;

&lt;p&gt;This is where it gets interesting. When a C function returns a struct, the ABI rules depend on size:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&amp;lt;= 8 bytes&lt;/strong&gt;: returned in RAX&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;9-16 bytes&lt;/strong&gt;: split across RAX (low 8) + RDX (high 8)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&amp;gt; 16 bytes&lt;/strong&gt;: caller passes a hidden pointer as the first argument (sret)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our &lt;code&gt;handleReturn&lt;/code&gt; function assembles the result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StructType&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReturnType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Size&lt;/span&gt;
&lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;rvalue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retVal&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="kt"&gt;uint64&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;rvalue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retVal&lt;/span&gt;          &lt;span class="c"&gt;// RAX → bytes 0-7&lt;/span&gt;
&lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;8&lt;/span&gt;
&lt;span class="n"&gt;src&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;retVal2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;dst&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rvalue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="c"&gt;// RDX → bytes 8-15&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 4: Callbacks (C → Go)
&lt;/h3&gt;

&lt;p&gt;WebGPU fires async callbacks from internal threads — Metal threads, Vulkan threads, threads goffi never created. These threads have no goroutine (G = nil), so calling Go code directly would crash.&lt;/p&gt;

&lt;p&gt;We solve this with Go's &lt;code&gt;crosscall2&lt;/code&gt; mechanism:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C thread (wgpu-native internal)
  │  calls our trampoline (1 of 2000 pre-compiled entries)
  ▼
Assembly dispatcher
  │  saves registers, loads callback index
  ▼
crosscall2 → runtime.load_g → runtime.cgocallback
  │  sets up goroutine, switches to Go stack
  ▼
Your Go callback function
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On AMD64, each trampoline is a 5-byte &lt;code&gt;CALL&lt;/code&gt; instruction. On ARM64, each entry is 8 bytes — two 4-byte instructions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// ARM64 (callback_arm64.s) — 8 bytes per entry
MOVD $0, R12              // load callback index
B    ·callbackDispatcher  // branch (no link — preserves LR)
MOVD $1, R12
B    ·callbackDispatcher
// ... 2000 entries
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Usage is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ud&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="c"&gt;// This runs safely even when called from a C thread&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt;
&lt;span class="nb"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wgpuRequestAdapter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  goffi vs purego: Honest Comparison
&lt;/h2&gt;

&lt;p&gt;Both libraries are pure Go, no CGO. But they make different trade-offs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;goffi&lt;/th&gt;
&lt;th&gt;purego&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;libffi-style: prepare CIF once, call many times&lt;/td&gt;
&lt;td&gt;reflect-style: &lt;code&gt;RegisterFunc&lt;/code&gt;, closure per call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per-call cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Zero allocations (CIF reused)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;sync.Pool.Get()&lt;/code&gt; for syscall args&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Callback float returns&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supported (asm writes XMM0)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;panic("unsupported return type")&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ARM64 HFA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Recursive struct walk (nested HFAs)&lt;/td&gt;
&lt;td&gt;Top-level fields only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Type system&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Explicit &lt;code&gt;TypeDescriptor&lt;/code&gt; (Size/Alignment/Kind)&lt;/td&gt;
&lt;td&gt;Go &lt;code&gt;reflect.Type&lt;/code&gt; introspection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ergonomics&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Raw — you manage &lt;code&gt;unsafe.Pointer&lt;/code&gt; yourself&lt;/td&gt;
&lt;td&gt;High-level — auto string null-termination, bool, slice&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Platforms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5 (amd64x3 + arm64x2)&lt;/td&gt;
&lt;td&gt;9+ architectures&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CallFunctionContext(ctx, ...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Typed errors&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5 error types with &lt;code&gt;errors.As()&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Generic errors&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Choose goffi when&lt;/strong&gt;: you need struct passing, zero per-call overhead, callback float returns, or you're building GPU/real-time bindings where every nanosecond counts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose purego when&lt;/strong&gt;: you need string auto-marshaling, broad architecture support (386, ppc64le, riscv64...), or quick one-off C library bindings with minimal boilerplate.&lt;/p&gt;

&lt;p&gt;We use both in gogpu — goffi for the hot-path WebGPU calls, purego patterns as reference for platform edge cases.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Ecosystem: Where goffi Came From and Where It Led
&lt;/h2&gt;

&lt;p&gt;goffi wasn't built in isolation. It was born from a real need — and it enabled an entire ecosystem of pure Go GPU libraries.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Origin: Proprietary Roots
&lt;/h3&gt;

&lt;p&gt;goffi started as an internal tool. For over a year it lived inside a proprietary codebase — a GPU computing stack we built for our own projects. It worked well enough for us: a handful of platforms, a known set of functions, predictable usage patterns.&lt;/p&gt;

&lt;p&gt;In 2025, we decided to open-source everything. Not just goffi, but the entire ecosystem — WebGPU bindings, the ML framework, the shader compiler, the GPU platform. We believed the Go community needed a real alternative to CGO for native library bindings.&lt;/p&gt;

&lt;p&gt;What we didn't expect: &lt;strong&gt;the gap between "works for us internally" and "production-ready open source" was enormous.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our internal version handled our specific use cases. Open source means every use case. Users on platforms we never tested. Struct layouts we never considered. Calling conventions with edge cases we'd never hit. The list of things that "just worked" internally but broke in the wild was humbling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ABI compliance&lt;/strong&gt; — our AMD64 assembly didn't handle struct returns &amp;gt;8 bytes correctly. Internally we never returned large structs by value. Open source users did, immediately. We had to implement RAX+RDX split returns and sret hidden pointers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ARM64&lt;/strong&gt; — we had AMD64 only. Open source meant Apple Silicon support was day one priority, not a nice-to-have.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Callbacks from C threads&lt;/strong&gt; — internally we controlled which threads called back into Go. In the wild, wgpu-native fires callbacks from Metal and Vulkan threads we never created. We had to integrate &lt;code&gt;crosscall2&lt;/code&gt; for proper C→Go transitions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error handling&lt;/strong&gt; — our internal code used generic errors. Open source users needed &lt;code&gt;errors.As()&lt;/code&gt; with typed errors to build robust applications. We added 5 error types.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testing&lt;/strong&gt; — our internal coverage was ~40%. Getting to 89% meant writing hundreds of test cases for edge cases we'd never encountered ourselves.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt; — internally, we knew how the code worked. For open source, every assembly file needed comments explaining the ABI, every public function needed godoc, every platform quirk needed documentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We essentially rebuilt goffi from scratch while keeping the core idea intact. The architecture is the same — CIF pre-computation, assembly dispatch, zero allocations — but the implementation is production-grade now, not prototype-grade.&lt;/p&gt;

&lt;h3&gt;
  
  
  go-webgpu/webgpu
&lt;/h3&gt;

&lt;p&gt;It started with &lt;a href="https://github.com/go-webgpu/webgpu" rel="noopener noreferrer"&gt;go-webgpu/webgpu&lt;/a&gt; — our zero-CGO WebGPU bindings for Go. We wanted to call &lt;a href="https://github.com/gfx-rs/wgpu-native" rel="noopener noreferrer"&gt;wgpu-native&lt;/a&gt; (Rust-based Vulkan/Metal/DX12 backend) from Go without requiring a C compiler. Every existing approach had a deal-breaker:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CGO&lt;/strong&gt;: requires gcc, breaks &lt;code&gt;go get&lt;/code&gt;, no cross-compilation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;purego&lt;/strong&gt;: at the time, no struct passing, no callback float returns, no HFA support — things WebGPU needs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we built goffi as the FFI layer for go-webgpu/webgpu. The bindings wrap 180+ wgpu-native functions — device creation, buffer allocation, render passes, compute dispatches, async adapter requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  born-ml: Machine Learning on GPU
&lt;/h3&gt;

&lt;p&gt;The second consumer was &lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;born-ml/born&lt;/a&gt; — a production-ready ML framework for Go with a PyTorch-like API. born needs GPU compute for tensor operations: matrix multiplication, convolution, automatic differentiation. The WebGPU compute pipeline powered by goffi gives born GPU acceleration while shipping as a single static binary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;born (ML framework)
  └─ go-webgpu/webgpu (WebGPU bindings)
       └─ goffi (FFI layer)
            └─ wgpu-native (Vulkan/Metal/DX12)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This stack lets you &lt;code&gt;go get github.com/born-ml/born&lt;/code&gt;, write a neural network, and run it on GPU — no Python, no CUDA, no C compiler.&lt;/p&gt;

&lt;h3&gt;
  
  
  GoGPU: The Full Ecosystem
&lt;/h3&gt;

&lt;p&gt;As the projects matured, we realized we could go further. &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GoGPU&lt;/a&gt; grew into a complete GPU computing ecosystem with &lt;strong&gt;dual backends&lt;/strong&gt; — a high-performance Rust backend (wgpu-native via goffi) and a &lt;strong&gt;pure Go&lt;/strong&gt; backend:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;Uses goffi&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gogpu" rel="noopener noreferrer"&gt;&lt;strong&gt;gogpu/gogpu&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;GPU framework — windowing, input, event loop, dual backends (Rust wgpu-native or Pure Go)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/wgpu" rel="noopener noreferrer"&gt;&lt;strong&gt;gogpu/wgpu&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;WebGPU implementation in pure Go — calls Vulkan, Metal, EGL/GLES natively via goffi&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;&lt;strong&gt;gogpu/naga&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Shader compiler in pure Go — WGSL to SPIR-V, MSL, GLSL, HLSL&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;&lt;strong&gt;gogpu/gg&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2D graphics library — SDF rendering, MSDF text, Vello compute pipeline&lt;/td&gt;
&lt;td&gt;Indirect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gpucontext" rel="noopener noreferrer"&gt;&lt;strong&gt;gogpu/gpucontext&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Shared interfaces for GPU context, windowing, and surface creation&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both &lt;code&gt;gogpu/gogpu&lt;/code&gt; and &lt;code&gt;gogpu/wgpu&lt;/code&gt; depend directly on goffi. The "pure Go" backend (&lt;code&gt;gogpu/wgpu&lt;/code&gt;) is pure Go in the sense of zero CGO — no C compiler needed — but it still calls native Vulkan, Metal, and EGL APIs through goffi. That's the whole point: goffi replaces CGO, not the native graphics drivers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-World Performance
&lt;/h3&gt;

&lt;p&gt;At 60 FPS, a typical frame makes ~30-50 FFI calls through goffi:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Frame budget:            16.6 ms
GPU work:                ~15 ms
FFI overhead (50 calls): 50 × 100ns = 5 us = 0.03%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The FFI overhead is literally unmeasurable in profiling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Callback-Heavy Async APIs
&lt;/h3&gt;

&lt;p&gt;WebGPU is heavily async. Device creation, adapter requests, buffer mapping — all callback-based:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Request GPU adapter (async) — simplified pattern&lt;/span&gt;
&lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ud&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}{}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;requestAdapterCIF&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wgpuInstanceRequestAdapter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;userdata&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt; &lt;span class="c"&gt;// Wait for GPU driver callback&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works even when wgpu-native fires the callback from an internal Metal/Vulkan thread, thanks to our &lt;code&gt;crosscall2&lt;/code&gt; integration.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Use goffi in Your Project
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/go-webgpu/goffi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Minimal Example: Calling strlen
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"runtime"&lt;/span&gt;
    &lt;span class="s"&gt;"unsafe"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/go-webgpu/goffi/ffi"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/go-webgpu/goffi/types"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// 1. Load library&lt;/span&gt;
    &lt;span class="n"&gt;libName&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"libc.so.6"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GOOS&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"windows"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;libName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"msvcrt.dll"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LoadLibrary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;libName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FreeLibrary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetSymbol&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"strlen"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// 2. Prepare call interface (once)&lt;/span&gt;
    &lt;span class="n"&gt;cif&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallInterface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PrepareCallInterface&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultCall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UInt64TypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PointerTypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// 3. Call (many times, zero overhead)&lt;/span&gt;
    &lt;span class="n"&gt;str&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"Hello, goffi!&lt;/span&gt;&lt;span class="se"&gt;\x00&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;strPtr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StringData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;length&lt;/span&gt; &lt;span class="kt"&gt;uint64&lt;/span&gt;

    &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;strPtr&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"strlen = %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// 13&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Passing Structs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Define struct layout matching C struct&lt;/span&gt;
&lt;span class="n"&gt;pointType&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;Size&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;               &lt;span class="c"&gt;// sizeof(Point)&lt;/span&gt;
&lt;span class="n"&gt;Alignment&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                &lt;span class="c"&gt;// alignof(Point)&lt;/span&gt;
&lt;span class="n"&gt;Kind&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StructType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;Members&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoubleTypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// x&lt;/span&gt;
&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoubleTypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// y&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Use in CIF&lt;/span&gt;
&lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PrepareCallInterface&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultCall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoubleTypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c"&gt;// returns double (distance)&lt;/span&gt;
&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TypeDescriptor&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pointType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pointType&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c"&gt;// two Point args&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Registering Callbacks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eventType&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Event %d received&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eventType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c"&gt;// Pass cb (uintptr) to C function expecting a function pointer&lt;/span&gt;
&lt;span class="n"&gt;ffi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CallFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cif&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;registerCallback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Hard Lessons
&lt;/h2&gt;

&lt;p&gt;Building a production FFI taught us things no documentation covers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Stack alignment kills silently.&lt;/strong&gt; A single byte of misalignment on AMD64 causes SIGSEGV — but only sometimes, depending on whether the callee uses SSE instructions. We spent days debugging crashes that only reproduced under specific GPU driver versions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Windows shadow space is non-negotiable.&lt;/strong&gt; Win64 ABI requires 32 bytes of "shadow space" on every call, even if the function takes zero arguments. Miss it and the callee corrupts your stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. ARM64 HFA rules are recursive.&lt;/strong&gt; A struct containing a struct containing 4 floats is still an HFA (Homogeneous Floating-point Aggregate) and must be passed in D0-D3. purego only checks top-level fields; we had to walk the full type tree.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. C threads have no goroutine.&lt;/strong&gt; When wgpu-native calls your callback from an internal Metal thread, &lt;code&gt;getg()&lt;/code&gt; returns nil. You must go through &lt;code&gt;crosscall2 → load_g → cgocallback&lt;/code&gt; or the runtime panics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. &lt;code&gt;float32&lt;/code&gt; encoding matters.&lt;/strong&gt; On Windows, &lt;code&gt;syscall.SyscallN&lt;/code&gt; passes args as &lt;code&gt;uintptr&lt;/code&gt;. Widening &lt;code&gt;float32&lt;/code&gt; to &lt;code&gt;float64&lt;/code&gt; then stuffing into a register corrupts the bit pattern — you need &lt;code&gt;math.Float32bits&lt;/code&gt; to preserve the exact IEEE-754 representation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FFI overhead&lt;/td&gt;
&lt;td&gt;88-114 ns/op&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test coverage&lt;/td&gt;
&lt;td&gt;89%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Platforms&lt;/td&gt;
&lt;td&gt;5 (Win/Linux/macOS x AMD64 + Linux/macOS x ARM64)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Assembly files&lt;/td&gt;
&lt;td&gt;17 files, ~900 lines of logic + 6200 lines of trampoline entries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Callback slots&lt;/td&gt;
&lt;td&gt;2000 per process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dependencies&lt;/td&gt;
&lt;td&gt;0 (only Go stdlib)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CGO required&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What About Go 1.26 CGO Improvements?
&lt;/h2&gt;

&lt;p&gt;Go 1.26 (released February 2026) &lt;a href="https://go.dev/doc/go1.26" rel="noopener noreferrer"&gt;reduced cgo call overhead by ~30%&lt;/a&gt; by removing the dedicated syscall P state. &lt;a href="https://gist.github.com/DeedleFake/2f50b02c0708484c66d18253302c4fd6" rel="noopener noreferrer"&gt;Benchmarks on Apple M1&lt;/a&gt; show &lt;code&gt;CgoCall&lt;/code&gt; is 33% faster, &lt;code&gt;CgoCallWithCallback&lt;/code&gt; is 21% faster.&lt;/p&gt;

&lt;p&gt;This is great news — but it doesn't change our calculus:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CGO still requires a C compiler&lt;/strong&gt; at build time. Our users &lt;code&gt;go get&lt;/code&gt; and ship.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-compilation&lt;/strong&gt; with CGO still requires cross-toolchains. &lt;code&gt;GOOS=linux GOARCH=arm64 go build&lt;/code&gt; just works with goffi.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static binaries&lt;/strong&gt; — CGO often pulls in libc. goffi produces fully static Go binaries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Go 1.26 also benefits goffi&lt;/strong&gt; — our &lt;code&gt;runtime.cgocall&lt;/code&gt; path gets the same 30% speedup, because goffi uses the same runtime machinery internally.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The gap between CGO and pure-Go FFI is narrowing from both directions. We welcome it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;v0.5.0&lt;/strong&gt; is focused on usability:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Variadic function support (&lt;code&gt;printf&lt;/code&gt;, &lt;code&gt;sprintf&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Builder pattern API for less boilerplate&lt;/li&gt;
&lt;li&gt;Platform-specific struct alignment (Windows &lt;code&gt;#pragma pack&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;v1.0.0&lt;/strong&gt; targets API stability with SemVer guarantees, security audit, and published benchmarks vs CGO/purego.&lt;/p&gt;

&lt;p&gt;The long-term goal: make GPU programming in Go as natural as it is in Rust or C++, with the ergonomics Go developers expect — &lt;code&gt;go get&lt;/code&gt;, &lt;code&gt;go build&lt;/code&gt;, done.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;goffi (FFI layer):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/go-webgpu/goffi" rel="noopener noreferrer"&gt;github.com/go-webgpu/goffi&lt;/a&gt; — the library this article is about&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pkg.go.dev/github.com/go-webgpu/goffi" rel="noopener noreferrer"&gt;pkg.go.dev/github.com/go-webgpu/goffi&lt;/a&gt; — Go documentation&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/go-webgpu/goffi/blob/main/docs/PERFORMANCE.md" rel="noopener noreferrer"&gt;Performance guide&lt;/a&gt; — benchmarks, optimization strategies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Projects built on goffi:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/go-webgpu/webgpu" rel="noopener noreferrer"&gt;go-webgpu/webgpu&lt;/a&gt; — zero-CGO WebGPU bindings (wgpu-native)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;born-ml/born&lt;/a&gt; — ML framework for Go, GPU-accelerated, PyTorch-like API&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;GoGPU ecosystem (pure Go GPU):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/gogpu" rel="noopener noreferrer"&gt;gogpu/gogpu&lt;/a&gt; — GPU framework, dual backends (Rust + Pure Go)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/wgpu" rel="noopener noreferrer"&gt;gogpu/wgpu&lt;/a&gt; — WebGPU in pure Go (Vulkan, Metal, DX12, GLES, Software)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;gogpu/naga&lt;/a&gt; — shader compiler in pure Go (WGSL to SPIR-V/MSL/GLSL/HLSL)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt; — 2D graphics library (SDF, MSDF text, Vello compute)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Acknowledgments
&lt;/h2&gt;

&lt;p&gt;goffi wouldn't exist without &lt;a href="https://github.com/ebitengine/purego" rel="noopener noreferrer"&gt;purego&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;When we first faced the CGO problem, the conventional wisdom was simple: "you can't call C from Go without a C compiler." purego proved that wrong. The &lt;a href="https://github.com/hajimehoshi/ebiten" rel="noopener noreferrer"&gt;ebitengine&lt;/a&gt; team — and specifically &lt;a href="https://github.com/AJ" rel="noopener noreferrer"&gt;@AJ&lt;/a&gt; and &lt;a href="https://github.com/AJ" rel="noopener noreferrer"&gt;@TotallyGamerJet&lt;/a&gt; — demonstrated that &lt;code&gt;runtime.cgocall&lt;/code&gt;, &lt;code&gt;cgo_import_dynamic&lt;/code&gt;, and hand-written assembly could replace CGO entirely. They showed the community that pure Go FFI was not just theoretically possible, but practical enough to ship a production game engine.&lt;/p&gt;

&lt;p&gt;We studied purego's source code extensively. The &lt;code&gt;crosscall2&lt;/code&gt; callback mechanism, the &lt;code&gt;fakecgo&lt;/code&gt; approach, the assembly trampoline pattern — purego pioneered all of these in the Go ecosystem. Without that foundation to learn from, goffi would have taken years longer to build, if we'd attempted it at all.&lt;/p&gt;

&lt;p&gt;goffi took a different path — libffi-style CIF pre-computation instead of reflect-based dispatch, explicit type descriptors instead of Go type introspection, struct passing and callback float returns for GPU workloads — but the path only existed because purego cleared it first.&lt;/p&gt;

&lt;p&gt;To the purego maintainers: thank you for proving it was possible. The entire pure-Go FFI ecosystem stands on your work.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;goffi is MIT-licensed and open to contributions. If you're building Go bindings for C libraries and want zero-CGO with full ABI compliance — give it a try and let us know how it goes.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>opensource</category>
      <category>programming</category>
      <category>performance</category>
    </item>
    <item>
      <title>Porting Vello's GPU Tile Rasterizer to Pure Go</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Sat, 28 Feb 2026 00:46:44 +0000</pubDate>
      <link>https://forem.com/kolkov/porting-vellos-gpu-tile-rasterizer-to-pure-go-7i8</link>
      <guid>https://forem.com/kolkov/porting-vellos-gpu-tile-rasterizer-to-pure-go-7i8</guid>
      <description>&lt;p&gt;When you call &lt;code&gt;dc.DrawCircle(400, 300, 100)&lt;/code&gt; in &lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt;, what happens under the hood? A &lt;strong&gt;tile-based rasterization pipeline&lt;/strong&gt; — a direct port of &lt;a href="https://github.com/linebender/vello" rel="noopener noreferrer"&gt;Vello&lt;/a&gt;'s GPU compute pipeline from the &lt;a href="https://linebender.org/" rel="noopener noreferrer"&gt;linebender&lt;/a&gt; team — converts your vector paths into per-pixel coverage values. It runs on &lt;strong&gt;both CPU and GPU&lt;/strong&gt;, and it's written entirely in Pure Go.&lt;/p&gt;

&lt;p&gt;This article walks through the internals of &lt;code&gt;tilecompute&lt;/code&gt; — a &lt;strong&gt;6,700-line&lt;/strong&gt; dual-execution port of Vello's original GPU compute rasterizer (circa 2020–2023) into Pure Go, with a CPU reference implementation and GPU compute dispatch via 9 WGSL shaders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Just Use Scanline?
&lt;/h2&gt;

&lt;p&gt;Traditional scanline rasterizers process one row of pixels at a time. They work, but they have a fundamental scalability problem: for a 4K canvas (3840x2160), you're iterating over &lt;strong&gt;8.3 million pixels&lt;/strong&gt; regardless of how simple your shapes are.&lt;/p&gt;

&lt;p&gt;Tile-based rasterizers flip this around. They divide the canvas into small tiles (16x16 pixels in our case) and only process tiles that the vector path actually touches. A circle in the corner of a 4K canvas? You process maybe 40 tiles instead of 8 million pixels.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/linebender/vello" rel="noopener noreferrer"&gt;Vello&lt;/a&gt; — created by &lt;a href="https://raphlinus.github.io/" rel="noopener noreferrer"&gt;Raph Levien&lt;/a&gt; and the &lt;a href="https://linebender.org/" rel="noopener noreferrer"&gt;linebender&lt;/a&gt; team — pioneered this approach using GPU compute shaders (originally as &lt;a href="https://github.com/raphlinus/piet-gpu" rel="noopener noreferrer"&gt;piet-gpu&lt;/a&gt; in 2020, renamed to Vello in 2022). We ported its GPU compute pipeline to Pure Go for use as the rasterization core of gogpu/gg.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article covers the &lt;strong&gt;original GPU compute pipeline&lt;/strong&gt; (16x16 tiles, 2020–2023). We've also ported Vello's &lt;strong&gt;newer&lt;/strong&gt; &lt;a href="https://github.com/linebender/vello/issues/670" rel="noopener noreferrer"&gt;Sparse Strips&lt;/a&gt; algorithm (4x4 tiles, August 2024) — see the Two Rasterizers, Two Targets section below.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The GPU Compute Pipeline
&lt;/h2&gt;

&lt;p&gt;Vello's original rasterizer (2020–2023) uses a &lt;strong&gt;tile-based GPU compute&lt;/strong&gt; approach with 16x16 pixel tiles. The key insight: instead of asking "which pixels does this path cover?", ask "which tiles does each line segment cross, and what's the winding contribution?"&lt;/p&gt;

&lt;p&gt;The full pipeline has &lt;strong&gt;9 GPU compute stages&lt;/strong&gt; (matching our 9 WGSL shaders), plus scene encoding and curve flattening on the CPU:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Vector Paths (cubics, arcs, lines)
    ↓
Scene Encoding (CPU — pack paths, transforms, styles)
    ↓
 1. PathTag Reduce    ─┐
 2. PathTag Scan      ─┘ monoid prefix sums over path structure
    ↓
Flatten (Euler spiral → line segments)
    ↓
 3. Draw Reduce       ─┐
 4. Draw Leaf Scan    ─┘ monoid prefix sums over draw commands
    ↓
 5. Path Count        (DDA walk — which tiles does each segment cross?)
    ↓
 6. Backdrop          (prefix sum — left-to-right winding accumulation)
    ↓
 7. Coarse            (generate Per-Tile Command Lists)
    ↓
 8. Path Tiling       (clip segments to tile boundaries, compute yEdge)
    ↓
 9. Fine              (per-pixel analytic anti-aliased coverage)
    ↓
Per-pixel RGBA output
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first four stages use &lt;strong&gt;monoid prefix sums&lt;/strong&gt; — Vello's core parallelism primitive. A monoid is an associative operation with an identity element; prefix sums over monoids can be computed in O(log n) parallel steps on a GPU, turning what would be sequential parsing into massively parallel work.&lt;/p&gt;

&lt;p&gt;Let's walk through the key stages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stages 1–4: Monoid Prefix Sums
&lt;/h2&gt;

&lt;p&gt;The first four stages parse the encoded scene in parallel using &lt;strong&gt;monoid prefix sums&lt;/strong&gt;. Each path tag and draw tag is reduced into a monoid (an associative structure), then scanned to produce cumulative values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// PathMonoid — accumulated state from scanning path tags&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;PathMonoid&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;TransIx&lt;/span&gt;       &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Transform index&lt;/span&gt;
&lt;span class="n"&gt;PathSegIx&lt;/span&gt;     &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Path segment index&lt;/span&gt;
&lt;span class="n"&gt;PathSegOffset&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Offset into path segment data&lt;/span&gt;
&lt;span class="n"&gt;StyleIx&lt;/span&gt;       &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Style index&lt;/span&gt;
&lt;span class="n"&gt;PathIx&lt;/span&gt;        &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Path index&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// DrawMonoid — accumulated state from scanning draw tags&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;DrawMonoid&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;PathIx&lt;/span&gt;      &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Which path this draw belongs to&lt;/span&gt;
&lt;span class="n"&gt;ClipIx&lt;/span&gt;      &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Clip stack depth&lt;/span&gt;
&lt;span class="n"&gt;SceneOffset&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Offset into scene data&lt;/span&gt;
&lt;span class="n"&gt;InfoOffset&lt;/span&gt;  &lt;span class="kt"&gt;uint32&lt;/span&gt; &lt;span class="c"&gt;// Offset into info buffer&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the GPU, reduce + scan runs in O(log n) parallel steps via our 9 WGSL compute shaders dispatched through &lt;code&gt;VelloComputeDispatcher&lt;/code&gt;. On the CPU, it's a sequential scan — the data structures are identical, making the CPU code a reference implementation that validates GPU correctness.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 5: Path Count — DDA Tile Walk
&lt;/h2&gt;

&lt;p&gt;The path count stage answers: "which tiles does each line segment cross?"&lt;/p&gt;

&lt;p&gt;We use a &lt;strong&gt;Digital Differential Analyzer&lt;/strong&gt; (DDA) — an algorithm that traces a line through a grid, visiting every cell the line passes through. For each tile visited, we record two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Segment count&lt;/strong&gt; — how many line segments cross this tile&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backdrop&lt;/strong&gt; — the signed winding contribution at the tile's left edge
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// pathCountMain — DDA walk through the tile grid&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;pathCountMain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bump&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;BumpAllocators&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;LineSoup&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;paths&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tile&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;segCounts&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;SegmentCount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;lineIx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;lineIx&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;bump&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lines&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;lineIx&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;lineIx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vec2FromArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;p1&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vec2FromArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Sort by Y for consistent DDA walking&lt;/span&gt;
&lt;span class="n"&gt;isDown&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;xy0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xy1&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;isDown&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;xy0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xy1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;xy0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xy1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p0&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Scale to tile coordinates (pixel / 16)&lt;/span&gt;
&lt;span class="n"&gt;s0&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;xy0&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tileScale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;s1&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;xy1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tileScale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// DDA walk: trace the line through tile grid cells&lt;/span&gt;
&lt;span class="c"&gt;// counting X crossings and Y crossings separately&lt;/span&gt;
&lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;abs32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;s0&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dy&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;s0&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;
&lt;span class="n"&gt;idxdy&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;dy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;idxdy&lt;/span&gt;

&lt;span class="c"&gt;// ... walk through tiles, update backdrop and segment counts&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;backdrop&lt;/strong&gt; is the critical concept. When a line segment crosses a tile's left edge, it contributes a signed delta to the winding number: &lt;code&gt;-1&lt;/code&gt; for downward segments, &lt;code&gt;+1&lt;/code&gt; for upward. This is how we know whether a pixel is "inside" or "outside" the path without checking every segment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 6: Backdrop Prefix Sum
&lt;/h2&gt;

&lt;p&gt;This is where tile-based rasterization really shines. Instead of checking every segment for every pixel, we accumulate winding numbers &lt;strong&gt;left-to-right&lt;/strong&gt; across each row of tiles:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Backdrop prefix sum — left-to-right winding accumulation&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;bboxH&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kt"&gt;int32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;bboxW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;bboxW&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;tiles&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Backdrop&lt;/span&gt;
&lt;span class="n"&gt;tiles&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Backdrop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After this step, each tile knows the accumulated winding number from all segments to its left. A tile with winding number 1 and no segments crossing it is &lt;strong&gt;fully solid&lt;/strong&gt; — no per-pixel computation needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 7: Coarse — Allocation + PTCL Generation
&lt;/h2&gt;

&lt;p&gt;The coarse stage allocates space in a global segment buffer and generates Per-Tile Command Lists. Segment counts are converted into indices using Vello's &lt;strong&gt;inverted index&lt;/strong&gt; trick:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Convert counts to indices using bitwise NOT&lt;/span&gt;
&lt;span class="n"&gt;nextSegIx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kt"&gt;uint32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;tiles&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;nSegs&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;tiles&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SegmentCountOrIx&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;nSegs&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;tiles&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SegmentCountOrIx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;nextSegIx&lt;/span&gt; &lt;span class="c"&gt;// !seg_ix in Rust&lt;/span&gt;
&lt;span class="n"&gt;nextSegIx&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;nSegs&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;^&lt;/code&gt; (bitwise NOT) serves double duty: it marks the tile as "has segments" (the NOT of a valid index is always negative as int32) while encoding the starting index into the global segment array.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 8: Path Tiling
&lt;/h2&gt;

&lt;p&gt;Now we clip each line segment to its tile boundaries and compute the crucial &lt;strong&gt;yEdge&lt;/strong&gt; value:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// PathSegment — a line segment clipped to tile boundaries&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;PathSegment&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;Point0&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt; &lt;span class="c"&gt;// Tile-relative start point&lt;/span&gt;
&lt;span class="n"&gt;Point1&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt; &lt;span class="c"&gt;// Tile-relative end point&lt;/span&gt;
&lt;span class="n"&gt;YEdge&lt;/span&gt;  &lt;span class="kt"&gt;float32&lt;/span&gt;    &lt;span class="c"&gt;// Y where segment crosses tile left edge (x=0)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;YEdge&lt;/code&gt; tells the fine rasterizer where the segment enters or exits the tile at x=0. If the segment doesn't cross the left edge, &lt;code&gt;YEdge&lt;/code&gt; is set to &lt;code&gt;1e9&lt;/code&gt; (sentinel value). This single float32 captures the geometric relationship needed for coverage computation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 9: Fine Rasterization
&lt;/h2&gt;

&lt;p&gt;The final stage computes per-pixel anti-aliased coverage within each 16x16 tile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// fillPath computes per-pixel coverage for a single tile.&lt;/span&gt;
&lt;span class="c"&gt;// Direct port of fill_path from vello_shaders/src/cpu/fine.rs.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;fillPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;segments&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;PathSegment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;backdrop&lt;/span&gt; &lt;span class="kt"&gt;int32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;evenOdd&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

&lt;span class="c"&gt;// Initialize area with backdrop winding number&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;area&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backdrop&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;segments&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="c"&gt;// For each column in the tile, compute the area&lt;/span&gt;
&lt;span class="c"&gt;// contribution of this segment using analytic integration&lt;/span&gt;
&lt;span class="c"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Apply fill rule and convert to alpha [0, 1]&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;evenOdd&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;area&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;abs32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;2.0&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;round32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;area&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;min32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;abs32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;area&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is &lt;strong&gt;analytic anti-aliasing&lt;/strong&gt; — we compute exact sub-pixel coverage by integrating line segment contributions, not by supersampling. The result is mathematically precise edges with smooth alpha gradients.&lt;/p&gt;

&lt;h2&gt;
  
  
  Euler Spiral Flattening
&lt;/h2&gt;

&lt;p&gt;But wait — &lt;code&gt;DrawCircle&lt;/code&gt; produces cubic Bezier curves, not line segments. How do we get from curves to lines?&lt;/p&gt;

&lt;p&gt;Vello uses &lt;strong&gt;Euler spiral approximation&lt;/strong&gt; for adaptive curve flattening. Unlike naive subdivision (which produces too many or too few segments), Euler spirals provide optimal error bounds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// FlattenFill flattens cubic Beziers using Vello's Euler spiral algorithm&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;FlattenFill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cubics&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;CubicBezier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;LineSoup&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;LineSoup&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;cubics&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;p0&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P0&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P0&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="n"&gt;p1&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="n"&gt;p2&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P2&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P2&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="n"&gt;p3&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P3&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;P3&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="n"&gt;flattenEulerFill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The algorithm evaluates curvature at each point and subdivides only where the error exceeds a tolerance (0.25 pixels by default). A nearly-straight segment becomes 1 line. A tight curve gets more subdivisions. The result: &lt;strong&gt;minimum line segments for maximum visual quality&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Per-Tile Command Lists (PTCL)
&lt;/h2&gt;

&lt;p&gt;The coarse stage (Stage 7) generates &lt;strong&gt;Per-Tile Command Lists&lt;/strong&gt; — each tile gets a stream of commands like "fill with coverage from segment N", "apply color #FF0000", "begin clip", "end clip". This is what makes the pipeline work for multiple overlapping paths (UI with buttons, text, backgrounds) in a single fine rasterization pass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// PTCL commands&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;CmdEnd&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;CmdFill&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;  &lt;span class="c"&gt;// Compute coverage from segments&lt;/span&gt;
&lt;span class="n"&gt;CmdSolid&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;  &lt;span class="c"&gt;// Full coverage (no segments needed)&lt;/span&gt;
&lt;span class="n"&gt;CmdColor&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;  &lt;span class="c"&gt;// Apply RGBA color with source-over blending&lt;/span&gt;
&lt;span class="n"&gt;CmdBeginClip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt; &lt;span class="c"&gt;// Push clip layer&lt;/span&gt;
&lt;span class="n"&gt;CmdEndClip&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;11&lt;/span&gt; &lt;span class="c"&gt;// Pop and composite clip&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fine rasterizer walks each tile's PTCL, executing commands and compositing results with premultiplied alpha — exactly like a GPU would:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;fineRasterizeTile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ptcl&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;PTCL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;segments&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;PathSegment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;bgColor&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TileWidth&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TileHeight&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;pixelCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TileWidth&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TileHeight&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pixelCount&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bgColor&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nextOffset&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ptcl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadCmd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;CmdEnd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;rgba&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;CmdFill&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="c"&gt;// Compute coverage, store in area buffer&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;CmdColor&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="c"&gt;// Apply color using area as mask, source-over blend&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;CmdBeginClip&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="c"&gt;// Push current pixels onto clip stack&lt;/span&gt;
&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;CmdEndClip&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="c"&gt;// Pop and composite with clip mask&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Dual Execution: Why Both CPU and GPU?
&lt;/h2&gt;

&lt;p&gt;tilecompute is a &lt;strong&gt;dual-execution&lt;/strong&gt; pipeline: the same 9-stage algorithm runs on both CPU (sequential Go code) and GPU (WGSL compute shaders dispatched via &lt;code&gt;VelloComputeDispatcher&lt;/code&gt;). Why maintain both?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. CPU is the Core.&lt;/strong&gt; After analyzing 8 enterprise 2D engines (Skia, Cairo, Vello, Blend2D, tiny-skia, piet, Qt RHI, Pathfinder), we found that in &lt;strong&gt;zero&lt;/strong&gt; of them is CPU rasterization a "backend". It's always the core. GPU is the optional accelerator.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Correctness Reference.&lt;/strong&gt; The CPU implementation serves as a reference for the GPU compute shaders. When GPU and CPU produce different results, we know which one to trust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Universal Availability.&lt;/strong&gt; Servers, CI/CD, embedded systems — many environments have no GPU. A server generating 10,000 chart images doesn't need GPU acceleration; it needs reliable software rendering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Identical Algorithm, Dual Execution.&lt;/strong&gt; The CPU code mirrors the GPU pipeline stage-by-stage — same data structures, same logic. The 9 WGSL compute shaders are &lt;code&gt;//go:embed&lt;/code&gt;ded and compiled into GPU compute pipelines via &lt;code&gt;hal.Device.CreateComputePipeline()&lt;/code&gt;. When a GPU is available, &lt;code&gt;VelloComputeDispatcher&lt;/code&gt; dispatches all 9 stages in parallel with &lt;code&gt;pass.Dispatch()&lt;/code&gt;; when not, the CPU executes them sequentially.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Rasterizers, Two Targets
&lt;/h2&gt;

&lt;p&gt;tilecompute is not the only Vello algorithm we've ported. gogpu/gg includes &lt;strong&gt;both&lt;/strong&gt; of Vello's rasterization pipelines — each optimized for a different execution target:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;tilecompute&lt;/th&gt;
&lt;th&gt;SparseStripsRasterizer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vello era&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Original (2020–2023)&lt;/td&gt;
&lt;td&gt;New (&lt;a href="https://github.com/linebender/vello/issues/670" rel="noopener noreferrer"&gt;Issue #670&lt;/a&gt;, August 2024)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vello_shaders/src/cpu/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sparse_strips/vello_cpu/&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tile size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;16x16 (256 pixels)&lt;/td&gt;
&lt;td&gt;4x4 (16 pixels)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Optimized for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GPU compute workgroups&lt;/td&gt;
&lt;td&gt;CPU / SIMD registers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Key insight&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;256 pixels = GPU workgroup size&lt;/td&gt;
&lt;td&gt;16 u8 pixels = one 128-bit SSE register&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data flow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Monoid prefix sums → PTCL&lt;/td&gt;
&lt;td&gt;Sort by Y,X → strip rendering&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Package&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;internal/gpu/tilecompute/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;internal/gpu/sparse_strips.go&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Why 16x16 for GPU?&lt;/strong&gt; GPU compute shaders process tiles in parallel workgroups. 256 pixels per tile matches the typical workgroup size (256 threads), giving each thread exactly one pixel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why 4x4 for CPU?&lt;/strong&gt; SIMD instructions operate on 128-bit registers. 16 pixels of u8 coverage data fit into a single SSE register, enabling vectorized operations across an entire tile at once — the same approach Intel used in Larrabee.&lt;/p&gt;

&lt;p&gt;Both rasterizers use analytic anti-aliasing, Euler spiral flattening, and support NonZero/EvenOdd fill rules. The difference is purely in how they partition the canvas and process tiles.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gogpu/gg Rasterization Engines
    │
    ├── tilecompute (16x16 tiles) — DUAL EXECUTION
    │      ├── CPU: sequential Go (reference + fallback)
    │      └── GPU: 9 WGSL shaders via VelloComputeDispatcher
    │
    └── SparseStripsRasterizer (4×4 tiles) — CPU
           CPU/SIMD-optimized pipeline
           Sort by Y,X + strip rendering
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Having both means gogpu/gg selects the optimal algorithm for the target: GPU compute dispatches the 16x16 pipeline via WGSL shaders, CPU rendering defaults to the 4x4 pipeline for SIMD efficiency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Smart Multi-Engine Selection
&lt;/h2&gt;

&lt;p&gt;Having multiple rasterizers raises a question: &lt;strong&gt;who decides which algorithm handles which path?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We analyzed 8 enterprise 2D engines — Skia, Cairo, Blend2D, Vello, Qt, Direct2D, Flutter, SwiftUI — and found that &lt;strong&gt;none of them&lt;/strong&gt; do per-path dynamic algorithm selection. Skia has separate CPU/GPU pipelines but no cross-selection. Vello has a planned &lt;code&gt;vello_api&lt;/code&gt; for CPU/GPU choice, not yet built. Direct2D recognizes simple shapes but doesn't switch algorithms.&lt;/p&gt;

&lt;p&gt;gogpu/gg is the first 2D graphics library with &lt;strong&gt;systematic multi-factor per-path selection&lt;/strong&gt; across 5 rasterization algorithms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Path arrives at Context.Fill()
    │
    ├── Shape detection → SDF Accelerator (circles, rrects)
    │     GPU SDF or CPU SDF — smoothstep coverage, highest quality
    │
    ├── Adaptive threshold check
    │     │
    │     ├── Below threshold → AnalyticFiller (scanline)
    │     │     Zero tile overhead, O(width × edges)
    │     │
    │     └── Above threshold → AdaptiveFiller
    │           │
    │           ├── &amp;lt; 10K segments → SparseStrips (4×4 tiles)
    │           │     CPU/SIMD-optimized, lower tile overhead
    │           │
    │           └── &amp;gt; 10K segments + large canvas → TileCompute (16×16 tiles)
    │                 GPU workgroup-ready, 16× fewer tiles
    │
    └── GPU Compute → Vello PTCL pipeline (full scene)
          9-stage GPU compute, massively parallel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Coregex Analogy
&lt;/h3&gt;

&lt;p&gt;The pattern is inspired by &lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;coregex&lt;/a&gt; — a Go regex library with &lt;strong&gt;17 strategies&lt;/strong&gt; and an intelligent selector that picks the optimal engine per-pattern. Same idea: analyze the input, pick the optimal engine. Both Go libraries, both first-of-kind multi-engine approaches.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;coregex&lt;/th&gt;
&lt;th&gt;gogpu/gg&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Engines&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;17 regex strategies&lt;/td&gt;
&lt;td&gt;5 rasterization algorithms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Selection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pattern analysis&lt;/td&gt;
&lt;td&gt;7-dimension path analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Regex pattern&lt;/td&gt;
&lt;td&gt;Vector path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Match result&lt;/td&gt;
&lt;td&gt;Pixel coverage&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Adaptive Threshold
&lt;/h3&gt;

&lt;p&gt;The key insight: &lt;strong&gt;scanline cost grows with width × edge crossings&lt;/strong&gt;, while &lt;strong&gt;tile cost grows with fill area&lt;/strong&gt;. For large shapes, tiles win at lower complexity. For tiny shapes (&amp;lt; 32px), scanline always wins.&lt;/p&gt;

&lt;p&gt;The selection between scanline and tile-based rasterizers uses an adaptive threshold derived from the path's bounding box area:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// threshold = max(32, 2048/sqrt(bboxArea))&lt;/span&gt;
&lt;span class="c"&gt;//&lt;/span&gt;
&lt;span class="c"&gt;// 100×100 bbox → threshold 20 elements (tiles kick in early)&lt;/span&gt;
&lt;span class="c"&gt;// 500×500 bbox → threshold  4 elements (large shapes → always tiles)&lt;/span&gt;
&lt;span class="c"&gt;//  30×30  bbox → always scanline (below bboxMinDimension)&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;adaptiveThreshold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bboxArea&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;bboxArea&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;maxElementThreshold&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;threshold&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2048.0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bboxArea&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;clamp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  User Override
&lt;/h3&gt;

&lt;p&gt;Auto-selection is the default, but the user always has final say — the same principle as database query hints or GPU driver force flags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Auto (default) — intelligent per-path selection&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRasterizerMode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RasterizerAuto&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Force scanline for all paths (debugging, isolation)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRasterizerMode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RasterizerAnalytic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Force 4×4 tiles (benchmarking CPU/SIMD path)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRasterizerMode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RasterizerSparseStrips&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Force 16×16 tiles (benchmarking GPU workgroup path)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRasterizerMode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RasterizerTileCompute&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Force SDF for maximum circle/rrect quality&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRasterizerMode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RasterizerSDF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The mode is per-Context — different contexts can use different strategies simultaneously. This makes A/B benchmarking trivial: render the same scene with two contexts, compare output and timing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Go source&lt;/td&gt;
&lt;td&gt;2,878 LOC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test code&lt;/td&gt;
&lt;td&gt;2,194 LOC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WGSL shaders&lt;/td&gt;
&lt;td&gt;1,695 LOC (9 shaders)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6,767 LOC&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tile size&lt;/td&gt;
&lt;td&gt;16x16 pixels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fill rules&lt;/td&gt;
&lt;td&gt;NonZero, EvenOdd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Golden test threshold&lt;/td&gt;
&lt;td&gt;0.15% max pixel difference&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 9 WGSL compute shaders are &lt;code&gt;//go:embed&lt;/code&gt;ded into &lt;code&gt;VelloComputeDispatcher&lt;/code&gt;, compiled into GPU compute pipelines, and dispatched with &lt;code&gt;pass.Dispatch()&lt;/code&gt; — the same algorithm running on both CPU (reference/fallback) and GPU (parallel compute).&lt;/p&gt;

&lt;h2&gt;
  
  
  Part of a Larger Ecosystem
&lt;/h2&gt;

&lt;p&gt;tilecompute is the rasterization core of &lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt;, which is part of a &lt;strong&gt;466K+ line&lt;/strong&gt; Pure Go GPU computing ecosystem:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;LOC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2D graphics library&lt;/td&gt;
&lt;td&gt;186K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/wgpu" rel="noopener noreferrer"&gt;gogpu/wgpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Pure Go WebGPU&lt;/td&gt;
&lt;td&gt;110K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;gogpu/naga&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Shader compiler&lt;/td&gt;
&lt;td&gt;61K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;gogpu/ui&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;GUI widget toolkit&lt;/td&gt;
&lt;td&gt;61K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gogpu" rel="noopener noreferrer"&gt;gogpu/gogpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;GPU framework&lt;/td&gt;
&lt;td&gt;40K&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The default stack is &lt;strong&gt;zero CGO, Pure Go&lt;/strong&gt; — from shader compilation to GPU command submission. But gogpu also supports a &lt;strong&gt;Rust backend&lt;/strong&gt; via &lt;a href="https://github.com/go-webgpu/webgpu" rel="noopener noreferrer"&gt;go-webgpu&lt;/a&gt; FFI to wgpu-native for maximum GPU performance when needed:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Build&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Pure Go&lt;/strong&gt; (default)&lt;/td&gt;
&lt;td&gt;gogpu/wgpu → Vulkan/Metal/GLES&lt;/td&gt;
&lt;td&gt;&lt;code&gt;go build&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Zero dependencies, easy cross-compile&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Rust FFI&lt;/strong&gt; (opt-in)&lt;/td&gt;
&lt;td&gt;go-webgpu → wgpu-native&lt;/td&gt;
&lt;td&gt;&lt;code&gt;go build -tags rust&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Maximum GPU performance, production&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both backends use the same gg API — the choice is transparent to application code. gg doesn't know or care which WebGPU implementation is underneath; it talks to &lt;code&gt;hal.Queue&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When GPU acceleration is available, gg uses the registered WebGPU backend (Pure Go or Rust) with support for Vulkan, Metal, DX12, and OpenGL ES. When it's not — tilecompute and the CPU rasterizer handle everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Acknowledgments
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;a href="https://github.com/linebender/vello" rel="noopener noreferrer"&gt;Vello&lt;/a&gt; team and &lt;a href="https://raphlinus.github.io/" rel="noopener noreferrer"&gt;Raph Levien&lt;/a&gt; for the tile rasterization pipeline and Euler spiral flattening research&lt;/li&gt;
&lt;li&gt;The variable naming in tilecompute intentionally matches Vello's Rust originals for easy cross-reference&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Package&lt;/strong&gt;: &lt;code&gt;internal/gpu/tilecompute/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vello&lt;/strong&gt;: &lt;a href="https://github.com/linebender/vello" rel="noopener noreferrer"&gt;linebender/vello&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Raph Levien's blog&lt;/strong&gt;: &lt;a href="https://raphlinus.github.io/" rel="noopener noreferrer"&gt;raphlinus.github.io&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discussion&lt;/strong&gt;: &lt;a href="https://github.com/orgs/gogpu/discussions/18" rel="noopener noreferrer"&gt;Join the conversation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;






&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/gg@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>go</category>
      <category>graphics</category>
      <category>gpu</category>
      <category>algorithms</category>
    </item>
    <item>
      <title>Pure Go GUI Toolkit 2026 — 425K LOC Ecosystem, Zero CGO, WebGPU (gogpu/ui)</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Sat, 21 Feb 2026 13:38:23 +0000</pubDate>
      <link>https://forem.com/kolkov/pure-go-gui-toolkit-2026-425k-loc-ecosystem-zero-cgo-webgpu-gogpuui-5aop</link>
      <guid>https://forem.com/kolkov/pure-go-gui-toolkit-2026-425k-loc-ecosystem-zero-cgo-webgpu-gogpuui-5aop</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series: Building Go's GPU Ecosystem&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-a-pure-go-graphics-library-for-gpu-programming-2j5d"&gt;GoGPU: A Pure Go Graphics Library&lt;/a&gt; — Project announcement&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-from-idea-to-100k-lines-in-two-weeks-building-gos-gpu-ecosystem-3b2"&gt;From Idea to 100K Lines in Two Weeks&lt;/a&gt; — The journey&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/pure-go-2d-graphics-library-with-gpu-acceleration-introducing-gogpugg-538h"&gt;Pure Go 2D Graphics with GPU Acceleration&lt;/a&gt; — Introducing gg&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gpu-compute-shaders-in-pure-go-gogpugg-v0150-1cjk"&gt;GPU Compute Shaders in Pure Go&lt;/a&gt; — Compute pipelines&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/go-126-meets-2026-with-a-professional-graphics-ecosystem-9g8"&gt;Go 1.26 Meets 2026&lt;/a&gt; — Ecosystem overview&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpugg-enterprise-2d-graphics-library-in-pure-go-1931"&gt;Enterprise 2D Graphics Library&lt;/a&gt; — gg architecture&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-enterprise-architecture-cross-package-gpu-integration-with-gpucontext-332"&gt;Cross-Package GPU Integration&lt;/a&gt; — gpucontext&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-unified-2d3d-graphics-integration-in-pure-go-gg3"&gt;Unified 2D/3D Graphics Integration&lt;/a&gt; — gg + gogpu&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core Complete, Focus on GUI&lt;/strong&gt; ← You are here&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Foundation is Done
&lt;/h2&gt;

&lt;p&gt;Three months ago, GoGPU was an idea. Today it's &lt;strong&gt;425,000+ lines of Pure Go&lt;/strong&gt; — a complete GPU computing stack with a shader compiler, WebGPU implementation, 2D graphics library, and application framework. All without CGO.&lt;/p&gt;

&lt;p&gt;The core architecture that powers everything — from shader compilation to pixel output — is &lt;strong&gt;production-ready&lt;/strong&gt;. And that means it's time to build what this was all leading to: &lt;strong&gt;a real GUI toolkit for Go&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where We Are: The Ecosystem at a Glance
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Active Development
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;th&gt;LOC&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;gogpu/ui&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;GUI Toolkit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;55K&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Phase 2 Beta&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Foundation (Maintenance Mode)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;th&gt;LOC&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2D Graphics (Canvas, GPU accel, text)&lt;/td&gt;
&lt;td&gt;167K&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gg/releases/tag/v0.29.0" rel="noopener noreferrer"&gt;v0.29.0&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/wgpu" rel="noopener noreferrer"&gt;gogpu/wgpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Pure Go WebGPU (Vulkan/DX12/Metal/GLES)&lt;/td&gt;
&lt;td&gt;105K&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/wgpu/releases/tag/v0.16.9" rel="noopener noreferrer"&gt;v0.16.9&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;gogpu/naga&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Shader Compiler (WGSL → SPIR-V/MSL/GLSL/HLSL)&lt;/td&gt;
&lt;td&gt;54K&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/naga/releases/tag/v0.14.1" rel="noopener noreferrer"&gt;v0.14.1&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gogpu" rel="noopener noreferrer"&gt;gogpu/gogpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Application Framework (windowing, input)&lt;/td&gt;
&lt;td&gt;37K&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gogpu/releases/tag/v0.20.0" rel="noopener noreferrer"&gt;v0.20.0&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+ 4 more&lt;/td&gt;
&lt;td&gt;gpucontext, gputypes, gg-pdf, gg-svg&lt;/td&gt;
&lt;td&gt;9K&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The foundation libraries handle issues, feature requests, bug fixes, and performance improvements as they come in. Their architecture is settled. Our full energy goes into &lt;strong&gt;gogpu/ui&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Build Another GUI Toolkit?
&lt;/h2&gt;

&lt;p&gt;Go's GUI landscape has options — Fyne, Gio, Wails. We respect all of them. But we're solving a different problem:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We want Go to power the applications that currently require Electron, Qt, or native platform toolkits.&lt;/strong&gt; IDEs. Design tools. CAD. Professional dashboards.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero CGO&lt;/strong&gt; — &lt;code&gt;go build&lt;/code&gt; and it works, everywhere&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WebGPU rendering&lt;/strong&gt; — native GPU acceleration on all platforms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise layout&lt;/strong&gt; — Flexbox, Grid, docking, virtualization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reactive state&lt;/strong&gt; — Signals-based data binding (not callbacks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accessibility&lt;/strong&gt; — ARIA roles from day one, not bolted on later&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable design systems&lt;/strong&gt; — Material 3 today, Fluent or Cupertino tomorrow&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Architecture: Three Layers, Clean Separation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Layer 3b: Design Systems ── theme/material3/ (HCT color science)
Layer 3a: Generic Widgets ── core/button/, core/checkbox/, core/radio/
Layer 2:  CDK (headless)  ── Content[C] polymorphic pattern
Layer 1:  Foundation      ── widget/, event/, geometry/, layout/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;strong&gt;widgets don't know their design system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;button.Button&lt;/code&gt; defines &lt;em&gt;behavior&lt;/em&gt; — click handling, keyboard activation, focus management. How it &lt;em&gt;looks&lt;/em&gt; is determined by a &lt;code&gt;Painter&lt;/code&gt; interface that the design system implements.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// The widget defines behavior&lt;/span&gt;
&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Submit"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;save&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VariantOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;button&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Filled&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Material 3 painter handles appearance&lt;/span&gt;
&lt;span class="c"&gt;// (injected via ThemeProvider, not imported by the widget)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Core widgets are &lt;strong&gt;design-system-agnostic&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Swapping from Material 3 to Fluent is changing one import&lt;/li&gt;
&lt;li&gt;Community can create custom design systems without forking widgets&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Dependency Inversion
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;gogpu/ui&lt;/code&gt; never imports &lt;code&gt;gogpu&lt;/code&gt; directly. Instead, it depends on &lt;em&gt;interfaces&lt;/em&gt; from &lt;code&gt;gpucontext&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ui ──imports──&amp;gt; gpucontext (interfaces: WindowProvider, EventSource)
examples ──imports──&amp;gt; gogpu (concrete implementation)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means you can test the entire widget tree &lt;strong&gt;headlessly&lt;/strong&gt; — no window, no GPU, just unit tests. The toolkit has &lt;strong&gt;97% average test coverage&lt;/strong&gt; across all packages.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Already Working
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Phase 0: Foundation (Complete)
&lt;/h3&gt;

&lt;p&gt;Core infrastructure that every widget builds on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;geometry&lt;/strong&gt; — Point, Size, Rect, Constraints, Insets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;event&lt;/strong&gt; — Mouse, keyboard, wheel, focus events with modifier keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;widget&lt;/strong&gt; — Widget interface with 3-phase lifecycle (Layout → Draw → Event)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;layout&lt;/strong&gt; — CSS Flexbox, VStack/HStack, CSS Grid engines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;render&lt;/strong&gt; — Canvas implementation backed by gogpu/gg&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 1: MVP (Complete)
&lt;/h3&gt;

&lt;p&gt;The basics needed for a working application:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;state&lt;/strong&gt; — Reactive signals (Signal, Computed, Effect, Binding, Scheduler) wrapping &lt;a href="https://github.com/coregx/signals" rel="noopener noreferrer"&gt;coregx/signals&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;a11y&lt;/strong&gt; — Accessibility tree with 35+ ARIA roles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;primitives&lt;/strong&gt; — Box, Text, Image display widgets with fluent API&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;app&lt;/strong&gt; — Window integration via gpucontext interfaces&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 1.5: Extensibility (Complete)
&lt;/h3&gt;

&lt;p&gt;Enabling the community to build on top:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;registry&lt;/strong&gt; — Widget factory registration with categories&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;plugin&lt;/strong&gt; — Plugin bundling with dependency resolution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;theme&lt;/strong&gt; — Base theme system (ColorPalette, Typography, Spacing, Shadows, Radii, Extensions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;layout&lt;/strong&gt; (public) — Custom layout algorithms&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 2: Beta (75% Complete)
&lt;/h3&gt;

&lt;p&gt;Interactive widgets and Material Design:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;button&lt;/strong&gt; — 4 variants (Filled, Outlined, TextOnly, Tonal), 3 sizes, keyboard activation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;checkbox&lt;/strong&gt; — Checked / unchecked / indeterminate, label support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;radio&lt;/strong&gt; — Radio groups with vertical/horizontal layout, arrow key navigation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;focus&lt;/strong&gt; — Tab/Shift+Tab navigation, keyboard shortcuts, focus ring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cdk&lt;/strong&gt; — Component Development Kit with Content[C] polymorphic pattern&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;material3&lt;/strong&gt; — HCT color science, 32 color roles, 15 typography roles, painters for all widgets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Remaining for Phase 2:&lt;/strong&gt; TextField, Dropdown, Slider, Progress indicators, Typography system, Icon system&lt;/p&gt;




&lt;h2&gt;
  
  
  A Working Example
&lt;/h2&gt;

&lt;p&gt;Here's a real application running today — &lt;code&gt;ui/examples/hello&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;gogpuApp&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gogpu/ui — Widget Demo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithContinuousRender&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c"&gt;// Event-driven: 0% CPU idle&lt;/span&gt;

    &lt;span class="n"&gt;uiApp&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithWindowProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithPlatformProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithEventSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;uiApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRoot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buildUI&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="c"&gt;// ... rendering pipeline setup ...&lt;/span&gt;
    &lt;span class="n"&gt;gogpuApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;buildUI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;primitives&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BoxWidget&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;primitives&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;primitives&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gogpu/ui — Widget Demo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
            &lt;span class="n"&gt;FontSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;28&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bold&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
            &lt;span class="n"&gt;Color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;widget&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RGBA8&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;33&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;33&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;33&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;

        &lt;span class="n"&gt;checkbox&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;checkbox&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LabelOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Enable notifications"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;checkbox&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Checked&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;checkbox&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnToggle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checked&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"notifications:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;checked&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;

        &lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewGroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Items&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ItemDef&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"light"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Label&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Light"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ItemDef&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"dark"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Label&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Dark"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ItemDef&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Label&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"System"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Selected&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DirectionOpt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;radio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Horizontal&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Padding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Gap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;12&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
      &lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;widget&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RGBA8&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;255&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
      &lt;span class="n"&gt;Rounded&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;12&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ShadowLevel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key characteristics:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Event-driven rendering&lt;/strong&gt; — 0% CPU when nothing changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU-direct pipeline&lt;/strong&gt; — widgets render through gg, ggcanvas blits directly to the GPU surface (zero-copy)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functional options&lt;/strong&gt; — clean construction without builder hell&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fluent styling&lt;/strong&gt; — &lt;code&gt;.Padding().Gap().Background().Rounded()&lt;/code&gt; chains&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rendering pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Widget tree → render.Canvas (gg) → ggcanvas → GPU surface
                                    ↓
                              Zero CPU readback
                              Direct GPU composition
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Near-term: Phase 2 Completion
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Widget&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TextField&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Text input with cursor, selection, clipboard&lt;/td&gt;
&lt;td&gt;Next up&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dropdown&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Popup selection with search&lt;/td&gt;
&lt;td&gt;Planned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Slider&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Range input with track and thumb&lt;/td&gt;
&lt;td&gt;Planned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Progress&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Determinate and indeterminate indicators&lt;/td&gt;
&lt;td&gt;Planned&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Phase 3–4: IDE-Class Application Shell
&lt;/h3&gt;

&lt;p&gt;This is where it gets ambitious. The target is &lt;strong&gt;GoLand-class IDE layout&lt;/strong&gt; — the kind of application shell that currently requires Electron, Qt, or platform-native code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────┐
│  Toolbar                                                 │
├────────┬─────────────────────────────────┬───────────────┤
│        │  Tab1 │ Tab2 │ Tab3             │               │
│  Left  │─────────────────────────────────│  Right Panel  │
│  Panel │                                 │  (Inspector,  │
│  (Tree,│     Main Editor Area            │   Properties) │
│  Files)│                                 │               │
│        │                                 │               │
├────────┴──────────────────┬──────────────┴───────────────┤
│  Terminal │ Problems │ Git│                              │
│  Bottom Panel (resizable, collapsible)                   │
└──────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The building blocks for this, in order:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ScrollView&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Smooth scrolling with inertia&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TabView&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Editor-style tabs with close buttons, reordering&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SplitView&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Resizable horizontal/vertical splits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dialog/Modal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Popup windows with focus trapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Popover/Tooltip&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Context menus, hover tooltips&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;VirtualizedList/Grid&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Render 100K items (file trees, logs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Animation Engine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Spring physics, transitions, easing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Docking System&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Draggable panels — left, right, bottom, floating&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Drag &amp;amp; Drop&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Cross-widget, cross-window DnD (tab reordering, panel docking)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fluent Theme&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Windows-native look&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cupertino Theme&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;macOS-native look&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;i18n&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;RTL text, locale-aware formatting&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The docking system is the crown jewel — panels that snap to left/right/bottom, collapse to icon strips, drag between positions. Exactly what you see in GoLand, VS Code, or Photoshop. Built entirely in Go, rendered entirely on the GPU.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Libraries: Maintenance Mode
&lt;/h2&gt;

&lt;p&gt;With the GUI toolkit as the primary focus, the lower-level libraries shift to maintenance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Focus Going Forward&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;gg&lt;/strong&gt; (2D graphics)&lt;/td&gt;
&lt;td&gt;Bug fixes (#95 AA quality, #72 circle artifact), GPU pattern support, performance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;wgpu&lt;/strong&gt; (WebGPU)&lt;/td&gt;
&lt;td&gt;Community bug reports, platform-specific fixes, new HAL features as needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;naga&lt;/strong&gt; (shaders)&lt;/td&gt;
&lt;td&gt;HLSL matrix fix (DX12 text rendering), new shader targets on demand&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;gogpu&lt;/strong&gt; (framework)&lt;/td&gt;
&lt;td&gt;Community issues (#82 NVIDIA crash, #89 macOS Tahoe), WASM support (#70)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This doesn't mean they're done — it means they're &lt;strong&gt;stable enough to build on&lt;/strong&gt; while we focus our development time on the toolkit that will use them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total ecosystem&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;425K+ lines of Go&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;UI toolkit alone&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;55K lines, 208 Go files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Test functions (UI)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1,400+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Test coverage (UI)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;97% average&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPU backends&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5 (Vulkan, DX12, Metal, GLES, Software)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Shader targets&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4 (SPIR-V, MSL, GLSL, HLSL)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Platforms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Windows, Linux (X11/Wayland), macOS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CGO required&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone and run the hello example&lt;/span&gt;
git clone https://github.com/gogpu/ui
&lt;span class="nb"&gt;cd &lt;/span&gt;ui/examples/hello
go run &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Requirements: Go 1.25+, a GPU with Vulkan/DX12/Metal/GLES support.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Involved
&lt;/h2&gt;

&lt;p&gt;We're building this in the open and we want your input:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/orgs/gogpu/discussions" rel="noopener noreferrer"&gt;GitHub Discussions&lt;/a&gt;&lt;/strong&gt; — Feature requests, architecture feedback&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/gogpu/ui/issues" rel="noopener noreferrer"&gt;gogpu/ui Issues&lt;/a&gt;&lt;/strong&gt; — Bug reports, widget requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;gogpu Organization&lt;/a&gt;&lt;/strong&gt; — All repositories&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The foundation is solid. Now comes the fun part — building the toolkit that makes Go a first-class citizen for desktop applications.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The GoGPU ecosystem is MIT-licensed. Contributions welcome.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>gui</category>
      <category>opensource</category>
      <category>webgpu</category>
    </item>
    <item>
      <title>Smart Coding: What Karpathy's Agentic Engineering Is Missing (630K LOC of Proof)</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Thu, 12 Feb 2026 10:58:09 +0000</pubDate>
      <link>https://forem.com/kolkov/from-vibe-coding-to-agentic-engineering-what-karpathy-got-right-and-whats-missing-62e</link>
      <guid>https://forem.com/kolkov/from-vibe-coding-to-agentic-engineering-what-karpathy-got-right-and-whats-missing-62e</guid>
      <description>&lt;p&gt;On February 4, 2026, Andrej Karpathy — the person who gave us "vibe coding" exactly one year earlier — declared it passé. His replacement term: &lt;strong&gt;Agentic Engineering&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Agentic — because the new default is that you are not writing the code directly 99% of the time, you are orchestrating agents. Engineering — because there is an art &amp;amp; science and expertise to it."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The term landed in a landscape already primed for it. Anthropic had just released their &lt;a href="https://resources.anthropic.com/hubfs/2026%20Agentic%20Coding%20Trends%20Report.pdf" rel="noopener noreferrer"&gt;2026 Agentic Coding Trends Report&lt;/a&gt;. A Hacker News thread asking &lt;a href="https://news.ycombinator.com/item?id=46691243" rel="noopener noreferrer"&gt;"Do you have any evidence that agentic coding works?"&lt;/a&gt; pulled 461 points and 455 comments. An &lt;a href="https://arxiv.org/abs/2505.19443" rel="noopener noreferrer"&gt;academic paper on arXiv&lt;/a&gt; had already formalized the paradigm split. And within days of Karpathy's post, Addy Osmani (Google) published a &lt;a href="https://addyosmani.com/blog/agentic-engineering/" rel="noopener noreferrer"&gt;detailed framework&lt;/a&gt; and announced an O'Reilly book titled &lt;em&gt;Beyond Vibe Coding&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I've been practicing and writing about this exact shift since January 2026 — a month before Karpathy named it. My &lt;a href="https://dev.to/kolkov/smart-coding-vs-vibe-coding-engineering-discipline-in-the-age-of-ai-5b20"&gt;Smart Coding article&lt;/a&gt; laid out the same core idea: engineering discipline plus AI as an accelerator, not a replacement.&lt;/p&gt;

&lt;p&gt;I'm not claiming credit for the term. But the conversation is still incomplete. Karpathy gave us a great name. Osmani gave us a great workflow. Both frameworks miss a critical piece that I've found essential across 35+ production projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Paradigms: Understanding the Spectrum
&lt;/h2&gt;

&lt;p&gt;The community keeps framing this as a binary: Vibe Coding vs. Agentic Engineering. That's wrong. There are actually three distinct paradigms, and understanding all three is the key to working effectively with AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vibe Coding: Exploration Mode
&lt;/h3&gt;

&lt;p&gt;Karpathy's original description (February 2025): "You fully give in to the vibes, embrace exponentials, and forget that the code even exists."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When it works:&lt;/strong&gt; Prototypes, hackathons, learning new APIs, feasibility spikes. You prompt, you accept, you run, you see if it works. Fast and cheap — and the code is disposable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When it fails:&lt;/strong&gt; Production. As a &lt;a href="https://addyo.substack.com/p/the-80-problem-in-agentic-coding" rel="noopener noreferrer"&gt;survey of 18 CTOs&lt;/a&gt; revealed, 16 experienced production disasters directly caused by unreviewed AI-generated code. The code compiled. The tests (auto-generated, shallow) passed. The bugs were subtle, the security vulnerabilities real.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agentic Engineering: Orchestration Mode
&lt;/h3&gt;

&lt;p&gt;Karpathy's new framework: you orchestrate AI agents while maintaining oversight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it gets right:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developers are no longer typists — they're orchestrators&lt;/li&gt;
&lt;li&gt;Multiple agents can work in parallel (code, test, review)&lt;/li&gt;
&lt;li&gt;Quality gates and automated validation are essential&lt;/li&gt;
&lt;li&gt;The skill is in specification and oversight, not syntax&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What it misses:&lt;/strong&gt; More on this below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Smart Coding: The Unifying Framework
&lt;/h3&gt;

&lt;p&gt;My take: &lt;strong&gt;Smart Coding isn't the opposite of either Vibe Coding or Agentic Engineering. It's the framework that encompasses both.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────┐
│              SMART CODING                   │
│         (The Meta-Framework)                │
│                                             │
│  ┌─────────────┐    ┌───────────────────┐   │
│  │ Vibe Coding │    │    Agentic        │   │
│  │ (Explore)   │───→│    Engineering    │   │
│  │             │    │    (Build)        │   │
│  └─────────────┘    └───────────────────┘   │
│         ↑                    │              │
│         └────────────────────┘              │
│    Feedback loop + Knowledge accumulation   │
└─────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Smart Coding is knowing &lt;strong&gt;when&lt;/strong&gt; to Vibe and &lt;strong&gt;when&lt;/strong&gt; to Engineer — and, crucially, how to make each session compound on the last. It's the judgment layer that neither paradigm addresses on its own.&lt;/p&gt;

&lt;p&gt;Vibe Coding answers: &lt;em&gt;how do I explore quickly?&lt;/em&gt;&lt;br&gt;
Agentic Engineering answers: &lt;em&gt;how do I build with AI agents?&lt;/em&gt;&lt;br&gt;
Smart Coding answers: &lt;em&gt;how do I decide, learn, and get better at both over time?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  What Karpathy Got Right
&lt;/h2&gt;

&lt;p&gt;Credit where it's due — Karpathy nailed several things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The 99% observation is accurate.&lt;/strong&gt;&lt;br&gt;
This isn't hyperbole. &lt;a href="https://fortune.com/2026/01/29/100-percent-of-code-at-anthropic-and-openai-is-now-ai-written-boris-cherny-roon/" rel="noopener noreferrer"&gt;Boris Cherny&lt;/a&gt;, head of Claude Code at Anthropic, says he hasn't written a line of code by hand in over two months — shipping 22-27 PRs per day, all 100% AI-generated. OpenAI researcher Roon says the same: "100%, I don't write code anymore." Anthropic's CPO Mike Krieger confirms that for most products it's "effectively 100%", with the company-wide average at 70-90%. In my own daily work across multiple Go ecosystems, I'd put it at 90-95%. My role is architecture, specification, review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. "Engineering" is the right word.&lt;/strong&gt;&lt;br&gt;
Calling it "engineering" demands rigor. It's not "agentic vibing" or "agentic prompting." The word engineering implies standards, validation, discipline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The timing is right.&lt;/strong&gt;&lt;br&gt;
Models in early 2026 (Claude Opus 4.6, GPT-5, Gemini 2.5) are genuinely capable of multi-step autonomous work. Agentic workflows that failed in 2024 now succeed reliably with proper guidance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. The self-awareness is refreshing.&lt;/strong&gt;&lt;br&gt;
Karpathy calling his original vibe coding tweet a "shower of thoughts throwaway" is honest. The fact that it resonated shows the community was ready for the vocabulary, not that the term was deeply considered.&lt;/p&gt;
&lt;h2&gt;
  
  
  What's Missing: The Bidirectional Learning Gap
&lt;/h2&gt;

&lt;p&gt;Neither Karpathy nor Osmani fully address this: &lt;strong&gt;the AI doesn't learn from you, and most developers don't systematically learn from AI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Osmani's framework says: "Start with a plan. Direct. Review. Test. Own the codebase." All correct. But it treats AI as a stateless contractor — you give instructions, it delivers code, you review. Next session, same blank slate.&lt;/p&gt;

&lt;p&gt;This is like hiring a developer who forgets everything every morning.&lt;/p&gt;
&lt;h3&gt;
  
  
  Teaching Your AI Agent
&lt;/h3&gt;

&lt;p&gt;In my &lt;a href="https://dev.to/kolkov/smart-coding-vs-vibe-coding-engineering-discipline-in-the-age-of-ai-5b20"&gt;Smart Coding article&lt;/a&gt;, I described the &lt;strong&gt;Knowledge File Pattern&lt;/strong&gt; — a dedicated context file that your AI agent reads at the start of every session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# PROJECT_CONTEXT.md&lt;/span&gt;

&lt;span class="gu"&gt;## Architecture&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Hexagonal architecture with ports/adapters
&lt;span class="p"&gt;-&lt;/span&gt; All external services accessed through interfaces
&lt;span class="p"&gt;-&lt;/span&gt; No business logic in HTTP handlers

&lt;span class="gu"&gt;## Conventions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Error wrapping with context, never naked errors
&lt;span class="p"&gt;-&lt;/span&gt; Structured JSON logging with request_id
&lt;span class="p"&gt;-&lt;/span&gt; Table-driven tests for validation logic

&lt;span class="gu"&gt;## Domain Knowledge&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; "Settlement" = end-of-day batch processing
&lt;span class="p"&gt;-&lt;/span&gt; Customer IDs are UUIDs, Order IDs are sequential

&lt;span class="gu"&gt;## Known Pitfalls&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Redis cluster doesn't support MULTI in our setup
&lt;span class="p"&gt;-&lt;/span&gt; Legacy API returns 200 for errors — check body
&lt;span class="p"&gt;-&lt;/span&gt; Date fields from Partner X: ISO but no timezone

&lt;span class="gu"&gt;## Lessons Learned&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Week 1: AI suggested time.Now() → added: "Use injected clock"
&lt;span class="p"&gt;-&lt;/span&gt; Week 2: AI used fmt logger → added: "Use zerolog with context"
&lt;span class="p"&gt;-&lt;/span&gt; Week 3: AI flat package layout → added: "Domain-driven packages"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This file is a &lt;strong&gt;living document&lt;/strong&gt; that accumulates project wisdom. Each AI mistake becomes a permanent correction. Each session starts where the last one ended.&lt;/p&gt;

&lt;p&gt;In practice, I go further. Each project has a &lt;code&gt;STATUS.md&lt;/code&gt; that the agent reads at session start and updates on exit — current progress, near-term plans, discovered problems that need investigation. Detailed tasks go into an internal kanban broken into stages with interdependencies. There are &lt;code&gt;LINTER_RULES&lt;/code&gt; files that agents read &lt;em&gt;during&lt;/em&gt; coding and update when they discover new patterns — so code passes golangci-lint on the first try, not the fifth.&lt;/p&gt;

&lt;p&gt;Each repository also maintains a &lt;strong&gt;research knowledge base&lt;/strong&gt; — results of investigations, benchmarks, comparisons with alternative approaches, and crucially, &lt;em&gt;why&lt;/em&gt; specific design decisions were made, with the full context of the decision. For an ecosystem like GoGPU with 9 interrelated repositories, there's an ecosystem-wide research base on top of per-repo ones. When you revisit a decision six months later, you don't have to reverse-engineer your own reasoning — it's documented, with the alternatives you considered and why you rejected them.&lt;/p&gt;

&lt;p&gt;And the knowledge is &lt;strong&gt;hierarchical&lt;/strong&gt;. The root-level context knows about all repositories in the ecosystem but doesn't go deep. Each project essentially has its own agent that fully owns its repository — knows every module, every convention, every quirk. It practically lives inside the codebase. When I switch between projects, the context switches with me, and each agent picks up right where it left off.&lt;/p&gt;

&lt;p&gt;The agent doesn't just know &lt;em&gt;how&lt;/em&gt; to write code for this project — it knows &lt;em&gt;where we left off&lt;/em&gt;, &lt;em&gt;what problems we've spotted&lt;/em&gt;, and &lt;em&gt;what's next&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Feedback Loop Karpathy Doesn't Mention
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Session 1: AI makes 20 mistakes. You correct 20.
           → 20 corrections added to knowledge file.

Session 2: AI makes 5 mistakes (15 prevented by context).
           → 5 new corrections added.

Session 3: AI makes 1 mistake.
           → Your knowledge file is now a comprehensive project spec.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is &lt;strong&gt;bidirectional learning&lt;/strong&gt;: you learn AI's patterns and capabilities, AI (via context) learns your project's constraints and conventions. Over weeks, this compounds dramatically.&lt;/p&gt;

&lt;p&gt;Without this feedback loop, agentic engineering is like conducting an orchestra that can't remember the piece they played yesterday. You spend half your time re-explaining context instead of making progress.&lt;/p&gt;

&lt;p&gt;The exchange is asymmetric, and that's the point. &lt;strong&gt;You&lt;/strong&gt; know better: project context, business domain, team conventions, past mistakes, infrastructure quirks, stakeholder constraints. &lt;strong&gt;AI&lt;/strong&gt; knows better: syntax across languages, algorithm tradeoffs, library APIs, boilerplate patterns, cross-ecosystem solutions. Smart Coding makes this exchange &lt;strong&gt;systematic and cumulative&lt;/strong&gt;, not ad hoc and forgotten.&lt;/p&gt;

&lt;h3&gt;
  
  
  Knowledge Consolidation: Keeping Context Sharp
&lt;/h3&gt;

&lt;p&gt;There's a catch that nobody talks about. Knowledge files grow. Fast. After weeks of daily sessions, your &lt;code&gt;PROJECT_CONTEXT.md&lt;/code&gt; becomes a sprawling document with redundant entries, outdated lessons, and contradictory notes from different project phases. The AI agent starts drowning in its own context — reading hundreds of lines of accumulated wisdom where half no longer applies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need to periodically consolidate your knowledge files.&lt;/strong&gt; Think of it as garbage collection for project context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Week 1-4:   Knowledge file grows organically (additions only)
Week 5:     Consolidation pass
            - Remove lessons the AI no longer needs (patterns now habitual)
            - Merge duplicate entries
            - Update outdated conventions (API changed, dependency upgraded)
            - Restructure: separate "always relevant" from "situational"
            - Archive historical context that's no longer actionable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, I do a consolidation pass every few weeks — or whenever I notice the agent making mistakes it shouldn't, which often means it's losing important rules in a sea of less relevant ones. Context windows are large but not infinite, and signal-to-noise ratio matters more than volume.&lt;/p&gt;

&lt;p&gt;The consolidation itself is a form of learning. Reviewing what accumulated forces you to reflect: which patterns stuck? Which conventions evolved? What assumptions turned out wrong? It's a checkpoint for both the project and your own understanding of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; Keep a separate archive file for removed entries. You might need them if you revisit an old subsystem or onboard someone new. The active knowledge file should be lean and current — a briefing, not a history book.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Case: Building Where Go Was Weakest
&lt;/h2&gt;

&lt;p&gt;Enough theory. What does Smart Coding look like at scale?&lt;/p&gt;

&lt;p&gt;It started last summer, when I decided to publish my internal &lt;a href="https://github.com/scigolib/hdf5" rel="noopener noreferrer"&gt;HDF5&lt;/a&gt; and &lt;a href="https://github.com/scigolib/matlab" rel="noopener noreferrer"&gt;MATLAB&lt;/a&gt; parsers as open-source libraries. These were working code extracted from private projects — battle-tested, but not shaped for public consumption. Packaging them properly (documentation, CI, tests, API design) was the first exercise.&lt;/p&gt;

&lt;p&gt;Then something clicked. I had years of accumulated private code solving problems where Go was traditionally weak — and modern AI agents (Claude Opus 4.6 in particular) made it feasible to extract, refactor, and publish at a pace I couldn't have imagined before. Without these tools, I could realistically maintain one, maybe two open-source projects alongside my regular work. With Smart Coding, I've shipped 35+.&lt;/p&gt;

&lt;p&gt;There's a personal angle here too. After COVID-19, my vision deteriorated — I developed &lt;a href="https://specialty.vision/article/what-is-ghosting-vision/" rel="noopener noreferrer"&gt;polyopia and ghosting&lt;/a&gt;, where letters and digits multiply across different depth layers on screen, like looking through layered glass at night in a subway tunnel. Not simple doubling — images shift along random axes, some closer, some further, overlapping unpredictably. Post-COVID &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC9843575/" rel="noopener noreferrer"&gt;corneal and lens changes&lt;/a&gt; are well-documented, and mine made reading code character by character genuinely exhausting. Agentic engineering changed my relationship with the screen: I focus on code flow, architecture, and diffs — not on typing every semicolon and bracket. My eyes track the big picture while agents handle the detail work. It's not just a productivity story. For developers dealing with vision impairment, RSI, or other physical constraints, this way of working can be genuinely life-changing.&lt;/p&gt;

&lt;p&gt;I want to be honest about this: &lt;strong&gt;AI didn't design these libraries. I did.&lt;/strong&gt; The architecture, the API decisions, the choice of which problems to solve — that's decades of engineering experience. But AI turned what would have been years of solo implementation work into months. That's the force multiplier effect in action.&lt;/p&gt;

&lt;p&gt;What that looks like in practice — across a systematic effort to fill Go's blind spots:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Pattern: Identifying Architectural Gaps
&lt;/h3&gt;

&lt;p&gt;Go is excellent for servers, CLI tools, and infrastructure. But it has well-known blind spots:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Domain&lt;/th&gt;
&lt;th&gt;The Gap&lt;/th&gt;
&lt;th&gt;What Existed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPU/Graphics&lt;/td&gt;
&lt;td&gt;No unified ecosystem (vs. Rust's wgpu+naga+bevy)&lt;/td&gt;
&lt;td&gt;Individual libs (Ebiten, Gio, Fyne), all CGO-based&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regex&lt;/td&gt;
&lt;td&gt;stdlib is intentionally slow (single RE2 engine)&lt;/td&gt;
&lt;td&gt;No multi-engine alternative&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML/Deep Learning&lt;/td&gt;
&lt;td&gt;Python dominance, Rust has Burn&lt;/td&gt;
&lt;td&gt;No production Go framework&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scientific formats (HDF5, MATLAB)&lt;/td&gt;
&lt;td&gt;CGO wrappers only, no write support&lt;/td&gt;
&lt;td&gt;No pure Go read/write implementations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Race detection&lt;/td&gt;
&lt;td&gt;Built-in requires CGO&lt;/td&gt;
&lt;td&gt;Can't use on Lambda/Alpine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PDF processing&lt;/td&gt;
&lt;td&gt;Limited libraries&lt;/td&gt;
&lt;td&gt;No enterprise-grade option&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;AI agents can't identify these gaps. They don't understand ecosystem dynamics, community pain points, or the strategic value of filling a specific niche. That's architecture thinking — the human's job.&lt;/p&gt;

&lt;h3&gt;
  
  
  GoGPU: 630K+ Lines of Pure Go Graphics
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GoGPU&lt;/a&gt; is a full GPU computing ecosystem: 630,000+ lines of pure Go across 9 repositories. A &lt;a href="https://github.com/gogpu/wgpu" rel="noopener noreferrer"&gt;WebGPU implementation&lt;/a&gt; (127K LOC), a &lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;2D graphics library&lt;/a&gt; (204K LOC), a &lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;shader compiler&lt;/a&gt; (90K LOC) translating WGSL to SPIR-V/MSL/GLSL/HLSL, a &lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;UI toolkit&lt;/a&gt; (161K LOC), PDF and SVG export backends. 429+ stars and growing. Five GPU backends (Vulkan, DirectX 12, Metal, GLES, Software), three platforms.&lt;/p&gt;

&lt;p&gt;Go had graphics libraries before this — &lt;a href="https://ebitengine.org/" rel="noopener noreferrer"&gt;Ebitengine&lt;/a&gt; for 2D games, &lt;a href="https://gioui.org/" rel="noopener noreferrer"&gt;Gio&lt;/a&gt; and &lt;a href="https://fyne.io/" rel="noopener noreferrer"&gt;Fyne&lt;/a&gt; for UI. But what it lacked was what Rust has built over years: a &lt;strong&gt;cohesive, integrated GPU ecosystem&lt;/strong&gt;. Rust's &lt;a href="https://github.com/gfx-rs/wgpu" rel="noopener noreferrer"&gt;wgpu&lt;/a&gt; + &lt;a href="https://github.com/gfx-rs/naga" rel="noopener noreferrer"&gt;naga&lt;/a&gt; + Bevy + Iced form an interconnected stack from low-level HAL to high-level UI — all pure Rust, all interoperable. Go had nothing comparable. Every existing option required CGO, and there was no shader compiler, no unified abstraction layer, no path from GPU compute to rendered pixels without leaving Go.&lt;/p&gt;

&lt;p&gt;GoGPU fills that gap: a pure Go stack from shader compilation to 2D graphics to UI toolkit.&lt;/p&gt;

&lt;p&gt;This was impossible to Vibe Code. The architecture decisions — how to structure a shader compiler pipeline, how to map WebGPU's memory model to Go's garbage collector, how to design cross-platform rendering abstractions across five GPU backends — require deep understanding of both GPU programming and Go's runtime characteristics.&lt;/p&gt;

&lt;p&gt;But it was also impossible without AI agents. Translating shader compiler patterns from Rust's naga to idiomatic Go? Implementing a WebGPU HAL across five backends? That's exactly where agentic engineering shines — clearly specified translation tasks with well-defined inputs and outputs, executed at a pace no single developer could match.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Smart Coding approach:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Vibe phase&lt;/strong&gt;: Explore Rust's naga codebase, understand the architecture (throwaway spikes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture phase&lt;/strong&gt;: Design Go-idiomatic module boundaries, memory layout, API surface (human)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic phase&lt;/strong&gt;: AI agents implement well-specified components under strict review&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge accumulation&lt;/strong&gt;: Each module taught the AI about Go-specific GPU patterns, making subsequent modules faster&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Coregex: Multi-Engine Regex, 3-3000x Faster Than stdlib
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;Coregex&lt;/a&gt; (156 stars) is a multi-engine regex library inspired by Rust's &lt;a href="https://github.com/rust-lang/regex" rel="noopener noreferrer"&gt;regex crate&lt;/a&gt; architecture — multiple execution engines (Thompson NFA, Pike VM, one-pass DFA, bounded backtracker) with an intelligent meta-engine that selects the optimal strategy per pattern. It outperforms Go's stdlib by 3 to 3000x depending on the pattern.&lt;/p&gt;

&lt;p&gt;You can't Vibe Code a multi-engine regex library. The meta-engine selection logic, Thompson NFA construction, Pike VM execution, SIMD-accelerated literal search — each requires deep algorithmic understanding and careful architectural decisions about when to dispatch to which engine. But you can use AI agents to implement well-specified automaton transitions, generate comprehensive test suites from regex grammar specifications, and benchmark systematically.&lt;/p&gt;

&lt;p&gt;The knowledge file for this project grew to include regex-specific patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Regex Engine Conventions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; NFA states use uint32, not int (SIMD alignment)
&lt;span class="p"&gt;-&lt;/span&gt; All character class operations must be branchless
&lt;span class="p"&gt;-&lt;/span&gt; Thompson construction: no epsilon-removal optimization
  (benchmarks showed it's slower for our pattern distribution)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every session with AI was more productive than the last because the context accumulated.&lt;/p&gt;

&lt;h3&gt;
  
  
  Born: Go's Answer to Burn
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;Born&lt;/a&gt; (51 stars) is a production ML framework with PyTorch-like API, automatic differentiation, and type-safe tensors — Go's answer to Rust's &lt;a href="https://github.com/tracel-ai/burn" rel="noopener noreferrer"&gt;Burn&lt;/a&gt; framework. Where Burn brought deep learning to Rust with swappable backends and ONNX interop, Born brings it to Go with the same philosophy: single binary deployment, compile-time type safety, and zero external dependencies.&lt;/p&gt;

&lt;p&gt;Here, the &lt;strong&gt;cross-ecosystem research pattern&lt;/strong&gt; was critical. AI agents researched Burn's backend trait architecture, PyTorch's autograd implementation, JAX's tracing approach, and Tinygrad's minimalist design. But the human decided: Go's strengths (single binary deployment, generics-based type safety, goroutine parallelism) dictate a different design than either Python's dynamic typing or Rust's ownership model allows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Smart prompt vs. naive prompt:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Naive: "Implement backpropagation in Go"
→ Gets a textbook implementation that ignores Go's type system

Smart: "I've studied Burn's backend trait architecture, JAX's tracing,
and PyTorch's autograd. For Go, I want backpropagation that:
- Uses Go generics for type-safe tensor operations
- Leverages goroutines for parallel gradient computation
- Produces a computational graph that can be serialized
- Follows the conventions in PROJECT_CONTEXT.md
What are the tradeoffs vs. a tape-based approach for our use case?"
→ Gets an informed implementation that fits the ecosystem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Pattern Across All Projects
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/kolkov/racedetector" rel="noopener noreferrer"&gt;Racedetector&lt;/a&gt; (35 stars) — pure Go race detector built on multiple research papers: FastTrack (Flanagan &amp;amp; Freund, PLDI 2009), escape analysis integration, shadow memory, vector clocks, and AST instrumentation. 359 tests ported from Go's official race detector suite. This required studying concurrent systems theory across several academic papers and synthesizing them into a cohesive architecture — then AI accelerated the implementation of each well-specified component.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/scigolib/hdf5" rel="noopener noreferrer"&gt;HDF5&lt;/a&gt; (23 stars) — full read/write support for HDF5 2.0.0 (Format Spec v4.0), including chunked datasets, GZIP/LZF/BZIP2 compression, hyperslab selection (10-250x faster partial reads), and 100% pass rate on the official 378-file test suite. &lt;a href="https://github.com/scigolib/matlab" rel="noopener noreferrer"&gt;MATLAB&lt;/a&gt; (10 stars) — read/write for both legacy v5-v7.2 and modern HDF5-based v7.3+ formats, covering all numeric types, complex numbers, sparse matrices, structures, and cell arrays. Both are pure Go, zero CGO — the only complete implementations in the Go ecosystem. Designing the architecture for these complex binary format specifications required deep human analysis; AI agents then implemented the byte-level parsing and serialization under strict test coverage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/coregx/gxpdf" rel="noopener noreferrer"&gt;GxPDF&lt;/a&gt; — started when a friend asked me to help parse bank statements from PDFs. Every bank had its own table format for transactions and balances. What began as a quick script grew into an enterprise PDF library — because the PDF spec is 1,300 pages and "quick" doesn't exist in PDF parsing. Human identified the subset that matters for production use; AI implemented the extraction pipeline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/grpmsoft/grpm" rel="noopener noreferrer"&gt;GRPM&lt;/a&gt; — next-gen package manager for Gentoo. Here, AI agents first helped study and translate the entire &lt;a href="https://github.com/grpmsoft/grpm/tree/main/docs/pms" rel="noopener noreferrer"&gt;Gentoo PMS (Package Manager Specification)&lt;/a&gt; into structured Markdown — creating a machine-readable knowledge base from dense technical documentation. That knowledge base then became the spec for implementation: each section mapped to code, tracked via a &lt;a href="https://github.com/grpmsoft/grpm/blob/main/docs/PMS_COMPLIANCE.md" rel="noopener noreferrer"&gt;PMS compliance matrix&lt;/a&gt;. This is bidirectional learning in action — agents helped build the context, then used that same context to implement. SAT-based dependency resolution still required human CS fundamentals; but the documentation-to-implementation pipeline was pure agentic workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every single project followed the same cycle:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Human identifies the gap and designs the architecture&lt;/li&gt;
&lt;li&gt;Vibe exploration of reference implementations&lt;/li&gt;
&lt;li&gt;Knowledge file creation with project-specific constraints&lt;/li&gt;
&lt;li&gt;Agentic implementation with growing context&lt;/li&gt;
&lt;li&gt;Bidirectional learning compound effect&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is Smart Coding. Not Vibe. Not Agentic. Both, orchestrated by engineering judgment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architect Mindset: Why Experience Still Matters
&lt;/h2&gt;

&lt;p&gt;Karpathy wrote: &lt;em&gt;"I've never felt this much behind as a programmer."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I'd reframe that: &lt;strong&gt;you've never been more valuable as an architect.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When AI handles 95% of implementation, what remains is the 5% that determines whether the software succeeds or fails:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System boundaries and module design&lt;/strong&gt; — AI optimizes locally, architects think globally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technology selection with full context&lt;/strong&gt; — AI knows APIs, architects know ecosystems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tradeoff evaluation&lt;/strong&gt; — AI can list options, architects can weigh them against constraints AI doesn't see&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure mode anticipation&lt;/strong&gt; — AI builds the happy path, architects prevent the disasters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conceptual integrity&lt;/strong&gt; — AI generates code, architects maintain vision
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Traditional:  Junior → Mid → Senior → Lead → Architect
              (Years of progression)

With AI:      Every developer must think like an architect
              (AI handles the junior-to-senior implementation work)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://news.ycombinator.com/item?id=46691243" rel="noopener noreferrer"&gt;Hacker News discussion&lt;/a&gt; on agentic coding confirmed this: the consensus was that &lt;strong&gt;"you get the most value when you know exactly what you want."&lt;/strong&gt; Clear specifications matter more than model capability.&lt;/p&gt;

&lt;p&gt;But knowing what you want requires experience. And experience compounds faster when you maintain a systematic feedback loop with your AI tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Smart Coding Framework
&lt;/h2&gt;

&lt;p&gt;The workflow I use daily, distilled from months of production work:&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: Assess (5 minutes)
&lt;/h3&gt;

&lt;p&gt;Before touching any tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;☐ Am I exploring or building?
☐ What are the system boundaries affected?
☐ What would I sketch on a whiteboard?
☐ Can I write a one-paragraph spec for this task?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you can't write the spec, you're not ready for Agentic Engineering. Start with a Vibe spike.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 2: Prepare Context
&lt;/h3&gt;

&lt;p&gt;Update your knowledge file with anything relevant to this session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;☐ New constraints discovered since last session?
☐ Known pitfalls for this specific task?
☐ Conventions that must be followed?
☐ Related decisions already made?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This takes 2-3 minutes and saves 30+ minutes of correcting AI mistakes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3: Execute (Vibe or Agentic)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Vibe mode&lt;/strong&gt; (exploration):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Time-boxed to 30-60 minutes&lt;/li&gt;
&lt;li&gt;Goal: knowledge, not code&lt;/li&gt;
&lt;li&gt;Everything produced is throwaway&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Agentic mode&lt;/strong&gt; (building):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Detailed specs per component&lt;/li&gt;
&lt;li&gt;AI agents implement, you review every diff&lt;/li&gt;
&lt;li&gt;Tests required before proceeding&lt;/li&gt;
&lt;li&gt;Knowledge file updated with lessons learned&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 4: Capture
&lt;/h3&gt;

&lt;p&gt;After every session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;☐ What did AI get wrong? → Add to knowledge file
☐ What did AI teach me? → Add to personal notes
☐ What pattern emerged? → Document for future sessions
☐ Did my architecture assumptions hold? → Adjust if not
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The 70/30 Rule
&lt;/h3&gt;

&lt;p&gt;A practical heuristic: spend &lt;strong&gt;70% of time&lt;/strong&gt; on architecture, specification, review, and validation. Let AI accelerate the remaining &lt;strong&gt;30%&lt;/strong&gt; — the mechanical implementation.&lt;/p&gt;

&lt;p&gt;This ratio seems counterintuitive. But the 70% investment is what makes the 30% valuable. Without understanding, AI output is just random characters that happen to compile.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Terminology Doesn't Matter. The Practice Does.
&lt;/h2&gt;

&lt;p&gt;Call it Smart Coding. Call it Agentic Engineering. Call it whatever you want. The principles are the same:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You own the architecture.&lt;/strong&gt; AI owns the implementation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You validate everything.&lt;/strong&gt; AI generates candidates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context compounds.&lt;/strong&gt; Every session builds on the last.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mode awareness matters.&lt;/strong&gt; Know when to explore and when to build.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Experience is your moat.&lt;/strong&gt; AI levels the playing field on syntax. Architecture and judgment are your edge.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Karpathy gave us a great term. The industry needed vocabulary to distinguish disciplined AI-assisted development from reckless prompt-and-pray. "Agentic Engineering" is professional, precise, and descriptive.&lt;/p&gt;

&lt;p&gt;But the term alone isn't enough. Without the bidirectional learning loop — without systematically teaching your AI agent about your project while learning from its capabilities — you're orchestrating an amnesiac. Productive today, starting from zero tomorrow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Smart Coding is the practice that makes Agentic Engineering compound.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;In January 2026, I published &lt;a href="https://dev.to/kolkov/smart-coding-vs-vibe-coding-engineering-discipline-in-the-age-of-ai-5b20"&gt;Smart Coding vs Vibe Coding&lt;/a&gt; — exploring the same ideas before Karpathy coined "Agentic Engineering." The principles hold up. The vocabulary evolved. The practice continues.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How are you managing the Vibe-to-Agentic transition? Are you maintaining knowledge files? Have you found the right explore/build ratio for your work? I'd love to compare approaches — share your experience in the comments.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;I'm &lt;strong&gt;Andrey Kolkov&lt;/strong&gt; — a Full Stack developer (Go backend, Angular frontend) maintaining 35+ open source projects. I build in the spaces where Go is traditionally weakest: &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GPU computing&lt;/a&gt;, &lt;a href="https://github.com/coregx/coregex" rel="noopener noreferrer"&gt;regex engines&lt;/a&gt; (3-3000x faster than stdlib), &lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;ML frameworks&lt;/a&gt;, and &lt;a href="https://github.com/scigolib" rel="noopener noreferrer"&gt;scientific computing&lt;/a&gt;. Each project is a daily exercise in Smart Coding — using AI as a force multiplier while maintaining engineering rigor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/kolkov" rel="noopener noreferrer"&gt;@kolkov&lt;/a&gt; | &lt;strong&gt;Projects&lt;/strong&gt;: &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GoGPU&lt;/a&gt;, &lt;a href="https://github.com/coregx" rel="noopener noreferrer"&gt;coregx&lt;/a&gt;, &lt;a href="https://github.com/born-ml" rel="noopener noreferrer"&gt;born-ml&lt;/a&gt;, &lt;a href="https://github.com/scigolib" rel="noopener noreferrer"&gt;scigolib&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; #ai #programming #productivity #softwaredevelopment&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>go</category>
    </item>
    <item>
      <title>GoGPU: Unified 2D/3D Graphics Integration in Pure Go</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Fri, 30 Jan 2026 05:02:04 +0000</pubDate>
      <link>https://forem.com/kolkov/gogpu-unified-2d3d-graphics-integration-in-pure-go-djf</link>
      <guid>https://forem.com/kolkov/gogpu-unified-2d3d-graphics-integration-in-pure-go-djf</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series: Building Go's GPU Ecosystem&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-a-pure-go-graphics-library-for-gpu-programming-2j5d"&gt;GoGPU: A Pure Go Graphics Library&lt;/a&gt; — Project announcement&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-from-idea-to-100k-lines-in-two-weeks-building-gos-gpu-ecosystem-3b2"&gt;From Idea to 100K Lines in Two Weeks&lt;/a&gt; — The journey&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/pure-go-2d-graphics-library-with-gpu-acceleration-introducing-gogpugg-538h"&gt;Pure Go 2D Graphics with GPU Acceleration&lt;/a&gt; — Introducing gg&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gpu-compute-shaders-in-pure-go-gogpugg-v0150-1cjk"&gt;GPU Compute Shaders in Pure Go&lt;/a&gt; — Compute pipelines&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/go-126-meets-2026-with-a-professional-graphics-ecosystem-9g8"&gt;Go 1.26 Meets 2026&lt;/a&gt; — Roadmap&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpugg-enterprise-2d-graphics-library-in-pure-go-1931"&gt;Enterprise 2D Graphics Library&lt;/a&gt; — gg architecture&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/kolkov/gogpu-enterprise-architecture-cross-package-gpu-integration-with-gpucontext-332"&gt;Cross-Package GPU Integration&lt;/a&gt; — gpucontext&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified 2D/3D Graphics Integration&lt;/strong&gt; ← You are here&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Today we announce a major milestone for the &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;GoGPU ecosystem&lt;/a&gt;: &lt;strong&gt;unified 2D/3D graphics integration&lt;/strong&gt; through standardized interfaces. This release enables seamless rendering of 2D graphics (via &lt;code&gt;gg&lt;/code&gt;) into GPU-accelerated windows (via &lt;code&gt;gogpu&lt;/code&gt;) — all in Pure Go, without CGO.&lt;/p&gt;

&lt;p&gt;This is the foundation for &lt;code&gt;gogpu/ui&lt;/code&gt;, our upcoming enterprise-grade GUI toolkit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem We Solved
&lt;/h2&gt;

&lt;p&gt;Modern applications need both 2D and 3D graphics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;UI elements&lt;/strong&gt; (text, buttons, icons) require 2D rendering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visualization&lt;/strong&gt; (charts, graphs, CAD) requires GPU acceleration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Games and simulations&lt;/strong&gt; require both&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditionally, integrating these required:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;CGO bindings to native libraries (Cairo, Skia, Qt)&lt;/li&gt;
&lt;li&gt;Complex texture management between CPU and GPU&lt;/li&gt;
&lt;li&gt;Tight coupling between graphics and windowing code&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We solved this with &lt;strong&gt;interface-based architecture&lt;/strong&gt; — a pattern proven in enterprise systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│                    User Application                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────────────┐        ┌──────────────────────────┐   │
│   │ gg (2D Graphics)│        │  gogpu (GPU Framework)   │   │
│   │                 │        │                          │   │
│   │  - Canvas API   │        │  - WebGPU abstraction    │   │
│   │  - Text/Fonts   │        │  - Multi-backend         │   │
│   │  - Paths/Shapes │        │  - Window management     │   │
│   └────────┬────────┘        └────────────┬─────────────┘   │
│            │                              │                 │
│            └──────────┬───────────────────┘                 │
│                       │                                     │
│            ┌──────────▼──────────┐                          │
│            │  gpucontext         │                          │
│            │  (Shared Interfaces)│                          │
│            │                     │                          │
│            │  - TextureDrawer    │                          │
│            │  - TextureCreator   │                          │
│            │  - DeviceProvider   │                          │
│            │  - EventSource      │                          │
│            └─────────────────────┘                          │
│                                                             │
└─────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;strong&gt;shared interfaces enable integration without coupling&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The TextureDrawer Interface
&lt;/h2&gt;

&lt;p&gt;At the heart of our integration is &lt;code&gt;gpucontext.TextureDrawer&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// gpucontext/texture.go&lt;/span&gt;

&lt;span class="c"&gt;// TextureDrawer provides texture drawing capabilities for 2D rendering.&lt;/span&gt;
&lt;span class="c"&gt;// This interface enables packages like ggcanvas to draw textures without&lt;/span&gt;
&lt;span class="c"&gt;// depending directly on gogpu, following the Dependency Inversion Principle.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;TextureDrawer&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// DrawTexture draws a texture at the specified position.&lt;/span&gt;
    &lt;span class="n"&gt;DrawTexture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tex&lt;/span&gt; &lt;span class="n"&gt;Texture&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="kt"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;

    &lt;span class="c"&gt;// TextureCreator returns the texture creator associated with this drawer.&lt;/span&gt;
    &lt;span class="n"&gt;TextureCreator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;TextureCreator&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// TextureCreator provides texture creation from raw pixel data.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;TextureCreator&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// NewTextureFromRGBA creates a texture from RGBA pixel data.&lt;/span&gt;
    &lt;span class="n"&gt;NewTextureFromRGBA&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Texture&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This interface follows the &lt;strong&gt;Dependency Inversion Principle&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;gg&lt;/code&gt; depends on abstractions (&lt;code&gt;gpucontext.TextureDrawer&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gogpu&lt;/code&gt; implements abstractions (&lt;code&gt;Context.AsTextureDrawer()&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Neither depends on the other&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Working Example
&lt;/h2&gt;

&lt;p&gt;Here's a simplified example based on our codebase (see full version at &lt;a href="https://github.com/gogpu/gogpu/blob/main/examples/gg_integration/main.go" rel="noopener noreferrer"&gt;&lt;code&gt;examples/gg_integration/main.go&lt;/code&gt;&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"math"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/gogpu/gg"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg/integration/ggcanvas"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gogpu"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gogpu/gmath"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Create GPU-accelerated window&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"GoGPU + gg Integration via ggcanvas"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ggcanvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Canvas&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnDraw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Width&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Height&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClearColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gmath&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0x1a1a2e&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

        &lt;span class="c"&gt;// Lazy initialization with GPU context&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
            &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ggcanvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GPUContextProvider&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to create canvas: %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c"&gt;// Draw 2D graphics using familiar gg API&lt;/span&gt;
        &lt;span class="n"&gt;cc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRGB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawCircle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c"&gt;// Render to GPU window — one line!&lt;/span&gt;
        &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RenderTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AsTextureDrawer&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c"&gt;// Handle window resize&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnResize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Resize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key points:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;ggcanvas.New(provider, w, h)&lt;/code&gt;&lt;/strong&gt; — Creates a canvas with GPU context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;canvas.Context()&lt;/code&gt;&lt;/strong&gt; — Returns standard &lt;code&gt;*gg.Context&lt;/code&gt; for 2D drawing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;canvas.RenderTo(dc.AsTextureDrawer())&lt;/code&gt;&lt;/strong&gt; — Uploads to GPU and draws&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The full example includes animated HSV-colored circles, debug PNG export, backend logging, and comprehensive error handling.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The &lt;code&gt;ggcanvas&lt;/code&gt; package handles all complexity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU→GPU texture upload&lt;/li&gt;
&lt;li&gt;Dirty tracking (only upload when changed)&lt;/li&gt;
&lt;li&gt;Format conversion (RGBA→GPU texture)&lt;/li&gt;
&lt;li&gt;Resource cleanup&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Under the Hood
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Data Flow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User draws via gg API
         │
         ▼
gg.Context accumulates draw commands
         │
         ▼
canvas.RenderTo(dc) called
         │
         ├─── cc.RenderToPixmap(pixmap)    [CPU rasterization]
         │
         ├─── texture.UpdateData(pixmap.Pix) [CPU→GPU upload]
         │
         └─── dc.DrawTexture(texture, 0, 0)  [GPU render]
         │
         ▼
Window surface (Vulkan/Metal/DX12)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  gogpu Implementation
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;gogpu.Context&lt;/code&gt; implements the interface via an adapter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// context_texture.go&lt;/span&gt;

&lt;span class="c"&gt;// AsTextureDrawer returns an adapter implementing gpucontext.TextureDrawer.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;AsTextureDrawer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextureDrawer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;contextTextureDrawer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;creator&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rendererTextureCreator&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;renderer&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;renderer&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This follows the &lt;strong&gt;Adapter Pattern&lt;/strong&gt; — exposing existing functionality through a new interface without modifying the original type.&lt;/p&gt;

&lt;h2&gt;
  
  
  Platform Support
&lt;/h2&gt;

&lt;p&gt;This integration works across all gogpu-supported platforms:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;td&gt;Vulkan, DX12&lt;/td&gt;
&lt;td&gt;Production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux (X11)&lt;/td&gt;
&lt;td&gt;Vulkan&lt;/td&gt;
&lt;td&gt;Production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux (Wayland)&lt;/td&gt;
&lt;td&gt;Vulkan&lt;/td&gt;
&lt;td&gt;Production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;Metal&lt;/td&gt;
&lt;td&gt;Production&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All platforms use &lt;strong&gt;Pure Go FFI&lt;/strong&gt; — no CGO required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance Characteristics
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;canvas.Context()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Returns existing context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2D drawing&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;Rasterization in gg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RenderTo()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;O(pixels)&lt;/td&gt;
&lt;td&gt;CPU→GPU texture upload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU draw&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Single textured quad&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For static or infrequently changing UI, the CPU→GPU upload happens only when content changes (dirty tracking).&lt;/p&gt;

&lt;h2&gt;
  
  
  Roadmap: gogpu/ui
&lt;/h2&gt;

&lt;p&gt;This integration is the foundation for &lt;code&gt;gogpu/ui&lt;/code&gt;, our upcoming GUI toolkit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Future gogpu/ui API (planned)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gogpu/ui Demo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1280&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;720&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c"&gt;// Declarative UI with reactive state&lt;/span&gt;
    &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VStack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;widgets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Counter Demo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FontSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;24&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HStack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;widgets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"-"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}),&lt;/span&gt;
            &lt;span class="n"&gt;widgets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Computed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Count: %d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="p"&gt;})),&lt;/span&gt;
            &lt;span class="n"&gt;widgets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"+"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Gap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Padding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Gap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;ui&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRenderer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GPUContextProvider&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRoot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnDraw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClearColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;theme&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RenderTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AsTextureDrawer&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c"&gt;// Event forwarding&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnMouse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleMouse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnKeyboard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleKeyboard&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnTouch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleTouch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Planned Features
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Features&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Phase 1&lt;/td&gt;
&lt;td&gt;v0.1.0&lt;/td&gt;
&lt;td&gt;Core widgets, Flexbox layout, Events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 2&lt;/td&gt;
&lt;td&gt;v0.2.0&lt;/td&gt;
&lt;td&gt;Material 3 theme, Animation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 3&lt;/td&gt;
&lt;td&gt;v0.3.0&lt;/td&gt;
&lt;td&gt;Virtualization (100K+ items), A11y&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 4&lt;/td&gt;
&lt;td&gt;v1.0.0&lt;/td&gt;
&lt;td&gt;IDE docking, Multiple themes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Design Decisions
&lt;/h3&gt;

&lt;p&gt;Based on our &lt;a href="https://github.com/orgs/gogpu/discussions/18" rel="noopener noreferrer"&gt;research of 7 Rust UI frameworks&lt;/a&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Decision&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Reactivity&lt;/td&gt;
&lt;td&gt;Fine-grained signals&lt;/td&gt;
&lt;td&gt;O(affected) updates only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Styling&lt;/td&gt;
&lt;td&gt;Tailwind-style builders&lt;/td&gt;
&lt;td&gt;Type-safe, IDE autocomplete&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layout&lt;/td&gt;
&lt;td&gt;Flexbox + incremental&lt;/td&gt;
&lt;td&gt;Industry standard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accessibility&lt;/td&gt;
&lt;td&gt;AccessKit schema&lt;/td&gt;
&lt;td&gt;Cross-platform standard&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Today's Release
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Highlights&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;gpucontext&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;v0.4.0&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;TextureDrawer&lt;/code&gt;, &lt;code&gt;TouchEventSource&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;wgpu&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;v0.12.0&lt;/td&gt;
&lt;td&gt;BufferRowLength fix&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;naga&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;v0.9.0&lt;/td&gt;
&lt;td&gt;Shader compiler improvements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;gg&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;v0.22.1&lt;/td&gt;
&lt;td&gt;Integration + LineJoinRound fix&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;gogpu&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;v0.14.0&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;AsTextureDrawer()&lt;/code&gt; implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/gogpu@v0.14.0
go get github.com/gogpu/gg@v0.22.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/gogpu/gogpu
&lt;span class="nb"&gt;cd &lt;/span&gt;gogpu
go run examples/gg_integration/main.go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Organization&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;github.com/gogpu&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RFC Discussion&lt;/td&gt;
&lt;td&gt;&lt;a href="https://github.com/orgs/gogpu/discussions/18" rel="noopener noreferrer"&gt;gogpu/ui RFC&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gpucontext Article&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/kolkov/gogpu-enterprise-architecture-cross-package-gpu-integration-with-gpucontext-332"&gt;Cross-Package GPU Integration&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gg Architecture&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/kolkov/gogpugg-enterprise-2d-graphics-library-in-pure-go-1931"&gt;Enterprise 2D Graphics Library&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project History&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/kolkov/gogpu-from-idea-to-100k-lines-in-two-weeks-building-gos-gpu-ecosystem-3b2"&gt;From Idea to 100K Lines&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  We Need Your Help
&lt;/h2&gt;

&lt;p&gt;Building an enterprise-grade graphics ecosystem is a massive undertaking. We're a small team, and we need the community's help:&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing &amp;amp; Bug Reports
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Platform testing&lt;/strong&gt; — macOS, Linux (X11/Wayland), different GPUs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edge cases&lt;/strong&gt; — unusual window sizes, high DPI, multi-monitor setups&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance issues&lt;/strong&gt; — stuttering, memory leaks, high CPU usage&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Validation &amp;amp; Feedback
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API review&lt;/strong&gt; — Does the API feel Go-idiomatic? What's confusing?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture validation&lt;/strong&gt; — Are we making the right design decisions?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-world usage&lt;/strong&gt; — Try it in your projects and report pain points&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What We're Especially Looking For
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;What We Need&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rendering bugs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Incorrect colors, missing pixels, artifacts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Performance bottlenecks&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Profile and identify slow paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API ergonomics&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Confusing names, missing methods, rough edges&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Platform issues&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Windows/macOS/Linux-specific problems&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Integration feedback&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How well does it fit your use case?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  How to Contribute
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Star the repos&lt;/strong&gt; — Helps visibility&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File issues&lt;/strong&gt; — Even small bugs matter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Join discussions&lt;/strong&gt; — &lt;a href="https://github.com/orgs/gogpu/discussions" rel="noopener noreferrer"&gt;github.com/orgs/gogpu/discussions&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Try the examples&lt;/strong&gt; — Report what breaks&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every bug report, performance profile, and piece of feedback helps us build the graphics library Go deserves.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With unified 2D/3D integration, the GoGPU ecosystem is ready for the next step: a production-grade GUI toolkit for Go.&lt;/p&gt;

&lt;p&gt;Our approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pure Go&lt;/strong&gt; — No CGO, easy cross-compilation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interface-based&lt;/strong&gt; — Clean architecture, testable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise patterns&lt;/strong&gt; — Proven in large-scale systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We're building the GUI toolkit Go deserves. Join the discussion at &lt;a href="https://github.com/orgs/gogpu/discussions/18" rel="noopener noreferrer"&gt;github.com/orgs/gogpu/discussions/18&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The GoGPU Team&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Follow the project:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;github.com/gogpu&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Twitter: &lt;a href="https://twitter.com/gogpu_go" rel="noopener noreferrer"&gt;@gogpu_go&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>go</category>
      <category>graphics</category>
      <category>gpu</category>
      <category>webgpu</category>
    </item>
    <item>
      <title>GoGPU Enterprise Architecture: Cross-Package GPU Integration with gpucontext</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Tue, 27 Jan 2026 16:36:53 +0000</pubDate>
      <link>https://forem.com/kolkov/gogpu-enterprise-architecture-cross-package-gpu-integration-with-gpucontext-332</link>
      <guid>https://forem.com/kolkov/gogpu-enterprise-architecture-cross-package-gpu-integration-with-gpucontext-332</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Release (January 27, 2026):&lt;/strong&gt; gogpu &lt;strong&gt;v0.12.0&lt;/strong&gt; + gg &lt;strong&gt;v0.21.0&lt;/strong&gt; — Enterprise architecture with &lt;code&gt;gpucontext&lt;/code&gt; integration. Shared GPU interfaces enable &lt;strong&gt;database/sql-like&lt;/strong&gt; dependency injection across the ecosystem.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Circular Dependencies
&lt;/h2&gt;

&lt;p&gt;As the GoGPU ecosystem grew to &lt;strong&gt;300K lines of Pure Go&lt;/strong&gt;, we hit a classic enterprise problem:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gogpu/gogpu (windowing, GPU init)
      ↓ depends on
gogpu/gg (2D graphics)
      ↓ depends on
gogpu/wgpu (WebGPU implementation)
      ↓ depends on
gogpu/naga (shader compiler)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The challenge:&lt;/strong&gt; How can &lt;code&gt;gg&lt;/code&gt; receive a GPU device from &lt;code&gt;gogpu&lt;/code&gt; without creating circular dependencies? And how will &lt;code&gt;gogpu/ui&lt;/code&gt; receive both GPU context AND input events?&lt;/p&gt;

&lt;p&gt;The answer: &lt;strong&gt;Shared interfaces in a zero-dependency package.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introducing gpucontext
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/gogpu/gpucontext" rel="noopener noreferrer"&gt;gogpu/gpucontext&lt;/a&gt;&lt;/strong&gt; is a new package with &lt;strong&gt;zero dependencies&lt;/strong&gt; that defines shared GPU infrastructure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// gpucontext v0.2.0 — Zero dependencies!&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/gogpu/gpucontext"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Core Interfaces
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Interface&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Implemented By&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DeviceProvider&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;GPU device + queue access&lt;/td&gt;
&lt;td&gt;gogpu.App&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EventSource&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Input events for UI&lt;/td&gt;
&lt;td&gt;gogpu.App&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;IMEController&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;IME positioning for CJK input&lt;/td&gt;
&lt;td&gt;gogpu.App (future)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This follows the &lt;strong&gt;wgpu-types pattern&lt;/strong&gt; from Rust — separating type definitions from implementation.&lt;/p&gt;




&lt;h2&gt;
  
  
  DeviceProvider: The database/sql Pattern
&lt;/h2&gt;

&lt;p&gt;Just like Go's &lt;code&gt;database/sql&lt;/code&gt; lets you swap MySQL for Postgres without changing your code, &lt;code&gt;gpucontext.DeviceProvider&lt;/code&gt; lets libraries receive GPU resources without knowing the source:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// gpucontext/device_provider.go&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;DeviceProvider&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;Device&lt;/span&gt;           &lt;span class="c"&gt;// Create GPU resources&lt;/span&gt;
    &lt;span class="n"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;Queue&lt;/span&gt;             &lt;span class="c"&gt;// Submit commands&lt;/span&gt;
    &lt;span class="n"&gt;SurfaceFormat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;TextureFormat&lt;/span&gt;  &lt;span class="c"&gt;// Match surface format&lt;/span&gt;
    &lt;span class="n"&gt;Adapter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;Adapter&lt;/span&gt;         &lt;span class="c"&gt;// GPU capabilities (optional)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  gogpu Implements DeviceProvider
&lt;/h3&gt;

&lt;p&gt;In &lt;strong&gt;gogpu v0.12.0&lt;/strong&gt;, the &lt;code&gt;App&lt;/code&gt; now provides GPU context to external libraries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gogpu"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gpucontext"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpucontext Demo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnDraw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// Get DeviceProvider for external libraries&lt;/span&gt;
        &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GPUContextProvider&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c"&gt;// All non-nil — GPU is ready!&lt;/span&gt;
        &lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;   &lt;span class="c"&gt;// gpucontext.Device&lt;/span&gt;
        &lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;     &lt;span class="c"&gt;// gpucontext.Queue&lt;/span&gt;
        &lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Adapter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// gpucontext.Adapter&lt;/span&gt;
        &lt;span class="n"&gt;format&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SurfaceFormat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// gpucontext.TextureFormat&lt;/span&gt;

        &lt;span class="c"&gt;// Pass to gg, ui, or any library that accepts DeviceProvider&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; The library receiving &lt;code&gt;DeviceProvider&lt;/code&gt; doesn't need to know it came from &lt;code&gt;gogpu&lt;/code&gt;. It could come from &lt;a href="https://github.com/born-ml/born" rel="noopener noreferrer"&gt;born-ml/born&lt;/a&gt; for ML compute, or a future WebAssembly host.&lt;/p&gt;




&lt;h2&gt;
  
  
  EventSource: Input Events for UI
&lt;/h2&gt;

&lt;p&gt;Building a GUI toolkit requires more than GPU access — you need input events. The &lt;code&gt;EventSource&lt;/code&gt; interface provides platform-independent input delivery:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// gpucontext/events.go&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;EventSource&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Keyboard&lt;/span&gt;
    &lt;span class="n"&gt;OnKeyPress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mods&lt;/span&gt; &lt;span class="n"&gt;Modifiers&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;OnKeyRelease&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mods&lt;/span&gt; &lt;span class="n"&gt;Modifiers&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;OnTextInput&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c"&gt;// Mouse&lt;/span&gt;
    &lt;span class="n"&gt;OnMouseMove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;OnMousePress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;button&lt;/span&gt; &lt;span class="n"&gt;MouseButton&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;OnMouseRelease&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;button&lt;/span&gt; &lt;span class="n"&gt;MouseButton&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;OnScroll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dy&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c"&gt;// Window&lt;/span&gt;
    &lt;span class="n"&gt;OnResize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;OnFocus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;focused&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c"&gt;// IME (Chinese/Japanese/Korean input)&lt;/span&gt;
    &lt;span class="n"&gt;OnIMECompositionStart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fn&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;OnIMECompositionUpdate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fn&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="n"&gt;IMEState&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;OnIMECompositionEnd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fn&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;committed&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Full IME Support for CJK Input
&lt;/h3&gt;

&lt;p&gt;Enterprise applications must support international users. The &lt;code&gt;IMEState&lt;/code&gt; struct provides everything needed for inline composition rendering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;IMEState&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Composing&lt;/span&gt;       &lt;span class="kt"&gt;bool&lt;/span&gt;   &lt;span class="c"&gt;// Currently composing?&lt;/span&gt;
    &lt;span class="n"&gt;CompositionText&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="c"&gt;// e.g., "nihon" → "日本"&lt;/span&gt;
    &lt;span class="n"&gt;CursorPos&lt;/span&gt;       &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="c"&gt;// Cursor within composition&lt;/span&gt;
    &lt;span class="n"&gt;SelectionStart&lt;/span&gt;  &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="c"&gt;// Selection range&lt;/span&gt;
    &lt;span class="n"&gt;SelectionEnd&lt;/span&gt;    &lt;span class="kt"&gt;int&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using EventSource
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c"&gt;// Get event source&lt;/span&gt;
&lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Register callbacks&lt;/span&gt;
&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnKeyPress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mods&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Modifiers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;KeyEscape&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Quit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnMousePress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MouseButton&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Click at (%.0f, %.0f)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnIMECompositionUpdate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IMEState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Render composition text inline&lt;/span&gt;
    &lt;span class="n"&gt;renderIMEPreview&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CompositionText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CursorPos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  gg Enterprise Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;gg v0.21.0&lt;/strong&gt; introduces two new packages that leverage &lt;code&gt;gpucontext&lt;/code&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  core/ — CPU Rendering Primitives
&lt;/h3&gt;

&lt;p&gt;Independent of GPU, contains pure algorithms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gg/core/
├── fixed.go          # Fixed-point math (FDot6, FDot16)
├── edge.go           # Line/curve edges
├── edge_builder.go   # Path → edges conversion
├── analytic_filler.go # Anti-aliased rendering
└── alpha_runs.go     # RLE coverage storage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key principle:&lt;/strong&gt; CPU rendering code is separate from GPU code, following Skia/Vello architecture patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  render/ — GPU Integration Layer
&lt;/h3&gt;

&lt;p&gt;Bridges gg to host applications via &lt;code&gt;gpucontext&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// gg/render/device.go&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;DeviceHandle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DeviceProvider&lt;/span&gt;

&lt;span class="c"&gt;// gg/render/gpu_renderer.go&lt;/span&gt;
&lt;span class="c"&gt;// gg receives device from host, doesn't create its own&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewGPURenderer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="n"&gt;DeviceHandle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;GPURenderer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"render: nil device handle"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;GPURenderer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;           &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;softwareFallback&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;NewSoftwareRenderer&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Architecture:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;              User Application
                    │
     ┌──────────────┼──────────────┐
     │              │              │
     ▼              ▼              ▼
  gogpu.App    gg.Context     gg.Scene
  (windowing)  (immediate)    (retained)
     │              │              │
     └──────────────┼──────────────┘
                    │
                    ▼
            gg/render package
     ┌──────────────┼──────────────┐
     │              │              │
     ▼              ▼              ▼
 DeviceHandle  RenderTarget    Renderer
 (GPU access)    (output)     (execution)
                    │
                    ▼
            gg/core package
          (CPU rasterization)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Building gogpu/ui: The Path Forward
&lt;/h2&gt;

&lt;p&gt;With &lt;code&gt;gpucontext&lt;/code&gt; providing GPU access AND input events, &lt;strong&gt;gogpu/ui&lt;/strong&gt; can now be built as a pure consumer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Future gogpu/ui integration&lt;/span&gt;
&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gogpu"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/ui"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="c"&gt;// ui receives BOTH GPU context AND events&lt;/span&gt;
    &lt;span class="n"&gt;uiRoot&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GPUContextProvider&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="c"&gt;// GPU for rendering&lt;/span&gt;
        &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;        &lt;span class="c"&gt;// Input for interaction&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;uiRoot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, GoGPU!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Font&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Title&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Button&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Click Me"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnClick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Clicked!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Padding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Gap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;12&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnDraw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;uiRoot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Render&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  UI Architecture Goals
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Signals-based reactivity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fine-grained updates, O(affected) not O(n)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tailwind-style API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Type-safe styling, AI-friendly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Enterprise features&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Docking, virtualization, accessibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Desktop (gogpu), Web (WASM), Mobile (WebView)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Ecosystem Today
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;LOC&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;gogpu/gg&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.21.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~143K&lt;/td&gt;
&lt;td&gt;2D graphics + core/render packages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/wgpu" rel="noopener noreferrer"&gt;gogpu/wgpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;v0.10.2&lt;/td&gt;
&lt;td&gt;~95K&lt;/td&gt;
&lt;td&gt;Pure Go WebGPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/naga" rel="noopener noreferrer"&gt;gogpu/naga&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;v0.8.4&lt;/td&gt;
&lt;td&gt;~33K&lt;/td&gt;
&lt;td&gt;Shader compiler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gogpu" rel="noopener noreferrer"&gt;gogpu/gogpu&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.12.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~29K&lt;/td&gt;
&lt;td&gt;GPU framework + gpucontext integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/gpucontext" rel="noopener noreferrer"&gt;gogpu/gpucontext&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;v0.2.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~1K&lt;/td&gt;
&lt;td&gt;Shared interfaces (zero deps)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;gogpu/ui&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;GUI toolkit (in design)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Total: ~300K lines of Pure Go.&lt;/strong&gt; No CGO. No Rust required. Just &lt;code&gt;go build&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Dependency Graph
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│                    Your Application                         │
├─────────────────────────────────────────────────────────────┤
│    gogpu/ui (future)    │   born-ml/born   │   Your App     │
├─────────────────────────────────────────────────────────────┤
│                  gogpu/gg (2D Graphics)                     │
│              core/ (CPU)    render/ (GPU integration)       │
├─────────────────────────────────────────────────────────────┤
│              gogpu/gogpu (Graphics Framework)               │
│         Windowing, Input, GPU Init, DeviceProvider          │
├─────────────────────────────────────────────────────────────┤
│    gogpu/gpucontext (Shared Interfaces — ZERO DEPS)         │
│      DeviceProvider, EventSource, IME, WebGPU types         │
├─────────────────────────────────────────────────────────────┤
│                  gogpu/wgpu (Pure Go WebGPU)                │
├─────────────────────────────────────────────────────────────┤
│            Vulkan  │  Metal  │  DX12  │  GLES               │
└─────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Get the latest versions&lt;/span&gt;
go get github.com/gogpu/gogpu@v0.12.0
go get github.com/gogpu/gg@v0.21.0
go get github.com/gogpu/gpucontext@v0.2.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Complete Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/gogpu/gogpu"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gogpu/gmath"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gpucontext"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpucontext Integration Demo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithSize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c"&gt;// Setup event handling&lt;/span&gt;
    &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnKeyPress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mods&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Modifiers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Key: %d, Mods: %d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mods&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnMousePress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;btn&lt;/span&gt; &lt;span class="n"&gt;gpucontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MouseButton&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Click at (%.0f, %.0f)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OnDraw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gogpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// Verify DeviceProvider&lt;/span&gt;
        &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GPUContextProvider&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Device&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c"&gt;// GPU ready — can pass to gg or other libraries&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c"&gt;// Draw demo triangle (red) on CornflowerBlue background&lt;/span&gt;
        &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawTriangleColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gmath&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CornflowerBlue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q1 2026: Stabilization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Comprehensive benchmarks across all backends&lt;/li&gt;
&lt;li&gt;Memory optimization and GPU submission batching&lt;/li&gt;
&lt;li&gt;Documentation and tutorials&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Q2 2026: gogpu/ui Foundation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Widget system with signals-based reactivity&lt;/li&gt;
&lt;li&gt;Layout engine (flexbox-inspired)&lt;/li&gt;
&lt;li&gt;Theme system with accessibility support&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Q3 2026: gogpu/ui Enterprise Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Docking and workspace management&lt;/li&gt;
&lt;li&gt;Virtualized lists for large datasets&lt;/li&gt;
&lt;li&gt;AccessKit integration for screen readers&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Join the Discussion
&lt;/h2&gt;

&lt;p&gt;We're making architectural decisions &lt;strong&gt;right now&lt;/strong&gt;. Your input shapes the future of Go graphics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/orgs/gogpu/discussions" rel="noopener noreferrer"&gt;GitHub Discussions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/ui" rel="noopener noreferrer"&gt;gogpu/ui Repository&lt;/a&gt; — Star and watch for updates&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;gogpu/gogpu:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/gogpu/releases/tag/v0.12.0" rel="noopener noreferrer"&gt;https://github.com/gogpu/gogpu/releases/tag/v0.12.0&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;gogpu/gg:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/gg/releases/tag/v0.21.0" rel="noopener noreferrer"&gt;https://github.com/gogpu/gg/releases/tag/v0.21.0&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;gogpu/gpucontext:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/gpucontext" rel="noopener noreferrer"&gt;https://github.com/gogpu/gpucontext&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Organization:&lt;/strong&gt; &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;https://github.com/gogpu&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Enterprise-grade GPU integration. Pure Go. Zero CGO. Zero circular dependencies.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/gogpu@v0.12.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Star the repos if you find them useful!&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the GoGPU Journey series:&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://dev.to/kolkov/gogpu-a-pure-go-graphics-library-for-gpu-programming-2j5d"&gt;GoGPU: A Pure Go Graphics Library&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/kolkov/gogpu-from-idea-to-100k-lines-in-two-weeks-building-gos-gpu-ecosystem-3b2"&gt;From Idea to 100K Lines in Two Weeks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/kolkov/gpu-compute-shaders-in-pure-go-gogpugg-v0153-4a9k"&gt;GPU Compute Shaders in Pure Go&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/kolkov/enterprise-gpu-backend-in-pure-go-gogpugg-v0200-2a3m"&gt;Enterprise GPU Backend v0.20.0&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise Architecture v0.21.0&lt;/strong&gt; ← You are here&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>go</category>
      <category>gpu</category>
      <category>graphics</category>
      <category>webgpu</category>
    </item>
    <item>
      <title>gogpu/gg: Enterprise 2D Graphics Library in Pure Go</title>
      <dc:creator>Andrey Kolkov</dc:creator>
      <pubDate>Thu, 22 Jan 2026 22:05:00 +0000</pubDate>
      <link>https://forem.com/kolkov/gogpugg-enterprise-2d-graphics-library-in-pure-go-1931</link>
      <guid>https://forem.com/kolkov/gogpugg-enterprise-2d-graphics-library-in-pure-go-1931</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Latest: v0.20.0&lt;/strong&gt; — Enterprise-grade GPU backend, color emoji, anti-aliased rendering. Part of the &lt;strong&gt;280K+ LOC Pure Go&lt;/strong&gt; graphics ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Previous:&lt;/strong&gt; &lt;a href="https://dev.to/kolkov/gpu-compute-shaders-in-pure-go-gogpugg-v0150-1cjk"&gt;GPU Compute Shaders in Pure Go (v0.15.0)&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What is gogpu/gg?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;gg&lt;/strong&gt; is a production-grade 2D graphics library for Go, designed to power:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;IDEs&lt;/strong&gt; — Syntax highlighting, code rendering, UI components&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browsers&lt;/strong&gt; — Canvas-like rendering, WebGPU acceleration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Professional apps&lt;/strong&gt; — Vector graphics, data visualization, games&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key differentiator:&lt;/strong&gt; Pure Go. No CGO. No external dependencies. Just &lt;code&gt;go build&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/gg@v0.20.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Feature Overview
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Capabilities&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Drawing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Rectangles, circles, ellipses, arcs, lines, polygons, stars, bezier curves&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Paths&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MoveTo, LineTo, QuadraticTo, CubicTo, ArcTo, Close + fluent PathBuilder&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Text&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;TrueType fonts, MSDF rendering, bidirectional text, color emoji&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Images&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PNG/JPEG I/O, 7 pixel formats, affine transforms, mipmaps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compositing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;29 blend modes (Porter-Duff, Advanced, HSL), layer isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rendering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Software (CPU), GPU (Vulkan/Metal/DX12), hybrid auto-selection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Performance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SIMD optimization, parallel tile rendering, LRU caching&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Hello World
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/gogpu/gg"&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;512&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// Clear background&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClearWithColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;White&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// Draw anti-aliased circle&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetHexColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#3498db"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawCircle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;256&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// Draw stroked rectangle&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetHexColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#e74c3c"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetLineWidth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawRectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;156&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;156&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stroke&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SavePNG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"output.png"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Text Rendering
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg/text"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClearWithColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;White&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// Load font&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewFontSourceFromFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Roboto-Regular.ttf"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// Render text&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetFont&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Face&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;48&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Black&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, GoGPU!"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// Centered text&lt;/span&gt;
    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawStringAnchored&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Centered Text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SavePNG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text.png"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Gradients
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Linear gradient&lt;/span&gt;
&lt;span class="n"&gt;grad&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewLinearGradientBrush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddColorStop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#ff6b6b"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddColorStop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#4ecdc4"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddColorStop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#45b7d1"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetFillBrush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grad&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawRectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SavePNG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gradient.png"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Transforms
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;dc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClearWithColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;White&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Save state, transform, draw, restore&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Push&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Translate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Rotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pi&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// 45 degrees&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Scale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetHexColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#9b59b6"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawRectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c"&gt;// Restore original state&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SavePNG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"transform.png"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Canvas API
&lt;/h2&gt;

&lt;p&gt;gg provides a familiar Canvas-like API inspired by HTML5 Canvas and Cairo:&lt;/p&gt;

&lt;h3&gt;
  
  
  Drawing Primitives
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Rectangles&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawRectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawRoundedRectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Circles and ellipses&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawCircle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawEllipse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawArc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startAngle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endAngle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Lines&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MoveTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LineTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Curves&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QuadraticTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CubicTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cx2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Polygons&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawPolygon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;points&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawRegularPolygon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rotation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawStar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outerR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;innerR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;points&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Styles
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Colors&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Red&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRGB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetRGBA&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetHexColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#3498db"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Strokes&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetLineWidth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetLineCap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LineCapRound&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="c"&gt;// Butt, Round, Square&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetLineJoin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LineJoinMiter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Miter, Round, Bevel&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetDash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Pattern: 5px on, 3px off&lt;/span&gt;

&lt;span class="c"&gt;// Fill and stroke&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stroke&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FillPreserve&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;   &lt;span class="c"&gt;// Fill but keep path&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StrokePreserve&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// Stroke but keep path&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Clipping
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Clip to circle&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawCircle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Clip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// All subsequent drawing is clipped&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResetClip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Fluent PathBuilder
&lt;/h2&gt;

&lt;p&gt;Build complex paths with method chaining for use with Scene Graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg/scene"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Build path with fluent API&lt;/span&gt;
&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BuildPath&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;MoveTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;LineTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;QuadTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;CubicTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Circle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Star&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;RoundRect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
    &lt;span class="n"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Use with scene graph&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewScene&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FillNonZero&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IdentityAffine&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SolidBrush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#2ecc71"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewPathShape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Available shape methods:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;MoveTo&lt;/code&gt;, &lt;code&gt;LineTo&lt;/code&gt;, &lt;code&gt;QuadTo&lt;/code&gt;, &lt;code&gt;CubicTo&lt;/code&gt;, &lt;code&gt;Close&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Circle&lt;/code&gt;, &lt;code&gt;Ellipse&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Rect&lt;/code&gt;, &lt;code&gt;RoundRect&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Polygon&lt;/code&gt;, &lt;code&gt;Star&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Text Rendering
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Font Loading
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// From file&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewFontSourceFromFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"font.ttf"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// From bytes&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewFontSourceFromBytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fontBytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Create face at specific size&lt;/span&gt;
&lt;span class="n"&gt;face&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Face&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// 24pt&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Multi-Face (Font Fallback)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Primary font + emoji fallback&lt;/span&gt;
&lt;span class="n"&gt;mainFont&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewFontSourceFromFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Roboto.ttf"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;emojiFont&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewFontSourceFromFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"NotoColorEmoji.ttf"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;multiFace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewMultiFace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;mainFont&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Face&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewFilteredFace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;emojiFont&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Face&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RangeEmoji&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetFont&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;multiFace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello World! 🎉🚀"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Text Layout with Wrapping
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;opts&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayoutOptions&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;MaxWidth&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;WrapMode&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WrapWordChar&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c"&gt;// Word-first, char fallback&lt;/span&gt;
    &lt;span class="n"&gt;Alignment&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AlignCenter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LineSpacing&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;layout&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayoutText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;longText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Render using glyph renderer&lt;/span&gt;
&lt;span class="n"&gt;renderer&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewGlyphRenderer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;outlines&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;renderer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RenderLayout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Or access line metrics directly&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lines&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// line.Y is the baseline Y position&lt;/span&gt;
    &lt;span class="c"&gt;// line.Width, line.Ascent, line.Descent available&lt;/span&gt;
    &lt;span class="c"&gt;// line.Glyphs contains positioned glyphs&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Color Emoji
&lt;/h3&gt;

&lt;p&gt;gg supports both bitmap (CBDT/CBLC) and vector (COLR/CPAL) color emoji:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Extract bitmap emoji (Noto Color Emoji)&lt;/span&gt;
&lt;span class="n"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCBDTExtractor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cbdtData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cblcData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;glyph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;extractor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetGlyph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;glyphID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ppem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;png&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;glyph&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;// Parse vector emoji layers (Segoe UI Emoji)&lt;/span&gt;
&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCOLRParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;colrData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cpalData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;glyph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetGlyph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;glyphID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;paletteIndex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;layer&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;glyph&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Layers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Render each layer with layer.Color&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Layer Compositing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  29 Blend Modes
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Porter-Duff:&lt;/strong&gt; Clear, Src, Dst, SrcOver, DstOver, SrcIn, DstIn, SrcOut, DstOut, SrcAtop, DstAtop, Xor&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advanced:&lt;/strong&gt; Multiply, Screen, Overlay, Darken, Lighten, ColorDodge, ColorBurn, HardLight, SoftLight, Difference, Exclusion&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HSL:&lt;/strong&gt; Hue, Saturation, Color, Luminosity&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PushLayer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BlendMultiply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Blend mode + opacity&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetHexColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#e74c3c"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawCircle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetHexColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#3498db"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawCircle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PopLayer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c"&gt;// Composite layer&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Alpha Masks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Create mask from shape&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawCircle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;mask&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AsMask&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Apply mask to new context&lt;/span&gt;
&lt;span class="n"&gt;dc2&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;400&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dc2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DrawImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backgroundImage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// Only visible through mask&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Scene Graph (Retained Mode)
&lt;/h2&gt;

&lt;p&gt;For complex, frequently-updated scenes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gogpu/gg/scene"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewScene&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Build scene graph with layer compositing&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PushLayer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BlendNormal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// blend, alpha, clip&lt;/span&gt;

&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FillNonZero&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IdentityAffine&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SolidBrush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Red&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCircleShape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FillNonZero&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IdentityAffine&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SolidBrush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Blue&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCircleShape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PopLayer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Render to pixmap&lt;/span&gt;
&lt;span class="n"&gt;renderer&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRenderer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;renderer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GPU-optimized command batching&lt;/li&gt;
&lt;li&gt;Efficient dirty region tracking&lt;/li&gt;
&lt;li&gt;Parallel tile rendering&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Backend Architecture
&lt;/h2&gt;

&lt;p&gt;gg supports three rendering backends:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Native&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;backend/native/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pure Go via gogpu/wgpu&lt;/td&gt;
&lt;td&gt;Default, zero deps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rust&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;backend/rust/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;wgpu-native FFI&lt;/td&gt;
&lt;td&gt;Max performance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Software&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;backend/software/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CPU rasterizer&lt;/td&gt;
&lt;td&gt;Fallback&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/gogpu/gg/backend"&lt;/span&gt;

&lt;span class="c"&gt;// Auto-select best available&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Default&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Explicit selection&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BackendNative&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BackendRust&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c"&gt;// Requires -tags rust&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BackendSoftware&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GPU Backend Features (v0.20.0)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Command Encoder&lt;/strong&gt; — State machine for GPU command recording&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Texture Management&lt;/strong&gt; — Lazy views with &lt;code&gt;sync.Once&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Buffer Mapping&lt;/strong&gt; — Async with device polling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pipeline Cache&lt;/strong&gt; — FNV-1a descriptor hashing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute Shaders&lt;/strong&gt; — WGSL shaders for path rasterization&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Benchmarks
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;sRGB → Linear&lt;/td&gt;
&lt;td&gt;0.16ns&lt;/td&gt;
&lt;td&gt;260x faster than math.Pow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LayerCache.Get&lt;/td&gt;
&lt;td&gt;90ns&lt;/td&gt;
&lt;td&gt;Thread-safe LRU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DirtyRegion.Mark&lt;/td&gt;
&lt;td&gt;10.9ns&lt;/td&gt;
&lt;td&gt;Lock-free atomic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MSDF lookup&lt;/td&gt;
&lt;td&gt;&amp;lt;10ns&lt;/td&gt;
&lt;td&gt;Zero-allocation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Path iteration&lt;/td&gt;
&lt;td&gt;438ns&lt;/td&gt;
&lt;td&gt;iter.Seq, 0 allocs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Optimizations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SIMD&lt;/strong&gt; — Vectorized color operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel rendering&lt;/strong&gt; — Tile-based multi-threaded rasterization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LRU caching&lt;/strong&gt; — Glyph cache, texture cache, layer cache&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU compute&lt;/strong&gt; — Path flattening, tile binning, coverage calculation&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Anti-Aliased Rendering
&lt;/h2&gt;

&lt;p&gt;v0.19.0 introduced professional-grade anti-aliasing using the &lt;strong&gt;tiny-skia algorithm&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// AA is enabled by default&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Disable AA for performance&lt;/span&gt;
&lt;span class="n"&gt;dc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FillNoAA&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Implementation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;4x supersampling with coverage accumulation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AlphaRuns&lt;/code&gt; — RLE-encoded sparse alpha buffer&lt;/li&gt;
&lt;li&gt;SIMD batch blending for 16 pixels at a time&lt;/li&gt;
&lt;li&gt;Same algorithm as Chrome, Android, Flutter&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The GoGPU Ecosystem
&lt;/h2&gt;

&lt;p&gt;gg is part of a complete Pure Go graphics stack:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;LOC&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;gogpu/gg&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;v0.20.0&lt;/td&gt;
&lt;td&gt;~122K&lt;/td&gt;
&lt;td&gt;2D graphics (this library)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gogpu/wgpu&lt;/td&gt;
&lt;td&gt;v0.10.1&lt;/td&gt;
&lt;td&gt;~95K&lt;/td&gt;
&lt;td&gt;Pure Go WebGPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gogpu/naga&lt;/td&gt;
&lt;td&gt;v0.8.4&lt;/td&gt;
&lt;td&gt;~33K&lt;/td&gt;
&lt;td&gt;WGSL shader compiler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gogpu/gogpu&lt;/td&gt;
&lt;td&gt;v0.11.1&lt;/td&gt;
&lt;td&gt;~28K&lt;/td&gt;
&lt;td&gt;Graphics framework&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Total: ~280K lines of Pure Go.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Design Principles
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pure Go&lt;/strong&gt; — No CGO, easy cross-compilation, single binary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU-First&lt;/strong&gt; — Designed for GPU acceleration from day one&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production-Ready&lt;/strong&gt; — Enterprise-grade error handling, logging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Stability&lt;/strong&gt; — Semantic versioning, deprecation policy&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Inspired By
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;vello&lt;/strong&gt; (Rust) — GPU compute shaders, sparse strips&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;tiny-skia&lt;/strong&gt; (Rust) — Anti-aliasing, stroke expansion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;kurbo&lt;/strong&gt; (Rust) — Path algorithms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;peniko&lt;/strong&gt; (Rust) — Brush/paint system&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;skrifa/swash&lt;/strong&gt; (Rust) — Font parsing, color emoji&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/gg@v0.20.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt; Go 1.25+&lt;/p&gt;

&lt;h3&gt;
  
  
  Build Tags
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Default: Native + Software backends&lt;/span&gt;
go build ./...

&lt;span class="c"&gt;# With Rust backend (requires wgpu-native)&lt;/span&gt;
go build &lt;span class="nt"&gt;-tags&lt;/span&gt; rust ./...

&lt;span class="c"&gt;# Software only&lt;/span&gt;
go build &lt;span class="nt"&gt;-tags&lt;/span&gt; nogpu ./...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Complete Examples
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/gg/tree/main/examples/basic" rel="noopener noreferrer"&gt;&lt;code&gt;examples/basic/&lt;/code&gt;&lt;/a&gt; — Basic shapes and colors&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/gg/tree/main/examples/text" rel="noopener noreferrer"&gt;&lt;code&gt;examples/text/&lt;/code&gt;&lt;/a&gt; — Text rendering&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/gg/tree/main/examples/color_emoji" rel="noopener noreferrer"&gt;&lt;code&gt;examples/color_emoji/&lt;/code&gt;&lt;/a&gt; — Color emoji extraction&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/gg/tree/main/examples/gpu" rel="noopener noreferrer"&gt;&lt;code&gt;examples/gpu/&lt;/code&gt;&lt;/a&gt; — GPU backend usage&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gogpu/gg/tree/main/examples/scene" rel="noopener noreferrer"&gt;&lt;code&gt;examples/scene/&lt;/code&gt;&lt;/a&gt; — Scene graph&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Run Examples
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;examples/basic
go run main.go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Documentation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;README&lt;/a&gt;&lt;/strong&gt; — Quick start&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/gogpu/gg/blob/main/docs/ARCHITECTURE.md" rel="noopener noreferrer"&gt;ARCHITECTURE.md&lt;/a&gt;&lt;/strong&gt; — System design&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/gogpu/gg/blob/main/ROADMAP.md" rel="noopener noreferrer"&gt;ROADMAP.md&lt;/a&gt;&lt;/strong&gt; — Development milestones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://pkg.go.dev/github.com/gogpu/gg" rel="noopener noreferrer"&gt;pkg.go.dev&lt;/a&gt;&lt;/strong&gt; — API reference&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/gg" rel="noopener noreferrer"&gt;https://github.com/gogpu/gg&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Release v0.20.0:&lt;/strong&gt; &lt;a href="https://github.com/gogpu/gg/releases/tag/v0.20.0" rel="noopener noreferrer"&gt;https://github.com/gogpu/gg/releases/tag/v0.20.0&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GoGPU Organization:&lt;/strong&gt; &lt;a href="https://github.com/gogpu" rel="noopener noreferrer"&gt;https://github.com/gogpu&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discussions:&lt;/strong&gt; &lt;a href="https://github.com/orgs/gogpu/discussions" rel="noopener noreferrer"&gt;https://github.com/orgs/gogpu/discussions&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;122K lines. Pure Go. Production-ready 2D graphics.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get github.com/gogpu/gg@v0.20.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Star the repo if you find it useful!&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the GoGPU Journey series&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>gpu</category>
      <category>graphics</category>
      <category>webgpu</category>
    </item>
  </channel>
</rss>
