<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Farhan Syah</title>
    <description>The latest articles on Forem by Farhan Syah (@farhansyah).</description>
    <link>https://forem.com/farhansyah</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1900686%2Fea8f1dd1-a04d-4806-83b2-45ce96c62aa2.jpeg</url>
      <title>Forem: Farhan Syah</title>
      <link>https://forem.com/farhansyah</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/farhansyah"/>
    <language>en</language>
    <item>
      <title>What Kind of Database I Want NodeDB to Be</title>
      <dc:creator>Farhan Syah</dc:creator>
      <pubDate>Fri, 03 Apr 2026 08:19:35 +0000</pubDate>
      <link>https://forem.com/nodedb/what-kind-of-database-i-want-nodedb-to-be-2ep</link>
      <guid>https://forem.com/nodedb/what-kind-of-database-i-want-nodedb-to-be-2ep</guid>
      <description>&lt;p&gt;When I think about &lt;strong&gt;NodeDB&lt;/strong&gt;, I am not thinking about the longest feature list or the flashiest demo.&lt;/p&gt;

&lt;p&gt;I am thinking about a database I can trust before and after an application grows.&lt;/p&gt;

&lt;p&gt;In the long run, I want &lt;strong&gt;NodeDB&lt;/strong&gt; to be &lt;strong&gt;easy to use&lt;/strong&gt;, &lt;strong&gt;reliable&lt;/strong&gt; in different scenarios, and &lt;strong&gt;secure&lt;/strong&gt; enough that I do not have to keep second-guessing it. I want it to be something I can start with early, keep using later, and not feel forced to replace once the project becomes more serious.&lt;/p&gt;

&lt;p&gt;I should not have to rethink the whole stack every time product requirements change. I should not have to move data somewhere else just because a new use case shows up. I should not have to accept that one part of the database is “real” while another important part is just a workaround. If the business grows, the database should still feel like a stable base, not the next reason to re-architect.&lt;/p&gt;

&lt;p&gt;But that is far in the future. The current reality is simpler: I am still building toward it.&lt;/p&gt;

&lt;p&gt;Right now, my main concern is not polish. It is not making NodeDB look finished before it is finished. It is the foundation.&lt;/p&gt;

&lt;p&gt;I want to build enough core capability early, and build it deeply enough, that I do not spend the next few years patching around missing pieces.&lt;/p&gt;




&lt;p&gt;Many databases grow by accumulation. A feature becomes important, so it gets added. Another workload appears, so another layer gets introduced. Then another extension, another plugin, another wrapper, another sidecar. Over time, the system may cover more ground, but it does not always become more coherent.&lt;/p&gt;

&lt;p&gt;From the user side, that has a cost. Query behavior becomes uneven. Operational expectations stop being consistent. One feature feels mature, another feels awkward, another works only if you accept a few strange rules. At that point, you are not really using one clean system anymore. You are managing the boundaries between several pieces that happen to live near each other.&lt;/p&gt;

&lt;p&gt;That is one of the reasons &lt;strong&gt;PostgreSQL&lt;/strong&gt; started feeling heavy for me across multiple projects.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;PostgreSQL&lt;/strong&gt; is good. Its ecosystem is strong. I am not arguing otherwise. But extensions do not magically become one deeply integrated system just because they run around the same database core. In practice, the burden shifts to the user. You are the one stitching capabilities together, working around different limitations, and dealing with the gaps between them.&lt;/p&gt;

&lt;p&gt;I have seen a similar pattern in databases that try to unify more from the start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SurrealDB&lt;/strong&gt; has a vision I understand. But my concern is the same: I do not want a database to keep piling things on top if the foundation was not designed to carry them well. Systems should evolve, of course. That is normal. But there is still a difference between growing a system and collecting features.&lt;/p&gt;

&lt;p&gt;That difference shows up in the user experience very quickly. Some capabilities exist, but they still feel second-class. The ergonomics are weaker. The query model is thinner. Performance is less predictable. Operations feel awkward. The feature works in a demo, but once it becomes central to a real workload, you start seeing the limits.&lt;/p&gt;

&lt;p&gt;That is exactly what I want to avoid with NodeDB.&lt;/p&gt;




&lt;p&gt;I want &lt;strong&gt;NodeDB&lt;/strong&gt; to reduce re-architecture later instead of causing it. I do not want to reach the next stage of a product and realize that an important capability was treated as an afterthought, so now the stack has to be rearranged. I do not want core requirements to arrive later and collide with a design that was never meant to support them properly.&lt;/p&gt;

&lt;p&gt;That is why I care so much about feature depth early.&lt;/p&gt;

&lt;p&gt;Not because users need everything on day one. And not because I think I can build everything perfectly from the start. I cannot.&lt;/p&gt;

&lt;p&gt;What I do believe is this: if an important capability is likely to matter sooner or later, I would rather think hard about how it belongs in the system early.&lt;/p&gt;




&lt;p&gt;I am not interested in a product page that lists many features. I care about whether the database actually behaves like one cohesive system. I care about whether the features feel like they belong together. I care about whether it stays usable across different scenarios without pushing the user into constant redesign or workaround.&lt;/p&gt;

&lt;p&gt;If a database claims to do everything, but half the capabilities feel weak, awkward, or fragile, that is not real completeness. I would rather build something deeper but longer than wider and shallower.&lt;/p&gt;

&lt;p&gt;So the database needs to be dependable. It has to hold up when requirements expand. It has to help the user avoid unnecessary stack changes later.&lt;/p&gt;




&lt;p&gt;Maybe this approach is wrong in some places. It is still &lt;em&gt;my opinion&lt;/em&gt;, my bias, and my way of thinking through the problem.&lt;/p&gt;

&lt;p&gt;But if &lt;strong&gt;NodeDB&lt;/strong&gt; works, I want it to work in a way that still makes sense years later, not just in the first exciting demo.&lt;/p&gt;

&lt;p&gt;In the next post, I will go deeper into the design direction behind that idea and why so many multi-model databases still feel wrong to me.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/NodeDB-Lab/nodedb" rel="noopener noreferrer"&gt;NodeDB&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nodedb</category>
      <category>database</category>
    </item>
    <item>
      <title>Why I'm Building NodeDB</title>
      <dc:creator>Farhan Syah</dc:creator>
      <pubDate>Thu, 02 Apr 2026 21:26:16 +0000</pubDate>
      <link>https://forem.com/nodedb/why-im-building-nodedb-4ml</link>
      <guid>https://forem.com/nodedb/why-im-building-nodedb-4ml</guid>
      <description>&lt;p&gt;For the last few years, &lt;strong&gt;PostgreSQL&lt;/strong&gt; has been my default database.&lt;/p&gt;

&lt;p&gt;Before that, I worked with &lt;strong&gt;MySQL&lt;/strong&gt;, &lt;strong&gt;MariaDB&lt;/strong&gt;, and &lt;strong&gt;MongoDB&lt;/strong&gt;. But once I spent enough time with &lt;strong&gt;PostgreSQL&lt;/strong&gt;, it became very hard to justify anything else for most projects. It gave me the relational model I wanted, plus JSON support that was good enough to remove a lot of my reasons for using MongoDB. When I needed spatial support, I could add PostGIS. When I needed time series and partitioning, I could use TimescaleDB. For a long time, that worked very well.&lt;/p&gt;

&lt;h3&gt;
  
  
  Then the workload started changing.
&lt;/h3&gt;

&lt;p&gt;Over the last two years, AI and ML stopped being side concerns and started becoming part of real application requirements. That meant vector search became relevant. PostgreSQL still looked like the right answer because &lt;code&gt;pgvector&lt;/code&gt; existed and, at first, it was good enough. But once I started using it across more serious workloads, I kept running into the same friction: scaling and performance concerns, filtering limitations, and dimension and storage constraints that mattered for my use cases at the time.&lt;/p&gt;

&lt;h4&gt;
  
  
  And vector was only one part of the problem.
&lt;/h4&gt;

&lt;p&gt;Then came graph needs. At that point, the pattern became very familiar. I could keep stretching PostgreSQL. I could handle graph logic manually at the application level. I could try more extensions. I could wire more tools together. And yes, any one of those decisions can be justified if you are working on one project and you are willing to absorb the complexity.&lt;/p&gt;

&lt;h4&gt;
  
  
  But I am not working on one project.
&lt;/h4&gt;

&lt;p&gt;I work on multiple projects every year, often with different requirements. That changes the economics completely. What looks reasonable in isolation turns into repeated operational and mental overhead when you keep doing it again and again. A couple of extensions are fine. Then you need another one. Then another workaround. Then another set of limitations, quirks, and edge cases to remember. Then offline-first and sync requirements enter the picture and now you are adding even more surrounding tools just to make the whole thing usable.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;That was the real breaking point for me.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The problem was not that PostgreSQL stopped being good. The problem was that PostgreSQL plus extensions plus surrounding infrastructure started becoming a stack I had to keep rebuilding across projects. It worked, but the repetition was exhausting.&lt;/p&gt;




&lt;h3&gt;
  
  
  I started looking around.
&lt;/h3&gt;

&lt;p&gt;Like many people in this space, I first looked at what already existed. If someone had already built the thing I wanted, I would rather use it than build a database from scratch.&lt;/p&gt;

&lt;p&gt;I found &lt;strong&gt;SurrealDB&lt;/strong&gt;. I liked the vision. I still think the direction is compelling: fewer hops, better developer experience, a more unified model. But when I looked deeper, especially at the implementation and tradeoffs, I was not convinced. From my perspective, it felt more like a patchwork than a database designed deeply from the ground up. Even in graph support, I did not find the level of capability I expected. The idea was attractive. The execution did not give me enough confidence.&lt;/p&gt;

&lt;p&gt;Then I looked at &lt;strong&gt;ArcadeDB&lt;/strong&gt;. In many ways, I thought it was stronger. Better coding quality, better performance characteristics, more substance. But it is JVM-based, and I wanted something smaller, tighter, and better suited to the kinds of embedded, mobile, offline-first, and mixed deployment scenarios I care about.&lt;/p&gt;

&lt;p&gt;At that point, my realistic options looked like this:&lt;/p&gt;

&lt;p&gt;Stick with &lt;strong&gt;PostgreSQL&lt;/strong&gt; and keep stacking extensions. Work around another database that did not fully fit. Or accept a polyglot architecture and keep paying the integration cost.&lt;/p&gt;

&lt;p&gt;None of those felt right to me.&lt;/p&gt;

&lt;p&gt;So I chose a &lt;strong&gt;fourth&lt;/strong&gt; option: &lt;strong&gt;build my own database&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  That is how NodeDB started in 2025.
&lt;/h3&gt;

&lt;p&gt;It started as a side project, and honestly, I did not have high expectations. If it worked, it worked. If it failed, it failed. That attitude was useful because this is not the kind of project you begin with false confidence.&lt;/p&gt;

&lt;h4&gt;
  
  
  I have already scrapped the project twice.
&lt;/h4&gt;

&lt;p&gt;This current version is the third serious attempt, and I only started building it earlier this year. The first two failures were important. They forced me to understand what I was doing wrong, what I was hand-waving, and what needed to be designed properly from the beginning instead of patched later. I do not think I would have reached this version without those failures.&lt;/p&gt;

&lt;p&gt;One thing I should mention briefly: I use AI heavily in the implementation.&lt;/p&gt;

&lt;p&gt;The code is mostly written by AI, not by me typing everything manually. That is simply the practical reality. It writes faster than I do, and often better than I do at raw throughput. But I am still the one directing, reviewing, rejecting, and understanding it. That part matters to me. If I am going to build a database seriously and support it in the future, I need to understand it all the way down.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;NodeDB&lt;/strong&gt; exists because I wanted something I could actually use across real projects without rebuilding the same database stack every time.&lt;/p&gt;

&lt;p&gt;I built it first to solve my own use cases, because that part is non-negotiable. If it does not solve my real problems, there is no point. But I also do not want to build a shallow personal tool that only works for me. I want to go deeper than that. I want something that can support broader use cases properly, with serious performance, serious design, and serious technical depth.&lt;/p&gt;

&lt;p&gt;Right now, &lt;strong&gt;NodeDB&lt;/strong&gt; is working for my use cases, but it is still evolving.&lt;/p&gt;

&lt;p&gt;I have already tested it in pilot projects, and for the kinds of problems I built it to solve, it is starting to prove itself. That does not mean the journey is done. Far from it. A database only becomes real when the design holds under pressure, when the tradeoffs are honest, and when the implementation can stand up over time.&lt;/p&gt;

&lt;p&gt;That is the challenge I have chosen.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Will I make it? Time will tell.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But this is the journey I am on, and I am going to share it openly: the design decisions, the mistakes, the database ideas, the tradeoffs, and the lessons I learn along the way.&lt;/p&gt;




&lt;p&gt;If you care about database engineering, multi-model systems, offline-first architecture, or the hard tradeoffs behind building a database from scratch, follow this journey.&lt;/p&gt;

&lt;p&gt;I will be sharing what works, what fails, what I have to redesign, and what I learn from trying to make &lt;strong&gt;NodeDB&lt;/strong&gt; real.&lt;/p&gt;

&lt;p&gt;If that sounds interesting, follow me here on dev.to and keep an eye on the next posts. I am just getting started.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/NodeDB-Lab/nodedb" rel="noopener noreferrer"&gt;NodeDB&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nodedb</category>
      <category>postgres</category>
      <category>database</category>
    </item>
    <item>
      <title>numr 0.5.0: The Rust numerical computing library that doesn't make you choose</title>
      <dc:creator>Farhan Syah</dc:creator>
      <pubDate>Sat, 14 Mar 2026 20:15:39 +0000</pubDate>
      <link>https://forem.com/farhansyah/numr-050-the-rust-numerical-computing-library-that-doesnt-make-you-choose-cpp</link>
      <guid>https://forem.com/farhansyah/numr-050-the-rust-numerical-computing-library-that-doesnt-make-you-choose-cpp</guid>
      <description>&lt;p&gt;Last year, I started building &lt;strong&gt;numr&lt;/strong&gt; because I was frustrated.&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/ml-rust" rel="noopener noreferrer"&gt;
        ml-rust
      &lt;/a&gt; / &lt;a href="https://github.com/ml-rust/numr" rel="noopener noreferrer"&gt;
        numr
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      A high-performance numerical computing library for Rust with GPU acceleration, inspired by Numpy
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;numr&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Foundational numerical computing for Rust&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;numr&lt;/code&gt; provides n-dimensional tensors, linear algebra, FFT, statistics, and automatic differentiation—with native GPU acceleration across CPU, CUDA, and WebGPU backends.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;numr&lt;/code&gt; is like Numpy in Rust but built with gradients, GPUs, and modern dtypes built-in from day one.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What numr Is&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;A &lt;strong&gt;foundation library&lt;/strong&gt; - Mathematical building blocks for higher-level libraries and applications.&lt;/p&gt;
&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;numr IS&lt;/th&gt;
&lt;th&gt;numr is NOT&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tensor library (like NumPy's ndarray)&lt;/td&gt;
&lt;td&gt;A deep learning framework&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linear algebra (decompositions, solvers)&lt;/td&gt;
&lt;td&gt;A high-level ML API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FFT, statistics, random distributions&lt;/td&gt;
&lt;td&gt;Domain-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Native GPU (CUDA + WebGPU) + autograd&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;For SciPy-equivalent functionality&lt;/strong&gt; (optimization, ODE, interpolation, signal), see &lt;a href="https://github.com/ml-rust/solvr" rel="noopener noreferrer"&gt;&lt;strong&gt;solvr&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Why numr?&lt;/h2&gt;

&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;vs NumPy&lt;/h3&gt;

&lt;/div&gt;
&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;NumPy&lt;/th&gt;
&lt;th&gt;numr&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;N-dimensional tensors&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linear algebra, FFT, stats&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Automatic differentiation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗ Need JAX/PyTorch&lt;/td&gt;
&lt;td&gt;✓ Built-in &lt;code&gt;numr::autograd&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPU acceleration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗ Need CuPy/JAX&lt;/td&gt;
&lt;td&gt;✓ Native CUDA + WebGPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Non-NVIDIA GPUs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗ None&lt;/td&gt;
&lt;td&gt;✓ AMD, Intel, Apple via WebGPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FP8 /&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;…&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/ml-rust/numr" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;




&lt;p&gt;I wanted to do numerical computing in Rust — tensors, linear algebra, FFT, gradients — on GPUs. Not just NVIDIA GPUs. Any GPU. And I didn't want to glue together five incompatible crates to do it.&lt;/p&gt;

&lt;p&gt;Python didn't plan for this either. NumPy emerged organically, and it took years of bolting on CuPy, JAX, and PyTorch before Python had GPU compute and autograd — scattered across incompatible libraries.&lt;/p&gt;

&lt;p&gt;Some people say fragmentation is fine. Separate crates for separate concerns — that's the Unix philosophy. And I'd agree, if they shared conventions, types, and backends. But they don't. ndarray gives you tensors but no GPU. nalgebra gives you linear algebra but no autograd. rustfft gives you FFT but nothing else. Different types, different idioms, none of them compose.&lt;/p&gt;

&lt;p&gt;So the burden falls on you — the application developer. You're the one writing adapter layers between crates. You're the one figuring out why this tensor type doesn't work with that decomposition. And when you need GPU support or a missing operation? You're filing issues and PRs upstream, waiting for maintainers, before you can get back to building your actual application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;numr&lt;/strong&gt; takes that burden off you. One library, one tensor type, one API — tensors, linalg, FFT, statistics, autograd, GPU. &lt;strong&gt;numr&lt;/strong&gt; will handle the hard part. You can just focus on building your application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;one library, one API, every backend&lt;/strong&gt;. Write your code once. Run it on CPU with AVX-512. Run it on NVIDIA with native CUDA kernels. Run it on AMD, Intel, or Apple silicon through WebGPU. Same code. Same results.&lt;/p&gt;

&lt;p&gt;Today, &lt;strong&gt;numr 0.5.0&lt;/strong&gt; ships. And it's the release where it stopped being a "promising project" and became something you can actually build on.&lt;/p&gt;




&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Fused kernels — because memory bandwidth is the real bottleneck
&lt;/h3&gt;

&lt;p&gt;The single biggest performance win in GPU computing isn't faster math. It's reading memory fewer times.&lt;/p&gt;

&lt;p&gt;A naive softmax reads your tensor five times: max, subtract, exp, sum, divide. A fused softmax reads it once. For large tensors, that's not a 5x difference (the math is cheap), but it's easily 2-3x.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://crates.io/crates/numr/0.5.0" rel="noopener noreferrer"&gt;0.5.0&lt;/a&gt;&lt;/strong&gt; adds fused kernels for the operations that matter most:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GEMM epilogue&lt;/strong&gt;: matmul + bias + activation in one kernel launch. This is the inner loop of every neural network. Forward and backward.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activation-mul&lt;/strong&gt;: for gated architectures like SwiGLU that power modern LLMs. One read instead of three.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add-norm&lt;/strong&gt;: residual connection + normalization fused together. The other operation you hit every single transformer layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these work on CPU, CUDA, and WebGPU. All of them have backward passes for autograd.&lt;/p&gt;




&lt;h3&gt;
  
  
  FP8 and quantized compute — because not everything needs 32 bits
&lt;/h3&gt;

&lt;p&gt;FP8 isn't just "smaller numbers." It's the difference between fitting a model in VRAM or not. Between one GPU and two.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;numr&lt;/strong&gt; now does FP8 matrix multiplication natively — E4M3 and E5M2 formats, across all backends. No external libraries. No NVIDIA-only restrictions.&lt;/p&gt;

&lt;p&gt;We also added i8×i8→i32 quantized matmul on CPU. This is what powers efficient quantized inference when you don't have a GPU.&lt;/p&gt;




&lt;h3&gt;
  
  
  2:4 structured sparsity — because half your weights are probably zero
&lt;/h3&gt;

&lt;p&gt;NVIDIA's Ampere architecture introduced hardware support for 2:4 sparsity: for every group of 4 weights, exactly 2 are zero. The hardware skips them, doubling throughput for free.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;numr 0.5.0&lt;/strong&gt; supports 2:4 structured sparsity across all backends. On CUDA, it hits the hardware fast path. On CPU and WebGPU, it uses optimized sparse kernels.&lt;/p&gt;




&lt;h3&gt;
  
  
  Autograd that actually covers what you need
&lt;/h3&gt;

&lt;p&gt;Previous releases had autograd for basic operations. &lt;strong&gt;0.5.0&lt;/strong&gt; makes it comprehensive:&lt;/p&gt;

&lt;p&gt;conv1d, conv2d, softmax, rms_norm, layer_norm, SiLU, softplus, SwiGLU, dropout, the fused GEMM epilogue, fused add-norm, dtype cast, narrow, cat, gather — all differentiable, all with correct backward passes, all supporting second-order derivatives.&lt;/p&gt;

&lt;p&gt;Activation checkpointing lets you trade compute for memory. Backward hooks let you trigger distributed gradient sync during backprop.&lt;/p&gt;

&lt;p&gt;This isn't an ML framework. It's the autograd engine that ML frameworks build on.&lt;/p&gt;




&lt;h3&gt;
  
  
  A CUDA backend that acts like it belongs there
&lt;/h3&gt;

&lt;p&gt;The CUDA story got serious in &lt;strong&gt;0.5.0&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caching allocator.&lt;/strong&gt; CUDA memory allocation is expensive. The old approach (stream-ordered allocation) worked but left performance on the table. The new Rust-side caching allocator reuses memory blocks, cutting allocation overhead dramatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graph capture.&lt;/strong&gt; Record a sequence of kernel launches once, replay it with zero overhead. Essential for inference serving where you run the same computation thousands of times.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GEMV fast paths.&lt;/strong&gt; When one matrix dimension is small (which happens constantly during inference — batch size 1), you don't want full tiled GEMM. Specialized GEMV kernels for transposed weight matrices avoid unnecessary work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipelined D2H copy.&lt;/strong&gt; Overlap GPU computation with data transfer back to the host. The GPU doesn't wait for the CPU, the CPU doesn't wait for the GPU.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why 0.5.0 matters
&lt;/h2&gt;

&lt;p&gt;This is where numr crosses the threshold from "interesting foundation" to "you can build real things on this." And we know because we did.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;0.5.0&lt;/strong&gt; has been validated against real downstream consumers. &lt;a href="https://github.com/ml-rust/solvr" rel="noopener noreferrer"&gt;solvr&lt;/a&gt; — a scientific computing library with optimization, ODE solvers, and interpolation — builds and runs on &lt;strong&gt;numr 0.5.0&lt;/strong&gt;. &lt;a href="https://github.com/ml-rust/boostr" rel="noopener noreferrer"&gt;boostr&lt;/a&gt; — an ML framework with attention, MoE, and Mamba blocks — builds and runs on it too. LLM inference and embedding generation work end-to-end.&lt;/p&gt;

&lt;p&gt;This isn't a library that passes unit tests in isolation. It's a library that other libraries are built on, and those libraries work.&lt;/p&gt;

&lt;p&gt;The fused kernels mean you're not leaving performance on the table. The autograd coverage means you can differentiate through realistic computation graphs. The CUDA infrastructure means GPU workloads actually perform. And all of it works the same across CPU, CUDA, and WebGPU.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://crates.io/crates/numr/0.5.0" rel="noopener noreferrer"&gt;0.5.0&lt;/a&gt;&lt;/strong&gt; unblocks new releases of &lt;a href="https://github.com/ml-rust/solvr" rel="noopener noreferrer"&gt;solvr&lt;/a&gt; (scientific computing — optimization, ODE solvers, interpolation) and &lt;a href="https://github.com/ml-rust/boostr" rel="noopener noreferrer"&gt;boostr&lt;/a&gt; (ML framework) which both build on numr.&lt;/p&gt;

&lt;p&gt;For numr itself, &lt;strong&gt;0.6.0&lt;/strong&gt; focuses on hardening: cleaning up error handling, API stability audit, and preparing for an eventual &lt;strong&gt;1.0&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;ROCm (native AMD GPU) is on the roadmap for 0.7.0+.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[dependencies]&lt;/span&gt;
&lt;span class="py"&gt;numr&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.5.0"&lt;/span&gt;

&lt;span class="c"&gt;# With GPU support&lt;/span&gt;
&lt;span class="py"&gt;numr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.5.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"cuda"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="py"&gt;numr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.5.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"wgpu"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/ml-rust/numr" rel="noopener noreferrer"&gt;github.com/ml-rust/numr&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Crates.io: &lt;a href="https://crates.io/crates/numr" rel="noopener noreferrer"&gt;crates.io/crates/numr&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;numr is Apache-2.0 licensed. Contributions welcome.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>machinelearning</category>
      <category>programming</category>
      <category>opensource</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Farhan Syah</dc:creator>
      <pubDate>Wed, 11 Mar 2026 20:32:38 +0000</pubDate>
      <link>https://forem.com/farhansyah/-22eo</link>
      <guid>https://forem.com/farhansyah/-22eo</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/farhansyah" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1900686%2Fea8f1dd1-a04d-4806-83b2-45ce96c62aa2.jpeg" alt="farhansyah"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/farhansyah/why-i-built-rdx-bringing-modern-docs-as-code-to-the-rust-ecosystem-49f1" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Why I Built RDX: Bringing Modern "Docs-as-Code" to the Rust Ecosystem&lt;/h2&gt;
      &lt;h3&gt;Farhan Syah ・ Mar 11&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#ai&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#webdev&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#programming&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#rust&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>rust</category>
    </item>
    <item>
      <title>Why I Built RDX: Bringing Modern "Docs-as-Code" to the Rust Ecosystem</title>
      <dc:creator>Farhan Syah</dc:creator>
      <pubDate>Wed, 11 Mar 2026 20:22:11 +0000</pubDate>
      <link>https://forem.com/farhansyah/why-i-built-rdx-bringing-modern-docs-as-code-to-the-rust-ecosystem-49f1</link>
      <guid>https://forem.com/farhansyah/why-i-built-rdx-bringing-modern-docs-as-code-to-the-rust-ecosystem-49f1</guid>
      <description>&lt;p&gt;For more than 10 years, I lived and breathed the Node.js ecosystem. I built applications using Node, Bun, and especially Svelte. I loved it. I still do—I’ve never been of the opinion that JavaScript or Node.js is "bad." Tools like Astro, MDX, and SvelteKit are genuinely phenomenal.&lt;/p&gt;

&lt;p&gt;But a while ago, my work shifted. I needed more control at a lower level, which led me to Rust. I’ve been using Rust full-time for a while now, and honestly? I don’t plan on going back.&lt;/p&gt;

&lt;p&gt;However, moving to a new ecosystem always exposes what’s missing. In Rust, one of the biggest glaring holes is &lt;strong&gt;public-facing documentation tooling&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Don't get me wrong: &lt;code&gt;rustdoc&lt;/code&gt; and &lt;a href="http://docs.rs/" rel="noopener noreferrer"&gt;docs.rs&lt;/a&gt; are incredible. They are arguably the cleanest, best ways to document source code in the industry. But API documentation isn't product documentation. When you need to build public-facing docs—with rich tutorials, interactive API playgrounds, custom callouts, and interactive tabs—Rust falls short.&lt;/p&gt;

&lt;p&gt;You usually end up using &lt;a href="https://github.com/rust-lang/mdBook" rel="noopener noreferrer"&gt;mdBook&lt;/a&gt;, which is great but visually basic. If you want a modern, interactive documentation site that rivals Stripe or Vercel, you are forced to leave Rust and go back to Python (MkDocs) or the Node.js ecosystem (Docusaurus, Mintlify, Nextra).&lt;/p&gt;

&lt;p&gt;I wanted to keep my stack 100% Rust. I didn't want to maintain a package.json just to write my documentation.&lt;/p&gt;

&lt;p&gt;I decided to build my own Static Site Generator (SSG) in Rust that runs on WebAssembly (WASM) to fully utilize Rust in the browser. But right out of the gate, I hit a massive blocker: the format.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The MDX and Markdoc Dilemma&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Standard Markdown (.md) is too limited. You can't build rich, interactive UI components with it.&lt;/p&gt;

&lt;p&gt;The industry standard is &lt;strong&gt;MDX&lt;/strong&gt;. But MDX is tightly coupled to JavaScript. It is inherently &lt;em&gt;imperative&lt;/em&gt;—it executes code. Trying to force a Rust backend to safely parse, execute, and render React-based MDX is a nightmare.&lt;/p&gt;

&lt;p&gt;Then there is &lt;strong&gt;Markdoc&lt;/strong&gt; (by Stripe). Markdoc gets the philosophy exactly right: documents shouldn't execute code; they should be &lt;em&gt;declarative data&lt;/em&gt;. But Markdoc is written entirely in TypeScript/JavaScript. Writing a Rust wrapper around a JS library, or trying to port a massive, moving TS codebase to Rust, felt counter-productive.&lt;/p&gt;

&lt;p&gt;I needed a native, high-performance implementation written in Rust.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Introducing RDX: Reactive Document eXpressions&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;I realized that before I could build the generator, I needed the language. So, I designed &lt;strong&gt;&lt;a href="https://github.com/rdx-lang/rdx" rel="noopener noreferrer"&gt;RDX&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;RDX has everything standard Markdown has, but it supports strict, declarative component rendering. It uses the familiar HTML/JSX-like syntax (&amp;lt;Notice type="warning"&amp;gt;) that authors are used to, but it fundamentally treats documents as pure data. No import statements, no JavaScript execution. Just a clean, strictly typed Abstract Syntax Tree (AST).&lt;/p&gt;

&lt;p&gt;I didn't just want to build a Rust crate, though. I started by writing a &lt;strong&gt;proper, formal specification&lt;/strong&gt;. I did this so that while I was building the official Rust implementation, anyone else could read the spec and build an RDX parser in Go, Python, or Zig tomorrow.&lt;/p&gt;

&lt;p&gt;After finalizing the &lt;a href="https://github.com/rdx-lang/rdx/blob/main/SPECIFICATION.md" rel="noopener noreferrer"&gt;spec&lt;/a&gt;, I built the official tools. Today, I'm thrilled to release them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rdx-parser: The core parsing engine built on top of pulldown-cmark.
&lt;/li&gt;
&lt;li&gt;rdx-ast: The strictly typed data structures.
&lt;/li&gt;
&lt;li&gt;rdx-schema: A validation engine that guarantees your authors don't use fake props or components.
&lt;/li&gt;
&lt;li&gt;I've even included a CLI tool to help people convert their existing MDX files to RDX, verify schemas, and more.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is open-sourced, and an be viewed here:&lt;br&gt;


&lt;/p&gt;
&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://github.com/rdx-lang" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F266820058%3Fs%3D280%26v%3D4" height="auto" class="m-0"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://github.com/rdx-lang" rel="noopener noreferrer" class="c-link"&gt;
            rdx-lang · GitHub
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            rdx-lang has 4 repositories available. Follow their code on GitHub.
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.githubassets.com%2Ffavicons%2Ffavicon.svg"&gt;
          github.com
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;




&lt;p&gt;You can start writing RDX today. In fact, I've already built and published a VS Code/VSCodium extension for syntax highlighting to make authoring a breeze.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Missing Piece: Rendering (And a Sneak Peek)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Right now, we have the parser, the AST, and the editor support. The only thing missing is the rendering software to turn these .rdx files into a beautiful website.&lt;/p&gt;

&lt;p&gt;Don't worry, I'm building that right now.&lt;/p&gt;

&lt;p&gt;I am currently developing a next-generation SSG. It will consume your RDX files and generate a documentation site that rivals Docusaurus, Mintlify, and MkDocs. The best part? It uses Rust and WASM to deliver high speed build times and interactive components without ever touching npm.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Built for the AI Era&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;There is one final reason I believe RDX is the future of "Docs as Code."&lt;/p&gt;

&lt;p&gt;RDX is incredibly AI-friendly. If you ask an LLM to write MDX, it frequently hallucinates JavaScript imports or breaks the build with syntax errors. If you ask an LLM to write Markdoc, it struggles with the custom Liquid-style tags.&lt;/p&gt;

&lt;p&gt;But LLMs &lt;em&gt;excel&lt;/em&gt; at writing standard HTML tags with typed attributes. Because RDX isolates components as pure data and pairs them with rdx-schema validation, you can autonomously generate documentation via AI and validate it instantly at build time. An RDX-powered AI documentation pipeline will beat an MDX or Markdoc pipeline in stability every single time.&lt;/p&gt;

&lt;p&gt;But, it will only be like that if everything is done correctly, and I get all the support from the community.&lt;/p&gt;

&lt;p&gt;I hope RDX can become the new standard for documentation. We finally have a way to write rich, interactive content without sacrificing the safety, speed, and tooling of the Rust ecosystem.&lt;/p&gt;

&lt;p&gt;Check out the &lt;a href="https://github.com/rdx-lang/rdx" rel="noopener noreferrer"&gt;repo&lt;/a&gt;, read the &lt;a href="https://github.com/rdx-lang/rdx/blob/main/SPECIFICATION.md" rel="noopener noreferrer"&gt;spec&lt;/a&gt;, and stay tuned. The renderer is coming next.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>rust</category>
    </item>
  </channel>
</rss>
