Forem: Rasmus Ros

From Closures to an AST in a Kotlin Transform Graph

Rasmus Ros — Mon, 18 May 2026 09:20:16 +0000

kumulant is a streaming statistics library: you feed it numbers, it maintains an accumulator like a mean or a quantile sketch, and you read snapshots back. Above the accumulator sits a graph of transforms and filters that preprocesses each value before it lands in the stat: filter out negatives, log-transform latencies, take a weighted dot product of a feature vector.

The first version of that graph was Kotlin lambdas all the way down. Pre-update transforms were (Double) -> Double, filters were (Double) -> Boolean, paired transforms were (Double, Double) -> Pair<Double, Double>. A schema would look something like this (the StatSchema / by stat pattern is from the previous post):

object LatencyMetrics : StatSchema() {
    val p99 by stat(
        DDSketch(probabilities = doubleArrayOf(0.99))
            .filter { it >= 0 }
            .transform { ln(it) }
    )
}

This is the path of least resistance in Kotlin and it works fine while the only caller is in-process Kotlin code. The lambdas are typed, the call site is short, and the closure captures whatever it needs from the enclosing scope.

The wire problem

kumulant's job inside the Eignex rewrite is to back a cloud-deployed monitoring layer. The expected caller is a service that wants to author its stat config as YAML and POST it over HTTP, not link kumulant as a Kotlin dependency. Once you've decided that's the deployment shape, every closure in the graph is a problem. A (Double) -> Double doesn't serialize. You can't write a transform in YAML if transform is a JVM lambda.

The naive fix is to ship a handful of named transforms (log, sqrt, negate) and let YAML reference them by string. That works until the first user needs log(x) minus log(y) or a piecewise expression, at which point you either keep adding named cases or invent a tiny expression language. Better to invent it up front.

The AST

The redesign turns every closure-shaped slot in the graph into a sealed AST:

@Serializable
sealed interface ScalarExpr {
    fun eval(x: Double, y: Double = 0.0, v: DoubleArray = EMPTY_VECTOR): Double
}

@Serializable @SerialName("X")     data object X : ScalarExpr { ... }
@Serializable @SerialName("Const") data class Const(val v: Double) : ScalarExpr { ... }
@Serializable @SerialName("Mul")   data class Mul(val l: ScalarExpr, val r: ScalarExpr) : ScalarExpr { ... }
@Serializable @SerialName("Log")   data class Log(val a: ScalarExpr) : ScalarExpr { ... }
@Serializable @SerialName("VFold") data class VFold(val op: VFoldOp) : ScalarExpr { ... }
// ... Add, Sub, Div, Neg, Abs, Exp, Sqrt, Pow, Min, Max, IfExpr, VDot, V(index)

Mirror the same shape for BoolExpr (Gt, Lt, And, Or, Not, InRange, etc.) and VectorExpr for the cases where the output is a vector of arbitrary length, not a scalar. Each node is @Serializable with a @SerialName discriminator, so kotlinx.serialization round-trips the whole tree polymorphically. The leaves X, Y, and V(i) are placeholders that get bound to the current input when eval runs.

The call site you'd want to keep, transform { ln(it) }, would now have to read transform(Log(X)). Doable, but losing the operator syntax is a real regression. Kotlin's operator overloading recovers it:

operator fun ScalarExpr.plus(rhs: ScalarExpr): ScalarExpr = Add(this, rhs)
operator fun ScalarExpr.times(rhs: Double): ScalarExpr = Mul(this, Const(rhs))
infix fun ScalarExpr.gt(rhs: Double): BoolExpr = Gt(this, Const(rhs))
// ... one per operator, three handfuls in total

With those in scope, the user-facing API looks almost identical to the lambda version. What's underneath is the difference:

val p99 by stat(
    DDSketch(probabilities = doubleArrayOf(0.99))
        .filter(X gt 0.0)        // BoolExpr: Gt(X, Const(0.0))
        .transform(Log(X))       // ScalarExpr: Log(X)
)

That same schema serializes to YAML as a tree the user can hand-edit or templated by a deploy pipeline.

What you lose, what you gain

The loss is real: you can't drop into arbitrary Kotlin in the body of a transform. If your transform isn't expressible as a composition of the AST node types you've defined, you have to add a node. There's no escape hatch to a raw lambda for the YAML path, because the whole point is that the YAML path doesn't have a JVM to run a closure on the other side.

In exchange:

The config serializes to YAML or JSON without thinking about it.
The AST is inspectable, so you can diff two versions of a schema and tell a user what changed before a redeploy.
The runtime cost is still a closure call per node, but you can compile the AST down to a single closure at materialize time and amortize the tree walk; kumulant does this in its spec layer.
Adding a new node type is one data class, one eval impl, one serial name. Adding a new built-in to a named-transforms registry would be the same amount of code with worse composability.

The lesson I took away: as soon as a config has to cross the wire, the wire format isn't a serialization concern bolted on the side of the typed API, it's the thing the typed API has to match. Starting with closures and trying to bolt YAML on later would have meant two sources of truth and a translation layer between them; starting from the AST and letting Kotlin's operator overloading recover the ergonomics gave both surfaces from one definition.

The expression node code above lives in Eignex/kumulant under schema/Expr.kt. The serialization plumbing (polymorphic discriminator, typed-key schemas) is the Eignex/skema library, covered in more detail in the previous post.

Three Stabs at a Typed Schema DSL in Kotlin

Rasmus Ros — Wed, 13 May 2026 05:20:38 +0000

Imagine a D&D character sheet. It has typed fields (Strength 1 to 18, Class is Fighter or Wizard or Rogue) and rules between them (Halflings can't be Paladins, Hit Points depend on Class and Constitution). The blank sheet is the schema; a filled-in character is one instance of it.

Schema vs. instance: the blank sheet on the back, one filled-in Human Fighter on top.

If you only have one schema, you can just write a CharacterSheet data class with the right fields plus some validation, and call it a day. This post is about the harder version: writing the library behind the sheet, where every user brings their own. Pathfinder, 5e, Call of Cthulhu, all different fields, all different rules, all driven by your code. The type system has to help, even though you don't know any of the user schemas in advance.

A few years ago I built combo (Constraint Oriented Multi-variate Bandit Optimization), an A/B-testing tool that picks variants subject to constraints between variables. I've been splitting the rewrite into two libraries: kumulant, a streaming aggregator with just the variables; and klause, an SMT solver with the variables and the rules between them. Both face the same design question: how does a user declare a typed schema, and how do call sites read variables back without dissolving into casts and string lookups?

I'll start with kumulant since it's the smaller half. The reason to lean hard on the types: the more the compiler catches (a misnamed read, a wrong-typed access, an illegal combination), the less the user has to remember.

1. Imperative registries

Every classical solver library starts the same way: a constructor for each kind of variable, references kept as locals, constraints as objects imposed onto the Problem. Choco in Java, Z3 via its Python and Java bindings, and combo's first version all look like this.

val problem = Problem()
val budget = IntVar(problem, "budget", 1000, 4000)
val color  = IntVar(problem, "color", 0, 2)  // 0=RED, 1=GREEN, 2=BLUE

// "if color=RED, then budget ≤ 2000"
problem.impose(IfThen(XeqC(color, 0), XlteqC(budget, 2000)))

The natural upgrade in Kotlin is to hide the problem receiver inside a problem { ... } block and put each kind of variable in a sealed Variable<V, T> hierarchy so it knows its own type. This is the type-safe builder pattern that powers most Kotlin DSLs (e.g. kotlinx.html): lambdas with receivers, infix functions, and operator overloading, enough to make the body look like the original mathematical notation. That's what combo does:

val p = problem {
    val budget = int("budget", min = 1000, max = 4000)
    val color  = nominal("color", RED, GREEN, BLUE)

    impose {
        color[RED] implies budget.atMost(2000)
    }
}

User-supplied relations reference typed value literals like color[RED] against variables in scope, not strings. Underneath, it's the same problem as the bare solver, though. To read a variable at a call site you either keep the typed reference around, threading it through every function that touches it, or fall back to by-name lookup that returns a Variable<*, *> you have to cast. With nested scopes the references fan out faster than you can keep clean, and the common fallback is a Map<String, Var> keyed by name, with reads as unchecked casts.

Compiles fine, crashes at runtime.

Building the Problem is also imperative, not declarative: the body runs in order, and there's no static structure to inspect or serialize without first executing the lambda.

2. Arity-indexed products

Pivoting to kumulant here: the same schema problem shows up for streaming statistics, where each "variable" is an accumulator like Mean or Sum and call sites need typed reads of its snapshot. Next attempt: give every variable a value that carries its type with it. A call site should be able to write snap.mean and get a typed value end-to-end, no cast.

Encode the schema as a product: a tuple where each position holds a variable, and the type carries the arity.

data class Stat2<A : Stat, B : Stat>(val first: A, val second: B)         // accumulator product
data class Result2<A : Result, B : Result>(val first: A, val second: B)   // snapshot from .read()
// Stat3 / Result3 / … same shape, one per arity

operator fun <A : Stat, B : Stat> A.plus(other: B): Stat2<A, B> = Stat2(this, other)

// Per-trait accessors: one extension per (position, trait) combo
val <B : HasMean> Result2<*, B>.mean get() = second.mean
val <A : HasMean> Result2<A, *>.mean get() = first.mean

The + is defined on Stat itself. A schema is built by adding stats together; the type carries the arity, and a StatGroup wraps the schema as the runtime accumulator:

val schema = Mean("avg_ms") + Sum("total_ms")  // Stat2<Mean, Sum>
val group  = StatGroup(schema)
group.update(105.0)
group.update(80.0)

val snap = group.read()  // Result2<MeanResult, SumResult>
snap.first.mean          // Double, typed
snap.second.sum          // Double, but "second" is positional
snap.mean                // works because only one position has HasMean

// Two stats sharing a trait kills the trait extension:
val decaySchema = DecayingSum(15.minutes) + DecayingSum(1.minutes)
val decayGroup  = StatGroup(decaySchema)
decayGroup.read().sum    // ambiguous; back to .first.sum / .second.sum

The types are fully preserved: add a stat and the type changes; combine two schemas and the types fuse via +. But at the call site you write snap.first.mean, and first is the problem. Position isn't name. Reorder the stats and call sites change. And as soon as two stats share a trait (the DecayingSum + DecayingSum above), the trait extensions become ambiguous and you fall back to .first.sum / .second.sum anyway.

I built the N×M expansion with a KSP processor that generated a trait accessor per position-trait combo, and it compiled. But the abstraction leaked: every call site had to import the right extensions for the traits it read, the parameterized-instances pattern still had no clean read, and the whole thing felt like a hack. Languages with higher-kinded or dependent types make this natural (shapeless is the closest analogue on the JVM), but that's not exactly mainstream territory. Without those features you're encoding a record with positional bookkeeping. I cut it.

3. Typed-key schemas

The fix is to bundle the name and the type into one value: a heterogeneous map keyed by a typed key. Kumulant does it in two layers: a typed key as the plumbing, and a class on top of it for declaring lots of them at once. The plumbing first, a StatKey<R> paired with a GroupResult:

interface Result
interface Stat<R : Result> {
    fun update(value: Double)
    fun read(): R
}

open class StatKey<R : Result>(val name: String, val stat: Stat<R>)

@Serializable
data class GroupResult(val results: Map<String, Result>) : Result {
    operator fun <R : Result> get(key: StatKey<R>): R =
        results[key.name] as R
}

class StatGroup(val keys: List<StatKey<*>>) : Stat<GroupResult> {
    override fun update(value: Double) { keys.forEach { it.stat.update(value) } }
    override fun read(): GroupResult =
        GroupResult(keys.associate { it.name to it.stat.read() })
}

Now keys can be declared directly, and the typed get returns the right result type at the call site:

val mean  = StatKey("mean",  Mean())
val count = StatKey("count", Sum())

val group = StatGroup(listOf(mean, count))
group.update(105.0)

val snap = group.read()
snap[mean]   // MeanResult, no cast at the call site
snap[count]  // SumResult

Each StatKey<R> pairs a name with the type it indexes. The container is a Map<String, Result> underneath, but the typed get returns the declared type, so the call site never sees the cast. Compared to the imperative registry, the strings still exist, but they're bound to the key value, not typed by the user at every read. The key is the variable's identity.

Declaring keys by hand is clunky: you'd be tracking them yourself for the StatGroup, and writing each name twice (once on the property, once as a string). Kumulant uses property delegates on a singleton object instead. Same pattern as JetBrains' Exposed, minus the duplicate name.

abstract class StatSchema {
    private val _keys = mutableListOf<StatKey<*>>()
    val keys: List<StatKey<*>> get() = _keys

    fun <R : Result> stat(s: Stat<R>): PropertyDelegateProvider<StatSchema, StatKey<R>>
    // returns a delegate that registers _keys += StatKey(propertyName, s) and yields the key

    // group(schema): same idea, registers a nested StatGroup as one key
}

fun StatGroup(schema: StatSchema): StatGroup = StatGroup(schema.keys)

User schemas are objects, and every property is a by-delegate:

object HttpMetrics : StatSchema() {
    val requests  by stat(Sum().withValue(1.0))
    // tracks p50, p99, and p999 latency quantiles
    val latencyMs by stat(DDSketch(probabilities = doubleArrayOf(0.5, 0.99, 0.999)))
}

object ServiceMetrics : StatSchema() {
    val requests        by stat(Sum().withValue(1.0))
    val billableMsTotal by stat(Sum())
    val http            by group(HttpMetrics)
    val db              by group(DbMetrics)
}

val service = StatGroup(ServiceMetrics)
service.update(120.0); service.update(80.0)

val snap = service.read()
snap[ServiceMetrics.requests].sum                  // Double, typed
snap[ServiceMetrics.billableMsTotal].sum           // Double, typed
snap[ServiceMetrics.http, { requests }].sum        // dotted lookup into a nested group

Now the schema is a class. Each property is a typed StatKey<R> whose result type matches the stat that constructed it. No magic strings to sync, no references to thread between definition and use, no imperative builder to run before the schema exists; the schema declaration is the structure.

For streaming statistics, this is the design I'm currently happy with. The remaining tradeoffs sit inside individual variables (e.g. derived variables with non-invertible projections need the programmer to handle merge correctness), not in the schema design. This design doesn't address constraints between variables, though. Aggregation is fine; "variable A must always be less than variable B" or "variable C can only be set when variable D fires" has nowhere to live. Fine for kumulant, but klause's case still needs them. That's its own design problem (DSL, AST, wire format) and not one I'd want to cram into this post.

I've pulled the pattern out as Eignex/skema, now at 0.1.0 (Swedish skema means template). It's a Kotlin Multiplatform library where one definition does double duty: typed compile-time access on the producer side, and a kotlinx-serializable wire format so a consumer that doesn't share your Kotlin code can still decode the schema and walk it by name. kumulant, klause, and combo will all settle onto it eventually, just haven't gotten there yet.

If you've worked something like this out, especially in a language without dependent types, I'd love to hear about it. Combo, kumulant, and klause are at github.com/Eignex/combo, github.com/Eignex/kumulant, and github.com/Eignex/klause if you want to poke at them.

AI Isn't the Problem. The Loss Function Is.

Rasmus Ros — Sun, 03 May 2026 19:57:32 +0000

I keep seeing the same argument about AI making us dumber. It's the same argument people had about search engines, and before that books. The usual response is to point at history and say "every generation panics, every generation was wrong, relax." I think that response is half right, and the wrong half is what bothers me.

Tools change what we bother to remember. The people who'd trained their whole lives to memorize 10,000-line oral epics watched the craft die when writing showed up. Long arithmetic in your head used to be normal; calculators arrived and the payoff for keeping that skill sharp went away. Brains didn't shrink. The skills just stopped being worth practicing.

I don't memorize phone numbers anymore. I don't memorize directions. I don't even memorize the APIs of libraries I use every week. What I do instead is keep a fairly precise mental index of where things live and what query will retrieve them. That's a real cognitive trade. I gave up some recall and got back a much larger working set of pointers. Net positive, I think, but I notice the trade in a way I didn't when I was nine.

We usually keep teaching

AI tools push the same trade further. They don't just outsource recall, they outsource synthesis: the part where you actually work through a problem and end up with a model of it in your head. I notice this when I let an LLM write code I could have written myself. I get the output, but I didn't build the model, which is usually the part I wanted. The people who worry about atrophy here aren't wrong, and it's worth its own post.

One thing the prior cases got right is that society kept teaching the underlying skill anyway. Calculators didn't kill arithmetic class. Search engines didn't kill the library-science basics on how an index actually works. Some skills got canonized as core, worth practicing even after the tool that automated them arrived, because we collectively decided they mattered. Coding hadn't quite reached that status yet, but I think it would have given another decade. AI may have shown up too early for that to happen.

So the historical pattern mostly holds: tools rewire priorities, some skills fade, others grow, the panic looks silly in retrospect. Where the "relax, every generation panics" crowd gets it wrong is in assuming AI is just the next entry in that list. It might be. But the environment AI is landing in is not the environment the printing press or the early search engine landed in.

The loop is the problem

Books don't optimize you. Calculators don't optimize you. Search engines, at the lookup layer at least, were mostly trying to give you the page you asked for and then get out of the way. The dominant information channel today is none of those things. It's a feed, and the feed is an optimizer. The target variable is engagement.

Earlier tools removed friction from a specific task and let you spend the saved effort somewhere else. A feed isn't trying to remove friction from anything you'd recognize as a task. It's trying to keep you in the loop. The reward signal it's chasing (what makes you click, stay, scroll, react) is not the same signal as "this was useful to me." It's often the opposite. There's data on this now: heavy social media use predicts elevated depression and anxiety in kids and young adults, and longitudinal work finds the use comes first, not the depression.

And then you wire a generative model into the same loop. Generative AI doesn't change the objective, it just gives the loop a faster, cheaper supply tuned to whatever it already rewards.

Side-by-side diagram of the engagement loop. Left: the ranker selects content from a fixed pool of human-made posts. Right: the same loop with a generative model in place of the pool.

Adding AI to the stack

My background is in optimization. The recurring question I work on is what a product should actually be optimizing for (PhD on automating A/B testing, Eignex the side project still chasing it). So when I look at "LLMs plus a recommendation feed" it looks to me like the same loop with a much better content supply. Not really a new content medium.

The version running today doesn't even use generation in the loop. The recommender stacks at the big platforms (Meta, TikTok, YouTube) are still doing what they've done for a decade: ranking content other people uploaded. The supply pool was already effectively infinite after years of user-generated content. The change is that a growing share of what gets uploaded is now AI-made, and the existing optimizer ranks the synthetic stuff exactly like everything else.

The scarier version puts the generator inside the loop, per-user posts written for you on demand. We don't have it. The thing is, we don't need it. The pool of generated content is already absurd enough that something in it fits your viewing history, your current mood, and what you had for breakfast. The optimizer just has to find it.

A scatter plot of a content embedding space. Blue dots cluster on popular topics; red dots fill the gaps and edges, AI posts colonizing the long tail.

None of this is hypothetical. AI-generated music has racked up millions of streams on Spotify before anyone noticed it wasn't human. Facebook is saturated with generative slop: fabricated heart-warming stories, sculptures supposedly carved by a 92-year-old grandpa nobody appreciates, content farms running cheap image generators to chase engagement. The TikTok-side version is "Italian brainrot", absurd AI-generated creatures with names like Tralalero Tralala captioned with nonsense-Italian audio, pulling hundreds of millions of views.

Facebook's own VP put the dynamic in plain terms to Futurism earlier this year: "if you, as a user, are interested in a piece of content which happens to be AI-generated, the recommendations algorithm will determine that, over time, you are interested in this topic." None of this uses particularly sophisticated tech, and it's already running at scale.

This loop doesn't get out of the way like search did. It takes friction out of producing whatever the optimizer rewards. Right now that's engagement, so the system gets better at engagement. Nothing malicious has to happen for that to land badly; it's doing exactly what it was asked.

The objective is a choice

I'm not fully pessimistic about this, though.

The objective is a choice. Engagement isn't a law of physics. Somebody picked clicks or watch time because it was easy to measure and correlated with revenue. People also reach for banning AI-generated content here. That isn't it either: "the machine wrote it" isn't a stable category once the machines are this good. The thing to push on is the loss function itself (what the system is told to optimize for), and the loss function is written by people.

The irony's not lost on me that if you're reading this, it probably reached you through one of these feeds. As engineers we like to act like the loss function is handed down on stone tablets. It isn't. Somebody wrote it, and on the products I work on that somebody is me.

There is research on what "different" could look like: ranking for informational diversity, or ranking on whether users still endorse a piece of content a week later instead of whether they reacted in the first three seconds. None of it is mature, none of it has a business model behind it the way engagement does, and that's the real obstacle, not the technical side. The systems are perfectly capable of optimizing for something else. The question is whether anyone with the keys wants to. I'd rather sort it out before the next, much more capable generator gets wired into the same loop.

Building a Compact Encoder on kotlinx.serialization

Rasmus Ros — Thu, 30 Apr 2026 07:21:53 +0000

Over the past few years, I found myself occasionally writing the same boilerplate: manually packing bits of application state into tight, heavily character-limited strings. It ended up with me creating a library for it called kencode. But first it's story time, and then a little explanation of the underlying tech of why kotlinx.serialization is so cool, and THEN I'll go over kencode.

It all started with URL callback links on an integrated Search Engine Results Page (SERP). In a previous project at Theca, we had built a search engine embedded directly into a client's website. When users clicked a search result, the link first redirected to our servers so we could register telemetry for the click before finally sending them to the actual target page.

This is standard tracking infrastructure stuff. But if enough state can be encoded directly into the URL, the tracking server can bypass an expensive database lookup entirely. In this particular case, we needed to pass the query ID, the user ID, the document ID, and the exact position in the SERP. One database call is not much, but latency does matter for initial impressions.

Having a short URL here is nice, they look more professional, and there is a limit to how long URLs can be. We also want there to be no special characters in the encoded result. That includes hyphens and underscore, since that would otherwise break the word selecting logic. Try to select the entire path by double-clicking in this URL and you'll see: https://example.com/hyphen-path. But here it works just fine to select dQw...: https://www.youtube.com/watch?v=dQw4w9WgXcQ since it's a single word.

Then the same encoding problem happened again with Kubernetes pod names. I was dynamically spinning up short-lived jobs and wanted to embed trace IDs somehow. Naturally, this metadata should also be stored in Kubernetes labels so it remains queryable with kubectl. Kubernetes also imposes a strict 63-character limit on names and only allows alphanumeric characters and hyphens. Encoding efficiency becomes a limiting factor here.

Later, I ran into this encoding problem a third time while implementing stateless pagination links for that SERP. Paginating correctly through blended keyword + vector search results meant we had to carry internal ranking state from page to page. This state lived entirely inside a ?next=xxx query parameter, meaning the payload had to be compact, URL-safe, and opaque to the user.

And now, I find myself needing it a fourth time for my current project Eignex. It's an optimization engine for tuning things like model parameters or ranking weights in production. By passing chosen-value state in a token to the front-end and back, we can avoid storing it in a massive user ID to settings dict on the back-end.

I realize this is not an everyday problem, but I have now encountered it four separate times. I think the ability to pack complex state into a tiny string is a useful architectural trick. Doing it manually each time is error-prone.

Pack lots of structured data into a tight string.

This is where kencode shines. You define a data class and get strong typing directly from the decoded payload:

@Serializable
data class JobState(
    val clientId: Int,
    val batchId: Int,
    val retryCount: Int?,
    val isPriority: Boolean
)

val state = JobState(119, 210, null, true)

val encodedState = EncodedFormat.encodeToString(state)
// This encodes the object into the string:
// 03W8mJ

val decodedState = EncodedFormat.decodeFromString<JobState>(encodedState)

For comparison, the same object in other encodings:

Encoding	Length	Output
JSON	66 chars	`{"clientId":119,"batchId":210,"retryCount":null,"isPriority":true}`
Protobuf + Base64	10 chars	`CHcQ0gEgAQ`
kencode (Base62)	6 chars	`03W8mJ`

kencode is implemented as a custom format on top of kotlinx.serialization, which has quite a different approach to serialization compared to other JVM libraries. Why that is the case requires some context.

Why kotlinx.serialization?

Before libraries like modern Jackson became the standard, serializing Java objects usually involved writing manual boilerplate. If you need to support multiple formats like Protobuf in addition to JSON you will suffer. Manually crafting custom serializers for every single combination of data type and output format (the classic NxM problem) is simply not the way.

To reduce this boilerplate, runtime reflection libraries like Gson and Jackson became popular. Under the hood, when an object is serialized, these libraries inspect the class at runtime to find its fields, their types, and their values. They map these fields to sequential tokens on the fly. This makes standard JSON-focused libraries easy to use, but not necessarily easy to extend.

The sequential model of serializing makes it difficult to create formats that perform aggregate operations on the entire class. kencode relies on exactly this kind of optimization to compact the payload, like grouping all boolean fields and nullability flags into a single bitmask header.

There is also a hard performance ceiling on reflection. Reflection libraries do usually cache the reflection steps, but the issue is not the reflection itself. It's that interpreting these cached steps at runtime is inherently slower than executing statically compiled code. When a reflection library loops over the fields of your class, it essentially calls a method like serializer.write(fieldValue) over and over. Since your fields are all different types, that is a megamorphic call site which the compiler can't inline or optimize well.

This is why kotlinx.serialization takes another approach completely. Instead of relying on reflection at runtime, it generates static serializers at compile time. The approach is similar to Rust's serde framework, allowing for highly optimized serialization without resorting to manual boilerplate.

In kotlinx.serialization, when a class is annotated with @Serializable, a compiler plugin generates a custom KSerializer at build time. For the JobState class above, it produces something like:

// Generated automatically by the @Serializable compiler plugin
object JobStateSerializer : KSerializer<JobState> {

    override val descriptor: SerialDescriptor =
        buildClassSerialDescriptor("JobState") {
            element<Int>("clientId")
            element<Int>("batchId")
            element<Int?>("retryCount")
            element<Boolean>("isPriority")
        }

    override fun serialize(encoder: Encoder, value: JobState) {
        val composite = encoder.beginStructure(descriptor)
        composite.encodeIntElement(descriptor, 0, value.clientId)
        composite.encodeIntElement(descriptor, 1, value.batchId)
        composite.encodeNullableSerializableElement(
            descriptor, 2, Int.serializer(), value.retryCount
        )
        composite.encodeBooleanElement(descriptor, 3, value.isPriority)
        composite.endStructure(descriptor)
    }

    override fun deserialize(decoder: Decoder): JobState {
        // Analogous to serialize, slightly longer because of formats
        // with arbitrary ordering like JSON.
    }
}

serialize just calls typed methods on an Encoder. The KSerializer provides the data shape; the Encoder writes it. That separation is why custom formats are so convenient: a new format only has to implement an Encoder/Decoder pair, and every existing @Serializable class works with it for free.

How It Works

Let's dive into kencode. I split it into three pieces: a compact binary format, a general byte-to-text encoder, and a small composition layer that turns the whole thing into a normal string format. The binary format and text encoders can be used separately.

1. PackedFormat

PackedFormat is the biggest part of the library. It contains the logic to serialize Kotlin objects into small byte arrays.

The format assumes both sides already agree on the schema. This is a strong assumption and definitely not what you want for persistence or cross-language communication. But when the assumption holds, we save a lot of space by not encoding structural information that both sides already know.

Its other core optimizations:

Bitmask headers: boolean fields and nullability markers are packed into a compact bitset header, costing 1 bit per field instead of the usual 1 byte.
Merged nested headers: bitmask bits from nested class fields are collected into a single root-level header, eliminating per-class byte-alignment padding that would otherwise be wasted at each nesting boundary.
Variable-length integers: standard integer fields waste space because they always consume 4 or 8 bytes, even for small numbers. We shrink them using varint (LEB128) and ZigZag encodings. Varint uses the most significant bit of each byte as a "continuation flag", letting small positive numbers squeeze into a single byte. ZigZag fixes a flaw in plain varint by mapping small negative numbers to small positives (0 → 0, -1 → 1, 1 → 2, -2 → 3) so they pack tightly too. Varint is the default in kencode (and in protobuf); enum ordinals are always varint-encoded automatically.
Collection bitmaps: boolean lists and nullable element lists pack their flags into a leading bitmap rather than storing one byte per element.

Together these optimizations explain how the JobState example was compacted. The boolean and nullability flag combine in the header, and the ID integers take one and two bytes respectively.

How PackedFormat lays out the JobState example.

The header for a flat class is straightforward: one bit per boolean, one bit per nullable field (0 = null, 1 = present), packed into the smallest number of bytes. For JobState that's two bits, so a single byte. Nested classes complicate this because per-class headers waste bits to byte alignment, so kencode merges every non-nullable nested class's bits into one shared header at the very front.

This header-first layout requires writing data you haven't processed yet, which standard streaming frameworks can't do without first materialising the whole object into an intermediate tree. Because kencode knows the exact schema via the SerialDescriptor, it skips the tree: beginStructure allocates a small byte array and reserves the right number of header bits, and endStructure flushes the bitmask followed by the buffered field data.

PackedFormat is the layer that actually reduces the payload. Everything after this is really about transport.

2. The Text Layer: ASCII-Safe Codecs

Transporting byte data as text is a common operation and usually handled by Base64. In kencode, we have more ByteEncoding options. Base64 and Base64Url are there mostly for interoperability, and they're a bit faster than the base-N codecs since the encoding is just a simple bit-shuffle. Base85 is useful when density matters more than a conservative character set. The most interesting one is Base62 (also the default). It solves the problem of using non-alphanumeric characters while staying reasonably dense.

BaseRadix handles arbitrary alphabets generically. It works like this: you treat the entire array of bytes as one massive number, divide it by your base (like 62), and map the remainders to characters in your alphabet. Same underlying math as converting binary to standard decimals, just using a custom string of letters and digits. So any alphabet works. Base36 uses only lowercase, and you could also plug in the base-58 alphabet Bitcoin uses to avoid visually ambiguous characters like 0/O and I/l.

But there's a catch when implementing this. To do that base conversion math, you have to load those raw bytes into a BigInteger. As your payload gets larger, BigInteger division becomes slower, so the naïve version is O(n²). The encoder uses a trick: chunk the input in pre-defined sizes. Instead of processing the whole payload as one giant number, it slices the data into fixed chunks and converts each block individually. This reverts the solution to O(n) just like Base64. You do lose a tiny fraction of a byte to rounding overhead every time a new block starts. 32 bytes turned out to be a good sweet spot.

The chunking also needs an inverse mapping for the decoder. For a given block, encoding N bytes produces a fixed number of characters M, but because M = ceil(N * 8 / log2(base)) rounds up, multiple byte counts can land on the same character count. So we precompute a lookup that goes the other way (character count back to byte count) so decoding a partial trailing block doesn't have to guess the original length.

The asymptotic cost per input byte falls out of the alphabet size:

Codec	chars / byte	Alphabet
Base36	1.55	`0-9 a-z`
Base62	1.34	`0-9 a-z A-Z`
Base64	1.33	`0-9 a-z A-Z` + 2 symbols
Base85	1.25	85 printable ASCII, incl. punctuation

Base64 and Base62 are nearly tied, with Base64 winning by a hair because its math aligns on bit boundaries. But Base62 buys you an alphanumeric-only output, which is usually the reason you reached for it in the first place.

For a concrete example, here is The quick brown fox jumps over the lazy dog (43 bytes) in each:

Base36    (68): 23qhn8p9aco732ripmr6mhzfrtsmxcxxzjdmm3vgas1xzpdkz80fuvjknh7nfo0s6fdz
Base62    (58): k0YiLeAWe79bmxSBiGjowzAh4fSmcMsLmNNmsSowlyAaaWecFKMVGnsquH
Base64Url (58): VGhlIHF1aWNrIGJyb3duIGZveCBqdW1wcyBvdmVyIHRoZSBsYXp5IGRvZw
Base85    (54): <+ohcEHPu*CER),Dg-(AAoDo:C3=B4F!,CEATAo8BOr<&@=!2AA8c)

At this size Base62 happens to match Base64Url because of where the block rounding lands. On longer payloads Base64 edges ahead by a small constant factor, and Base85 stays the densest at the cost of a much noisier alphabet.

3. The Composition Layer: EncodedFormat

Finally there is EncodedFormat, which is the glue that combines the binary format and a chosen text codec into a single StringFormat. Between those two layers is an optional transform step for arbitrary byte manipulation.

val format = EncodedFormat {
    binaryFormat = PackedFormat { defaultEncoding = IntPacking.SIGNED }
    transform = encryptingTransform
    codec = Base62
}

val token = format.encodeToString(payload)

A PayloadTransform is just a pair of encode/decode functions on a ByteArray. You get the packed bytes, return whatever bytes you want, and the text codec runs on that. Two of them chain together with .then(...).

I mainly added this for encryption. In the Eignex case, the token rides along on the front-end between requests, so it has to be opaque. Wrapping a cipher is basically a few lines:

val encryptingTransform = object : PayloadTransform {
    override fun encode(data: ByteArray): ByteArray = cipher.encrypt(data)
    override fun decode(data: ByteArray): ByteArray = cipher.decrypt(data)
}

The same interface covers a bunch of other useful things: error-correcting codes (wrap Reed-Solomon and you get tokens that survive a couple of mangled characters), compression for larger payloads, or a CRC checksum if you're worried about users truncating tokens they pasted from a log (there's a checksum = Crc16 shorthand for that one).

PackedFormat is for dense transport, not durable storage. If you want something you can persist and evolve more comfortably over time, swap in ProtoBuf instead.

Anyway, that's kencode. Let me know if you find a fifth reason to pack state into a string. Source is at github.com/Eignex/kencode if you want to poke at it.

Building Eignex in the Open

Rasmus Ros — Mon, 27 Apr 2026 11:21:57 +0000

I've always been fascinated by applying optimization to solve real-world problems.

It is often an inherently multidisciplinary activity, and there is something deeply satisfying about taking distinct, often siloed ideas and jamming them together to create something that is fundamentally better than the sum of its parts. In my PhD thesis it was search-based optimization, multi-armed bandit algorithms, combinatorial optimization, probabilistic machine learning, and of course, software engineering.

I wrapped up my PhD thesis back in 2022. I loved the work itself, digging deep into continuous optimization and A/B testing, but I realized pretty quickly that I didn't want to stay in academia.

The environment felt incredibly results-driven, but often in the wrong way. It felt like to be successful you have to play the academic game of marketing your work, rather than the pure engineering challenge of solving a hard problem and making it robust.

I wasn't ready to stop working on optimization just because I left the university, though. I actually find this stuff fun. I wanted to keep building, but I wanted to build tools that actually work in the real world, not just in a paper.

That's basically why I started the Eignex project.

Why Open Source?

To me, open sourcing the work felt like a no-brainer. It wasn't a strategic decision I thought twice over.

First, I enjoy writing high-performance code, and it's simply more fun when other people can use it. But more importantly, there is a trust factor.

If you are building infrastructure that is going to automatically tweak parameters on a live production system, you shouldn't be doing it inside a black box. If a piece of software is going to turn knobs on my server, I want to see the code. I want to know exactly how it makes decisions and how safety constraints are enforced.

That's why all the building blocks of the core engines are public. You can audit the math yourself and contribute if you want.

The End Goal

Let's be real: making money on open source is notoriously difficult. I'm not under any illusions about that, and I'm not trying to build something to make a living out of.

The plan, though, is to eventually build a managed SaaS.

It doesn't exist yet. Right now, I'm just focusing on building the core engine from the bottom-up, one library at a time. But the long-term goal is to build a platform that handles the messy parts of running these optimization loops in production. Things like dashboards, persistent state management, and k8s setup.

If I can eventually get that managed service to a point where it covers the server bills, I'll call that a win.

For now, I'm just building.