Forem: Hugo Vantighem

Read Modify Write Is Where NoSQL Concurrency Bugs Begin.

Hugo Vantighem — Sun, 24 May 2026 12:11:46 +0000

Part 1 of 3 — the single-document case.

There's a class of bug that every backend engineer ships at least once, usually
without noticing for months. It hides inside the most innocent-looking operation:
read a document, decide something, write it back.

Take a concrete invariant: a team can hold at most 10 seats. To add a seat you
read the team document, count the seats, check count < 10, and write. A textbook
Read → Modify → Write.

Now run it twice at the same instant. Request A reads count = 9, decides "9 < 10,
fine", and writes 10. Request B, a millisecond apart, also read count = 9,
decided "fine", and writes 10. You now have a team that thinks it has 10 seats but
actually granted 11. Neither request did anything wrong on its own. One write
silently erased the premise of the other. This is a lost update, and it's the
core anomaly of the single-document case.

T0   A reads count = 9
T1   B reads count = 9
T2   A writes count = 10   ("9 < 10, fine")
T3   B writes count = 10   ("9 < 10, fine")

Reality:        11 seats granted
Database state: 10
Invariant:      violated, silently

Here's what teams actually reach for, and exactly what each option leaves on the
table.

The fat aggregate (atomic operators)

If you can express the whole mutation as a single atomic operator — $inc,
$push with $slice, or a conditional findAndModify — MongoDB applies it
atomically on the document. There's no read-then-write window, so no lost update.
For invariants that fit a single atomic expression, this is genuinely the right
tool, and you should reach for it first.

The catch: not every invariant fits. The moment your check needs branching ("if
the plan is free and count ≥ 5, reject") you're back to reading, deciding in
application code, and writing — and the window reopens. Embedding related data is
a perfectly good modeling choice; the trap is different. It's the temptation to keep
stretching one document's consistency boundary — folding in unrelated rules just
to keep the write atomic — which is exactly how you end up with 16 MB documents and a saturated network.

Anomaly status: ✅ lost update handled — for the subset of rules expressible as
one atomic op.

The pessimistic lock (Redis)

Grab a distributed lock before the read, release after the write. It works — but
for a single document it's a sledgehammer. You've added a network round-trip, a
brand-new failure mode (the lock service), and a whole class of distributed
coordination failures — lease expiry, lock drift, fencing, split-brain — all to
guard one document the database could have guarded itself.

Anomaly status: ✅ everything — at the cost of latency and distributed coordination
failures. (Part 3 is dedicated to why that bill is steep.)

Optimistic locking (a version field)

Carry a version on the document. Read it, run your logic, then write with a
guard: findAndModify({_id, version: v}, {$set: {...}, $inc: {version: 1}}). If
anyone wrote in between, version moved, your guard matches nothing, and you
retry. This is the clean default for single-document RMW that doesn't fit an
atomic operator — it kills lost update with no external system.

The catch: under contention it's a retry machine. The more concurrent writers, the
more losers re-run their logic, burning CPU and tail latency.

Anomaly status: ✅ lost update — at the cost of app-side retries.

Pray

Bet that two requests never touch the same document in the same millisecond. They
will. Anomaly status: ❌ lost update, in production, at 3 a.m.

The point

For a single document, you're actually well served: atomic operators or optimistic
locking close the gap cleanly, without external machinery. The single-document
case is the easy one.

The real pain begins the instant your invariant spans two documents — a
workspace budget gating a user debit, for example. There, optimistic locking stops
being sufficient: it still guards each document on its own, but it can no longer
guarantee an invariant that lives between them. And a nastier anomaly walks in —
the database stays perfectly "consistent" while your business invariant quietly
dies.

Welcome to write skew. That's part 2.

Postgres-grade Serializable at 20k+ ops/s — on a laptop. Don’t try this at home.

Hugo Vantighem — Sat, 23 May 2026 17:14:52 +0000

They didn't know it was impossible, so they did it. — Mark Twain

In the software industry, we've been raised with a dogma: you must choose between Massive Performance (NoSQL, eventual consistency) and Domain Rigor (SQL, strong consistency, serializable).

We are told that locks, latencies, and ACID properties are the natural enemies of speed. That if you want to scale, you have to let go of your business invariants.

I decided to test another hypothesis. And I broke the myth.

The Result: 20,000+ Validated Transactions per Second

This isn't a "fire and forget" ingestion log.

This isn't a volatile cache experiment.

What you see here is Business Transaction Durability:

Invariants validated — every business rule is checked before commit.
State persisted — every change is durably written to disk.
Strong Consistency — Serializable-level isolation.

At 20,000+ ops/s, we are not just talking about speed. We are talking about the ability to maintain absolute domain integrity under massive load.

And the kicker: this is running on a MacBook Air M3 — 8 cores, 16 GB of RAM, the same machine I write the code on. No 64-core server. No NVMe array. No datacenter rack. One laptop, fan barely audible, doing the work of a small cluster.

Why General-Purpose Databases Hit a Ceiling

Most databases are built for general cases. They treat every row the same way because they don't know your business.

This "Domain Ignorance" leads to generic row locks, MVCC bookkeeping, cross-table coordination, and massive overhead — costs you pay on every single transaction, whether your domain needs them or not.

Not Magic — Discipline

For the skeptics: this isn't sorcery. It's discipline applied to the right layer — designing the system so the hardware does exactly what it's good at, and nothing else.

I'm not reinventing the storage wheel. The foundation is Pebble, the same proven LSM-tree engine that powers CockroachDB. But the engine is just the floor. The real lever is the orchestration of the domain logic on top of it — and that's what Part 2 puts a name on.

A Note on the Benchmark Scope

I know what you're thinking. "20k+ ops/s? That must be an internal memory trick."

It isn't. To ensure these numbers reflect real-world usage, the benchmark covers the entire lifecycle of a business transaction:

Client-side serialization — the payload starts from the app.
Local communication — end-to-end roundtrip.
Server-side deserialization & parsing.
Business Invariants validation.
Disk persistence with full durability guarantees — fsync on every commit.

The workload: batch=1000, payload=1KB, single-node, single laptop. Here's the run, with the system-level disk stats captured live during the bench:

[23755.87 items/s] | items=1424000 | batch=1000 | payload=1KB | durability=FSYNC-ON

Live capture during the bench (batch=1000, 1KB, fsync ON). Disk on fire, CPU bored.

Two things jump out of that stats panel — and together they're the whole point:

The disk is screaming. Sustained 100–200 MB/s with the ⚡ markers firing almost every second. This is real fsync'd traffic hitting the SSD, not a memory cache pretending to be durable. If you pulled the power cord mid-run, every committed transaction would still be there on reboot.
The CPU is bored (~18% on an 8-core M3). The compute is idle while the disk pegs out — that asymmetry is the whole story.

And this isn't the ceiling. With bigger batches the same laptop pushes further; even at batch=1, it doesn't fall off a cliff. The full envelope is Part 2.

What's Next?

This is just Part 1. In a few days, Part 2 finishes the picture and lands the real punchline: business rules aren't a tax on performance — they're the contract that lets the machine fly. And the whole thing runs on hardware your team could expense, not a cloud bill that needs board approval.

Stay tuned. The era of the "Impossible Trade-off" is over.