DEV Community

Dhanush B
Dhanush B

Posted on

91 4 5 4 4

📍The Art of Choosing the Right Database

"Should I use SQL or NoSQL? B-Trees or LSM Trees?" If you've ever felt overwhelmed choosing the right database for your app, you're not alone. Underneath every database is a rich ecosystem of storage engines and transaction protocols — and choosing right can mean the difference between blazing-fast performance or painful bottlenecks.

In this blog, we step into the world of database internals — through the lens of story — and explore how systems like MySQL, MongoDB, Cassandra, and PostgreSQL really work under the hood.


Let me tell you a story.

It all began when I was designing a backend system for a fast-growing e-commerce app. It needed to handle thousands of concurrent users, real-time product inventory updates, personalized recommendations, and a dashboard that updated faster than you could say “out of stock.” And like many developers before me, I hit the question:

“Which database should I use?”

What followed was a deep dive down the rabbit hole of database internals — a world where storage engines clash, transactions dance in delicate sync, and choices aren’t always black and white.

This post is that journey — and maybe, it’ll help you find your own answer.


📁 Behind the Scenes of a Database

At first glance, databases look simple. You insert data. You query it. Maybe update or delete a few rows.

But under the hood, it’s a wild machine — made up of layers like:

  • Transport: How your queries travel to the server
  • Query Parser & Optimizer: What your SQL actually becomes
  • Execution Engine: Where it all gets done
  • Storage Engine: The core, the vault, the thing that makes it all possible

That’s where our journey truly begins.


🍊 Storage Engine Showdown: B-Tree vs LSM Tree

🎓 The Classic Hero: B-Trees

Imagine a grand old library with neatly sorted sections and labeled drawers. That’s a B-Tree.

Efficient, organized, and time-tested. Every insert knows where to go. Every query finds what it needs quickly.

It works like this:

  • Your data is stored in sorted blocks
  • Every read is fast (like O(log n) fast)
  • Updates happen in-place — meaning some random disk I/O, but that's fine for OLTP systems

Databases like MySQL (InnoDB) and PostgreSQL love B-Trees. They're rock solid when you need strong consistency, fast lookups, and ACID transactions.

But…

🔥 The Young Disruptor: LSM Trees

Then you meet LSM Trees — Log-Structured Merge Trees.

This one doesn’t bother with in-place updates. It writes everything to memory first, and flushes it to disk in sorted chunks called SSTables. Every now and then, it cleans up by merging — a process called compaction.

It’s like writing notes on sticky pads, then compiling a clean notebook later.

The result? Blazing-fast write performance. Perfect for logs, metrics, IoT streams, and other write-heavy systems.

LSM Trees power Cassandra, RocksDB, HBase, and even parts of MongoDB.


⚖️ When You Have to Choose

It felt like I was in a Western showdown:

B-Tree Wins If… LSM Tree Wins If…
Reads are frequent Writes are frequent
You need ACID compliance Eventual consistency is okay
OLTP-style transactions Streaming or time-series data

But that wasn’t the end. A good database isn’t just about reading or writing.


🔐 The Transaction Tapestry

Remember that moment in every heist movie where timing is everything?

That’s what transactions are like. You need your operations to be Atomic, Consistent, Isolated, and Durable — aka ACID.

✪️ SQL Databases (Relational)

In systems like MySQL or PostgreSQL, this is handled with:

  • Undo logs
  • WAL (Write-Ahead Logs)
  • MVCC (Multi-Version Concurrency Control)

Everything is locked, tracked, and reversible.

🌐 NoSQL Databases

In contrast, systems like Cassandra and DynamoDB favor eventual consistency — they go for BASE (Basically Available, Soft state, Eventual consistency).

They work like eventually synced notebooks:

Updates hit one node, and others catch up in the background. Fast, distributed, but less strict.


🧵 A Thread on Concurrency

Concurrency is where things got trickier.

With B-Trees, concurrency is controlled through careful locking:

  • Shared locks, exclusive locks, even update locks
  • B-Link trees (a clever enhancement) let reads flow even during writes

With LSM Trees, it’s much more lock-free:

  • MemTables take writes concurrently
  • SSTables are immutable
  • Compaction is background work

It was like comparing a bank vault with a revolving door system.


🧬 The Hybrid Age

In the real world, no one-size-fits-all database exists.

Some systems started combining the best of both:

  • MySQL has plugins for RocksDB
  • MongoDB switched to an LSM-like engine (WiredTiger)
  • Aurora blends SQL compatibility with NoSQL performance

🧠 My Takeaway

Choosing the right database isn’t about trends — it’s about trade-offs.

Ask yourself:

  • Is your workload read-heavy or write-heavy?
  • Do you need strict transactions, or is speed more important?
  • Are you handling structured business data, or millions of streaming events?

Once you answer these, your storage engine almost picks itself.


✍️ Final Thoughts

That little "insert into users" line in your code? It kicks off a cascade of logic and engineering brilliance that spans decades.

Understanding database internals made me a better backend engineer — and hopefully, now it’ll do the same for you.


Postmark Image

"Please fix this..."

Focus on creating stellar experiences without email headaches. Postmark's reliable API and detailed analytics make your transactional emails as polished as your product.

Start free

Top comments (3)

Collapse
 
adderek profile image
Maciej Wakuła

I am happy you are learning that. However you have not even touched the tip of the iceberg.
One important correction: NOSQL which stands for not only SQL.
Learn about ACID and CAP. Once you learn fundamentals of some RDBML (ex. MySQL, postgres, Oracle) try programming language inside the DB and triggers (but only try - in most cases this is the wrong way). Read about change journal (log). Try messaging and event sourcing. Then try the other area (mongo, redis). Learn why you might wish to sacrifice part of ACID and focus on CAP. Finally go to NewSQL. Good luck

Collapse
 
matthewlefevre profile image
Matthew LeFevre

Love the story! Thank you for sharing!

Collapse
 
himanshu_code profile image
Himanshu Sorathiya

Nice
Beginner friendly explanation, loved it