Forem: Hrishikesh Dalal

The AI Ghostwriter Experiment

Hrishikesh Dalal — Sun, 17 May 2026 11:48:08 +0000

Over the past few months, I ran an experiment on this blog.

Balancing engineering projects, open-source contributions, and university means deep-focus writing is a luxury I rarely have. I wanted to test a hypothesis: Can a developer leverage LLMs to fully automate content creation without losing technical depth?

I launched my "Engineering in the Wild" & "System Design" series to find out. Here is what I learned from letting AI ghostwrite my blog, and why I’m officially pivoting my strategy.

The Workflow

It wasn’t just mindless prompting. I built a specific pipeline to ground the AI in reality:

Input: Raw notes on architectural decisions (like sync APIs vs. Pub/Sub) or details from my ongoing projects.
Generation: Prompting the LLM to structure an engaging, DEV.to-optimized article.
Output: Flawless markdown, clean code snippets, and accessible analogies.

On paper, it worked perfectly. I shipped consistently, and the articles read like polished, textbook overviews that explained complex concepts exceptionally well.

The Flaw: Sanitized Engineering

While the explanations were solid, the experiment revealed a fatal flaw in fully automated content: it completely sanitizes the reality of writing code.

Reviewing the generated articles made a few things glaringly obvious:

Loss of Signal: AI outputs the statistical average of human knowledge. It produces clean, pristine ideals. But real engineering is chaotic and messy.
Erasure of Friction: The real value of a dev blog isn't just the final architecture—it’s the friction it took to get there. AI erased the edge-case geofencing bugs in my attendance platform, the legacy codebase constraints at my enterprise internship, and the brutal code reviews from merging open-source AI modules.
The Commodity Trap: A flawless explanation of a protocol is useful, but if anyone can generate it in ten seconds, it loses its unique edge. It didn’t show how I think; it showed how the model thinks.

The Pivot: Authentic Grit

The foundational overviews the AI helped me build serve their purpose, and AI remains a phenomenal tool. But going forward, the workflow for "Engineering in the Wild" & "System Design" is evolving.

The best documentation isn’t just a flawless essay; it’s a collection of genuine experiences from the trenches of development. It’s time to bring the grit back to this blog.

Even this meta-analysis was generated by AI :) but the pivot is 100% human.

EP 15: Pub/Sub - Stop Chaining Your Services

Hrishikesh Dalal — Wed, 18 Feb 2026 11:20:06 +0000

You just launched your dream startup, "DevEats"—a food delivery app for developers who need serious fuel while debugging. Your signature item? The "Triple Stack Smash Burger with Truffle Fries."

In the beginning, life was simple. You had one backend service (a monolith). A user ordered a burger, your code saved it to the database, and life was good.

But you grew fast. Now, YOU are the owner of a scaling platform, and your code is turning into a tangled mess of spaghetti (pun intended).

Here is why you need the Pub/Sub pattern, explained through the lens of your own burger empire.

1. What is the Need? (The "Smash Burger Panic")

Let’s look at your current codebase. When a user clicks "Order Burger", your backend does this strictly in order:

Payment Service: Charge the credit card $25.
Order Service: Save the order to the database.
Notification Service: Send a receipt email.
Kitchen Service: Print the ticket for the chef.
Analytics Service: Update your admin dashboard sales graph.

The Problem:
One Friday night, your Email Provider goes down.
Because your code is sequential (A -> B -> C -> D), the entire process crashes at Step 3.

The payment went through (Step 1), but the kitchen printer never got the ticket (Step 4).

Now you have a hangry developer who paid for a Triple Stack Burger that isn't being made. They are angry, and your support inbox is flooding.

The Need:
You need a way for your system to say, "Hey, a burger was just ordered!" without caring who is listening or if the email service is currently online. You need to decouple the sender from the receivers.

2. The Solution: Pub/Sub Architecture

In a Publish/Subscribe (Pub/Sub) model, we split your architecture into three parts:

The Publisher (You/The Order Service): You simply shout a message ("Order #505: Triple Smash Burger!") into a channel. You don't know who is listening. You don't care.
The Topic (The Channel): A dedicated lane for specific types of messages (e.g., events.burger_ordered).
The Subscribers (The Listeners): Independent services that have signed up to listen to that topic.

The New Flow:

User orders the burger.
You (Order Service) publish a message to the events.burger_ordered topic.
Done. You return "Success" to the user immediately.

Meanwhile, in the background:

The Kitchen Service hears the message and instantly prints the ticket on the grill station.
The Notification Service hears the message and tries to send an email. If it fails (because the provider is down), it retries later. Crucially, it does not stop the kitchen from cooking.
The Rewards Service hears the message and adds 50 points to the user's account.

3. When to Use It?

You shouldn't use Pub/Sub for everything. Use it when:

1-to-Many Relationships: One event (Burger Ordered) triggers multiple, unrelated actions (Email, Kitchen Display, Inventory Check).
Decoupling is Vital: You want to add a new "SMS Service" next week without rewriting your existing Order Service code.
Asynchronous Processing: You don't want the user staring at a loading spinner while your server generates a PDF invoice.

4. The Pros and Cons

✅ The Pros

Scalability: You can add 50 new subscribers (e.g., a "Surge Pricing Bot") without touching the original Publisher code.
Reliability: If one subscriber crashes (e.g., the Email service), the others (Kitchen, Analytics) keep working perfectly. The burger still gets made.
Speed: The Publisher fires the event and moves on. The user gets a "Your Order is Placed!" screen instantly.

❌ The Cons

Complexity: You are introducing a new component (a Broker like Redis, Kafka, or Google Pub/Sub) to manage.
Debugging is a Nightmare: You can't just follow a linear stack trace anymore. You have to trace messages flying across a network.
Consistency Issues: What if the message is delivered twice? (Does the customer get charged twice? Does the chef make two burgers?). You have to write code to handle these edge cases.

5. Pub/Sub vs. Message Queue (The Confusion)

This is where most devs get tripped up. Both use "Brokers," but the intent is different.

Scenario A: Pub/Sub (The "Megaphone")

Use Case: You launch a New limited-edition Ghost Pepper Burger.
Logic: 1 Message -> Many Receivers.

Publisher: The Menu Manager Service.
Subscribers:
App Notification Service: Pushes "New Burger Alert!" to all users.
Inventory Service: Reserves spicy peppers in the warehouse.
Marketing Service: Posts automatically to Twitter/X.
Why? Everyone needs to know about this event to do their own specific job.

Scenario B: Message Queue (The "Grill Line")

Use Case: It's lunch rush. You have 500 burger orders coming in.
Logic: 1 Message -> 1 Receiver (Load Balancing).

Producer: The Order Service pushes 500 tickets to a Queue.
Consumers (Chefs): You have 5 Chefs (Chef A, B, C, D, E).
The Flow: Chef A grabs Ticket #1. Chef B grabs Ticket #2.
Why? If Chef A is grilling the burger for Ticket #1, you do not want Chef B to grill the same burger. That’s a waste of meat. You want to distribute the work, not broadcast it.

The Cheat Sheet

Feature	Pub/Sub	Message Queue
Philosophy	"Everyone needs to know this happened."	"Someone please do this job."
Distribution	Broadcast (One-to-All)	Point-to-Point (One-to-One)
Burger Example	"New Menu Item Launched!" (Tell everyone)	"Grill this patty" (One chef does it)

Final Words

As the owner of DevEats, switching to Pub/Sub saved your Friday night rush. When your Email Service crashed, nobody noticed. The orders kept flowing, the smash burgers kept sizzling, and the emails were just queued up and sent an hour later.

That is the power of decoupling. 🍔 code responsibly!

The Jungle Protocol: Turning my Professional Journey into an Adaptive Quest

Hrishikesh Dalal — Sun, 01 Feb 2026 13:58:07 +0000

This is a submission for the New Year, New You Portfolio Challenge Presented by Google AI

About Me

I’m Hrishikesh Dalal, a Computer Engineering student at Sardar Patel Institute of Technology (SPIT), Mumbai. I don't just write code; I build experiences. Currently, I’m engineering a Jumanji-themed gamified portfolio that turns the standard resume into an adventure.

My focus lies in Full Stack Development (Next.js, React) and creating developer tools—I recently published envalyze and env-fault to the NPM registry to help devs manage environments better. When I'm not pushing commits to Open Source projects like Drupal or contributing to MDN Web Docs, I'm exploring the intersection of AI and law with projects like VerdictAI.

I write about system design, open source, and the reality of building software.

Portfolio

Link: https://hrishikesh-dalal-portfolio-571902073238.europe-west1.run.app

How I Built It

My Tech Stack

To build an experience this immersive, I needed a stack that balanced performance with high-end visuals:

Frontend: Next.js & Tailwind CSS
Animations: GSAP (for that cinematic feel)
3D Elements: Three.js (for the interactive globe and hero patterns)
AI:Gemini API (powering the NPC-style chatbots)

The Process: From Chaos to Clarity

Building a portfolio is a trap. You want to show off everything, but if you're not careful, you end up with a bloated mess.

1. The Affinity Mapping Phase

Initially, I was drowning in ideas from my UI/UX course. I had a laundry list: a full 3D world, a terminal, hidden easter eggs, business-centric project stats, and a Jumanji-themed game. I used Affinity Mapping to dump every "cool" idea onto a board and then ruthlessly segregated them.

2. The "Reality Check"

I had to ask the most important question: Why is someone here?
Hiring managers are busy. They might be tech-savvy, or they might not be. They might want to play a game, or they might just want to see my resume and leave in 30 seconds.

The Decision: A "Clean & Clear" landing page with a defined Call to Action (CTA) is non-negotiable. The "wow" factor should enhance the info, not hide it.

Design Breakdown: Section by Section

The Hero (The First 5 Seconds)

I treated the Hero section like a movie opening. I implemented a preloader with a cinematic transition that drops you into an interactive dot pattern. It’s meant to grab attention immediately.

The Impact (Big Numbers & Open Source)

Instead of just saying "I build tools," I used big, bold typography to showcase my NPM stats. I wanted the real-world impact of envalyze and env-fault to be unavoidable. I also dedicated space to my Open Source journey—showing my Hacktoberfest badges and my role as a project admin.

The Experience Timeline

I used a scroll-triggered timeline with GSAP micro-interactions. When you hover over an entry, there’s a slight movement—just enough to feel alive, but not enough to be distracting.

The Terminal (For the Geeks)

I built a fully functional terminal for my projects. You can change themes, run an echo, or see animations.

Pro Tip: Try typing sudo. You’ve been warned.

The Accessibility Switch: If you aren't a terminal person, there's a toggle to switch back to a standard grid. No one gets left behind.

The "Jungle" Experience

This was the hardest part to get right. My first version was just a collection of games, and the feedback from friends was brutal: "It’s disconnected. I don't see YOU in this."

I went back to the drawing board and turned it into a narrative. I added:

A Prologue & Epilogue: To give the "quest" a purpose.
Level 2 (The Jumper): While you play, the character narrates my tech stack and experience journey.
Secret Cheat Codes: Because what's a game without them?

Bringing it to life with AI

My main chatbot you have different answers queries, sample questions that you can ask to make it better.

While the game chatbot is like NPC character answering you.

The Architecture

The diagram below shows how the game chatbot is integrated to handle context-aware queries, ensuring it knows where you are in the game and who I am as a developer.

The User Interaction Layer
User Sends Message: The process begins with the user interacting with the interface.
Frontend App (NEXT JS): The core UI is built using React. It features a React Markdown Renderer, which ensures that the AI's responses—often containing code snippets or formatted text—are displayed cleanly and professionally to the user.
Security and Traffic Management
Backend API Gateway & Security Layer: Instead of calling the Gemini API directly from the frontend (which would expose your API keys), the app calls a secure backend gateway. This layer handles authentication and protects your credentials.
Load Balancer: To prevent any single server from becoming overwhelmed, a load balancer sits in the middle. It analyzes incoming requests and routes them efficiently across multiple active instances.
The Processing Core
Chat Model Instances: The system uses multiple parallel instances (Instance 1, 2, and 3) to process requests.
Least Used Routing: The load balancer is shown routing a request specifically to "Instance 1" because it is currently the "Least Used," ensuring optimal performance and low latency.
Google's Gemini API: These instances act as the bridge to Google’s infrastructure. They send the processed user prompt to the Gemini API and receive the raw AI generation in return.
The Response Loop
Forwards Response: Once a chat instance receives the data from Google, it forwards that response back through the secure backend.
Final Display: The message is sent back to the NEXT JS app, where it is rendered as Markdown, completing the loop and appearing on the user's screen.

Easter Eggs

I have a game at the start where the user can press the "?" button and play that simple.

"Stay Hungry, Stay Foolish." My portfolio ends with this quote because it sums up my approach to engineering. Whether it's deep-diving into the Drupal source code or building a terminal from scratch, I’m always looking for the next thing that feels slightly impossible to build.

What I'm Most Proud Of

If you’re a developer, you know the feeling: you spend 10% of your time on the features people see and 90% of your time on the "over-engineered" systems that make you feel like a wizard. Here are the pillars of this portfolio that I obsessed over.

1. The Terminal (Because real devs use a CLI)

I didn't want a static "Projects" page; I wanted something that felt like home. I built a fully functional, web-based Terminal where you can actually interact with my work.

The Experience: It’s a working CLI where you can navigate projects, change themes, or trigger animations.
The "Sudo" Trap: I’ve hidden easter eggs for people who like to poke around—try typing sudo.
The Hybrid UI: I know recruiters aren't always looking to type commands, so I built a "safety net" toggle that instantly switches the terminal into a standard project grid.

2. Projects with a "Business First" Logic

Most portfolios just show a screenshot and a GitHub link. I decided to treat my projects like real products.

Beyond the Code: When you click into a project, I don't just list the tech stack. I break down the Business Perspective, the Impact, and the Stats.
The "Why" Factor: I included the reasoning behind my architectural choices so that a non-technical stakeholder can understand the value I created, not just the code I wrote.
Deep Dives: Each project page is designed to show the journey from a pain point to a scalable solution.

3. The Jumanji Game Engine

I turned my professional experience into a narrative quest. This wasn't just about putting a game on a website; it was about making the resume playable.

The Narrative: I added a full prologue and epilogue to make sure it felt like a real quest, not just a random mini-game.
The Career Jumper: In Level 2, the character literally narrates my tech stack and my journey through different roles while you play.
Recruiter Mode: I even made the game adaptive—a recruiter can upload a job description, and the game will fine-tune the information it highlights to match what they need.

4. The Chatbot Architecture

I’m most proud of what’s happening "under the hood" here. I built a production-ready pipeline for the Gemini-powered Chatbot.

The Load Balancer: My system uses three different chat model instances. I implemented a load balancer that routes traffic to the "least used" instance to keep things snappy, no matter the traffic.
Security & Scalability: I kept the logic on the backend. The React frontend talks to a secure API Gateway, which handles the Gemini API calls to keep my keys safe.
The NPC Persona: I used the Gemini API to give the bot a specific "In-game NPC" persona—it knows it’s in the Jungle, but it also knows my GitHub stats.

System Design: EP 14 - The Coffee Shop Guide to Understanding Message Brokers

Hrishikesh Dalal — Wed, 21 Jan 2026 11:17:47 +0000

If you are a developer moving from a monolith to microservices, you’ve probably hit The Problem. Service A calls Service B, but Service B is slow, or down, or just busy. Suddenly, Service A hangs, the user sees a spinning wheel of death, and you get paged at 3 AM.

The solution? Stop making your services talk directly. Introduce a Message Broker.

If that sounds abstract, let's talk about coffee.

The "Bad" Coffee Shop (Synchronous)

Imagine a coffee shop where the person taking your order is also the only person allowed to make your coffee.

You (The User) walk in and order a latte.
The Cashier (Service A) takes your money.
The Cashier then walks over to the espresso machine, steams the milk, pulls the shot, pours the art, and hands it to you.
Only then do they return to the register to take the next person's order.

The Result: A massive line out the door. If the espresso machine breaks, the cashier is stuck waiting, and nobody can even place an order. This is Synchronous Communication (like HTTP REST calls). It creates tight coupling and bottlenecks.

The "Good" Coffee Shop (Asynchronous with a Broker)

Now, let's look at how a Starbucks or a high-volume cafe actually works.

You (The User) order a latte.
The Cashier (Service A / Producer) takes your money and writes your order on a cup (or a ticket).
The Cashier places the cup on the Counter/Ticket Rail (The Message Broker).
The Cashier immediately turns back to you and says, "Next please!"
The Barista (Service B / Consumer) sees the cup on the rail, picks it up, and makes the coffee.

The Result: The cashier never stops working. They can take 50 orders in the time it takes the barista to make 5 drinks. The "Ticket Rail" acts as a buffer.

The Technical Translation

That "Ticket Rail" is your Message Broker (like RabbitMQ, Kafka, or AWS SQS). Here is the mapping for your next system design interview:

1. The Producer (The Cashier)

This is the service that sends the data. In the coffee shop, the cashier "produces" the order. They don't care who makes the coffee or when it gets made; they just need to know the order has been safely placed on the rail.

Dev Term: Publish

2. The Message Broker (The Ticket Rail)

This is the middleware that sits between your services. It holds the messages (orders) until a consumer is ready to take them.

The Queue: The line of cups waiting on the machine. If the baristas get overwhelmed, the queue grows, but the cups aren't lost. They are safely stored until the backlog clears.

3. The Consumer (The Barista)

This is the service that processes the data. The barista "consumes" the order from the rail.

Dev Term: Subscribe
Scaling: If the line of cups gets too long, what do you do? You don't clone the cashier; you add a second barista. Message brokers allow you to scale your processing power (Consumers) independently of your intake power (Producers).

Why use a Message Broker?

Using the coffee shop model, the benefits become obvious:

Decoupling: The cashier doesn't need to know which barista is working today. If the barista quits (Service B crashes), the cashier can keep taking orders. The cups just pile up on the rail until a new barista is hired (Service B restarts). No data is lost.
Throttling (Load Leveling): If 100 people rush the store at once, the cashier takes all the orders quickly. The baristas continue working at their normal, safe speed. They don't explode from stress; they just work through the backlog.
Asynchronous Processing: Heavy tasks (like roasting beans or baking pastries) don't block the customer from paying and leaving.

Summary

Direct HTTP calls are like a cashier making every drink themselves. It works for a lemonade stand, but it fails at scale.

A Message Broker is the ticket rail that lets your services work at their own speeds, independently and reliably.

So, the next time you're architecting a system, ask yourself: Are we building a lemonade stand, or are we building a coffee empire?

System Design - EP 13: Content Delivery Networks (CDNs)

Hrishikesh Dalal — Fri, 16 Jan 2026 16:11:48 +0000

We’ve all been there. You click a link, and the page hangs. You stare at the loading spinner, your patience draining with every second. In 2026, a slow website isn't just annoying; it’s a dealbreaker.

As developers, we know Speed = Revenue. But how do you make a website fast for a user in Tokyo when your server is sitting in a basement in New York?

Enter the CDN (Content Delivery Network).

If you've ever nodded along while someone said "just put it behind Cloudflare" but were secretly fuzzy on the details, this article is for you. Let's break down CDNs using a scenario we can all understand: The Global Pizza Empire.

The Scenario: "Uncle Tony's Pizza"

Imagine you run the world's best pizza shop, Uncle Tony's, located in New York City. Your pizza is legendary. People fly in from all over the world just to get a slice.

In technical terms, your New York shop is the Origin Server. It is the single source of truth where the pizza (content) is created.

The Problem: The Latency Delivery

One day, you decide to offer global delivery.

Customer A lives in Brooklyn (5 miles away). They get their pizza hot and fresh in 20 minutes.
Customer B lives in London (3,500 miles away). You have to bake the pizza in NY, put it on a supersonic jet, and fly it to London. It arrives 6 hours later, cold and soggy.

This delay is Latency. In the web world, even if your server (New York Kitchen) is super fast, the physical distance to the user (London Customer) creates unavoidable lag.

The Solution: The Franchise Model (The CDN)

You realize you can't defy physics. You can't make the jet faster. So, you change your strategy. You open small, reheating stations in major cities around the world: London, Tokyo, Mumbai, and Sydney.

These stations are your Edge Servers. They don't have the full kitchen setup of the NYC headquarters, but they are capable of holding inventory.

How the "Caching" Works

You can't bake every single unique pizza at these small stations. Instead, you look at your data. You realize 80% of people just order Pepperoni or Cheese.

The First Request (Cache Miss): A guy in London orders a Pepperoni pizza. The London station is empty. They call the NY HQ, get a Pepperoni pizza flown over, and deliver it. It’s slow, but now the London station keeps a stash of 50 frozen Pepperoni pizzas in their freezer.
The Second Request (Cache Hit): Ten minutes later, a girl in London orders a Pepperoni pizza. The London station checks its freezer. Bingo! They grab one, heat it up, and deliver it in 10 minutes. They didn't even call New York.

The Theory: Key Concepts in CDNs

Now that we understand the pizza model, let's look at the actual technical terms you'll need for your system design interviews or cloud configs.

1. Edge Servers

These are geographically distributed servers where CDN providers (like Cloudflare, Akamai, or Fastly) cache your content. Users are routed to the nearest edge server for faster delivery.

Pizza Translation: The local reheating stations in London or Tokyo.

2. Origin Server

This is your main web server (e.g., an AWS EC2 instance or S3 bucket). The CDN fetches content from here only if it’s not already cached at the edge server.

Pizza Translation: The main Uncle Tony's Kitchen in New York.

3. Caching

CDNs store copies of your static content (e.g., images, videos, CSS, HTML) in their edge servers. Cached content is served directly to users, reducing the need for frequent requests to the origin server.

Pizza Translation: Storing frozen Pepperoni pizzas in the local freezer so you don't have to bake a fresh one every time.

4. TTL (Time to Live)

The duration for which a file is cached on a CDN edge server.
Example: An image might have a TTL of 24 hours. This means it stays cached for 24 hours; after that, the Edge Server assumes it might be stale, deletes it, and fetches a fresh copy from the Origin.

Pizza Translation: The expiration date on the frozen pizza. You throw it out after 24 hours to ensure quality.

5. GeoDNS

CDNs use GeoDNS to route users to the nearest edge server based on their geographic location. When a user types in your URL, the DNS resolver looks at their IP address and says, "You're in Germany? Okay, talk to the Frankfurt server, not the New York one."

Pizza Translation: The call center routing your order to the nearest branch automatically.

Why this Architecture Wins

1. Speed (Performance)

Your users in Tokyo aren't downloading images from New York anymore; they are downloading them from a server in Shinjuku. The data travels fewer miles, reducing the Time to First Byte (TTFB).

2. Scalability (Traffic Spikes)

Imagine it's Super Bowl Sunday. Everyone calls New York at once. The phone lines jam. The kitchen catches fire.
With the Franchise model (CDN), the New York kitchen only handles the weird custom orders. The millions of standard Pepperoni orders are handled by the thousands of local branches. Your main kitchen survives.

3. Security (DDoS Protection)

Imagine a competitor hires a million bots to prank call your New York Pizza shop so no real customers can get through.
With a CDN, these prank calls hit the thousands of local branches first. The branches are smart—they realize it's a prank and hang up before the call ever reaches the New York HQ.

System Design EP: 12 - Why Your Database Hates Your Images: A Guide to BLOBs

Hrishikesh Dalal — Wed, 14 Jan 2026 14:34:27 +0000

Ummm, BLOBS. Everyone has heard that term somewhere sometime across their development journey, but what are those?

In system design, a BLOB (Binary Large Object) is any unstructured data that doesn't fit neatly into a database row. Unlike integers or strings, the database engine doesn't "understand" this data,it just sees a massive chunk of bytes.

Examples: User avatars (JPEG), video files (MP4), PDFs, Application Logs, Backup files.

Size Range: From a few KBs (thumbnails) to TBs (genome sequencing data).

So basically storing mp4 file directly as a file is difficult, how to store it say in MySQL but it becomes easier when we store in Blobs using 0 & 1s.

When looking for the "definitive" resource on Blobs (Binary Large Objects) in System Design, most engineers end up piecing together chapters from different sources.

Since a single perfect article is hard to find, I have synthesized the industry-standard approach below—essentially the article you are looking for—followed by links to the best external deep-dives.

The Great Debate: Database vs. Object Storage

The most common interview question and real-world decision is: "Should I store user uploads in my SQL database or on disk?"

Option A: Storing BLOBs in the Database

You store the image directly in a column (e.g., VARBINARY in MySQL or BYTEA in PostgreSQL).

Pros: ACID compliance (the image and the user profile update commit together), easy backups (one dump file).
Cons:
Performance Suicide: Databases are optimized for small, random reads/writes. Reading a 10MB image consumes the same I/O as reading thousands of user rows.
Cost: Block storage (SSD for DBs) is 3-5x more expensive than Object Storage.
Scalability: You cannot easily cache database responses at the edge (CDN) compared to static URLs.

Option B: Storing BLOBs in Object Storage (The Standard)

You store the actual file in a service like Amazon S3, Google Cloud Storage, or Azure Blob, and store only the reference URL in your database.

Pros: Infinite scalability, cheaper storage tiers, built-in redundancy, easy integration with CDNs.
Cons: Loose consistency (you might delete the database row but forget to delete the S3 file, creating "orphan" data).

The Verdict: 99% of the time, use Object Storage (Option B). Only use Option A if your files are tiny (<20KB), strictly transactional, and security is paramount (e.g., encryption keys or sensitive legal docs that must never leak via a public URL).

The Architecture Pattern

The standard design pattern for handling blobs involves decoupling metadata from storage.

The Workflow:

Client Request: User uploads a profile picture.
API Gateway: Authenticates the user.
Presigned URL: Instead of uploading to your server (which blocks your server's threads), your backend generates a "Presigned URL" from S3. This authorizes the client to upload directly to the bucket for a limited time.
Direct Upload: Client uploads the binary data directly to S3/Blob Storage.
Confirmation: S3 returns a success code. The client notifies your backend: "Upload Complete."
Metadata Save: Your backend saves the file path (e.g., s3://my-bucket/users/123/avatar.png) into the SQL database users table.

EP 11: The Legend of ShopStream: The In-Memory Revolution

Hrishikesh Dalal — Tue, 13 Jan 2026 10:15:59 +0000

Akash had fixed his database latency problems by using Redis as a cache. ShopStream was fast. But Akash was about to learn a hard lesson: Redis is not just a "dumb" cache. It is a Ferrari engine that he was using to drive to the grocery store.

Chapter 1: The Day the Lights Went Out (Persistence)

One stormy night, the power failed at the data center. The Redis server rebooted.

When the lights came back on, Akash checked the dashboard.

User Sessions: Gone. Everyone was logged out.
Shopping Carts: Empty.
Revenue Impact: Massive.

Akash panicked. "I thought Redis stores data in RAM! If RAM loses power, data is gone!"
He realized he needed Persistence. He had two options:

Option A: The Photographer (RDB - Redis Database Backup)

Akash configured Redis to take a "Snapshot" every hour.

How it works: Every hour, Redis forks a background process to save all data to a file (dump.rdb) on the hard disk.
The Trade-off: It’s compact and fast to restore. But, if the server crashes at 4:59 PM, he loses 59 minutes of data since the last snapshot at 4:00 PM.

Option B: The Stenographer (AOF - Append Only File)

This wasn't enough for "Shopping Carts." So Akash turned on AOF.

How it works: Every time a command runs (e.g., SET cart:123 "Apple"), Redis logs that command to a file on the disk immediately.
The Trade-off: No data loss! But the file grows huge over time because it records every change.

The Solution: Akash used a hybrid approach. He used AOF for critical data (carts/sessions) and RDB for less critical cache data to balance speed and safety.

Chapter 2: The Data Structure Buffet (Beyond Strings)

ShopStream was growing. Akash was building a "Live Leaderboard" for the most active shoppers.

The "Newbie" Way:
Akash was storing the leaderboard as a JSON string in Redis:
"[{user: 'Alice', score: 10}, {user: 'Bob', score: 5}]"
Every time Alice bought something, Akash had to:

GET the whole JSON string.
Decode it in his backend code.
Update Alice’s score.
Sort the array again.
SET the whole JSON string back to Redis.

This was slow and caused "Race Conditions" (two users updating at once).

The "Pro" Way (Redis Data Structures):
Akash realized Redis isn't just a Key-Value store; it's a Data Structure Server.

Sorted Sets (The Leaderboard Fix): He used the ZSET data type.
Command: ZADD leaderboard 10 "Alice"
Command: ZINCRBY leaderboard 5 "Alice"
Magic: Redis automatically kept the list sorted in RAM. To get the top 3 users, Akash just ran ZREVRANGE 0 2. No application code needed.
Hashes (The User Profile):
Instead of storing a user as a big blob of text, he used HASH.
Command: HSET user:101 name "Akash" email "a@test.com"
Benefit: He could update just the email without reading/writing the whole user object.
Lists (The Job Queue):
When a user bought an item, the system needed to send an email. Instead of making the user wait, Akash pushed the "Send Email" task into a Redis LIST.
Command: LPUSH email_queue {user_id: 101}
A background worker simply monitored the list (BRPOP) and processed emails instantly.

Chapter 3: The Traffic Jam (Single Threaded Nature)

One day, the app froze. CPU usage on the Redis server was at 100%.

Akash looked at the logs. A junior developer had run a command: KEYS * (Give me ALL keys).
Because ShopStream had 10 million keys, Redis had to iterate through every single one.

The Critical Lesson:

Redis is Single-Threaded. It processes one command at a time. It is incredibly fast (handling 100,000+ ops/sec), but if one command takes 1 second (like KEYS *), all other 99,999 requests are blocked waiting in line.

The Fix: Akash banned the KEYS command and used SCAN (which reads keys in small batches) to prevent blocking the main thread.

Chapter 4: Too Big to Fail (Sentinel vs. Cluster)

The data grew to 500GB. But the server only had 64GB of RAM. The server crashed with an OOM (Out of Memory) error.

Akash faced the ultimate architectural choice: Scale Up or Scale Out?

Scenario A: High Availability (Sentinel)

Problem: "If my main Redis server dies, the app dies."
Solution: Akash set up Redis Sentinel.
Architecture: 1 Master Node (Write), 2 Slave Nodes (Read).
How it works: Sentinel watches the Master. If the Master dies, Sentinel automatically votes for one of the Slaves to become the new Master.
Limit: The total data size is still limited to the RAM of one node (64GB).

Scenario B: Infinite Scale (Redis Cluster)

Problem: "I have 500GB of data. I need more RAM."
Solution: Akash migrated to Redis Cluster.
How it works: He bought 10 servers (nodes). Redis automatically split the data across them using Sharding.
Keys A-M go to Node 1.
Keys N-Z go to Node 2.
The Magic: The client app doesn't need to know which node has the data. It asks the cluster, and the cluster routes the request to the right node.

The Moral of the Story

Akash learned that Redis is the "Swiss Army Knife" of backend engineering.

Persistence: Use AOF if you can't lose data; RDB for backups.
Logic: Don't pull data to your code to sort it. Push your data into Sorted Sets or Lists and let Redis do the work.
Performance: Never block the single thread.
Scaling: Use Sentinel for reliability, use Cluster for massive data size.

Summary Cheat Sheet

Feature	Concept	Use Case
String	Basic Key-Value	Caching HTML pages, Sessions.
List	Linked List	Message Queues, Timelines (Twitter feed).
Set	Unordered Unique	Storing Followers, IP Whitelists (fast lookups).
Sorted Set (ZSet)	Sorted by Score	Leaderboards, Priority Queues.
Hash	Field-Value Map	Storing Objects (User Profiles, Product Details).
Pub/Sub	Radio Station	Real-time Chat, Notification Systems.
TTL (Time to Live)	Expiry	Auto-deleting OTPs or Cache data.

EP 10: The Legend of ShopStream: The NFS (Caching)

Hrishikesh Dalal — Mon, 12 Jan 2026 07:06:34 +0000

Akash, the founder of ShopStream, was feeling good. He had broken his Monolith into Microservices and set up a smart Load Balancer to handle the traffic. Ideally, the app should have been flying.

But it wasn't.

Users were complaining again. "The 'Trending Blogs' page takes forever to load!" they screamed in the reviews.

Akash looked at the metrics. The servers were fine. The network was fine. The bottleneck was the Database.

Chapter 1: The Slow Librarian (The Problem)

Every time a user opened the "Trending Blogs" page, the system did this:

Fetch Data: It asked the MongoDB database for the top 10 blogs. (Time: 500ms)
Calculate: The backend formatted the dates and sorted the list. (Time: 100ms)
Response: The user finally got the page. (Total Time: 600ms)

This 600ms delay happened for every single user. If 10,000 people visited the page, the poor database had to answer the exact same question 10,000 times. It was like a librarian running to the basement to fetch the same book over and over again.

Akash realized: "Why am I calculating this every time? The trending blogs don't change every second!"

Chapter 2: The Cheat Sheet (Enter Caching)

Akash introduced a new layer to his architecture: The Cache.

He set up a Redis instance—a super-fast, in-memory storage. It acted like a "Cheat Sheet" or a sticky note on the librarian's desk.

The New Workflow:

User A visits the page. The system checks the Cache. It's empty (Cache Miss).
The system goes to the Database (500ms) + Calculates (100ms) = 600ms.
Crucially, before sending the response, the system saves a copy of this final result in Redis.

The Magic:

User B visits the page 1 second later.
The system checks the Cache. It finds the data! (Cache Hit).
It serves the data directly from RAM. Total Time: 20ms.

Akash had just reduced the load time by 96%. The database stopped sweating, and the users were happy.

Chapter 3: The Ghost of Old Data (Cache Invalidation)

Monday morning, Akash posted a breaking news blog: "50% OFF SALE!"

He refreshed the app. The new blog wasn't there.
He refreshed again. Still the old list.

The Problem: The Cache was doing its job too well. It was still serving the "Trending Blogs" list it saved yesterday. It didn't know a new blog existed.

Akash learned about Cache Invalidation. He had to tell the cache when to update. He implemented two strategies:

The "Time to Live" (TTL): He told Redis: "Only keep this data for 24 hours."
Day 1: Cache serves fast data.
Day 2 (24 hrs later): Redis automatically deletes the data.
First Request: The system is forced to go back to the Database, get fresh data (including the new blog), and save it again.
Explicit Invalidation:
For critical things like "Product Price," 24 hours is too long. Akash wrote code so that whenever he updated a price in the admin panel, the system immediately nuked that specific key from the cache.

Chapter 4: The Layers of Speed (Types of Caching)

Akash realized he could cache things in more places than just the server. He built a "Defense in Depth" strategy for speed.

1. Client-Side Cache (The User's Pocket)

Akash realized the browser was downloading the logo.png and style.css every time a user refreshed.

The Fix: He told the user's browser (via HTTP headers): "Keep this logo for a year. Don't ask me for it again."
Result: Zero network requests. Instant load.

2. CDN Cache (The Delivery Trucks)

ShopStream had users in London, but the main server was in Mumbai. Light takes time to travel that distance.

The Fix: Akash used a CDN (Content Delivery Network) like Cloudflare. He stored copies of his static files (images, videos) on servers all over the world.
Result: A user in London downloaded the images from a London server, not Mumbai.

3. Server-Side Cache (The RAM)

This was his Redis setup.

The Fix: Storing the results of heavy database queries or complex calculations in memory.
Result: The database could relax.

4. Application-Level Cache

Inside the code, Akash had a complex function that calculated shipping costs based on weight and distance.

The Fix: He used a local variable (memoization) to store the result of calculateShipping(10kg, 5km). If the code saw those inputs again, it just returned the saved number.

The Moral of the Story

By the end of the month, Akash looked at his dashboard.

Performance: Latency dropped from 600ms to 60ms.
Cost: He actually downgraded his database server because it had so little work to do, saving money.
Scalability: When traffic spiked, the Cache absorbed the hits, protecting the fragile database.

EP 9: The Legend of ShopStream: The Gatekeeper Chronicles

Hrishikesh Dalal — Sun, 11 Jan 2026 04:42:10 +0000

Akash, the founder of ShopStream, had successfully split his massive Monolith into Microservices. The app was flexible, and the code was clean.

But success brought a new problem: Traffic.

Chapter 1: The Crush (Why We Need a Load Balancer)

It was the day of the "Mega Summer Sale." Millions of users flooded ShopStream to buy discounted headphones.

Akash had prepared for this. He had spun up 10 identical servers for the Checkout Service. He thought he was safe.

But then, disaster struck.

All the users were trying to connect to Server #1 because that was the only IP address the app knew. Server #1 caught fire (metaphorically) and crashed. The other 9 servers sat idle, doing absolutely nothing, while users stared at "404 Errors."

Akash realized he didn't need just more servers; he needed a Traffic Cop.

He introduced a Load Balancer.

The Concept:
Imagine a popular nightclub. Inside, there are 5 bartenders (Servers). If everyone rushes to the first bartender, chaos ensues.
The Load Balancer is the Host at the door.

The Host stops you at the entrance.
He looks inside to see which bartender is free.
He points you to Bartender #3.
If Bartender #2 passes out (Server Crash), the Host stops sending people to him. (This is called a Health Check).

Suddenly, the traffic was spread evenly. No single server died. The system survived.

Chapter 2: The Decision Logic (Algorithms)

Now that Akash had a "Host" at the door, he had to teach it how to distribute the guests. This is where Load Balancing Algorithms came in.

1. The "Round Robin" (Taking Turns)
At first, Akash told the Host: "Just send them in order. One for Server A, one for Server B, one for Server C."

Pros: Simple and fair.
Cons: It didn't account for strength. Server A was a powerful beast, while Server C was an old laptop. Server C got overwhelmed quickly.

2. The "Least Connections" (The Smart Observer)
Akash changed the rule: "Look at who is busy. Send the new user to the server with the fewest active customers."

Pros: Perfect for when user sessions have different lengths (e.g., one user buys quickly, another browses for hours).
Cons: Slightly more complex to calculate.

3. The "IP Hash" (The Sticky Memory)
Some users complained: "I put an item in my cart on Server A, but on my next click, the Load Balancer sent me to Server B, and my cart was empty!"
Akash switched to IP Hashing. The Host now remembered the user's face (IP Address). "Ah, I remember you. You go back to Server A."

Pros: Essential for maintaining user sessions (Sticky Sessions).

Chapter 3: The Two Guardians (L4 vs. L7 Load Balancers)

As ShopStream grew into a global empire, Akash realized not all traffic was the same. He had to choose between two types of Load Balancers: Layer 4 (L4) and Layer 7 (L7).

The L4 Load Balancer: "The Bouncer"

Akash hired a tough, fast Bouncer for the main entrance.

How he works: He only looks at the envelope of the message (IP Address and Port). He doesn't open the letter.
The Logic: "Oh, you are coming from IP 192.168.1.5 trying to reach Port 80? Go to Server 3."
The Pros: He is incredibly fast. He doesn't waste time reading the data. He can handle millions of requests per second.
The Cons: He is "dumb." He doesn't know if you are asking for a Video or a JPG image. He just forwards traffic blindly.

The L7 Load Balancer: "The Concierge"

Inside the VIP area, Akash placed a sophisticated Concierge.

How she works: She actually opens the envelope and reads the request (HTTP Header, URL, Cookies).
The Logic:
"I see you are requesting /video/stream. I will send you to the high-performance Media Servers."
"I see you are requesting /billing/invoice. I will send you to the highly secure Finance Servers."
The Pros: She is smart. She allows for "Content-Based Routing." She can even block hackers if she sees a suspicious SQL code in the URL.
The Cons: She is slower than the Bouncer because she has to read the message (Decrypt HTTPS, read headers, re-encrypt).

The Moral of the Story

Akash learned that a robust system usually needs both.

L4 (The Bouncer) sits at the very edge, taking the massive hit of incoming internet traffic and distributing it quickly.
L7 (The Concierge) sits behind him, sorting that traffic into specific microservices (Checkout vs. Video vs. Reviews).

EP 8: The Legend of "ShopStream": A Tale of Two Architectures

Hrishikesh Dalal — Sat, 10 Jan 2026 16:40:36 +0000

Once upon a time, in a small, dimly lit garage, a developer named Akash had a billion-dollar idea. He wanted to build ShopStream, a revolutionary app that combined live video streaming with instant e-commerce.

Akash had a deadline. He had investors to impress in two weeks. He needed to move fast.

Chapter 1: The Golden Monolith (The Early Days)

Akash opened his code editor and created a single project folder.

He built the User Auth, the Video Player, the Payment Gateway, and the Inventory System all in one massive codebase. They shared the same database and ran on a single server.

It was a Monolith.

Why it worked:

Speed: Akash could write a function in the Video module and call it directly from the Inventory module. No network lag, no API complexity.
Simplicity: Deployment was a breeze. He simply dragged his one executable file onto the server, restarted it, and voilà—ShopStream was live.
Cost: It ran on a cheap, $5/month virtual machine.

The Lesson:

In the beginning, Monolithic is King. When you are a startup or building an MVP (Minimum Viable Product), your priority is speed of delivery. You don't know if the product will succeed, so spending months architecting complex systems is a waste of time.

Chapter 2: The Spaghetti Monster (The Growth Phase)

ShopStream became a viral hit. The user base exploded from 100 to 1,000,000 overnight.

Akash hired 50 new developers. They all jumped into that same single project folder. Suddenly, the paradise turned into a nightmare.

The Merge Conflict Hell: When the "Payment Team" tried to update the checkout logic, they accidentally overwrote code from the "Video Team."
The "Fragile Glass" Effect: One Friday, a junior developer made a typo in the Comments Section code. Because it was all one single process, the error caused a memory leak that crashed the entire application. Users couldn't comment, but worse they couldn't buy anything either.
The Scaling Problem: On Black Friday, traffic spiked. The Video Streaming part of the app was heavy and needed 100 extra servers. The User Profile page was light and needed only one. But because it was a Monolith, Akash had to replicate the entire app 100 times, wasting massive amounts of money on memory and CPU for parts of the app that weren't being used.

The Lesson:

Monoliths struggle at Scale. When your team grows large, a single codebase becomes a bottleneck. When your traffic grows unevenly (some features are popular, others aren't), Monoliths scale inefficiently.

Chapter 3: The Great Migration (Enter Microservices)

Desperate, Akash hired a seasoned Architect named Sasha. She looked at the giant, tangled ball of code and shook her head.

"We need to break it apart," she said. "We need Microservices."

Over the next six months, they took a chainsaw to the Monolith.

They ripped out the Payment Logic and made it a standalone service with its own database.
They isolated the Video Transcoder into its own service.
They separated the Inventory System.

Now, these services talked to each other over a network (APIs), like distinct shops in a marketplace.

Why it worked:

Fault Isolation: A month later, the Comments Service crashed again. But this time, the Video Player and Payments kept running perfectly. Users just saw a "Comments loading..." spinner, but they could still watch and buy.
Independent Scaling: When the Black Friday sale hit, Sasha set the Payment Service to auto-scale to 500 servers, while leaving the User Profile Service on just two. It was precise and cost-effective.
Tech Freedom: The Data Science team wanted to write the Recommendation Engine in Python, while the core backend was in Java. With Microservices, they could do that easily.

Chapter 4: The Hidden Cost (The Reality Check)

However, life wasn't perfect for Akash and Sasha.

In the Monolith days, debugging was easy you just looked at the logs. Now, a user would report an error, and Akash had to chase the request through five different services to find where it failed.

"Why is the site slow?" Akash asked.
"Well," Sasha replied, "Service A is calling Service B, which is waiting for Service C, and the network between them is lagging."

They had to hire a dedicated DevOps team just to manage the complexity of Kubernetes, Docker containers, and distributed tracing.

The Lesson:

Microservices are expensive. They trade development complexity for operational complexity. You don't solve the problem; you just move it. You should only pay this price if you absolutely need the scale.

The Moral of the Story

So, which architecture won? Neither. They just served different chapters of the company's life.

Stick with the Monolith if:

You are a startup or a small team (under 10-20 devs).
You are building a new product (MVP).
Your domain is simple.
You want fast iteration and easy debugging.

Switch to Microservices if:

You are a large enterprise (Netflix, Uber, Amazon scale).
You have multiple teams that need to work independently without blocking each other.
You need to scale different parts of your app drastically differently.
One part of your system crashing shouldn't take down the rest.

The Golden Rule: Start Monolithic. Stay Monolithic as long as you can. Only break it apart when the pain of managing the Monolith becomes greater than the pain of managing Microservices.

EP 7: The "Join" Tax vs. The "Storage" Tax

Hrishikesh Dalal — Fri, 02 Jan 2026 04:37:24 +0000

When we talk about SQL vs. NoSQL in the context of system design, we’re moving past syntax and getting into the "meat" of the problem: Trade-offs. In a real-world system, you aren't choosing a database because you like the query language; you’re choosing it because of how it handles traffic, consistency, and failure. Here is how to think about this like a seasoned engineer.

The "Join" Tax vs. The "Storage" Tax

In System Design, we care about latency.

SQL (Normalization): We minimize redundancy. If a user changes their name, you update it in one place. But the "tax" you pay is in Joins. If your dashboard needs data from five different tables, your database has to do a lot of heavy lifting at read-time to stitch that data back together.
NoSQL (Denormalization): We embrace redundancy. You might store that user’s name in five different document collections. The "tax" here is Storage and Update Complexity. Reads are lightning-fast because the data is already "pre-joined" in one document, but if the user changes their name, you might have to update five different places.

Ask yourself: Is my app read-heavy or write-heavy? If you're building a social media feed where you read the same post a million times but only write it once, NoSQL’s "read-ready" format often wins.

2. The CAP Theorem: The Rule You Can't Break

You can’t talk system design without the CAP Theorem. It’s the ultimate reality check for distributed systems.

Consistency (C): Every node sees the same data at the same time.
Availability (A): Every request gets a response (even if it's old data).
Partition Tolerance (P): The system keeps working even if the network breaks between nodes.

In a distributed world, you must choose P. That leaves you with a choice between CP (SQL-like strictness) or AP (NoSQL-like speed).

SQL (CP): Better for banking or inventory. I’d rather the system "break" (be unavailable) than tell you that you have $100 when you actually have $0.
NoSQL (AP): Better for "likes" on a post. If one server shows 100 likes and another shows 102, the world won't end.

3. Scaling: The "Wall" vs. The "Horizon"

This is usually the biggest factor in high-level design interviews.

SQL Scaling (Vertical): You’re basically buying a bigger engine for the same car. Once you hit the limit of the biggest server available, you have to do Manual Sharding, which is a nightmare of architectural complexity. :(
NoSQL Scaling (Horizontal): These are built to be "sharded" by design. You just add more cheap servers (nodes) to the cluster. The database handles the distribution of data across those nodes automatically. :)

As we navigate the tech landscape of 2026, many of the world's most successful platforms aren't choosing one over the other, they are using both in tandem to handle different parts of their infrastructure.

Here are some use cases and real-life examples of where these databases actually live in production.

Financial Integrity and Compliance

When you are building a system where a single missing penny can cause a legal nightmare, SQL is the only real choice. Financial systems rely on ACID compliance to ensure that if a transaction starts, it either completes perfectly or fails entirely with no "middle ground."

Real-Life Example:
JPMorgan Chase uses relational databases (often heavily tuned versions of PostgreSQL or Oracle) to manage their core ledgers. They need strict schemas and strong consistency because they cannot afford "eventual consistency." If you check your balance after a deposit, it needs to be correct immediately, not "eventually" correct a few seconds later.

Social Media Feeds and Real-Time Content

Social media is the opposite of a bank ledger. It is "read-heavy" and deals with massive amounts of unstructured data like text, images, tags, and reactions. NoSQL shines here because of its ability to scale horizontally. In a system design interview, if you are asked to design a "Twitter-like" feed, you'd likely use a Document Store (like MongoDB) or a Wide-Column Store (like Cassandra). These databases allow you to store a post and all its metadata together in one "blob," making it incredibly fast to serve to millions of users at once.

Real-Life Example:
Instagram uses a hybrid approach, but they famously use a NoSQL-style architecture for their feed. When you scroll, the app isn't performing complex "joins" across ten different tables to find the photo, the caption, and the likes; it's pulling a pre-computed document from a NoSQL store that has everything ready to show you in milliseconds.

High-Speed Caching and Session Management

Sometimes you don't need a permanent home for data; you just need a place to store it for a few minutes at lightning speed. This is where Key-Value NoSQL stores (like Redis) come in. In system design, we use these for things like user sessions, shopping carts, or leaderboards. If a user logs in, you don't want to query your main SQL database every single time they click a button just to verify who they are. Instead, you store their "session token" in a fast in-memory NoSQL database.

Real-Life Example:
Gaming platforms like Riot Games (League of Legends) use NoSQL key-value stores to manage live leaderboards and player sessions. When thousands of players finish a match at the same time, the system needs to update rankings instantly without waiting for a traditional SQL database to lock tables and process the writes.

EP 6.4: What is SHARDING?

Hrishikesh Dalal — Thu, 01 Jan 2026 12:03:25 +0000

In the world of system design, there is a common saying: "Don't shard until you absolutely have to." While sharding offers virtually unlimited scaling potential, it introduces a level of operational complexity that can cripple a small engineering team.

This article I explore what sharding is, why it differs from simple partitioning, the strategies for implementing it. So let us get intoo it.

What is Sharding? (We know Partitioning, so why need Sharding)

To understand sharding, you must first understand Horizontal Partitioning. In partitioning, you take a massive table and split it into smaller "chunks" stored on the same server. The database engine (like PostgreSQL) handles the logic of finding which chunk holds your data.

Sharding takes this a step further. Instead of keeping those chunks on one machine, you place them on entirely different database servers, known as Shards.

The Core Difference:

Partitioning -> database manages the complexity.
Sharding -> you (the application developer) manage the complexity.

Partitioning -> Tables
Sharding -> Databases

The Shard Key:)

When you split a table across multiple servers, you need a rule to decide where each row goes. This is determined by the Sharding Key.

If you shard based on user_id:

Server 1 (Shard 1): Stores users with IDs 1–1000.

Server 2 (Shard 2): Stores users with IDs 1001–2000.

The Sharding Key is the most critical decision in this architecture. A poor key leads to "Hotspots", where one server is at 99% CPU while the others are idling at 5%.

3. Sharding Strategies

There is no "one size fits all" way to distribute data. Here are the four primary strategies used in the industry:

A. Range-Based Sharding

Data is divided into continuous ranges of the shard key.

Ex: Shard A (A-M), Shard B (N-Z).
Pros: Very easy to reason about and implement.
Cons: Leads to Data Skew. If you have 1 million users whose names start with 'S' and only 10 users starting with 'X', Shard B will be overloaded while Shard A sits empty.

B. Hash-Based Sharding

A hash function is applied to the shard key (), where is the number of shards.

Pros: Provides a very even distribution of data across all servers.
Cons: Resharding is a nightmare. If you grow from 3 shards to 4, the result of the modulo operation changes for almost every key, requiring you to move nearly all your data to new servers.

C. Geographic/Entity-Based Sharding

Data is grouped by a logical attribute like region or country.

Ex: European users on a Dublin server; Asian users on a Singapore server.
Pros: Reduces latency for local users and helps with data residency laws (GDPR).
Cons: If your app suddenly goes viral in one specific country, that shard becomes a bottleneck.

D. Directory-Based Sharding

A separate "Lookup Service" or "Mapping Table" keeps track of which shard holds which data.

Pros: Maximum flexibility. You can move a single user from Shard 1 to Shard 5 without changing any hashing logic.
Cons: The directory itself becomes a Single Point of Failure and a performance bottleneck. Every query now requires two hits: one to the directory and one to the shard.

4. Why Sharding is Difficult :(

If sharding allows you to scale to billions of users, why do engineers avoid it?

I. Application Complexity

Since the data is on different servers, the database can't help you find it. Your application code must become "Shard Aware." You have to write logic that says: "If the user is trying to log in with ID 505, connect to Database Server 2." This makes your codebase significantly harder to maintain.

II. The "Join" Problem

In a single database, joining two tables is easy. In a sharded environment, if the Users table is on Shard A and the Orders table is on Shard B, you cannot perform a SQL Join. You must pull the data from both servers into your application memory and "join" them manually—an operation that is slow and memory-intensive.

III. Loss of Transactional Consistency

Classic databases offer ACID properties (Atomicity, Consistency, Isolation, Durability). In sharding, a "transaction" that spans two shards is nearly impossible to guarantee. If Shard A updates successfully but Shard B fails, your data is now in a corrupted, inconsistent state.

5. When should you actually Shard?

Before you choose sharding, you should have already exhausted these three steps:

Vertical Scaling: Buy a bigger server with more RAM and CPU.
Read Replicas (Master-Slave): Use the architecture we discussed earlier to offload all "Read" traffic to Slaves.
Caching: Use Redis to stop 80% of requests from even hitting your database.