Forem: Daniel Lee

Design a short url for system design interview

Daniel Lee — Tue, 03 Dec 2024 19:13:16 +0000

This article is a personal note from "System Design Interview - An Insider's Guide by Alex Xu". It's intended as a memory refresher for system design interview in hurry.

hash funciton

to generate a shorter URL, use a hash function to map between short URLs and long URLs. For example, hash functions like CRC, MD5 and SHA-1 are used.

hash value

to generate short URLs of length "n", find the value that meets "back-of-the-envelope" estimation. For example, a system is identified to support 365 million unique URLS, then given the system generates using [0-9,a-z,A-Z], "n" has to be 62^n >= 365 M (roughly 7).

hash collision

to generate unique short URLs, two approaches can be taken:
1. *Base 62 conversion *- converts the same number between its different number presentations
2. Hash the first "n" number of characters and append it to the rest to generate a short URL. If the length of the result still exceeds "n", repeat the process.
  - As this can be time-consuming, "bloom filter" can be used to improve performance. Bloom Filter is a space-efficient probablistic technique to test if an element in a number of a set.

Security strategy: zero trust model with mTLS

Daniel Lee — Tue, 26 Nov 2024 05:17:23 +0000

Prologue

When I worked at Mastercard, I had an opportunity to contribute to an organization's effort to upgrade backend services to support optional mTLS for a new client from the middle east. Due to the country policy, all the data handled within our application had to be on-premise and this was the first time I heard about "Zero Trust Architecture" which sounded pretty cool!

Terminologies explained:

"On-premise" means that software, systems, data, and infrastructure are installed and operated within an organization's own facilities, such as an office building or data centre in a certain region, etc.
"Zero trust" is a cybersecurity framework that assumes no subject in an information system is trusted by default.

What is mTLS?

A mutual transport layer security is an encryption protocol often used in a zero trust security framework. To better understand, we first need to know about the following three important building blocks:

Public and Private keys
- Anything encrypted with a public key can be decrypted only with the private key
- Anyone can view the public key by looking at the domain's or server's TLS certificate
TLS Certificate
- A data file containing information about server's or domain's identity, public key, and statement of certificates (issuer, expiry date, etc)
TLS handshake
- A process for verifying the TLS certificate and the server's possession of the private key

How does mTLS work?

In TLS, normally, the server has a TLS certificate and a public/private key pair while the client doesn't. TLS is established in the following manner:

Client connects to the server
Server presents its TLS certificate
Client verifies the server's certificate
Client and server exchange information over encrypted TLS connection

On the other hand, in mTLS, both client and server have a certificate, and both sides authenticate using their public/private key pairs. mTLS is established in the following manner (additional steps in bold):

Client connects to the server
Server presents its TLS certificate
Client verifies the server's certificate
Client presents its TLS certificate
Server verifies the client's certificate
Server grants access
Client and server exchange information over encrypted TLS connection

What's unique about mTLS?

As both clients and servers need to verify certificates, there has to be a central authority, so called, "Root" TLS certificate. This enables an organization to be their own certificate authority and it is self-signed, meaning organizations create it themselves (if they have their own private network or internet service provider). Thus, authorized clients and servers have to correspond to this root certificate.

References

To learn more about mTLS, please refer to CloudFlare blog post
To learn more about Zero Trust, please refer to IBM Technology youtube video

Typescript OOD concept for technical interview

Daniel Lee — Tue, 26 Nov 2024 04:23:40 +0000

Basics

class can implement one or more interfaces.
- ex) class C implements A,B {...}
class can extend from one ore more base classes.
- ex) class C extends A,B {...}

Member visibility

public: can be accessed anywhere
protected: only visible to subclasses of the class they're declared in. For example,

class Greeter {
  public greet() {...}
  protected getName() {...}
}

class SpecialGreeter extends Greeter {
  public function(){
    this.getName() // good
  }
}

const g = new SpecialGreeter() // subclass
g.greet() // good, public method of the parent class
g.getName() // bad! cannot call a protected method of the parent class

private: cannot be accessed even from subclasses, but via getter method
static: define properties or methods on the class itself, rather than on instances of the class. It can be accessed without needing to create an instance of the class. For example

class MathUtils {
 static PI = 3.14159;
 static calculateCircleArea(radius: number): number {
  return this.PI * radius * radius
 }

MathUtils.PI // 3.14159
MathUtils.calculateCircleArea(5) //78.53975

When to use it?
- When the functionality is tied to the class rather than an object
- For constant or utility function that should be shared
- For singleton-like pattern where one central instance is sufficient
abstract: cannot be instantiated (class, field, method). It is a means to serve as a base class for subclasses which do implement all the abstract members. For example,

abstract class Base {...}
const b = new Base() // bad

class Derived extends Base {...}
const d = new Derived() // good

Design a key-value store for system design interview

Daniel Lee — Wed, 20 Nov 2024 06:25:49 +0000

When talking about a key-value store in a backend system, the very first thing that comes to my mind is a cache storage like Redis.

Caching is a technique used to avoid hitting database multiple times for the same retrieval of data, thereby serves Read requests more efficiently. However, fitting everything in a hash map or in memory is limited when the size of data grows over time. Although there are ways to optimize how data is stored (via data compression or separately storing frequently accessed data in memory and the rest in disk), a system will eventually require a distributed key-value store to handle more traffic.

When designing a distributed key-value store system, CAP theorem has to be considered: Consistency, Availability, and Partition tolerance. In fact, partition tolerance is almost inevitable in modern systems as network failures cannot be avoided all the time. So, we have to make tradeoffs between consistency and availability in reality. Here're 3 possible combinations in theory:

CA: choose consistency and availability over partition tolerance (unrealistic)
CP: choose consistency and partition tolerance over availability
AP: choose availability and partition tolerance over consistency

Given these tradeoffs, one should further consider the followings:

Data partition and repetition
Consistency (strong, weak, eventual)
Inconsistency resolution
Handling failures (temporary, permanent)
Write and read paths

When we partition or replicate data, "consistency hashing" can be used to ensure minimum data movement and even distribution of data amongst key-val stores. Checked!

To ensure consistency, "Quorum consensus" can be used to guarantee consistency for both read and write operations. Quorum consensus basically states that

Given N number of replicas, W as a write quorum size, R as a read quorum size,

For a write operation to be considered successful, write operation must be acknowledged from W replicas

For a read operation to be considered successful, read operation must wait for responses from at least R replicas

For example, if a data is replicated to server0, server1, server2, and W = 1, a coordinator must wait for 1 acknowledge from one of the servers.

If W + R > N, strong consistency is guaranteed.

There're 3 consistency models that can be considered:

Strong consistency: all replicas or nodes will see up-to-date data all the time
Weak consistency: somewhere in the middle of strong and eventual consistency
Eventual consistency: given enough time, all updates will be propagated and all replicas are consistent

To resolve inconsistency amongst replicas, versioning and vector clocks can be used to detect and reconcile conflicts.

Versioning technique basically treats each data modification as a new immutable version of data.
Vector clock is a [server, version] pair associated with a data item. For example, D([s1,v1], [s2,v2], ..., [sn,vn]), where D is data item, sn is nth server, and vn is nth version. Vector clocks has to choose one of the following operations:
1. Increment vi if [si,vi] exists
2. Otherwise, create a new entry [si, v1]

To handle failures (of replicas/servers/nodes), a system first needs to detect a failure. "Gossip protocol" can be used for that purpose. The following is how it works:

each node maintains a node membership list (memberID and heartbeats)
each node periodically increments its heartbeat counter and sends heartbeats to a set of random nodes (which in turn propagate to another set of nodes)
once nodes receive heartbeats, membership list is updated. If the heartbeat hasn't increased for more than predefined periods, the member is considered as offline

After detecting a failure, "Sloppy Quorum" can be used to recover from temporary failures.

"Quorum" means the minimum number of members of an assembly to make proceedings of that meeting valid.

It works as follows:

A system chooses the first W healthy servers for writes and first R healthy servers for reads on the hash ring. Then, use "hinted handoff" to achieve data consistency. For instance, when the down server is up, changes will be pushed back to the server to achieve data consistency. If a server is unavailable due to network for server failures, another server will process requests temporarily

On the other hand, "Anti-Entropy protocol" can be used to recover from permanent failures.

It keeps replicas in sync by comparing each piece of data on replicas and updating each replica to the newest version.

A Merkle Tree is used for inconsistency detection and efficiently minimizing the amount of data transferred.
- A Merkle Tree has every non-leaf node labeled with the hash of the labels or values of its child nodes. To learn more about it, check out Gaurav's Video

Write Path

The write request is persisted on a commit log file
Data is saved in the memory caches
When the memory cache is full or reaches a predefined threshold, data is flushed to SSTable in disk (sorted-string pairs)

Read Path
After a read is directed to a specific node, a system first checks if data is in the memory cache, else return data from the disk. In the disk, to find which SSTable contains the key, "Bloom Filter" is commonly used.

Reference

System Design Interview – An insider's guide by Alex Xu

Design a consistent hashing for system design interview

Daniel Lee — Fri, 15 Nov 2024 22:44:29 +0000

Imagine that you launch a new product and it attracts a huge volume of traffic ($1M order per second). Your product was designed with a single server and the server couldn't keep up with the volume, causing you to lose $1M order which you could've earned.

To support a serge of a demand, what kind of mechanism could you implement in the server-side to not only distribute the traffic evenly, buy also never miss incoming orders? You also want to ensure that other working servers are protected from being overloaded? This is where consistent hashing algorithm comes into play.

Traditionally, a common way to distribute traffic could be done by following a simple formula:

serverIndex = hash(key) % N, where N is the number of servers

But, when a server is added or removed, serverIndex changes and that can possibly lead to redistribution of all hash keys. Think about caching servers as an example (the backend system would result in many cache misses because N value is different)!

One of solutions you can consider is a consistent hashing algorithm.

Consistent hashing is a hash ring formed by connecting the head and tail of a hash space with k number of virtual nodes added between each server. It ensures (uniform) minimal number of redistribution of hash keys or data, thereby prevents overloading a server (aka, "hotspot" key problem)

On the hashing ring, servers are mapped based on server IP or name. k number of virtual nodes are placed for each server. When a server goes down or new servers are consistently added to the system, it impacts hash spaces and that can easily be detected by going the hash ring anti-clockwise (from the server impacted to the server prior to the impacted server). Once broken spots are identified, only keys in that impacted space need to be remapped (not all!).

In summary, employing consistent hashing in balancing the server load has 3 benefits:

reduces the number of keys to be distributed when a server's added or removed
makes it easy to scale horizontally as data are more evenly distributed
mitigates "hotspot" problem

Reference

System Design Interview – An insider's guide by Alex Xu

Design a rate limiter for system design interview

Daniel Lee — Thu, 31 Oct 2024 19:03:03 +0000

What is it?

A rate limiter controls the rate of traffic from a client or a server. In HTTP, it limits the number of client requests allowed to be sent over a specific time period.

Client-side vs Server-side

API requests can be limited on the client-side or server-side. Having a rate limiter implemented on the server-side allows full control of implementation. However, implementing it on the client-side can be manipulated by malicious users and limits the control of the implementation.

Furthermore, there's API gateway, a cloud microservice that we can use as a middleware when implementing the rate limiter on the server-side. It is a fully managed service that supports not only rate limiting, but also, SSL termination, authentication, IP whitelisting, static content servicing, etc. No reinventing the wheel!

Benefits

Prevent resource starvation (ex. DoS Attack).
In case your backend system relies on external APIs (payment, health records, etc), it can reduce per-call basis costs.
Prevent server overloading.

Algorithms

Token Bucket
Leaking Bucket
Fixed Window Counter
Sliding Window Log
Sliding Window Counter (hybrid: combination of 3 and 4)

Token Bucket

This algorithm basically adds a user-defined number of tokens to the bucket periodically, and each request consumes one token. If no token is available, drop subsequent requests.

verdict: easy to implement; space-efficient; tuning is challenging.

Leaking Bucket

This algorithm processes requests at a fixed rate, which is ideal for a stable outflow rate environment. Similar to token bucket algorithm, but each request is queued instead. If a queue is full, a request is rejected.

verdict: easy to implement; space-efficient; not suitable to handle burst of request spike.

Fixed Window Counter

This algorithm divides timeline into fix-sized time windows and assigns a counter to each window. Each request increments the counter and once a window reaches its threshold, the request is dropped until a new time window starts.

verdict: spikes at the edges of a window could cause more requests than allowed quota. For example, 5 requests/minute is the threshold, and there can be 5 requests between 2:00:00 and 2:01:00 (at the last 50% of the window) and another 5 requests between 2:01:00 and 2:02:00 (at the first 50% of the window).

Sliding Window Log

This algorithm stores the timestamp of each request in a log. When a new request comes in, older timestamps are removed from the log (relative to the start of the current window). If a log size is larger than a pre-defined size, a request is rejected.

verdict: it needs a lot of memory to store timestamps even if a request is rejected, some timestamp may still be in a log.

Sliding Window Counter

This algorithm computes the number of requests in the rolling window to decide whether to accept or reject a request based on the following formula:

requests in the current window + (requests in the previous window * overlap % of the rolling window and previous window)

For example, a minute window threshold is set to 7 requests/minute. 5 requests were made in the last 70% of the previous minute window, and 3 requests were made in the first 30% of the current minute window. Then, within a minute, there were 3 + 5 * 0.7 = 6.5 requests made. So, the current request can go through.
-verdict: it can smooth out spikes in traffic, but it only works for not-so strict look back window (because this algorithm assumes that requests in the previous window are evenly distributed)

Design Decision

Because multiple rate limiters may be required in a system, A configuration that defines a set of rules needs to be stored in a cache (for faster access than db). A configuration rule can look something like below:

domain: messaging
descriptors:
 - key: message_type
   value: marketing
   rate_limit:
     unit: day
     requests_per_unit: 5

Rate Limiter in Distributed Systems

To handle concurrent requests, locking mechanism can be used. However, there're still 2 major issues:

Race condition - which can be solved via Lua Script and sorted_sets (data structure) in Redis
Synchronization - which can be solved with a centralized data store. Sticky session is not efficient and still introduces the same problem.

Graceful Rejection of Request

In the HTTP header, you can define a set of rate limiter properties to handle a rejection gracefully on the client-side.

Reference

System Design Interview – An insider's guide by Alex Xu

How in the world can a nested for/while loop(s) have time complexity of O(N)?! for coding interview

Daniel Lee — Wed, 23 Oct 2024 01:31:45 +0000

In Python (and in general algorithm analysis), the time complexity of a nested loop depends on how the loops are structured. If you have a while loop inside a for loop, the overall complexity can still be O(N) in certain cases due to how the loops interact.

Here are a few examples that explain why a while loop inside a for loop can still result in O(N) complexity:

Example 1: Constant-Time Inner While Loop

for i in range(N):
    while some_condition:
        # Do constant-time work (O(1))
        break

In this case:

The for loop runs N times.
The while loop executes only once per iteration of the for loop because of the break statement.
The work done inside the while loop is constant time, so the complexity for each for loop iteration is O(1).
Therefore, the overall complexity is O(N)×O(1)=O(N).

Example 2: Inner While Loop's Work Depends on the Outer Loop

for i in range(N):
    while i < 5:
        # Do constant-time work (O(1))
        i += 1

In this case:

The for loop runs N times.
The while loop runs only a constant number of times (at most 5) for each iteration of the for loop.
Again, the complexity remains O(N).

Example 3: While Loop Consumes the Range in Total

i = 0
for _ in range(N):
    while i < N:
        # Do constant-time work (O(1))
        i += 1

In this case:

The for loop runs N times.
The while loop's condition (i < N) means that i will only increase up to N in total, across all iterations.
Even though it's nested, the while loop will increment i a total of N times overall, not N×N.
Therefore, the complexity is still O(N).

Key Takeaway:
If the number of iterations of the inner while loop is independent of the for loop (or if the total work done by the while loop across the for iterations is bounded by N), then the combined complexity can remain O(N).

The complexity would only be O(N^2) if the inner while loop ran N times for each iteration of the outer for loop. For example,

for i in range(N):
    j = 0
    while j < N:
        # Do constant-time work (O(1))
        j += 1

The for loop runs N times.
The while loop runs N times for each iteration of the for loop. Therefore, the complexity is O(N^2).

Search optimized database (full-text search) for system design interview

Daniel Lee — Wed, 16 Oct 2024 01:16:58 +0000

Traditional databases run a table scan to find a search term in the database. This is slow and efficient if a table stores a large dataset (1000+ rows). To improve, "search optimized database" can be used.

It uses indexing, tokenization and stemming to make search queries fast and efficient (by building "inverted indexes"):

Tokenization is a process of reducing words to their root form. For example, "running" and "runs" can be reduced to "run".
Stemming is a process of breaking a piece of task into individual words. It helps mapping words to documents containing those words in the inverted indexes.

Something to note is the underlying data structure of search mechanisms of search optimized database - Inverted Indexes.

It is a data structure that maps words to the documents that contain them. For example:

{
  "word1": [doc1,doc2,doc3],
  "word2": [doc2,doc3,doc6]
}

Most search optimized database also support "Fuzzy Search" out of box as a configuration. Fuzzy Search works by leveraging "edit distance calculation" technique, which measures how many letters to be changed/added/removed to transform one word into another. Thus, results with minor misspellings or discrepancy relative to the search term can be returned efficiently in case of human errors.

One of the popular search optimized database is "ElasticSearch".

Reference

https://www.hellointerview.com/learn/system-design/in-a-hurry/key-technologies

TIL: HTTP methods are case-sensitive

Daniel Lee — Thu, 19 Sep 2024 01:04:56 +0000

When making different API calls, for example, using fetch() api, request methods are case-sensitive (all UPPER-CASE) according to RFCs 7230 and 7231

The method token is case-sensitive because it might be used as a gateway to object-based systems with case-sensitive method names.

fetch(<url>, { method: [GET, PATCH, PUT, DELETE, etc.], ...}

TIL: api route paths can be designed with the hypen (-) and the dot (.)

Daniel Lee — Mon, 16 Sep 2024 17:16:03 +0000

API routes can be designed to use the hypen and dot for useful purposes. For instance, to pass/parse multiple pieces of information.

Example usage of the hypen

  Route path: /flights/:from-:to
  Request URL: http://localhost:3000/flights/LAX-SFO
  req.params: { "from": "LAX", "to": "SFO" }

Example usage of the dot

  Route path: /plantae/:genus.:species
  Request URL: http://localhost:3000/plantae/Prunus.persica
  req.params: { "genus": "Prunus", "species": "persica" }

There're other ways to pass multiple values to a route path such that uses an array or the same parameter.

Example

  Route path: /test
  Request URL1: https://localhost:3000/test?array=a&array=b&array=c
  Request URL2: https://localhost:3000/test?array[]=a&array[]=b&array[]=c

In the code

  app.get('/test', function(req,res){
    console.log(req.query.array); // array=[a,b,c]
    res.send(200);
  });

Note that query string exposes sensitive data to clients, so sensitive data shouldn't be put into a query string.

Key components to know for system design interview

Daniel Lee — Fri, 13 Sep 2024 04:25:55 +0000

This article is intended for software engineers with prior experience in development.

How to Approach System Design Interviews?

Think like a tech lead guiding junior engineers how to implement your design.

What interviewers want to see:

base-level understanding of system design fundamentals
back-and-forth about problem constraints and parameters of your service
well-reasoned, qualified decisions based on engineering trade-offs
unique direction your experience and decisions take them
holistic view of a system and its users

1) API

REST

APIs must be modelled based on the resources in the system. For instance, a single URL with HTTP verbs (GET, POST, PATCH, PUT, DELETE)
Good: versioning, structured
Bad: unneeded data also get fetched

RPC

Write code that executes on another remote machine internally
APIS are thought of as an action/command (ex. /postAnOrder(OrderDetails order)
Good: no special syntax to be learned, space-efficient
Bad: only to be used for internal communication because of timing issues (it becomes challenging to distinguish concurrent multiple communications between machines)

GraphQL

Data are structured in a graph relationships. Vertices (entities) and Edges (relationships)
Good: ideal for customer-facing apps; you get what you ask; no more routing in backend to get and modify information
Bad: less friendly to generate documentations like REST; not suitable for aggregate data

2) Databases (SQL vs NoSQL)

SQL

composed of rows and tables
strong ACID (emphasis: strong consistency)
support powerful queries
bad: writes are slow due to B-Trees splitting/merging pages/blocks.

NoSQL

nested key-val store
multiple writes can be easily handled
emphasis: eventual consistency
bad: reads might be stale for a couple of seconds (due to log-structured merge-tree)

Other types

document-type (JSON)
columnar-type (good for queries involving computing the same value types across multiple values)
graph-type

3) Scaling (horizontal vs vertical)

Database scaling

utilize replicas, then shard into separate databases. Sharding uses a hash function for even distribution and retrieval of entries.

Compute Scaling

divide a processing into pieces and designate each piece as a job in a queue so that multiple computers can work together in parallel.
both approaches may introduce some latency between calls/requests.
replicas ensures the reliability of a system by avoiding a single point of failure.

4) CAP Theorem

In real world, it's impossible to achieve all three
one of key fundamentals of distributed system design

Consistency

every node in a network will have access toe the same data

Availability

even if one or more nodes are down, any client making a data request receives a response

Partition Tolerance (necessary for modern systems)

In case of a fault in a network or communication, the system will continue to work

5) Web Authentication and Basic Security

It's all about the trade-offs between total safety and total convenience
Authentication (JWT, session tokens/cookies) is about verifying identity, whereas authorization is allowing actions.
For instance, user password can be secured with hashing and salting.

6) Load Balancers

It's used to distribute traffic across machines (adding or removing servers in case of a failure).
3 common techniques: round-robin, least connections/response time, consistent hashing.

Round-Robin

sends request to servers one by one
can overload a server
ideal when servers are stable and loads are random

Least Connections/Response Time

ideal when servers with similar compute power and requests have varying connection time

Consistent Hashing

install N number of virtual nodes for each server, so that loads are distributed as evenly as possible and only partial of the hash ring is affected when a server is added or removed.

7) Caching

To reduce latency of an expensive network computation/network calls/database queries/asset fetching.
Popular caching patterns: cache-aside, and write-through/write-back.

Cache-aside

fetch from cache first, if not found, fetch from database, then cache it.
data can become stale in cache if there's frequent write to the database. "Time-to-Live" can resolve it.
Checking cache first might introduce extra latency.

Write-through and write-back

Application writes data directly to the cache: asynchronously (write-back) or synchronously (write-through)

Write-back

data goes into a queue and writes the data back to database.

Write-through

opposite of write-back. Hence synchronous workflow, it can slow down whole streaming process.
cache invalidation strategy: Least Recently Used (LRU)

8) Message Queues (Pub/Sub)

beneficial if there can be a spike of traffic that potentially brings a server or a database down.
queues can send requests to multiple servers/systems instead of clients sending the same request to multiple servers/systems.
queues decouple the client from the server by eliminating the need to know the server address.

Common properties (based on implementations)

guaranteed delivery
no duplicate messages are delivered
ensure that the order of messages is maintained

9) Indexing

great for fetching a block of data from the hard disk to primary memory
can be multi-levelled
B-tree (self-adjusting; sorted order of pages)

10) Failover (active-passive or leader-follower)

replications are used to avoid a single point of failure. It also helps a system serve global users across geographical locations/regions, and increases throughput.

leaders

machine that handles write requests to the data-store

followers

replicas of the leader that handles read requests

synchronous replication

a write request to the followers must be acknowledged (by the leader machine). It slows down streaming, but ensures guaranteed delivery.

asynchronous replication

opposite of synchronous replication.
less-time consuming, but no guarantee on delivery.
most common types of replication systems: single-leader, multi-leader (multiple machines can handle writes, but each needs to catch up with writes on other machines for consistency)
to resolve concurrent write conflicts:
- keep the update with the largest client timestamp
- sticky routing: writes from the same client go to the same leader
- keep all the updates and return all the updates from each other

Storage and retrieval for system design interview

Daniel Lee — Wed, 11 Sep 2024 06:21:30 +0000

The following content is my own note from the book, "Designing Data-Intensive Applications". The writing is intended for people who want to dash through the book quickly.

There're different ways to store data (store engines) for transactional workloads and analytics. They're called OLTP (optimized for transaction processing) and OLAP (optimized for analytics).

OLTP is user-facing, meaning request volume is high, so some strategies are required to improve performance on queries such as using index.
On the other hand, OLAP is computational heavy where each query demands scanning over millions of records because of aggregate functions (SUM, COUNT, AVG, etc). Therefore, column-oriented storage is generally desirable.

Where does it start from?

In order for humans to understanding what's happening in applications or machines, we need some kind of records, we call it, log.

Log is simply an output text describing states of a machine or an application. As long as a machine operates, logs will be created endlessly and storing these log is constrained by disk or RAM capacity. So, some strategies to avoid running out of disk space such as compaction is performed.

Compaction breaks a large log file down into smaller segments and merge them to keep the storage efficient. It's generally ideal for writes because it throws away duplicate keys in the log, keeping only the most recent update for each key.

Segments are never modified after they have been written and reads can be served fast as there's no need of frequent updates on segment files. It continues to write requests to the latest segment file, and after a while, it then merges old segments and switch read requests to using new merged segment, then remove old segment files to keep storage efficient.

There're also different ways to store data such as column-based.

In a relational database schema, each row can consist of 1 to many columns and not every request needs those column values and indexing strategy is often used to optimize performance to some degree.

Index is a data structure to efficiently find the value for a particular key into the database. The general idea is to keep some additional metadata on the side (to help you locate the data you want). It's derived from the primary data, affecting performance of queries. Especially, when a write happens, indexes also need to be updated and it slows down writes. So, developers need to be mindful when creating indexes.

Most-widely used indexing data structure is B-trees which stores key-val pairs sorted by key.

There're also many other indexing strategies such as hash indexing, etc. However, in a column-based schema, values of each field are stored in a single row, separated by a comma, so to access nth record, developers simply accesses nth value of each row (representing fields/attributes). Duplications can be improved by bitmap encoding, etc.