<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: ufraaan</title>
    <description>The latest articles on Forem by ufraaan (@ufraan).</description>
    <link>https://forem.com/ufraan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818829%2F08f2f0ce-72c8-456f-9ea8-6f07216a4d07.jpeg</url>
      <title>Forem: ufraaan</title>
      <link>https://forem.com/ufraan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ufraan"/>
    <language>en</language>
    <item>
      <title>BitTorrent Internals</title>
      <dc:creator>ufraaan</dc:creator>
      <pubDate>Sun, 03 May 2026 11:39:48 +0000</pubDate>
      <link>https://forem.com/ufraan/bittorrent-internals-52mj</link>
      <guid>https://forem.com/ufraan/bittorrent-internals-52mj</guid>
      <description>&lt;p&gt;BitTorrent is a decentralized peer-to-peer (P2P) file-sharing protocol designed for fast, efficient distribution of large files over the internet.&lt;/p&gt;

&lt;p&gt;Let's first see how we classically download files from the internet, and why we even need something like BitTorrent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqrajxismc1dy0im5d51h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqrajxismc1dy0im5d51h.png" alt="client-server" width="800" height="259"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The client requests a file from the server, the server has the file and responds. But things get interesting when your download size is a bit larger.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Server bandwidth is limited, so as more clients connect, speed slows down.&lt;/li&gt;
&lt;li&gt;Speed of data transfer is capped by the server's upload capacity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjlwyfcm74fug5rxy891.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjlwyfcm74fug5rxy891.png" alt="alice-bob" width="800" height="317"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If Bob's upload speed is 60 Mbps, then no matter how fast Alice's download speed is, the overall download speed cannot exceed 60 Mbps.&lt;/p&gt;




&lt;h2&gt;
  
  
  Peer-to-Peer Network
&lt;/h2&gt;

&lt;p&gt;In a P2P network, every party participating in the network has the exact same capabilities: they are all equal peers and can initiate conversations with each other.&lt;/p&gt;

&lt;p&gt;The main highlight of P2P: even if a few nodes crash or are removed, the network keeps serving its purpose. No single point of failure.&lt;/p&gt;

&lt;p&gt;This isn't just about outages: it also applies to the core service the network provides. For example, if the network's job is to serve files, even if one machine goes down, other machines would still share those files with whoever needs them. There are no "system interruptions" as long as the network is stable enough.&lt;/p&gt;

&lt;p&gt;P2P networks come in two flavors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pure P2P&lt;/strong&gt;: No central entity. Every node can connect to every other node.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid P2P&lt;/strong&gt;: Has a central entity, used to share &lt;em&gt;metadata&lt;/em&gt; about the data across peers: not the data itself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa38j3ajg40okcc3eq8k3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa38j3ajg40okcc3eq8k3.png" alt="pure-hybrid-p2p" width="800" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: If the central entity goes down, the network and its services are affected. This hybrid P2P architecture is what powers BitTorrent.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;BitTorrent has a central entity called a &lt;strong&gt;tracker&lt;/strong&gt;. Peers talk to each other, but to know &lt;em&gt;who&lt;/em&gt; to talk to, they first consult the tracker.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Idea
&lt;/h2&gt;

&lt;p&gt;The core idea of BitTorrent is to download a file from multiple machines concurrently.&lt;/p&gt;

&lt;p&gt;We saw that download speed is limited by the upload capacity of the sender: be it a user, a server, or anything else. If you can download at 100 Mbps but the sender can only upload at 60 Mbps, you'll max out at 60 Mbps.&lt;/p&gt;

&lt;p&gt;But what if instead of downloading from one machine, we distributed the file across the network and connected to 50 different clients simultaneously to download in parallel? That's the idea behind BitTorrent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1s9g2rtm6cr65yfwhxz0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1s9g2rtm6cr65yfwhxz0.png" alt="p2p1" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Faster downloads.&lt;/li&gt;
&lt;li&gt;Upload load is distributed among peers. Every peer may hold some fragment of the file and can serve it to others. You still get high download speeds, but the upload burden is shared across the network.&lt;/li&gt;
&lt;li&gt;A large number of downloads puts only a small load on each peer, because it's highly distributed.&lt;/li&gt;
&lt;li&gt;Breaking a file into smaller chunks boosts concurrency.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  A Simplified Download Flow
&lt;/h2&gt;

&lt;p&gt;When a user wants to download a file, they sniff around the network to find peers that have the pieces. For this, they use a tracker.&lt;/p&gt;

&lt;p&gt;The user goes to the tracker and says "I want this file." The tracker responds with a list of peers that have it. The user then connects directly to those peers and downloads the file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhuudp3zxz9ntds12raoc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhuudp3zxz9ntds12raoc.png" alt="p2p2" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's say a user wants a file that has 4 chunks. They go to the tracker, the tracker responds with the list of machines for each chunk, the user talks to those peers, downloads each chunk, and concatenates them locally to get the full file.&lt;/p&gt;




&lt;h2&gt;
  
  
  Nomenclature &amp;amp; Terminologies
&lt;/h2&gt;

&lt;p&gt;These terms are useful when analyzing BitTorrent’s behavior and algorithms.&lt;/p&gt;




&lt;h3&gt;
  
  
  1. Pieces and Blocks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A file shared on the BitTorrent network is divided into &lt;strong&gt;pieces&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Each piece is further subdivided into &lt;strong&gt;blocks&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Data transfer happens at the &lt;strong&gt;block level&lt;/strong&gt; (one block per request).&lt;/li&gt;
&lt;li&gt;Example:

&lt;ul&gt;
&lt;li&gt;A ~16 MB piece → ~1000 blocks of 16 KB each.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;A piece is considered valid &lt;strong&gt;only if all its blocks are received&lt;/strong&gt;.&lt;/li&gt;

&lt;li&gt;The client reconstructs the original file by concatenating all pieces.&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1a4wft2yfdufhzjjd9c0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1a4wft2yfdufhzjjd9c0.png" alt="Pieces and Blocks" width="800" height="601"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Peer Set
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;peer set&lt;/strong&gt; is the list of peers a node can connect to for uploading/downloading.&lt;/li&gt;
&lt;li&gt;Typically obtained from a &lt;strong&gt;tracker&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Example:

&lt;ul&gt;
&lt;li&gt;If peer A receives &lt;code&gt;{C, E}&lt;/code&gt; from the tracker, it exchanges data only with C and E.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F03yvyai4ita62h46bj3y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F03yvyai4ita62h46bj3y.png" alt="Peer Set" width="800" height="307"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Active Peer Set
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A subset of the peer set used for &lt;strong&gt;active data transfer&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Not all peers are connected simultaneously.&lt;/li&gt;
&lt;li&gt;Example:

&lt;ul&gt;
&lt;li&gt;Out of 50 peers received, only ~10 may be actively connected.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Purpose:

&lt;ul&gt;
&lt;li&gt;Limits bandwidth usage.&lt;/li&gt;
&lt;li&gt;Reduces network congestion.&lt;/li&gt;
&lt;li&gt;Improves stability of connections.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgoksj0r3dgshkjypr4y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgoksj0r3dgshkjypr4y.png" alt="Active Peer Set" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Seeders &amp;amp; Leechers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Seeder&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A peer that has the complete file.&lt;/li&gt;
&lt;li&gt;Uploads pieces to others.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Leecher&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A peer that is still downloading.&lt;/li&gt;
&lt;li&gt;May also upload already downloaded pieces.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1p62arpkr5z6job7dcd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1p62arpkr5z6job7dcd.png" alt="Seeder vs Leecher" width="620" height="840"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Impact on Performance
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;More &lt;strong&gt;seeders&lt;/strong&gt; → higher availability → faster downloads.&lt;/li&gt;
&lt;li&gt;Few seeders → bottleneck (resembles client-server model).&lt;/li&gt;
&lt;li&gt;If &lt;strong&gt;leechers ≫ seeders&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Increased contention.&lt;/li&gt;
&lt;li&gt;Slower download speeds.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  BitTorrent is Popularity-Friendly
&lt;/h2&gt;

&lt;p&gt;New and popular files will have many seeders and download faster. Old or unpopular files have fewer seeders and download slower.&lt;/p&gt;

&lt;p&gt;For example, when a new version of an operating system is released, there's a very high chance many people want to download it. Ubuntu and Debian offer official torrent distributions, and there will be many seeders: so whoever wants to download gets fast speeds.&lt;/p&gt;




&lt;h2&gt;
  
  
  Applications of BitTorrent
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Downloading Linux distributions (faster than FTP &amp;amp; HTTP), and large software, movies, games, etc.&lt;/li&gt;
&lt;li&gt;Sending patches to users (e.g., security patches). You can run a small BitTorrent-based system where you drop a file into one node and it automatically distributes across every machine in your network, which can then run the patches. Massive data centers use this to power security patch distribution.&lt;/li&gt;
&lt;li&gt;Facebook uses this to power massive deployments and distribute build artifacts across servers. Instead of thousands of servers all downloading a binary from one source, it splits the file across multiple places. The network gradually converges and every node ends up with the full file.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Torrent File
&lt;/h2&gt;

&lt;p&gt;To download or upload any file from the torrent network, you need a &lt;code&gt;.torrent&lt;/code&gt; file. This file holds metadata about the file you want to download.&lt;/p&gt;

&lt;p&gt;For example, if you want to download Ubuntu from the torrent network, the Ubuntu ISO would have a corresponding &lt;code&gt;.torrent&lt;/code&gt; file. You download it, which contains all the metadata, and then use it to fetch the actual file from the network.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6tbvsg4c42tgblami7ju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6tbvsg4c42tgblami7ju.png" alt="torrent-file" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Lifecycle of a Torrent File
&lt;/h2&gt;

&lt;p&gt;Seeders are seeding data in the network, and as long as at least 1 seeder is serving the file, the torrent is alive. Otherwise, the torrent is dead.&lt;/p&gt;

&lt;p&gt;It's therefore very important to have at least 1 seeder: otherwise nobody can download the file.&lt;/p&gt;

&lt;p&gt;What separates BitTorrent from a classic blockchain/cryptocurrency use case is that there's no incentive for anyone to join and stay as a seeder. Cryptocurrency incentivizes participation in the network: BitTorrent doesn't.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7qs1xszt86xxsps7nc9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj7qs1xszt86xxsps7nc9.png" alt="user-download-via-http" width="800" height="264"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Does the Torrent File Hold?
&lt;/h2&gt;

&lt;p&gt;The torrent file is static: no matter when you download it, it will always have the same content.&lt;/p&gt;

&lt;p&gt;It holds metadata about the file, not the actual data.&lt;/p&gt;

&lt;p&gt;A torrent file is essentially a dictionary of key-value pairs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;announce&lt;/strong&gt;: URL of the tracker. This tells your torrent client which tracker to contact to find peers in the network.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;created by&lt;/strong&gt;: Name and version of the program that created the torrent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;creation date&lt;/strong&gt;: Creation timestamp in Unix epoch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;encoding&lt;/strong&gt;: Encoding used for strings in the &lt;code&gt;info&lt;/code&gt; dictionary. Defaults to UTF-8.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;comment&lt;/strong&gt;: Optional comment from the author.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;info&lt;/strong&gt;: A dictionary describing the file(s) of the torrent. For example, if you're downloading Ubuntu, it would contain information about the Ubuntu image itself.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;BitTorrent supports two types of downloads: single-file and multi-file. Depending on the type, the structure of the &lt;code&gt;info&lt;/code&gt; dictionary varies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3b77n0wi5jsgn6yovxbt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3b77n0wi5jsgn6yovxbt.png" alt="single-file-format" width="800" height="124"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0npolvip8tio987vkmv5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0npolvip8tio987vkmv5.png" alt="multi-file-format" width="800" height="317"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  File Data Information
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;info&lt;/code&gt; dictionary also stores information about the pieces:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;piece length&lt;/strong&gt;: Number of bytes in each piece.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pieces&lt;/strong&gt;: 20-byte SHA1 hash values for each piece, concatenated together.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp2r0u2id2mkqcyx8852z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp2r0u2id2mkqcyx8852z.png" alt="pieces" width="800" height="224"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since a file is split into equal-size pieces, &lt;code&gt;piece length&lt;/code&gt; tells you how big each one is.&lt;/p&gt;

&lt;p&gt;For example, a 1 GB file with a piece size of 1 MB would have 1024 pieces. The torrent file doesn't store the actual piece data: instead, for each piece it stores a 20-byte SHA1 hash and concatenates all of them together.&lt;/p&gt;




&lt;h3&gt;
  
  
  Torrent File Format: Bencoding
&lt;/h3&gt;

&lt;p&gt;Torrent files use a custom encoding format called &lt;strong&gt;bencoding&lt;/strong&gt;: not JSON.&lt;/p&gt;

&lt;p&gt;When you open a &lt;code&gt;.torrent&lt;/code&gt; file in a client like qBittorrent, the client first decodes the bencoded file to extract the metadata. The component that does this is called a &lt;strong&gt;bencoding decoder&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  Bencoding Specification
&lt;/h3&gt;

&lt;p&gt;Every torrent file is a bencoded dictionary. The bencoding specification supports only 4 data types: strings, integers, lists, and dictionaries.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjgehz65cn905jt1lmj8a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjgehz65cn905jt1lmj8a.png" alt="bencoding-breakdown" width="800" height="562"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So the entire torrent file is a bencoded dictionary.&lt;/p&gt;

&lt;p&gt;Wrote one in Go: understood it way better. (&lt;a href="https://ufraan.dev/projects/bencode-foo" rel="noopener noreferrer"&gt;https://ufraan.dev/projects/bencode-foo&lt;/a&gt;)&lt;/p&gt;




&lt;h2&gt;
  
  
  The BitTorrent Architecture
&lt;/h2&gt;

&lt;p&gt;The BitTorrent architecture consists of 4 entities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;.torrent&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;Trackers&lt;/li&gt;
&lt;li&gt;Seeders&lt;/li&gt;
&lt;li&gt;Leechers&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Pieces
&lt;/h3&gt;

&lt;p&gt;Whenever a file is shared on the BitTorrent network, it's not shared in its entirety. It's first broken into &lt;strong&gt;pieces&lt;/strong&gt;, which become the unit of transmission.&lt;/p&gt;

&lt;p&gt;The downloader gets these pieces and concatenates them locally to form the complete file. All pieces are the same length.&lt;/p&gt;

&lt;p&gt;For example, a 3 MB file with a piece size of 1 MB creates 3 pieces: p1, p2, p3.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4y6akhnl3c0dyca8w3a5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4y6akhnl3c0dyca8w3a5.png" alt="piecesinbt" width="800" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you join the network and download a piece from a seeder, you immediately broadcast to the rest of the network: "I have this piece now: if anyone needs it, come to me instead."&lt;/p&gt;

&lt;p&gt;As each peer downloads any piece, they inform everyone else. This is the power of P2P.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft4hluv7l1fkcu8okb9oj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft4hluv7l1fkcu8okb9oj.png" alt="pieces" width="800" height="233"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Torrent File
&lt;/h3&gt;

&lt;p&gt;A metafile that holds static information about the file: filename, size, piece information, etc. It does &lt;strong&gt;not&lt;/strong&gt; hold the actual data.&lt;/p&gt;

&lt;p&gt;One critical field it holds is the &lt;strong&gt;announce&lt;/strong&gt; URL: the tracker URL. The tracker is the only central entity in the BitTorrent architecture, acting as a metadata store where you can get information about other peers and the torrent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcinuvh5f02yozsjzebsg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcinuvh5f02yozsjzebsg.png" alt="seeder v leecher" width="800" height="321"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F53gd5e25npgdr58mikhd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F53gd5e25npgdr58mikhd.png" alt="stp diagram" width="800" height="325"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each torrent file is uniquely identified by an &lt;strong&gt;infohash&lt;/strong&gt;: a SHA1 hash of the &lt;code&gt;info&lt;/code&gt; section of the &lt;code&gt;.torrent&lt;/code&gt; file. The &lt;code&gt;.torrent&lt;/code&gt; file itself is typically downloaded through a regular HTTP web server.&lt;/p&gt;




&lt;h3&gt;
  
  
  Tracker
&lt;/h3&gt;

&lt;p&gt;The tracker is the only central entity in this P2P network, and it's very lightweight.&lt;/p&gt;

&lt;p&gt;For a given torrent, the &lt;code&gt;.torrent&lt;/code&gt; file contains the tracker URL. Every peer in the network connects to this tracker to get metadata about who else is in the network.&lt;/p&gt;

&lt;p&gt;It's a decentralized network where there can be multiple trackers, but you'll connect to one tracker for a given &lt;code&gt;.torrent&lt;/code&gt; file.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: The tracker does not download or transfer files. It only holds information about peers and their distribution: that's why it's so lightweight.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The core jobs of a tracker:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Keep track of peers that hold the file.&lt;/li&gt;
&lt;li&gt;Keep track of peers that are downloading.&lt;/li&gt;
&lt;li&gt;Help peers find other peers to download content from.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A tracker is essentially a simple HTTP server that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Hands out peer information to the network.&lt;/li&gt;
&lt;li&gt;Periodically collects stats from peers.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyxrbiemdko7mgfyqmnza.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyxrbiemdko7mgfyqmnza.png" alt="architecture-breakdown-hld" width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you have a &lt;code&gt;.torrent&lt;/code&gt; file, you first extract info from it, then contact the tracker saying "I want to join your network." The tracker responds with roughly 50 peers that are part of this network.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzj9bdn8ucudf5sirpz18.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzj9bdn8ucudf5sirpz18.png" alt="peer-set-and-state" width="800" height="244"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The tracker doesn't just send info to users: peers in the network also periodically report back to the tracker: downloaded amount, uploaded amount, which torrent they're part of, etc.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnuqu6s80wpwxfhlckmlq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnuqu6s80wpwxfhlckmlq.png" alt="peer-set-and-connetions" width="800" height="365"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiafnysebqjdk5r8cf8bs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiafnysebqjdk5r8cf8bs.png" alt="peer-set-gossip" width="800" height="529"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>computerscience</category>
      <category>distributedsystems</category>
      <category>networking</category>
    </item>
    <item>
      <title>How Twitter Served 300,000 Timelines Per Second</title>
      <dc:creator>ufraaan</dc:creator>
      <pubDate>Fri, 01 May 2026 12:59:25 +0000</pubDate>
      <link>https://forem.com/ufraan/how-twitter-served-300000-timelines-per-second-43bp</link>
      <guid>https://forem.com/ufraan/how-twitter-served-300000-timelines-per-second-43bp</guid>
      <description>&lt;p&gt;This post is a plain breakdown of how Twitter (circa 2013) handled timeline scale, why the first approach broke, and why the final solution is a hybrid.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Based on concepts from Designing Data Intensive Applications.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;When I strip Twitter down to the basics, there are really only two product operations that matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Post Tweet&lt;/strong&gt;: a user publishes a new message to their followers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Home Timeline&lt;/strong&gt;: a user views tweets from the people they follow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here are the numbers Twitter published (Nov 2012):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Average&lt;/th&gt;
&lt;th&gt;Peak&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Post Tweet (writes)&lt;/td&gt;
&lt;td&gt;4,600 req/sec&lt;/td&gt;
&lt;td&gt;12,000 req/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Home Timeline (reads)&lt;/td&gt;
&lt;td&gt;300,000 req/sec&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The read-to-write ratio is roughly &lt;strong&gt;65x&lt;/strong&gt;. People read way more than they write. That asymmetry is the whole story here.&lt;/p&gt;

&lt;p&gt;What surprised me the first time I studied this: &lt;strong&gt;12,000 writes/sec is not the scary part&lt;/strong&gt;. A solid relational setup can handle that. The hard part is &lt;em&gt;fan-out&lt;/em&gt;: one tweet may need to show up in millions of home timelines almost immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Schema
&lt;/h2&gt;

&lt;p&gt;At the core, Twitter's model has three tables.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;tweets&lt;/code&gt;: the content&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;sender_id&lt;/th&gt;
&lt;th&gt;text&lt;/th&gt;
&lt;th&gt;timestamp&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;"Excited to announce"&lt;/td&gt;
&lt;td&gt;1000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;"Grateful. Humbled. Dehydrated."&lt;/td&gt;
&lt;td&gt;1001&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;"I didn't come this far to only come this far"&lt;/td&gt;
&lt;td&gt;1002&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;"Rejected 12 times. Hired once. Now I speak at conferences."&lt;/td&gt;
&lt;td&gt;1003&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every tweet lives here. Notice &lt;code&gt;sender_id = 12&lt;/code&gt; appears twice: rows 1 and 3 are both from alice. This table is just tweet content, nothing else.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;users&lt;/code&gt;: the profiles&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;screen_name&lt;/th&gt;
&lt;th&gt;profile_image&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;bob&lt;/td&gt;
&lt;td&gt;bob.jpg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;charlie&lt;/td&gt;
&lt;td&gt;charlie.jpg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;alice&lt;/td&gt;
&lt;td&gt;alice.jpg&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This is profile data only. No tweets, no follow graph. If I see &lt;code&gt;sender_id = 12&lt;/code&gt; in &lt;code&gt;tweets&lt;/code&gt;, I resolve &lt;code&gt;id = 12&lt;/code&gt; here and get "alice".&lt;/p&gt;

&lt;p&gt;&lt;code&gt;follows&lt;/code&gt;: the relationships&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;follower_id&lt;/th&gt;
&lt;th&gt;followee_id&lt;/th&gt;
&lt;th&gt;meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;User 100 follows bob&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;User 100 follows alice&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;101&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;User 101 follows alice&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This table stores the follow graph. One row = one relationship.&lt;/p&gt;




&lt;h2&gt;
  
  
  Follower vs. Followee
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Follower&lt;/strong&gt;: the person doing the following. If you follow alice, &lt;em&gt;you&lt;/em&gt; are the follower.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Followee&lt;/strong&gt;: the person being followed. alice, in this case, is the followee.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So when alice tweets, the question is: &lt;em&gt;"Who has &lt;code&gt;followee_id = 12&lt;/code&gt;?"&lt;/em&gt; That result is everyone whose timeline might need an update.&lt;/p&gt;




&lt;h2&gt;
  
  
  Approach 1: Query at Read Time
&lt;/h2&gt;

&lt;p&gt;Twitter's original approach was straightforward: writes go into &lt;code&gt;tweets&lt;/code&gt;, and timelines are computed on demand at read time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;tweets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tweets&lt;/span&gt;
  &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;tweets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sender_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;follows&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;follows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;followee_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;follows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;follower_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;current_user&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;tweets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why this broke at scale
&lt;/h3&gt;

&lt;p&gt;300,000 timeline reads/sec means hammering this multi-join query constantly. Even with indexes, this gets expensive fast. That pushed Twitter to move work away from reads.&lt;/p&gt;




&lt;h2&gt;
  
  
  Approach 2: Fan-Out on Write
&lt;/h2&gt;

&lt;p&gt;The key idea: if reads outnumber writes by 65x, pay more at write time so reads are cheap.&lt;/p&gt;

&lt;p&gt;Instead of building timelines on demand, build them when tweets are created. Each user gets a precomputed cached timeline (like a mailbox), ready to read.&lt;/p&gt;

&lt;h3&gt;
  
  
  The mailbox analogy
&lt;/h3&gt;

&lt;p&gt;This is what happens when alice tweets:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The tweet is saved to the global &lt;code&gt;tweets&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;A background worker queries all of alice's followers&lt;/li&gt;
&lt;li&gt;For each follower, it &lt;strong&gt;prepends&lt;/strong&gt; the new tweet to their cached timeline&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When I open the app, the timeline is basically a cache fetch. No heavy joins in the hot read path.&lt;/p&gt;

&lt;h3&gt;
  
  
  The math
&lt;/h3&gt;

&lt;p&gt;Average Twitter user has approximately &lt;strong&gt;75 followers&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;4,600 tweets/sec × 75 followers = 345,000 cache writes/sec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A cache write is a &lt;strong&gt;list prepend&lt;/strong&gt; in Redis: microseconds, no disk I/O. A JOIN query involves scanning indexed B-trees, merging results, sorting: milliseconds, disk-bound.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full Data Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqy7nh9yww6zkpzeb2op.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqy7nh9yww6zkpzeb2op.png" alt="data-pl"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each follower has their own dedicated timeline cache (e.g., User 1: &lt;code&gt;T7→T5→T3→T1&lt;/code&gt;, User 2: &lt;code&gt;T8→T6→T5&lt;/code&gt;). The tweet IDs differ per user because each user follows a different set of people. When you request your timeline, it's served directly from your pre-built cache. Reads are fast because all the work happened at write time.&lt;/p&gt;

&lt;p&gt;The tradeoff: one new tweet triggers N cache updates where N is the number of followers.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Celebrity Problem and the Hybrid Solution
&lt;/h2&gt;

&lt;p&gt;Fan-out on write has a ceiling. Celebrity accounts have tens of millions of followers. If one celebrity tweet triggers 30-80 million fan-out writes, queues back up and latency spikes.&lt;/p&gt;

&lt;p&gt;So the production answer is a &lt;strong&gt;hybrid&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Normal users&lt;/strong&gt;: fan-out on write (Approach 2)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Celebrities&lt;/strong&gt;: no fan-out. Their tweets stay in the global tweets table.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When I request my home timeline, the system merges:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your pre-built cache (tweets from normal users you follow)&lt;/li&gt;
&lt;li&gt;A small real-time query for tweets from celebrities you follow (Approach 1)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This stays manageable because most users follow only a small number of celebrity accounts.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Principle
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Do more work at write time so the common path is trivially cheap.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Twitter optimized for reads because reads dominated writes.&lt;/p&gt;

&lt;p&gt;When I design systems now, I start with one question: &lt;em&gt;what is my read/write ratio, and where should I pay cost?&lt;/em&gt; That one answer often determines the rest of the architecture.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>distributedsystems</category>
      <category>performance</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Git Under the Hood: What Actually Happens When You Commit</title>
      <dc:creator>ufraaan</dc:creator>
      <pubDate>Fri, 01 May 2026 12:51:31 +0000</pubDate>
      <link>https://forem.com/ufraan/git-under-the-hood-what-actually-happens-when-you-commit-2m3c</link>
      <guid>https://forem.com/ufraan/git-under-the-hood-what-actually-happens-when-you-commit-2m3c</guid>
      <description>&lt;p&gt;I used to just memorize git commands without understanding what was going on behind the scenes. Add, commit, push, and hope it works. Then one day I actually opened the &lt;code&gt;.git&lt;/code&gt; folder and everything clicked.&lt;/p&gt;

&lt;p&gt;This post covers the basics of how Git works internally, how to configure it properly, and how branches and merging actually function. If you are tired of blindly typing commands and want to understand what Git is actually doing, this is for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up Git Properly
&lt;/h2&gt;

&lt;p&gt;When you first install Git, it needs to know who you are. Every commit gets tagged with a name and email, so Git stores these in a file called &lt;code&gt;.gitconfig&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can set these globally, which means every repository on your machine uses the same identity. Or you can set them locally per project. Most people go global for their name and email since those do not change.&lt;/p&gt;

&lt;p&gt;You can also change your default editor. Git loves opening Vim for commit messages, which is fine if you know Vim, but most beginners would rather just use VSCode. Run this to fix that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; core.editor &lt;span class="s2"&gt;"code --wait"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--wait&lt;/code&gt; flag is important. It tells Git to pause and wait until you close the editor window before it continues.&lt;/p&gt;

&lt;p&gt;Then set your name and email:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.name &lt;span class="s2"&gt;"ufraan"&lt;/span&gt;
git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.email &lt;span class="s2"&gt;"ufraan1@gmail.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can check what you set by running the same commands without the values. These settings live in your &lt;code&gt;.gitconfig&lt;/code&gt; file. On Linux or macOS it is at &lt;code&gt;~/.gitconfig&lt;/code&gt;. On Windows it is at &lt;code&gt;C:\Users\&amp;lt;YourUsername&amp;gt;\.gitconfig&lt;/code&gt;. Open it up and you will see your settings, plus any SSH keys you have configured for talking to GitHub or GitLab.&lt;/p&gt;

&lt;h2&gt;
  
  
  The .gitignore File
&lt;/h2&gt;

&lt;p&gt;Git tracks everything by default. That means build artifacts, environment files with your API keys, node_modules, and random OS metadata all get picked up. You do not want any of that in your repository.&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;.gitignore&lt;/code&gt; file in your project root to tell Git what to ignore:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;node_modules/
*.log
.DS_Store
*.pyc
.env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Quick breakdown of what these cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;node_modules/&lt;/code&gt; gets massive and you can always reinstall it with &lt;code&gt;npm install&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;*.log&lt;/code&gt; files are just program output&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.DS_Store&lt;/code&gt; is macOS metadata that has nothing to do with your code&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;*.pyc&lt;/code&gt; are compiled Python files generated automatically&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.env&lt;/code&gt; usually has secrets and should never be committed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are starting a new project and do not know what to put in there, search for "gitignore generator" online. Pick your tech stack and it will give you a ready-made template.&lt;/p&gt;

&lt;p&gt;Once your &lt;code&gt;.gitignore&lt;/code&gt; is set up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Initial commit"&lt;/span&gt;
git log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;git add .&lt;/code&gt; command automatically respects your &lt;code&gt;.gitignore&lt;/code&gt; file, so it will not stage anything you told it to ignore.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens Behind the Scenes
&lt;/h2&gt;

&lt;p&gt;Run &lt;code&gt;git log --oneline&lt;/code&gt; and you will see short commit hashes and your messages. You might also see something like &lt;code&gt;HEAD -&amp;gt; master&lt;/code&gt;. But what is actually happening?&lt;/p&gt;

&lt;p&gt;Every commit stores a reference to the previous commit. Except the very first one, which has no parent. Each commit also gets a unique hash that acts like a fingerprint for that snapshot. This creates a chain of commits going all the way back to the beginning.&lt;/p&gt;

&lt;p&gt;You can see this for yourself by peeking into the &lt;code&gt;.git&lt;/code&gt; folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; .git
&lt;span class="nb"&gt;ls&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will find a few important things inside:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;HEAD&lt;/code&gt; points to the branch you are currently on. Open it and you will see something like &lt;code&gt;ref: refs/heads/master&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;objects/&lt;/code&gt; stores all your actual data, including commits and file contents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;refs/&lt;/code&gt; stores branch pointers&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;logs/&lt;/code&gt; keeps a history of changes to HEAD and other references&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So when you run &lt;code&gt;git commit&lt;/code&gt;, Git creates a new commit object in the objects directory, updates the branch pointer in refs, and HEAD follows the current branch. That is the whole process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Git Branches
&lt;/h2&gt;

&lt;p&gt;Branches are basically parallel timelines for your project. You can work on a feature in isolation without touching the main codebase. Git creates a default branch called &lt;code&gt;master&lt;/code&gt; (or &lt;code&gt;main&lt;/code&gt; on newer setups), which is usually where the stable version lives.&lt;/p&gt;

&lt;p&gt;Let us walk through how this actually works. Create a new folder and initialize a repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;gittwo
&lt;span class="nb"&gt;cd &lt;/span&gt;gittwo
git init
git status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After &lt;code&gt;git init&lt;/code&gt;, you will see "On branch master". Create a file and track it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create an index.html file with some basic HTML&lt;/span&gt;
git status
git add index.html
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"add index file"&lt;/span&gt;
git branch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output shows &lt;code&gt;* master&lt;/code&gt;. The asterisk means that is where HEAD is pointing.&lt;/p&gt;

&lt;p&gt;Now make some changes and commit again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add index.html
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"update code for index file"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let us say you need to work on a navigation bar without affecting master. Create a branch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git branch nav-bar
git branch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will see &lt;code&gt;nav-bar&lt;/code&gt; listed below &lt;code&gt;master&lt;/code&gt;, and the asterisk is still on &lt;code&gt;master&lt;/code&gt;. You created the branch but you are not on it yet. Switch over:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout nav-bar
&lt;span class="c"&gt;# or the newer command:&lt;/span&gt;
git switch nav-bar

git branch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the asterisk moved to &lt;code&gt;nav-bar&lt;/code&gt;. Create a file on this branch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create nav-bar.html&lt;/span&gt;
git add nav-bar.html
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"add navbar to codebase"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now switch back to master:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout master
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that &lt;code&gt;nav-bar.html&lt;/code&gt; disappeared from your editor. It is not deleted. It just lives on the &lt;code&gt;nav-bar&lt;/code&gt; branch now. This is the key thing about branches. Each branch has its own working directory state. Switch back to &lt;code&gt;nav-bar&lt;/code&gt; and the file comes right back.&lt;/p&gt;

&lt;p&gt;You can always check where HEAD points:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git branch
git log &lt;span class="nt"&gt;--oneline&lt;/span&gt;
&lt;span class="c"&gt;# Shows HEAD -&amp;gt; master or HEAD -&amp;gt; nav-bar&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few useful branch commands to keep around:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git branch                  &lt;span class="c"&gt;# list all branches&lt;/span&gt;
git branch bugfix           &lt;span class="c"&gt;# create a new branch without switching&lt;/span&gt;
git switch bugfix           &lt;span class="c"&gt;# switch to a branch&lt;/span&gt;
git switch &lt;span class="nt"&gt;-c&lt;/span&gt; dark-mode     &lt;span class="c"&gt;# create and switch in one step&lt;/span&gt;
git checkout &lt;span class="nt"&gt;-b&lt;/span&gt; pink-mode   &lt;span class="c"&gt;# same thing, older syntax&lt;/span&gt;
git branch &lt;span class="nt"&gt;-d&lt;/span&gt; branch-name   &lt;span class="c"&gt;# delete a branch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing to remember. Always commit your changes before switching branches. If you leave work uncommitted and switch, it can either follow you to the new branch or cause problems. Just commit first.&lt;/p&gt;

&lt;h2&gt;
  
  
  Merging Branches
&lt;/h2&gt;

&lt;p&gt;Once you are done on a branch, you need to bring those changes back into master. That is what merging does.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fast-Forward Merge
&lt;/h3&gt;

&lt;p&gt;This is the simple case. If master has not changed since you created your branch, Git just moves the master pointer forward to your latest commit. No extra merge commit needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout master
git merge nav-bar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Non-Fast-Forward Merge
&lt;/h3&gt;

&lt;p&gt;If both master and your branch have new commits since you split off, Git cannot just fast-forward. It has to create a merge commit that combines both histories. You can also force this behavior even when a fast-forward is possible:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout master
git merge &lt;span class="nt"&gt;--no-ff&lt;/span&gt; nav-bar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--no-ff&lt;/code&gt; flag creates an explicit merge commit. This is useful if you want to keep a clear record of when a feature branch was integrated, even if a fast-forward was possible.&lt;/p&gt;

&lt;p&gt;Let us try merging in practice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout master
git merge nav-bar &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"merge navbar"&lt;/span&gt;
git log &lt;span class="nt"&gt;--oneline&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The nav-bar commits are now part of master. You can see &lt;code&gt;nav-bar.html&lt;/code&gt; in your editor alongside the other files. Once merged, you can clean up the branch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git branch &lt;span class="nt"&gt;-d&lt;/span&gt; nav-bar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do it again with another branch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout &lt;span class="nt"&gt;-b&lt;/span&gt; footer

&lt;span class="c"&gt;# Create footer.html&lt;/span&gt;
git add footer.html
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"add footer section to codebase"&lt;/span&gt;

git checkout master
git merge footer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now &lt;code&gt;footer.html&lt;/code&gt; appears in master as well.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resolving Merge Conflicts
&lt;/h2&gt;

&lt;p&gt;Conflicts happen when two branches modify the same part of the same file. Git does not know which version to keep, so it stops and asks you to decide.&lt;/p&gt;

&lt;p&gt;Here is how to trigger and fix one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On master, modify index.html&lt;/span&gt;
&lt;span class="c"&gt;# Add "footer added" inside the body tag&lt;/span&gt;
git add index.html
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"add footer in index file"&lt;/span&gt;

&lt;span class="c"&gt;# Switch to footer branch&lt;/span&gt;
git checkout footer

&lt;span class="c"&gt;# Modify the same file differently&lt;/span&gt;
&lt;span class="c"&gt;# Add "footer was added successfully" inside the body tag&lt;/span&gt;
git add index.html
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"update index file with footer code"&lt;/span&gt;

&lt;span class="c"&gt;# Switch back to master and try to merge&lt;/span&gt;
git checkout master
git merge footer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Git will stop and tell you there is a conflict. Open &lt;code&gt;index.html&lt;/code&gt; and you will see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="na"&gt;HEAD&lt;/span&gt;
&lt;span class="na"&gt;footer&lt;/span&gt; &lt;span class="na"&gt;added
=&lt;/span&gt;&lt;span class="s"&gt;======&lt;/span&gt;
&lt;span class="na"&gt;footer&lt;/span&gt; &lt;span class="na"&gt;was&lt;/span&gt; &lt;span class="na"&gt;added&lt;/span&gt; &lt;span class="na"&gt;successfully&lt;/span&gt;
&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt; footer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The section between &lt;code&gt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; HEAD&lt;/code&gt; and &lt;code&gt;=======&lt;/code&gt; is your current branch. The section between &lt;code&gt;=======&lt;/code&gt; and &lt;code&gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt; footer&lt;/code&gt; is the incoming branch. You need to pick one, combine them, or write something entirely different. Then delete the conflict markers completely.&lt;/p&gt;

&lt;p&gt;Once the file looks right:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add index.html
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"resolve conflict in index file"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is it. The conflict is resolved and both branches are merged into master.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Git Aliases
&lt;/h2&gt;

&lt;p&gt;I use a few aliases to save myself from typing the same commands over and over. Here is what I have in my &lt;code&gt;.zshrc&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;gs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'git status'&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;ga&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'git add'&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;gcm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'git commit -m'&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;gp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'git push'&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;gpl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'git pull'&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;gc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'git clone'&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;gb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'git branch'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;They are nothing fancy, but &lt;code&gt;gcm&lt;/code&gt; and &lt;code&gt;gs&lt;/code&gt; alone save a ton of keystrokes over the course of a day.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;If you want to go deeper into how Git actually works, the Pro Git book is the best resource out there. It is free to read online at &lt;a href="https://git-scm.com/book/en/v2" rel="noopener noreferrer"&gt;https://git-scm.com/book/en/v2&lt;/a&gt; and covers everything from basics to advanced internals. I highly recommend it if you really want to master Git.&lt;/p&gt;

&lt;p&gt;Git stops being scary once you understand what is happening under the hood. Commits are just snapshots linked in a chain. Branches are pointers to those snapshots. Merging is moving pointers around. The &lt;code&gt;.git&lt;/code&gt; folder is not some black box. It is just files and references that you can look at whenever you want.&lt;/p&gt;

&lt;p&gt;Adios! :)&lt;/p&gt;

</description>
      <category>git</category>
    </item>
  </channel>
</rss>
