<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Shadi AJAM</title>
    <description>The latest articles on Forem by Shadi AJAM (@shadiajam).</description>
    <link>https://forem.com/shadiajam</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2157687%2F95872e44-3e83-4e9f-ad5f-35a9a4251dc2.jpeg</url>
      <title>Forem: Shadi AJAM</title>
      <link>https://forem.com/shadiajam</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/shadiajam"/>
    <language>en</language>
    <item>
      <title>Should We Rethink About IDs? A Deep Dive into "Snowflake IDs"</title>
      <dc:creator>Shadi AJAM</dc:creator>
      <pubDate>Sun, 03 Nov 2024 15:24:09 +0000</pubDate>
      <link>https://forem.com/shadiajam/should-we-rethink-about-ids-a-deep-dive-into-snowflake-ids-3j54</link>
      <guid>https://forem.com/shadiajam/should-we-rethink-about-ids-a-deep-dive-into-snowflake-ids-3j54</guid>
      <description>&lt;h2&gt;
  
  
  Let’s Start from begining: Why Do We Even Use IDs?
&lt;/h2&gt;

&lt;p&gt;The short answer is "Labeling".&lt;/p&gt;

&lt;p&gt;Since ancient times, we’ve had to label everything—animals, crops, livestock, geographic regions, even military units. As civilization grew, we began recording information on paper, collecting and storing data. Over time, this amassed into our first version of "Big Data." To manage this, we invent methods to sort and index all that paper, driven by one goal: finding information faster. This need for "organization" led us to create paper files, folders, shelves, and storage containers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwi7elmpqgxa4wk9tmzx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwi7elmpqgxa4wk9tmzx.png" alt="First kind of archiving" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we entered the computer era, we started using computers to store our files, shifting our basic labeling system into the realm of databases. In a database, each row (or file) gets a unique ID, usually starting with an auto-incremented value from zero. This makes it easy to organize and find information quickly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rv58kaez4lu1fjrygz2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rv58kaez4lu1fjrygz2.png" alt="Computers era for archives" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Over time, as we began using distributed servers and databases, messaging and communication between devices became even more critical. Each record or message had to be unique, requiring a way to label it individually—without any duplicates—across the entire system, regardless of the device.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmckkys381ca7la7wkru.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmckkys381ca7la7wkru.png" alt="Modern datacenters" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  End of story, Let's brake "Snowflake IDs"!!
&lt;/h2&gt;

&lt;p&gt;Snowflake IDs are "unique" identifiers created to solve the issue of ID duplication across distributed systems.&lt;/p&gt;

&lt;p&gt;Orginally created by X (formaly Twitter) used for the IDs of tweets, also we can find kind of usage by major tech compaines like (Instagram, Uber, Github and Linkedin).&lt;/p&gt;

&lt;h3&gt;
  
  
  Snowflake ID Structure:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Snowflake ID is a fixed-length 8-byte, 64-bit (63 usable bits).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Snowflake ID is compact and efficient for storage in databases as a single 64-bit integer. This small footprint is ideal for high-performance systems, minimizing storage space while maintaining unique, ordered identifiers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Structure Breakdown:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxbrq1sebjnn57sqqp1m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxbrq1sebjnn57sqqp1m.png" alt="Snowflake ID Structure" width="451" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Snowflake ID Structure
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Empty bit (1 bit).&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timestamp (41 bits):&lt;/strong&gt; Representing the time in milliseconds since a custom epoch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Center/Machine ID (10 bits):&lt;/strong&gt; Number present the generator machine/device, up to 1024 number.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sequence Number (12 bits):&lt;/strong&gt; serve as a sequence counter within the same millisecond, up to 4096 number.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real world examples:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Linkedin uses Snowflake IDs on article editor
Lets take my article as an example.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ex4iz166k5xyd66wqls.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ex4iz166k5xyd66wqls.png" alt="Linkedin uses Snowflake IDs on article editor" width="601" height="586"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Snowflake ID: 7256902784527069184&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4og8dj16ehcksotum1fz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4og8dj16ehcksotum1fz.png" alt="Linkedin Snowflake IDs Breakdown" width="564" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The table above breaks down the Snowflake ID, showing how LinkedIn structures its identifiers. The timestamp aligns exactly with the date and time I started writing this article: "October 29 at 5:40 AM".&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;X(Twitter) uses Snowflake IDs post ID
Lets take this post for Elon Musk as example: &lt;a href="https://x.com/elonmusk/status/1851515326581916096" rel="noopener noreferrer"&gt;https://x.com/elonmusk/status/1851515326581916096&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwipgujwrzfos8djj3jo1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwipgujwrzfos8djj3jo1.png" alt="X uses Snowflake IDs as post ID" width="731" height="606"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Snowflake ID: 1851515326581916096&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomagcsbumsctdbw2wto1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomagcsbumsctdbw2wto1.png" alt="Twitter X Snowflake ID Breakdown" width="564" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this approach, X (Twitter) uses a starting timestamp of "1288834974657," which translates to "November 4, 2010, 1:42:54.657 AM." By adding the Snowflake ID timestamp, we get "October 30, 2024, 6:43:48.005 AM," indicating when the tweet was posted.&lt;/p&gt;

&lt;p&gt;The datacenter ID identifies the machine that generated the tweet, while the sequence ID ensures each tweet is unique, even if created at the same time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The good, the bad and the ugly!
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F596komrmx9grr0dm21f9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F596komrmx9grr0dm21f9.png" alt="Snowflake IDs: The good, the bad and the ugly" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Good:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Small Footprint: The 64-bit structure of Snowflake IDs makes them compact and efficient for storage.&lt;/li&gt;
&lt;li&gt;Sortable: Snowflake IDs include a timestamp component, ensuring that IDs are roughly ordered by time.&lt;/li&gt;
&lt;li&gt;Usable Components: Because the components are already has meanful data, this data can be used on any part of the system. &lt;/li&gt;
&lt;li&gt;Customizable: Changing the allocation of bits for the data center id and sequence number as needs, basically you have 22 bit(10+12) you can divide them for whatever your needs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Bad:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Not Globally Unique: Snowflake IDs are unique within a one distributed system but may not be globally unique across different companies/systems.&lt;/li&gt;
&lt;li&gt;Limited Numbers for Components: The number of bits allocated for data center and machine IDs restricts the number of unique identifiers for components. for ex: Data Center/Machine ID can only fit 1024 number&lt;/li&gt;
&lt;li&gt;Complex Configuration: Properly configuring and managing the allocation of bits and unique identifiers for data centers and machines can become complicated, especially in large distributed systems.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Ugly:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Clock Drift Issues and Dependency on Accurate Timekeeping: The system relies on precise time synchronization, which can lead to non-sequential IDs or even duplicates if clocks are out of sync.&lt;/li&gt;
&lt;li&gt;Potential for ID Collisions: Without careful management and synchronization, Snowflake ID generation can lead to collisions or duplicated IDs, undermining the reliability of the system.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Snowflake IDs vs. GUIDs: A Potential Replacement?
&lt;/h2&gt;

&lt;p&gt;Ahhh no diffiently not. Comparing "GUIDs" and "Snowflake IDs" is more like comparing "Sea" and "River", Yes at the base line both are "water" but with huge diffrences.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk5l8ums9ixu2i69ycdcx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk5l8ums9ixu2i69ycdcx.png" alt="GUID vs Snowflake IDs" width="800" height="733"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GUIDs are "GLOBAL IDs" it's great to ensure is that exact "label" is unique accross the globe.&lt;/p&gt;

&lt;p&gt;Snowflake IDs are "SYSTEM IDs" it's great for all your system resources to know that "label".&lt;/p&gt;

&lt;h2&gt;
  
  
  Still here!? You are really interested!!!
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Here is some Snowflake IDs Referances!
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Snowflake IDs Delphi Generator: &lt;a href="https://github.com/shadiajam/SnowFlakeID-Delphi" rel="noopener noreferrer"&gt;https://github.com/shadiajam/SnowFlakeID-Delphi&lt;/a&gt; "This is mine consider to star it ⭐"&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Online Snowflake ID Generator: &lt;a href="https://www.onlineappzone.com/snowflake-id-generator" rel="noopener noreferrer"&gt;https://www.onlineappzone.com/snowflake-id-generator&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Wikipedia: &lt;a href="https://en.wikipedia.org/wiki/Snowflake_ID" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Snowflake_ID&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>database</category>
      <category>programming</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
