<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Joud Awad</title>
    <description>The latest articles on Forem by Joud Awad (@thejoud1997).</description>
    <link>https://forem.com/thejoud1997</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1238326%2F5d65a5d6-611d-4526-9bc2-d2d8643d5226.png</url>
      <title>Forem: Joud Awad</title>
      <link>https://forem.com/thejoud1997</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/thejoud1997"/>
    <language>en</language>
    <item>
      <title>18/30 Days System Design Questions!</title>
      <dc:creator>Joud Awad</dc:creator>
      <pubDate>Sat, 23 May 2026 15:45:16 +0000</pubDate>
      <link>https://forem.com/thejoud1997/1830-days-system-design-questions-2i8c</link>
      <guid>https://forem.com/thejoud1997/1830-days-system-design-questions-2i8c</guid>
      <description>&lt;p&gt;Your Redis cache just expired on a key that 8,000 users hit every second.&lt;br&gt;
Every single one of those requests is now flying straight at your database.&lt;/p&gt;

&lt;p&gt;This is the thundering herd. You didn't have a traffic problem — you had a cache problem. Now you have both.&lt;/p&gt;

&lt;p&gt;Here's the setup:&lt;br&gt;
Service → Node.js API, 8,000 req/sec on the /feed endpoint&lt;br&gt;
Cache → Redis, TTL = 60s on the feed key&lt;br&gt;
DB → Postgres, comfortable at ~200 req/sec sustained&lt;br&gt;
What happened → TTL expired at peak traffic, all 8,000 req/sec hit Postgres simultaneously&lt;/p&gt;

&lt;p&gt;The DB is on its knees. You have minutes before it falls over. And the next TTL expiry is in 60 seconds.&lt;/p&gt;

&lt;p&gt;What do you do?&lt;/p&gt;

&lt;p&gt;A) Mutex lock — only one request queries the DB to rebuild the cache, the rest wait behind it.&lt;br&gt;
B) Probabilistic early expiry — start randomly rebuilding the cache before the TTL actually hits zero.&lt;br&gt;
C) Request coalescing — collapse all in-flight requests for the same key into a single DB query, return the same result to all of them.&lt;br&gt;
D) Cache pre-warming — a background job rebuilds the key on a schedule, TTL never reaches zero in prod.&lt;/p&gt;

&lt;p&gt;All four ship in production systems. Only one of them prevents the thundering herd without introducing a new failure mode under load.&lt;/p&gt;

&lt;p&gt;Pick one — A, B, C, or D — and tell me why. Full breakdown in the comments (including which answer is the senior-engineer trap that works in theory but falls apart when 8,000 requests are piling up).&lt;/p&gt;

&lt;p&gt;If your team has ever had a cache expiry take down a database, share this with them. The debate is worth more than the post.&lt;/p&gt;

&lt;p&gt;Drop your answer 👇&lt;/p&gt;

&lt;h1&gt;
  
  
  30DaysOfSystemDesign #SystemDesign #DistributedSystems #SoftwareArchitecture
&lt;/h1&gt;

</description>
      <category>systemdesign</category>
      <category>distributedsystems</category>
      <category>designsystem</category>
    </item>
    <item>
      <title>1/30 Days System Design Question</title>
      <dc:creator>Joud Awad</dc:creator>
      <pubDate>Sat, 23 May 2026 09:13:45 +0000</pubDate>
      <link>https://forem.com/thejoud1997/130-days-system-design-question-5dd8</link>
      <guid>https://forem.com/thejoud1997/130-days-system-design-question-5dd8</guid>
      <description>&lt;p&gt;our mobile app talks to 3 backend services directly.&lt;/p&gt;

&lt;p&gt;A 4th one ships next sprint. The mobile team is already drowning.&lt;/p&gt;

&lt;p&gt;Every new service means a new domain to whitelist, a new auth scheme to wire, and a new error shape to parse. You’re asked to reduce coupling before NotificationService lands.&lt;/p&gt;

&lt;p&gt;Here’s the setup:&lt;/p&gt;

&lt;p&gt;Mobile → UserService (users.api.com)&lt;/p&gt;

&lt;p&gt;Mobile → OrderService (orders.api.com)&lt;/p&gt;

&lt;p&gt;Mobile → PaymentService (payments.api.com)&lt;/p&gt;

&lt;p&gt;…and NotificationService next sprint.&lt;/p&gt;

&lt;p&gt;The client is doing routing the backend should be doing. What do you do?&lt;/p&gt;

&lt;p&gt;A) Add an API Gateway — single entry point, all services hide behind one domain.&lt;/p&gt;

&lt;p&gt;B) Build a BFF (Backend for Frontend) — a dedicated aggregation layer tailored for mobile.&lt;/p&gt;

&lt;p&gt;C) Put a Load Balancer in front of all services — single IP, distributed traffic.&lt;/p&gt;

&lt;p&gt;D) Switch to GraphQL Federation — one unified schema the client queries.&lt;/p&gt;

&lt;p&gt;Three of these are real patterns you’d use in production. Only one of them actually solves the problem in front of you.&lt;/p&gt;

&lt;p&gt;Pick one — A, B, C, or D — and tell me why. I’ll drop the full breakdown in the comments (including why two of the wrong answers are close enough to trick senior engineers).&lt;/p&gt;

&lt;p&gt;If this is the kind of tradeoff question your team argues about, share it with them. The debate is worth more than the post.&lt;/p&gt;

&lt;p&gt;Drop your answer 👇&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>distributedsystems</category>
      <category>microservices</category>
    </item>
  </channel>
</rss>
