<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Isaac Ojerumu</title>
    <description>The latest articles on Forem by Isaac Ojerumu (@ejiro).</description>
    <link>https://forem.com/ejiro</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F951291%2Fc1237531-6eb5-4795-90ad-0dbd315df283.jpg</url>
      <title>Forem: Isaac Ojerumu</title>
      <link>https://forem.com/ejiro</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ejiro"/>
    <language>en</language>
    <item>
      <title>Designing a Reliable Notification System for 1M+ Users (Push, SMS, Email)</title>
      <dc:creator>Isaac Ojerumu</dc:creator>
      <pubDate>Sun, 24 May 2026 00:54:07 +0000</pubDate>
      <link>https://forem.com/ejiro/designing-a-reliable-notification-system-for-1m-users-push-sms-email-2i39</link>
      <guid>https://forem.com/ejiro/designing-a-reliable-notification-system-for-1m-users-push-sms-email-2i39</guid>
      <description>&lt;p&gt;In fintech, notifications are not a “nice-to-have” feature.&lt;/p&gt;

&lt;p&gt;They’re part of the product’s trust layer.&lt;/p&gt;

&lt;p&gt;If a user transfers money and doesn’t get a confirmation, they panic.&lt;br&gt;
If an OTP arrives 3 minutes late, login fails.&lt;br&gt;
If price alerts come twice, users lose confidence fast.&lt;/p&gt;

&lt;p&gt;At small scale, sending notifications feels simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Application → Twilio/SendGrid → Done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But once you’re dealing with millions of users, multiple channels, retries, provider outages, and traffic spikes… notification systems become distributed systems problems.&lt;/p&gt;

&lt;p&gt;And distributed systems are mostly about handling failure gracefully.&lt;/p&gt;

&lt;p&gt;Imagine a fintech platform sending:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;transaction alerts&lt;/li&gt;
&lt;li&gt;OTPs&lt;/li&gt;
&lt;li&gt;portfolio updates&lt;/li&gt;
&lt;li&gt;price alerts&lt;/li&gt;
&lt;li&gt;security notifications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…to over 1 million users across:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Push notifications&lt;/li&gt;
&lt;li&gt;SMS&lt;/li&gt;
&lt;li&gt;Email&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The challenge isn’t just “sending messages.”&lt;/p&gt;

&lt;p&gt;The real challenge is making sure the system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;doesn’t send duplicates&lt;/li&gt;
&lt;li&gt;doesn’t silently lose messages&lt;/li&gt;
&lt;li&gt;survives provider outages&lt;/li&gt;
&lt;li&gt;scales during spikes&lt;/li&gt;
&lt;li&gt;remains observable when things go wrong&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s the architecture I’d use.&lt;/p&gt;




&lt;h1&gt;
  
  
  High-Level Architecture
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[App Service]
      │
      ▼
[Notification Queue (Redis Streams / SQS / Kafka)]
      │
      ▼
[Worker Pool]
      │
      ▼
[Provider Router]
      │
 ┌────┼───────────────────────────────┐
 ▼    ▼                               ▼
SMS  Email                           Push
 │      │                              │
 ▼      ▼                              ▼
Twilio SendGrid                      FCM/APNs
 │
 ▼
Fallback Providers
(Termii / Mailgun / Direct APNs)
      │
      ▼
[Delivery Log + Idempotency Store]
(PostgreSQL + Redis)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h1&gt;
  
  
  1. Queue-Based Architecture
&lt;/h1&gt;

&lt;p&gt;One of the biggest mistakes teams make early on is sending notifications directly from the API request cycle.&lt;/p&gt;

&lt;p&gt;That works… until traffic spikes.&lt;/p&gt;

&lt;p&gt;Imagine Black Friday, a crypto market crash, or salary payment day.&lt;br&gt;
Suddenly, millions of notifications need to go out almost at once.&lt;/p&gt;

&lt;p&gt;If your application waits for Twilio or SendGrid to respond before returning a response to the user, your entire app becomes hostage to external providers.&lt;/p&gt;

&lt;p&gt;That’s dangerous.&lt;/p&gt;

&lt;p&gt;Instead, the API should do one thing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Accept the request quickly and push a notification event into a queue.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;From there, worker processes handle delivery asynchronously.&lt;/p&gt;

&lt;p&gt;This changes the system completely.&lt;/p&gt;

&lt;p&gt;Queues give you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;horizontal scalability&lt;/li&gt;
&lt;li&gt;retry capability&lt;/li&gt;
&lt;li&gt;backpressure handling&lt;/li&gt;
&lt;li&gt;failure isolation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If providers slow down, the queue absorbs the spike instead of crashing your application.&lt;/p&gt;

&lt;p&gt;At this scale, queues stop being optional infrastructure.&lt;br&gt;
They become the safety buffer protecting the rest of your system.&lt;/p&gt;

&lt;p&gt;Recommended technologies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redis Streams&lt;/li&gt;
&lt;li&gt;AWS SQS&lt;/li&gt;
&lt;li&gt;Kafka (especially for very high throughput systems)&lt;/li&gt;
&lt;/ul&gt;


&lt;h1&gt;
  
  
  2. Reliability &amp;amp; Idempotency
&lt;/h1&gt;

&lt;p&gt;The hardest problem in notification systems usually isn’t failed sends.&lt;/p&gt;

&lt;p&gt;It’s duplicate sends.&lt;/p&gt;

&lt;p&gt;Users are surprisingly tolerant of delayed notifications.&lt;br&gt;
They are &lt;em&gt;not&lt;/em&gt; tolerant of receiving the same debit alert three times.&lt;/p&gt;

&lt;p&gt;Retries are where duplicates usually happen.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Worker sends SMS&lt;/li&gt;
&lt;li&gt;Provider actually succeeds&lt;/li&gt;
&lt;li&gt;Network timeout occurs before acknowledgment returns&lt;/li&gt;
&lt;li&gt;Worker assumes failure&lt;/li&gt;
&lt;li&gt;Worker retries&lt;/li&gt;
&lt;li&gt;User receives duplicate SMS&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To prevent this, every notification should carry an &lt;code&gt;idempotency_key&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Before sending, workers check:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Have we already processed this exact notification?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Example constraint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;notification_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is one of those small architectural decisions that saves massive operational pain later.&lt;/p&gt;

&lt;p&gt;Even if retries happen multiple times, the database becomes the final protection layer against duplicates.&lt;/p&gt;

&lt;p&gt;Every delivery attempt should also be logged.&lt;/p&gt;

&lt;p&gt;Not just successes — everything.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;notification_attempts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;provider used&lt;/li&gt;
&lt;li&gt;response status&lt;/li&gt;
&lt;li&gt;retry count&lt;/li&gt;
&lt;li&gt;timestamps&lt;/li&gt;
&lt;li&gt;provider error payloads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because when something goes wrong in production, you want evidence, not guesses.&lt;/p&gt;




&lt;h1&gt;
  
  
  3. Provider Routing &amp;amp; Fallbacks
&lt;/h1&gt;

&lt;p&gt;A reality every senior engineer eventually learns:&lt;/p&gt;

&lt;p&gt;Third-party providers fail more often than you expect.&lt;/p&gt;

&lt;p&gt;Twilio can degrade.&lt;br&gt;
SendGrid can throttle requests.&lt;br&gt;
FCM can delay pushes.&lt;/p&gt;

&lt;p&gt;The mistake is designing systems that assume providers are always available.&lt;/p&gt;

&lt;p&gt;Reliable systems assume failure is normal.&lt;/p&gt;

&lt;p&gt;So instead of hardcoding a single provider, introduce a provider routing layer.&lt;/p&gt;
&lt;h2&gt;
  
  
  Example Routing Strategy
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Channel&lt;/th&gt;
&lt;th&gt;Primary&lt;/th&gt;
&lt;th&gt;Fallback&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SMS&lt;/td&gt;
&lt;td&gt;Twilio / Termii&lt;/td&gt;
&lt;td&gt;Flutterwave SMS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email&lt;/td&gt;
&lt;td&gt;SendGrid&lt;/td&gt;
&lt;td&gt;Mailgun&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Push&lt;/td&gt;
&lt;td&gt;Firebase FCM&lt;/td&gt;
&lt;td&gt;Direct APNs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The worker flow becomes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Attempt primary provider&lt;/li&gt;
&lt;li&gt;Detect timeout or failure&lt;/li&gt;
&lt;li&gt;Retry intelligently&lt;/li&gt;
&lt;li&gt;Automatically fail over if necessary&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Users shouldn’t notice your provider had a bad day.&lt;/p&gt;

&lt;p&gt;That’s the goal.&lt;/p&gt;


&lt;h1&gt;
  
  
  4. Retry Strategy
&lt;/h1&gt;

&lt;p&gt;Retries sound simple until they start causing damage.&lt;/p&gt;

&lt;p&gt;Bad retry systems can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;spam users&lt;/li&gt;
&lt;li&gt;overload providers&lt;/li&gt;
&lt;li&gt;amplify outages&lt;/li&gt;
&lt;li&gt;generate huge costs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A common mistake is retrying too aggressively.&lt;/p&gt;

&lt;p&gt;If Twilio is already struggling, hammering it with thousands of immediate retries only makes things worse.&lt;/p&gt;

&lt;p&gt;Instead, use exponential backoff.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Retry #1 → 30 seconds
Retry #2 → 2 minutes
Retry #3 → 10 minutes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives providers time to recover while keeping pressure manageable.&lt;/p&gt;

&lt;p&gt;And after maximum retries?&lt;/p&gt;

&lt;p&gt;Move the message into a Dead Letter Queue (DLQ).&lt;/p&gt;

&lt;p&gt;That queue is basically your “something unusual happened here” bucket.&lt;/p&gt;

&lt;p&gt;At that point, engineers should be alerted.&lt;/p&gt;




&lt;h1&gt;
  
  
  5. Reconciliation Jobs
&lt;/h1&gt;

&lt;p&gt;One subtle issue in distributed systems:&lt;/p&gt;

&lt;p&gt;Sometimes providers say “accepted” even though delivery eventually fails.&lt;/p&gt;

&lt;p&gt;That creates dangerous blind spots.&lt;/p&gt;

&lt;p&gt;A notification may look successful internally while the user never actually receives it.&lt;/p&gt;

&lt;p&gt;This is why reconciliation jobs matter.&lt;/p&gt;

&lt;p&gt;Every few minutes, background jobs should scan for suspicious states:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Notifications stuck in "pending" for too long
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;→ Re-query provider APIs
→ Update delivery status
→ Retry if needed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These jobs quietly save systems from edge cases caused by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;webhook failures&lt;/li&gt;
&lt;li&gt;transient outages&lt;/li&gt;
&lt;li&gt;network interruptions&lt;/li&gt;
&lt;li&gt;delayed provider processing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A lot of reliability engineering is really just building systems that continuously self-correct.&lt;/p&gt;




&lt;h1&gt;
  
  
  6. User Preferences &amp;amp; Rate Limiting
&lt;/h1&gt;

&lt;p&gt;Good notification systems are not just reliable.&lt;/p&gt;

&lt;p&gt;They’re respectful.&lt;/p&gt;

&lt;p&gt;Users should control how they’re contacted.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;disable marketing SMS&lt;/li&gt;
&lt;li&gt;mute non-critical notifications after 10PM&lt;/li&gt;
&lt;li&gt;push-only preferences&lt;/li&gt;
&lt;li&gt;email digests instead of instant alerts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Simple table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;user_notification_settings
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…can dramatically improve user experience.&lt;/p&gt;

&lt;p&gt;Rate limiting matters too.&lt;/p&gt;

&lt;p&gt;Without it, bugs or loops can become expensive very quickly.&lt;/p&gt;

&lt;p&gt;Imagine accidentally sending OTPs in a retry loop to thousands of users.&lt;/p&gt;

&lt;p&gt;Redis-based limits help protect against this.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Max 3 SMS/hour/user
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That protects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;users from spam&lt;/li&gt;
&lt;li&gt;providers from overload&lt;/li&gt;
&lt;li&gt;the business from runaway costs&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  7. Monitoring &amp;amp; Observability
&lt;/h1&gt;

&lt;p&gt;At scale, invisible systems are dangerous systems.&lt;/p&gt;

&lt;p&gt;You need to know:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are queues growing?&lt;/li&gt;
&lt;li&gt;Are workers failing?&lt;/li&gt;
&lt;li&gt;Which provider is slowing down?&lt;/li&gt;
&lt;li&gt;Which region is affected?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The most important metrics are usually boring operational ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  Golden Signals
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;queue latency&lt;/li&gt;
&lt;li&gt;queue depth&lt;/li&gt;
&lt;li&gt;worker error rate&lt;/li&gt;
&lt;li&gt;provider response times&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then business-level metrics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;delivery success rate&lt;/li&gt;
&lt;li&gt;percentage delivered within 30 seconds&lt;/li&gt;
&lt;li&gt;provider costs&lt;/li&gt;
&lt;li&gt;user opt-out rates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And finally: alerts.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Alert if SMS failure rate exceeds 5% for 2 minutes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The earlier you detect degradation, the smaller the incident becomes.&lt;/p&gt;




&lt;h1&gt;
  
  
  8. Testing Failure Scenarios
&lt;/h1&gt;

&lt;p&gt;The biggest difference between systems that &lt;em&gt;look&lt;/em&gt; reliable and systems that &lt;em&gt;are&lt;/em&gt; reliable is failure testing.&lt;/p&gt;

&lt;p&gt;Because everything works in happy-path demos.&lt;/p&gt;

&lt;p&gt;The real question is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What happens when dependencies misbehave?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;One useful strategy is shadow testing.&lt;/p&gt;

&lt;p&gt;Route a tiny percentage of production traffic through a new provider and compare results safely.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;compare latency&lt;/li&gt;
&lt;li&gt;compare delivery rates&lt;/li&gt;
&lt;li&gt;validate formatting consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Chaos testing is also incredibly valuable.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Intentionally fail 10% of Twilio requests in staging
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That sounds scary initially.&lt;/p&gt;

&lt;p&gt;But it validates whether:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;retries actually work&lt;/li&gt;
&lt;li&gt;failovers trigger correctly&lt;/li&gt;
&lt;li&gt;reconciliation jobs recover messages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reliable systems are engineered through controlled failure exposure.&lt;/p&gt;




&lt;h1&gt;
  
  
  Why This Architecture Works
&lt;/h1&gt;

&lt;p&gt;What makes this architecture resilient is that it assumes bad things will happen.&lt;/p&gt;

&lt;p&gt;Because eventually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;providers fail&lt;/li&gt;
&lt;li&gt;queues spike&lt;/li&gt;
&lt;li&gt;retries happen&lt;/li&gt;
&lt;li&gt;networks become unreliable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system survives because reliability is built into the architecture itself.&lt;/p&gt;

&lt;p&gt;By combining:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;queues&lt;/li&gt;
&lt;li&gt;idempotency&lt;/li&gt;
&lt;li&gt;retries&lt;/li&gt;
&lt;li&gt;fallback providers&lt;/li&gt;
&lt;li&gt;reconciliation jobs&lt;/li&gt;
&lt;li&gt;observability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…the platform continues operating even during partial outages and heavy traffic spikes.&lt;/p&gt;

&lt;p&gt;And in fintech, reliability isn’t just infrastructure quality.&lt;/p&gt;

&lt;p&gt;It directly affects user trust.&lt;/p&gt;




&lt;h1&gt;
  
  
  Tech Stack Example
&lt;/h1&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Suggested Tech&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Queue&lt;/td&gt;
&lt;td&gt;Redis Streams / Kafka / SQS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Workers&lt;/td&gt;
&lt;td&gt;Laravel Queues / Go Workers / Node Consumers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache&lt;/td&gt;
&lt;td&gt;Redis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;PostgreSQL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monitoring&lt;/td&gt;
&lt;td&gt;Prometheus + Grafana&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alerts&lt;/td&gt;
&lt;td&gt;PagerDuty / Slack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMS&lt;/td&gt;
&lt;td&gt;Twilio / Termii&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email&lt;/td&gt;
&lt;td&gt;SendGrid / Mailgun&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Push&lt;/td&gt;
&lt;td&gt;Firebase FCM&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;Most notification systems work during normal traffic.&lt;/p&gt;

&lt;p&gt;That’s not the hard part.&lt;/p&gt;

&lt;p&gt;The hard part is surviving:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;provider outages&lt;/li&gt;
&lt;li&gt;duplicate retry scenarios&lt;/li&gt;
&lt;li&gt;partial failures&lt;/li&gt;
&lt;li&gt;sudden spikes in traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s where architecture starts to matter.&lt;/p&gt;

&lt;p&gt;Because users rarely remember the notifications that worked.&lt;/p&gt;

&lt;p&gt;They remember the moments when communication failed during something important.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>distributedsystems</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Integrating Flutterwave’s secure payment gateway into your website using PHP</title>
      <dc:creator>Isaac Ojerumu</dc:creator>
      <pubDate>Sun, 23 Oct 2022 01:43:31 +0000</pubDate>
      <link>https://forem.com/ejiro/integrating-flutterwaves-secure-payment-gateway-into-your-website-using-php-4187</link>
      <guid>https://forem.com/ejiro/integrating-flutterwaves-secure-payment-gateway-into-your-website-using-php-4187</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfotlucsuy9zhhmqe8sm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfotlucsuy9zhhmqe8sm.png" alt="Flutterwave banner" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Flutterwave is one of the leading payment gateways in Nigeria and Africa. If you have a business that requires secure online payment from customers, then I suggest you try them out. You can collect payments in USD, GBP, EUR, NGN and a host of other African currencies. They have so many payment solutions with which you can complete your payment easily and secured. In this guide, I’ll be showing you how to integrate rave into your website. So let’s dive in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A. Create a Flutterwave account&lt;/strong&gt;&lt;br&gt;
If you have not created an account yet, go to &lt;a href="https://app.flutterwave.com/register" rel="noopener noreferrer"&gt;https://app.flutterwave.com/register&lt;/a&gt; to create an account. If you already have, login with your account details. Navigate to the settings page and click on the API keys tabs to get your public and secret key, we are going to need it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbqyoe24r7jeb8kuf8jck.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbqyoe24r7jeb8kuf8jck.jpg" alt="Flutterwave register page" width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbuijyerzb2jku4otf3a.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbuijyerzb2jku4otf3a.jpg" alt="Flutterwave settings page" width="799" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;B. Collect payment information and charge the customer&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We have to collect the user’s email, the amount that will be paid and the currency if the payment will not be in naira(NGN). You can provide an input tag for the user to fill in the amount and their email or you can use a hidden input tag if the amount to be paid is static or you already have the user’s info. There are four (4) methods of doing this as seen in Flutterwave documentation;&lt;/p&gt;

&lt;p&gt;(i) Flutterwave inline (ii) Flutterwave standard (iii) HTML chekout (iv) Direct charge&lt;/p&gt;

&lt;p&gt;Let's look at integrating by Flutterwave standard method using the PHP language. You can refer to the documentation on the other methods: &lt;a href="https://developer.flutterwave.com/docs/collecting-payments/overview" rel="noopener noreferrer"&gt;https://developer.flutterwave.com/docs/collecting-payments/overview&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flutterwave standard&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Below is a sample form page I created with HTML and CSS (bootstrap) for demonstration. Here the user is to fund his wallet on a telecom (VTU) website.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;html lang="en"&amp;gt;
&amp;lt;head&amp;gt;
    &amp;lt;meta charset="UTF-8"&amp;gt;
    &amp;lt;meta http-equiv="X-UA-Compatible" content="IE=edge"&amp;gt;
    &amp;lt;meta name="viewport" content="width=device-width, initial-scale=1.0"&amp;gt;
    &amp;lt;title&amp;gt;Fund Wallet&amp;lt;/title&amp;gt;
    &amp;lt;link rel="stylesheet" href="bootstrap.min.css" /&amp;gt;
&amp;lt;/head&amp;gt;
&amp;lt;body&amp;gt;
    &amp;lt;div class="container-xxl flex-grow-1 container-p-y"&amp;gt;
     &amp;lt;div class="row"&amp;gt;
       &amp;lt;div class="col-xl"&amp;gt;
         &amp;lt;div class="card mb-4"&amp;gt;
           &amp;lt;div class="card-header d-flex justify-content-between align-items-center"&amp;gt;
             &amp;lt;h5 class="mb-0"&amp;gt;Fund account with Flutterwave&amp;lt;/h5&amp;gt;
            &amp;lt;/div&amp;gt;
            &amp;lt;div class="card-body"&amp;gt;
              &amp;lt;form method="post" action="process_payment.php" id="form"&amp;gt;                  
                &amp;lt;div class="mb-3"&amp;gt;
                  &amp;lt;label class="form-label" for="phone-number"&amp;gt;Amount*&amp;lt;/label&amp;gt;
                   &amp;lt;input type="tel" name="amount" class="form-control" required&amp;gt;
                &amp;lt;/div&amp;gt;

                &amp;lt;div class="mb-3"&amp;gt;
                  &amp;lt;input type="email" name="email" class="form-control"  required&amp;gt;
                &amp;lt;/div&amp;gt;
                &amp;lt;div class="mb-3"&amp;gt;
                  &amp;lt;input type="tel" name="phone" class="form-control" required&amp;gt;
                &amp;lt;/div&amp;gt;

                &amp;lt;button type="submit" class="btn btn-primary action" name="pay"&amp;gt; Pay&amp;lt;/button&amp;gt;
              &amp;lt;/form&amp;gt;
            &amp;lt;/div&amp;gt;
          &amp;lt;/div&amp;gt;
        &amp;lt;/div&amp;gt;

      &amp;lt;/div&amp;gt;          
   &amp;lt;/div&amp;gt;
  &amp;lt;script src="jquery.min.js"&amp;gt;&amp;lt;/script&amp;gt;
  &amp;lt;script src="bootstrap.min.js"&amp;gt;&amp;lt;/script&amp;gt;
&amp;lt;/body&amp;gt;
&amp;lt;/html&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here the user information is submitted in a form to a backend script. I named my own script process_payment.php. This script will call the &lt;a href="https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/hosted/pay" rel="noopener noreferrer"&gt;https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/hosted/pay&lt;/a&gt; endpoint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?php

if (isset($_POST['transfer'])) {
  extract($_POST);

   $curl = curl_init();

    $customer_email = $email; 
    $amount_pay = $amount;
    $currency = "NGN";
    $txref = "rave" . uniqid(); // ensure you generate unique references per transaction.
    // get your public key from the dashboard.
    $PBFPubKey = "FLWPUBK-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-X"; 
    $redirect_url = "https://www.fltwvapp.herokuapp.com/redirect.php?email=$email&amp;amp;amount=$amount_pay"; // Set your own redirect URL

     curl_setopt_array($curl, array(
      CURLOPT_URL =&amp;gt; "https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/hosted/pay",
      CURLOPT_RETURNTRANSFER =&amp;gt; true,
      CURLOPT_CUSTOMREQUEST =&amp;gt; "POST",
      CURLOPT_POSTFIELDS =&amp;gt; json_encode([
        'amount'=&amp;gt;$amount_pay,
        'customer_email'=&amp;gt;$customer_email,
        'currency'=&amp;gt;$currency,
        'txref'=&amp;gt;$txref,
        'PBFPubKey'=&amp;gt;$PBFPubKey,
        'redirect_url'=&amp;gt;$redirect_url,
      ]),
      CURLOPT_HTTPHEADER =&amp;gt; [
        "content-type: application/json",
        "cache-control: no-cache"
      ],
    ));

    $response = curl_exec($curl);
    $err = curl_error($curl);

    if($err){
      // there was an error contacting the rave API
      die('Curl returned error: ' . $err);
    }

    $transaction = json_decode($response);

    if(!$transaction-&amp;gt;data &amp;amp;&amp;amp; !$transaction-&amp;gt;data-&amp;gt;link){
      // there was an error from the API
      print_r('API returned error: ' . $transaction-&amp;gt;message);
    }

    // redirect to page so User can pay

    header('Location: ' . $transaction-&amp;gt;data-&amp;gt;link); 
}
?&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assign the email and amount submitted by the POST request to variables. Make sure to validate those variables to make sure that the user has not done anything dubious. Generate a unique transaction reference. It is advisable to store this reference, amount, and any other information you may wish to collect about the product or service to be paid for in a database and then set the transaction status to pending. Paste in your public key and specify your redirect URL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;em&gt;Your redirect URL must be hosted on a web host&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;The curl is used to call the &lt;a href="https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/hosted/pay" rel="noopener noreferrer"&gt;https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/hosted/pay&lt;/a&gt;. The amount, email, phone, currency, unique reference, your public key and redirect URL is passed into it as well and then executed. Use json_decode to decode the response returned by the API and store in a transaction variable. Check the transaction object to see if it has the data property and if the link that will process the payment also exists(this link is from Flutterwave and not the one you provided earlier. don’t worry about it). We are now redirected to the link which is the checkout page for our card details.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3e9hsqdfbm0p8jeco61d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3e9hsqdfbm0p8jeco61d.jpg" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When the transaction is completed, Flutterwave will now redirect to the URL provided earlier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;C. Verify the payment status&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After the transaction has been completed, use the reference returned in the redirect URL to check for the status of the transaction. Below is the redirect.php file which does the verification for us.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?php

if (isset($_GET['txref'])) {
  $ref = $_GET['txref'];
  $amount = $_GET['amount']; //Get the correct amount of your product
  $email = $_GET['email'];
  $currency = "NGN"; //Correct Currency from Server

  $query = array(
    "SECKEY" =&amp;gt; "FLWSECK-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-X",
    "txref" =&amp;gt; $ref
  );

  $data_string = json_encode($query);

  $ch = curl_init('https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/verify');                                                                      
  curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
  curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);                                              
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
  curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));

  $response = curl_exec($ch);

  $header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
  $header = substr($response, 0, $header_size);
  $body = substr($response, $header_size);

  curl_close($ch);

  $resp = json_decode($response, true);

  $paymentStatus = $resp['data']['status'];
  $chargeResponsecode = $resp['data']['chargecode'];
  $chargeAmount = $resp['data']['amount'];
  $chargeCurrency = $resp['data']['currency'];

  if (($chargeResponsecode == "00" || $chargeResponsecode == "0") &amp;amp;&amp;amp; ($chargeAmount == $amount)  &amp;amp;&amp;amp; ($chargeCurrency == $currency)) {
    // transaction was successful...
  // please check other things like whether you already gave value for this ref
    // if the email matches the customer who owns the product etc
    //Give Value and return to Success page
    //   var_dump($resp);
    header('location: success.html');
  } else {
    //Dont Give Value and return to Failure page
    // var_dump($resp);
    header('location: error.html');
  }
}
else {
  die('No reference supplied');
}

?&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use $_GET to collect the reference and other values passed to the redirect URL. Encode your secret key gotten from your dashboard and your txRef as JSON and call the &lt;a href="https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/verify" rel="noopener noreferrer"&gt;https://api.ravepay.co/flwv3-pug/getpaidx/api/v2/verify&lt;/a&gt; endpoint with it to verify the payment. Decode the response returned. Check the chargecode, amount and currency to make sure they synchronize. Use the txRef to update the status of the transaction, redirect the user to the success or error page or do anything you’d like to do after a successful or failed transaction. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;D. Use webhook to ensure stability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Webhooks are used for notifying our server that a payment event has just occurred. A web application implementing Webhooks will POST a message to a URL when certain things happen. Webhooks are optional. Without it, everything will work but it makes our application stable. &lt;/p&gt;

&lt;p&gt;Let’s say that a user completes his payment successfully but maybe due to poor network signal was unable to get to your designated URL, or maybe closes the browser or the payment tab mistakenly, it will be difficult to ascertain the status of his payment unless we visit the dashboard manually and crosscheck. But with webhooks, you can still know if the transaction succeeded or failed and then you can update your transaction table and give or deny the user access to value being paid for whenever he comes back.&lt;/p&gt;

&lt;p&gt;Another case is when the user is subscribing to a service and will be billed periodically, whenever the user is billed, the status of the transaction is sent to your webhook URL. You can now keep offering the service to the user or stop if he wasn’t billed. Other examples are for events — like getting paid via mobile money or USSD where the transaction is completed outside your application.&lt;/p&gt;

&lt;p&gt;Enable webhook from your dashboard&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkrhokbhhz67jog3hqopm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkrhokbhhz67jog3hqopm.jpg" alt=" " width="799" height="409"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Below is a sample webhook from the documentation page.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?php

// Retrieve the request's body
$body = @file_get_contents("php://input");

// retrieve the signature sent in the reques header's.
$signature = (isset($_SERVER['verif-hash']) ? $_SERVER['verif-hash'] : '');

/* It is a good idea to log all events received. Add code *
 * here to log the signature and body to db or file       */

if (!$signature) {
    // only a post with rave signature header gets our attention
    exit();
}

// Store the same signature on your server as an env variable and check against what was sent in the headers
$local_signature = getenv('SECRET_HASH');

// confirm the event's signature
if( $signature !== $local_signature ){
  // silently forget this ever happened
  exit();
}

http_response_code(200); // PHP 5.4 or greater
// parse event (which is json string) as object
// Give value to your customer but don't give any output
// Remember that this is a call from rave's servers and 
// Your customer is not seeing the response here at all

$response = json_decode($body);
if ($response-&amp;gt;body-&amp;gt;status == 'successful') {
    # code...
    // Update your database and set the transaction status to successful
}
exit();

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Your redirect page and webhook page might be doing the same thing. In either your redirect page or webhook, you might first check to see if the payment status has been updated to avoid giving value twice.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
