<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: ohyeah</title>
    <description>The latest articles on Forem by ohyeah (@ohyeah_d04cd4c2cd46a1ad2c).</description>
    <link>https://forem.com/ohyeah_d04cd4c2cd46a1ad2c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3905431%2F7aaf17d8-ee8b-4d77-98ed-602442d3f592.jpg</url>
      <title>Forem: ohyeah</title>
      <link>https://forem.com/ohyeah_d04cd4c2cd46a1ad2c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ohyeah_d04cd4c2cd46a1ad2c"/>
    <language>en</language>
    <item>
      <title>Your Next.js health check is lying to you (and how to fix it)</title>
      <dc:creator>ohyeah</dc:creator>
      <pubDate>Fri, 01 May 2026 08:55:55 +0000</pubDate>
      <link>https://forem.com/ohyeah_d04cd4c2cd46a1ad2c/your-nextjs-health-check-is-lying-to-you-and-how-to-fix-it-1ho4</link>
      <guid>https://forem.com/ohyeah_d04cd4c2cd46a1ad2c/your-nextjs-health-check-is-lying-to-you-and-how-to-fix-it-1ho4</guid>
      <description>&lt;p&gt;I've been monitoring my own SaaS in production for the last two months, and I've watched the same bug pattern hit indie projects over and over:&lt;/p&gt;

&lt;p&gt;The app is on fire. Customers are seeing 500s. Stripe webhooks are silently failing. And yet &lt;code&gt;GET /api/health&lt;/code&gt; is cheerfully returning &lt;code&gt;200 OK&lt;/code&gt;, every minute, like nothing's wrong.&lt;/p&gt;

&lt;p&gt;The reason is almost always the same: the health check is testing the wrong thing.&lt;/p&gt;

&lt;p&gt;This post is about what a health check should &lt;em&gt;actually&lt;/em&gt; do, the three failure modes that catch people, and a working Next.js 13+ implementation you can paste in.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "useless 200" pattern
&lt;/h2&gt;

&lt;p&gt;Here's the health check I see most often in indie Next.js codebases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/app/api/health/route.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;GET&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This endpoint can only fail in one way: the Next.js process itself is dead. If that happens, your hosting platform was already going to know — Vercel/Render/Fly notice the process crashed before your monitor does.&lt;/p&gt;

&lt;p&gt;What this endpoint &lt;em&gt;cannot&lt;/em&gt; tell you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Did someone rename &lt;code&gt;DATABASE_URL&lt;/code&gt; to &lt;code&gt;DATABASE_POOL_URL&lt;/code&gt; in env vars and forget to update the code?&lt;/li&gt;
&lt;li&gt;Did your Supabase service-role token expire?&lt;/li&gt;
&lt;li&gt;Did the connection pool max out and start refusing connections?&lt;/li&gt;
&lt;li&gt;Did a middleware change start returning 308 redirects to &lt;code&gt;/login&lt;/code&gt; for everything?&lt;/li&gt;
&lt;li&gt;Is your background queue stuck?&lt;/li&gt;
&lt;li&gt;Is the Stripe webhook handler returning 200 but silently swallowing events?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every one of those bugs has hit a real production app I know of in the last 90 days. None of them were caught by a &lt;code&gt;return { ok: true }&lt;/code&gt; health check. All of them were eventually caught by &lt;em&gt;customer&lt;/em&gt; complaints — the worst possible monitor.&lt;/p&gt;




&lt;h2&gt;
  
  
  The three layers of "healthy"
&lt;/h2&gt;

&lt;p&gt;Before showing the fix, the mental model that makes this easier:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: shallow.&lt;/strong&gt; "Is the function reachable?" This is the useless 200. It tells you the runtime is up, nothing more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: middle.&lt;/strong&gt; "Are my critical dependencies reachable from this function right now?" Database. Auth provider. Cache. The cheapest possible roundtrip that actually exercises auth and the connection pool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: deep.&lt;/strong&gt; "Is the entire system functioning?" Background workers running. Cron jobs not stuck. Queue not backed up. This is expensive and runs less often.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Most indie projects only need Layer 2.&lt;/strong&gt; Layer 1 is what you have today and it doesn't help. Layer 3 is what big shops do; you don't need it yet.&lt;/p&gt;

&lt;p&gt;The rest of this post is about doing Layer 2 correctly in Next.js.&lt;/p&gt;




&lt;h2&gt;
  
  
  The fix: a real Next.js 13+ health endpoint
&lt;/h2&gt;

&lt;p&gt;Here's the Route Handler I run on my own SaaS. It uses Supabase but the pattern is the same for any DB:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/app/api/health/route.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NextResponse&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;next/server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createSupabaseServiceRole&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/supabase/server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;runtime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;nodejs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dynamic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;force-dynamic&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;GET&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createSupabaseServiceRole&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Cheapest possible call that exercises the connection pool + auth.&lt;/span&gt;
    &lt;span class="c1"&gt;// head: true returns no rows — microseconds, no payload.&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;profiles&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;exact&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;head&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;NextResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;no-store&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;NextResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;no-store&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are five non-obvious decisions in those 25 lines. Each one is a bug I've personally watched bite somebody.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. &lt;code&gt;runtime = "nodejs"&lt;/code&gt;, not edge
&lt;/h3&gt;

&lt;p&gt;Health checks should hit the same runtime your real traffic hits. If your app runs on the Node.js runtime (most indie SaaS), your health check should too. Otherwise you're testing a runtime your customers never use.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. &lt;code&gt;dynamic = "force-dynamic"&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Without this, Next.js or your CDN can serve a cached 200 even after your DB is down. The cache happily reports "healthy" while every customer request is failing. I've seen this exact bug in production. Hard to debug because the health check looks fine.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. &lt;code&gt;Cache-Control: no-store&lt;/code&gt; on every response
&lt;/h3&gt;

&lt;p&gt;Same reason. Belt and suspenders. CDNs respect &lt;code&gt;no-store&lt;/code&gt; even if Next.js gets it wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. A real DB roundtrip — not &lt;code&gt;SELECT 1&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;SELECT 1&lt;/code&gt; works for raw Postgres, but it's a half-measure. You want a query that exercises:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connection pool acquisition (catches "pool exhausted")&lt;/li&gt;
&lt;li&gt;Auth (catches "service role token expired")&lt;/li&gt;
&lt;li&gt;A real table the app uses (catches "ran migrations on the wrong DB")&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;head: true&lt;/code&gt; count query above does all three. It costs microseconds and transfers no rows. Use the cheapest possible &lt;em&gt;real&lt;/em&gt; query, not the cheapest possible &lt;em&gt;fake&lt;/em&gt; query.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. &lt;code&gt;503&lt;/code&gt; on failure — not &lt;code&gt;200&lt;/code&gt; with &lt;code&gt;{ ok: false }&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This is the one people get most wrong. Most upstream monitors — Kubernetes liveness probes, GCP/AWS health checks, external uptime tools — trigger on &lt;strong&gt;HTTP status&lt;/strong&gt;, not body content. If you return &lt;code&gt;200 { ok: false }&lt;/code&gt;, your monitor sees a successful response and your platform never takes the bad pod out of rotation.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;503 Service Unavailable&lt;/code&gt; is the right status for "I'm running but I can't serve traffic." Use it.&lt;/p&gt;




&lt;h2&gt;
  
  
  "But I want my health check at &lt;code&gt;/healthz&lt;/code&gt;"
&lt;/h2&gt;

&lt;p&gt;The k8s convention is &lt;code&gt;/healthz&lt;/code&gt;. Easy with App Router rewrites in &lt;code&gt;next.config.js&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;rewrites&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/healthz&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;destination&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/api/health&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now both URLs work and your liveness probe stays idiomatic.&lt;/p&gt;




&lt;h2&gt;
  
  
  Monitoring it from outside
&lt;/h2&gt;

&lt;p&gt;Here's the part that costs people the most, because they think a health check is enough by itself.&lt;/p&gt;

&lt;p&gt;It isn't. Your platform's liveness probe (Vercel internal, Kubernetes, etc.) checks the &lt;em&gt;pod&lt;/em&gt;. It does not check:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DNS — did your registrar accidentally let the domain expire?&lt;/li&gt;
&lt;li&gt;TLS — did your certificate auto-renewal silently fail?&lt;/li&gt;
&lt;li&gt;CDN edge — is Cloudflare serving stale 502s while origin is fine?&lt;/li&gt;
&lt;li&gt;The path between user and pod — region outage, BGP drama&lt;/li&gt;
&lt;li&gt;Third-party degradation — your code is fine, but Stripe/OpenAI is throwing 500s&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only way to catch these is to hit your &lt;em&gt;public URL&lt;/em&gt; from &lt;em&gt;outside your infra&lt;/em&gt;, on a schedule, from multiple regions.&lt;/p&gt;

&lt;p&gt;You can roll your own with cron-job.org + a Slack webhook in 30 minutes. Or use any external uptime monitor. I built &lt;a href="https://sitepulse.satosushi.co" rel="noopener noreferrer"&gt;SitePulse&lt;/a&gt; for this exact reason — one of the war stories below was what kicked it off — but the stack doesn't matter. The point is: &lt;strong&gt;don't rely on your own infra to tell you your own infra is broken.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The war story that made me write this
&lt;/h2&gt;

&lt;p&gt;Last year I shipped a deploy that renamed an env var. I'd updated &lt;code&gt;.env.example&lt;/code&gt;. I'd updated the code. I had not updated the production env var. The deploy went green. The 200-only &lt;code&gt;/api/health&lt;/code&gt; kept returning 200. CI passed because tests use a different config.&lt;/p&gt;

&lt;p&gt;For 41 minutes, every customer request to the affected endpoint returned 500. I noticed because someone tweeted at me.&lt;/p&gt;

&lt;p&gt;If the health check had done a real DB roundtrip with the &lt;em&gt;production&lt;/em&gt; config, it would have failed at deploy-time and the platform would have refused to promote the build. Instead it merrily reported "healthy" while 100% of real traffic broke.&lt;/p&gt;

&lt;p&gt;That bug cost me a customer. Worse, it cost me trust — they'd been one of my early users.&lt;/p&gt;

&lt;p&gt;The two-line fix (real DB query, 503 on failure) would have caught it inside the first request after deploy.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;A health check that returns &lt;code&gt;200&lt;/code&gt; without checking anything tells you the function is reachable. That's it. It's the cheapest possible information and it's almost never the information you need.&lt;/p&gt;

&lt;p&gt;A useful health check:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Hits the same runtime your customers hit (&lt;code&gt;runtime = "nodejs"&lt;/code&gt; if that's what you use)&lt;/li&gt;
&lt;li&gt;Refuses to be cached (&lt;code&gt;dynamic = "force-dynamic"&lt;/code&gt; + &lt;code&gt;Cache-Control: no-store&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Does a real, cheap roundtrip to your most fragile dependency&lt;/li&gt;
&lt;li&gt;Returns &lt;code&gt;503&lt;/code&gt; on failure, not &lt;code&gt;200&lt;/code&gt; with a flag in the body&lt;/li&gt;
&lt;li&gt;Is checked from &lt;em&gt;outside&lt;/em&gt; your infra, not just by your platform's internal probe&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Do that and you'll catch the boring bugs that take down indie SaaS — environment drift, expired tokens, silent CDN issues — before your customers do.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you found this useful, I wrote a &lt;a href="https://stackoverflow.com/questions/57956476/how-to-set-up-an-endpoint-for-health-check-on-next-js/79934561#79934561" rel="noopener noreferrer"&gt;shorter version on Stack Overflow&lt;/a&gt; covering the same patterns, and I publish more indie-SaaS-on-Vercel posts here on dev.to. The monitoring tool I built (&lt;a href="https://sitepulse.satosushi.co" rel="noopener noreferrer"&gt;SitePulse&lt;/a&gt;) is free for 5 monitors if you want the "external monitor" half of the story without writing it yourself.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>nextjs</category>
      <category>webdev</category>
      <category>devops</category>
      <category>supabase</category>
    </item>
    <item>
      <title>I built my own UptimeRobot in a weekend with Next.js 16 + Vercel Cron</title>
      <dc:creator>ohyeah</dc:creator>
      <pubDate>Thu, 30 Apr 2026 05:50:13 +0000</pubDate>
      <link>https://forem.com/ohyeah_d04cd4c2cd46a1ad2c/i-built-my-own-uptimerobot-in-a-weekend-with-nextjs-16-vercel-cron-34d9</link>
      <guid>https://forem.com/ohyeah_d04cd4c2cd46a1ad2c/i-built-my-own-uptimerobot-in-a-weekend-with-nextjs-16-vercel-cron-34d9</guid>
      <description>&lt;p&gt;I've been paying UptimeRobot for years. It works. The free tier is generous. I have no real beef with them.&lt;/p&gt;

&lt;p&gt;But every time I added a 6th monitor, the upgrade modal appeared. Every time I logged in to check a site, the dashboard nudged me toward Pro. Every time I wanted a public status page on my own domain, that was a paid feature too.&lt;/p&gt;

&lt;p&gt;Eventually I asked the question every indie dev asks at some point: how hard could this actually be?&lt;/p&gt;

&lt;p&gt;It turned out: a weekend to MVP, two weeks to ship to paying customers. Here's the architecture, the parts that surprised me, and the bugs that cost me an afternoon each.&lt;/p&gt;

&lt;h2&gt;
  
  
  The whole product, on one page
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Probe a list of URLs every minute.&lt;/strong&gt; HEAD or GET, optional body keyword check.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detect "down" reliably&lt;/strong&gt; — don't email you because of one flaky packet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Email when status flips.&lt;/strong&gt; Don't email every minute the site stays down.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Render a public status page&lt;/strong&gt; at a custom slug.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bill it.&lt;/strong&gt; $9/month for 25 monitors, free up to 5.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the spec. Anything else I considered building, I asked: "would my own indie projects need this?" The answer for incident management, on-call rotations, request tracing, RUM, and Slack threading was: no. So they didn't get built.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Next.js 16&lt;/strong&gt; (App Router) on Vercel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supabase&lt;/strong&gt; for Postgres + Auth (Tokyo region — more on this below)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vercel Cron&lt;/strong&gt; runs a single endpoint every minute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resend&lt;/strong&gt; for alert emails&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stripe&lt;/strong&gt; Checkout + webhooks for billing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's it. No queue, no Redis, no separate worker fleet. The whole backend is one cron endpoint and a handful of Server Actions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 1-minute heartbeat
&lt;/h2&gt;

&lt;p&gt;Vercel Cron sends a GET to &lt;code&gt;/api/cron/check&lt;/code&gt; every minute. A single endpoint handles every monitor on the platform — no per-monitor crons, no fan-out queue.&lt;/p&gt;

&lt;p&gt;The flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cron tick
  → claim_due_monitors (Postgres function, atomic SELECT FOR UPDATE)
    → process up to 200 monitors in parallel batches of 25
      → fetch each URL with AbortController timeout
        → upsert check result + flip status if needed
          → enqueue alert if status transitioned
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Postgres function is the load-bearing piece. It locks rows that are due for a check, bumps their &lt;code&gt;next_check_at&lt;/code&gt;, and returns them in one round-trip. Two cron workers will never claim the same monitor in the same tick, because Postgres handles the contention for me.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- simplified&lt;/span&gt;
&lt;span class="k"&gt;create&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;claim_due_monitors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_limit&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="k"&gt;setof&lt;/span&gt; &lt;span class="n"&gt;monitors&lt;/span&gt;
&lt;span class="k"&gt;language&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;
&lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;begin&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;
    &lt;span class="k"&gt;update&lt;/span&gt; &lt;span class="n"&gt;monitors&lt;/span&gt;
    &lt;span class="k"&gt;set&lt;/span&gt; &lt;span class="n"&gt;next_check_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;interval_seconds&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'1 second'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;monitors&lt;/span&gt;
      &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;next_check_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;
      &lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;next_check_at&lt;/span&gt;
      &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;update&lt;/span&gt; &lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="n"&gt;locked&lt;/span&gt;
      &lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="n"&gt;p_limit&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;returning&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;for update skip locked&lt;/code&gt; is the magic. It lets a second cron worker (which won't happen here, but you want it to be safe) skip rows that are already being processed instead of waiting for a lock.&lt;/p&gt;

&lt;h2&gt;
  
  
  Probing 25 URLs concurrently in one function
&lt;/h2&gt;

&lt;p&gt;Each tick can hit dozens of URLs. The cron route batches them with &lt;code&gt;Promise.allSettled&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CONCURRENCY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;monitors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;CONCURRENCY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;slice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;monitors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;CONCURRENCY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;allSettled&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;processMonitor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;admin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ...tally results, log errors&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The probe itself is just &lt;code&gt;fetch&lt;/code&gt; with three things you must get right:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;controller&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AbortController&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timeout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nx"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timeout_ms&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;keyword&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;GET&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;redirect&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;follow&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user-agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SitePulseBot/1.0 (+https://sitepulse.satosushi.co)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cache-control&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;no-cache&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;no-store&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// never let Next cache a probe&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// ALWAYS drain the body, even if you don't need it.&lt;/span&gt;
  &lt;span class="c1"&gt;// Otherwise the socket stays open and the next probe pays connect cost.&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arrayBuffer&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// ...record result&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;clearTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three subtle things in there:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cache: "no-store"&lt;/code&gt;&lt;/strong&gt; — Next.js will happily cache &lt;code&gt;fetch&lt;/code&gt; responses in production. You don't want a cached HTTP probe.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drain the body&lt;/strong&gt; — if you don't read the response body, the underlying connection sits in limbo. Across hundreds of probes per minute, this matters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;AbortController&lt;/code&gt; for timeouts&lt;/strong&gt; — &lt;code&gt;fetch&lt;/code&gt; has no built-in timeout. The default is "wait forever." Don't.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The 1-second latency I didn't notice for two days
&lt;/h2&gt;

&lt;p&gt;I deployed the first version and a page load felt sluggish. Not broken — just sluggish. Maybe 800ms-1.2s for the dashboard to render.&lt;/p&gt;

&lt;p&gt;Vercel Functions default to &lt;code&gt;iad1&lt;/code&gt; (Washington DC). My Supabase project is in Tokyo. Every Server Component that hit the database was making a US-east → Tokyo → US-east round trip per query. With 3-4 queries per page render, that's a second of pure network sitting between the user and the page.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;vercel.json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"regions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"hnd1"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One line. Pinning functions to Tokyo (&lt;code&gt;hnd1&lt;/code&gt;) drops Server Component render time to under 100ms. The lesson generalises: &lt;strong&gt;always colocate compute with your primary data store&lt;/strong&gt;, especially for Server Components, where every render is a synchronous database conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Server Component cookie crash
&lt;/h2&gt;

&lt;p&gt;Next.js 16's App Router gives Server Components access to cookies. Supabase's &lt;code&gt;createServerClient&lt;/code&gt; wants a &lt;code&gt;setAll&lt;/code&gt; callback so it can refresh tokens.&lt;/p&gt;

&lt;p&gt;But Server Components are read-only — you can't set cookies during a render in production. If a token refresh happens during a Server Component pass, &lt;code&gt;setAll&lt;/code&gt; throws, and the entire page returns a 500 with that lovely React digest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Server Components render
 → Supabase tries to refresh expired token
 → setAll attempts to write cookies
 → ERR_HTTP_HEADERS_SENT-style error
 → 500 with digest 972974443
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fix is one try/catch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;getAll&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;cookieStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getAll&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="nf"&gt;setAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cookiesToSet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;cookiesToSet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(({&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
          &lt;span class="nx"&gt;cookieStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Server Component context — token will be refreshed on next request.&lt;/span&gt;
        &lt;span class="c1"&gt;// Safe to swallow.&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Supabase docs hint at this but the existing examples I copied didn't have the try/catch. If your Supabase + Next.js 16 app sometimes 500s on logged-in users after a token expiry, this is probably why.&lt;/p&gt;

&lt;h2&gt;
  
  
  The trailing whitespace bug that ate three hours
&lt;/h2&gt;

&lt;p&gt;I copied my &lt;code&gt;STRIPE_WEBHOOK_SECRET&lt;/code&gt; from the Stripe CLI output into Vercel's env var UI. Webhooks 401'd in production. Local was fine.&lt;/p&gt;

&lt;p&gt;The Stripe webhook secret has a trailing newline if you copy from a terminal. Vercel stores it verbatim — including the newline. The HTTP header &lt;code&gt;Stripe-Signature&lt;/code&gt; then doesn't match anything, signature verification fails, and you get a 400 in your logs with no obvious cause.&lt;/p&gt;

&lt;p&gt;The fix is to never trust your clipboard:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$STRIPE_WEBHOOK_SECRET&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'\n\r\t '&lt;/span&gt; | vercel &lt;span class="nb"&gt;env &lt;/span&gt;add STRIPE_WEBHOOK_SECRET production
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same gotcha applies to any header-borne secret: API tokens, basic auth, JWT signing keys. If signature verification fails in prod but works locally, check whitespace before checking anything else.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I deliberately didn't build
&lt;/h2&gt;

&lt;p&gt;The list of things people expect from a "real" uptime monitor that I left out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Logs / RUM / transaction monitoring.&lt;/strong&gt; That's what Sentry and Logflare are for.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-region probing.&lt;/strong&gt; I check from one region. If Cloudflare is down in São Paulo and your site is up in São Paulo, you and I will both find out at the same time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On-call schedules / rotations.&lt;/strong&gt; Indie devs are a one-person rotation. If I'm asleep, the alert waits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slack / Discord / PagerDuty / OpsGenie.&lt;/strong&gt; Email + SMS covers the people I'm building for.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;5-second checks.&lt;/strong&gt; 1-minute is enough for indie projects. Sub-minute is genuinely expensive to do reliably.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cutting features wasn't a sacrifice — it was the product. The competitors I respect (UptimeRobot, BetterStack, Pingdom) all do most of what I left out, and that's exactly why their pricing pages have four columns and a "contact sales" button.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned about flat pricing
&lt;/h2&gt;

&lt;p&gt;The most-discussed part of this product hasn't been the technical stack — it's been the price.&lt;/p&gt;

&lt;p&gt;$9/month for 25 monitors. No per-seat. No per-region. No per-channel. Free up to 5 monitors.&lt;/p&gt;

&lt;p&gt;The reasoning: when I'm picking a tool for a side project, I don't have time to evaluate three pricing tiers and figure out which one I'd grow into. I want one number. Flat pricing forces the product to do less, which forces me to make better tradeoffs about what to build.&lt;/p&gt;

&lt;p&gt;It also means the product can never become "enterprise." That's fine. There are already excellent enterprise uptime monitors. There aren't enough good ones for indie devs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The link
&lt;/h2&gt;

&lt;p&gt;The product is live at &lt;strong&gt;&lt;a href="https://sitepulse.satosushi.co" rel="noopener noreferrer"&gt;sitepulse.satosushi.co&lt;/a&gt;&lt;/strong&gt; — 5 monitors free, $9/mo for 25, no card to start. If you've been on UptimeRobot and want to see how it stacks up, I wrote a &lt;a href="https://sitepulse.satosushi.co/vs/uptimerobot" rel="noopener noreferrer"&gt;side-by-side comparison&lt;/a&gt; too.&lt;/p&gt;

&lt;p&gt;Not open source — that's the business — but happy to answer architecture questions in the comments. The cron-claim-and-fan-out pattern in particular has been more reliable than any queue I've shipped, and I think it generalises to a lot of "do this thing every N seconds for N users" problems where you'd otherwise reach for SQS.&lt;/p&gt;

</description>
      <category>nextjs</category>
      <category>webdev</category>
      <category>vercel</category>
      <category>indiehackers</category>
    </item>
  </channel>
</rss>
