<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Aryan Tiwari</title>
    <description>The latest articles on Forem by Aryan Tiwari (@yetanotheraryan).</description>
    <link>https://forem.com/yetanotheraryan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3856203%2Fde26fc83-fd3c-4486-9452-eeac4016cc08.jpg</url>
      <title>Forem: Aryan Tiwari</title>
      <link>https://forem.com/yetanotheraryan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yetanotheraryan"/>
    <language>en</language>
    <item>
      <title>Your Node.js app is slow to start. You just don't know which module to blame.</title>
      <dc:creator>Aryan Tiwari</dc:creator>
      <pubDate>Wed, 01 Apr 2026 18:56:43 +0000</pubDate>
      <link>https://forem.com/yetanotheraryan/your-nodejs-app-is-slow-to-start-you-just-dont-know-which-module-to-blame-2le0</link>
      <guid>https://forem.com/yetanotheraryan/your-nodejs-app-is-slow-to-start-you-just-dont-know-which-module-to-blame-2le0</guid>
      <description>&lt;p&gt;Last month I was debugging a startup regression at work. Our Node.js service went from ~300ms boot to nearly 900ms overnight. No new features. No infra changes. Just a routine dependency bump.&lt;/p&gt;

&lt;p&gt;The usual approach? Comment out requires one by one. Bisect &lt;code&gt;package.json&lt;/code&gt;. Stare at &lt;code&gt;--cpu-prof&lt;/code&gt; output and pretend to understand V8 internals.&lt;/p&gt;

&lt;p&gt;I wanted something simpler: run one command, see which module is eating my startup time, and know if the cost is in the module itself or in everything it drags in.&lt;/p&gt;

&lt;p&gt;So I built &lt;code&gt;coldstart&lt;/code&gt; — a zero-dependency startup profiler for Node.js that instruments &lt;code&gt;Module._load&lt;/code&gt;, reconstructs the dependency tree, and shows you exactly where boot time goes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full transparency:&lt;/strong&gt; I used Claude pretty heavily while building this — for scaffolding the ESM loader hooks, generating the flamegraph HTML template, and iterating on the tree rendering logic. The core idea (patching &lt;code&gt;Module._load&lt;/code&gt; with &lt;code&gt;performance.now()&lt;/code&gt; bookends) and the architecture were mine, but AI absolutely accelerated the implementation. I think that's just how a lot of solo open source gets built now, and I'd rather be upfront about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem in 30 seconds
&lt;/h2&gt;

&lt;p&gt;Node.js doesn't tell you &lt;em&gt;why&lt;/em&gt; startup is slow. You get one number — total boot time — and zero breakdown.&lt;/p&gt;

&lt;p&gt;Meanwhile:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single &lt;code&gt;require('sequelize')&lt;/code&gt; can silently add 400ms&lt;/li&gt;
&lt;li&gt;Transitive dependencies pile up — you &lt;code&gt;require&lt;/code&gt; one thing, Node loads 300 modules&lt;/li&gt;
&lt;li&gt;Synchronous work in module scope (reading files, compiling templates, connecting to DBs) blocks the event loop before your app even starts&lt;/li&gt;
&lt;li&gt;Cached modules still add edges to the dependency graph, obscuring the real bottlenecks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This matters more than ever. If you're running on Lambda (where &lt;a href="https://speedrun.nobackspacecrew.com/blog/2025/07/21/the-fastest-node-22-lambda-coldstart-configuration.html" rel="noopener noreferrer"&gt;cold starts are now billed&lt;/a&gt;), on serverless platforms, or in containers that scale from zero — startup time is latency your users feel on the first request.&lt;/p&gt;

&lt;h2&gt;
  
  
  What coldstart actually does
&lt;/h2&gt;

&lt;p&gt;Run it against any Node app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @yetanotheraryan/coldstart server.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You get this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;coldstart — 847ms total startup

  ┌─ express          234ms  ████████████░░░░░░░░
  │  ├─ body-parser    89ms  █████░░░░░░░░░░░░░░░
  │  ├─ qs             12ms  █░░░░░░░░░░░░░░░░░░░
  │  └─ path-to-regex   8ms  ░░░░░░░░░░░░░░░░░░░░
  ├─ sequelize        401ms  █████████████████████  ⚠ slow
  │  ├─ pg            203ms  ███████████░░░░░░░░░
  │  └─ lodash         98ms  █████░░░░░░░░░░░░░░░
  └─ dotenv             4ms  ░░░░░░░░░░░░░░░░░░░░

event loop max 42ms, p99 17ms, mean 4.3ms
modules 312 total, 59 cached
time split 286ms first-party, 503ms node_modules
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tree shows parent → child load relationships with inclusive timing (how long the whole subtree took) and bar charts colored by severity. At a glance you can see: sequelize is the problem, and within sequelize, it's &lt;code&gt;pg&lt;/code&gt; and &lt;code&gt;lodash&lt;/code&gt; doing the heavy lifting.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works under the hood
&lt;/h2&gt;

&lt;p&gt;The core technique is straightforward — &lt;code&gt;coldstart&lt;/code&gt; monkey-patches &lt;code&gt;Module._load&lt;/code&gt; (the internal function Node calls for every &lt;code&gt;require()&lt;/code&gt;):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Before the original &lt;code&gt;_load&lt;/code&gt; runs, record &lt;code&gt;performance.now()&lt;/code&gt; and the parent module&lt;/li&gt;
&lt;li&gt;Let Node do its thing — resolve, compile, execute&lt;/li&gt;
&lt;li&gt;After &lt;code&gt;_load&lt;/code&gt; returns, record the end time&lt;/li&gt;
&lt;li&gt;Store the raw event: &lt;code&gt;{ request, resolvedPath, parentPath, startMs, endMs, cached }&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For ESM, it uses Node's &lt;code&gt;module.register()&lt;/code&gt; loader hooks (available in Node 18.19+) to capture &lt;code&gt;resolve&lt;/code&gt; and &lt;code&gt;load&lt;/code&gt; events, bridging timing data back to the main tracer through a message channel.&lt;/p&gt;

&lt;p&gt;After your app finishes starting up, the tracer takes all those raw events and builds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A tree&lt;/strong&gt; — the actual parent → child dependency graph as loaded at runtime&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inclusive time&lt;/strong&gt; — total wall-clock time for a module and everything it pulled in&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exclusive time&lt;/strong&gt; — just the module's own initialization cost, minus children&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event loop stats&lt;/strong&gt; — max, mean, p99 blocking during startup using &lt;code&gt;perf_hooks&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A split&lt;/strong&gt; — how much time was first-party code vs &lt;code&gt;node_modules&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The distinction between inclusive and exclusive is key. A module with high inclusive but low exclusive time is just a gateway — it pulls in heavy children but isn't slow itself. High exclusive time means &lt;em&gt;that specific module&lt;/em&gt; is doing expensive work at load time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three ways to use it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CLI&lt;/strong&gt; (easiest — profiles any app):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;coldstart server.js
coldstart &lt;span class="nt"&gt;--json&lt;/span&gt; server.js          &lt;span class="c"&gt;# machine-readable output&lt;/span&gt;
coldstart &lt;span class="nt"&gt;--&lt;/span&gt; node &lt;span class="nt"&gt;--inspect&lt;/span&gt; app.js  &lt;span class="c"&gt;# pass node flags through&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Programmatic API&lt;/strong&gt; (embed in your own tooling):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;renderTextReport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@yetanotheraryan/coldstart&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;done&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;monitor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./bootstrap&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./server&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;renderTextReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;done&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Preload mode&lt;/strong&gt; (zero code changes):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node &lt;span class="nt"&gt;--require&lt;/span&gt; @yetanotheraryan/coldstart/register server.js
&lt;span class="c"&gt;# or for ESM:&lt;/span&gt;
node &lt;span class="nt"&gt;--import&lt;/span&gt; @yetanotheraryan/coldstart/register server.mjs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There's also a &lt;code&gt;renderFlamegraphHtml()&lt;/code&gt; export that generates a self-contained HTML flamegraph you can open in a browser — useful for sharing with your team or dropping into a PR description.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually found at work
&lt;/h2&gt;

&lt;p&gt;After running &lt;code&gt;coldstart&lt;/code&gt; on our service, the culprit was obvious in under a second: a transitive dependency three levels deep was doing synchronous file I/O at module scope to read a config file. The dependency bump had changed its initialization path.&lt;/p&gt;

&lt;p&gt;The fix was a one-line lazy &lt;code&gt;require()&lt;/code&gt; that moved the load out of the critical startup path. Boot time went back to ~320ms.&lt;/p&gt;

&lt;p&gt;Without the tree view, I'd have been bisecting for an hour.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not just use &lt;code&gt;--cpu-prof&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;--cpu-prof&lt;/code&gt; is great for understanding &lt;em&gt;what code is running&lt;/em&gt;, but it doesn't answer &lt;em&gt;which module load is slow&lt;/em&gt; or &lt;em&gt;what's the dependency chain that got us here&lt;/em&gt;. You get a flamegraph of V8 internals and function calls, not a map of your &lt;code&gt;require()&lt;/code&gt; tree with timing.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;coldstart&lt;/code&gt; is deliberately higher-level. It answers "which npm package is making my startup slow?" — not "which V8 builtin is hot."&lt;/p&gt;

&lt;p&gt;They're complementary. Use &lt;code&gt;coldstart&lt;/code&gt; to find the slow module, then &lt;code&gt;--cpu-prof&lt;/code&gt; if you need to understand &lt;em&gt;why&lt;/em&gt; that module is slow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Current status &amp;amp; what's missing
&lt;/h2&gt;

&lt;p&gt;Working today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CommonJS profiling&lt;/li&gt;
&lt;li&gt;ESM profiling (Node 18.19+)&lt;/li&gt;
&lt;li&gt;CLI, programmatic API, preload mode&lt;/li&gt;
&lt;li&gt;Text report, JSON report, HTML flamegraph&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not yet implemented:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dynamic &lt;code&gt;import()&lt;/code&gt; tracing&lt;/li&gt;
&lt;li&gt;Watch mode for iterating on startup optimizations&lt;/li&gt;
&lt;li&gt;CI integration (fail if startup exceeds a threshold)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's early. The API is stable enough for everyday use but I'm iterating on the output format and considering a few features based on what people actually need.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @yetanotheraryan/coldstart
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or just run it once with npx:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @yetanotheraryan/coldstart your-app.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/yetanotheraryan/coldstart" rel="noopener noreferrer"&gt;github.com/yetanotheraryan/coldstart&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If this is useful to you, a star on the repo genuinely helps with discoverability. And if you run it on your app and find something interesting — I'd love to hear about it in the comments. What was your slowest module?&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Aryan — I build open source tools for Node.js on the side. You can find my other projects on &lt;a href="https://github.com/yetanotheraryan" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>opensource</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
