<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Michael Stelly</title>
    <description>The latest articles on Forem by Michael Stelly (@refactory).</description>
    <link>https://forem.com/refactory</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1131803%2F5b76d7eb-5a3b-4862-a032-46fc56ef84f3.jpg</url>
      <title>Forem: Michael Stelly</title>
      <link>https://forem.com/refactory</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/refactory"/>
    <language>en</language>
    <item>
      <title>I Ran Three LLMs Entirely in the Browser to Power an AI Coaching Feature. Here's What I Measured.</title>
      <dc:creator>Michael Stelly</dc:creator>
      <pubDate>Wed, 08 Apr 2026 19:10:25 +0000</pubDate>
      <link>https://forem.com/refactory/i-ran-three-llms-entirely-in-the-browser-to-power-an-ai-coaching-feature-heres-what-i-measured-9jm</link>
      <guid>https://forem.com/refactory/i-ran-three-llms-entirely-in-the-browser-to-power-an-ai-coaching-feature-heres-what-i-measured-9jm</guid>
      <description>&lt;p&gt;I'm building &lt;a href="https://holocronparse.com" rel="noopener noreferrer"&gt;Holocron&lt;/a&gt;, a browser-based combat log analyzer for the Star Wars: The Old Republic (SWTOR) video game. The core product thesis is that parsers that stop at showing you numbers aren't useful enough. A good tool tells you &lt;em&gt;what to do about them&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The coaching layer I'm building takes ~1500 tokens of structured combat stats (spec, abilities, DPS numbers, rule-based findings) and returns ~500 tokens of plain-language guidance. It runs after parsing, entirely client-side. No server. No account. No data leaving the browser.&lt;/p&gt;

&lt;p&gt;I already had Ollama working as a local LLM provider. But Ollama requires the user to install a background service, pull a model, and make sure it's running. For a tool where frictionless entry is a design constraint, that's a real drop-off risk. So I ran a spike to find out whether &lt;code&gt;@mlc-ai/web-llm&lt;/code&gt; with WebGPU could replace that setup entirely: just open the page, wait under 30 seconds on first visit (measured 23.7s on the test hardware), and get AI coaching with zero install.&lt;/p&gt;

&lt;p&gt;This post covers the full methodology, every number I measured, and the implementation decisions I made based on the results.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Output Contract
&lt;/h2&gt;

&lt;p&gt;Before getting into models and benchmarks, it helps to understand exactly what I needed the LLM to produce. The coaching system has a strict output schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;CoachingOutput&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;narrativeSummary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// 2-3 sentence performance narrative&lt;/span&gt;
  &lt;span class="nl"&gt;additionalFindings&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="c1"&gt;// 1-3&lt;/span&gt;
    &lt;span class="nl"&gt;headline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;recommendation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                               &lt;span class="c1"&gt;// max 3 items&lt;/span&gt;
  &lt;span class="nl"&gt;additionalPositives&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;    &lt;span class="c1"&gt;// max 3 plain strings&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The schema is intentionally flat and bounded. &lt;code&gt;additionalPositives&lt;/code&gt; is an array of strings, not objects. This matters. A lot. I'll come back to it.&lt;/p&gt;

&lt;p&gt;Production validation rejects anything that doesn't conform. There's no "close enough" here.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why WebLLM
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/mlc-ai/web-llm" rel="noopener noreferrer"&gt;WebLLM&lt;/a&gt; is an in-browser LLM inference engine built by the MLC AI team. It compiles models into a WebGPU-accelerated WASM runtime, ships a prebuilt model library hosted on HuggingFace, and exposes an OpenAI-compatible API. You load a model with &lt;code&gt;CreateMLCEngine()&lt;/code&gt;, then call &lt;code&gt;engine.chat.completions.create()&lt;/code&gt; exactly like you would with the OpenAI SDK.&lt;/p&gt;

&lt;p&gt;The two features that made it worth spiking:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grammar-constrained generation.&lt;/strong&gt; WebLLM supports &lt;code&gt;response_format: { type: 'json_object', schema: ... }&lt;/code&gt;, implemented at the WASM layer. This isn't prompt engineering hoping the model behaves. It enforces the schema at the token sampling level. The model literally cannot produce output that violates the schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OPFS caching.&lt;/strong&gt; Model weights are cached to the Origin Private File System after the first download. A 1.3 GB model that takes 23 seconds to load cold takes 2.3 seconds warm. Repeat users pay nothing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Test Setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hardware:&lt;/strong&gt; Apple Silicon Mac &lt;em&gt;(Apple M3 Max, 64gb integrated memory)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browser:&lt;/strong&gt; Chrome (WebGPU enabled)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WebLLM version:&lt;/strong&gt; 0.2.82&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark:&lt;/strong&gt; 10 coaching prompts per model using the production SPF prompt structure (1500 token input, targeting 500 token output)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality scoring:&lt;/strong&gt; Automated 6-signal composite (0-100 scale), equally weighted: (1) narrative depth — word count and sentence structure of &lt;code&gt;narrativeSummary&lt;/code&gt;; (2) schema compliance — all required fields present and correctly typed; (3) template parroting — prompt text appearing verbatim in output; (4) ability name accuracy — capitalized phrases cross-referenced against ability names present in the input; (5) finding duplication — semantic overlap across &lt;code&gt;additionalFindings&lt;/code&gt; items; (6) actionability — presence of concrete, imperative language in &lt;code&gt;recommendation&lt;/code&gt; fields. Template parrot and hallucination counts in the side-by-side table are raw per-prompt tallies, not components of the composite score.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I tested three models, chosen to cover the quality/size/speed tradeoff space:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Llama-3.2-1B-Instruct-q4f16_1-MLC&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~0.7 GB&lt;/td&gt;
&lt;td&gt;Smallest viable instruct model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Llama-3.2-3B-Instruct-q4f16_1-MLC&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~1.3 GB&lt;/td&gt;
&lt;td&gt;Sweet spot candidate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Phi-3.5-mini-instruct-q4f16_1-MLC&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~2.0 GB&lt;/td&gt;
&lt;td&gt;Quality ceiling for this size class&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I also ran the same 10 prompts against plain Ollama (no grammar enforcement) as a baseline for each model. That comparison turned out to be the most interesting part of the whole exercise.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Llama 3.2 3B
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Target&lt;/th&gt;
&lt;th&gt;Verdict&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Download size&lt;/td&gt;
&lt;td&gt;~1.3 GB&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Acceptable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cold load time&lt;/td&gt;
&lt;td&gt;23.7s&lt;/td&gt;
&lt;td&gt;≤ 30s&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warm load time&lt;/td&gt;
&lt;td&gt;2.3s&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tokens/sec&lt;/td&gt;
&lt;td&gt;49.8&lt;/td&gt;
&lt;td&gt;≥ 10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU memory&lt;/td&gt;
&lt;td&gt;~3.0 GB&lt;/td&gt;
&lt;td&gt;≤ 2 GB&lt;/td&gt;
&lt;td&gt;FLAG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON parse success&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;≥ 9/10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema valid&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;≥ 9/10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg quality score&lt;/td&gt;
&lt;td&gt;76/100&lt;/td&gt;
&lt;td&gt;subjective&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg latency/prompt&lt;/td&gt;
&lt;td&gt;5.8s&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Acceptable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Content quality: PASS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strengths: substantive narratives (avg 20+ words), zero template parroting, all findings have real body and recommendation text, references specific numbers from input in 8/10 prompts.&lt;/p&gt;

&lt;p&gt;Weaknesses: hallucinated ability names in 9/10 prompts (2-4 per prompt), occasional duplication of findings across 4 of 10 prompts, VRAM at ~3 GB exceeds the 2 GB flag.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;GPU memory is measured at the browser process level and includes driver and WebGPU runtime overhead beyond model weights. The 3B weights alone are ~1.6 GB at 4-bit quantization; the remainder is KV cache at 1500 token context plus browser overhead. Numbers will vary across machines and Chrome versions. The 2 GB threshold assumes a minimum-spec user running SWTOR on a machine with 8 GB unified memory: the game typically holds 3–4 GB GPU memory under load, leaving 4 GB headroom. Anything above 2 GB for the coaching model narrows that margin on older hardware.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ollama baseline comparison:&lt;/strong&gt; Plain Ollama 3B is 10/10 JSON valid but only &lt;strong&gt;1/10 schema valid&lt;/strong&gt;. The model consistently emits &lt;code&gt;additionalPositives&lt;/code&gt; as objects with &lt;code&gt;headline&lt;/code&gt;/&lt;code&gt;body&lt;/code&gt;/&lt;code&gt;recommendation&lt;/code&gt; fields instead of plain strings. This is a silent breaking failure. Grammar-constrained WebLLM generation is 10/10 schema valid under identical prompts. Content quality: Ollama 73/100 vs WebLLM 76/100 -- no degradation from running in-browser.&lt;/p&gt;




&lt;h3&gt;
  
  
  Llama 3.2 1B
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Target&lt;/th&gt;
&lt;th&gt;Verdict&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Download size&lt;/td&gt;
&lt;td&gt;~0.7 GB&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cold load time&lt;/td&gt;
&lt;td&gt;11.7s&lt;/td&gt;
&lt;td&gt;≤ 30s&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warm load time&lt;/td&gt;
&lt;td&gt;1.4s&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tokens/sec&lt;/td&gt;
&lt;td&gt;118.6&lt;/td&gt;
&lt;td&gt;≥ 10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU memory&lt;/td&gt;
&lt;td&gt;~1.1 GB&lt;/td&gt;
&lt;td&gt;≤ 2 GB&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON parse success&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;≥ 9/10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema valid&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;≥ 9/10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg quality score&lt;/td&gt;
&lt;td&gt;70/100&lt;/td&gt;
&lt;td&gt;subjective&lt;/td&gt;
&lt;td&gt;Marginal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg latency/prompt&lt;/td&gt;
&lt;td&gt;2.1s&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Content quality: MARGINAL FAIL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The numbers look great. 118.6 tok/s. 11.7s cold load. 1.4s warm. 0.7 GB download. Under the hood it falls apart.&lt;/p&gt;

&lt;p&gt;Template parroting in 7/10 prompts -- the model echoes prompt text like "Things the player did well that the rule engine missed" verbatim in the output. Prompt 9 returned all three &lt;code&gt;additionalPositives&lt;/code&gt; as identical copies of that string. Individual prompt scores ranged from 46 to 96. About 30% of runs would produce output that embarrasses the product. Speed doesn't offset that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ollama baseline comparison:&lt;/strong&gt; Plain Ollama 1B is 8/10 schema valid (better than the 3B, because the simpler model apparently follows field type instructions more literally). Content quality: Ollama 52/100 vs WebLLM 70/100. The grammar constraints improve structural compliance and seem to improve content quality too, but the underlying weaknesses (parroting, duplication, hallucination) persist.&lt;/p&gt;




&lt;h3&gt;
  
  
  Phi-3.5 Mini
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Target&lt;/th&gt;
&lt;th&gt;Verdict&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Download size&lt;/td&gt;
&lt;td&gt;~2.0 GB&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Large&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cold load time&lt;/td&gt;
&lt;td&gt;37.4s&lt;/td&gt;
&lt;td&gt;≤ 30s&lt;/td&gt;
&lt;td&gt;FAIL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warm load time&lt;/td&gt;
&lt;td&gt;2.4s&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tokens/sec&lt;/td&gt;
&lt;td&gt;52.5&lt;/td&gt;
&lt;td&gt;≥ 10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU memory&lt;/td&gt;
&lt;td&gt;~2.3 GB&lt;/td&gt;
&lt;td&gt;≤ 2 GB&lt;/td&gt;
&lt;td&gt;FLAG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON parse success&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;≥ 9/10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema valid&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;≥ 9/10&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg quality score&lt;/td&gt;
&lt;td&gt;77/100&lt;/td&gt;
&lt;td&gt;subjective&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg latency/prompt&lt;/td&gt;
&lt;td&gt;6.8s&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Acceptable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Content quality: PASS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Best average quality score (77), best narrative depth, most actionable recommendations, zero template parroting. Loses on cold load (37.4s exceeds the 30s threshold) and both VRAM flags. The 1 point quality delta over 3B doesn't justify the extra 700MB of download and the load time failure.&lt;/p&gt;




&lt;h3&gt;
  
  
  Side-by-Side
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;3B&lt;/th&gt;
&lt;th&gt;1B&lt;/th&gt;
&lt;th&gt;Phi-3.5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cold load&lt;/td&gt;
&lt;td&gt;23.7s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;11.7s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;37.4s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warm load&lt;/td&gt;
&lt;td&gt;2.3s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.4s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.4s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tok/s&lt;/td&gt;
&lt;td&gt;49.8&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;118.6&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;52.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latency/prompt&lt;/td&gt;
&lt;td&gt;5.8s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.1s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6.8s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Download&lt;/td&gt;
&lt;td&gt;1.3 GB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.7 GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.0 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VRAM&lt;/td&gt;
&lt;td&gt;~3.0 GB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~1.1 GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~2.3 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quality&lt;/td&gt;
&lt;td&gt;76&lt;/td&gt;
&lt;td&gt;70&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;77&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema valid&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Template parrot&lt;/td&gt;
&lt;td&gt;0/10&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;7/10&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0/10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hallucinations&lt;/td&gt;
&lt;td&gt;9/10&lt;/td&gt;
&lt;td&gt;5/10&lt;/td&gt;
&lt;td&gt;8/10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quality floor&lt;/td&gt;
&lt;td&gt;52&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;46&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Finding That Changes the Architecture
&lt;/h2&gt;

&lt;p&gt;The Ollama baseline comparison wasn't in the original spike plan. I added it as a sanity check. It turned out to be the most important data in the whole exercise.&lt;/p&gt;

&lt;p&gt;Plain Ollama 3B (no grammar enforcement) fails schema validation 90% of the time on this output contract. The model produces valid JSON. It just puts objects where the schema expects strings. &lt;code&gt;parseLlmResponse()&lt;/code&gt; rejects it.&lt;/p&gt;

&lt;p&gt;This means the existing Ollama integration, before this spike, was silently broken at the schema level for the 3B model. It would have worked fine for smaller models that happen to follow field type instructions more literally, but for the model you actually want to use for quality coaching, it would fail in production nearly every time.&lt;/p&gt;

&lt;p&gt;WebLLM's grammar-constrained generation doesn't improve the situation. &lt;strong&gt;It &lt;em&gt;defines&lt;/em&gt; the situation.&lt;/strong&gt; Without it, you're rolling the dice on whether the model happens to output the right types.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Implication for any project using Ollama for structured output:&lt;/strong&gt; Ollama added JSON schema support to its &lt;code&gt;format&lt;/code&gt; parameter in v0.4. Use it. Note that Ollama enforces schema compliance at the completion layer, not at the token sampling level — it's not equivalent to constrained decoding, but it substantially improves structured output reliability over prompt engineering alone. If you're relying on prompt engineering alone to get schema-compliant output from a small model, you're going to see silent failures in production that look like valid JSON until your validator catches them.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Ability Hallucination Problem
&lt;/h2&gt;

&lt;p&gt;Every model, at every size, hallucinates ability names. The 3B invents them in 9 out of 10 prompts. The 1B in 5 out of 10. Phi-3.5 in 8 out of 10.&lt;/p&gt;

&lt;p&gt;Coaching that tells a player to "increase your uptime on Shadow Strike" when their class doesn't have an ability called Shadow Strike destroys credibility instantly. This is domain-specific and model-agnostic. The models don't have SWTOR ability databases. They pattern-match on capitalized phrases that look like they belong in a game context and generate plausible-sounding names.&lt;/p&gt;

&lt;p&gt;The mitigation I'm implementing: post-process every LLM response against the set of ability names present in the input prompt. Any capitalized phrase in the output that isn't in the known set gets flagged. Starting in &lt;code&gt;warn&lt;/code&gt; mode (log to console) before considering &lt;code&gt;strip&lt;/code&gt; mode, because I want observability into how often this fires before making a content decision that could remove legitimate text.&lt;/p&gt;

&lt;p&gt;This is a reminder that domain-specific hallucination isn't solved by model size. It's solved by grounding. If you're building in a domain with specific terminology (game abilities, medical terms, legal citations), plan for a validation pass.&lt;/p&gt;




&lt;h2&gt;
  
  
  Implementation Decisions
&lt;/h2&gt;

&lt;p&gt;These are the implementation decisions the spike produced — the design I'm building toward. None of this is merged to production yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chosen model:&lt;/strong&gt; &lt;code&gt;Llama-3.2-3B-Instruct-q4f16_1-MLC&lt;/code&gt;. Meets all performance targets. Quality comparable to Ollama baseline. Zero user setup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Web Worker is non-optional.&lt;/strong&gt; &lt;code&gt;CreateWebWorkerMLCEngine&lt;/code&gt; runs all inference off the main thread. Running it on the main thread freezes the UI during the ~24 second cold load. This is not optional.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lazy loading.&lt;/strong&gt; The model doesn't load on page load or provider construction. It loads on the first &lt;code&gt;generateCoaching()&lt;/code&gt; call, with a progress callback wired to a UI progress bar. Repeat users hit the OPFS cache at 2.3s.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VRAM guard.&lt;/strong&gt; The 3B model uses ~3 GB GPU memory, which can conflict with the game running simultaneously on 8 GB machines. Before loading, call &lt;code&gt;navigator.gpu.requestAdapter()&lt;/code&gt; and surface a warning if the device looks constrained. Don't block the load. Just warn. Sustained inference at ~50 tok/s also has thermal and power-draw implications on a laptop running the game simultaneously; the lazy-load design keeps idle overhead at zero.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1B fast mode.&lt;/strong&gt; Exposed as an opt-in user preference (&lt;code&gt;webllmModel: '3b' | '1b'&lt;/code&gt;), persisted to localStorage. Disclosed as "Fast mode uses a smaller model. Coaching depth may be reduced." The 1B quality floor is too low to be a default, but at 118.6 tok/s and 11.7s cold load it's genuinely compelling for users who know what they're trading.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fallback chain:&lt;/strong&gt; WebLlmProvider (if WebGPU available) -&amp;gt; OllamaProvider (if localhost:11434 reachable, with schema enforcement) -&amp;gt; rule-based coaching (always available). Never let a WebLLM failure surface to the user.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CSP update.&lt;/strong&gt; Model shards fetch from HuggingFace CDN. Add bounded &lt;code&gt;connect-src&lt;/code&gt; exceptions for &lt;code&gt;https://huggingface.co&lt;/code&gt; and &lt;code&gt;https://raw.githubusercontent.com&lt;/code&gt;. No wildcards.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;Test grammar enforcement before testing model quality. The schema compliance numbers are what determine whether the integration works at all. Content quality is a secondary concern. A model that produces 80/100 content but fails schema validation 50% of the time is less useful than a model that produces 70/100 content and passes validation 100% of the time.&lt;/p&gt;

&lt;p&gt;For anyone running a similar spike on a different output schema: start with structured generation enforced at the runtime level. Don't test prompt-engineering-only compliance and expect it to generalize.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;WebLLM with WebGPU is production-ready for this use case — and it's what I'm building toward. The 3B Llama model clears every performance target, produces coaching quality that matches the Ollama baseline, and requires zero user setup. Grammar-constrained generation isn't a nice-to-have -- it's the feature that makes small-model structured output viable at all.&lt;/p&gt;

&lt;p&gt;The ability hallucination problem is real and unsolved by model size. Plan for a post-processing validation pass if your domain has specific terminology.&lt;/p&gt;

&lt;p&gt;The most useful thing I measured was the thing I almost didn't measure: what happens when you remove the grammar enforcement. The answer is that it breaks quietly and often. If I were running this spike again, I'd run the Ollama baseline first — before testing any WebLLM model. Schema compliance is a binary gate. There's no point benchmarking content quality on a model whose output your validator will reject.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Holocron is a browser-based SWTOR combat log analyzer. It's free, requires no install, and all parsing happens client-side. If you play SWTOR and want to understand your logs, try it at &lt;a href="https://holocronparse.com" rel="noopener noreferrer"&gt;holocronparse.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webgpu</category>
      <category>llm</category>
      <category>javascript</category>
      <category>react</category>
    </item>
    <item>
      <title>The React Native Upgrade Decision Framework: Predicting 10-38x Cost Multipliers</title>
      <dc:creator>Michael Stelly</dc:creator>
      <pubDate>Mon, 06 Oct 2025 18:07:23 +0000</pubDate>
      <link>https://forem.com/refactory/the-react-native-upgrade-decision-framework-predicting-10-38x-cost-multipliers-2b32</link>
      <guid>https://forem.com/refactory/the-react-native-upgrade-decision-framework-predicting-10-38x-cost-multipliers-2b32</guid>
      <description>&lt;p&gt;Deferred React Native upgrades multiply costs 10-38x. Here's how to predict which category you're in before approving budget.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use This Framework
&lt;/h2&gt;

&lt;p&gt;Use this assessment if you're:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inheriting a React Native codebase and need to understand what you're walking into&lt;/li&gt;
&lt;li&gt;Evaluating an upgrade request from your team&lt;/li&gt;
&lt;li&gt;Planning quarterly tech debt remediation&lt;/li&gt;
&lt;li&gt;Facing app store compliance deadlines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This framework emerged from 12 enterprise React Native migrations at Fortune 500 retailers and mid-market companies. It identifies the patterns that separate routine upgrades from budget-destroying rewrites.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 18-Month Inflection Point
&lt;/h2&gt;

&lt;p&gt;React Native apps deferred beyond 18 months don't just need updates—they require reverse-engineering code written by people who've left the company. Costs don't scale linearly. They multiply exponentially.&lt;/p&gt;

&lt;p&gt;Real example: An app four years out of date required $380,000 to fix. Eighteen months earlier, the same work would have cost $10,000. The difference: all three disaster conditions appeared together (ghost team, stale dependencies, no RN expertise) plus crossing four major breaking points.&lt;/p&gt;

&lt;p&gt;The pattern repeats across organizations. Understanding where your app sits on this curve determines whether you're approving a $30K sprint or a $300K strategic project.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Conditions That Predict Disaster
&lt;/h2&gt;

&lt;p&gt;Across 12 migrations, three conditions consistently appeared together in projects that became forced rewrites:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Ghost Team&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Original developers who built the app no longer work there. Nobody on the current team understands architectural decisions, custom integrations, or why certain code exists.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Stale Dependencies&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Third-party libraries haven't been updated in 18+ months. Many are abandoned. Some have security vulnerabilities with no patches available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Capability Gap&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Current team has no production React Native experience (fewer than 2-3 developers who've shipped RN apps to production). They've inherited an RN app but have web-only or native-only backgrounds.&lt;/p&gt;

&lt;p&gt;When all three conditions exist simultaneously, you're in rewrite territory. The technical debt is organizational, not just code-based.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Version Gap Amplifier
&lt;/h2&gt;

&lt;p&gt;Version distance matters, but not linearly. React Native has four major breaking points where the entire ecosystem fractures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Version 0.60&lt;/strong&gt; (2019): Android compatibility overhaul&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Versions 0.68-0.70&lt;/strong&gt; (2022): New Architecture introduced&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version 0.72&lt;/strong&gt; (2023): Package reorganization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version 0.76&lt;/strong&gt; (2024): New Architecture becomes mandatory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each breaking point adds significant complexity. The gap between version 0.59 and 0.70 isn't incremental—it's an architectural chasm. With React Native 0.81 now in production, teams on versions below 0.76 face five major breaking points—making the migration window even more critical.&lt;/p&gt;

&lt;p&gt;Count how many breaking points sit between your current version and target. Each one represents a major migration, not a simple update.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Hidden Costs
&lt;/h2&gt;

&lt;p&gt;Version gaps determine technical complexity. But three cost categories determine actual budget impact:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Opportunity Cost&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Two weeks of development becomes months of calendar time. During those months: roadmaps stall, competitors ship your planned features, revenue updates sit in backlog. A $35,000 migration creates $140,000 in lost revenue opportunity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Velocity Collapse&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Upgrades create an 8-week productivity crater:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Weeks 1-2: Migration work (0% feature velocity)&lt;/li&gt;
&lt;li&gt;Weeks 3-4: Bug fixes (30% velocity)&lt;/li&gt;
&lt;li&gt;Weeks 5-6: Performance optimization (50% velocity)&lt;/li&gt;
&lt;li&gt;Weeks 7-8: Recovery (70% velocity)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A "2-week upgrade" costs $144,000 in lost productivity—3x the direct engineering cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Knowledge Transfer Multiplier&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
When code was written by former employees, every fix takes longer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developer on team with context: Normal speed&lt;/li&gt;
&lt;li&gt;Left within 6 months: 2.5x longer&lt;/li&gt;
&lt;li&gt;Left over a year ago: 4x longer
&lt;/li&gt;
&lt;li&gt;Multiple developers gone: 7x+ longer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If former employees wrote 60%+ of your code, multiply all estimates by 4-7x minimum.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern-Based Assessment
&lt;/h2&gt;

&lt;p&gt;Based on observed patterns across migrations, not mathematical formulas:&lt;/p&gt;

&lt;h3&gt;
  
  
  Manageable Upgrade
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Conditions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Team has production React Native experience&lt;/li&gt;
&lt;li&gt;Original developers available OR well-documented codebase&lt;/li&gt;
&lt;li&gt;Dependencies updated within last 12 months&lt;/li&gt;
&lt;li&gt;Crossing 0-1 major breaking points&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Primary Risk:&lt;/strong&gt; Underestimating opportunity cost&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Timeline:&lt;/strong&gt; 1-2 weeks&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Budget:&lt;/strong&gt; $10-30K&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Team:&lt;/strong&gt; 1 senior developer  &lt;/p&gt;

&lt;h3&gt;
  
  
  High-Risk Migration
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Conditions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Team has production React Native experience BUT&lt;/li&gt;
&lt;li&gt;Original developers gone OR stale dependencies (18+ months) OR&lt;/li&gt;
&lt;li&gt;Crossing 2 major breaking points&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Primary Risk:&lt;/strong&gt; Velocity collapse extends beyond migration window&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Timeline:&lt;/strong&gt; 4-8 weeks&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Budget:&lt;/strong&gt; $50-150K&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Team:&lt;/strong&gt; 2-3 developers plus contractor/consultant  &lt;/p&gt;

&lt;h3&gt;
  
  
  Rewrite Territory
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Conditions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All three disaster patterns present:

&lt;ul&gt;
&lt;li&gt;Ghost team (original developers gone)&lt;/li&gt;
&lt;li&gt;Stale dependencies (18+ months old)&lt;/li&gt;
&lt;li&gt;Capability gap (no production RN experience)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Often crossing 3+ major breaking points&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Primary Risk:&lt;/strong&gt; Project becomes permanent crisis mode&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Timeline:&lt;/strong&gt; 3-6 months&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Budget:&lt;/strong&gt; $300K-1M+&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Team:&lt;/strong&gt; New team or heavy contractor involvement  &lt;/p&gt;

&lt;p&gt;Team capability dominates all other factors. Without production React Native experience, costs multiply significantly regardless of version gap or dependency health.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Diagnostic Questions
&lt;/h2&gt;

&lt;p&gt;Ask these in your next planning meeting. The pauses reveal hidden risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Question 1: "What broke the last time we upgraded?"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What you're testing:&lt;/strong&gt; Institutional memory&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strong answers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Specific component names and technical details&lt;/li&gt;
&lt;li&gt;Documented runbooks with time estimates&lt;/li&gt;
&lt;li&gt;Lessons learned applied to current architecture&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Weak answers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"I think it went fine"&lt;/li&gt;
&lt;li&gt;"No one on the current team was here for that"&lt;/li&gt;
&lt;li&gt;Searching through Slack or git history to find answers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What it means:&lt;/strong&gt; If your team can't describe what broke last time with specifics, they can't predict what will break this time. Add 40% to any timeline estimate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Question 2: "Who owns each of our custom integrations?"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What you're testing:&lt;/strong&gt; Knowledge transfer risk and capability gaps&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strong answers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Names of current employees with specific ownership&lt;/li&gt;
&lt;li&gt;Recent updates to custom code (within 6 months)&lt;/li&gt;
&lt;li&gt;Clear documentation and test coverage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Weak answers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"I think [former employee] wrote that"&lt;/li&gt;
&lt;li&gt;"The build works, so we haven't touched it"&lt;/li&gt;
&lt;li&gt;Long pause followed by code history searches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What it means:&lt;/strong&gt; Custom code for cameras, payments, sensors fails during upgrades in ways standard features don't. If answers involve former employees, multiply estimates by 2.5x.&lt;/p&gt;

&lt;h3&gt;
  
  
  Question 3: "What's our rollback plan if this fails in production?"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What you're testing:&lt;/strong&gt; Production risk awareness and strategic thinking&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strong answers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Specific technical approach (staged rollout to 10% of users)&lt;/li&gt;
&lt;li&gt;Monitoring plan with defined thresholds (crash rate &amp;gt;2% triggers rollback)&lt;/li&gt;
&lt;li&gt;Tested rollback procedures with documented timing (can revert in under 2 hours)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Weak answers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"We test thoroughly, so it shouldn't fail"&lt;/li&gt;
&lt;li&gt;"We'll fix issues as they come up"&lt;/li&gt;
&lt;li&gt;Blank stares or "we can roll back the app store version"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What it means:&lt;/strong&gt; Teams without rollback plans don't understand production risk. Without rollback capability: emergency fixes create new bugs, app ratings collapse, you explain to the board why routine maintenance destroyed user satisfaction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision Matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Assessment Tier&lt;/th&gt;
&lt;th&gt;Team Capability&lt;/th&gt;
&lt;th&gt;Timeline&lt;/th&gt;
&lt;th&gt;Budget&lt;/th&gt;
&lt;th&gt;Approval Level&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Manageable&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;RN production experience&lt;/td&gt;
&lt;td&gt;1-2 weeks&lt;/td&gt;
&lt;td&gt;$10-30K&lt;/td&gt;
&lt;td&gt;Team lead approval&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;High Risk&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;RN experience OR can hire expertise quickly&lt;/td&gt;
&lt;td&gt;4-8 weeks&lt;/td&gt;
&lt;td&gt;$50-150K&lt;/td&gt;
&lt;td&gt;Director approval + dedicated sprint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rewrite&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No RN experience + ghost team + stale deps&lt;/td&gt;
&lt;td&gt;3-6 months&lt;/td&gt;
&lt;td&gt;$300K-1M+&lt;/td&gt;
&lt;td&gt;VP/CTO approval + strategic initiative&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Two Scenarios
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scenario A: Systematic Maintenance&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
You're 3-6 months behind current, not 18+. You have team members with RN production experience. You have tested rollback procedures. The compliance deadline is annoying but manageable. Budget impact: $30-50K.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scenario B: Deferred Maintenance&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
You're 18+ months behind. Dependencies are failing. Your team has no RN production experience. The compliance deadline becomes existential crisis requiring emergency contractor spending. Budget impact: $300K-1M+.&lt;/p&gt;

&lt;p&gt;Same external deadline. 10-30x difference in cost. The variable is whether you assessed risk before it became crisis.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Do This Week
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Identify which tier your app occupies&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do you have all three disaster conditions? (Ghost team + stale deps + capability gap)&lt;/li&gt;
&lt;li&gt;Count breaking points between current and target version&lt;/li&gt;
&lt;li&gt;Calculate dependency age (when were top 20 dependencies last updated?)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Ask your team the three diagnostic questions&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Schedule 30 minutes this week&lt;/li&gt;
&lt;li&gt;Document the answers (especially the pauses and uncertainty)&lt;/li&gt;
&lt;li&gt;Note which questions triggered searches through old documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Make decisions based on assessment, not hope&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manageable tier: Approve as routine sprint work&lt;/li&gt;
&lt;li&gt;High-risk tier: Budget for dedicated project with external expertise&lt;/li&gt;
&lt;li&gt;Rewrite tier: Begin strategic planning for application replacement&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Technical debt accumulates silently until external forces—app store requirements, security vulnerabilities, competitor pressure—make it visible. By that point, costs have multiplied 10-38x and options have narrowed to "expensive" or "more expensive."&lt;/p&gt;

&lt;p&gt;The assessment costs nothing. The surprise costs everything.&lt;/p&gt;

&lt;p&gt;Teams that successfully navigate React Native upgrades see these patterns before they become board-level problems. They assess before committing budget. They ask revealing questions before accepting estimates. They understand true costs before approving work.&lt;/p&gt;

&lt;p&gt;The patterns repeat predictably. The breaking points don't move. The only variable is whether you see them coming in time to make strategic decisions instead of reactive ones.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Michael Stelly is a Senior Frontend Engineer specializing in React Native architecture and enterprise migrations. This framework emerged from leading upgrades at Fortune 500 retailers and analyzing migration patterns across 12 enterprise applications over seven years.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>mobiledevelopment</category>
      <category>engineeringmanagement</category>
      <category>technicaldebt</category>
    </item>
    <item>
      <title>The Walls That Turn $10k Updates Into $300k Rewrites</title>
      <dc:creator>Michael Stelly</dc:creator>
      <pubDate>Tue, 30 Sep 2025 16:05:05 +0000</pubDate>
      <link>https://forem.com/refactory/the-four-walls-that-turn-10k-updates-into-300k-rewrites-1cee</link>
      <guid>https://forem.com/refactory/the-four-walls-that-turn-10k-updates-into-300k-rewrites-1cee</guid>
      <description>&lt;h2&gt;
  
  
  The $380,000 Wake-Up Call
&lt;/h2&gt;

&lt;p&gt;A Fortune 500 retailer called me after receiving a $380,000 quote to update their React Native app. The same update would have cost $10,000 eighteen months earlier. &lt;/p&gt;

&lt;p&gt;In Part 1, I established the 18-month rule: React Native apps that go 18 months without updates cost 10x more to fix than maintain. Now I'll show you exactly why—four specific version changes that transform routine updates into bankruptcy-inducing rewrites.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cost Multiplication Formula
&lt;/h2&gt;

&lt;p&gt;After managing 12+ React Native migrations, the pattern never varies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;THE COMPOUND INTEREST OF NEGLECT
Base Update Cost: $10,000 (1 developer, 2 weeks)
Skip one critical version: $25,000-35,000
Skip two critical versions: $60,000-90,000
Skip three critical versions: $150,000+
Skip all four: Start shopping for rebuild quotes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Technical debt doesn't have a payment plan. It has a due date, and the penalty is bankruptcy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Version 0.60: When Google Broke Every Android App
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;THE ANDROID BREAKING POINT&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost to fix:&lt;/strong&gt; $25,000-$35,000&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time required:&lt;/strong&gt; 3-4 weeks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What breaks:&lt;/strong&gt; Every Android component&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skip penalty:&lt;/strong&gt; Multiplies next update by 3x&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business impact:&lt;/strong&gt; App won't compile until fixed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Google forced every React Native app to migrate from their old Android Support Library to AndroidX. These systems cannot coexist—it's one or the other, no exceptions. Every third-party component in your app must be updated or replaced. The tools that compile your code into an app need complete overhaul.&lt;/p&gt;

&lt;p&gt;When Bluecrew received Apple's removal notice, their React Native 0.61 app hadn't seen maintenance in 18 months—just break fixes to keep it running. No one had touched the dependencies. No one had updated the architecture. The technical debt had compounded silently until Apple forced their hand. With 90 days to comply or lose their app store presence, they faced a complete rebuild.&lt;/p&gt;

&lt;h2&gt;
  
  
  Version 0.68-0.70: The Architecture Tax
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;THE PERFORMANCE PARADOX&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost to fix:&lt;/strong&gt; $40,000-$80,000&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time required:&lt;/strong&gt; 6-8 weeks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What breaks:&lt;/strong&gt; Core app communication layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skip penalty:&lt;/strong&gt; Performance degrades permanently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business impact:&lt;/strong&gt; App gets slower while competitors get faster&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;React Native introduced a completely new foundation system for how JavaScript communicates with native code. The cruel irony? Running both systems simultaneously—which happens by default—makes your app slower than before the "upgrade."&lt;/p&gt;

&lt;p&gt;Your custom features may need complete rewrites. Developer productivity craters as they fight through hundreds of warnings. Most third-party components don't support the new system, forcing impossible choices between features and modernization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Version 0.72: The Hidden $25,000 Reorganization
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;THE IMPORT MAZE&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost to fix:&lt;/strong&gt; $15,000-$25,000&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time required:&lt;/strong&gt; 2-3 weeks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What breaks:&lt;/strong&gt; Every internal connection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skip penalty:&lt;/strong&gt; Blocks all future updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business impact:&lt;/strong&gt; Zero features, pure overhead&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;React Native reorganized its entire package structure. Every internal connection in your app—how your payment system talks to your user interface, how your login connects to your data—needs manual rewiring. We're talking hundreds, sometimes thousands, of connection points.&lt;/p&gt;

&lt;p&gt;You're paying senior developers to achieve nothing visible. The app works exactly the same, just with different plumbing. Try explaining that to stakeholders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Version 0.76+: The Point of No Return
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;THE MANDATORY MIGRATION&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost to fix:&lt;/strong&gt; $100,000+ (migration) or $300,000+ (rebuild)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time required:&lt;/strong&gt; 3-6 months&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What breaks:&lt;/strong&gt; Everything still on old foundation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skip penalty:&lt;/strong&gt; Not an option—update or die&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business impact:&lt;/strong&gt; Feature freeze or start over&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The old foundation system is dead. No compatibility mode. No grace period. Update or watch your app get delisted from app stores.&lt;/p&gt;

&lt;p&gt;At this point, most companies face a brutal choice: spend $100,000+ forcing a migration that might fail, or spend $300,000 on a rebuild that definitely works. Neither option is good. Both could have been avoided.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bluecrew Crisis: A Cautionary Tale
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;"18 months of 'saving money' created a crisis that quarterly maintenance would have prevented."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Bluecrew's story is typical. After deploying their React Native app, they focused on growth, not maintenance. Break fixes only. The app worked, users were happy, why spend money updating what wasn't broken?&lt;/p&gt;

&lt;p&gt;For 18 months, technical debt accumulated invisibly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;React Native fell behind by 11 versions&lt;/li&gt;
&lt;li&gt;Dependencies went unmaintained&lt;/li&gt;
&lt;li&gt;Security patches were ignored&lt;/li&gt;
&lt;li&gt;The ecosystem moved forward while their app stood still&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then Apple's removal notice arrived. 90 days to comply with security requirements that their version could never meet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The timeline:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Months 1-18:&lt;/strong&gt; "The app works fine" - $0 spent on maintenance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 0:&lt;/strong&gt; Apple removal notice arrives&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 1-30:&lt;/strong&gt; Internal team discovers it's not fixable with updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 31-60:&lt;/strong&gt; Three firms quoted rebuilds ranging from $300,000-$380,000; none would attempt updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 61-90:&lt;/strong&gt; Emergency rebuild with specialist&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real cost wasn't just the emergency rebuild. It was the three months of uncertainty, the risk to their business model if the app was delisted, and the opportunity cost of developers fighting fires instead of building features. While Bluecrew spent 3 months migrating, their competitors shipped 12-15 new features. That's the real cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Multiplication Effect Nobody Explains
&lt;/h2&gt;

&lt;p&gt;These versions don't add difficulty—they multiply it:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Walls to Cross&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;th&gt;Reality Check&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;One wall&lt;/td&gt;
&lt;td&gt;Manageable sprint&lt;/td&gt;
&lt;td&gt;Annoying but doable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Two walls&lt;/td&gt;
&lt;td&gt;Everything breaks twice&lt;/td&gt;
&lt;td&gt;Team considers quitting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Three walls&lt;/td&gt;
&lt;td&gt;Dependencies fight each other&lt;/td&gt;
&lt;td&gt;CTO updates resume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Four walls&lt;/td&gt;
&lt;td&gt;Cheaper to rebuild&lt;/td&gt;
&lt;td&gt;Board asks hard questions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every month you delay adds 15% to your bill. That's a higher interest rate than most credit cards.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Industry
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;E-commerce:&lt;/strong&gt; Your checkout breaks during Black Friday. Revenue drops to zero while you emergency patch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Banking:&lt;/strong&gt; You fail your next security audit. Regulators don't care about your update timeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Healthcare:&lt;/strong&gt; You lose HIPAA compliance overnight. Legal asks why this wasn't prevented.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SaaS:&lt;/strong&gt; Your mobile app becomes a competitive liability. Customers switch to competitors who maintained their apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 30-Second Assessment
&lt;/h2&gt;

&lt;p&gt;Is your React Native app already dead? Check these five indicators:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;React Native version below 0.72? &lt;strong&gt;[Yes = +1 point]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Last update more than 12 months ago? &lt;strong&gt;[Yes = +2 points]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Original developers gone? &lt;strong&gt;[Yes = +1 point]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;More than 20 outdated dependencies? &lt;strong&gt;[Yes = +2 points]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Build warnings fill your console? &lt;strong&gt;[Yes = +1 point]&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Score 3+?&lt;/strong&gt; You're already in crisis. The question isn't if you'll pay, but how much.&lt;/p&gt;

&lt;h2&gt;
  
  
  Companies Doing It Right
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AVOIDING THE WALLS:
Walmart: Updates monthly, migration cost &amp;lt; $5k/quarter
Discord: Automated 90% of updates, 2-day turnaround  
Coinbase: Dedicated React Native maintenance team
Their secret: They never hit the walls

Meanwhile, companies in crisis:
Bluecrew: 90-day removal notice, emergency rebuild
Fortune 500 Retailer: $380,000 quote, considering abandoning mobile
Healthcare Startup: Failed HIPAA audit, mobile app offline 3 months
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Pattern I See Repeatedly
&lt;/h2&gt;

&lt;p&gt;After 12+ React Native rescues, the pattern never changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;App launches successfully&lt;/li&gt;
&lt;li&gt;Team moves to next project&lt;/li&gt;
&lt;li&gt;"It's working fine" for 12-18 months&lt;/li&gt;
&lt;li&gt;External force (app store, security, OS update) breaks app&lt;/li&gt;
&lt;li&gt;Discovery that updates are now impossible&lt;/li&gt;
&lt;li&gt;Emergency rebuild at 10x the maintenance cost&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Bluecrew fit this pattern exactly. So did my work at Sam's Club, where their 0.61.5 app had already crossed into "archaeology" territory. The pattern is so consistent I can predict costs within 10%.&lt;/p&gt;

&lt;h2&gt;
  
  
  If You're Already in Crisis
&lt;/h2&gt;

&lt;p&gt;If you're already past the 18-month mark, don't panic. The worst decision is further delay. Even apps 24+ months behind can be saved—it just requires accepting the reality of rebuild over update. Every day you wait adds to the final bill, but starting today stops the bleeding.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Decision You Face Today
&lt;/h2&gt;

&lt;p&gt;While you're reading this, your competitors are shipping features. Instagram, Walmart, and others update their React Native apps monthly. They're not doing it for fun—they're avoiding the walls that trap you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option A: Start Quarterly Maintenance Now&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$40,000/year predictable cost&lt;/li&gt;
&lt;li&gt;Zero feature freeze&lt;/li&gt;
&lt;li&gt;Developers stay productive&lt;/li&gt;
&lt;li&gt;Competitive advantage maintained&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option B: Wait for the Crisis&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$150,000-$300,000 emergency cost&lt;/li&gt;
&lt;li&gt;3-6 month feature freeze&lt;/li&gt;
&lt;li&gt;Team burnout guaranteed&lt;/li&gt;
&lt;li&gt;Explain to board why mobile revenue stopped&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The math is unforgiving: preventive maintenance costs a fraction of emergency repairs.&lt;/p&gt;

&lt;h2&gt;
  
  
  You Can't Defer Forever
&lt;/h2&gt;

&lt;p&gt;These events force updates whether you're ready or not:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;App store security requirements:&lt;/strong&gt; 90-day compliance or delisting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payment processor updates:&lt;/strong&gt; Update or lose transaction capability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;iOS/Android annual updates:&lt;/strong&gt; Your app breaks every September&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security vulnerabilities:&lt;/strong&gt; Immediate patches or legal liability
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WARNING SIGNS YOUR NOTICE IS COMING:
□ Your app requires iOS/Android versions from 2+ years ago
□ You're still asking for permissions deprecated in iOS 14
□ Your Android target SDK is below 31 (from 2021)
□ You haven't updated since before COVID

If you checked ANY box, start planning now.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What 12 Migrations Taught Me
&lt;/h2&gt;

&lt;p&gt;Companies don't skip maintenance maliciously. They skip it because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The app works today&lt;/li&gt;
&lt;li&gt;Resources are tight&lt;/li&gt;
&lt;li&gt;Other priorities seem more urgent&lt;/li&gt;
&lt;li&gt;The accumulating debt is invisible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Until it's not. Until Apple or Google sends the notice. Until a critical security vulnerability is discovered. Until the latest iOS breaks your app.&lt;/p&gt;

&lt;p&gt;By then, the 2-day updates have become 2-month rebuilds. The $10,000 maintenance has become $100,000+ emergencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;You now understand the 18-month cliff (Part 1) and the four walls that multiply costs (Part 2).&lt;/p&gt;

&lt;p&gt;In Part 3, "The React Native Migration Playbook," I'll provide the tactical guide for crossing each wall—specific strategies that minimize risk and cost while keeping your app alive during migration.&lt;/p&gt;

&lt;p&gt;But first, find out where you stand:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx react-native info
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Count your walls. Calculate your costs. Make your choice.&lt;/p&gt;

&lt;p&gt;The math is consistent: 18 months of deferred maintenance equals a rebuild. Not an update, not a migration—a rebuild. Bluecrew proved this. Sam's Club proved this. The next proof might be your app.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 3 coming next: The step-by-step playbook for navigating each wall while keeping your app alive.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About the series:&lt;/strong&gt; The React Native Foundations series covers the what, why, and how of maintaining healthy React Native apps. Part 1 revealed the 18-month cliff. Part 2 exposed the cost multipliers. Part 3 will show you exactly how to navigate the migration when you can't avoid it any longer.&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>mobile</category>
      <category>techdebt</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Your React Native App Has 18 Months to Live</title>
      <dc:creator>Michael Stelly</dc:creator>
      <pubDate>Mon, 22 Sep 2025 15:35:34 +0000</pubDate>
      <link>https://forem.com/refactory/your-react-native-app-has-18-months-to-live-9lc</link>
      <guid>https://forem.com/refactory/your-react-native-app-has-18-months-to-live-9lc</guid>
      <description>&lt;p&gt;The call came at 4 PM on a Tuesday: "Apple just sent us a Q4 compliance notice. Our React Native app needs to meet new security requirements by year-end. Can you help?" His team had no mobile experience, the deadline was breathing down their necks, and they needed help fast. The culprit? Their app was still running React Native 0.61 - a version so outdated that app stores were flagging it for known security vulnerabilities that would never be patched.&lt;/p&gt;

&lt;p&gt;Within 20 minutes of our first conversation, I knew we had a problem that went far deeper than a simple version bump. After our audit, they had worse news: another firm quoted $380,000 for fixing their 'simple' with a rebuild from scratch. Fortunately, I had a better plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  THE $380,000 REALITY CHECK
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;App Profile:&lt;/strong&gt; 20 screens, 10k users, e-commerce&lt;br&gt;&lt;br&gt;
&lt;strong&gt;External Rebuild Quote:&lt;/strong&gt; $380,000&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Specialist Rebuild Cost:&lt;/strong&gt; $120,000 (9 months solo)&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Client Savings:&lt;/strong&gt; $260,000&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Lesson:&lt;/strong&gt; Experience matters when technical debt becomes unavoidable&lt;/p&gt;

&lt;h2&gt;
  
  
  The Technical Debt Compound Effect
&lt;/h2&gt;

&lt;p&gt;After modernizing 12+ React Native apps, I've found the point of no return: 18 months. Skip quarterly updates for longer than that, and your linear fixes become exponential problems. The math is brutal but consistent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Month 0–6:&lt;/strong&gt; Simple updates, 2 hours each quarterly release&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Month 7–12:&lt;/strong&gt; Dependencies conflict, 8 hours per update&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Month 13–18:&lt;/strong&gt; Native module incompatibilities, 40+ hours&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Month 19+:&lt;/strong&gt; Complete rebuild recommended&lt;/p&gt;

&lt;p&gt;When I contracted with Sam's Club as Senior Mobile Engineer in early 2022 to lead their React Native migration for the fresh seafood department workers' app, it was stuck on version 0.61.5 - already three years behind the ecosystem. We successfully migrated to 0.67.2, but the process revealed how quickly technical debt compounds when updates are deferred.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five Signs Your React Native App Will Die in 2025
&lt;/h2&gt;

&lt;p&gt;Through painful experience, I've identified five early warning signs that predict this exact scenario:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Your React Native version is below 0.72&lt;/strong&gt; (released June 21, 2023) - You're now 2+ years behind critical security patches including the Regular Expression Denial of Service (ReDoS) vulnerability that affected versions 0.59.0 to 0.62.3&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;npm outdated shows 20+ major version gaps&lt;/strong&gt; - When your dependency tree is more than 50% unsupported packages, you're not looking at updates anymore - you're looking at archaeology&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Your build times have significantly degraded&lt;/strong&gt; from when you first started the project - This indicates fundamental configuration drift from modern React Native expectations&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Your Android build fails on Gradle 8+&lt;/strong&gt; due to namespace conflicts in legacy native modules - New app store requirements will eventually force this upgrade whether you're ready or not&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Console shows 15+ deprecation warnings on startup&lt;/strong&gt; - These aren't just noise - they're countdown timers to broken functionality&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I've seen all five symptoms at once exactly three times. All three required complete rewrites.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Prevention Playbook
&lt;/h2&gt;

&lt;p&gt;The React Native ecosystem's rapid evolution is both its greatest strength and its most dangerous trap. Stay current, and you ride the wave of continuous improvements. Fall behind, and you're fighting an entire ecosystem that has moved on without you.&lt;/p&gt;

&lt;h3&gt;
  
  
  QUARTERLY MAINTENANCE PLAYBOOK
&lt;/h3&gt;

&lt;p&gt;☐ Update React Native by one minor version max&lt;br&gt;&lt;br&gt;
☐ Run &lt;code&gt;npm audit fix&lt;/code&gt; for security patches&lt;br&gt;&lt;br&gt;
☐ Update React Navigation if using (breaking changes common)&lt;br&gt;&lt;br&gt;
☐ Test on latest iOS/Android beta releases&lt;br&gt;&lt;br&gt;
☐ Profile app performance, document any degradation&lt;br&gt;&lt;br&gt;
☐ Remove one unused dependency minimum&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Time Investment:&lt;/strong&gt; 16–24 hours per quarter&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Version jump strategy:&lt;/strong&gt; Never skip more than two React Native minor versions. The breaking changes accumulate too quickly for safe major jumps.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dependency hygiene:&lt;/strong&gt; Remove unused packages immediately. Every dependency is a potential failure point during updates.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Audit Checklist
&lt;/h2&gt;

&lt;p&gt;Run this 5-minute audit on your React Native app today:&lt;/p&gt;

&lt;p&gt;☐ &lt;strong&gt;Check your version:&lt;/strong&gt; Run &lt;code&gt;npx react-native --version&lt;/code&gt; (current is 0.81)&lt;br&gt;&lt;br&gt;
☐ &lt;strong&gt;Count outdated dependencies:&lt;/strong&gt; Run &lt;code&gt;npm outdated&lt;/code&gt; and count major version gaps&lt;br&gt;&lt;br&gt;
☐ &lt;strong&gt;Test latest tools:&lt;/strong&gt; Try building with the newest Xcode and Android Studio&lt;br&gt;&lt;br&gt;
☐ &lt;strong&gt;Measure build performance:&lt;/strong&gt; Time a clean build from &lt;code&gt;npx react-native run-android&lt;/code&gt;&lt;br&gt;&lt;br&gt;
☐ &lt;strong&gt;Count deprecation warnings:&lt;/strong&gt; How many warnings appear in your console on app startup?&lt;/p&gt;

&lt;p&gt;More than 3 red flags? You're approaching the 18-month cliff.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Choice is Yours
&lt;/h2&gt;

&lt;p&gt;The choice is stark: invest 2 hours monthly in updates, or $380,000 in a rebuild. If you're seeing any of these warning signs, you have a 6-month window to act before the compound effect makes updates impossible.&lt;/p&gt;

&lt;p&gt;Start with a dependency audit today: &lt;code&gt;npx react-native upgrade-helper&lt;/code&gt;. Check your current version against the latest React Native release. Count your deprecation warnings. Measure your build times.&lt;/p&gt;

&lt;p&gt;The next 18 months will either cost you hundreds of thousands or hundreds of hours.&lt;/p&gt;

&lt;p&gt;If this sounds like your situation, don't wait - reach out today. I'm currently accepting React Native modernization audits for fall 2025, and the earlier we catch these issues, the more options you have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't become another emergency rebuild story.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;This article kicks off my React Native Foundations series, where I'll cover the "what," the "why," and the "how" of maintaining a healthy React Native ecosystem that extends the practical life of all your applications. Today covered the "what" - the reality you're facing with your apps right now.&lt;/p&gt;

&lt;p&gt;Next comes the "why": &lt;strong&gt;Foundations II: Upgrade or Perish&lt;/strong&gt;, a four-part deep dive into why stakeholders face only one real decision - when to plan the upgrade or when to decommission the app. Wait long enough, and the app stores will make that choice for you.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About Michael:&lt;/strong&gt; I've been building cross-platform mobile apps since 2011, starting with Titanium SDK and going all-in on React Native in 2018. Seven years and 12+ modernization projects later, I've helped companies including Sam's Club and Bluecrew avoid hundreds of thousands of dollars in rebuild costs by catching technical debt before it becomes a crisis. I specialize in rescuing legacy React Native applications and establishing sustainable development practices that prevent future emergencies.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://www.linkedin.com/in/refactory" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or learn more about my services at &lt;a href="https://refactory.carrd.co/" rel="noopener noreferrer"&gt;Refactory&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://medium.com/@michaelstelly/your-react-native-app-has-18-months-to-live-fe18e242ec5d" rel="noopener noreferrer"&gt;Medium&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>mobile</category>
      <category>javascript</category>
      <category>techdebt</category>
    </item>
    <item>
      <title>React Native Version Matrix: The Hidden Upgrade Path</title>
      <dc:creator>Michael Stelly</dc:creator>
      <pubDate>Sun, 21 Sep 2025 22:47:43 +0000</pubDate>
      <link>https://forem.com/refactory/react-native-version-matrix-the-hidden-upgrade-path-1p3m</link>
      <guid>https://forem.com/refactory/react-native-version-matrix-the-hidden-upgrade-path-1p3m</guid>
      <description>&lt;p&gt;&lt;em&gt;Part 1 of 4: Why "Simple" Upgrades Become Multi-Week Migrations&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I got a call from a manager whose React Native app was facing imminent removal from both Play and Apple app stores. His team had no mobile experience, and they were desperate. Within 20 minutes of our conversation, I knew this wasn't an upgrade problem—it was an archaeological dig.&lt;/p&gt;

&lt;p&gt;The Bluecrew app was running React Native 0.61, which had been released in August 2019. Four years later in 2023, they weren't just behind—they were running a museum piece. After reviewing their codebase, I had to deliver news no manager wants to hear: this wasn't going to be an update. It was going to be a complete rewrite. The majority of their npm libraries were either outdated or completely abandoned.&lt;/p&gt;

&lt;p&gt;As Tom, their manager, later wrote: "I took over a team that had a React Native project that was desperately out of date and at risk of immediate removal from Play and Apple app stores... Mike helped us rewrite the code base entirely."&lt;/p&gt;

&lt;p&gt;The tragedy? Six weeks of systematic quarterly maintenance could have prevented the entire crisis. Instead, they needed a complete application rebuild that consumed weeks and required outside expertise.&lt;/p&gt;

&lt;p&gt;Using my upgrade complexity formula—&lt;strong&gt;Upgrade Difficulty = (Version Gap × Architectural Changes) × Dependency Decay Rate&lt;/strong&gt;—Bluecrew scored 47. Anything over 30 signals a rewrite candidate. They weren't upgrading; they were performing archaeology.&lt;/p&gt;

&lt;p&gt;After leading React Native migrations at Sam's Club in 2022, Bluecrew in 2023, and analyzing 12 enterprise React Native apps over seven years, I've discovered that React Native upgrades aren't linear progressions—they're navigating a web of interdependencies where skipping the wrong version triggers cascade failures the changelog never mentions.&lt;/p&gt;

&lt;p&gt;I've mapped these breaking changes into a predictable matrix. Four critical version walls determine whether your upgrade takes days or months. Position your app correctly, and updates become routine. Miss the pattern, and you'll find yourself explaining to stakeholders why your "simple update" has turned into a feature freeze.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern Behind the Chaos
&lt;/h2&gt;

&lt;p&gt;Every React Native release contains fracture points—changes that break not just your code, but the entire ecosystem around it. The changelog calls them "improvements." Experienced developers know them as migration projects.&lt;/p&gt;

&lt;p&gt;Take a seemingly innocent changelog entry: "Migrated to AndroidX." Sounds simple enough. But experienced React Native developers know this phrase signals an ecosystem-wide fracture where every Android dependency must choose sides, build times double, and apps that compile perfectly crash on user devices due to reflection errors in native code.&lt;/p&gt;

&lt;p&gt;Or consider: "New Architecture available." What this actually delivered was two completely different architectures running simultaneously in the same app, requiring C++ expertise for modules that previously needed simple Java annotations, and event timing changes that broke carefully tuned animations.&lt;/p&gt;

&lt;p&gt;The most insidious: "Packages moved to @react-native scope." A "simple" namespacing change that broke every import in your codebase, with no predictable pattern for where packages relocated.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cascade Effect at Enterprise Scale
&lt;/h2&gt;

&lt;p&gt;Enterprise React Native upgrades reveal how these fractures compound across complex codebases. What looks like a routine version bump becomes a multi-week emergency project.&lt;/p&gt;

&lt;p&gt;One "simple" upgrade typically cascades into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Library migration and compatibility testing (3-5 days)&lt;/li&gt;
&lt;li&gt;Architecture rewrites with minimal documentation (5-7 days) &lt;/li&gt;
&lt;li&gt;Build configuration modernization (2-3 days)&lt;/li&gt;
&lt;li&gt;Platform-specific conflicts requiring specialized expertise (1-2 days)&lt;/li&gt;
&lt;li&gt;Dependency debugging with cryptic native errors (3-4 days)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each fix reveals two more problems. Updating one library breaks your upload flow. Fixing navigation breaks deep linking. Every dependency touches its own ecosystem of subdependencies with their own compatibility requirements.&lt;/p&gt;

&lt;p&gt;The result: weeks explaining to stakeholders why your "simple update" has turned into a feature freeze, while your supposedly stable app accumulates technical debt that compounds exponentially.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Four Immutable Rules
&lt;/h2&gt;

&lt;p&gt;After mapping breaking changes across 12 enterprise migrations over seven years, four patterns emerged that govern every React Native upgrade:&lt;/p&gt;

&lt;h3&gt;
  
  
  Rule 1: Walls Don't Move
&lt;/h3&gt;

&lt;p&gt;AndroidX will always live at 0.60. The New Architecture divide stays at 0.68-0.70. Package reorganization happened at 0.72. These are geological layers in React Native's history that mark fundamental shifts in how the platform works. You can't skip them, only cross them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rule 2: Time Compounds Everything
&lt;/h3&gt;

&lt;p&gt;A month-old React Native app requires 2 hours of updates. A year-old app requires 2 weeks. An 18-month-old app requires 2 months. The math is exponential because dependencies diverge, libraries get abandoned, and breaking changes accumulate in ways that can't be fixed incrementally.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rule 3: Dependencies Determine Destiny
&lt;/h3&gt;

&lt;p&gt;Your upgrade path isn't about React Native—it's about your slowest dependency. One abandoned package locks your entire app at an old version. That camera library pinned to React Native 0.58? Your entire app is now a 0.58 app, regardless of what version you think you're running.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rule 4: Position Determines Difficulty
&lt;/h3&gt;

&lt;p&gt;Some React Native versions offer 12-18 months of stability where updates are routine maintenance. Others trap you in constant update cycles where every change cascades into architectural decisions. Smart teams position themselves just after major walls and stay there until the next stable zone appears.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strategic Positioning Matters
&lt;/h2&gt;

&lt;p&gt;Every React Native app sits at a specific coordinate in this version matrix. Your position determines not just your current stability, but your future upgrade path and the complexity of every dependency decision.&lt;/p&gt;

&lt;p&gt;Most upgrade guides recommend incremental updates. After managing 12 enterprise migrations over seven years, I disagree. Sometimes a clean rewrite is faster and safer than trying to bridge a multi-year technical debt gap. Bluecrew proved this—their complete rebuild took less time than attempting an "incremental" migration across multiple version walls.&lt;/p&gt;

&lt;p&gt;The cascade effect isn't random—it follows predictable patterns. Understanding these patterns is the difference between routine maintenance and emergency projects that consume entire development cycles and team credibility.&lt;/p&gt;

&lt;p&gt;Your supposedly "stable" React Native app is built on abandoned NPM packages, deprecated Android APIs, and native code written by people who've moved on to other companies. Every month you wait, the foundation shifts a little more.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Path Forward
&lt;/h2&gt;

&lt;p&gt;The React Native ecosystem fractures at four predictable points—version walls where the entire ecosystem breaks and requires complete migration strategies rather than simple updates. These walls don't move. They're geological layers that every app must eventually cross.&lt;/p&gt;

&lt;p&gt;In Part 2, I'll map each critical wall in detail—what breaks, why it breaks, and the specific technical decisions that determine whether crossing them takes days or weeks. Each wall has its own failure patterns and migration requirements. Understanding them lets you position your app strategically and plan upgrades that align with business reality.&lt;/p&gt;

&lt;p&gt;The walls are predictable, but only if you know what to look for. Most teams learn this the hard way—during emergency weekend deployments, explaining to stakeholders why the "simple update" has turned into a month-long project.&lt;/p&gt;

&lt;p&gt;You don't have to learn it the hard way.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is Part 1 of a 4-part series on navigating React Native upgrades. Follow me for Parts 2-4, where I'll detail the specific walls, cascade patterns, and decision frameworks that can save your team weeks of migration pain.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;About Michael: I’ve been building cross-platform mobile apps since 2011, starting with Titanium SDK and going all-in on React Native in 2018. Seven years and 12+ modernization projects later, I’ve helped companies including Sam’s Club and Bluecrew avoid hundreds of thousands of dollars in rebuild costs by catching technical debt before it becomes a crisis. I specialize in rescuing legacy React Native applications and establishing sustainable development practices that prevent future emergencies.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://www.linkedin.com/in/refactory" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or learn more about my services at &lt;a href="https://refactory.carrd.co/" rel="noopener noreferrer"&gt;Refactory&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>mobile</category>
      <category>javascript</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
