<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Anders Myrmel</title>
    <description>The latest articles on Forem by Anders Myrmel (@anders_myrmel_2bc87f4df06).</description>
    <link>https://forem.com/anders_myrmel_2bc87f4df06</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3648125%2F6c6e4fe8-e39b-483b-8ef6-fc148b3c4e7f.png</url>
      <title>Forem: Anders Myrmel</title>
      <link>https://forem.com/anders_myrmel_2bc87f4df06</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/anders_myrmel_2bc87f4df06"/>
    <language>en</language>
    <item>
      <title>How I Scrape 250,000 Shopify Stores Without Getting Blocked</title>
      <dc:creator>Anders Myrmel</dc:creator>
      <pubDate>Thu, 12 Mar 2026 11:06:43 +0000</pubDate>
      <link>https://forem.com/anders_myrmel_2bc87f4df06/how-i-scrape-250000-shopify-stores-without-getting-blocked-29f9</link>
      <guid>https://forem.com/anders_myrmel_2bc87f4df06/how-i-scrape-250000-shopify-stores-without-getting-blocked-29f9</guid>
      <description>&lt;p&gt;Right-click any Shopify store. View Source. You'll see &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tags from every app they've installed, a &lt;code&gt;Shopify.theme&lt;/code&gt; object with their exact theme, and tracking pixels from every ad platform they use. None of this is hidden.&lt;/p&gt;

&lt;p&gt;I wanted to scrape all of it across 250K stores. That's the detection engine behind &lt;a href="https://storeinspect.com" rel="noopener noreferrer"&gt;StoreInspect&lt;/a&gt;, where I map the Shopify ecosystem by scanning what stores run.&lt;/p&gt;

&lt;p&gt;This post is the technical walkthrough. What worked, what didn't, and the parts that took way longer than expected.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Rebrowser-Puppeteer (not regular Puppeteer)&lt;/li&gt;
&lt;li&gt;PostgreSQL with JSONB snapshots&lt;/li&gt;
&lt;li&gt;Webshare proxies + Tailscale SOCKS5 tunnels&lt;/li&gt;
&lt;li&gt;Detection logic bundled as a string so it runs in both Puppeteer and a Chrome Extension&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why not regular Puppeteer
&lt;/h2&gt;

&lt;p&gt;Standard Puppeteer gets flagged immediately. Shopify itself doesn't block scrapers aggressively, but Cloudflare and bot detection on individual stores will. Rebrowser is a drop-in replacement that patches the &lt;code&gt;webdriver&lt;/code&gt; property and fixes the obvious fingerprinting leaks.&lt;/p&gt;

&lt;p&gt;I also set a real viewport (1920x1080), proper Chrome 131 user agent strings, and route through residential proxies. Nothing exotic. The bar for scraping Shopify stores is low because the data is public. You just need to not look like a bot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Page loading strategy
&lt;/h2&gt;

&lt;p&gt;First mistake I made: using &lt;code&gt;networkidle2&lt;/code&gt; as the wait condition. Shopify stores have analytics scripts, chat widgets, and ad pixels that fire continuously. &lt;code&gt;networkidle2&lt;/code&gt; waits for 500ms of network silence, which sometimes never comes.&lt;/p&gt;

&lt;p&gt;Switched to &lt;code&gt;domcontentloaded&lt;/code&gt; plus a flat 5-second delay:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;domcontentloaded&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 5 seconds lets lazy-loaded scripts, GTM tags, and deferred pixels fire. Not elegant, but it catches 95%+ of what I need.&lt;/p&gt;

&lt;p&gt;I also block images, fonts, and media via request interception. I only need the HTML and scripts. This cut page load times by about 60%.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four detection layers
&lt;/h2&gt;

&lt;p&gt;One method doesn't catch everything, so I use four:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Script URL matching
&lt;/h3&gt;

&lt;p&gt;Most Shopify apps inject a script tag with a recognizable domain. Klaviyo loads from &lt;code&gt;static.klaviyo.com&lt;/code&gt;. Judge.me loads from &lt;code&gt;judge.me&lt;/code&gt;. Yotpo loads from &lt;code&gt;staticw2.yotpo.com&lt;/code&gt;. Match the domain, you've identified the app.&lt;/p&gt;

&lt;p&gt;This is the most reliable method. About 70% of detections come from script URLs.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. JavaScript globals
&lt;/h3&gt;

&lt;p&gt;Apps set window variables when they initialize. Klaviyo sets &lt;code&gt;window.klaviyo&lt;/code&gt; and &lt;code&gt;window._learnq&lt;/code&gt;. Gorgias sets &lt;code&gt;window.GorgiasChat&lt;/code&gt;. TikTok's pixel sets &lt;code&gt;window.ttq&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I check these with &lt;code&gt;page.evaluate()&lt;/code&gt; after the page loads. Useful as a second signal when script URLs are obfuscated or loaded through a tag manager.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. DOM elements
&lt;/h3&gt;

&lt;p&gt;Some apps only inject UI. A chat bubble, a reviews widget, a popup form. CSS selectors like &lt;code&gt;.jdgm-widget&lt;/code&gt; (Judge.me) or &lt;code&gt;[data-yotpo-instance-id]&lt;/code&gt; (Yotpo) catch these.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Theme App Extension blocks
&lt;/h3&gt;

&lt;p&gt;This one took me a while to find. Shopify's Online Store 2.0 lets apps inject server-rendered blocks, and Shopify wraps them in HTML comments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- BEGIN app block: shopify://apps/judge-me-reviews/blocks/preview_badge/... --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are gold. They map directly to the app's Shopify App Store slug. A store can have zero client-side scripts from an app but still have its Theme App Extension block in the HTML. I maintain a map from Shopify slugs to app IDs to catch these.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cookie consent problem
&lt;/h2&gt;

&lt;p&gt;This one cost me a week. I was getting pixel detection rates way below what I expected. Meta Pixel was showing up on maybe 40% of stores when industry benchmarks say 50%+.&lt;/p&gt;

&lt;p&gt;The problem: cookie consent managers. OneTrust, Cookiebot, and similar tools block ad pixels from loading until the user clicks "Accept." My scraper never clicks accept, so the pixels never fire.&lt;/p&gt;

&lt;p&gt;Fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;onetrust-accept-btn-handler&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button[id*="accept"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Click accept, wait 8 seconds for GTM to process and load the blocked tags. Pixel detection accuracy went from roughly 60% to 95%.&lt;/p&gt;

&lt;p&gt;The 8-second wait feels long but it's necessary. GTM doesn't fire tags synchronously after consent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Detection bundle architecture
&lt;/h2&gt;

&lt;p&gt;I wanted the same detection code in both the Puppeteer scraper (server-side) and a Chrome Extension (client-side). The signatures for 180+ apps, 40+ pixels, and theme detection logic need to be identical.&lt;/p&gt;

&lt;p&gt;The solution: bundle everything into a single self-executing function string.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;detectionScript&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`(function() {
  const APP_SIGNATURES = { /* 180 apps */ };
  const PIXEL_SIGNATURES = { /* 40 pixels */ };
  // ... detection logic
  return { isShopify, theme, apps, pixels };
})()`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Puppeteer&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;detectionScript&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Chrome Extension (content script)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;detectionScript&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One source of truth. When I add a new app signature, both the scraper and extension pick it up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Storing results in JSONB
&lt;/h2&gt;

&lt;p&gt;Each scrape produces a snapshot stored as JSONB in PostgreSQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;store_snapshots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;store_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;apps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;// [{name, category, detected_via}]&lt;/span&gt;
  &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// [{name, type, pixel_id}]&lt;/span&gt;
  &lt;span class="nx"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// {name, type, author}&lt;/span&gt;
  &lt;span class="nx"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// {product_count, traffic_tier, ...}&lt;/span&gt;
  &lt;span class="nx"&gt;snapshot_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;timestamp&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why JSONB instead of normalized tables? The schema changes constantly. Every time I add a new detection field or metric, I don't want to run a migration. JSONB lets me evolve the structure without downtime.&lt;/p&gt;

&lt;p&gt;I also keep denormalized counts on the main &lt;code&gt;stores&lt;/code&gt; table (&lt;code&gt;app_count&lt;/code&gt;, &lt;code&gt;pixel_count&lt;/code&gt;, &lt;code&gt;theme_name&lt;/code&gt;) for fast filtering and sorting. The snapshots are for historical comparison.&lt;/p&gt;

&lt;h2&gt;
  
  
  Error handling at scale
&lt;/h2&gt;

&lt;p&gt;At 250K stores, every edge case happens. Password-protected stores return a &lt;code&gt;/password&lt;/code&gt; redirect. Stores with lapsed billing return 402. Dead stores return 404. Some stores infinite-redirect between &lt;code&gt;www&lt;/code&gt; and non-&lt;code&gt;www&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I classify errors into retryable and non-retryable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Don't retry these&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dns_not_found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;   &lt;span class="c1"&gt;// Domain doesn't resolve&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ssl_error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;       &lt;span class="c1"&gt;// Cert problems&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;store_closed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;    &lt;span class="c1"&gt;// 402 Payment Required&lt;/span&gt;

&lt;span class="c1"&gt;// Retry with backoff&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;timeout&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;         &lt;span class="c1"&gt;// Might be temporary&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;blocked&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;         &lt;span class="c1"&gt;// Try different proxy&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;network_error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;   &lt;span class="c1"&gt;// Transient&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DNS failures and SSL errors get marked and skipped permanently. Timeouts and blocks get retried with proxy rotation. This keeps the scraper from wasting cycles on stores that will never respond.&lt;/p&gt;

&lt;h2&gt;
  
  
  Regional domain deduplication
&lt;/h2&gt;

&lt;p&gt;Brands like &lt;code&gt;gymshark.com&lt;/code&gt; and &lt;code&gt;gymshark.co.uk&lt;/code&gt; are the same store. Without deduplication, they'd appear as separate entries with identical tech stacks.&lt;/p&gt;

&lt;p&gt;I check regional TLDs (&lt;code&gt;.co.uk&lt;/code&gt;, &lt;code&gt;.com.au&lt;/code&gt;, &lt;code&gt;.de&lt;/code&gt;, &lt;code&gt;.fr&lt;/code&gt;, etc.) against the &lt;code&gt;.com&lt;/code&gt; version. If both exist, I skip the regional variant. Simple, but it prevented thousands of duplicates.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned about Shopify's ecosystem
&lt;/h2&gt;

&lt;p&gt;After scanning 250K stores: the median store runs 1 app (usually Shop Pay) and 4 pixels (usually Shopify's own pixel plus GA4). 59% have no email marketing tool. 78% have no reviews app. The ecosystem is almost empty.&lt;/p&gt;

&lt;p&gt;The detection engine is available as a &lt;a href="https://storeinspect.com/extension" rel="noopener noreferrer"&gt;free Chrome extension&lt;/a&gt; if you want to try it on individual stores.&lt;/p&gt;




&lt;p&gt;The full dataset updates daily at &lt;a href="https://storeinspect.com/report/state-of-shopify" rel="noopener noreferrer"&gt;storeinspect.com/report/state-of-shopify&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you have questions about the scraping setup or detection patterns, drop them in the comments.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>webscraping</category>
      <category>shopify</category>
      <category>javascript</category>
    </item>
    <item>
      <title>How I Built Two Data Tools for Ecommerce: Store Inspector and ProductLair</title>
      <dc:creator>Anders Myrmel</dc:creator>
      <pubDate>Fri, 05 Dec 2025 14:43:56 +0000</pubDate>
      <link>https://forem.com/anders_myrmel_2bc87f4df06/how-i-built-two-data-tools-for-ecommerce-store-inspector-and-productlair-2l04</link>
      <guid>https://forem.com/anders_myrmel_2bc87f4df06/how-i-built-two-data-tools-for-ecommerce-store-inspector-and-productlair-2l04</guid>
      <description>&lt;p&gt;Over the last year I have been building two tools for ecommerce researchers and founders who want faster insights without spending hours digging through apps, themes, ads, or product signals. The tools are Store Inspector and ProductLair, and both came from my own frustration trying to understand what actually makes a store or product perform well.&lt;/p&gt;

&lt;p&gt;This post explains the technical thinking behind both projects and some of the challenges I ran into while building them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Store Inspector: Browser First Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://storeinspect.com" rel="noopener noreferrer"&gt;Store Inspector&lt;/a&gt; is a free Chrome extension that analyzes any Shopify store. It detects themes, apps, pixels, and a basic lead score. Most tools rely on servers to scrape stores, which is slow and often blocked. I wanted something that runs fully in the browser to avoid rate limits and speed issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How detection works&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the page HTML directly from the active tab&lt;/li&gt;
&lt;li&gt;Parse scripts, linked files, and HTML identifiers&lt;/li&gt;
&lt;li&gt;Match patterns for themes, apps, and pixels&lt;/li&gt;
&lt;li&gt;Score based on tech stack, tracking depth, and store structure&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything happens locally. Nothing is sent to my servers unless the user opens the detailed view. This makes detection extremely fast and keeps privacy simple.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenges&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some apps hide their scripts behind dynamic loaders&lt;/li&gt;
&lt;li&gt;Themes often change naming conventions&lt;/li&gt;
&lt;li&gt;Pixels can appear under custom wrappers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I solved these problems by building a pattern matcher that is constantly updated and by using weighted scoring instead of binary detection.&lt;/p&gt;

&lt;h2&gt;
  
  
  ProductLair: Data Consolidation for Product Research
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://productlair.com" rel="noopener noreferrer"&gt;ProductLair&lt;/a&gt; focuses on the product side rather than the store side. The goal is to collect signals from social platforms, ads, stores, and search trends and combine them into a single product profile.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech stack behind ProductLair&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Next.js for the frontend&lt;/li&gt;
&lt;li&gt;Supabase as the database and auth layer&lt;/li&gt;
&lt;li&gt;Serverless functions for scraping and enrichment&lt;/li&gt;
&lt;li&gt;A lightweight scoring engine for market assessment&lt;/li&gt;
&lt;li&gt;Recharts for performance summaries&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Why build it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are many product finder tools, but almost all of them are either outdated, low quality, or only work on TikTok. I wanted something that gives real analysis, not random product dumps. That meant manually curating products, building comparison tools, and designing an interface that feels fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interesting issues I ran into&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;TikTok creative links expire faster than expected&lt;/li&gt;
&lt;li&gt;Store data needed normalization because everyone structures their titles differently&lt;/li&gt;
&lt;li&gt;Search trend scoring required smoothing to avoid spikes from small regions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I learned building both tools
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Browser based detection is far faster than remote scraping for anything involving Shopify&lt;/li&gt;
&lt;li&gt;Data is useless without normalization and scoring&lt;/li&gt;
&lt;li&gt;Users prefer one click insights over dashboards with too many metrics&lt;/li&gt;
&lt;li&gt;Speed matters more than features for research tools&lt;/li&gt;
&lt;li&gt;Clear visual hierarchy makes or breaks a research page&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is next
&lt;/h2&gt;

&lt;p&gt;I am currently working on automated tracking for Store Inspector and deeper funnel metrics for ProductLair. Both tools are early but they already help thousands of users research faster.&lt;/p&gt;

&lt;p&gt;If you are working on ecommerce related data tools, browser extensions, or store analysis systems, I would love to hear what you are building.&lt;/p&gt;

</description>
      <category>sass</category>
      <category>webdev</category>
      <category>ecommerce</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
