<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Junne欧阳</title>
    <description>The latest articles on Forem by Junne欧阳 (@junneoyang).</description>
    <link>https://forem.com/junneoyang</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3915665%2Ff764469e-15f3-4076-8c0c-e649bbdab5c6.jpeg</url>
      <title>Forem: Junne欧阳</title>
      <link>https://forem.com/junneoyang</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/junneoyang"/>
    <language>en</language>
    <item>
      <title>I Scraped 32 Real Google SERPs to Validate My AI's Competitive Analysis. Here's What I Found.</title>
      <dc:creator>Junne欧阳</dc:creator>
      <pubDate>Fri, 08 May 2026 09:28:22 +0000</pubDate>
      <link>https://forem.com/junneoyang/i-scraped-32-real-google-serps-to-validate-my-ais-competitive-analysis-heres-what-i-found-25ig</link>
      <guid>https://forem.com/junneoyang/i-scraped-32-real-google-serps-to-validate-my-ais-competitive-analysis-heres-what-i-found-25ig</guid>
      <description>&lt;p&gt;I'm building &lt;a href="https://dance-ai.xyz/" rel="noopener noreferrer"&gt;ShenBi AI&lt;/a&gt; — an AI tool that turns Chinese short-video links (Douyin, Xiaohongshu, Kuaishou) into structured transcripts and rewrite-ready scripts. As a solo founder doing SEO myself, I needed a competitive analysis: who's ranking for &lt;code&gt;douyin transcript&lt;/code&gt; and similar long-tail queries?&lt;/p&gt;

&lt;p&gt;I asked Claude. It gave me a clean "top 10" list — ScreenApp, BibiGPT, Apify, and so on. Easy. I trusted it.&lt;/p&gt;

&lt;p&gt;Then a small voice asked: &lt;strong&gt;is this actually what Google shows real users?&lt;/strong&gt; Or is the AI returning some aggregator data (Bing? Brave? Custom Search API?) and calling it Google?&lt;/p&gt;

&lt;p&gt;I decided to find out the hard way: scrape &lt;strong&gt;32 real Google SERPs&lt;/strong&gt; across 4 region/device combinations. Here's the build, and what I found.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;AI search tools (Claude/GPT WebSearch) returned a "top 10" that &lt;strong&gt;diverged significantly&lt;/strong&gt; from what real Google actually shows.&lt;/li&gt;
&lt;li&gt;I missed &lt;strong&gt;3 major competitors entirely&lt;/strong&gt; (&lt;code&gt;turboscribe.ai&lt;/code&gt;, &lt;code&gt;aitodo.co&lt;/code&gt;, &lt;code&gt;stt.ai&lt;/code&gt;) that dominate real SERPs but didn't appear in the AI's results.&lt;/li&gt;
&lt;li&gt;Real SERP top 1 &lt;strong&gt;changes by country&lt;/strong&gt; for the same keyword. Same keyword: India top 1 = Kapwing, US top 1 = Aitodo.&lt;/li&gt;
&lt;li&gt;AI Overview triggers in &lt;strong&gt;11 of 32&lt;/strong&gt; SERPs — asymmetric impact across regions and queries.&lt;/li&gt;
&lt;li&gt;The whole pipeline runs in ~7 minutes per quarterly snapshot. Code below.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why "AI WebSearch ≠ real Google SERP"
&lt;/h2&gt;

&lt;p&gt;When you ask Claude or ChatGPT to "search Google for X", you don't get real Google SERP. You get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A processed/aggregated response from whatever search API the AI is wired to&lt;/li&gt;
&lt;li&gt;Snippets and titles that may be cached or rewritten&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No SERP features&lt;/strong&gt;: no AI Overview, no People Also Ask, no Discussions panels, no video carousels&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No regional variance&lt;/strong&gt;: usually a US-centric default&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For broad questions ("what's the capital of France"), this is fine. For SEO competitive analysis where you need to know &lt;strong&gt;what your competitors look like in real SERPs&lt;/strong&gt;, it's not enough.&lt;/p&gt;

&lt;p&gt;I had to scrape real Google.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Anti-Bot Problem (and Why Plain Curl Fails)
&lt;/h2&gt;

&lt;p&gt;You can't just &lt;code&gt;curl https://google.com/search?q=...&lt;/code&gt; from a script. Google flags the request in milliseconds based on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;TLS / HTTP/2 fingerprint&lt;/strong&gt; — &lt;code&gt;curl&lt;/code&gt; and &lt;code&gt;requests&lt;/code&gt; look nothing like real browsers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JavaScript challenge&lt;/strong&gt; — most SERP content is JS-rendered; static HTML is mostly an empty shell&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IP reputation&lt;/strong&gt; — datacenter IPs (AWS / GCP / Oracle Cloud / etc.) are flagged on sight&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After two &lt;code&gt;/sorry/&lt;/code&gt; redirects in 30 seconds, I knew: I needed real browser automation.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;p&gt;I ended up with this combination, &lt;strong&gt;all free except for a $5/month Clash subscription&lt;/strong&gt; (which I had anyway for general work):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CloakBrowser&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Patched Chromium at the C++ source level (Canvas/WebGL/font/timezone fingerprinting). Drop-in Playwright API. Free.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Clash Verge&lt;/strong&gt; (with API enabled)&lt;/td&gt;
&lt;td&gt;Switch exit nodes by country (India, US, Hong Kong) for region-specific SERPs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Python + Playwright sync API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The collector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Persistent BrowserContext + manual captcha&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;First captcha per IP solved by hand → Google issues &lt;code&gt;GOOGLE_ABUSE_EXEMPTION&lt;/code&gt; cookie → subsequent queries skip the challenge&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Critical detail: &lt;strong&gt;I run desktop UA on datacenter IPs, not mobile&lt;/strong&gt;. Why? Mobile UA + datacenter IP is a contradiction Google flags instantly (mobile users don't browse from Oracle Cloud). Desktop UA passes the smell test.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Collector (simplified)
&lt;/h2&gt;

&lt;p&gt;The full collector is ~600 lines (region grouping, state recovery, scroll simulation, captcha pause), but the core flow is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;cloakbrowser&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;launch_persistent_context&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Switch Clash exit node to target country
&lt;/span&gt;&lt;span class="n"&gt;clash&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;switch_to_country&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# India
&lt;/span&gt;&lt;span class="n"&gt;clash&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify_ip_country&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_wait&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# confirm exit IP is actually India
&lt;/span&gt;
&lt;span class="c1"&gt;# 2. Launch browser with persistent profile (cookie cache survives captcha solve)
&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;launch_persistent_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user_data_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./profile_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DESKTOP_UA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;viewport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;width&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1920&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;height&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1080&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;humanize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# CloakBrowser injects realistic mouse/click timing
&lt;/span&gt;    &lt;span class="n"&gt;locale&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en-US&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Asia/Kolkata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Navigate
&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new_page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.google.com/search?q=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;gl=in&amp;amp;hl=en&amp;amp;num=20&amp;amp;pws=0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait_for_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# JS render
&lt;/span&gt;
&lt;span class="c1"&gt;# 4. Detect captcha
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/sorry/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚠️  Captcha triggered. Solve in browser, press Enter to retry...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# retry, cookie now valid
&lt;/span&gt;
&lt;span class="c1"&gt;# 5. Trigger lazy-load (scroll to bottom, then back)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mouse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wheel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait_for_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;() =&amp;gt; window.scrollTo(0, 0)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 6. Snapshot: HTML + full-page screenshot + metadata
&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;png&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;screenshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;full_page&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# ... write to disk with sha256 hash for audit trail
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then I parse the HTML with BeautifulSoup. Tip: &lt;strong&gt;mobile and desktop SERPs use different selectors&lt;/strong&gt; for organic results (&lt;code&gt;&amp;lt;h3&amp;gt;&lt;/code&gt; on desktop, &lt;code&gt;&amp;lt;div role="heading"&amp;gt;&lt;/code&gt; on mobile). My first parser missed all mobile results until I caught this.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Findings
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Finding 1: AI WebSearch ≠ real SERP
&lt;/h3&gt;

&lt;p&gt;For &lt;code&gt;douyin transcript&lt;/code&gt; — top 5 according to Claude's WebSearch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. ScreenApp
2. ScreenApp (Chinese variant)
3. Apify
4. yeschat.ai
5. TokScript
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Real Google (India, desktop, my datacenter IP):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. screenapp.io
2. apify.com
3. stt.ai           ← Claude missed this entirely
4. dupdub.com       ← Claude missed
5. turboscribe.ai   ← Claude missed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Three competitors I would have ignored&lt;/strong&gt; in my SEO strategy if I'd trusted the AI alone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding 2: Top 1 changes by country
&lt;/h3&gt;

&lt;p&gt;For &lt;code&gt;xiaohongshu transcript generator&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Region/Device&lt;/th&gt;
&lt;th&gt;Top 1&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;India desktop&lt;/td&gt;
&lt;td&gt;&lt;code&gt;kapwing.com&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;India mobile&lt;/td&gt;
&lt;td&gt;&lt;code&gt;kapwing.com&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US desktop&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aitodo.co&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US mobile&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aitodo.co&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The AI WebSearch had told me ScreenApp was top 1 for everything Xiaohongshu-related. &lt;strong&gt;It's not even in the top 5 for this exact keyword.&lt;/strong&gt; Different countries = different competitors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding 3: AI Overview asymmetry
&lt;/h3&gt;

&lt;p&gt;11 of 32 SERPs triggered AI Overview. Distribution surprised me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All 8 Hong Kong (Chinese) SERPs triggered it&lt;/li&gt;
&lt;li&gt;3 of 16 English SERPs triggered (mostly &lt;code&gt;xiaohongshu transcript&lt;/code&gt; variants)&lt;/li&gt;
&lt;li&gt;0 of 8 &lt;code&gt;douyin transcript&lt;/code&gt; SERPs triggered&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're doing SEO in a language Google has heavily AI-Overview-ified, expect significant click-through compression. If you're doing English transcript queries, you're (currently) safe.&lt;/p&gt;

&lt;h3&gt;
  
  
  Finding 4: My GSC ranking ≠ real SERP
&lt;/h3&gt;

&lt;p&gt;Google Search Console reported I was ranking position 2.43 for my best keyword. The real SERP I scraped: &lt;strong&gt;I'm not in the top 14 anywhere.&lt;/strong&gt; Two possibilities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;GSC is averaging over 3 months of impressions, with rare top-3 hits pulling the mean down&lt;/li&gt;
&lt;li&gt;Datacenter IP sees a different SERP than real residential users&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Either way: &lt;strong&gt;don't trust GSC ranks alone for strategy.&lt;/strong&gt; Cross-validate.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Tell Past-Me
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AI search tools are great for general questions, dangerous for SEO competitive research.&lt;/strong&gt; Always cross-validate with at least one real SERP scrape.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Datacenter IP scraping is a first-pass tool, not a final answer.&lt;/strong&gt; Google may show different results to residential vs datacenter exit nodes. For final strategy, validate with a residential proxy ($7-10 from IPRoyal for a one-shot run).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mobile and desktop SERPs are completely different worlds.&lt;/strong&gt; If your audience is 90% mobile (mine is — India + Pakistan + Thailand), don't analyze desktop only.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Quarterly cadence matters more than perfect data.&lt;/strong&gt; Build the pipeline once, run it every 3 months, diff the results. Trends &amp;gt; single snapshots.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build in public.&lt;/strong&gt; I shared this approach in &lt;a href="https://github.com/oywt" rel="noopener noreferrer"&gt;the project's repo&lt;/a&gt; so other indie founders doing SEO can fork it. (If anyone has tips on residential proxies that don't break the bank, please share.)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I'm planning a residential IP run next month to compare against this datacenter snapshot, and probably a Bing/DuckDuckGo cross-comparison for users in countries where those have non-trivial market share.&lt;/p&gt;

&lt;p&gt;If you're an indie founder doing your own SEO, I'd love to hear how you handle competitive research. Drop a comment below, or check out &lt;a href="https://dance-ai.xyz/" rel="noopener noreferrer"&gt;ShenBi AI&lt;/a&gt; — the project that started all this.&lt;/p&gt;

</description>
      <category>seo</category>
      <category>python</category>
      <category>indiehackers</category>
      <category>buildinpublic</category>
    </item>
  </channel>
</rss>
