<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sami</title>
    <description>The latest articles on Forem by Sami (@sami_8858131362756585e4f4).</description>
    <link>https://forem.com/sami_8858131362756585e4f4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3877584%2F63d2c24c-ec4e-457f-8a71-2b79bb969554.png</url>
      <title>Forem: Sami</title>
      <link>https://forem.com/sami_8858131362756585e4f4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/sami_8858131362756585e4f4"/>
    <language>en</language>
    <item>
      <title>Weibo's Hot Search Is the Best Real-Time Feed of Chinese Public Sentiment in 2026</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Fri, 08 May 2026 17:33:19 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/weibos-hot-search-is-the-best-real-time-feed-of-chinese-public-sentiment-in-2026-2cep</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/weibos-hot-search-is-the-best-real-time-feed-of-chinese-public-sentiment-in-2026-2cep</guid>
      <description>&lt;p&gt;Weibo's "hot search" (热搜) is the closest thing China has to a real-time barometer of public attention. It updates every few minutes, ranks topics by an opaque heat score, and is where every news cycle, celebrity scandal, and viral product launch lands first. For brands, agencies, and researchers covering China, this feed is gold — and unlike most of Weibo, it's accessible without a single cookie.&lt;/p&gt;

&lt;p&gt;This post is for anyone building a brand-monitoring, sentiment-tracking, or trend-discovery pipeline aimed at China.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why hot search matters
&lt;/h2&gt;

&lt;p&gt;Weibo (微博) is China's microblogging giant — 580M+ monthly active users. The hot search ranking is curated by Weibo's own engagement signals: a topic earns a spot when search volume, post creation, and engagement spike together within a short window.&lt;/p&gt;

&lt;p&gt;That makes hot search a &lt;strong&gt;leading indicator&lt;/strong&gt; for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PR crises&lt;/strong&gt;: a brand mention reaches the top 50 within minutes of a viral video&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product launches&lt;/strong&gt;: launches by Apple, Tesla, Xiaomi, etc. typically hit the top 20 within an hour&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cultural shifts&lt;/strong&gt;: holiday spikes, generational slang, viral memes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geopolitics&lt;/strong&gt;: state-affiliated topics surface predictably; their ranking velocity tells a story&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're tracking China for any of these use cases, polling hot search every 5–15 minutes gives you sub-news-cycle response time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you actually get
&lt;/h2&gt;

&lt;p&gt;Each hot search row exposes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;rank&lt;/strong&gt; (1–50)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;title&lt;/strong&gt; (the search term itself, in Chinese)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;hotValue&lt;/strong&gt; — an integer that approximates topical heat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;category&lt;/strong&gt; (科技 = tech, 娱乐 = entertainment, 时尚 = fashion, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;labelName&lt;/strong&gt; — content-moderation labels: &lt;code&gt;热&lt;/code&gt; (hot), &lt;code&gt;新&lt;/code&gt; (new), &lt;code&gt;沸&lt;/code&gt; (boiling), &lt;code&gt;爆&lt;/code&gt; (exploding)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;isHot&lt;/strong&gt; flag&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;url&lt;/strong&gt; to the search results page on weibo.com&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sample row:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"rank"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能最新突破"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"科技"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"hotValue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2847562&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"labelName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"热"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"isHot"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://s.weibo.com/weibo?q=%23..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  A minimal Python pipeline
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;snapshot_hot_search&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hot_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c1"&gt;# Poll every 10 minutes and dedupe by title
&lt;/span&gt;&lt;span class="n"&gt;seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;snap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;snapshot_hot_search&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;snap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;seen&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;first_seen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] rank=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;  heat=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hotValue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A small loop and you've built a brand-mention monitor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common patterns I see customers run
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Brand watch.&lt;/strong&gt; Match new hot-search titles against a list of brand keywords. Trigger alerts when a brand name enters top 50.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Velocity tracking.&lt;/strong&gt; Compute the rank-change velocity per topic. Topics that jump from rank 40 → 5 in under 30 minutes are early-warning signals for going viral.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Category drift.&lt;/strong&gt; Track which categories dominate hot search hour-by-hour. Useful for media planning and ad targeting timing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Cross-platform correlation.&lt;/strong&gt; Pair Weibo hot search with Bilibili trending and RedNote search to detect cross-platform memes early. The platforms are surprisingly correlated 1–6 hours apart.&lt;/p&gt;

&lt;h2&gt;
  
  
  Going deeper: posts and comments
&lt;/h2&gt;

&lt;p&gt;Hot search gives you topics. To go deeper into actual conversation, pivot from a hot title to its underlying posts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# After identifying a hot topic, search posts about it
&lt;/span&gt;&lt;span class="n"&gt;posts_run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;人工智能最新突破&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That returns post-level data: text, author, like/repost/comment counts, embedded images, and post URLs. Pair with &lt;code&gt;mode: post_comments&lt;/code&gt; to harvest reactions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a hosted scraper, not raw scraping
&lt;/h2&gt;

&lt;p&gt;Weibo's public web endpoints work without login for most read paths, but they require a visitor session token (Sina Visitor System) and exponential backoff on throttling responses. A naive &lt;code&gt;requests&lt;/code&gt; script will either get throttled within 100 calls or pull empty arrays without realizing.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;strong&gt;Weibo Scraper on Apify&lt;/strong&gt;&lt;/a&gt; handles session bootstrap, throttling, retries, and consistent schema across modes (&lt;code&gt;hot_search&lt;/code&gt;, &lt;code&gt;post_comments&lt;/code&gt;, &lt;code&gt;search&lt;/code&gt;, &lt;code&gt;user_posts&lt;/code&gt;). Pure HTTP — no browser, no proxy required.&lt;/p&gt;

&lt;p&gt;Pricing is pay-per-event: &lt;strong&gt;$0.005 per item&lt;/strong&gt;. 1,000 items = $5. The free Apify tier covers 1,000 items/month.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is hot search censored?&lt;/strong&gt; Some topics are rate-limited or removed by Weibo's moderation. The labelName field hints at moderation state. You'll see topics appear and disappear.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I get historical hot search?&lt;/strong&gt; Not via Weibo directly — they don't expose archives. You build your own archive by snapshotting at intervals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about session tokens?&lt;/strong&gt; They expire periodically. Hosted scrapers refresh them automatically; if you DIY, plan for re-auth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is scraping Weibo legal?&lt;/strong&gt; This accesses publicly visible data. No authentication is bypassed. Always check your local laws and Weibo's ToS.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building a Chinese intelligence stack?
&lt;/h2&gt;

&lt;p&gt;I maintain the full suite for production pipelines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;Weibo Scraper&lt;/a&gt; — &lt;em&gt;(this one)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;Bilibili Scraper&lt;/a&gt; — China's YouTube, 300M MAU&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;RedNote (Xiaohongshu) Scraper&lt;/a&gt; — lifestyle social&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/rednote-shop-scraper" rel="noopener noreferrer"&gt;RedNote Shop Scraper&lt;/a&gt; — Xiaohongshu e-commerce&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Running 50K+ items per month?&lt;/strong&gt; I offer custom output schemas, dedicated proxy pools, SLA, and volume pricing. DM me on Apify or open an Issue titled "Enterprise inquiry".&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Found a bug?&lt;/strong&gt; Open an Issue and I usually ship fixes within 48 hours.&lt;/p&gt;

&lt;p&gt;A 30-second review on the Apify Store helps other users find this tool. ⭐&lt;/p&gt;

</description>
      <category>python</category>
      <category>webscraping</category>
      <category>china</category>
      <category>marketing</category>
    </item>
    <item>
      <title>Building a Xiaohongshu (RedNote) E-commerce Scraper for RedShop Product Data</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Wed, 06 May 2026 01:56:00 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/building-a-xiaohongshu-rednote-e-commerce-scraper-for-redshop-product-data-2g7d</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/building-a-xiaohongshu-rednote-e-commerce-scraper-for-redshop-product-data-2g7d</guid>
      <description>&lt;p&gt;When Xiaohongshu (RedNote / Little Red Book / 小红书) launched RedShop — its US-facing e-commerce platform — in April 2026, I noticed every existing scraper on Apify only covered the social side: posts, profiles, comments, videos. None of them touched product listings, vendor catalogs, or pricing data.&lt;/p&gt;

&lt;p&gt;So I built one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a dedicated shop scraper?
&lt;/h2&gt;

&lt;p&gt;Xiaohongshu is unusual among Chinese platforms because product listings live in a separate URL space from social posts. The all-in-one social scrapers handle the &lt;code&gt;/explore/&lt;/code&gt; posts surface. RedShop products live behind &lt;code&gt;/goods-detail/&lt;/code&gt; with completely different structure.&lt;/p&gt;

&lt;p&gt;Trying to extract product data from a "social" scraper means hacky workarounds. A dedicated commerce-focused tool gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Structured product fields (price, sold count, SKU variants, vendor metadata)&lt;/li&gt;
&lt;li&gt;Native support for vendor/store browsing&lt;/li&gt;
&lt;li&gt;Cross-border vs domestic flagging&lt;/li&gt;
&lt;li&gt;Cleaner pricing model: charge per product, not per "result"&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What it extracts
&lt;/h2&gt;

&lt;p&gt;For each product:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;itemId, title, productUrl&lt;/li&gt;
&lt;li&gt;salePrice, originalPrice, discountPct, currency (CNY for domestic, USD for cross-border)&lt;/li&gt;
&lt;li&gt;soldCount, wantCount (popularity signals)&lt;/li&gt;
&lt;li&gt;cover, images&lt;/li&gt;
&lt;li&gt;vendor (sellerId, name, rating)&lt;/li&gt;
&lt;li&gt;category path&lt;/li&gt;
&lt;li&gt;skus (variants with prices and stock)&lt;/li&gt;
&lt;li&gt;crossBorder flag and shippingOrigin&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Three modes
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;product_search&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search products by keyword, sort by price/sales, filter by price range&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vendor_products&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full catalog from a specific seller&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;product_detail&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Deep dive on specific product URLs (full SKU breakdown)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Real-world use cases
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DTC brands&lt;/strong&gt;: monitor your own listings and competitor pricing on China's #1 social commerce platform&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dropshippers and resellers&lt;/strong&gt;: discover trending Chinese products before they hit Amazon or Etsy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-border arbitrage&lt;/strong&gt;: identify SKUs popular in China that haven't reached Western markets yet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Investment analysts&lt;/strong&gt;: track e-commerce activity for Chinese consumer brands&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sourcing agents&lt;/strong&gt;: scout Chinese products at scale for clients in cosmetics, fashion, or home goods&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Combined with the &lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;RedNote All-in-One Scraper&lt;/a&gt; (social side), you can map products to the influencers tagging them — extremely valuable for influencer-product correlation studies.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/rednote-shop-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;skincare&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sortBy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sales&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minPrice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxPrice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — ¥&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;salePrice&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (sold &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;soldCount&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Output sample
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"itemId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"642a1b3c0000000023019f7e"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Skincare Set - Hydrating Toner + Serum + Moisturizer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"salePrice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;199.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"originalPrice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;299.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"discountPct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;33.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"currency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CNY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"soldCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"wantCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vendor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"sellerId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BeautyBrand Official"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"rating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.8&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Beauty / Skincare / Sets"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"skus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"spec"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Normal Skin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;199.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"stock"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1200&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"crossBorder"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"shippingOrigin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"China"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Pay-per-event:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$0.0075&lt;/strong&gt; per product scraped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$0.025&lt;/strong&gt; per vendor info record&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Typical costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search 100 products: ~$0.75&lt;/li&gt;
&lt;li&gt;Full vendor catalog (200 products): ~$1.53&lt;/li&gt;
&lt;li&gt;5 competitor vendors with 100 products each: ~$3.88&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Does it work for cross-border products?&lt;/strong&gt;&lt;br&gt;
Yes — products are explicitly flagged in the output (&lt;code&gt;crossBorder: true/false&lt;/code&gt;) so you can filter domestic vs international listings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I track price changes over time?&lt;/strong&gt;&lt;br&gt;
Schedule the actor to run daily/weekly via Apify Schedules. The dataset versioning gives you a price history for any product or vendor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does it need a proxy?&lt;/strong&gt;&lt;br&gt;
Residential proxies are recommended for reliable results. The default config uses Apify's residential pool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is there an official Xiaohongshu shop API?&lt;/strong&gt;&lt;br&gt;
No — Xiaohongshu doesn't offer a commerce API for international developers. This actor is the practical alternative.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/zhorex/rednote-shop-scraper" rel="noopener noreferrer"&gt;RedNote Shop Scraper on Apify&lt;/a&gt; — works with Apify's free plan ($5/month credits cover hundreds of products at no cost).&lt;/p&gt;

&lt;p&gt;If you build something useful with it, drop a comment — always interested in seeing how people use commerce data.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>python</category>
      <category>ecommerce</category>
      <category>china</category>
    </item>
    <item>
      <title>Google Ads can spend up to 2x your daily budget. I built a Chrome extension that catches it before it happens.</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 30 Apr 2026 15:17:28 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/google-ads-can-spend-up-to-2x-your-daily-budget-i-built-a-chrome-extension-that-catches-it-before-j0</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/google-ads-can-spend-up-to-2x-your-daily-budget-i-built-a-chrome-extension-that-catches-it-before-j0</guid>
      <description>&lt;p&gt;If you've ever opened Google Ads and noticed your campaign spent way more than the daily budget you set, you're not imagining it. Google's documentation explicitly says they may spend up to &lt;strong&gt;twice your daily budget&lt;/strong&gt; on any given day, evening it out across the month. That's not a bug — it's how their pacing engine has always worked.&lt;/p&gt;

&lt;p&gt;What changed in March 2026: Google now aggressively targets &lt;strong&gt;100% of your monthly limit&lt;/strong&gt; — which is 30.4× your daily budget. Even with ad scheduling. So if your campaigns only run 22 days a month (weekdays only, for example), Google can push up to &lt;strong&gt;38% more spend per active day&lt;/strong&gt; than you'd expect from your daily budget setting.&lt;/p&gt;

&lt;p&gt;Most PPC managers don't notice until the damage is done. The Campaigns tab in Google Ads doesn't tell you whether you're on pace or headed for overspend. You'd need a spreadsheet, a calendar, and a calculator open in another window — or a SaaS tool that costs $49 to $749 per month.&lt;/p&gt;

&lt;p&gt;I got tired of the spreadsheet route. So I built a Chrome extension that does it inside Google Ads, in real time, for free up to 3 campaigns. Walking through the build because the technical approach is interesting and the pricing math vs SaaS tools is genuinely lopsided.&lt;/p&gt;

&lt;h2&gt;
  
  
  What budget pacing actually requires
&lt;/h2&gt;

&lt;p&gt;The math is simple. For each campaign:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;expected_spend_today = daily_budget × (days_elapsed_in_month / total_days_in_month)
pacing_ratio = actual_spend_today / expected_spend_today

# pacing_ratio &amp;lt; 1.10 → on pace
# pacing_ratio 1.10–1.20 → slight overspend
# pacing_ratio &amp;gt; 1.20 → overspend risk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole core logic. SaaS tools wrap this in dashboards, alerts, multi-account aggregation, and reporting. But the underlying calculation is six lines of code.&lt;/p&gt;

&lt;p&gt;The reason SaaS tools charge $49+/month isn't the math — it's the data plumbing. They connect to the Google Ads API (OAuth, refresh tokens, quota management), run server-side jobs to pull your accounts on a schedule, store results in a database, render charts. Real infrastructure cost.&lt;/p&gt;

&lt;p&gt;But here's the thing: &lt;strong&gt;your campaign data is already visible on your Google Ads screen&lt;/strong&gt;. Names, budgets, costs, statuses — the information is sitting in the DOM right there. If you're already looking at Google Ads, why does anyone need to call an API to tell you what you're already looking at?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Chrome extension approach
&lt;/h2&gt;

&lt;p&gt;I built AdPacer as a Manifest V3 Chrome extension that reads the campaign data from the Google Ads page DOM and overlays three pacing indicators directly on the interface you're already using. Architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Content script&lt;/strong&gt; runs on &lt;code&gt;ads.google.com&lt;/code&gt; URLs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MutationObserver&lt;/strong&gt; detects when the campaigns table renders or updates (Google Ads is a heavy SPA so this matters)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DOM parsing&lt;/strong&gt; extracts campaign name, daily budget, current spend per row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pacing math&lt;/strong&gt; runs locally on the extracted values&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DOM injection&lt;/strong&gt; adds the colored pacing bars and projected-spend badges next to each campaign row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notifications API&lt;/strong&gt; for the periodic overspend checks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Zero API calls. Zero authentication flows. Zero backend. Zero data leaves the user's browser. Everything runs in the page's content-script context.&lt;/p&gt;

&lt;p&gt;The privacy implication is meaningful: AdPacer cannot exfiltrate your Google Ads data even if it wanted to. There's no network request to anywhere. SaaS tools, however privacy-conscious their privacy policies are, send your campaign data to their servers as a fundamental part of how they work.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you actually see in Google Ads after install
&lt;/h2&gt;

&lt;p&gt;Three additions to the standard Campaigns tab:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Pacing bars&lt;/strong&gt; — a color-coded bar next to each campaign:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Green:&lt;/strong&gt; on pace, within 10% of expected spend&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Yellow:&lt;/strong&gt; ahead of pace, 10–20% over expected&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red:&lt;/strong&gt; overspend risk, 20%+ over expected&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Projected end-of-month spend&lt;/strong&gt; — a badge showing what your monthly spend will be if you continue at the current daily run rate. Updates as the page data updates. No spreadsheet required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Browser notifications&lt;/strong&gt; — when any campaign crosses your threshold (configurable from 10% to 25%). Checks automatically every 30 minutes. Catch problems early instead of at month-end reconciliation.&lt;/p&gt;

&lt;p&gt;That's it. Install, open Google Ads, see your pacing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing — and why this is structured the way it is
&lt;/h2&gt;

&lt;p&gt;I deliberately wanted to make this accessible to freelancers and small teams, not enterprise-priced.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;th&gt;Limit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;Up to 3 campaigns. All core features. No credit card, no trial expiration.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pro&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$14/mo&lt;/td&gt;
&lt;td&gt;Unlimited campaigns, custom thresholds, priority support.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$29/mo&lt;/td&gt;
&lt;td&gt;Multi-account support, PDF pacing reports, team sharing.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The free tier covers most freelancers managing 1-3 client accounts at a time, or small e-commerce teams running a couple of brand/generic/shopping campaigns. Pro is for in-house PPC managers running 5-50 campaigns. Agency is for teams managing multiple clients.&lt;/p&gt;

&lt;p&gt;For comparison with the SaaS landscape:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Lowest tier&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;TrueClicks&lt;/td&gt;
&lt;td&gt;$49/mo&lt;/td&gt;
&lt;td&gt;Broader PPC management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optmyzr&lt;/td&gt;
&lt;td&gt;$129/mo&lt;/td&gt;
&lt;td&gt;Optimization suite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WordStream&lt;/td&gt;
&lt;td&gt;$299/mo+&lt;/td&gt;
&lt;td&gt;Enterprise tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AdPacer&lt;/td&gt;
&lt;td&gt;$0–$14/mo&lt;/td&gt;
&lt;td&gt;Pacing only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you need full PPC management — bid optimization, A/B testing, audience suggestions, the whole stack — the SaaS tools are doing a lot more than pacing. But if all you actually need is "tell me when a campaign is going to overspend," paying $49-299/month for that single feature is overkill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who I built this for
&lt;/h2&gt;

&lt;p&gt;PPC managers running Google Ads daily who want instant budget visibility without context-switching to another tool. Freelancers managing 1-5 client accounts where SaaS pricing eats too much of the margin. Agency teams who need quick pacing checks across multiple campaigns. E-commerce advertisers watching ROAS and budget efficiency in real time.&lt;/p&gt;

&lt;p&gt;If you're an enterprise team running 200+ campaigns with complex bid strategies, this isn't for you — you probably already have an Optmyzr-class tool. If you're somewhere between "spreadsheet" and "expensive SaaS," this fills the gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it doesn't do (yet)
&lt;/h2&gt;

&lt;p&gt;Being honest about scope:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No Microsoft Ads / Bing Ads support&lt;/strong&gt; yet (Google Ads only)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Meta / TikTok Ads&lt;/strong&gt; (different DOMs, different challenges, would be a separate extension)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No historical pacing trends&lt;/strong&gt; beyond current month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No bid suggestions or campaign optimization&lt;/strong&gt; (that's a different problem space)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pacing is a single, focused use case. The extension does that one thing well rather than trying to be a half-decent everything tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install link
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://chromewebstore.google.com/detail/adpacer-%E2%80%94-budget-pacing-f/mfgliiabejphemhkhlnapbebmkfhfjfm" rel="noopener noreferrer"&gt;AdPacer on the Chrome Web Store&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Free tier covers up to 3 campaigns with no credit card and no trial expiration — install and see if it solves your problem before paying anything.&lt;/p&gt;

&lt;p&gt;If you're a PPC manager and the spending pattern Google introduced in March 2026 has been causing you headaches, this is the lowest-friction way to catch overspend before it happens. If you're a developer reading this for the technical approach: yes, the entire thing runs client-side via DOM parsing — no API key, no backend, no data leaves the browser.&lt;/p&gt;

&lt;p&gt;Happy to answer questions about either side.&lt;/p&gt;

</description>
      <category>chrome</category>
      <category>marketing</category>
      <category>productivity</category>
      <category>javascript</category>
    </item>
    <item>
      <title>How to scrape Weibo (微博) data with Python in 2026 — the Sina Visitor System and how to handle it</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 30 Apr 2026 14:58:19 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/how-to-scrape-weibo-wei-bo-data-with-python-in-2026-the-sina-visitor-system-and-how-to-handle-it-1j6g</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/how-to-scrape-weibo-wei-bo-data-with-python-in-2026-the-sina-visitor-system-and-how-to-handle-it-1j6g</guid>
      <description>&lt;p&gt;Weibo is China's Twitter — the platform where Chinese public opinion forms, brand crises break first, and government statements land. 580M+ monthly active users, mostly mainstream demographics. If you're doing China market intelligence, brand monitoring, or PR analytics, Weibo is one of the platforms you can't skip.&lt;/p&gt;

&lt;p&gt;The challenge: Weibo's developer API requires a Chinese business license, has severe rate limits, and exposes very limited data. For Western teams, web scraping is the practical option. The interesting twist is Weibo's Sina Visitor System — an auth flow that makes anonymous access possible for some endpoints but not others. Understanding which is which matters for what you can actually scrape.&lt;/p&gt;

&lt;p&gt;This article covers the technical landscape (with real Python code) and points to a hosted scraper if you'd rather skip the maintenance.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Weibo serves
&lt;/h2&gt;

&lt;p&gt;A Weibo post is structured similarly to a tweet but with longer character limits and more structured engagement signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Post text&lt;/strong&gt; (140 to 2,000 characters depending on user level)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repost chain&lt;/strong&gt; — Weibo's quote-tweet equivalent, central to virality tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engagement metrics&lt;/strong&gt; — &lt;code&gt;attitudes_count&lt;/code&gt; (likes), &lt;code&gt;comments_count&lt;/code&gt;, &lt;code&gt;reposts_count&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hashtags and mentions&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geolocation&lt;/strong&gt; if disclosed by user&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author profile&lt;/strong&gt; — follower count, verification status, verified reason (e.g., "新浪科技 official Weibo")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Media&lt;/strong&gt; — images, videos&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A Weibo user profile gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User ID (numeric)&lt;/li&gt;
&lt;li&gt;Screen name (display name)&lt;/li&gt;
&lt;li&gt;Description / bio&lt;/li&gt;
&lt;li&gt;Followers / friends counts&lt;/li&gt;
&lt;li&gt;Statuses count (total posts)&lt;/li&gt;
&lt;li&gt;Verification status with reason text — this is gold for identifying official accounts vs personal vs corporate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For monitoring use cases, the metric that matters most depends on your goal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Crisis monitoring&lt;/strong&gt;: track &lt;code&gt;comments_count&lt;/code&gt; and repost velocity. A spike in either signals viral attention.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brand presence&lt;/strong&gt;: track post frequency from verified accounts in your category.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KOL identification&lt;/strong&gt;: filter by &lt;code&gt;verified=true&lt;/code&gt; + follower count above a threshold.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Sina Visitor System
&lt;/h2&gt;

&lt;p&gt;This is the key technical concept for scraping Weibo without a Chinese business license.&lt;/p&gt;

&lt;p&gt;When you visit Weibo without logging in, Sina automatically issues you a "visitor cookie" via what they call the Sina Visitor System (SVS). This cookie lets you access limited public data — specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hot search / trending topics&lt;/strong&gt;: full access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post comments&lt;/strong&gt;: full access for any public post&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post viewing&lt;/strong&gt;: limited&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For these endpoints, scraping is straightforward — get a visitor cookie, hit the AJAX endpoint, parse JSON.&lt;/p&gt;

&lt;p&gt;What the visitor cookie does NOT give you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search by keyword&lt;/strong&gt; (returns hot timeline as a fallback instead of true search results)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User posts beyond profile basics&lt;/strong&gt; (you get the profile, not the user's post history)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For those, you need a real logged-in cookie — specifically the &lt;code&gt;SUB&lt;/code&gt; cookie value from a logged-in browser session. We'll get to that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 1: Build it yourself
&lt;/h2&gt;

&lt;p&gt;The Sina Visitor System flow looks roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;

&lt;span class="c1"&gt;# Step 1: Hit the visitor system to get a tid (temporary ID)
&lt;/span&gt;&lt;span class="n"&gt;visitor_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://passport.weibo.com/visitor/genvisitor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;visitor_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gen_callback&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;os&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;browser&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Chrome&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fonts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;undefined&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;screenInfo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1920*1080*24&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plugins&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="s"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;# The response is a JSONP-wrapped JSON. Strip the wrapper, parse, extract tid.
&lt;/span&gt;
&lt;span class="c1"&gt;# Step 2: Use tid to get the SUB visitor cookie
&lt;/span&gt;&lt;span class="n"&gt;incarnate_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://passport.weibo.com/visitor/visitor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;incarnate_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;incarnate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;# Response sets cookies. Extract SUB and SUBP from response.cookies.
&lt;/span&gt;
&lt;span class="c1"&gt;# Step 3: Use those cookies to call AJAX endpoints
&lt;/span&gt;&lt;span class="n"&gt;hot_search_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://weibo.com/ajax/side/hotSearch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hot_search_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cookies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUBP&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subp&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# data["data"]["realtime"] is the hot search list
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the rough shape. In practice you'll handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rate limit responses (HTTP 418, 429) with exponential backoff&lt;/li&gt;
&lt;li&gt;Cookie expiration (visitor cookies last hours, not days)&lt;/li&gt;
&lt;li&gt;AJAX endpoint changes (Weibo periodically reshuffles paths)&lt;/li&gt;
&lt;li&gt;Anti-scraping fingerprint checks (less aggressive than RedNote, but still present)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the keyword-search and user-posts endpoints, you'll need a real &lt;code&gt;SUB&lt;/code&gt; cookie from a logged-in account:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Get SUB from your browser DevTools → Application → Cookies → weibo.com
# Look for the cookie named "SUB"
&lt;/span&gt;&lt;span class="n"&gt;sub_cookie&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB=_2A25Fxxxxxx...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://weibo.com/ajax/side/searchAll&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;q&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;人工智能&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;cookies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sub_cookie&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cookies typically last several days before expiring, depending on Weibo's session policies.&lt;/p&gt;

&lt;h3&gt;
  
  
  DIY cost breakdown
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Estimate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial setup (visitor system, hot search, comments)&lt;/td&gt;
&lt;td&gt;4-8 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User session cookie management&lt;/td&gt;
&lt;td&gt;1-2 hours/week&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maintenance when Weibo changes endpoints&lt;/td&gt;
&lt;td&gt;2-4 hours, every 2-3 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No proxy needed for most endpoints&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Weibo is genuinely the easiest of the major Chinese platforms to scrape if you stay within visitor-system endpoints. RedNote and Bilibili both have more complex auth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 2: Use a hosted scraper
&lt;/h2&gt;

&lt;p&gt;If you don't want to maintain visitor-system handling and cookie management, the &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; Apify actor handles it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Hot search (no cookie needed)
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hot_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (heat: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hotValue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rank"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能最新突破"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"科技"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hotValue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2847562&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"labelName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"热"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"isHot"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://s.weibo.com/weibo?q=..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-25T12:00:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For brand monitoring, search mode is what you want — though note the search-vs-cookie tradeoff:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Without cookie: returns hot timeline as fallback
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CeraVe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# With cookie: returns true keyword-matched results
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CeraVe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cookieString&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB=your_logged_in_cookie&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hosted actor handles the visitor system, exponential backoff, and rate limit recovery internally. Pricing: $5 per 1,000 results.&lt;/p&gt;

&lt;p&gt;Honest stats on the actor right now: 4 paying users, 11 free-tier users, 92.5% success rate, 3,768 result extractions to date. Average issue response time when something breaks: under a few hours.&lt;/p&gt;

&lt;h2&gt;
  
  
  When DIY vs hosted
&lt;/h2&gt;

&lt;p&gt;DIY makes sense when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're processing &amp;gt; 1M posts/month (per-result cost adds up)&lt;/li&gt;
&lt;li&gt;You have ops capacity to refresh &lt;code&gt;SUB&lt;/code&gt; cookies regularly&lt;/li&gt;
&lt;li&gt;You need to scrape behind login at scale&lt;/li&gt;
&lt;li&gt;You have specific endpoints not covered by hosted scrapers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hosted makes sense when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You don't have a dedicated scraper engineer&lt;/li&gt;
&lt;li&gt;Volume is moderate (&amp;lt; 500k posts/month)&lt;/li&gt;
&lt;li&gt;You want the visitor-system handling to be someone else's problem&lt;/li&gt;
&lt;li&gt;You're prototyping and want to validate the use case before committing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What you do with the data downstream
&lt;/h2&gt;

&lt;p&gt;Sentiment analysis on Chinese text is the obvious next layer. Off-the-shelf Chinese BERT models work reasonably for Weibo's discourse style — Weibo posts tend to be more formal than RedNote slang, so general Chinese sentiment models accuracy is higher (typical 75-85% on neutral/positive/negative classification).&lt;/p&gt;

&lt;p&gt;For brand crisis detection, the signal you usually want is *&lt;em&gt;velocity&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>webscraping</category>
      <category>china</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Coin-per-view: the Bilibili metric that beats subscriber count for creator vetting</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Wed, 29 Apr 2026 17:48:32 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/coin-per-view-the-bilibili-metric-that-beats-subscriber-count-for-creator-vetting-477j</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/coin-per-view-the-bilibili-metric-that-beats-subscriber-count-for-creator-vetting-477j</guid>
      <description>&lt;p&gt;If you've ever sponsored a YouTube creator and been disappointed by the ROI, you've already lived through what subscriber count actually measures: not engagement, not influence, not purchase intent. Just historical clicks on a follow button. Many of those followers stopped opening videos two years ago. Some are inactive accounts. Some followed for a single piece of content that has nothing to do with your brand.&lt;/p&gt;

&lt;p&gt;This is universally true on creator platforms, but it's especially true on Bilibili — China's YouTube. With 300M+ monthly active users skewed Gen Z and millennials, Bilibili is where Chinese creator marketing happens. And Bilibili exposes three engagement signals that YouTube doesn't, which together let you cut through the noise of follower counts and identify creators whose audiences actually engage.&lt;/p&gt;

&lt;p&gt;The single most useful one is &lt;strong&gt;coin-per-view ratio&lt;/strong&gt;. This post explains what it is, why it matters, what threshold to use, and how to compute it for any Chinese creator in a few lines of code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why follower count is a lying signal
&lt;/h2&gt;

&lt;p&gt;Three reasons follower counts mislead in creator marketing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Followers are a lagging indicator.&lt;/strong&gt; Someone followed a creator in 2023 because they liked one video. That doesn't tell you whether they still watch in 2026, whether they engage, or whether they trust the creator's recommendations enough to buy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Followers are gameable.&lt;/strong&gt; Not everyone games them, but enough creators do that you can't trust raw counts without other signals. Bot followers, follow-for-follow campaigns, paid follower services. China specifically has a robust market for these.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The follower-to-engagement ratio varies wildly.&lt;/strong&gt; A creator with 100k followers and 1M average views per video has fundamentally different audience economics than another creator with 100k followers and 5k average views per video. Both have the same "follower count" — the engagement quality is the actual signal.&lt;/p&gt;

&lt;p&gt;This is why every serious creator marketing tool talks about "engagement rate" — which on YouTube is usually computed as (likes + comments) / views. It's better than raw follower count, but on Bilibili you can do meaningfully better.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three Bilibili-native metrics
&lt;/h2&gt;

&lt;p&gt;Bilibili was designed by anime fans for anime fans, and the engagement system reflects values around quality and creator support that YouTube's flat "like" button never captured. Three metrics that come back from any Bilibili video scrape:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Danmaku (弹幕)&lt;/strong&gt; — real-time scrolling comments overlaid on the video as users watch. Think livestream chat, but for pre-recorded video. The danmaku count tells you how many people were engaged enough mid-watch to type something. It's a leading indicator of viewing time and attention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Favorites (收藏)&lt;/strong&gt; — equivalent to "save for later" or YouTube's bookmark. Strong long-term value signal: high favorites relative to views means people return to this video. Tutorials, references, and definitive content score high here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coins (投币)&lt;/strong&gt; — Bilibili's tipping system. This is the interesting one. Each user gets a small daily allocation of coins (typically 5 per day for active users), and they can "throw" them at videos they want to support. Because coins are scarce by design — you only have a few to spend, ever — coin counts are a strong genuine-appreciation signal.&lt;/p&gt;

&lt;p&gt;A user gives a coin to a video they love. They give a coin to a creator they want to keep making content. They don't give a coin to a video they passively watched and forgot. The cost is real (relative to the user's daily allocation), so the signal is real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coin-per-view ratio: the single best signal
&lt;/h2&gt;

&lt;p&gt;If I had to pick one metric to evaluate a Bilibili creator, it would be &lt;strong&gt;median coin-per-view ratio across their last 20-30 videos&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The math is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;coin_per_view&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;coin_count&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;view_count&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;  &lt;span class="c1"&gt;# express as percentage
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What I've found from looking at hundreds of Bilibili creators across categories:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Coin/View %&lt;/th&gt;
&lt;th&gt;Audience quality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&amp;lt; 0.5%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Passive viewers. Casual scrolling traffic, not engaged.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;0.5% – 1%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Average. Normal Bilibili content, decent audience.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1% – 2%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Strong. Genuinely engaged audience. Worth sponsoring.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&amp;gt; 2%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Exceptional. Users actively spending limited resources on this content.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Above 2% is rare. It typically indicates either: (a) genuinely high-quality educational/tutorial content that people return to, (b) a creator with a deeply loyal niche audience, or (c) content that struck a strong emotional/cultural nerve.&lt;/p&gt;

&lt;p&gt;For creator vetting, my heuristic is: &lt;strong&gt;if median coin-per-view is below 1%, the audience is more passive than the follower count suggests; sponsorship ROI will probably disappoint.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How to compute this for any creator
&lt;/h2&gt;

&lt;p&gt;The data you need: a creator's recent videos with their view and coin counts. Bilibili exposes this through their public API — no auth required. You can use the open-source &lt;code&gt;bilibili-api&lt;/code&gt; Python library, or call their &lt;code&gt;/x/space/wbi/arc/search&lt;/code&gt; endpoint directly.&lt;/p&gt;

&lt;p&gt;If you'd rather skip the API integration entirely, I built a hosted scraper on Apify Store: &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt;. $5 per 1,000 results, free tier covers ~1,000 results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;statistics&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;median&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Get a creator's last 30 videos
# user_id (mid) is the number in their profile URL: space.bilibili.com/{mid}
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_videos&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;userIds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;546195&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;   &lt;span class="c1"&gt;# 老番茄 (a well-known Bilibili gamer)
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;videos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;video&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;videos&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Compute coin-per-view ratio per video
&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;videos&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# skip videos with too few views to be meaningful
&lt;/span&gt;        &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="n"&gt;coins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coinCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coins&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Videos analyzed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Median coin-per-view: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best video coin-per-view: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Categorize
&lt;/span&gt;&lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EXCEPTIONAL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;STRONG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AVERAGE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PASSIVE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Audience quality: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this against any Bilibili creator's user ID and you have a concrete answer about audience engagement quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  A workflow for vetting creators at scale
&lt;/h2&gt;

&lt;p&gt;If you're building a creator marketing program for the Chinese market, the workflow that works for the teams I've seen using this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Gather candidates.&lt;/strong&gt; From competitor sponsorship lists, from category trending, or from agency recommendations. Aim for 30-50 candidates per round.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pull their recent video portfolios.&lt;/strong&gt; Use &lt;code&gt;user_videos&lt;/code&gt; mode to get the last 20-30 videos per creator.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute aggregate metrics.&lt;/strong&gt; For each creator: median coin-per-view, median favorite-per-view, median danmaku-per-view, view consistency (standard deviation).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter on quality threshold.&lt;/strong&gt; Drop anyone with median coin-per-view below 1%. This usually cuts the candidate list by 40-60%.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual review of the survivors.&lt;/strong&gt; Watch a sample of their videos. Check for content fit. Evaluate sponsorship history (do their sponsored posts feel native or forced?).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Negotiate from the qualified shortlist.&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Total cost using a hosted scraper: ~$5-10 in scraping for a 50-creator vetting round. Compared to agency rates for the same work ($500-2000 per round), the math is obvious once you do it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-platform creator vetting
&lt;/h2&gt;

&lt;p&gt;Bilibili is not the whole story. If you're vetting creators for a comprehensive China presence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili&lt;/strong&gt; for video content (gaming, tech, anime, education)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RedNote (Xiaohongshu)&lt;/strong&gt; for product-discovery content (beauty, fashion, lifestyle, food)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo&lt;/strong&gt; for public discourse and broad reach campaigns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each platform has different engagement signals. Bilibili has coins; RedNote has saves (similarly scarce intent-to-buy signal); Weibo has reposts and verified-account hierarchy. A creator strong on one isn't necessarily strong on others.&lt;/p&gt;

&lt;p&gt;I maintain scrapers for all three on Apify Store under the &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;zhorex profile&lt;/a&gt;, with consistent output schemas across the suite. Same pricing model ($5/1000 results), same Apify infrastructure. If you're doing cross-platform creator analytics, the consistency saves integration time.&lt;/p&gt;

&lt;h2&gt;
  
  
  When this approach fails
&lt;/h2&gt;

&lt;p&gt;Two cases where coin-per-view ratio is a misleading signal:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Brand-new creators with very few videos.&lt;/strong&gt; If a creator has uploaded 3 videos and one went viral with high coins, the ratio looks artificial. Wait until you have 15-20 videos to compute median.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Live-stream-focused creators.&lt;/strong&gt; Bilibili lets creators upload archived live streams. Coin economics are different in livestream context (gifts replace coins). For livestream-heavy creators, you need different analysis.&lt;/p&gt;

&lt;p&gt;For everyone else, coin-per-view ratio is the single best signal I've found for vetting Bilibili creator quality at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this won't tell you
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Whether the audience is geographically right for your campaign (need follower demographics, which require auth)&lt;/li&gt;
&lt;li&gt;Whether the creator has done sponsorships before that flopped (need to scrape their content for promo patterns)&lt;/li&gt;
&lt;li&gt;Whether their audience overlaps with your target customer profile (need cross-reference with other platforms)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Treat coin-per-view as the engagement-quality filter. Everything else still requires manual review or additional data sources.&lt;/p&gt;




&lt;p&gt;If you're working on creator marketing for the Chinese market and want to compare notes on what works — drop a comment. I write about Chinese platform analytics (Bilibili, RedNote, Weibo) and the build-vs-buy tradeoffs around them.&lt;/p&gt;

&lt;p&gt;Hosted Bilibili scraper: &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/bilibili-scraper&lt;/a&gt;&lt;br&gt;
Other Chinese platform scrapers in the same suite: &lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;RedNote&lt;/a&gt; for product-discovery content, &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;Weibo&lt;/a&gt; for public discourse.&lt;/p&gt;

</description>
      <category>china</category>
      <category>marketing</category>
      <category>datascience</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Bilibili API Alternative: Scrape Videos &amp; Creators in 30 Lines of Python (No Browser, No Proxy)</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Wed, 29 Apr 2026 17:41:42 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/the-easiest-chinese-platform-to-scrape-in-python-in-2026-bilibili-in-under-30-lines-5216</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/the-easiest-chinese-platform-to-scrape-in-python-in-2026-bilibili-in-under-30-lines-5216</guid>
      <description>&lt;p&gt;If you've ever tried to scrape Chinese-platform data at scale, you know the landscape: TLS fingerprinting walls, residential-proxy bills, integration headaches that take weeks before you even see one row of data.&lt;/p&gt;

&lt;p&gt;Bilibili is the outlier. &lt;strong&gt;No API key. No browser. No proxy.&lt;/strong&gt; Pure HTTP. Runs in 256 MB of RAM. Genuinely the easiest Chinese platform to scrape in 2026.&lt;/p&gt;

&lt;p&gt;If you're tracking a gaming brand launch, doing creator research, building a market-trends dataset, or feeding a recommender system, Bilibili is where you should start. This post walks through how to scrape it from scratch in Python, what data you actually get, and when it makes sense to switch from DIY to a hosted scraper.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Bilibili is unusually scrape-friendly
&lt;/h2&gt;

&lt;p&gt;Bilibili (哔哩哔哩) is China's YouTube — 300M+ monthly active users, skewed Gen Z and millennials, dominant in anime, gaming, tech, and educational content. From a scraping perspective, three things make it different from RedNote and Weibo:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Public JSON endpoints.&lt;/strong&gt; Bilibili exposes JSON for video metadata, popular/trending, user info, and comments. Most don't require auth for public content.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Stable structure.&lt;/strong&gt; The endpoint shapes change rarely — months between meaningful updates. You can ship a scraper and it'll keep working.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Globally accessible.&lt;/strong&gt; No geo-fencing on most public content. You can hit it from a US datacenter without proxies for moderate volumes.&lt;/p&gt;

&lt;p&gt;The trade-off: comments throttle aggressively from cloud IPs (only top ~3 comments come back). Search has its own constraints (more on that below). For everything else, Bilibili is a friendly target.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can actually extract
&lt;/h2&gt;

&lt;p&gt;Per video:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BVID, title, description, URL, duration&lt;/li&gt;
&lt;li&gt;View count, like count, &lt;strong&gt;danmaku&lt;/strong&gt; (live scrolling comments) count, &lt;strong&gt;coin&lt;/strong&gt; (tip) count, favorite count, share count, reply count&lt;/li&gt;
&lt;li&gt;Author MID, name, publish date, category, tags&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Per user/creator:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MID, name, signature/bio, level, follower count&lt;/li&gt;
&lt;li&gt;Total archive (video) count&lt;/li&gt;
&lt;li&gt;Profile URL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Per comment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Comment ID, text, like count, reply count, created timestamp&lt;/li&gt;
&lt;li&gt;Author MID, name, avatar, level&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bilibili-specific signal worth highlighting: &lt;strong&gt;coin-per-view ratio&lt;/strong&gt;. When users "throw coins" at a video, they spend a finite daily allowance — that's a stronger quality signal than likes (which are free). Useful for creator vetting.&lt;/p&gt;

&lt;h2&gt;
  
  
  A minimal Python scraper (≈30 lines)
&lt;/h2&gt;

&lt;p&gt;This pulls Bilibili's trending/popular videos as clean structured rows. No external dependencies beyond &lt;code&gt;httpx&lt;/code&gt;. No request signing. Works on the first run.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;HEADERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User-Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Accept&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Referer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.bilibili.com/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_popular_bilibili&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.bilibili.com/x/web-interface/popular&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HEADERS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;owner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;owner&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
            &lt;span class="n"&gt;stat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bvid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bvid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;view&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;likeCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;like&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;danmakuCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;danmaku&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;duration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;duration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.bilibili.com/video/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bvid&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;fetch_popular_bilibili&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ensure_ascii&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole scraper. Plain HTTP. No browser. No proxy. Returns Bilibili's currently-trending videos with the engagement metrics that matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sample output
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bvid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BV1YXDfBUETP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AI教程入门:从零开始"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"author"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"技术老师"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"viewCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1570113&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"likeCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;182455&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"danmakuCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7466&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;767&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.bilibili.com/video/BV1YXDfBUETP"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When DIY breaks down
&lt;/h2&gt;

&lt;p&gt;The 30-line popular fetcher is great for prototyping and tracking trends. It falls over when you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search by keyword.&lt;/strong&gt; Bilibili's search endpoints have additional request requirements that change over time. Reimplementing them yourself means signing logic that you'll need to chase as Bilibili updates it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comments at depth.&lt;/strong&gt; Cloud IPs get throttled to ~3 comments per video. You need residential IPs or session continuity to paginate full threads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-mode pipelines.&lt;/strong&gt; Search + video detail + user-videos + comments + popular all wired together with retries, rate-limit handling, and consistent output schemas — that's a weekend's work, not 30 lines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-running jobs.&lt;/strong&gt; Cookie refresh, exponential backoff on rate-limit responses, image URL normalization, edge cases on user pages with paid memberships.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output stability.&lt;/strong&gt; Bilibili sometimes changes a single field name silently. Your downstream breaks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When that's where you are, point a hosted Actor at it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hosted alternative
&lt;/h2&gt;

&lt;p&gt;I maintain the &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;strong&gt;Bilibili Scraper on Apify&lt;/strong&gt;&lt;/a&gt; — five modes (&lt;code&gt;search&lt;/code&gt;, &lt;code&gt;video_detail&lt;/code&gt;, &lt;code&gt;video_comments&lt;/code&gt;, &lt;code&gt;user_videos&lt;/code&gt;, &lt;code&gt;popular&lt;/code&gt;) in one Actor. Pure HTTP under the hood, same as above, but with all the rate-limiting + retry + schema-stability work already done — including search, which the DIY example skips.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;人工智能教程&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;danmakuCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pay-per-result: $0.005 per item. First 1,000 items are on Apify's free tier ($5/month credit).&lt;/p&gt;

&lt;h2&gt;
  
  
  Common questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Will Bilibili add anti-bot eventually?&lt;/strong&gt; Some endpoints already have additional request requirements; popular/trending and basic video metadata have stayed open for years. The hosted Actor tracks any changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about geo-restricted content?&lt;/strong&gt; Some licensed videos (anime, copyrighted music) are mainland-only. Public metadata for those still works; the streams don't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is this legal?&lt;/strong&gt; This Actor only accesses Bilibili's public HTTP endpoints — the same data any browser visitor sees without logging in. No authentication is bypassed. Always consult your local laws.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building Chinese intelligence at scale?
&lt;/h2&gt;

&lt;p&gt;If you're combining Bilibili with other Chinese platforms (Weibo, RedNote), I maintain the full suite under the same code style and schema conventions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;Bilibili Scraper&lt;/a&gt; — &lt;em&gt;(this one)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;Weibo Scraper&lt;/a&gt; — microblogging, 580M MAU&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;RedNote (Xiaohongshu) Scraper&lt;/a&gt; — lifestyle social&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/rednote-shop-scraper" rel="noopener noreferrer"&gt;RedNote Shop Scraper&lt;/a&gt; — Xiaohongshu e-commerce&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Running 1,000+ items per week?&lt;/strong&gt; I offer custom output schemas, dedicated proxy pools, SLA support, and volume discounts above 50K items/month. DM me on Apify or open an Issue with subject "Enterprise inquiry".&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Found a bug?&lt;/strong&gt; Open an Issue on the Actor page — I usually ship fixes within 48 hours.&lt;/p&gt;

&lt;p&gt;If this saved you time, a 30-second review on the Apify Store helps a lot. ⭐&lt;/p&gt;

</description>
      <category>python</category>
      <category>webscraping</category>
      <category>china</category>
      <category>datascience</category>
    </item>
    <item>
      <title>How to scrape RedNote (Xiaohongshu) with Python in 2026 — the auth/signing problem and how to handle it</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Sat, 25 Apr 2026 01:13:23 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/how-to-scrape-rednote-xiaohongshu-with-python-in-2026-the-authsigning-problem-and-how-to-3f9e</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/how-to-scrape-rednote-xiaohongshu-with-python-in-2026-the-authsigning-problem-and-how-to-3f9e</guid>
      <description>&lt;p&gt;RedNote (Xiaohongshu, 小红书, sometimes "Little Red Book" or just XHS) is the platform a lot of Western teams realized they needed to monitor in 2024-2025, when the TikTok regulatory mess in the US sent millions of users — and brand attention — toward Chinese platforms. It's now China's #1 lifestyle and product-discovery network, with 300M+ monthly active users and a search-driven discovery model that makes it different from every other Chinese social platform.&lt;/p&gt;

&lt;p&gt;The problem: there's no official public API. Western teams who try to monitor it usually end up either (a) paying enterprise vendors $20-50k/year for limited China coverage, or (b) trying to scrape it themselves and discovering that RedNote has one of the more aggressive anti-scraping stacks in Chinese social.&lt;/p&gt;

&lt;p&gt;This article walks through the actual technical challenges and shows you both DIY and hosted approaches with real Python code. I've shipped a hosted RedNote scraper on Apify that I'll mention later — but the goal here is for you to understand the problem space well enough to make an informed build-vs-buy decision, not to sell you anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  What RedNote actually serves
&lt;/h2&gt;

&lt;p&gt;Before we go technical: what data does RedNote expose, and what's actually useful?&lt;/p&gt;

&lt;p&gt;A RedNote post is structured roughly like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Title&lt;/strong&gt; (often very short, sometimes empty)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Body text&lt;/strong&gt; — long-form description with product mentions, hashtags, location tags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image carousel&lt;/strong&gt; — 1-9 images. Critical: a non-trivial portion of product info lives in image text overlays, not in the body&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engagement metrics&lt;/strong&gt; — likes, saves, comments, shares&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author profile&lt;/strong&gt; — username, avatar, follower/following counts, bio, verification, location&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tags / categories&lt;/strong&gt; — hashtags and platform-assigned categories&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most monitoring use cases, the metric that matters more than likes is &lt;strong&gt;saves&lt;/strong&gt;. Saves on RedNote are the closest equivalent to "I want to buy this later" — they correlate with purchase intent. Likes on RedNote are casual engagement, similar to Twitter likes.&lt;/p&gt;

&lt;p&gt;Profile data is structured similarly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User ID, RedNote ID (red ID), nickname, avatar&lt;/li&gt;
&lt;li&gt;Bio / description&lt;/li&gt;
&lt;li&gt;Follower / following counts&lt;/li&gt;
&lt;li&gt;Location, gender, profile tags&lt;/li&gt;
&lt;li&gt;Total likes received across all posts&lt;/li&gt;
&lt;li&gt;Verification status&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The technical challenges (why this is harder than scraping Twitter)
&lt;/h2&gt;

&lt;p&gt;If you've scraped Western social platforms, your default toolkit is probably &lt;code&gt;httpx&lt;/code&gt; or &lt;code&gt;requests&lt;/code&gt; plus maybe a residential proxy. RedNote is going to break each of those defaults.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 1: TLS fingerprinting
&lt;/h3&gt;

&lt;p&gt;RedNote uses TLS fingerprinting (specifically JA3/JA4) to identify and block requests that don't come from real browsers. The &lt;code&gt;requests&lt;/code&gt; library has a Python-specific TLS fingerprint that RedNote's bot-detection layer recognizes immediately.&lt;/p&gt;

&lt;p&gt;The standard fix is to use &lt;code&gt;curl_cffi&lt;/code&gt;, which lets you spoof a Chrome or Safari TLS fingerprint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;curl_cffi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;curl_requests&lt;/span&gt;

&lt;span class="c1"&gt;# Spoof Chrome 120's TLS fingerprint
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;curl_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.xiaohongshu.com/explore&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;impersonate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chrome120&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This alone gets you past the first layer of detection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 2: Request signing
&lt;/h3&gt;

&lt;p&gt;RedNote signs every API request with a value called &lt;code&gt;x-s&lt;/code&gt; (sometimes seen as &lt;code&gt;xs&lt;/code&gt;) plus other parameters like &lt;code&gt;x-t&lt;/code&gt; and &lt;code&gt;x-s-common&lt;/code&gt;. These are computed client-side from a JavaScript function in their web app.&lt;/p&gt;

&lt;p&gt;The signing function changes roughly monthly. When it changes, every scraper using the old signing logic breaks until someone reverse-engineers the new function.&lt;/p&gt;

&lt;p&gt;Here's roughly what you need to do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudo-code — actual signing logic is more complex
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_signing_headers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    The actual logic is reverse-engineered from RedNote&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s web client JS.
    This requires reading their obfuscated bundle and reproducing it in Python.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;body_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;separators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;

    &lt;span class="c1"&gt;# Real implementation involves:
&lt;/span&gt;    &lt;span class="c1"&gt;# - Specific input concatenation order
&lt;/span&gt;    &lt;span class="c1"&gt;# - Custom hashing scheme (not standard HMAC)
&lt;/span&gt;    &lt;span class="c1"&gt;# - Several "magic constants" that change when RedNote rotates
&lt;/span&gt;    &lt;span class="c1"&gt;# - Sometimes a captcha-derived token
&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url_path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;data=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;body_str&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;t=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;x_s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# This is NOT the actual algorithm
&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x_s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="c1"&gt;# x-s-common is computed separately
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actual signing algorithm is more complex than what I've shown. There are open-source libraries that have reverse-engineered it (&lt;code&gt;xhs-api&lt;/code&gt; and similar on GitHub) — they get you most of the way there, but expect to patch them when RedNote rotates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 3: IP-level rate limiting and datacenter blocking
&lt;/h3&gt;

&lt;p&gt;RedNote blocks requests from datacenter IPs (AWS, GCP, Azure, DigitalOcean, etc.) within minutes. You need residential proxies, ideally with Chinese geolocation or at least Asia-Pacific.&lt;/p&gt;

&lt;p&gt;Even with residential IPs, there's a per-IP rate limit. Realistic throughput is around 10-20 requests per minute per IP before you start getting 412/418 errors and eventually IP bans.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 4: SPA / dynamic rendering for some endpoints
&lt;/h3&gt;

&lt;p&gt;Search and the explore feed are loaded via AJAX after initial page load, but a few endpoints (some user pages, certain post types) only render their data in the Vue.js application state. You either need to extract data from the inline &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tag (look for &lt;code&gt;window.__INITIAL_STATE__&lt;/code&gt;) or render with Playwright.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 5: Login walls on certain features
&lt;/h3&gt;

&lt;p&gt;True keyword-filtered search requires login. Without login, you get the explore feed (trending/recommended for your keyword), which is useful but not the same. This is a structural product limitation, not a scraping limitation — you can scrape it the same way logged-in users see it, you just need to either provide cookies or accept the explore-feed fallback.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 1: Build it yourself
&lt;/h2&gt;

&lt;p&gt;If you have ops capacity to maintain it (someone who can read JavaScript and reverse-engineer signing functions monthly), DIY is feasible. Here's a minimal example using &lt;code&gt;curl_cffi&lt;/code&gt; plus an open-source signing library.&lt;/p&gt;

&lt;p&gt;First install:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;curl_cffi xhs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: &lt;code&gt;xhs&lt;/code&gt; is one of several open-source libraries on GitHub that wrap RedNote's API. Check their commit history before depending on one — the abandoned ones break monthly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;xhs&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;XhsClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;xhs.exceptions&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataFetchError&lt;/span&gt;

&lt;span class="c1"&gt;# You need to provide your own cookies and signing function URL
# The 'sign' function comes from the library's reverse-engineered JS
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;web_session&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation provided by the library
&lt;/span&gt;    &lt;span class="c1"&gt;# When RedNote rotates, you'll need to update this
&lt;/span&gt;    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;XhsClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;cookie&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;abRequestId=...; webBuild=...; xsecappid=xhs-pc-web; ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;sign&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sign&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Get a user's posts
&lt;/span&gt;    &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5cfbc3f10000000018023ebb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_user_notes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;notes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Title: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;display_title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Likes: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;interact_info&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;liked_count&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;DataFetchError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RedNote rejected the request: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Common causes:
&lt;/span&gt;    &lt;span class="c1"&gt;# - Signing function out of date (update from upstream)
&lt;/span&gt;    &lt;span class="c1"&gt;# - Cookie expired (re-login)
&lt;/span&gt;    &lt;span class="c1"&gt;# - IP throttled (rotate proxy)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cookie you need comes from logging into RedNote in a browser and copying the relevant cookies from DevTools. The cookies expire — typically a few days — so you'll need to refresh them periodically.&lt;/p&gt;

&lt;p&gt;Here's the honest cost breakdown for DIY:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cost component&lt;/th&gt;
&lt;th&gt;Estimate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial setup (researching libraries, getting first scrape working)&lt;/td&gt;
&lt;td&gt;8-16 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Residential proxy (Bright Data, Oxylabs, etc.)&lt;/td&gt;
&lt;td&gt;$50-200/month for moderate volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-incident maintenance when RedNote rotates&lt;/td&gt;
&lt;td&gt;4-8 hours, 1-2x/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ongoing: cookie refresh, error handling&lt;/td&gt;
&lt;td&gt;1-2 hours/week&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you have a developer whose time is worth $50-100/hour, DIY is around $400-1000/month all-in for moderate scraping volumes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 2: Use a hosted scraper
&lt;/h2&gt;

&lt;p&gt;The build-vs-buy math changes if you don't have someone on the team who can read reverse-engineered JavaScript and patch signing logic. Hosted Apify Actors handle that for you.&lt;/p&gt;

&lt;p&gt;Several developers (including me) maintain RedNote scrapers on Apify. Mine is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/rednote-xiaohongshu-scraper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;$5 per 1,000 results&lt;/li&gt;
&lt;li&gt;14 paying users currently, 38 on free tier&lt;/li&gt;
&lt;li&gt;Average issue response time: 1.6 hours&lt;/li&gt;
&lt;li&gt;88.8% success rate (the gap is mostly RedNote-side transient errors)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Using it from Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Search
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/rednote-xiaohongshu-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;skincare routine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filterByMinLikes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;  &lt;span class="c1"&gt;# Only return posts with 100+ likes
&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Iterate over results
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Title: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Likes: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;likes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Author: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nickname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;URL: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;postUrl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output JSON is flat:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"69d269310000000023017e07"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.xiaohongshu.com/explore/69d269310000000023017e07"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"normal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Morning skincare routine for dry skin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"images"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"https://sns-webpic-qc.xhscdn.com/..."&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"likes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15234&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"author"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"575d32285e87e733f0162c0a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"nickname"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BeautyQueen"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"avatar"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://sns-avatar-qc.xhscdn.com/..."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-25T21:14:30Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is one option among several on Apify Store. EasyApi has the most users by volume; OrbitData Labs has a different all-in-one approach. Pricing is roughly the same across them ($5/1000 ± $1). Differences are in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Output schema (some return RedNote's raw nested API response, some flatten it)&lt;/li&gt;
&lt;li&gt;Update frequency (some are abandoned and break for weeks at a time)&lt;/li&gt;
&lt;li&gt;Mode coverage (some only do search; others handle profiles, comments, videos, etc.)&lt;/li&gt;
&lt;li&gt;Issue response time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're evaluating, run a free-tier test on 2-3 of them with the same input and compare what you get back. The free tier costs you nothing.&lt;/p&gt;

&lt;h2&gt;
  
  
  When does each approach make sense?
&lt;/h2&gt;

&lt;p&gt;DIY (build it yourself):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have a Chinese-language ops team and can monitor breakages&lt;/li&gt;
&lt;li&gt;You're processing &amp;gt; 1M posts/month (the per-result cost of hosted starts to add up)&lt;/li&gt;
&lt;li&gt;You need to scrape behind login (which means you need cookies from logged-in accounts you control)&lt;/li&gt;
&lt;li&gt;You have specific data needs that no hosted scraper covers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hosted Apify actor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You don't have a dedicated scraper engineer&lt;/li&gt;
&lt;li&gt;Volume is variable or moderate (&amp;lt; 500k posts/month)&lt;/li&gt;
&lt;li&gt;You want to outsource the cat-and-mouse with RedNote's anti-bot updates&lt;/li&gt;
&lt;li&gt;You're prototyping and want to validate the approach before committing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The middle ground that often makes sense: use a hosted actor for production data flow, build a thin DIY layer for any specific endpoints the hosted version doesn't cover. The hosted scraper handles the maintenance burden on the parts that break most often (search, profile, posts), and you keep custom DIY logic for the edges.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you do with the data downstream
&lt;/h2&gt;

&lt;p&gt;Scraping is one third of the problem. The other two:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sentiment analysis on Chinese text.&lt;/strong&gt; Off-the-shelf Chinese BERT models (like &lt;code&gt;bert-base-chinese&lt;/code&gt; from Huggingface) are a starting point but accuracy varies wildly by domain. RedNote slang, in particular, doesn't appear in the training data of general Chinese sentiment models — fine-tuning on RedNote-specific labeled samples gets you significant accuracy lift if accuracy matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Image text extraction.&lt;/strong&gt; A non-trivial portion of product mentions on RedNote live in image text overlays (Chinese users frequently put product names visible in images, not in the post body). PaddleOCR is the open-source standard for Chinese OCR. Slow (~30 seconds per image) but reliable. Adds significant cost to processing pipelines but you'll miss a measurable percentage of product mentions without it.&lt;/p&gt;

&lt;p&gt;Both of these are downstream of scraping — solve scraping first, then layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is scraping RedNote legal?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Public-data scraping legality varies by jurisdiction. RedNote's ToS prohibits automated access (as do most platform ToS). The Apify approach (and most public-scraping infrastructure) treats public web pages as accessible, the same way Google's crawler would. You should consult legal counsel for your specific use case. Not legal advice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How fast can I scrape RedNote?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Realistic sustained throughput per IP is around 10-20 requests per minute before triggering rate limits. With residential proxy rotation and proper backoff, you can scale this horizontally — Apify's actor handles this internally. For DIY, plan for ~1 result per second per IP as a conservative number.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need a Chinese IP?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not strictly required, but residential IPs (Asian residential preferred) have notably higher success rates than US/European residential. Datacenter IPs are blocked outright.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's &lt;code&gt;xsec_token&lt;/code&gt; and why does it matter?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When users share posts via the RedNote app or copy URLs, those URLs include an &lt;code&gt;xsec_token&lt;/code&gt; query parameter that authenticates the link request. Some scrapers don't handle URLs with &lt;code&gt;xsec_token&lt;/code&gt; correctly and return errors. If you're scraping URLs collected from real users, make sure your tooling supports this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I scrape video files from RedNote posts?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. Video URLs are returned in the post metadata. Direct download from those URLs works without authentication for public videos.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How often does the request signing change?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Roughly monthly, sometimes more frequently around major RedNote app updates. If you're doing DIY, plan to dedicate 4-8 hours per rotation to update your signing function, or rely on an actively maintained library that pushes updates fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's the difference between RedNote and Xiaohongshu?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They're the same platform. "Xiaohongshu" (小红书) is the Chinese name and means "Little Red Book". "RedNote" is the English brand they pushed during the 2024-2025 TikTok migration to be more accessible to global users. Same app, same data, same API endpoints.&lt;/p&gt;




&lt;p&gt;If you're working on China market intelligence, brand monitoring, or competitive research and want to compare notes — drop me a comment. I write about Chinese platform scraping (Weibo, Bilibili, RedNote) and the build-vs-buy trade-offs around them.&lt;/p&gt;

&lt;p&gt;The actor I mentioned: &lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/rednote-xiaohongshu-scraper&lt;/a&gt;. Free tier covers ~1,000 results, which is enough to validate against your specific use case before committing.&lt;/p&gt;

</description>
      <category>python</category>
      <category>webscraping</category>
      <category>china</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Monitoring the Chinese Social Media Ecosystem: RedNote, Weibo &amp; Bilibili Data Pipeline</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:56:48 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/monitoring-the-chinese-social-media-ecosystem-rednote-weibo-bilibili-data-pipeline-5dcn</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/monitoring-the-chinese-social-media-ecosystem-rednote-weibo-bilibili-data-pipeline-5dcn</guid>
      <description>&lt;p&gt;If you are doing market research, brand monitoring, competitive intelligence, or academic research on China, no single platform tells the whole story. Chinese internet users spread across specialized platforms the way Western users split between Twitter, Instagram, YouTube, and Reddit — but the Chinese platforms are larger, more fragmented, and harder to access from outside the country.&lt;/p&gt;

&lt;p&gt;This article maps the three most important Chinese social platforms for data collection, explains what each one covers, and shows how to build a unified monitoring pipeline using three Apify Actors that share a common architecture: no browser, no proxy, no API keys.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Pillars of Chinese Social Media Intelligence
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. RedNote / Xiaohongshu (小红书) — Social Commerce &amp;amp; Lifestyle
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; China's Instagram meets Pinterest. A social commerce platform where users share product reviews, lifestyle content, travel diaries, and beauty routines. Known for its influence on Chinese consumer purchasing decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User base:&lt;/strong&gt; 200M+ monthly active users&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters for research:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product sentiment and unfiltered consumer reviews&lt;/li&gt;
&lt;li&gt;Trend detection in beauty, fashion, food, travel, and lifestyle&lt;/li&gt;
&lt;li&gt;Influencer (KOL/KOC) discovery for Chinese market campaigns&lt;/li&gt;
&lt;li&gt;Brand perception monitoring among Chinese consumers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actor:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/rednote-scraper&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modes:&lt;/strong&gt; Search posts, user profiles, post comments, trending topics&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Weibo (微博) — Microblogging &amp;amp; Public Opinion
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; China's Twitter. The platform where Chinese public opinion forms in real-time. Breaking news, celebrity drama, government communications, and brand PR all happen on Weibo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User base:&lt;/strong&gt; 580M+ monthly active users&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters for research:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time trending topics with heat scores and categories&lt;/li&gt;
&lt;li&gt;Public opinion tracking on policy, brands, and international events&lt;/li&gt;
&lt;li&gt;Celebrity and KOL influence measurement&lt;/li&gt;
&lt;li&gt;Crisis monitoring and brand sentiment during PR events&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actor:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modes:&lt;/strong&gt; Hot search/trending, post comments, keyword search, user posts&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auth note:&lt;/strong&gt; Trending topics and post comments work without login. Search and user posts require a Weibo &lt;code&gt;SUB&lt;/code&gt; cookie (easily obtained from a browser session).&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Bilibili (哔哩哔哩) — Video &amp;amp; Creator Analytics
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; China's YouTube, with a strong focus on anime, gaming, tech education, and Gen Z culture. Known for its unique &lt;strong&gt;danmaku&lt;/strong&gt; (弹幕) system — scrolling comments that overlay the video in real-time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User base:&lt;/strong&gt; 300M+ monthly active users&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters for research:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creator analytics with Bilibili-specific metrics (danmaku, coins, favorites)&lt;/li&gt;
&lt;li&gt;Gaming and anime industry monitoring&lt;/li&gt;
&lt;li&gt;Chinese Gen Z content trends and preferences&lt;/li&gt;
&lt;li&gt;Tech and education content analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actor:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modes:&lt;/strong&gt; Search videos, video details, video comments, user/creator videos, popular/trending&lt;/p&gt;




&lt;h2&gt;
  
  
  Platform Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;RedNote&lt;/th&gt;
&lt;th&gt;Weibo&lt;/th&gt;
&lt;th&gt;Bilibili&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Platform type&lt;/td&gt;
&lt;td&gt;Social commerce&lt;/td&gt;
&lt;td&gt;Microblogging&lt;/td&gt;
&lt;td&gt;Video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary audience&lt;/td&gt;
&lt;td&gt;Women 18-35, consumers&lt;/td&gt;
&lt;td&gt;All demographics&lt;/td&gt;
&lt;td&gt;Gen Z, gamers, students&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content format&lt;/td&gt;
&lt;td&gt;Photos + short text&lt;/td&gt;
&lt;td&gt;Short posts (tweets)&lt;/td&gt;
&lt;td&gt;Long-form video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Product sentiment, lifestyle trends&lt;/td&gt;
&lt;td&gt;Breaking news, public opinion&lt;/td&gt;
&lt;td&gt;Creator analytics, youth culture&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unique data&lt;/td&gt;
&lt;td&gt;Purchase intent, product reviews&lt;/td&gt;
&lt;td&gt;Hot search rankings, heat scores&lt;/td&gt;
&lt;td&gt;Danmaku counts, coin tipping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth required&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Partial (trending: no, search: yes)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MAU&lt;/td&gt;
&lt;td&gt;200M+&lt;/td&gt;
&lt;td&gt;580M+&lt;/td&gt;
&lt;td&gt;300M+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Actor&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;rednote-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;weibo-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;bilibili-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Building a Unified Pipeline
&lt;/h2&gt;

&lt;p&gt;All three actors share common design principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pure HTTP&lt;/strong&gt; — no browser needed, minimal compute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;256MB RAM&lt;/strong&gt; — cheap to run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pay-per-event&lt;/strong&gt; — $5 per 1,000 items for all three&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON/CSV/Excel output&lt;/strong&gt; — same export formats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apify integrations&lt;/strong&gt; — Google Sheets, S3, webhooks, Zapier, Make, n8n&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example: Daily China Brand Monitor
&lt;/h3&gt;

&lt;p&gt;Here is a practical pipeline architecture using all three actors:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Schedule daily runs for each actor via Apify Schedules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RedNote&lt;/strong&gt; — search for your brand name in Chinese:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"你的品牌名"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Weibo&lt;/strong&gt; — monitor trending mentions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hot_search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Bilibili&lt;/strong&gt; — track video mentions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"你的品牌名"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Use Apify webhooks to push results to your data warehouse (S3 → Snowflake/BigQuery) or directly to Google Sheets for a lightweight dashboard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Set up alerting — use Zapier or Make to send Slack notifications when specific keywords appear in trending topics or when engagement spikes on brand-related content.&lt;/p&gt;




&lt;h2&gt;
  
  
  Use Case Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Brand Entering the Chinese Market
&lt;/h3&gt;

&lt;p&gt;A Western consumer brand preparing to launch in China runs all three actors to understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; What do Chinese consumers say about the product category? What do they value? Which competitor products are reviewed positively?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; Is the brand already discussed in Chinese media? What is the general sentiment?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; Are there video reviews or unboxing content for the brand or competitors?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Academic Research on Chinese Digital Culture
&lt;/h3&gt;

&lt;p&gt;A researcher studying Chinese Gen Z media consumption uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; Trending video content, danmaku engagement patterns, creator growth trajectories&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; Public discourse topics, hashtag trends over time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; Lifestyle and consumption trends, product preference signals&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  PR Crisis Monitoring
&lt;/h3&gt;

&lt;p&gt;A multinational company monitors all three platforms for brand mentions after a negative news cycle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; Real-time trending topics and comment sentiment (the fastest-moving platform)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; Consumer reaction and product boycott signals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; Video commentary and creator opinion pieces&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Technical Architecture
&lt;/h2&gt;

&lt;p&gt;All three actors share:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;Pure HTTP (no Playwright/Puppeteer)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory&lt;/td&gt;
&lt;td&gt;256MB RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Proxy&lt;/td&gt;
&lt;td&gt;Not required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API keys&lt;/td&gt;
&lt;td&gt;Not required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;JSON, CSV, XLSX, JSONL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rate limiting&lt;/td&gt;
&lt;td&gt;Built-in backoff and retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;Content in original Simplified Chinese&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Translation&lt;/td&gt;
&lt;td&gt;Pipe through Google Translate, DeepL, or Claude for English&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;All three actors use the same pricing model:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Cost per actor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000 items&lt;/td&gt;
&lt;td&gt;$5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 items&lt;/td&gt;
&lt;td&gt;$50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000 items&lt;/td&gt;
&lt;td&gt;$500&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Running all three actors daily with 50 items each = 150 items/day × 30 days = 4,500 items/month ≈ &lt;strong&gt;$22.50/month&lt;/strong&gt; for comprehensive daily China monitoring across three platforms.&lt;/p&gt;

&lt;p&gt;Apify's free plan includes $5 of monthly credits — enough to test all three actors.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Do I need a VPN to scrape Chinese platforms?&lt;/strong&gt;&lt;br&gt;
No. All three actors use public HTTP endpoints that are globally accessible. No VPN or proxy is required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the data in Chinese?&lt;/strong&gt;&lt;br&gt;
Yes. All content is returned in the original Simplified Chinese. For English translations, pipe the output through a translation service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I combine data from all three actors?&lt;/strong&gt;&lt;br&gt;
Yes. All three output JSON with similar schemas. Use Apify's dataset API or export to a shared warehouse for unified analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is scraping Chinese social media legal?&lt;/strong&gt;&lt;br&gt;
These actors only access publicly available data through public web endpoints. No authentication is bypassed and no private data is accessed. Always review your local laws and each platform's terms of service.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/rednote-scraper&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/weibo-scraper&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/bilibili-scraper&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Start with one platform, validate the data quality, then expand to all three for comprehensive Chinese social media intelligence.&lt;/p&gt;

&lt;p&gt;Built by &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;Zhorex&lt;/a&gt; — the only developer on Apify specializing in Chinese platform intelligence.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>china</category>
      <category>api</category>
      <category>marketing</category>
    </item>
    <item>
      <title>Automating Perplexity AI Searches for Content Research and Brand Monitoring</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:55:14 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/automating-perplexity-ai-searches-for-content-research-and-brand-monitoring-27o5</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/automating-perplexity-ai-searches-for-content-research-and-brand-monitoring-27o5</guid>
      <description>&lt;p&gt;Perplexity AI has become one of the most popular AI-powered search engines, giving users synthesized answers with cited sources instead of a list of blue links. For marketers, content strategists, and SEO professionals, a new discipline has emerged: &lt;strong&gt;Answer Engine Optimization (AEO)&lt;/strong&gt; — the practice of getting your brand mentioned and cited in AI-generated search results.&lt;/p&gt;

&lt;p&gt;The challenge is that monitoring how Perplexity answers queries about your brand, competitors, or industry requires manually searching one query at a time. If you want to track visibility across dozens or hundreds of queries on a regular schedule, you need automation.&lt;/p&gt;

&lt;p&gt;This article walks through how to automate Perplexity AI searches at scale using the &lt;a href="https://apify.com/zhorex/perplexity-ai-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/perplexity-ai-scraper&lt;/code&gt;&lt;/a&gt; Actor on Apify — no Perplexity API key required.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What:&lt;/strong&gt; Extract AI-generated answers, cited sources, and related questions from Perplexity AI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/perplexity-ai-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/perplexity-ai-scraper&lt;/code&gt;&lt;/a&gt; on Apify&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; $0.02 per query ($20 per 1,000 queries)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No API key:&lt;/strong&gt; Scrapes the public web interface directly via headless browser&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two modes:&lt;/strong&gt; Full search results, or brand mention monitoring&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why Automate Perplexity Searches?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Answer Engine Optimization (AEO)
&lt;/h3&gt;

&lt;p&gt;As more users shift from Google to AI search engines like Perplexity, ChatGPT, and Google AI Overviews, monitoring your visibility in AI-generated answers is becoming critical. AEO focuses on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Is your brand mentioned&lt;/strong&gt; when users ask about your product category?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What position&lt;/strong&gt; does your brand appear in the AI answer?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Which competitors&lt;/strong&gt; are mentioned alongside you?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What sources&lt;/strong&gt; does Perplexity cite — and is your content among them?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Content Strategy
&lt;/h3&gt;

&lt;p&gt;Perplexity curates answers from multiple sources and cites them. By analyzing which URLs get cited for your target keywords, you can identify content gaps: if your competitors are cited but you are not, that signals where to invest in content.&lt;/p&gt;

&lt;h3&gt;
  
  
  Competitive Intelligence
&lt;/h3&gt;

&lt;p&gt;Track how Perplexity positions your competitors for your target keywords. Which brands does it recommend? How does it describe them? This gives you a direct view into how AI search engines perceive your market.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Modes of Operation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Mode 1: Search — Extract Full AI Answers
&lt;/h3&gt;

&lt;p&gt;Submit any query and get the full AI-generated answer with all cited sources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"queries"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software for small business 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"top project management tools for startups"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software for small business 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"For 2026, the best CRM overall is..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sources"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"position"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Best CRM Software for 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.pcmag.com/picks/the-best-crm-software"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcmag.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"snippet"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Customers are vital to any business..."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"relatedQuestions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Zoho CRM vs Salesforce which is better for small businesses"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Best free or low-cost CRM alternatives for startups 2026"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"totalSources"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"answerLength"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1542&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.perplexity.ai/search?q=..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-09T15:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Mode 2: Brand Monitor — Track Brand Mentions in AI Answers
&lt;/h3&gt;

&lt;p&gt;Automatically detect whether your brand appears in Perplexity's answer for a set of queries, and see which competitors are mentioned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"brand_monitor"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"brandName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"queries"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"top email marketing platforms for ecommerce"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"brand"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mentioned"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mentionContext"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot is the easiest all-around option for teams that want a simple start..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"position"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Mentioned early in the answer (top section)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"competitorsMentioned"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Zoho CRM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Salesforce"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Pipedrive"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sourcesCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-09T15:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Practical Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Weekly AEO Dashboard
&lt;/h3&gt;

&lt;p&gt;Schedule the actor to run weekly with your core keyword list (e.g., "best [your category] software 2026"). Track over time whether your brand is being recommended, and whether your visibility is improving after content investments.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Competitor Monitoring
&lt;/h3&gt;

&lt;p&gt;Run brand_monitor mode with your top competitor names. See which queries they appear for that you do not — those are opportunities.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Content Gap Analysis
&lt;/h3&gt;

&lt;p&gt;For your target queries, look at which sources Perplexity cites. If your content is not cited, study what the cited sources cover that you do not. Use this to prioritize content creation.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Market Research at Scale
&lt;/h3&gt;

&lt;p&gt;Need AI-curated answers to dozens of research questions? Use search mode with a list of queries and get structured, sourced answers in bulk — with full attribution for verification.&lt;/p&gt;




&lt;h2&gt;
  
  
  Python and JavaScript Integration
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/perplexity-ai-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;brand_monitor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;brandName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YourBrand&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;best project management tools 2026&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mentioned&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;competitorsMentioned&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ApifyClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apify-client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zhorex/perplexity-ai-scraper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;queries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;best CRM software 2026&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;listItems&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How It Works Under the Hood
&lt;/h2&gt;

&lt;p&gt;The actor opens Perplexity.ai in a headless Chromium browser (Playwright), navigates to the search URL for each query, waits for the AI to finish generating its streamed answer, then extracts the full answer text, cited sources with URLs, and related questions. Results are pushed to a structured Apify dataset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~15-30 seconds per query (AI answer generation takes time)&lt;/li&gt;
&lt;li&gt;1024 MB RAM recommended (Playwright + Chromium)&lt;/li&gt;
&lt;li&gt;Sequential queries with 5s delay to avoid rate limiting&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Perplexity API vs. This Actor
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Perplexity Sonar API&lt;/th&gt;
&lt;th&gt;This Actor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API key required&lt;/td&gt;
&lt;td&gt;Yes (paid subscription)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Sonar Pro pricing&lt;/td&gt;
&lt;td&gt;$0.02 per query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Fast (direct API)&lt;/td&gt;
&lt;td&gt;~15-30s per query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sources/citations&lt;/td&gt;
&lt;td&gt;Included&lt;/td&gt;
&lt;td&gt;Included&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Brand monitoring&lt;/td&gt;
&lt;td&gt;Build it yourself&lt;/td&gt;
&lt;td&gt;Built-in mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Related questions&lt;/td&gt;
&lt;td&gt;Not included&lt;/td&gt;
&lt;td&gt;Included&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The trade-off is speed: the API is faster, but this actor requires no API key or subscription.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;$0.02 per query ($20 per 1,000 queries). You can test with Apify's free tier which gives you $5 of monthly usage — enough for 250 queries.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How is this different from the Perplexity API?&lt;/strong&gt;&lt;br&gt;
The official Perplexity API (Sonar) gives programmatic access but requires a paid API key. This Actor scrapes the free public web interface — same answers, same sources, no API key required. The trade-off is speed (~15-30s per query).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is AEO?&lt;/strong&gt;&lt;br&gt;
Answer Engine Optimization is the practice of getting your brand mentioned in AI search results. As more users shift to AI search engines, monitoring your visibility in AI-generated answers is becoming as important as traditional SEO.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I schedule regular monitoring runs?&lt;/strong&gt;&lt;br&gt;
Yes. Use Apify Schedules to run the actor daily, weekly, or at any interval. Combine with webhooks to send Slack or email alerts when your brand visibility changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are the answers always the same?&lt;/strong&gt;&lt;br&gt;
No. AI answers are non-deterministic — the same query can produce slightly different answers on different runs. This is a characteristic of AI search, not a limitation of the actor.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/perplexity-ai-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/perplexity-ai-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Try a single query in search mode to see the output quality, then build out your keyword list for systematic monitoring.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>seo</category>
      <category>marketing</category>
      <category>python</category>
    </item>
    <item>
      <title>Scraping Bilibili Videos and Creators for Market Research in 2026</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:53:41 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/scraping-bilibili-videos-and-creators-for-market-research-in-2026-4fpg</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/scraping-bilibili-videos-and-creators-for-market-research-in-2026-4fpg</guid>
      <description>&lt;p&gt;Bilibili (哔哩哔哩) is China's premier video platform with 300M+ monthly active users. Think YouTube, but with a younger demographic (Gen Z and millennials), a thriving anime and gaming community, and a unique feature called &lt;strong&gt;danmaku&lt;/strong&gt; (弹幕) — real-time scrolling comments that overlay the video as it plays.&lt;/p&gt;

&lt;p&gt;For anyone doing market research on Chinese youth culture, gaming audiences, tech content, or creator economics, Bilibili is one of the richest data sources available. The problem is that there is no official public Bilibili API for international developers. Bilibili's internal APIs are undocumented, require Chinese phone verification, and change frequently.&lt;/p&gt;

&lt;p&gt;This article shows how to extract Bilibili videos, comments, creator profiles, and trending content using the &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt; Actor on Apify — no API key, no browser, no proxy required.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What:&lt;/strong&gt; Extract videos, comments, creator profiles, and trending content from Bilibili&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt; on Apify — pure HTTP, 256MB RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; $5 per 1,000 items scraped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth:&lt;/strong&gt; None required — all endpoints are public&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unique data:&lt;/strong&gt; Danmaku counts, coin counts (投币), favorites, plus standard engagement metrics&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Makes Bilibili Data Unique
&lt;/h2&gt;

&lt;p&gt;Unlike YouTube or TikTok, Bilibili has platform-specific metrics that this actor captures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Danmaku count (弹幕)&lt;/strong&gt; — live scrolling comments overlaid on the video. High danmaku signals active community engagement, not just passive viewing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coin count (投币)&lt;/strong&gt; — Bilibili's tipping system where users "throw coins" at creators. A direct signal of audience appreciation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Favorite count (收藏)&lt;/strong&gt; — equivalent to "save" on other platforms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard metrics:&lt;/strong&gt; views, likes, shares, replies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These metrics together give a much richer picture of content performance than views alone.&lt;/p&gt;




&lt;h2&gt;
  
  
  Five Scraping Modes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Search Videos
&lt;/h3&gt;

&lt;p&gt;Search by keyword (Chinese or English). Supports sort and filter options.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能教程"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sortOrder"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pubdate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sort options: &lt;code&gt;totalrank&lt;/code&gt; (relevance), &lt;code&gt;click&lt;/code&gt; (most views), &lt;code&gt;pubdate&lt;/code&gt; (newest), &lt;code&gt;dm&lt;/code&gt; (most danmaku), &lt;code&gt;stow&lt;/code&gt; (most favorites), &lt;code&gt;scores&lt;/code&gt; (most comments).&lt;/p&gt;

&lt;p&gt;Duration filters: &lt;code&gt;short&lt;/code&gt; (&amp;lt;10min), `medium` (10-30min), `long` (30-60min), `verylong` (&amp;gt;60min).&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Video Details
&lt;/h3&gt;

&lt;p&gt;Full video info with all engagement metrics and tags.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"video_detail"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"videoUrls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.bilibili.com/video/BV1GJ411x7h7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"BV1xx411c7mD"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Video Comments
&lt;/h3&gt;

&lt;p&gt;Extract comments with author info and likes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"video_comments"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"videoUrls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"https://www.bilibili.com/video/BV1GJ411x7h7"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxComments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sortComments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hot"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sort options: &lt;code&gt;hot&lt;/code&gt; (top/most-liked), &lt;code&gt;time&lt;/code&gt; (newest), &lt;code&gt;likes&lt;/code&gt; (by like count).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Bilibili throttles comment pagination from datacenter IPs, returning only top/pinned comments. Full comment pagination requires residential IPs or authenticated sessions. Other modes are unaffected.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Creator/User Videos
&lt;/h3&gt;

&lt;p&gt;Get user profile info plus their recent videos.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user_videos"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userIds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"546195"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1340190821"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Find a user's &lt;code&gt;mid&lt;/code&gt; in their profile URL: &lt;code&gt;space.bilibili.com/{mid}&lt;/code&gt;. Multiple users are processed in parallel (up to 3 concurrent).&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Popular/Trending Videos
&lt;/h3&gt;

&lt;p&gt;Trending videos, filterable by Bilibili category.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"popular"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"game"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Available categories: Animation, Music, Dance, Gaming, Knowledge, Tech, Sports, Cars, Life, Food, Animal, Fashion, Entertainment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Output Example
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Video:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"video"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"bvid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BV1YXDfBUETP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Example Video Title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.bilibili.com/video/BV1YXDfBUETP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;167&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"durationFormatted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2:47"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"viewCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1570113&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"likeCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;182455&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"coinCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;110535&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"favoriteCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;63471&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"shareCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;45918&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"danmakuCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7466&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"replyCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;17276&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authorName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Creator Name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authorMid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1340190821&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"publishDate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-08T12:00:00+00:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Gaming"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"anime"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"review"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-10T10:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;User Profile:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;546195&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"老番茄"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fans"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20189060&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"archiveCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;652&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"verified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"profileUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://space.bilibili.com/546195"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-10T10:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Use Cases for Market Research
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gaming/anime brand monitoring&lt;/strong&gt; — Track game launches and anime reactions on China's largest anime community&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content trend analysis&lt;/strong&gt; — Identify trending topics in Chinese youth culture and Gen Z interests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creator evaluation&lt;/strong&gt; — Analyze Bilibili KOLs (Key Opinion Leaders) for partnerships and sponsorships using follower counts, engagement ratios, and content frequency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ad placement research&lt;/strong&gt; — Understand which categories and content types perform best by danmaku density, coin rates, and view-to-engagement ratios&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Academic research&lt;/strong&gt; — Study Chinese digital culture, danmaku behavior, and content consumption patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product launch monitoring&lt;/strong&gt; — Track brand mentions and competitor content in China&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Python and JavaScript Integration
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;人工智能教程&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;danmakuCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ApifyClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apify-client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;searchQuery&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;人工智能教程&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxResults&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;listItems&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;viewCount&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Technical Details
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No browser&lt;/strong&gt; — pure HTTP requests to Bilibili's public APIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No proxy&lt;/strong&gt; — Bilibili is accessible globally (some licensed content may be geo-restricted)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No API key&lt;/strong&gt; — all endpoints are public&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;256MB RAM&lt;/strong&gt; — lightweight and efficient&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrent fetching&lt;/strong&gt; — video_detail: up to 5 in parallel; user_videos and video_comments: up to 3 in parallel&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;$0.005 per item scraped ($5 per 1,000 results). Start with Apify's free plan which includes $5 of monthly credits.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part of the Chinese Digital Intelligence Suite
&lt;/h2&gt;

&lt;p&gt;Bilibili covers video and creator analytics. For full China market coverage, combine with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; — Weibo microblogging, trending topics, public opinion (580M+ MAU)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/rednote-scraper&lt;/code&gt;&lt;/a&gt; — RedNote/Xiaohongshu social commerce, lifestyle content (200M+ MAU)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All actors: no browser, no proxy, no API keys. Built by &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;Zhorex&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/bilibili-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Try the &lt;code&gt;popular&lt;/code&gt; mode first (no auth needed) to see what is trending on Chinese video right now.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>api</category>
      <category>python</category>
      <category>analytics</category>
    </item>
    <item>
      <title>How to Scrape Weibo Without Login in 2026: The Complete Guide</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:51:50 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/how-to-scrape-weibo-without-login-in-2026-the-complete-guide-4ge2</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/how-to-scrape-weibo-without-login-in-2026-the-complete-guide-4ge2</guid>
      <description>&lt;p&gt;Weibo (微博) is China's dominant microblogging platform — think Twitter meets Instagram, with 580M+ monthly active users. It is where Chinese public opinion forms, brands communicate, celebrities post, and news breaks. Government officials, industry leaders, and major brands all maintain active Weibo accounts.&lt;/p&gt;

&lt;p&gt;For anyone doing China market research, PR monitoring, influencer analysis, or geopolitical tracking, Weibo data is essential. The problem is that there is no official public Weibo API available for international developers. Weibo's developer platform requires a Chinese business license, imposes strict rate limits, and returns limited data.&lt;/p&gt;

&lt;p&gt;This guide walks through how to extract Weibo posts, trending topics, comments, and creator profiles using the &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; Actor on Apify — no API key, no browser, no VPN required.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What:&lt;/strong&gt; Extract posts, trending topics, comments, and user profiles from Weibo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; on Apify — pure HTTP, no browser needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; $5 per 1,000 items scraped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth:&lt;/strong&gt; Trending topics and post comments work without any login. Search and user posts require a Weibo &lt;code&gt;SUB&lt;/code&gt; cookie&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No VPN needed:&lt;/strong&gt; All endpoints are globally accessible&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why Weibo Data Matters
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Who&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PR &amp;amp; Communications&lt;/td&gt;
&lt;td&gt;Track brand mentions in real-time on China's public square&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Market Research&lt;/td&gt;
&lt;td&gt;Monitor what is trending among Chinese consumers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Influencer Marketing&lt;/td&gt;
&lt;td&gt;Find and evaluate KOLs by followers, engagement, verification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Competitive Intelligence&lt;/td&gt;
&lt;td&gt;Track Chinese competitor announcements and campaigns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Geopolitical Analysis&lt;/td&gt;
&lt;td&gt;Monitor public discourse on policy and international topics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Journalism&lt;/td&gt;
&lt;td&gt;Access Chinese public opinion data for reporting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Academic Research&lt;/td&gt;
&lt;td&gt;Study Chinese social media behavior and trends&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Four Scraping Modes
&lt;/h2&gt;

&lt;p&gt;The actor supports four distinct modes, each targeting a different type of Weibo data:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Hot Search / Trending Topics (no login needed)
&lt;/h3&gt;

&lt;p&gt;Get the real-time pulse of the Chinese internet. Returns trending topics with heat scores and categories.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hot_search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rank"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能最新突破"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"科技"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hotValue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2847562&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"labelName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"热"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"isHot"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://s.weibo.com/weibo?q=..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-10T12:00:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Post Comments (no login needed)
&lt;/h3&gt;

&lt;p&gt;Extract comments from specific posts. Provide post IDs (mid) or detail URLs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"post_comments"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postIds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"5285773987283226"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxComments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Search Posts (login optional)
&lt;/h3&gt;

&lt;p&gt;Search by keyword. Without cookies, returns hot timeline posts as a fallback. With cookies, searches the full index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cookieString"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SUB=your_sub_cookie_value"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. User Posts (login needed for posts)
&lt;/h3&gt;

&lt;p&gt;Get profile info (always works without login) plus posts (requires cookies). Provide numeric user IDs or profile URLs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user_posts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userIds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"1642634100"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cookieString"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SUB=your_sub_cookie_value"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How to Get Cookies
&lt;/h2&gt;

&lt;p&gt;For search and user posts modes, you need a Weibo &lt;code&gt;SUB&lt;/code&gt; cookie:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open &lt;strong&gt;weibo.com&lt;/strong&gt; in your browser and log in&lt;/li&gt;
&lt;li&gt;Open DevTools (F12) → Application → Cookies → weibo.com&lt;/li&gt;
&lt;li&gt;Copy the value of the &lt;strong&gt;SUB&lt;/strong&gt; cookie&lt;/li&gt;
&lt;li&gt;Paste it in the &lt;code&gt;cookieString&lt;/code&gt; field as: &lt;code&gt;SUB=your_value_here&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The cookie typically lasts several days before expiring.&lt;/p&gt;




&lt;h2&gt;
  
  
  Python and JavaScript Examples
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hot_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ApifyClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apify-client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hot_search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxResults&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;listItems&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Technical Details
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No browser needed&lt;/strong&gt; — pure HTTP requests using httpx, runs in 256MB RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No VPN needed&lt;/strong&gt; — Weibo endpoints are globally accessible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic session&lt;/strong&gt; — visitor cookies obtained automatically via the Sina Visitor System&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate-limit handling&lt;/strong&gt; — exponential backoff on 418/429 errors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chinese text preserved&lt;/strong&gt; — all content returned as-is in original Simplified Chinese&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000 items&lt;/td&gt;
&lt;td&gt;$5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 items&lt;/td&gt;
&lt;td&gt;$50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000 items&lt;/td&gt;
&lt;td&gt;$500&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each scraped item (post, comment, trending topic, or profile) counts as one result. You can start with Apify's free plan, which includes $5 of monthly credits — enough for 1,000 data points.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part of the Chinese Digital Intelligence Suite
&lt;/h2&gt;

&lt;p&gt;Weibo covers microblogging and public opinion, but for comprehensive China market intelligence you need the full picture:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Users&lt;/th&gt;
&lt;th&gt;What it covers&lt;/th&gt;
&lt;th&gt;Actor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Weibo&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;580M+ MAU&lt;/td&gt;
&lt;td&gt;Microblogging, trending, celebrity content&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RedNote (Xiaohongshu)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;200M+ MAU&lt;/td&gt;
&lt;td&gt;Social commerce, lifestyle, product reviews&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/rednote-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bilibili&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;300M+ MAU&lt;/td&gt;
&lt;td&gt;Video content, danmaku, creator analytics&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All three actors: no browser, no proxy, no API keys. Built by &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;Zhorex&lt;/a&gt; — the only developer on Apify specializing in Chinese platform intelligence.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is there a Weibo API?&lt;/strong&gt;&lt;br&gt;
There is no official public Weibo API available for international developers. Weibo's developer platform requires a Chinese business license and imposes strict rate limits. This scraper is the best alternative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need a VPN?&lt;/strong&gt;&lt;br&gt;
No. The Weibo endpoints used by this actor are globally accessible without a VPN or proxy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the content in Chinese?&lt;/strong&gt;&lt;br&gt;
Yes. Weibo is a Chinese-language platform — all content is returned in the original Simplified Chinese. If you need English translations, pipe the output through a translation API (Google Translate, DeepL, or Claude).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is scraping Weibo legal?&lt;/strong&gt;&lt;br&gt;
This scraper only accesses publicly available data through Weibo's public web endpoints. It does not bypass authentication or access private/locked accounts. Always review your local laws and Weibo's terms of service.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;The Actor page, full input schema, and a free trial run are at:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/weibo-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with trending topics (no login needed) to see the data quality, then expand to search and user profiles as needed.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>api</category>
      <category>python</category>
      <category>china</category>
    </item>
    <item>
      <title>Scraping G2 Reviews Without Kasada Headaches: A SaaS Competitive Intelligence Pipeline With 29 Fields Per Review</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Fri, 17 Apr 2026 17:51:52 +0000</pubDate>
      <link>https://forem.com/sami_8858131362756585e4f4/scraping-g2-reviews-without-kasada-headaches-a-saas-competitive-intelligence-pipeline-with-29-16gf</link>
      <guid>https://forem.com/sami_8858131362756585e4f4/scraping-g2-reviews-without-kasada-headaches-a-saas-competitive-intelligence-pipeline-with-29-16gf</guid>
      <description>&lt;p&gt;G2 holds roughly two million verified software reviews across tens of thousands of SaaS categories. For anyone doing competitive intelligence, sales prospecting, or product research in the B2B software space, it is one of the single highest-signal public datasets on the internet. The problem is that scraping it has become almost comically painful in 2026.&lt;/p&gt;

&lt;p&gt;If you have tried hitting &lt;code&gt;g2.com&lt;/code&gt; with &lt;code&gt;requests&lt;/code&gt; lately, you already know the story. Cloudflare turnstile, then a Kasada challenge, then a TLS fingerprint check, then a behavioral JavaScript puzzle, then an invisible CAPTCHA. Even well-configured headless browsers with residential proxies get flagged within the first dozen pages. The G2 review data is public, but getting to it at scale has turned into a cat-and-mouse game that burns engineering hours and proxy budget in equal measure.&lt;/p&gt;

&lt;p&gt;This article walks through a cleaner path: using the &lt;code&gt;zhorex/g2-reviews-scraper&lt;/code&gt; Actor on Apify to pull structured reviews, 29 fields deep, without running a single browser or rotating a single proxy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why G2 Data Is Worth the Effort
&lt;/h2&gt;

&lt;p&gt;Before getting into the scraper itself, it helps to articulate what the data is actually good for. G2 reviews are structured, long-form, and written by verified business users. That combination makes them uniquely useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sales intelligence.&lt;/strong&gt; A review that complains about vendor X's integration with Salesforce is a warm lead for vendor Y whose Salesforce integration is its headline feature.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Win/loss analysis.&lt;/strong&gt; Reviews of competitors often name the alternatives the reviewer evaluated. This is free narrative market research.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature gap detection.&lt;/strong&gt; Aggregating thousands of "what do you dislike" fields across a category surfaces the roadmap items customers actually care about.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Churn signals.&lt;/strong&gt; Negative sub-ratings on "support" or "ease of setup" for a specific competitor, trended over quarters, predict defection windows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The value is there. The delivery mechanism is the bottleneck.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With the Official Path
&lt;/h2&gt;

&lt;p&gt;G2 does offer a paid API, but it is gated behind an enterprise contract, requires a seat license on the G2 side, and typically restricts you to data about your own product and a handful of named competitors. Pulling the full set of reviews for a category like "CRM Software" or "Marketing Automation" across all vendors is not on the menu.&lt;/p&gt;

&lt;p&gt;The community workarounds are worse. DIY scrapers hit Kasada within minutes. Proxy bills for rotating residential IPs run $500 to $2,000 a month at modest volumes. Every time G2 rolls a new JS challenge, your Playwright script breaks at 3 AM and you spend a Saturday fingerprinting headers.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We burned six weeks and about $4,000 in proxies before admitting our in-house G2 scraper was never going to be stable."&lt;br&gt;
— Growth engineer at a mid-market PLG startup&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What the Actor Does
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;g2-reviews-scraper&lt;/code&gt; Actor bypasses all of that by calling G2's public review feed directly. No browser, no proxy rotation, no Kasada bypass hacks. You give it a product URL or slug and it returns structured JSON.&lt;/p&gt;

&lt;p&gt;Feature set:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scrape reviews for any G2 product URL or slug&lt;/li&gt;
&lt;li&gt;29 fields per review including sub-ratings, reviewer job title, company size, industry, and verification status&lt;/li&gt;
&lt;li&gt;Pagination handled internally, up to the full review history of a product&lt;/li&gt;
&lt;li&gt;Filters for star rating, date range, and review language&lt;/li&gt;
&lt;li&gt;JSON, CSV, Excel, or JSONL output&lt;/li&gt;
&lt;li&gt;Runs on Apify infrastructure, so no local Node/Python setup required for the scraping itself&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;G2 Official API&lt;/th&gt;
&lt;th&gt;DIY scraper + residential proxies&lt;/th&gt;
&lt;th&gt;&lt;code&gt;zhorex/g2-reviews-scraper&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Access to competitor reviews&lt;/td&gt;
&lt;td&gt;No (own product only)&lt;/td&gt;
&lt;td&gt;Yes, if you can keep it running&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kasada / JS challenge handling&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Your problem, breaks weekly&lt;/td&gt;
&lt;td&gt;Handled, no browser needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setup time&lt;/td&gt;
&lt;td&gt;Weeks (contract + provisioning)&lt;/td&gt;
&lt;td&gt;Days to weeks&lt;/td&gt;
&lt;td&gt;Minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost at 10k reviews/month&lt;/td&gt;
&lt;td&gt;Custom enterprise quote&lt;/td&gt;
&lt;td&gt;~$300-600 proxies + eng time&lt;/td&gt;
&lt;td&gt;$50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sub-ratings included&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Usually not&lt;/td&gt;
&lt;td&gt;Yes (29 fields)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Export formats&lt;/td&gt;
&lt;td&gt;JSON via API&lt;/td&gt;
&lt;td&gt;Whatever you build&lt;/td&gt;
&lt;td&gt;JSON, CSV, XLSX, JSONL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maintenance burden&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Input Example
&lt;/h2&gt;

&lt;p&gt;A realistic starter config for pulling reviews across three CRM competitors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productUrls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/salesforce-sales-cloud/reviews"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/hubspot-sales-hub/reviews"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/pipedrive/reviews"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxReviewsPerProduct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"minRating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxRating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateFrom"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-01-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"includeSubRatings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"includeReviewerProfile"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Output Example
&lt;/h2&gt;

&lt;p&gt;Here is one review item, trimmed to the fields most people care about. The full object has all 29 fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"g2-8421930"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productSlug"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hubspot-sales-hub"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot Sales Hub"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewTitle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Great for SMB, feels cramped above 50 reps"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"starRating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"subRatings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"easeOfUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"qualityOfSupport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"easeOfSetup"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"meetsRequirements"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewLikes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Pipeline view is clean, sequences are easy to build, and the free tier got us started without procurement."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewDislikes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Once we hit 60 reps the reporting module struggled. Forecasting is weaker than Salesforce and custom objects are limited."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"recommendations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Fine for teams under 50. Above that, evaluate Salesforce or Dynamics."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"displayName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Verified User in Software"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"jobTitle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RevOps Manager"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"industry"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Computer Software"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"companySize"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"51-200 employees"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"isVerified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"publishedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-11-08T14:22:10Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"helpfulCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"organic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/hubspot-sales-hub/reviews/hubspot-sales-hub-review-8421930"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fields include &lt;code&gt;reviewId&lt;/code&gt;, &lt;code&gt;productSlug&lt;/code&gt;, &lt;code&gt;productName&lt;/code&gt;, &lt;code&gt;starRating&lt;/code&gt;, five sub-ratings, &lt;code&gt;reviewLikes&lt;/code&gt;, &lt;code&gt;reviewDislikes&lt;/code&gt;, &lt;code&gt;recommendations&lt;/code&gt;, &lt;code&gt;problemsSolved&lt;/code&gt;, &lt;code&gt;benefitsRealized&lt;/code&gt;, reviewer display name, job title, industry, company size, region, validation status, &lt;code&gt;publishedAt&lt;/code&gt;, &lt;code&gt;updatedAt&lt;/code&gt;, &lt;code&gt;language&lt;/code&gt;, &lt;code&gt;helpfulCount&lt;/code&gt;, &lt;code&gt;source&lt;/code&gt;, &lt;code&gt;incentive&lt;/code&gt; (if the review was incentivized), and the canonical &lt;code&gt;reviewUrl&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four Real Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. SaaS Sales Displacement Plays
&lt;/h3&gt;

&lt;p&gt;A sales team selling a CRM builds a nightly job that pulls all 1- and 2-star reviews for three major competitors. Each review is piped through an LLM that extracts the specific complaint and the reviewer's company. The result is a prioritized outbound list where every lead comes with a documented pain point in their own words. Open rates on personalized sequences built from real G2 complaints routinely run 2-3x generic cold outbound.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Category-Level Feature Gap Analysis
&lt;/h3&gt;

&lt;p&gt;A product manager at a marketing automation vendor scrapes every review in the "Marketing Automation" category filed in the last 12 months, roughly 40,000 reviews across 30 products. She clusters the "dislikes" text with embeddings and counts cluster frequency per vendor. The result is a heatmap showing which features are consistent weak spots across the category (great roadmap input) and which are only weak for specific competitors (great competitive collateral).&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Churn and Renewal Risk Signals
&lt;/h3&gt;

&lt;p&gt;A customer success team at an observability vendor subscribes to a rolling scrape of their own product's reviews plus the top five competitors. Any new 1- or 2-star review mentioning an integration or feature their product covers gets routed to a Slack channel. It acts as an early-warning system for account risk and a real-time queue of switch-ready prospects.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Private Equity Due Diligence
&lt;/h3&gt;

&lt;p&gt;A PE analyst evaluating a SaaS acquisition scrapes 5 years of G2 history for the target and three comparable vendors. The trend of monthly average star rating, sub-rating deltas, and review volume growth becomes part of the investment memo. This is one of the few ways to reality-check the seller's narrative about product quality and market position.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;The Actor is priced at &lt;strong&gt;$0.005 per review&lt;/strong&gt; scraped, billed through Apify. Platform compute usage is negligible because there is no browser.&lt;/p&gt;

&lt;p&gt;Worked examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1,000 reviews: &lt;strong&gt;$5&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;10,000 reviews (a large product's full history): &lt;strong&gt;$50&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;100,000 reviews (a full category sweep): &lt;strong&gt;$500&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;1,000,000 reviews (multi-category enterprise pull): &lt;strong&gt;$5,000&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compare that to a DIY build where a single month of residential proxies for the same volume runs $1,500 to $3,000, plus engineering time to keep the Kasada bypass alive. For most teams the break-even point is well under a week.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is scraping G2 reviews legal?&lt;/strong&gt;&lt;br&gt;
G2 reviews are publicly accessible and the Actor only collects data that any logged-out visitor can see. Public data scraping has been repeatedly upheld in US courts (hiQ v. LinkedIn is the landmark). That said, how you use and redistribute the data is on you. Do not republish full review text as your own content, and respect GDPR if you process reviewer metadata for EU subjects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need proxies or a Kasada solver?&lt;/strong&gt;&lt;br&gt;
No. The Actor uses G2's public review feed directly and does not trip Kasada. You do not need to supply proxies, browser fingerprints, or CAPTCHA solver tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How fresh is the data?&lt;/strong&gt;&lt;br&gt;
Reviews are scraped live at run time. If a review was published five minutes before your run, it will be included. For ongoing monitoring, schedule the Actor hourly or daily via Apify Schedules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the rate limit?&lt;/strong&gt;&lt;br&gt;
Practically speaking, you are limited by Apify concurrency and the Actor's internal pacing, not by G2 blocking. Expect roughly 500-1,000 reviews per minute per run. Runs can be parallelized across products.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I get all 29 fields or is that a premium tier?&lt;/strong&gt;&lt;br&gt;
All 29 fields are included at the flat $0.005 per review rate. There is no feature-gated premium tier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I export to my warehouse?&lt;/strong&gt;&lt;br&gt;
Apify exposes datasets as JSON, CSV, XLSX, JSONL, RSS, and HTML table, and has native integrations for S3, Google Drive, and webhooks. A common pattern is JSONL to S3, then &lt;code&gt;COPY&lt;/code&gt; into Snowflake or BigQuery.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pair It With Capterra for Full B2B Coverage
&lt;/h2&gt;

&lt;p&gt;G2 skews toward mid-market and enterprise SaaS. Capterra, owned by Gartner, leans more toward SMB and has broader coverage of vertical software (construction, healthcare, legal). For any serious competitive intel project, you want both. The companion Actor &lt;a href="https://apify.com/zhorex/capterra-reviews-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/capterra-reviews-scraper&lt;/code&gt;&lt;/a&gt; uses the same schema philosophy and pairs cleanly with this one in a single pipeline. If you are also tracking sentiment on Chinese-language software or consumer platforms, &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; covers the APAC side.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;The Actor page, full input schema, and a free trial run live at:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/g2-reviews-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/g2-reviews-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Drop in a product URL, run it, and you will have a clean JSON dataset in the Apify console in a couple of minutes. No proxy contract, no Kasada cat-and-mouse, no maintenance bill at 3 AM.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>api</category>
      <category>saas</category>
      <category>python</category>
    </item>
  </channel>
</rss>
