<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Double CHEN</title>
    <description>The latest articles on Forem by Double CHEN (@double_chen_70da460344c73).</description>
    <link>https://forem.com/double_chen_70da460344c73</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3506122%2F7c2ff04c-887f-404d-bd2e-c4150e6431e8.png</url>
      <title>Forem: Double CHEN</title>
      <link>https://forem.com/double_chen_70da460344c73</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/double_chen_70da460344c73"/>
    <language>en</language>
    <item>
      <title>The fingerprint layer is why your Playwright + residential proxies still get blocked</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Fri, 08 May 2026 10:42:17 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/the-fingerprint-layer-is-why-your-playwright-residential-proxies-still-get-blocked-4aom</link>
      <guid>https://forem.com/double_chen_70da460344c73/the-fingerprint-layer-is-why-your-playwright-residential-proxies-still-get-blocked-4aom</guid>
      <description>&lt;h2&gt;
  
  
  The thread that started this
&lt;/h2&gt;

&lt;p&gt;A couple months ago I saw a post on r/webscraping that summed up the current state of things better than I ever could:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We have scrapers, ordinary ones, browser automation… we use proxies for location based blocking, residential proxies for data centre blockers, we rotate the user agent, we have some third party unblockers too. But often, we still get captchas, and CloudFlare can get in the way too."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Four layers of evasion. Still getting challenge pages. 167 upvotes, 52 comments, archived without a real solution. I've been writing some variant of that person's scraper for the past two years, and I think there's a cleaner answer now than there was when they posted.&lt;/p&gt;

&lt;p&gt;This is a writeup of what I think the actual problem is, what I tested, and the CLI-based setup I've ended up with.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four layers that don't get you past Cloudflare
&lt;/h2&gt;

&lt;p&gt;These are the evasion techniques the OP was already using:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. User agent rotation
&lt;/h3&gt;

&lt;p&gt;Changes the &lt;code&gt;User-Agent&lt;/code&gt; header per request. This catches early-2010s anti-bot rules. Cloudflare, DataDome, Akamai all moved past it years ago — they don't trust the UA string at all, they build their own fingerprint.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Residential proxies
&lt;/h3&gt;

&lt;p&gt;Gets you past IP-reputation lists (data-center ASNs are pre-flagged). Useful. But the proxy exit IP is one of dozens of signals — not a primary discriminator on any WAF I've worked with.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Geo/location proxies
&lt;/h3&gt;

&lt;p&gt;Solves "this site only serves US users" or similar rate-limit-per-country patterns. Good for what it does. Doesn't affect bot detection.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Third-party unblocker services
&lt;/h3&gt;

&lt;p&gt;These are typically a browser-farm-as-a-service with some stealth built in. Works well until the service itself gets fingerprinted (which happens to all of them eventually — when a service becomes popular, anti-bot vendors train their models on it).&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually gets checked
&lt;/h2&gt;

&lt;p&gt;The top comments on that thread are a good inventory. I've verified each of these against my own traffic:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLS fingerprint (JA3 / JA3S)&lt;/strong&gt;&lt;br&gt;
Cipher suite order, extension list, supported groups — all sent during the TLS handshake, before any HTTP. Python &lt;code&gt;requests&lt;/code&gt; has a JA3 that screams "non-browser." Curl-cffi and rnet both let you mimic a real browser's TLS, which is a big reason they work better. For browser automation you get this "for free" as long as you're using a real browser underneath.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JavaScript-level fingerprint&lt;/strong&gt;&lt;br&gt;
This is the big one. Scripts loaded on page render query:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;navigator.webdriver&lt;/code&gt; (should be &lt;code&gt;undefined&lt;/code&gt;, not &lt;code&gt;true&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;navigator.plugins&lt;/code&gt; (Playwright/Puppeteer vanilla: empty array)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;navigator.languages&lt;/code&gt; (should match headers)&lt;/li&gt;
&lt;li&gt;Canvas rendering hash (deterministic per GPU + driver, the real moat)&lt;/li&gt;
&lt;li&gt;WebGL renderer string&lt;/li&gt;
&lt;li&gt;Font list (from &lt;code&gt;document.fonts&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Audio context hash&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;window.chrome.runtime&lt;/code&gt; (missing in headless Chrome by default)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Plugin-based stealth libraries patch the top 3-5. The WAF vendors add the remaining ones to their detection within a release cycle, and you end up in a patch-loop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Behavioral&lt;/strong&gt;&lt;br&gt;
Mouse movement curvature, keyboard interval distribution, scroll velocity, click-before-load latency. These matter for high-tier targets (DataDome enterprise, Akamai Bot Manager).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Network latency between IP and proxy&lt;/strong&gt;&lt;br&gt;
This is the gotcha. One comment on the original thread describes a month-long debug where every other signal looked clean, but the timing of browser actions vs. the measured round-trip didn't match what a single user would look like. The fix was positioning the data-center IP near the residential proxy exit.&lt;/p&gt;
&lt;h2&gt;
  
  
  What I tested
&lt;/h2&gt;

&lt;p&gt;I built a small test harness: 5 target sites known to use different anti-bot tiers, and ran each stack against each site for 200 sessions over a week. I counted successful page loads (no challenge page appeared).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Light WAF&lt;/th&gt;
&lt;th&gt;Medium (reCAPTCHA)&lt;/th&gt;
&lt;th&gt;Hard (CF Bot Mgmt)&lt;/th&gt;
&lt;th&gt;Very Hard (DataDome)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Plain Playwright&lt;/td&gt;
&lt;td&gt;198/200&lt;/td&gt;
&lt;td&gt;142/200&lt;/td&gt;
&lt;td&gt;6/200&lt;/td&gt;
&lt;td&gt;0/200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Playwright + stealth plugin&lt;/td&gt;
&lt;td&gt;200/200&lt;/td&gt;
&lt;td&gt;189/200&lt;/td&gt;
&lt;td&gt;94/200&lt;/td&gt;
&lt;td&gt;2/200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Playwright + stealth + residential&lt;/td&gt;
&lt;td&gt;200/200&lt;/td&gt;
&lt;td&gt;195/200&lt;/td&gt;
&lt;td&gt;127/200&lt;/td&gt;
&lt;td&gt;11/200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Camoufox (anti-detect browser)&lt;/td&gt;
&lt;td&gt;200/200&lt;/td&gt;
&lt;td&gt;198/200&lt;/td&gt;
&lt;td&gt;173/200&lt;/td&gt;
&lt;td&gt;34/200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;browser-act CLI + residential&lt;/td&gt;
&lt;td&gt;200/200&lt;/td&gt;
&lt;td&gt;200/200&lt;/td&gt;
&lt;td&gt;191/200&lt;/td&gt;
&lt;td&gt;47/200&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The gap between plain Playwright and any stealth-aware setup on Cloudflare Bot Management was the most dramatic — 6/200 vs. 191/200. The gap between different stealth setups on CF was smaller but still significant. DataDome Enterprise remains hard for everything except mobile-device-based approaches.&lt;/p&gt;
&lt;h2&gt;
  
  
  Working with browser-act CLI
&lt;/h2&gt;

&lt;p&gt;I ended up moving most of my scrapers over to browser-act. Not because it's strictly the highest-scoring option — Camoufox is very close — but because it's a CLI instead of a library. That changed how I write scrapers.&lt;/p&gt;
&lt;h3&gt;
  
  
  Install
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add browser-act/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; browser-act
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;(Uses npm's skill system; the CLI itself is a Python package that gets installed on first run.)&lt;/p&gt;
&lt;h3&gt;
  
  
  The commands that matter
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; myjob browser open &amp;lt;stealth_id&amp;gt; https://example.com
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; myjob &lt;span class="nb"&gt;wait &lt;/span&gt;stable
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; myjob solve-captcha
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; myjob get markdown
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; myjob state
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; myjob click 14
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; myjob input 7 &lt;span class="s2"&gt;"search query"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;--session&lt;/code&gt; flag keeps your cookie jar and fingerprint persistent across calls. Log in once, reuse the session for subsequent scrapes.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;solve-captcha&lt;/code&gt; is the built-in Cloudflare Turnstile + reCAPTCHA v2 + hCaptcha solver. Returns &lt;code&gt;solved: True&lt;/code&gt; in my testing on Indeed and Product Hunt in under 2 seconds each. No 2captcha/anti-captcha account needed — the CLI handles it.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;get markdown&lt;/code&gt; is the one I didn't expect to use as much as I do. It returns an LLM-optimized markdown representation of the page, stripping navigation chrome, scripts, and ad containers. On the Product Hunt AI directory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Raw HTML: 680,193 chars&lt;/li&gt;
&lt;li&gt;browser-act markdown: 49,272 chars&lt;/li&gt;
&lt;li&gt;Reduction: 92.7%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're running LLM-in-the-loop scraping, that's ~14x fewer input tokens per page. Compounds very fast on high-volume jobs.&lt;/p&gt;
&lt;h3&gt;
  
  
  A concrete refactor
&lt;/h3&gt;

&lt;p&gt;Here's a login-then-scrape flow. Old version (playwright + stealth + retry glue):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;playwright.async_api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;async_playwright&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;playwright_stealth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stealth_async&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;async_playwright&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;headless&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--disable-blink-features=AutomationControlled&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;user_agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Mozilla/5.0 ...&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;viewport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;width&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1920&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;height&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1080&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new_page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;stealth_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# ... ~80 more lines: cookie banner dismissal, login form, challenge retry loop
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New version, same flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;SESSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;mywork
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; browser open &lt;span class="nv"&gt;$BID&lt;/span&gt; https://target.com/login
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; &lt;span class="nb"&gt;wait &lt;/span&gt;stable
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; input 3 &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$USER&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; input 4 &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PASS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; click 5
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; &lt;span class="nb"&gt;wait &lt;/span&gt;stable
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; solve-captcha  &lt;span class="c"&gt;# handles CF challenge if one appears&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; navigate https://target.com/dashboard
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; &lt;span class="nv"&gt;$SESSION&lt;/span&gt; get markdown &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; dashboard.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My actual scraper went from ~800 lines of Python glue to ~120 lines of bash calling the CLI. Much less to maintain, and the debug story is better — each command prints its result, so I can &lt;code&gt;tee&lt;/code&gt; and inspect.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limits and caveats
&lt;/h2&gt;

&lt;p&gt;Being honest about what this doesn't do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DataDome Enterprise and Akamai Bot Manager&lt;/strong&gt; are still hard. My 47/200 on DataDome is better than other stacks in the test but not production-viable for aggressive scraping. For those targets you're looking at mobile device farms or paid bypass APIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proxy rotation&lt;/strong&gt; is not automatic; configure with &lt;code&gt;--dynamic-proxy &amp;lt;region&amp;gt;&lt;/code&gt; or &lt;code&gt;--custom-proxy &amp;lt;url&amp;gt;&lt;/code&gt;. Still need a proxy source if you're at scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browser profiles&lt;/strong&gt; are stored locally. If you want to share a logged-in session across machines, you need to export/import the profile yourself. Not a one-click thing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLI-as-library&lt;/strong&gt; is a trade-off: you lose the fine-grained control of a Playwright API. For 80% of scraping flows it's fine, for the 20% where you need, say, CDP-level network interception mid-session, you'd stay with Playwright.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;The Reddit thread is accurate: UA rotation + residential proxies + unblockers aren't enough on their own because they don't touch the fingerprint layer, which is what modern WAFs actually gate on. Getting past Cloudflare Bot Management or similar means controlling:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;TLS fingerprint (use a real browser or curl-cffi)&lt;/li&gt;
&lt;li&gt;JavaScript-level signals (canvas, webdriver, plugins — stealth patches for all of them)&lt;/li&gt;
&lt;li&gt;Captcha handling (built-in solver or paid service)&lt;/li&gt;
&lt;li&gt;Session persistence (so you're not re-solving on every request)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;browser-act is one option that bundles all four; Camoufox + a captcha service is another. The exact tool matters less than recognizing that the fingerprint layer is where the game is now.&lt;/p&gt;

&lt;p&gt;Happy to hear if you've tested other setups — especially on the DataDome/Akamai side, where I think the community's collective knowledge is still thin.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>automation</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>browser-act CLI Launches Dynamic Proxy — All-in-One Browser Automation for AI Agents</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Wed, 29 Apr 2026 09:19:28 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/browser-act-cli-launches-dynamic-proxy-all-in-one-browser-automation-for-ai-agents-525l</link>
      <guid>https://forem.com/double_chen_70da460344c73/browser-act-cli-launches-dynamic-proxy-all-in-one-browser-automation-for-ai-agents-525l</guid>
      <description>&lt;p&gt;BrowserAct just released a major feature for browser-act CLI: built-in dynamic proxies.&lt;br&gt;
This is designed for AI Agents and enterprise-grade automation.&lt;/p&gt;

&lt;p&gt;Why it matters:&lt;/p&gt;

&lt;p&gt;Traditional browser automation requires you to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Buy from third-party proxy providers ($)&lt;/li&gt;
&lt;li&gt;Configure complex URLs and credentials&lt;/li&gt;
&lt;li&gt;Handle anti-bot detection (another service)&lt;/li&gt;
&lt;li&gt;Solve CAPTCHAs (more fees)&lt;/li&gt;
&lt;li&gt;Manage region switching (engineering overhead)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This fragmented toolchain makes automation brittle and expensive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The browser-act CLI way:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One command. Zero config. Unified billing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What this means for you:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;✅ No proxy juggling — No need to buy from third-party proxy providers or manage credentials. Switch between multiple countries with one command.&lt;br&gt;
✅ Stay undetected — Your automation looks human (stealth fingerprinting + rotating IPs), bypass Cloudflare &amp;amp; DataDome automatically. No more IP bans.&lt;br&gt;
✅ Built for your AI workflow — Run multiple regions in parallel, lower token costs, automatic human handoff when AI gets stuck.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cross-border e-commerce price monitoring&lt;/li&gt;
&lt;li&gt;Multi-account automation&lt;/li&gt;
&lt;li&gt;OpenClaw / Claude Code / Cursor powered data collection&lt;/li&gt;
&lt;li&gt;Global testing &amp;amp; compliance validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is infrastructure built for the AI era.&lt;/p&gt;

&lt;p&gt;Special Offer: ⭐ Star for 500 credits: &lt;a href="https://github.com/browser-act/skills" rel="noopener noreferrer"&gt;https://github.com/browser-act/skills&lt;/a&gt;&lt;br&gt;
📖 Full docs: &lt;a href="https://github.com/browser-act/skills/tree/main/browser-act" rel="noopener noreferrer"&gt;https://github.com/browser-act/skills/tree/main/browser-act&lt;/a&gt;&lt;/p&gt;

</description>
      <category>browseract</category>
      <category>webautomation</category>
      <category>ai</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Why your scraper plateaus at 5-6 concurrent Chrome instances (and the shared-cookie trap nobody names)</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:35:31 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/why-your-scraper-plateaus-at-5-6-concurrent-chrome-instances-and-the-shared-cookie-trap-nobody-3baa</link>
      <guid>https://forem.com/double_chen_70da460344c73/why-your-scraper-plateaus-at-5-6-concurrent-chrome-instances-and-the-shared-cookie-trap-nobody-3baa</guid>
      <description>&lt;p&gt;Someone on r/webscraping this week hit the wall I've seen a dozen projects hit:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"When I try to use multiple pages (tabs) within a single browser instance, Turnstile doesn't load properly on background or non-focused pages. Because of that, I'm forced to run one browser instance per page... I can do like 5 or 6 browsers simultaneously before throttling my CPU, avg about 30+ solves a minute."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;5-6 browsers max. 30 solves/min. And — critically — the OP can't go &lt;em&gt;up&lt;/em&gt; by using tabs, because tabs break the Turnstile flow.&lt;/p&gt;

&lt;p&gt;This is a design constraint hiding as a performance problem. Let's name it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why tabs break Turnstile (and other CF challenges)
&lt;/h2&gt;

&lt;p&gt;Cloudflare's Turnstile widget does two things that make it hostile to multi-tab scraping in one Chrome process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;It checks &lt;code&gt;document.visibilityState&lt;/code&gt;.&lt;/strong&gt; A backgrounded tab reports &lt;code&gt;hidden&lt;/code&gt;, and the widget's challenge scripts bail out or stall waiting for a &lt;code&gt;visible&lt;/code&gt; transition. This is what the OP observed as "Turnstile doesn't load properly on non-focused pages."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cookies are shared across the entire browser profile.&lt;/strong&gt; When two tabs both start a CF challenge on the same origin, they race for the same &lt;code&gt;cf_clearance&lt;/code&gt; cookie slot. Whichever tab gets focused first writes its token; the other tab's challenge sees a mismatched state and blocks.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One of the replies on the thread nailed this in passing: &lt;em&gt;"cookies are shared in a browser between all tabs, multiple challenges can block each other."&lt;/em&gt; That's the full story. Focus matters only as a symptom; the cookie race is the actual collision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "5-6 browsers max" is the wrong number
&lt;/h2&gt;

&lt;p&gt;If you profile what those 5-6 Chrome processes are doing, you'll see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;~200MB RSS each&lt;/strong&gt;, most of it heap from V8 and the renderer&lt;/li&gt;
&lt;li&gt;Two or three render threads spinning waiting for page load / paint&lt;/li&gt;
&lt;li&gt;A pool of worker threads for networking and crypto&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On a 16GB / 8-core machine you're CPU-bound, not memory-bound, because every page load triggers full Chromium rendering + JS execution &lt;em&gt;for the challenge script&lt;/em&gt; — which is deliberately expensive (that's the "work" part of proof-of-work).&lt;/p&gt;

&lt;p&gt;So the real ceiling isn't "how many Chrome binaries fit in RAM" — it's "how many concurrent CF challenges can your CPU solve in parallel." At the OP's 30 solves/min on 5-6 browsers, that's ~5 solves/min per browser, or about one every 10-12 seconds. That matches what the challenge takes on a cold profile.&lt;/p&gt;

&lt;h2&gt;
  
  
  The profile-isolation fix
&lt;/h2&gt;

&lt;p&gt;The escape is not more threads or tabs. It's &lt;strong&gt;profile isolation with cached clearance&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The idea:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pre-warm a pool of Chrome profiles (say, 20 of them) by letting each one solve a CF challenge once and storing the resulting &lt;code&gt;cf_clearance&lt;/code&gt; cookie&lt;/li&gt;
&lt;li&gt;For each scrape request, pick a profile whose clearance hasn't expired (CF clearances last ~30 min typically)&lt;/li&gt;
&lt;li&gt;Run the scrape as that profile. Because the clearance is already present, no challenge runs — you skip the 10-second proof-of-work&lt;/li&gt;
&lt;li&gt;When a profile's clearance expires, quietly re-warm it in the background&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With this architecture the bottleneck shifts from "CF challenge compute per browser" to "network latency per page," and you can fan out to dozens of concurrent requests.&lt;/p&gt;

&lt;p&gt;Real numbers from what I've been running with this pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;20 profiles, pre-warmed&lt;/li&gt;
&lt;li&gt;~1.5s avg page load on CF-protected targets (from warm clearance)&lt;/li&gt;
&lt;li&gt;~8× throughput vs "one browser per page with fresh challenges"&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Doing it with browser-act
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;browser-act&lt;/code&gt; is a CLI that manages the profile pool for you — each &lt;code&gt;browser-id&lt;/code&gt; is an isolated profile with its own cookies and storage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install:&lt;/span&gt;
npx skills add browser-act/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; browser-act

&lt;span class="c"&gt;# Create 20 profiles:&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;i &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;seq &lt;/span&gt;1 20&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;browser-act browser create &lt;span class="nt"&gt;--profile-name&lt;/span&gt; &lt;span class="s2"&gt;"scrape-&lt;/span&gt;&lt;span class="nv"&gt;$i&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;

&lt;span class="c"&gt;# Warm each profile by opening the target site once (runs the CF challenge):&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; warmup browser open &amp;lt;profile-id&amp;gt; https://target.site

&lt;span class="c"&gt;# Later, from your scraper, pick a warm profile and run:&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; scrape browser open &amp;lt;profile-id&amp;gt; https://target.site/page-N
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; scrape get markdown &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; page-N.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cookies persist on disk per profile, so restarting the scraper doesn't lose clearance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things worth arguing about
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;This doesn't help if your target site &lt;em&gt;rotates&lt;/em&gt; challenge policy per request (some bank / gambling sites do). That's a different regime — you need a JS solver loop&lt;/li&gt;
&lt;li&gt;20 profiles is about where you hit diminishing returns on a single machine. Past that, put them on separate instances with separate IPs — the &lt;code&gt;cf_clearance&lt;/code&gt; cookie is IP-bound&lt;/li&gt;
&lt;li&gt;If you only need ~30 pages/minute, the OP's 5-6 browser setup is fine. This matters past ~100 pages/min&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full r/webscraping thread is at &lt;a href="https://www.reddit.com/r/webscraping/comments/1smb3dm/optimised_chrome_for_multi_threading/" rel="noopener noreferrer"&gt;optimised chrome? for multi threading&lt;/a&gt;. If you're fighting the same ceiling, drop what you've tried — happy to compare.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>chrome</category>
      <category>automation</category>
      <category>performance</category>
    </item>
    <item>
      <title>Selenium keeps getting blocked by Cloudflare? Here's what the fingerprint actually catches (and how to stop triggering it)</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:32:53 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/selenium-keeps-getting-blocked-by-cloudflare-heres-what-the-fingerprint-actually-catches-and-how-4fn9</link>
      <guid>https://forem.com/double_chen_70da460344c73/selenium-keeps-getting-blocked-by-cloudflare-heres-what-the-fingerprint-actually-catches-and-how-4fn9</guid>
      <description>&lt;p&gt;A post on r/webscraping this week asked a question that keeps coming up: &lt;em&gt;"I'm using Selenium through Chrome, need to scrape ~1M pages at ~1s/page, but every request hangs 7-8s on a Cloudflare challenge."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;At 7 seconds per page, 1M pages takes 81 days. That's not a rate limit problem. That's a &lt;em&gt;detection&lt;/em&gt; problem — and you can't fix it with more threads.&lt;/p&gt;

&lt;p&gt;The replies to that post are a goldmine of advice that's mostly right but doesn't explain &lt;em&gt;why&lt;/em&gt;. Let's do that.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual thing Cloudflare catches
&lt;/h2&gt;

&lt;p&gt;Selenium's ChromeDriver leaks the automation flag in at least three observable ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;navigator.webdriver === true&lt;/code&gt;&lt;/strong&gt; — exposed by design, WebDriver spec requires it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CDP client signature&lt;/strong&gt; — ChromeDriver wraps Chrome's DevTools Protocol with a specific RPC pattern that's detectable via timing and order of &lt;code&gt;Target.*&lt;/code&gt; calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing browser UI signals&lt;/strong&gt; — Selenium launches Chrome without certain accessibility/window events that real users always generate&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One of the top comments on that Reddit thread summarized it well:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Selenium operates using a ChromeDriver or a GeckoDriver binary, which any respectable company that doesn't want bots on its website can fingerprint. That doesn't mean Selenium is broken — it just means it was not made for what you're trying to do. Selenium's purpose is automated testing."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's the right read. Selenium was designed for QA, where you &lt;em&gt;want&lt;/em&gt; the site to know you're an automated test. Cloudflare's Bot Management scores those same signals against a human baseline, and the score tanks fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the comments recommend (and what actually works)
&lt;/h2&gt;

&lt;p&gt;Filtered through real use:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;Catch&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;undetected-chromedriver&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Patches the WebDriver flag + CDP strings&lt;/td&gt;
&lt;td&gt;Cloudflare pushes updates that re-detect it every few months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;SeleniumBase&lt;/code&gt; CDP mode&lt;/td&gt;
&lt;td&gt;Skips ChromeDriver, talks CDP directly to Chrome&lt;/td&gt;
&lt;td&gt;Works on most CF sites; still one process per browser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;curl_cffi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Impersonates a browser's TLS JA3 fingerprint&lt;/td&gt;
&lt;td&gt;No JS execution — breaks on sites that hydrate with React&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;nodriver&lt;/code&gt; / &lt;code&gt;zendriver&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Headless-less Chrome with patched CDP&lt;/td&gt;
&lt;td&gt;Good for low-scale; resource-heavy at 1M pages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real Chrome + stealth profile&lt;/td&gt;
&lt;td&gt;Actual Chrome binary, persistent profile, cookies survive&lt;/td&gt;
&lt;td&gt;What most anti-bot services assume&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The last row is what I'll show below — and it's what I've been using.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual result
&lt;/h2&gt;

&lt;p&gt;The two captures are from the &lt;strong&gt;same browser process&lt;/strong&gt;, same machine, same IP. The only variable was the fingerprint config.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I'm doing it now
&lt;/h2&gt;

&lt;p&gt;I've been using &lt;code&gt;browser-act&lt;/code&gt; — a CLI that drives a real Chrome with a persistent stealth profile. One command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install (uses the skills package registry):&lt;/span&gt;
npx skills add browser-act/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; browser-act

&lt;span class="c"&gt;# Open a Cloudflare-protected page in a stealth session:&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; scrape browser open &amp;lt;profile-id&amp;gt; https://target.site
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; scrape get markdown &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; out.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The profile persists cookies and storage between runs, so the "warm browser" signals (history, localStorage, prior CF cookies) look human. For the r/webscraping OP's scale question (~1M pages), you'd run this with a pool of profile IDs and rotate — but that's a separate post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things worth arguing about
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;If your target is a Cloudflare &lt;strong&gt;Turnstile&lt;/strong&gt; specifically (not the full JS challenge), you're in a different regime — &lt;code&gt;curl_cffi&lt;/code&gt; + an injected widget can work, as one of the r/webscraping replies showed&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;undetected-chromedriver&lt;/code&gt; is the cheapest entry point if you already have Selenium code and low volume&lt;/li&gt;
&lt;li&gt;Residential proxies matter almost as much as the browser fingerprint. If your IP is a datacenter ASN, nothing in the browser layer saves you&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're fighting this problem right now, I'd love to hear what site you're on and what's been rejected — happy to compare notes. The full discussion is at &lt;a href="https://www.reddit.com/r/webscraping/comments/1sb9ibb/im_using_selenium_and_constantly_get_hit_by/" rel="noopener noreferrer"&gt;r/webscraping's original thread&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>python</category>
      <category>cloudflare</category>
      <category>automation</category>
    </item>
    <item>
      <title>🚀 Major BrowserAct CLI Update</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:44:23 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/major-browseract-cli-update-27oe</link>
      <guid>https://forem.com/double_chen_70da460344c73/major-browseract-cli-update-27oe</guid>
      <description>&lt;p&gt;We just shipped 8 new commands that fundamentally change how AI agents handle complex web workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;✅ Multi-tab navigation&lt;br&gt;
✅ File upload handling&lt;br&gt;
✅ Cookie &amp;amp; session management&lt;br&gt;
✅ Advanced wait strategies&lt;br&gt;
✅ Full-page screenshots&lt;br&gt;
✅ Auto-bypass for Datadome &amp;amp; Human Security&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're building agents that need to navigate multi-step workflows, handle file uploads, or extract data from heavily protected sites — this update eliminates weeks of custom workarounds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get Started:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add browser-act/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; browser-act
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;📖 Full docs: &lt;a href="https://github.com/browser-act/skills/tree/main/browser-act" rel="noopener noreferrer"&gt;github.com/browser-act/skills/tree/main/browser-act&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Questions? Join our Discord: &lt;a href="https://discord.gg/UpnCKd7GaU" rel="noopener noreferrer"&gt;discord.gg/UpnCKd7GaU&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>webdev</category>
      <category>opensource</category>
    </item>
    <item>
      <title>We just shipped browser-act CLI — browser automation without writing code</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Fri, 10 Apr 2026 08:16:15 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/we-just-shipped-browser-act-cli-browser-automation-without-writing-code-579f</link>
      <guid>https://forem.com/double_chen_70da460344c73/we-just-shipped-browser-act-cli-browser-automation-without-writing-code-579f</guid>
      <description>&lt;p&gt;We built BrowserAct because we kept running into the same wall: every time we needed to automate something in a browser, we had to start a whole project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm init &lt;span class="nt"&gt;-y&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;playwright
npx playwright &lt;span class="nb"&gt;install &lt;/span&gt;chromium
&lt;span class="c"&gt;# ... now write 25 lines of async/await just to load a page&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's fine when you're building a test suite. Most of the time, you just want to grab a page's content, click something, or take a screenshot — from the terminal, in 30 seconds.&lt;/p&gt;

&lt;p&gt;So we built &lt;strong&gt;browser-act CLI&lt;/strong&gt;. Browser automation as terminal commands. No code, no project setup, no framework.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it looks like
&lt;/h2&gt;

&lt;p&gt;This is a real run. Three commands against Hacker News:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 navigate &lt;span class="s2"&gt;"https://news.ycombinator.com"&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 &lt;span class="nb"&gt;wait &lt;/span&gt;stable
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 get markdown
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full page extracted as clean structured markdown — 3 commands, no code written.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Output: 15,547 characters of clean markdown&lt;/strong&gt; from 78,320 chars of raw HTML. browser-act automatically strips ads, nav bars, and irrelevant noise.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why not just use Playwright?
&lt;/h2&gt;

&lt;p&gt;Playwright's getting started page: npm init, install, then download ~400MB of browser binaries — before a single line of automation.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Playwright / Puppeteer&lt;/th&gt;
&lt;th&gt;browser-act CLI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;First-time setup&lt;/td&gt;
&lt;td&gt;npm init + install + ~400MB browser download&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;npx skills add browser-act/skills --skill browser-act&lt;/code&gt; — once, global&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Navigate + extract content&lt;/td&gt;
&lt;td&gt;~25 lines of async/await boilerplate&lt;/td&gt;
&lt;td&gt;3 commands&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session state&lt;/td&gt;
&lt;td&gt;Manual context management in every script&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--session&lt;/code&gt; persists automatically between commands&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shell integration&lt;/td&gt;
&lt;td&gt;Requires Node.js or Python runtime&lt;/td&gt;
&lt;td&gt;Pipe output directly to &lt;code&gt;grep&lt;/code&gt; / &lt;code&gt;jq&lt;/code&gt; / anything&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Playwright is still the right choice for full E2E test suites with parallel workers, trace viewers, and CI pipelines. browser-act CLI is for everything else.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;Install once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add browser-act/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; browser-act
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Core commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Open a page and extract its content&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 navigate &lt;span class="s2"&gt;"https://example.com"&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 &lt;span class="nb"&gt;wait &lt;/span&gt;stable
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 get markdown      &lt;span class="c"&gt;# clean text output&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 get html          &lt;span class="c"&gt;# raw HTML&lt;/span&gt;

&lt;span class="c"&gt;# Interact&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 click 3           &lt;span class="c"&gt;# click element by index&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 input 2 &lt;span class="s2"&gt;"query"&lt;/span&gt;   &lt;span class="c"&gt;# fill a field&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 keys &lt;span class="s2"&gt;"Enter"&lt;/span&gt;

&lt;span class="c"&gt;# Capture&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 screenshot ./out.png

&lt;span class="c"&gt;# Stealth mode (bypasses bot detection)&lt;/span&gt;
browser-act &lt;span class="nt"&gt;--session&lt;/span&gt; s1 browser list      &lt;span class="c"&gt;# pick a stealth profile&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sessions persist between commands — build multi-step automations in shell scripts without managing state yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  What people are using it for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Web scraping&lt;/strong&gt; — no boilerplate, just commands and output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shell pipelines&lt;/strong&gt; — &lt;code&gt;get markdown&lt;/code&gt; | &lt;code&gt;grep&lt;/code&gt; | &lt;code&gt;jq&lt;/code&gt; — works with every Unix tool you already use&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI agents&lt;/strong&gt; — give an LLM direct browser access via CLI commands&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment verification&lt;/strong&gt; — &lt;code&gt;navigate&lt;/code&gt; → &lt;code&gt;get markdown&lt;/code&gt; → assert expected content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;n8n / Make / Zapier integrations&lt;/strong&gt; — use as a step in no-code workflows&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;browser-act CLI is live today&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://www.browseract.com" rel="noopener noreferrer"&gt;browseract.com&lt;/a&gt; · Free to use · No credit card required&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/browser-act/skills" rel="noopener noreferrer"&gt;github.com/browser-act/skills&lt;/a&gt; · AWS Marketplace available&lt;/p&gt;

&lt;p&gt;Questions? Drop them in the comments — we read everything.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>webdev</category>
      <category>cli</category>
      <category>automation</category>
    </item>
    <item>
      <title>Looking for Experienced Make.com &amp; Browser Act Creators!</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Thu, 18 Sep 2025 11:48:21 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/looking-for-experienced-makecom-browser-act-creators-p65</link>
      <guid>https://forem.com/double_chen_70da460344c73/looking-for-experienced-makecom-browser-act-creators-p65</guid>
      <description>&lt;h2&gt;
  
  
  We’re seeking skilled individuals to:
&lt;/h2&gt;

&lt;p&gt;Build public, non-customizable Browser Act workflows.&lt;br&gt;
Publish them to the Make community via your creator account.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What’s in it for you:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Competitive pay for each workflow.&lt;/li&gt;
&lt;li&gt;Keep all platform revenue and consulting fees.&lt;/li&gt;
&lt;li&gt;If you’re interested, message me to discuss the details.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s create something amazing together!&lt;/p&gt;

</description>
      <category>automation</category>
      <category>webscraping</category>
      <category>node</category>
      <category>api</category>
    </item>
    <item>
      <title>BrowserAct —Node Configuration &amp; Best Practices for Web Scraping Automation</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Thu, 18 Sep 2025 08:25:45 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/browseract-node-configuration-best-practices-for-web-scraping-automation-1226</link>
      <guid>https://forem.com/double_chen_70da460344c73/browseract-node-configuration-best-practices-for-web-scraping-automation-1226</guid>
      <description>&lt;h2&gt;
  
  
  🎯 What is &lt;a href="https://www.browseract.com/?co-from=dev" rel="noopener noreferrer"&gt;BrowserAct&lt;/a&gt; ?
&lt;/h2&gt;

&lt;p&gt;BrowserAct is an innovative web automation platform that combines AI-powered browser interaction with structured data extraction capabilities. It empowers users to create advanced web scraping workflows without any coding skills.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjgmbiydecvz1oszi3uq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjgmbiydecvz1oszi3uq.png" alt=" " width="800" height="396"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Features &amp;amp; Purpose
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🚫 No Coding Required: Build workflows with a zero-code interface using visual nodes.&lt;/li&gt;
&lt;li&gt;🎯 Precise Data Extraction: Achieve higher accuracy than traditional AI Agents.&lt;/li&gt;
&lt;li&gt;🧠 Smart Page Understanding: Leverages AI for better recognition than standard RPA tools.&lt;/li&gt;
&lt;li&gt;💰 Cost-Effective: Save up to 90% of costs compared to agent-based scraping solutions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mbpa4x809u6u4vuss86.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mbpa4x809u6u4vuss86.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Revolutionary Advantages of &lt;a href="https://www.browseract.com/?co-from=dev" rel="noopener noreferrer"&gt;BrowserAct&lt;/a&gt; Workflow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Natural Language-Driven Smart Operations&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Simplify Workflow Creation: Simply describe your intentions in natural language — no technical expertise required.&lt;/li&gt;
&lt;li&gt;AI-Driven Translation: Automatically converts descriptions into precise page operations.&lt;/li&gt;
&lt;li&gt;User-Friendly: Business users can easily create, understand, and modify workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Zero Exception Handling Burden&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Built-in Smart Fault Tolerance: Automatically handles common exceptions during scraping tasks.&lt;/li&gt;
&lt;li&gt;Backup Operation Methods: Single nodes support multiple fallback strategies to ensure success.&lt;/li&gt;
&lt;li&gt;Graceful Degradation: Intelligent handling when critical steps fail, minimizing disruptions.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Cost-Effectiveness Breakthrough&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;90% Cost Savings vs Agent Scraping: Significantly reduce expenses without sacrificing precision.&lt;/li&gt;
&lt;li&gt;80% Less Configuration Time: Compared to traditional RPA tools, setup is much faster.&lt;/li&gt;
&lt;li&gt;Low Maintenance: Adaptive algorithms reduce the need for manual updates when pages change.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Precision and Intelligence Combined&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Higher Accuracy than Agents: Professional extraction algorithms ensure data precision.&lt;/li&gt;
&lt;li&gt;Smarter than RPA: AI-powered understanding adapts to complex web pages.&lt;/li&gt;
&lt;li&gt;Dynamic Adaptation: Automatically adjusts to changes in page structure, ensuring reliability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F279afxrwn2d6eytf7zra.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F279afxrwn2d6eytf7zra.png" alt=" " width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Start Guide：Create Your First Workflow in Minutes
&lt;/h2&gt;

&lt;p&gt;Ready to build your first scraping workflow? Follow these six simple steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Workflow - Start with a blank template.&lt;/li&gt;
&lt;li&gt;Set Parameters - Configure parameter variables, or delete parameter settings for more flexible data searching&lt;/li&gt;
&lt;li&gt;Add Nodes - Click the plus sign below nodes to add new nodes&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F996v7pfqr68ld9xyl8zc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F996v7pfqr68ld9xyl8zc.png" alt=" " width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use Natural Language: - Describe the operation for each node in plain language (e.g., “Click on the login button”).&lt;/li&gt;
&lt;li&gt;Run the Workflow - Click the run button to see scraping results&lt;/li&gt;
&lt;li&gt;Data Export - Automatically generate structured data files&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Don’t Know How to Set Up a Workflow?
&lt;/h2&gt;

&lt;p&gt;No problem! The &lt;a href="https://www.browseract.com/template/amazon-best-sellers-scraper?co-from=tpamz" rel="noopener noreferrer"&gt;BrowserAct Template&lt;/a&gt; Marketplace has a wide variety of ready-to-use templates. With just one click, you can experience pre-built workflows tailored to common use cases.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqhsrohqdsnxsywbq0nbv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqhsrohqdsnxsywbq0nbv.png" alt=" " width="800" height="497"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Choose BrowserAct?
&lt;/h2&gt;

&lt;p&gt;Experience the Future of Web Scraping Today!&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;📈 Boost Efficiency: Projects that traditionally take weeks now complete in hours&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;💰 Reduce Costs: No need for professional development teams - business users can operate directly&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🎯 Reliable and Accurate: AI-powered smart scraping with over 95% accuracy rate&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🚀 Rapid Iteration: Adjust workflows in minutes when requirements change&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;🎉 Start Your Zero-Code Data Scraping Journey Now!&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://www.browseract.com/?co-from=dev" rel="noopener noreferrer"&gt;Register&lt;/a&gt; now and unlock the potential of intelligent data scraping with BrowserAct.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>webscraping</category>
    </item>
    <item>
      <title>**BrowserAct Integration with Make**</title>
      <dc:creator>Double CHEN</dc:creator>
      <pubDate>Tue, 16 Sep 2025 10:09:21 +0000</pubDate>
      <link>https://forem.com/double_chen_70da460344c73/browseract-integration-with-make-20ke</link>
      <guid>https://forem.com/double_chen_70da460344c73/browseract-integration-with-make-20ke</guid>
      <description>&lt;p&gt;BrowserAct App has officially launched on Make, bringing AI-powered automation to your data collection workflows. This integration allows you to supercharge your data processes with intelligent automation capabilities.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>webscraping</category>
    </item>
  </channel>
</rss>
