<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: 许映洲</title>
    <description>The latest articles on Forem by 许映洲 (@_ab214f84f83a01455a74b).</description>
    <link>https://forem.com/_ab214f84f83a01455a74b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3953937%2Fa508106a-2da4-432e-9a1e-38c1608fc027.png</url>
      <title>Forem: 许映洲</title>
      <link>https://forem.com/_ab214f84f83a01455a74b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/_ab214f84f83a01455a74b"/>
    <language>en</language>
    <item>
      <title>Automate Browser Tasks with xbrowser: A Developer's Guide to Web Automation</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Wed, 27 May 2026 08:57:50 +0000</pubDate>
      <link>https://forem.com/_ab214f84f83a01455a74b/automate-browser-tasks-with-xbrowser-a-developers-guide-to-web-automation-4m71</link>
      <guid>https://forem.com/_ab214f84f83a01455a74b/automate-browser-tasks-with-xbrowser-a-developers-guide-to-web-automation-4m71</guid>
      <description>&lt;p&gt;Browser automation has been stuck in a rut for years. The dominant tools — Selenium, Puppeteer, Playwright — are powerful, but they're built for testing, not for real-world task automation. You want to scrape a competitor's pricing page? Write a 40-line script. Need to search Google and Bing simultaneously and compare results? That's another script. Want to chain a login flow with a data extraction step? Now you're managing async state, waiting for selectors, and praying nothing times out.&lt;/p&gt;

&lt;p&gt;I've been writing browser automation code for years, and I kept running into the same friction: too much boilerplate for tasks that should take one command. That frustration led me to &lt;a href="https://xbrowser.dev" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt;, a CLI tool designed specifically for developers and AI agents who need to get things done in a browser without writing a full test suite every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with Current Tools
&lt;/h2&gt;

&lt;p&gt;Let's be clear — Playwright and Selenium are excellent at what they do. If you're writing end-to-end tests for a web application, they're the right choice. But when your use case shifts from "test my app" to "interact with the web," the cracks start to show:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Heavy setup&lt;/strong&gt;: You need a Node.js project, dependency installation, browser downloads, and boilerplate before you can even navigate to a page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Script-first&lt;/strong&gt;: Every task requires writing a script. There's no quick "just do this one thing" mode.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No domain helpers&lt;/strong&gt;: Want to search Google? You're navigating to google.com, typing in a selector, waiting for results, and parsing the DOM yourself.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not agent-friendly&lt;/strong&gt;: AI agents need simple, composable commands. A 50-line async script is the opposite of that.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What I wanted was something like &lt;code&gt;curl&lt;/code&gt; but for interactive browser tasks — a single command that handles the complexity and gives me the result.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter xbrowser
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/dyyz1993/xbrowser" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; is an open-source (MIT) browser automation CLI that ships as a single npm package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm i &lt;span class="nt"&gt;-g&lt;/span&gt; @dyyz1993/xbrowser
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the entire installation. No separate browser download, no WebDriver setup, no configuration files. It comes with a managed Chromium build that includes CDP fingerprint protection — meaning the sites you visit can't easily detect that you're running an automated browser.&lt;/p&gt;

&lt;p&gt;The tool is designed around composable commands that map to real-world tasks rather than low-level browser APIs. Let me walk through the core features.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Engine Search
&lt;/h2&gt;

&lt;p&gt;Searching the web from the command line shouldn't require an API key. xbrowser handles the browser interaction for you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Search Google&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"headless browser automation tools"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--num&lt;/span&gt; 10

&lt;span class="c"&gt;# Search Bing&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"headless browser automation tools"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; bing &lt;span class="nt"&gt;--num&lt;/span&gt; 10

&lt;span class="c"&gt;# Search Baidu (for Chinese-language results)&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"无头浏览器自动化工具"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; baidu &lt;span class="nt"&gt;--num&lt;/span&gt; 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each command returns structured results with titles, URLs, and snippets. You can pipe them into &lt;code&gt;jq&lt;/code&gt; for filtering, save them to a file, or feed them directly into an AI agent's context.&lt;/p&gt;

&lt;p&gt;This is particularly useful for competitive analysis. Want to see how your brand ranks across search engines?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Compare your ranking position across engines&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"my product name"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--num&lt;/span&gt; 30 | jq &lt;span class="s1"&gt;'.results[] | select(.url | contains("myproduct.com"))'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No API keys, no rate limits to manage, no OAuth flows. Just search and get results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Web Scraping Without the Script
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;scrape&lt;/code&gt; command extracts clean, structured content from any URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Get page content as markdown&lt;/span&gt;
xbrowser scrape https://example.com/blog/my-article

&lt;span class="c"&gt;# Crawl an entire site&lt;/span&gt;
xbrowser crawl https://example.com &lt;span class="nt"&gt;--depth&lt;/span&gt; 3 &lt;span class="nt"&gt;--max-pages&lt;/span&gt; 100

&lt;span class="c"&gt;# Generate a URL sitemap&lt;/span&gt;
xbrowser map https://example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;scrape&lt;/code&gt; output is markdown by default, which means it's immediately usable — paste it into a document, feed it to an LLM, or parse it with standard text tools.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;crawl&lt;/code&gt; follows internal links and respects depth limits, giving you a complete content snapshot of a site. &lt;code&gt;map&lt;/code&gt; produces a flat list of every reachable URL, which is invaluable for SEO audits.&lt;/p&gt;

&lt;p&gt;Here's a practical example — auditing your own site's internal link structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Map all URLs on your site&lt;/span&gt;
xbrowser map https://mysite.com &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; sitemap.txt

&lt;span class="c"&gt;# Find orphaned pages (in sitemap but not linked from other pages)&lt;/span&gt;
&lt;span class="nb"&gt;cat &lt;/span&gt;sitemap.txt | &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;read &lt;/span&gt;url&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;xbrowser scrape &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"href="&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="nv"&gt;$count&lt;/span&gt;&lt;span class="s2"&gt; links"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Chain Commands: The Power Move
&lt;/h2&gt;

&lt;p&gt;This is the feature that sets xbrowser apart. Instead of writing multi-step scripts, you chain operations in a single command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Navigate, interact, and extract&lt;/span&gt;
xbrowser chain &lt;span class="s2"&gt;"goto https://news.ycombinator.com &amp;amp;&amp;amp; click '.titleline &amp;gt; a' &amp;amp;&amp;amp; scrape"&lt;/span&gt;

&lt;span class="c"&gt;# Complete login flow with data extraction&lt;/span&gt;
xbrowser chain &lt;span class="s2"&gt;"goto https://app.example.com/login &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  &amp;amp;&amp;amp; fill '#email' 'user@example.com' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  &amp;amp;&amp;amp; fill '#password' 'my-password' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  &amp;amp;&amp;amp; click '#login-button' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  &amp;amp;&amp;amp; wait '#dashboard' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  &amp;amp;&amp;amp; scrape '#dashboard'"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The chain syntax reads like natural language: go to this page, click this element, fill in that field, scrape the result. It mirrors how you'd describe the task to another person.&lt;/p&gt;

&lt;p&gt;For AI agent workflows, this is a game-changer. An agent can construct chain commands dynamically based on user intent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "Go to Hacker News, click the top story, and summarize it for me"

Agent constructs:
xbrowser chain "goto https://news.ycombinator.com &amp;amp;&amp;amp; click '.titleline &amp;gt; a:first-of-type' &amp;amp;&amp;amp; scrape"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No script generation, no debugging async code, no selector management. The agent just builds a chain string and executes it.&lt;/p&gt;

&lt;h2&gt;
  
  
  SEO and Backlink Analysis
&lt;/h2&gt;

&lt;p&gt;xbrowser ships with 67+ plugins, and the SEO suite is particularly comprehensive:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Analyze backlinks for a domain&lt;/span&gt;
xbrowser seo backlinks &lt;span class="nt"&gt;--domain&lt;/span&gt; example.com

&lt;span class="c"&gt;# Check on-page SEO factors&lt;/span&gt;
xbrowser seo audit https://example.com/page

&lt;span class="c"&gt;# Analyze search engine results for a keyword&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"target keyword"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--num&lt;/span&gt; 30 &lt;span class="nt"&gt;--analyze&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The backlink plugin crawls referring domains, checks link status, and reports on link quality metrics. The audit plugin checks meta tags, heading structure, image alt text, and other on-page factors.&lt;/p&gt;

&lt;p&gt;For link-building workflows, you can combine search and scraping:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find guest post opportunities&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"write for us + web development"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--num&lt;/span&gt; 20 | &lt;span class="se"&gt;\&lt;/span&gt;
  jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.results[].url'&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;read &lt;/span&gt;url&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;xbrowser scrape &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s2"&gt;"guidelines&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;submit&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;contribute"&lt;/span&gt;
  &lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Record and Replay
&lt;/h2&gt;

&lt;p&gt;Sometimes you need to automate a complex workflow that's hard to express as a chain. That's where recording comes in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start recording (opens a visible browser window)&lt;/span&gt;
xbrowser record my-workflow

&lt;span class="c"&gt;# Do your thing — click around, fill forms, navigate&lt;/span&gt;

&lt;span class="c"&gt;# Stop recording when done&lt;/span&gt;
&lt;span class="c"&gt;# The workflow is saved as a replayable script&lt;/span&gt;

&lt;span class="c"&gt;# Replay it headlessly&lt;/span&gt;
xbrowser replay my-workflow &lt;span class="nt"&gt;--headless&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Record your workflow once in a visible browser, then replay it on a schedule or in CI. This is perfect for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily report generation that requires login&lt;/li&gt;
&lt;li&gt;Monitoring competitor pricing pages&lt;/li&gt;
&lt;li&gt;Regression testing without writing test code&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How It Compares
&lt;/h2&gt;

&lt;p&gt;Let me be straightforward about when to use what:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;xbrowser&lt;/th&gt;
&lt;th&gt;Playwright&lt;/th&gt;
&lt;th&gt;Selenium&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Installation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;npm i -g&lt;/code&gt; (one step)&lt;/td&gt;
&lt;td&gt;npm install + browser download&lt;/td&gt;
&lt;td&gt;npm install + WebDriver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CLI-first&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No (library-first)&lt;/td&gt;
&lt;td&gt;No (library-first)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search helpers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Google/Bing/Baidu built-in&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SEO plugins&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;67+ built-in&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chain syntax&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;goto &amp;amp;&amp;amp; click &amp;amp;&amp;amp; scrape&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Requires script&lt;/td&gt;
&lt;td&gt;Requires script&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Record/Replay&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Codegen (code output)&lt;/td&gt;
&lt;td&gt;IDE plugins&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Anti-detection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CDP fingerprint protection&lt;/td&gt;
&lt;td&gt;Basic stealth plugins&lt;/td&gt;
&lt;td&gt;External tools&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Test framework&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not designed for this&lt;/td&gt;
&lt;td&gt;Primary use case&lt;/td&gt;
&lt;td&gt;Primary use case&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key distinction: &lt;a href="https://xbrowser.dev" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; is for &lt;strong&gt;doing things on the web&lt;/strong&gt;. Playwright and Selenium are for &lt;strong&gt;testing things on the web&lt;/strong&gt;. Different goals, different tools.&lt;/p&gt;

&lt;p&gt;If you're building an AI agent that needs to browse the web, scrape data, perform SEO analysis, or automate repetitive browser tasks, xbrowser gives you composable commands that map directly to those tasks. If you're writing integration tests for your React app, stick with Playwright.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm i &lt;span class="nt"&gt;-g&lt;/span&gt; @dyyz1993/xbrowser
xbrowser &lt;span class="nt"&gt;--help&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"hello world"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three commands and you're up and running. The full documentation, plugin directory, and API reference are available at &lt;a href="https://xbrowser.dev" rel="noopener noreferrer"&gt;xbrowser.dev&lt;/a&gt;. The source code is on &lt;a href="https://github.com/dyyz1993/xbrowser" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; under the MIT license.&lt;/p&gt;

&lt;p&gt;If you're building AI agents that interact with the web, or if you're tired of writing 50-line scripts for tasks that should take one command, give it a try. Contributions and plugin submissions are welcome.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;xbrowser is open source under the MIT license. Install with &lt;code&gt;npm i -g @dyyz1993/xbrowser&lt;/code&gt;. Docs and examples at &lt;a href="https://xbrowser.dev" rel="noopener noreferrer"&gt;xbrowser.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>browser</category>
      <category>cli</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
