<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: BotConductStandard </title>
    <description>The latest articles on Forem by BotConductStandard  (@botconductstandard).</description>
    <link>https://forem.com/botconductstandard</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3879750%2F8ae493fd-627f-4382-9be6-7ce6d3fbbab4.jpeg</url>
      <title>Forem: BotConductStandard </title>
      <link>https://forem.com/botconductstandard</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/botconductstandard"/>
    <language>en</language>
    <item>
      <title>Static compliance checklists can't measure AI agent behavior. Here's what does.</title>
      <dc:creator>BotConductStandard </dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:33:53 +0000</pubDate>
      <link>https://forem.com/botconductstandard/static-compliance-checklists-cant-measure-ai-agent-behavior-heres-what-does-1g4o</link>
      <guid>https://forem.com/botconductstandard/static-compliance-checklists-cant-measure-ai-agent-behavior-heres-what-does-1g4o</guid>
      <description>&lt;p&gt;Agent-evaluation products in 2026 fall into two generations. First-generation: static pass/fail checklists. Second-generation: evaluation under changing conditions, where behavior trajectory is measured rather than endpoint state. The first generation can't answer the questions CTOs and CISOs actually ask. The second generation can — and it works the same way across every platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem with ten checks
&lt;/h2&gt;

&lt;p&gt;Most agent-readiness products shipping today work the same way. Define N rules. Test whether the bot passes each. Aggregate into a score. Ship a certificate.&lt;/p&gt;

&lt;p&gt;The appeal is obvious. It's auditable. It maps to how SOC 2 reports look. A CISO understands it without training.&lt;/p&gt;

&lt;p&gt;The problem is also obvious once you think about production incidents. The evaluation measures &lt;strong&gt;observable state at a single point in time&lt;/strong&gt;. It tells you nothing about how the agent behaves when conditions around it change — when signals evolve, when server state shifts, when adversarial inputs arrive. These are the situations that cause real production incidents, and they are precisely what static evaluation cannot measure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The community already said this
&lt;/h2&gt;

&lt;p&gt;On recent threads about agent-readiness tooling, the paraphrased reaction from sophisticated technical commenters has been: &lt;em&gt;"10 static checks is like SEO in 10 static checks. It misses the point."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That critique is correct. The market is already splitting into two camps, and first-generation tools are being read as legacy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What second-generation looks like
&lt;/h2&gt;

&lt;p&gt;Instead of testing compliance with fixed rules, second-generation evaluation measures &lt;strong&gt;behavior trajectory&lt;/strong&gt; under evolving conditions. The agent is placed in environments where directives can change during the session, where signals can contradict, where adversarial inputs test discipline.&lt;/p&gt;

&lt;p&gt;What gets measured is not a state at a single point in time, but the decision trajectory across the scenario — what the agent chose when forced to interpret ambiguous inputs, how it recovered from errors, whether it held scope under pressure.&lt;/p&gt;

&lt;p&gt;The specific scenarios, thresholds, and evaluation criteria are not disclosed publicly. This is deliberate: revealing the mechanism would let operators tune agents to pass without demonstrating genuine compliance. The methodology is a closed oracle — reproducible internally, verifiable externally through cryptographically signed observation records, but not publicly described.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the report looks like
&lt;/h2&gt;

&lt;p&gt;First-generation reports produce checkmarks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[✓] Identifies as bot
[✓] Respects standard directives
[✗] Publishes declaration URL
Score: 87/100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Second-generation reports produce trajectories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;T+0s   | Session initialized, agent fetched initial directives
...    | Scenario-specific events recorded with timestamps
T+N    | Agent made decision in response to changing conditions
...    | Multiple decision points across the session

Verdict: [PASS|FAIL] per scenario
Reason: Specific agent behaviors in context,
        with cryptographically signed observation IDs
        for each event.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first shows the state. The second shows the decision. In a production incident, only the decision matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-platform by design
&lt;/h2&gt;

&lt;p&gt;The certification is infrastructure-neutral. An agent certified by the methodology is recognized the same way by a site behind Cloudflare, one running DataDome, one with in-house infrastructure, and one with nothing at all. It doesn't compete with bot-management vendors — it's the independent layer they can cite. Like a passport for AI agents: issued once, honored everywhere.&lt;/p&gt;

&lt;p&gt;The same principle applies to the regulatory plane. One certification bundles compliance evidence against multiple frameworks simultaneous ly — EU AI Act, GDPR, California SB 1001, RFC 9309, W3C TDMRep, EU DSM Directive. Instead of demonstrating compliance six separate times against six separate auditors, the operator is evaluated once and the result can be cited in any jurisdiction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this distinction is urgent now
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Regulatory pressure is specific about conduct.&lt;/strong&gt; EU AI Act Article 50 requires disclosure &lt;em&gt;during interaction&lt;/em&gt;, not at deployment. GDPR rights apply per-request. California SB 1001 demands honest identification in the context of a conversation. These are dynamic obligations, not static attestations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enterprise buyers ask operational questions.&lt;/strong&gt; A CTO doesn't ask "does it pass a 10-check list." They ask how the agent behaves when conditions in the real deployment environment change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Incidents are documented.&lt;/strong&gt; Recent disclosures in the infrastructure-vendor space have confirmed AI-accelerated attacks exploiting agent platforms. The evaluation framework appropriate to this threat model is not a checklist.&lt;/p&gt;

&lt;h2&gt;
  
  
  What BotConduct is building
&lt;/h2&gt;

&lt;p&gt;BotConduct Training Center is designed second-generation from day one. Level 1 is static hygiene (basic sanity is the floor). Level 2 measures behavior under evolving conditions. Level 3 measures conduct integrity under adversarial probing. Each evaluation produces a cryptographically signed trajectory, not a checklist.&lt;/p&gt;

&lt;p&gt;Each observation is signed with Ed25519 and recorded in an append-only chain. Public key at botconduct.org/.well-known/bcs-public-key.pem. Anyone can verify any observation via botconduct.org/api/verify-observation/{id} without trusting us.&lt;/p&gt;

&lt;p&gt;If Moody's rates bonds and FICO rates people, BotConduct rates how an AI agent behaves when nobody is watching — and the certificate works across every platform.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Landing + pricing: &lt;a href="https://botconduct.org/training-center" rel="noopener noreferrer"&gt;botconduct.org/training-center&lt;/a&gt;&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Regulatory foundation: RFC 9309, EU AI Act Art. 50, EU DSM Directive Art. 4, California SB 1001, W3C TDMRep, GDPR.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Discussion welcomed.&lt;/strong&gt; What scenarios would you want to see in a second-generation evaluation of your own agents? What does your team currently use to measure agent behavior under change?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>security</category>
      <category>webdev</category>
    </item>
    <item>
      <title>194 IP Addresses. One Fake iPhone. Six Days Undetected. published: true</title>
      <dc:creator>BotConductStandard </dc:creator>
      <pubDate>Sat, 18 Apr 2026 14:28:43 +0000</pubDate>
      <link>https://forem.com/botconductstandard/194-ip-addresses-one-fake-iphone-six-days-undetectedpublished-true-1ofe</link>
      <guid>https://forem.com/botconductstandard/194-ip-addresses-one-fake-iphone-six-days-undetectedpublished-true-1ofe</guid>
      <description>&lt;p&gt;A scraper ran on our network for 6 days using 194 different Tencent Cloud IPs. Every request carried a fake iPhone User-Agent (iOS 13.2.3 from 2019). It never read robots.txt. It never identified itself. It averaged 1.8 requests per IP -- staying below every rate limiter, every WAF rule, every IP-based detection system.&lt;/p&gt;

&lt;p&gt;In your analytics, this looks like 194 different people casually browsing on iPhones. No alert. No anomaly. Nothing to investigate.&lt;/p&gt;

&lt;p&gt;The numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;194 unique IPs (all ASN 132203, Tencent Cloud)&lt;/li&gt;
&lt;li&gt;362 requests over 6 days&lt;/li&gt;
&lt;li&gt;Fake iPhone UA (iOS 13.2.3 -- released November 2019)&lt;/li&gt;
&lt;li&gt;1.8 hits per IP average (evades all IP-based detection)&lt;/li&gt;
&lt;li&gt;Never read robots.txt&lt;/li&gt;
&lt;li&gt;Hit paths across entire site including /es/, /de/, /fr/, /no/, /zh/&lt;/li&gt;
&lt;li&gt;All datacenter IPs -- no real iPhone connects from a datacenter&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What this means:&lt;br&gt;
If you run e-commerce, it has your prices. If you run media, it has your content. If you run SaaS, it mapped your app. And you never saw it because every request looked like a real user.&lt;/p&gt;

&lt;p&gt;We caught it by measuring behavioral conduct -- not counting IPs.&lt;/p&gt;

&lt;p&gt;Full forensic breakdown: &lt;a href="https://botconduct.org/report/april-2026/part-2/" rel="noopener noreferrer"&gt;https://botconduct.org/report/april-2026/part-2/&lt;/a&gt;&lt;br&gt;
Part 2 of the State of Bot Conduct series. Part 1: &lt;a href="https://botconduct.org/report/april-2026/part-1/" rel="noopener noreferrer"&gt;https://botconduct.org/report/april-2026/part-1/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;BotConduct.org -- Behavioral scoring for bots and AI agents.&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>agents</category>
      <category>webdev</category>
    </item>
    <item>
      <title>GPTBot follows content invisible to humans. TwitterBot and ClaudeBot don t.</title>
      <dc:creator>BotConductStandard </dc:creator>
      <pubDate>Fri, 17 Apr 2026 19:27:17 +0000</pubDate>
      <link>https://forem.com/botconductstandard/we-scored-172-bots-on-behavioral-conduct-openai-came-in-last-4bpd</link>
      <guid>https://forem.com/botconductstandard/we-scored-172-bots-on-behavioral-conduct-openai-came-in-last-4bpd</guid>
      <description>&lt;p&gt;We run a behavioral observation network that scores how bots and AI agents conduct themselves when they visit websites. We scored 172+ operators. The results were eye-opening.&lt;/p&gt;

&lt;h2&gt;
  
  
  GPTBot: 8 content requests in 14 seconds
&lt;/h2&gt;

&lt;p&gt;On April 17, 2026, OpenAI s GPTBot visited our network from IP 74.7.241.33 -- verified against OpenAI s own published ranges at openai.com/gptbot.json.&lt;/p&gt;

&lt;p&gt;In a single session of 51 seconds, it made 39 requests. &lt;strong&gt;8 of those went to content not visible to human visitors.&lt;/strong&gt; All 8 in a 14-second burst.&lt;/p&gt;

&lt;p&gt;GPTBot does not render CSS. It parses raw HTML and follows every anchor tag it finds -- visible or not. It cannot tell the difference between content meant for users and content that is hidden from the rendered page.&lt;/p&gt;

&lt;p&gt;A 00B company s flagship crawler, navigating the web blind.&lt;/p&gt;

&lt;h2&gt;
  
  
  TwitterBot and ClaudeBot: zero
&lt;/h2&gt;

&lt;p&gt;X Corp s TwitterBot and Anthropic s ClaudeBot visited the same pages. Same HTML. Same content -- visible and hidden.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Neither followed any hidden content.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Three crawlers. Three of the biggest tech companies in the world. Same test. Two understood what humans can see. One didn t.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full leaderboard
&lt;/h2&gt;

&lt;p&gt;This is not a cherry-picked comparison. We scored 172+ bot operators on behavioral conduct. Here is how the named operators rank:&lt;/p&gt;

&lt;p&gt;The pattern: the biggest name does not mean the best behavior. Some of the most well-funded AI companies run crawlers less sophisticated than open-source projects with zero budget.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happens when a crawler can t see
&lt;/h2&gt;

&lt;p&gt;Hidden content exists everywhere on the web: honeypots, bot detection systems, anti-scraping layers, admin panels, internal tooling. A crawler that follows everything blindly will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trigger every honeypot it encounters&lt;/li&gt;
&lt;li&gt;Get flagged by every bot detection system&lt;/li&gt;
&lt;li&gt;Scrape content it was never meant to access&lt;/li&gt;
&lt;li&gt;Get blocked, rate-limited, and blacklisted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not about ethics. This is about engineering. Rendering CSS is a solved problem. Google s crawler does it. Anthropic s does it. X s does it. OpenAI s does not.&lt;/p&gt;

&lt;h2&gt;
  
  
  We contacted OpenAI
&lt;/h2&gt;

&lt;p&gt;We emailed &lt;a href="mailto:opt-out@openai.com"&gt;opt-out@openai.com&lt;/a&gt; on April 17, 2026 with 48 hours notice before publication. No response as of this writing. If they respond, we will update this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  This is Part 1 of 5
&lt;/h2&gt;

&lt;p&gt;We are publishing one finding per day:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Part 1 (today):&lt;/strong&gt; GPTBot and hidden content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 2:&lt;/strong&gt; 194 rotating IPs with a fake iPhone User-Agent. Six days. One cloud provider.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 3:&lt;/strong&gt; The crawler that ignored its own standard&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 4:&lt;/strong&gt; What bot traffic actually costs you&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 5:&lt;/strong&gt; A free tool to see what is hitting YOUR site right now&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full report with research disclaimer: &lt;a href="https://botconduct.org/report/april-2026/part-1" rel="noopener noreferrer"&gt;botconduct.org/report/april-2026/part-1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Want to see what bots do on your site?&lt;/strong&gt; Free sensor, 30 seconds, one line of code: &lt;a href="https://botconduct.org/sensor.html" rel="noopener noreferrer"&gt;botconduct.org/sensor.html&lt;/a&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>webdev</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I watched 145 bots visit my site for two weeks. Here is what I learned.</title>
      <dc:creator>BotConductStandard </dc:creator>
      <pubDate>Thu, 16 Apr 2026 04:13:40 +0000</pubDate>
      <link>https://forem.com/botconductstandard/i-watched-145-bots-visit-my-site-for-two-weeks-here-is-what-i-learned-1e3</link>
      <guid>https://forem.com/botconductstandard/i-watched-145-bots-visit-my-site-for-two-weeks-here-is-what-i-learned-1e3</guid>
      <description>&lt;p&gt;Two weeks ago I put a fresh site online and started logging every request. I wanted to answer a simple question: how much of my traffic is actually human?&lt;/p&gt;

&lt;p&gt;Turns out, barely any.&lt;/p&gt;

&lt;h2&gt;
  
  
  The raw numbers
&lt;/h2&gt;

&lt;p&gt;Across those two weeks I observed &lt;strong&gt;145 distinct bots&lt;/strong&gt; hitting the site. Some declared themselves honestly. Some pretended to be iPhones from 2019. Some came in through Cloudflare. Some came in through rotating AWS IPs and never stopped.&lt;/p&gt;

&lt;p&gt;I was interested in more than just counting them. I wanted to know &lt;strong&gt;how each one behaved&lt;/strong&gt; — not the identity, the conduct. Did it read &lt;code&gt;robots.txt&lt;/code&gt;? Did it respect rate limits? Did it avoid obviously private paths? Did it keep a stable user-agent across requests?&lt;/p&gt;

&lt;p&gt;I ended up with a scoring system. Each bot got a number between 0 and 100 based on observable behavior.&lt;/p&gt;

&lt;p&gt;The distribution was surprising.&lt;/p&gt;

&lt;h2&gt;
  
  
  The well-behaved majority
&lt;/h2&gt;

&lt;p&gt;The bots at the top of the ranking are exactly the ones you would expect. Major search engines. AI crawlers from the big labs. A few SEO tools. Social preview bots.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbotconduct.org%2Fassets%2Fleaderboard.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbotconduct.org%2Fassets%2Fleaderboard.png" alt="Top rated bots in the registry" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GPTBot (OpenAI), ClaudeBot (Anthropic), Bingbot (Microsoft), Bytespider (ByteDance), Baiduspider, YandexBot, Meta's scraper, redditbot — all landing at 100 out of 100.&lt;/p&gt;

&lt;p&gt;It makes sense once you think about it. These companies operate massive crawling infrastructure. They know every site they hit is watching. They have compliance teams. Their crawlers are boring in the best way — they announce themselves, stay within limits, and leave.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hostile minority
&lt;/h2&gt;

&lt;p&gt;The bottom of the ranking was where it got interesting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbotconduct.org%2Fassets%2Fhostile.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbotconduct.org%2Fassets%2Fhostile.png" alt="Hostile bots in the registry" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;About &lt;strong&gt;27% of bots scored below 50&lt;/strong&gt;. A few of them were recognizable — L9Explore, the crawler operated by LeakIX, probing sensitive paths aggressively. Keydrop Scanner doing credential probing. A stream of anonymous WordPress scanners hammering &lt;code&gt;/wp-admin&lt;/code&gt; on every domain they find.&lt;/p&gt;

&lt;p&gt;The worst offender was a single IP on AWS that sent &lt;strong&gt;2,562 requests in one day&lt;/strong&gt;. No user-agent. No interest in &lt;code&gt;robots.txt&lt;/code&gt;. Just walking through every endpoint it could find.&lt;/p&gt;

&lt;p&gt;Another favorite: a bot presenting itself as &lt;code&gt;iPhone; iPhone OS 13_2_3&lt;/code&gt; — an iOS version from late 2019. Nobody real is running that in 2026. The user-agent is a lie and the behavior matches. Distributed across dozens of residential IPs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The middle is the interesting part
&lt;/h2&gt;

&lt;p&gt;The polar ends of the distribution are easy. Known good bots are good. Obvious scanners are obviously malicious.&lt;/p&gt;

&lt;p&gt;The middle third is where real decisions live. Crawlers from cloud providers like Tencent sat around 36. Not malicious per se, but also not identifying themselves well and using rotating IPs. If I were running a site that mattered, would I let those through? Block? Rate-limit?&lt;/p&gt;

&lt;p&gt;This is the category where &lt;code&gt;block everything automated&lt;/code&gt; destroys legitimate use cases (partners, vendors, research tools) and &lt;code&gt;allow everything&lt;/code&gt; destroys your servers. It's where the real work is.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;I stopped logging and started building. The passive observations became an API — you send it a suspicious request, it sends you back a score and a recommended action.&lt;/p&gt;

&lt;p&gt;The action is one of four: &lt;code&gt;allow&lt;/code&gt;, &lt;code&gt;throttle&lt;/code&gt;, &lt;code&gt;challenge&lt;/code&gt;, &lt;code&gt;block&lt;/code&gt;. Anything my middleware can handle in three lines.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;verdict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bcs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user_agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User-Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verdict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;block&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rubric that produces the score is proprietary, but the verdicts are public. Every bot I scored shows up in a public registry with its current rating. Operators can claim their entries and upgrade to a cryptographically signed identity if they want higher trust.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it changed for me
&lt;/h2&gt;

&lt;p&gt;Before this experiment, I treated automated traffic as a nuisance. Something to filter, block, ignore.&lt;/p&gt;

&lt;p&gt;After two weeks of looking closely, I think about it differently. The web is becoming a conversation between automated agents — and most of them are trying to do their jobs well. The bad ones are loud, and they get all the attention, but they are the minority.&lt;/p&gt;

&lt;p&gt;Giving the well-behaved agents a way to prove it — and the sites a way to verify it — seems like a better answer than the status quo of blocking everything automated.&lt;/p&gt;

&lt;h2&gt;
  
  
  If you want to try it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If you run a bot or agent&lt;/strong&gt;: there is a public certification flow. It takes 30 seconds for basic certification, a few minutes for something more serious.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If you run a site&lt;/strong&gt;: the API has a free tier (5,000 scores per month) if you want to experiment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is at &lt;a href="https://botconduct.org" rel="noopener noreferrer"&gt;botconduct.org&lt;/a&gt;. The first production site running this end-to-end is &lt;a href="https://importsignals.com" rel="noopener noreferrer"&gt;importsignals.com&lt;/a&gt; — their &lt;a href="https://importsignals.com/security" rel="noopener noreferrer"&gt;bot policy page&lt;/a&gt; is a reasonable reference if you want to see what it looks like in the wild.&lt;/p&gt;

&lt;p&gt;Would love to hear from other people who have measured their bot traffic seriously. I suspect the 27% hostile number is conservative.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Follow-up thread and registry updates at &lt;a href="https://twitter.com/botconduct" rel="noopener noreferrer"&gt;@botconduct&lt;/a&gt;.&lt;/em&gt;&lt;br&gt;
Rafa Mizrahi&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>security</category>
      <category>datascience</category>
    </item>
    <item>
      <title>I watched 145 bots visit my site for two weeks. Here is what I learned.</title>
      <dc:creator>BotConductStandard </dc:creator>
      <pubDate>Thu, 16 Apr 2026 04:13:40 +0000</pubDate>
      <link>https://forem.com/botconductstandard/i-watched-145-bots-visit-my-site-for-two-weeks-here-is-what-i-learned-7ao</link>
      <guid>https://forem.com/botconductstandard/i-watched-145-bots-visit-my-site-for-two-weeks-here-is-what-i-learned-7ao</guid>
      <description>&lt;p&gt;Two weeks ago I put a fresh site online and started logging every request. I wanted to answer a simple question: how much of my traffic is actually human?&lt;/p&gt;

&lt;p&gt;Turns out, barely any.&lt;/p&gt;

&lt;h2&gt;
  
  
  The raw numbers
&lt;/h2&gt;

&lt;p&gt;Across those two weeks I observed &lt;strong&gt;145 distinct bots&lt;/strong&gt; hitting the site. Some declared themselves honestly. Some pretended to be iPhones from 2019. Some came in through Cloudflare. Some came in through rotating AWS IPs and never stopped.&lt;/p&gt;

&lt;p&gt;I was interested in more than just counting them. I wanted to know &lt;strong&gt;how each one behaved&lt;/strong&gt; — not the identity, the conduct. Did it read &lt;code&gt;robots.txt&lt;/code&gt;? Did it respect rate limits? Did it avoid obviously private paths? Did it keep a stable user-agent across requests?&lt;/p&gt;

&lt;p&gt;I ended up with a scoring system. Each bot got a number between 0 and 100 based on observable behavior.&lt;/p&gt;

&lt;p&gt;The distribution was surprising.&lt;/p&gt;

&lt;h2&gt;
  
  
  The well-behaved majority
&lt;/h2&gt;

&lt;p&gt;The bots at the top of the ranking are exactly the ones you would expect. Major search engines. AI crawlers from the big labs. A few SEO tools. Social preview bots.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxodo3ffxr749eiwykpwa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxodo3ffxr749eiwykpwa.png" alt="Top rated bots in the registry" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GPTBot (OpenAI), ClaudeBot (Anthropic), Bingbot (Microsoft), Bytespider (ByteDance), Baiduspider, YandexBot, Meta's scraper, redditbot — all landing at 100 out of 100.&lt;/p&gt;

&lt;p&gt;It makes sense once you think about it. These companies operate massive crawling infrastructure. They know every site they hit is watching. They have compliance teams. Their crawlers are boring in the best way — they announce themselves, stay within limits, and leave.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hostile minority
&lt;/h2&gt;

&lt;p&gt;The bottom of the ranking was where it got interesting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18fkmmwio8acrgm6a711.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18fkmmwio8acrgm6a711.png" alt="Hostile bots in the registry" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;About &lt;strong&gt;27% of bots scored below 50&lt;/strong&gt;. A few of them were recognizable — L9Explore, the crawler operated by LeakIX, probing sensitive paths aggressively. Keydrop Scanner doing credential probing. A stream of anonymous WordPress scanners hammering &lt;code&gt;/wp-admin&lt;/code&gt; on every domain they find.&lt;/p&gt;

&lt;p&gt;The worst offender was a single IP on AWS that sent &lt;strong&gt;2,562 requests in one day&lt;/strong&gt;. No user-agent. No interest in &lt;code&gt;robots.txt&lt;/code&gt;. Just walking through every endpoint it could find.&lt;/p&gt;

&lt;p&gt;Another favorite: a bot presenting itself as &lt;code&gt;iPhone; iPhone OS 13_2_3&lt;/code&gt; — an iOS version from late 2019. Nobody real is running that in 2026. The user-agent is a lie and the behavior matches. Distributed across dozens of residential IPs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The middle is the interesting part
&lt;/h2&gt;

&lt;p&gt;The polar ends of the distribution are easy. Known good bots are good. Obvious scanners are obviously malicious.&lt;/p&gt;

&lt;p&gt;The middle third is where real decisions live. Crawlers from cloud providers like Tencent sat around 36. Not malicious per se, but also not identifying themselves well and using rotating IPs. If I were running a site that mattered, would I let those through? Block? Rate-limit?&lt;/p&gt;

&lt;p&gt;This is the category where &lt;code&gt;block everything automated&lt;/code&gt; destroys legitimate use cases (partners, vendors, research tools) and &lt;code&gt;allow everything&lt;/code&gt; destroys your servers. It's where the real work is.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;I stopped logging and started building. The passive observations became an API — you send it a suspicious request, it sends you back a score and a recommended action.&lt;/p&gt;

&lt;p&gt;The action is one of four: &lt;code&gt;allow&lt;/code&gt;, &lt;code&gt;throttle&lt;/code&gt;, &lt;code&gt;challenge&lt;/code&gt;, &lt;code&gt;block&lt;/code&gt;. Anything my middleware can handle in three lines.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;verdict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bcs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;user_agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User-Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verdict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;block&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rubric that produces the score is proprietary, but the verdicts are public. Every bot I scored shows up in a public registry with its current rating. Operators can claim their entries and upgrade to a cryptographically signed identity if they want higher trust.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it changed for me
&lt;/h2&gt;

&lt;p&gt;Before this experiment, I treated automated traffic as a nuisance. Something to filter, block, ignore.&lt;/p&gt;

&lt;p&gt;After two weeks of looking closely, I think about it differently. The web is becoming a conversation between automated agents — and most of them are trying to do their jobs well. The bad ones are loud, and they get all the attention, but they are the minority.&lt;/p&gt;

&lt;p&gt;Giving the well-behaved agents a way to prove it — and the sites a way to verify it — seems like a better answer than the status quo of blocking everything automated.&lt;/p&gt;

&lt;h2&gt;
  
  
  If you want to try it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If you run a bot or agent&lt;/strong&gt;: there is a public certification flow. It takes 30 seconds for basic certification, a few minutes for something more serious.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If you run a site&lt;/strong&gt;: the API has a free tier (5,000 scores per month) if you want to experiment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is at &lt;a href="https://botconduct.org" rel="noopener noreferrer"&gt;botconduct.org&lt;/a&gt;. The first production site running this end-to-end is &lt;a href="https://importsignals.com" rel="noopener noreferrer"&gt;importsignals.com&lt;/a&gt; — their &lt;a href="https://importsignals.com/security" rel="noopener noreferrer"&gt;bot policy page&lt;/a&gt; is a reasonable reference if you want to see what it looks like in the wild.&lt;/p&gt;

&lt;p&gt;Would love to hear from other people who have measured their bot traffic seriously. I suspect the 27% hostile number is conservative.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Follow-up thread and registry updates at &lt;a href="https://twitter.com/botconduct" rel="noopener noreferrer"&gt;@botconduct&lt;/a&gt;.&lt;/em&gt;&lt;br&gt;
Rafa Mizrahi&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>security</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
