<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Simon Paxton</title>
    <description>The latest articles on Forem by Simon Paxton (@simon_paxton).</description>
    <link>https://forem.com/simon_paxton</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3812173%2Fa596220b-d0d6-4427-ba84-c4a2f45f39d5.png</url>
      <title>Forem: Simon Paxton</title>
      <link>https://forem.com/simon_paxton</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/simon_paxton"/>
    <language>en</language>
    <item>
      <title>Firefox Zero-Day: Mozilla Says Claude Mythos Found 271 Bugs</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Sun, 10 May 2026 04:38:33 +0000</pubDate>
      <link>https://forem.com/simon_paxton/firefox-zero-day-mozilla-says-claude-mythos-found-271-bugs-5e3g</link>
      <guid>https://forem.com/simon_paxton/firefox-zero-day-mozilla-says-claude-mythos-found-271-bugs-5e3g</guid>
      <description>&lt;p&gt;Mozilla said this week that its &lt;strong&gt;Firefox zero-day&lt;/strong&gt; hardening work with an early version of &lt;strong&gt;Claude Mythos Preview&lt;/strong&gt; helped identify and fix &lt;strong&gt;271 vulnerabilities&lt;/strong&gt; shipped in Firefox 150. In Mozilla’s account, the model-assisted effort followed earlier scans with Opus 4.6 that had already led to fixes for 22 security-sensitive bugs in Firefox 148.&lt;/p&gt;

&lt;p&gt;The company also named &lt;strong&gt;three Firefox CVEs&lt;/strong&gt; it explicitly credited to Claude Mythos Preview: &lt;strong&gt;CVE-2026-6746, CVE-2026-6757, and CVE-2026-6758&lt;/strong&gt;. Mozilla’s public posts said the broader set of 271 findings came from the initial Mythos evaluation, while most of those fixes did not receive individual public CVE listings.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI model helps Mozilla fix 271 Firefox vulnerabilities
&lt;/h2&gt;

&lt;p&gt;In Mozilla’s security blog, the company said Firefox 150 includes fixes for &lt;strong&gt;271 vulnerabilities&lt;/strong&gt; identified during its first evaluation of Claude Mythos Preview. Mozilla said the work started in February, when the Firefox team began using frontier AI models to look for latent security bugs in the browser.&lt;/p&gt;

&lt;p&gt;That number is a lot larger than Mozilla’s previous public AI-assisted result. The same post said earlier work with &lt;strong&gt;Claude Opus 4.6&lt;/strong&gt; produced fixes for &lt;strong&gt;22 security-sensitive bugs&lt;/strong&gt; in Firefox 148.&lt;/p&gt;

&lt;p&gt;Ars Technica reported Mozilla engineers described the latest results as having &lt;strong&gt;“almost no false positives.”&lt;/strong&gt; That is the part worth noting. AI bug reports have usually had the opposite reputation: lots of plausible text, then a human spends the afternoon discovering the bug does not exist.&lt;/p&gt;

&lt;p&gt;Mozilla’s own engineering post says exactly that. It described earlier AI-generated security reports as “unwanted slop” and said the dynamic changed because both the models improved and Mozilla got better at steering and filtering them.&lt;/p&gt;

&lt;p&gt;For related coverage of model performance on offensive and defensive security tasks, see NovaKnown’s earlier reporting on &lt;a href="https://novaknown.com/2026/04/14/ai-cyber-capabilities/" rel="noopener noreferrer"&gt;AI cyber capabilities&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mozilla says Claude Mythos Preview found three Firefox zero-days
&lt;/h2&gt;

&lt;p&gt;Mozilla’s advisories for Firefox 150 publicly credit &lt;strong&gt;three named vulnerabilities&lt;/strong&gt; to Claude Mythos Preview: &lt;strong&gt;CVE-2026-6746&lt;/strong&gt;, &lt;strong&gt;CVE-2026-6757&lt;/strong&gt;, and &lt;strong&gt;CVE-2026-6758&lt;/strong&gt;. Those are the clearest line items connecting the model to specific disclosed bugs.&lt;/p&gt;

&lt;p&gt;Mozilla’s engineering post added an important detail about the mix of findings: some of the reports were &lt;strong&gt;sandbox escapes&lt;/strong&gt;. In browser security, a sandbox escape is a bug that lets code break out of the restricted rendering process into a more privileged one.&lt;/p&gt;

&lt;p&gt;Mozilla said those sandbox escapes would need to be &lt;strong&gt;combined with other exploits&lt;/strong&gt; to produce a full-chain Firefox compromise. The company also said the model was allowed to patch Firefox source code during these investigations, as long as the modified code only ran in the sandboxed process.&lt;/p&gt;

&lt;p&gt;That matters because Mozilla framed these as hardening results across multiple browser subsystems, not just a list of one-shot critical remote code execution bugs. Several findings were defense-in-depth issues or bugs that improved exploitability boundaries rather than standalone takeover chains.&lt;/p&gt;

&lt;p&gt;For background on the model itself, NovaKnown previously covered &lt;a href="https://novaknown.com/2026/04/11/anthropic-mythos-hype/" rel="noopener noreferrer"&gt;anthropic mythos&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The harness and workflow behind the Firefox scans
&lt;/h2&gt;

&lt;p&gt;Mozilla said the jump in useful findings came from a custom &lt;strong&gt;agent harness&lt;/strong&gt; wrapped around the model. The harness gave the LLM instructions, access to project tools, and a loop that kept it working until it either produced a verifiable result or ran out of road.&lt;/p&gt;

&lt;p&gt;Ars Technica quoted Mozilla Distinguished Engineer Brian Grinstead describing the harness as code that tells the model to find a bug in a file, gives it tools to read and write files and run test cases, and then keeps iterating until completion. Mozilla said the harness plugged the model into the same testing pipeline and special Firefox builds its human developers already use.&lt;/p&gt;

&lt;p&gt;One concrete example was memory-safety work with sanitizer builds. Grinstead said the team could point the agent at a source file, tell it there was an issue to find, and let it generate test cases until it produced a crash under the sanitizer build. That is a much clearer success condition than “read this code and tell me if anything looks bad,” which is how you get slop.&lt;/p&gt;

&lt;p&gt;Mozilla also said the model could use existing fuzzing infrastructure and other internal tools. The workflow was not a chatbot staring at source code. It was an LLM inside a project-specific loop with deterministic checks.&lt;/p&gt;

&lt;p&gt;The company’s post says this setup improved both &lt;strong&gt;signal generation&lt;/strong&gt; and &lt;strong&gt;noise filtering&lt;/strong&gt;. That lands squarely in the bucket of &lt;a href="https://novaknown.com/2026/04/24/llm-failure-modes/" rel="noopener noreferrer"&gt;LLM failure modes&lt;/a&gt;: the model still needs a workflow that can verify outputs against reality.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Mozilla framed the findings and the remaining caveats
&lt;/h2&gt;

&lt;p&gt;Mozilla’s public framing was blunt. In its security post, the company wrote that “the zero-days are numbered” and said defenders now have a chance to win “decisively.” The reporting underneath that claim is narrower and more concrete: Firefox 150 shipped with 271 fixes tied to model-assisted hardening work, and Mozilla published extra technical detail because of the level of interest.&lt;/p&gt;

&lt;p&gt;Mozilla also said it intentionally released only a &lt;strong&gt;small sample&lt;/strong&gt; of the underlying reports. The company normally keeps detailed bug reports private for several months after fixes ship, and said it made a calculated decision to unhide some examples earlier than usual.&lt;/p&gt;

&lt;p&gt;The engineering post also describes what the models &lt;strong&gt;did not&lt;/strong&gt; find. Mozilla said some hardened surfaces and layered defenses held up against the model’s attempts, including areas where previous human researchers had found clever routes. That is a useful detail because it puts the Firefox zero-day discussion on actual terrain: a browser with layered mitigations, not a generic claim that AI now solves security.&lt;/p&gt;

&lt;p&gt;A separate government evaluation from the &lt;strong&gt;UK AI Security Institute&lt;/strong&gt; puts Claude Mythos Preview’s cyber performance in a broader comparison set. The institute said an early checkpoint of &lt;strong&gt;GPT-5.5&lt;/strong&gt; now reaches a &lt;strong&gt;similar level&lt;/strong&gt; on its cyber evaluations, after Mythos Preview had previously been the first model to complete its end-to-end corporate network attack simulation. On the institute’s expert-level cyber tasks, GPT-5.5 posted a &lt;strong&gt;71.4%&lt;/strong&gt; average pass rate versus &lt;strong&gt;68.6%&lt;/strong&gt; for Mythos Preview.&lt;/p&gt;

&lt;p&gt;That evaluation does not measure Firefox directly. It does, however, place Mozilla’s Firefox zero-day work next to an external benchmark showing Mythos Preview is no longer alone at that performance tier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Mozilla said Firefox 150 includes fixes for &lt;strong&gt;271 vulnerabilities&lt;/strong&gt; found during an initial evaluation of &lt;strong&gt;Claude Mythos Preview&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Mozilla explicitly credited &lt;strong&gt;three CVEs&lt;/strong&gt; to Claude Mythos Preview: &lt;strong&gt;CVE-2026-6746&lt;/strong&gt;, &lt;strong&gt;CVE-2026-6757&lt;/strong&gt;, and &lt;strong&gt;CVE-2026-6758&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Mozilla said a custom &lt;strong&gt;agent harness&lt;/strong&gt; was central to the results, giving the model tools, test infrastructure, and deterministic verification loops.&lt;/li&gt;
&lt;li&gt;Some findings were &lt;strong&gt;sandbox escapes&lt;/strong&gt; and defense-in-depth issues, not all standalone full-chain compromises.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;UK AI Security Institute&lt;/strong&gt; said &lt;strong&gt;GPT-5.5&lt;/strong&gt; now performs at a similar level to Mythos Preview on its cyber evaluations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://blog.mozilla.org/en/privacy-security/ai-security-zero-day-vulnerabilities/" rel="noopener noreferrer"&gt;The zero-days are numbered&lt;/a&gt; — Mozilla’s security post on Firefox 150 and the 271 vulnerabilities tied to Mythos-assisted hardening.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/" rel="noopener noreferrer"&gt;Behind the Scenes Hardening Firefox with Claude Mythos Preview&lt;/a&gt; — Mozilla’s engineering write-up on the harness, sample reports, and workflow details.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.aisi.gov.uk/blog/our-evaluation-of-openais/gpt-5-5-cyber-capabilities" rel="noopener noreferrer"&gt;Our evaluation of OpenAI's GPT-5.5 cyber capabilities&lt;/a&gt; — The UK AI Security Institute’s benchmark comparing GPT-5.5 with Mythos Preview.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arstechnica.com/information-technology/2026/05/mozilla-says-271-vulnerabilities-found-by-mythos-have-almost-no-false-positives/" rel="noopener noreferrer"&gt;Mozilla says 271 vulnerabilities found by Mythos have ‘almost no false positives’&lt;/a&gt; — Ars Technica’s report with additional quotes from Mozilla engineers on the harness and verification loop.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2802" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mozilla</category>
      <category>firefox</category>
      <category>anthropic</category>
      <category>openai</category>
    </item>
    <item>
      <title>1,000x Claim, No Independent Proof: Subquadratic Architecture</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Fri, 08 May 2026 04:24:23 +0000</pubDate>
      <link>https://forem.com/simon_paxton/1000x-claim-no-independent-proof-subquadratic-architecture-44h</link>
      <guid>https://forem.com/simon_paxton/1000x-claim-no-independent-proof-subquadratic-architecture-44h</guid>
      <description>&lt;p&gt;Subquadratic launched from stealth this week with a claim that its &lt;strong&gt;subquadratic architecture&lt;/strong&gt; can cut attention compute by nearly &lt;strong&gt;1,000x&lt;/strong&gt; at very large context lengths. On its launch page, the startup said its first model, &lt;strong&gt;SubQ 1M-Preview&lt;/strong&gt;, is built on a “fully subquadratic architecture” rather than the standard transformer pattern where attention cost rises quadratically with context length.&lt;/p&gt;

&lt;p&gt;The headline number is large enough to attract immediate scrutiny. VentureBeat reported that Subquadratic had not published independent research validating the claim at launch, even as it pitched three private-beta products built around the same &lt;strong&gt;subquadratic architecture&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Subquadratic claims a 1,000x attention-compute reduction
&lt;/h2&gt;

&lt;p&gt;On its launch page, Subquadratic says its model belongs to “a new class of large language models” and that its &lt;strong&gt;subquadratic architecture&lt;/strong&gt; reduces attention compute by “almost 1,000x compared to other frontier models.” The company ties that figure to very long inputs, saying the comparison applies at &lt;strong&gt;12 million tokens&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That is a direct shot at the main cost curve in transformer models. In standard attention, each token is compared with every other token, so compute grows quadratically as context gets longer. Subquadratic says its approach changes that scaling so compute grows linearly with context length instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  SubQ 1M-Preview and the products it is pitching
&lt;/h2&gt;

&lt;p&gt;VentureBeat reported that the company’s first model is called &lt;strong&gt;SubQ 1M-Preview&lt;/strong&gt;. Alongside it, Subquadratic launched three products into private beta:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an &lt;strong&gt;API&lt;/strong&gt; with access to the full context window&lt;/li&gt;
&lt;li&gt;a command-line coding agent called &lt;strong&gt;SubQ Code&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;a search product called &lt;strong&gt;SubQ Search&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The launch positions the model as more than a research claim. The company is already packaging the &lt;strong&gt;subquadratic architecture&lt;/strong&gt; as an API, a coding tool, and a search system.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Subquadratic says about long-context costs
&lt;/h2&gt;

&lt;p&gt;Subquadratic’s pitch is centered on long-context workloads, where context length means how much text a model can process in one shot. The company says lower attention cost makes workloads that were previously too expensive to run at scale more practical.&lt;/p&gt;

&lt;p&gt;That claim lines up with a real bottleneck. In conventional transformer systems, doubling context length does not double attention cost; it quadruples it. That is why long-context applications often rely on retrieval, chunking, and other workarounds instead of simply sending everything to the model. NovaKnown has covered adjacent efficiency work before in pieces on &lt;a href="https://novaknown.com/2026/04/20/speculative-checkpointing/" rel="noopener noreferrer"&gt;speculative checkpointing&lt;/a&gt;, &lt;a href="https://novaknown.com/2026/04/16/llm-performance-drop/" rel="noopener noreferrer"&gt;LLM performance drop&lt;/a&gt;, and &lt;a href="https://novaknown.com/2026/04/26/claude-code-token-usage/" rel="noopener noreferrer"&gt;Claude Code token usage&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Published evidence at launch
&lt;/h2&gt;

&lt;p&gt;The missing piece at launch was independent backing. VentureBeat reported that the efficiency numbers would matter only &lt;strong&gt;if validated independently&lt;/strong&gt;, and that no published independent research was available at the time of the announcement.&lt;/p&gt;

&lt;p&gt;That leaves the public record in a very specific state. Subquadratic has made a concrete claim about a &lt;strong&gt;subquadratic architecture&lt;/strong&gt;, given a concrete figure for attention-compute reduction, and announced products based on it. What it had not done, in the material available at launch, was publish outside validation showing the architecture performs as claimed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Funding and launch details
&lt;/h2&gt;

&lt;p&gt;VentureBeat reported that Subquadratic emerged from stealth on Tuesday and had raised &lt;strong&gt;$29 million&lt;/strong&gt; in seed funding. The report said investors include Tinder co-founder &lt;strong&gt;Justin Mateen&lt;/strong&gt;, former SoftBank Vision Fund partner &lt;strong&gt;Javier Villamizar&lt;/strong&gt;, and early investors in Anthropic, OpenAI, Stripe, and Brex.&lt;/p&gt;

&lt;p&gt;The same VentureBeat report cited The New Stack as saying the raise valued the company at &lt;strong&gt;$500 million&lt;/strong&gt;. Those funding details sat alongside the product launch, not a peer-reviewed paper — which is one reason the discussion around the model quickly split between curiosity and demands for proof.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Subquadratic says its &lt;strong&gt;subquadratic architecture&lt;/strong&gt; reduces attention compute by almost &lt;strong&gt;1,000x&lt;/strong&gt; compared with frontier models at &lt;strong&gt;12 million tokens&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The company’s first announced model is &lt;strong&gt;SubQ 1M-Preview&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Subquadratic launched three private-beta products: &lt;strong&gt;SubQ API&lt;/strong&gt;, &lt;strong&gt;SubQ Code&lt;/strong&gt;, and &lt;strong&gt;SubQ Search&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The company says its design changes long-context economics by making compute scale linearly with context length instead of quadratically.&lt;/li&gt;
&lt;li&gt;At launch, VentureBeat reported &lt;strong&gt;no independent published validation&lt;/strong&gt; of the architecture claim.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://subq.ai/introducing-subq" rel="noopener noreferrer"&gt;Introducing SubQ&lt;/a&gt; — Subquadratic’s launch page outlining its fully subquadratic architecture and attention-compute claim.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://venturebeat.com/technology/miami-startup-subquadratic-claims-1-000x-ai-efficiency-gain-with-subq-model-researchers-demand-independent-proof/" rel="noopener noreferrer"&gt;Miami startup Subquadratic claims 1,000x AI efficiency gain with SubQ model, researchers demand independent proof&lt;/a&gt; — VentureBeat’s report on the model, product launch, funding, and the lack of independent published validation at launch.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2799" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>subquadratic</category>
      <category>venturebeat</category>
      <category>openai</category>
      <category>anthropic</category>
    </item>
    <item>
      <title>They Rejected It. It’s Building Anyway: OpenAI Oracle Data Center</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Thu, 07 May 2026 21:38:31 +0000</pubDate>
      <link>https://forem.com/simon_paxton/they-rejected-it-its-building-anyway-openai-oracle-data-center-4h6f</link>
      <guid>https://forem.com/simon_paxton/they-rejected-it-its-building-anyway-openai-oracle-data-center-4h6f</guid>
      <description>&lt;p&gt;The &lt;strong&gt;OpenAI Oracle data center&lt;/strong&gt; project in Saline Township, Michigan moved into construction after local officials voted down the rezoning tied to the site. Fortune reported in May that building work had begun even after the local rejection.&lt;/p&gt;

&lt;p&gt;The Washington Post reported that OpenAI is involved in the planned giant Michigan facility, and Data Center Dynamics identified Oracle as a partner on the same project. That puts the &lt;strong&gt;OpenAI Oracle data center&lt;/strong&gt; inside the companies' broader Stargate data center push, but in this case the immediate story is local: a farm-town rezoning fight that did not stop construction.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI and Oracle’s Michigan data center project went ahead after local rejection
&lt;/h2&gt;

&lt;p&gt;Construction started after the township board rejected the rezoning request for the project. Fortune reported that the Saline Township development had already been turned down by local officials before work began.&lt;/p&gt;

&lt;p&gt;Data Center Dynamics reported that the project is being developed with &lt;strong&gt;OpenAI and Oracle&lt;/strong&gt; in Saline Township, a rural community near Ann Arbor. The outlet described the project as part of Stargate and as the center of a dispute between developers and township governance.&lt;/p&gt;

&lt;p&gt;That sequence is the unusual part. In the standard version of a local land-use fight, a failed rezoning vote stops the project. Here, the &lt;strong&gt;OpenAI Oracle data center&lt;/strong&gt; advanced into construction anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Saline Township voted down the rezoning before construction started
&lt;/h2&gt;

&lt;p&gt;Planet Detroit reported that the Saline Township board voted to deny the rezoning needed for the project. The vote came before construction moved forward.&lt;/p&gt;

&lt;p&gt;The site dispute centered on whether farmland in the township could be used for the data center development. Fortune's reporting on the start of construction placed that earlier denial at the center of the local conflict.&lt;/p&gt;

&lt;p&gt;Saline Township's role was not symbolic. Local officials had already said no to the rezoning request before the project advanced on the ground.&lt;/p&gt;

&lt;h2&gt;
  
  
  The project became a zoning fight over exclusionary zoning
&lt;/h2&gt;

&lt;p&gt;The Washington Post reported that the developer sued and argued that the township's rejection amounted to &lt;strong&gt;exclusionary zoning&lt;/strong&gt;. In plain English, that is the claim that local zoning rules were used to block a lawful category of development rather than regulate it neutrally.&lt;/p&gt;

&lt;p&gt;The same report tied the lawsuit directly to the Saline Township data center project involving OpenAI. Data Center Dynamics separately described the conflict as a zoning dispute between the companies behind the project and local government.&lt;/p&gt;

&lt;p&gt;Planet Detroit also reported on later court activity, including a judge denying intervention in the case. By then, the dispute had already moved beyond a local board vote and into litigation over how the township used its zoning power.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI’s role sits inside the broader Stargate buildout
&lt;/h2&gt;

&lt;p&gt;OpenAI said in its January announcement of the &lt;strong&gt;Stargate Project&lt;/strong&gt; that it was launching a major AI infrastructure effort with partners including Oracle. That company announcement established OpenAI's role in the larger buildout that the Michigan project is part of.&lt;/p&gt;

&lt;p&gt;The Michigan facility is one local piece of that broader infrastructure push. The Washington Post connected the Saline Township project to OpenAI's AI expansion, and Data Center Dynamics explicitly linked the site to Stargate.&lt;/p&gt;

&lt;p&gt;If this sounds familiar, it fits a broader pattern of AI infrastructure colliding with local land-use politics. NovaKnown has covered the spending side in &lt;a href="https://novaknown.com/2026/04/19/ai-datacenter-spending/" rel="noopener noreferrer"&gt;AI Datacenter Spending&lt;/a&gt;, community resistance in &lt;a href="https://novaknown.com/2026/04/14/data-center-backlash-festus/" rel="noopener noreferrer"&gt;Data Center Backlash&lt;/a&gt;, and new buildouts abroad in &lt;a href="https://novaknown.com/2026/03/19/datagrid-new-zealand-ai-factory/" rel="noopener noreferrer"&gt;Datagrid New Zealand&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;OpenAI Oracle data center&lt;/strong&gt; in Saline Township moved into construction after local officials rejected the rezoning tied to the project.&lt;/li&gt;
&lt;li&gt;OpenAI and Oracle were both identified by named sources as participants in the Michigan project, which is tied to Stargate.&lt;/li&gt;
&lt;li&gt;Planet Detroit reported that the township board voted down the rezoning before construction began.&lt;/li&gt;
&lt;li&gt;The Washington Post reported that a lawsuit over the project argues the township used exclusionary zoning.&lt;/li&gt;
&lt;li&gt;The dispute has become both a local governance fight and a court case over the township's zoning decision.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://openai.com/index/announcing-the-stargate-project/" rel="noopener noreferrer"&gt;OpenAI — Announcing the Stargate Project&lt;/a&gt; — OpenAI's announcement of Stargate and its infrastructure partnership with Oracle.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.washingtonpost.com/nation/2025/10/13/data-center-bans-lawsuit/" rel="noopener noreferrer"&gt;The Washington Post — Data center bans lawsuit&lt;/a&gt; — Reports OpenAI's involvement in the Michigan project and the exclusionary-zoning lawsuit.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.datacenterdynamics.com/en/news/planned-stargate-data-center-encounters-opposition-in-saline-township-michigan/" rel="noopener noreferrer"&gt;Data Center Dynamics — Planned Stargate data center encounters opposition in Saline Township, Michigan&lt;/a&gt; — Details Oracle's involvement and the local zoning dispute.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://planetdetroit.org/2026/02/judge-denies-data-center-intervention/" rel="noopener noreferrer"&gt;Planet Detroit — Judge denies data center intervention&lt;/a&gt; — Covers the township board's rezoning denial and later court proceedings.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://fortune.com/2026/05/06/ai-data-center-michigan-saline-politics-farmland/" rel="noopener noreferrer"&gt;Fortune — AI data center Michigan Saline politics farmland&lt;/a&gt; — Reports that construction began after the local rejection.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2796" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>openai</category>
      <category>oracle</category>
      <category>stargate</category>
      <category>michigan</category>
    </item>
    <item>
      <title>OpenAI Revenue is Not the Whole Story: Anthropic's Enterprise Bet</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Sun, 03 May 2026 06:16:56 +0000</pubDate>
      <link>https://forem.com/simon_paxton/openai-revenue-is-not-the-whole-story-anthropics-enterprise-bet-4p6i</link>
      <guid>https://forem.com/simon_paxton/openai-revenue-is-not-the-whole-story-anthropics-enterprise-bet-4p6i</guid>
      <description>&lt;p&gt;OpenAI revenue is still the number people reach for when they want a leaderboard. But the cleaner frame is different: Anthropic appears to be building a different kind of AI business, one centered on enterprise customers, safety positioning, and less dependence on mass-market fame.&lt;/p&gt;

&lt;p&gt;That distinction matters because public discussion keeps collapsing three separate things into one scorecard: revenue, valuation, and brand recognition. The available sources here do &lt;em&gt;not&lt;/em&gt; show that Anthropic has passed OpenAI on valuation or revenue. They do show why Anthropic can look strong anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Anthropic’s Enterprise Focus Changes The Revenue Conversation
&lt;/h2&gt;

&lt;p&gt;The strategic frame is &lt;strong&gt;revenue mix versus public visibility&lt;/strong&gt;. A company can be less famous and still look formidable if it is optimized for business spending rather than consumer attention.&lt;/p&gt;

&lt;p&gt;Anthropic’s own &lt;a href="https://www.anthropic.com/enterprise" rel="noopener noreferrer"&gt;Claude for Enterprise&lt;/a&gt; page makes that positioning unusually explicit. It leads with enterprise workflows, secure connections to company knowledge, and business use cases rather than a mass-market assistant pitch.&lt;/p&gt;

&lt;p&gt;That is a different motion from a consumer product becoming a household verb. Enterprise buyers care about access controls, internal knowledge retrieval, and whether a tool can slot into existing company work. Anthropic is selling into that budget line.&lt;/p&gt;

&lt;p&gt;A small detail on the same page is revealing: Anthropic highlights Lyft reducing customer support time by &lt;strong&gt;87% with Claude&lt;/strong&gt;. That is not consumer marketing. It is a procurement story, aimed at managers who sign contracts after seeing labor savings and workflow gains.&lt;/p&gt;

&lt;p&gt;This is why claims about &lt;strong&gt;OpenAI revenue&lt;/strong&gt; often miss the interesting part. Two AI companies can generate money through very different channels. One can dominate public awareness while the other builds a quieter base of higher-touch business accounts.&lt;/p&gt;

&lt;p&gt;That difference also helps explain why Anthropic shows up so often in discussions about workplace AI adoption and developer workflows, including in comparisons like our look at &lt;a href="https://novaknown.com/2026/03/30/claude-vs-chatgpt/" rel="noopener noreferrer"&gt;Claude vs ChatGPT&lt;/a&gt;. The products overlap, but the go-to-market emphasis is not the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI Still Has The Stronger Consumer Brand
&lt;/h2&gt;

&lt;p&gt;On public recognition, the gap is much easier to support. The Reuters Institute’s 2025 report says &lt;strong&gt;ChatGPT is by far the most widely recognised generative AI system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That matters because brand recognition and revenue are related, but they are not interchangeable. ChatGPT gave OpenAI something Anthropic does not have at the same scale: a consumer brand that functions as category shorthand.&lt;/p&gt;

&lt;p&gt;When people talk about AI in casual conversation, they usually say “ChatGPT,” not “Claude.” That creates distribution all by itself. It also makes &lt;strong&gt;OpenAI revenue&lt;/strong&gt; a more natural headline than Anthropic’s business performance, because consumer familiarity drives media attention.&lt;/p&gt;

&lt;p&gt;Anthropic’s relative lack of consumer fame should not be confused with weakness. It means the company is playing a different game. OpenAI owns more of the public mindshare; Anthropic is visibly pitching itself to organizations that care more about internal deployment than mass recognition.&lt;/p&gt;

&lt;p&gt;There is a second-order effect here. Consumer fame tends to distort how outsiders judge company strength. A company with the stronger household brand often gets treated as if it must also lead every business metric. That is exactly the shortcut readers should avoid.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Anthropic’s Public Messaging Reveals About Its Business Model
&lt;/h2&gt;

&lt;p&gt;Anthropic’s homepage is unusually consistent about one thing: &lt;strong&gt;safety is not a side note&lt;/strong&gt;. The company foregrounds “AI to serve humanity’s long-term well-being,” links to its Responsible Scaling Policy, and frames Claude as “a space to think” with “No ads. No sponsored content.”&lt;/p&gt;

&lt;p&gt;That messaging is branding, but it is also customer selection. Safety language, governance language, and enterprise product pages all point toward buyers who want a lower-drama procurement story: controlled deployment, business use cases, and an AI vendor that talks like a risk committee can live with it.&lt;/p&gt;

&lt;p&gt;This is the part many leaderboard arguments miss. Anthropic’s safety posture is not just philosophy; it is part of the sales motion. For an enterprise customer, especially one connecting internal company knowledge, trust signals can be part of the product.&lt;/p&gt;

&lt;p&gt;That does not mean the strategy is frictionless. Enterprise-first companies often run harder into account controls, permissions, and support expectations. You can see how quickly trust becomes operational, not abstract, in situations like reported &lt;a href="https://novaknown.com/2026/04/23/anthropic-bans-without-warning/" rel="noopener noreferrer"&gt;Anthropic bans&lt;/a&gt;. Once your buyer is a business, reliability and account handling become part of the value proposition.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Valuation Claims Need Caution When Revenue Is Not Public
&lt;/h2&gt;

&lt;p&gt;Here is the part worth stating plainly: the source set does &lt;strong&gt;not&lt;/strong&gt; support the claim that Anthropic has overtaken OpenAI in valuation or revenue. Those claims were investigated and dropped.&lt;/p&gt;

&lt;p&gt;That leaves a narrower, better argument. Anthropic’s enterprise positioning explains why some observers may &lt;em&gt;feel&lt;/em&gt; like it is winning, especially inside technical teams and business deployments, without needing any unsupported claim about beating &lt;strong&gt;OpenAI revenue&lt;/strong&gt; or surpassing OpenAI’s valuation.&lt;/p&gt;

&lt;p&gt;This is a common category error in AI coverage. People see strong enterprise adoption, a credible product, and a clear safety brand, then translate that into assumptions about top-line revenue or private-market value. But those are separate measurements, and neither company’s full numbers are public in a way that lets this comparison be made cleanly from the cited sources.&lt;/p&gt;

&lt;p&gt;A better way to read the chessboard is this: OpenAI has the stronger consumer brand because ChatGPT is the public face of generative AI; Anthropic has built a more overt enterprise-first narrative through Claude Enterprise and safety-centered positioning. Both can be true at once.&lt;/p&gt;

&lt;p&gt;That also changes how to read future reporting on &lt;strong&gt;AI company revenue&lt;/strong&gt;. If a headline treats consumer mindshare as proof of enterprise dominance, or treats enterprise credibility as proof of overall revenue leadership, it is probably compressing too much into one metric. Our earlier coverage of &lt;a href="https://novaknown.com/2026/04/30/openai-revenue-2026-2/" rel="noopener noreferrer"&gt;OpenAI revenue 2026&lt;/a&gt; is useful here precisely because revenue stories need source discipline, not vibes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Anthropic’s verified public positioning is &lt;strong&gt;enterprise-first&lt;/strong&gt;, centered on Claude for Enterprise, secure company knowledge access, and business use cases.&lt;/li&gt;
&lt;li&gt;Reuters Institute reports that &lt;strong&gt;ChatGPT is by far the most widely recognised generative AI system&lt;/strong&gt;, giving OpenAI a stronger consumer brand.&lt;/li&gt;
&lt;li&gt;The available sources do &lt;strong&gt;not&lt;/strong&gt; support claims that Anthropic has overtaken OpenAI in valuation or revenue.&lt;/li&gt;
&lt;li&gt;Anthropic’s safety-heavy messaging appears tightly linked to its business model, especially for enterprise customers evaluating risk and trust.&lt;/li&gt;
&lt;li&gt;The real comparison is not a simple leaderboard: &lt;strong&gt;OpenAI revenue&lt;/strong&gt;, Anthropic’s enterprise motion, and ChatGPT brand recognition describe different kinds of strength.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://anthropic.com" rel="noopener noreferrer"&gt;Anthropic home&lt;/a&gt; — Anthropic’s homepage shows its safety framing, Claude releases, and overall company positioning.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/enterprise" rel="noopener noreferrer"&gt;Claude for Enterprise&lt;/a&gt; — Anthropic’s enterprise page highlights business workflows, secure knowledge connections, and customer examples like Lyft.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://reutersinstitute.politics.ox.ac.uk/generative-ai-and-news-report-2025-how-people-think-about-ais-role-journalism-and-society" rel="noopener noreferrer"&gt;Reuters Institute generative AI and news report 2025&lt;/a&gt; — Includes the supported brand-recognition comparison showing ChatGPT’s public awareness lead.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://novaknown.com/2026/04/30/openai-revenue-2026-2/" rel="noopener noreferrer"&gt;OpenAI revenue 2026&lt;/a&gt; — NovaKnown’s earlier coverage of OpenAI’s revenue trajectory and what can actually be supported.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://novaknown.com/2026/03/30/claude-vs-chatgpt/" rel="noopener noreferrer"&gt;Claude vs ChatGPT&lt;/a&gt; — A product-level comparison that helps explain why the companies can feel closer in practice than in public mindshare.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The open question is whether Anthropic’s enterprise-first model will eventually produce a clearer public metric advantage, or simply a quieter, more durable one.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2789" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>openai</category>
      <category>anthropic</category>
      <category>chatgpt</category>
      <category>claude</category>
    </item>
    <item>
      <title>11 Minutes, $1.73, and GPT-5.5 Cybersecurity Simulation</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Fri, 01 May 2026 21:37:56 +0000</pubDate>
      <link>https://forem.com/simon_paxton/11-minutes-173-and-gpt-55-cybersecurity-simulation-3n9</link>
      <guid>https://forem.com/simon_paxton/11-minutes-173-and-gpt-55-cybersecurity-simulation-3n9</guid>
      <description>&lt;p&gt;The UK AI Security Institute says &lt;strong&gt;GPT-5.5 cybersecurity simulation&lt;/strong&gt; results now look a lot less like a one-off milestone and a lot more like a repeatable frontier capability. In its latest evaluation, AISI found that an early checkpoint of OpenAI’s GPT-5.5 reached roughly the same level as Anthropic’s Mythos Preview on hard cyber tasks—and slightly beat it on one key benchmark.&lt;/p&gt;

&lt;p&gt;That matters because AISI was explicitly testing whether Mythos Preview’s earlier result was a weird outlier. Instead, a second model from a different developer now lands in the same range, including solving a difficult multi-step cyber attack simulation end-to-end in some attempts. If you’ve been tracking rising &lt;a href="https://novaknown.com/2026/04/14/ai-cyber-capabilities/" rel="noopener noreferrer"&gt;AI cyber capabilities&lt;/a&gt;, this is the part worth circling.&lt;/p&gt;

&lt;h2&gt;
  
  
  GPT-5.5 Cybersecurity Simulation Is No Longer a One-Model Fluke
&lt;/h2&gt;

&lt;p&gt;AISI’s headline finding is simple: &lt;strong&gt;GPT-5.5 reached a similar cyber capability level to Mythos Preview&lt;/strong&gt;. That is the interesting result.&lt;/p&gt;

&lt;p&gt;Back in April, AISI said Mythos Preview was the first frontier model it had seen complete its corporate network attack simulation end-to-end, a multi-step exercise it estimates would take a human expert around &lt;strong&gt;20 hours&lt;/strong&gt;. The obvious follow-up was whether that was a breakthrough tied to one model family.&lt;/p&gt;

&lt;p&gt;AISI’s answer is now: probably not. GPT-5.5, from a different lab, hit a comparable level and achieved a &lt;strong&gt;slightly higher average pass rate&lt;/strong&gt; than Mythos Preview on expert tasks.&lt;/p&gt;

&lt;p&gt;That shift changes the interpretation. A surprising benchmark win can be a stunt. Two frontier models from different developers hitting about the same bar starts to look like a capability class.&lt;/p&gt;

&lt;h2&gt;
  
  
  How GPT-5.5 Performed Across OpenAI's Cyber Task Suite
&lt;/h2&gt;

&lt;p&gt;AISI’s testbed is broader than a single dramatic demo. It uses a suite of &lt;strong&gt;95 narrow cyber tasks&lt;/strong&gt; across four difficulty tiers, built in capture-the-flag format—structured challenges where the model has to actually recover a “flag” by solving the task.&lt;/p&gt;

&lt;p&gt;Those tasks cover things like &lt;strong&gt;reverse engineering, web exploitation, and cryptography&lt;/strong&gt;. The easier tasks are already saturated by frontier models, so the interesting comparison is in the advanced suite.&lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;Expert-level&lt;/strong&gt; tasks, AISI reports these average pass rates:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Expert task pass rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;71.4% ± 8.0%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mythos Preview&lt;/td&gt;
&lt;td&gt;68.6% ± 8.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4&lt;/td&gt;
&lt;td&gt;52.4% ± 9.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;48.6% ± 10.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That is a real jump over earlier OpenAI and Anthropic frontier models. GPT-5.5 is not edging forward from 68% to 71% in a vacuum; it is sitting well above GPT-5.4 and Opus 4.7 on the hardest tier AISI reports.&lt;/p&gt;

&lt;p&gt;The advanced tasks themselves are also nasty in exactly the way you’d want for this kind of evaluation. AISI says they include reversing stripped binaries and embedded firmware without source code, building reliable exploits for memory corruption bugs, recovering keys from weak crypto implementations, winning TOCTOU races, unpacking obfuscated malware, and weaponizing synthetic vulnerabilities planted in real open-source software.&lt;/p&gt;

&lt;p&gt;One example AISI highlights is a reverse-engineering challenge built around a &lt;strong&gt;stripped Rust ELF implementing a custom virtual machine&lt;/strong&gt;, plus a second unknown-format file containing bytecode for that VM. That is not “write a phishing email.” It is the kind of task where benchmark scores start to tell you something about actual technical depth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Minutes Matter: The Human-versus-Model Time Gap
&lt;/h2&gt;

&lt;p&gt;AISI says GPT-5.5 solved a difficult cyber task in &lt;strong&gt;under 11 minutes&lt;/strong&gt;. The same full-chain simulation is estimated to take a human expert about &lt;strong&gt;20 hours&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The raw comparison is startling, but it needs one clarification: this does &lt;strong&gt;not&lt;/strong&gt; mean GPT-5.5 is a drop-in replacement for a human red teamer. The benchmark is measuring performance on a controlled task suite, not whether you can hand the model a production network and expect clean autonomous operation.&lt;/p&gt;

&lt;p&gt;Still, the time gap matters for two reasons.&lt;/p&gt;

&lt;p&gt;First, it changes what becomes cheap to try. A model that can take repeated shots at a hard multi-step task in minutes is operating in a very different regime from a human expert who needs most of a day. Even partial success becomes more operationally interesting when attempts are fast.&lt;/p&gt;

&lt;p&gt;Second, AISI says the run cost was &lt;strong&gt;$1.73&lt;/strong&gt;. That is a tiny price for a benchmark result at this level. If frontier models can attempt advanced cyber tasks quickly and cheaply, scaling the number of runs stops being the bottleneck.&lt;/p&gt;

&lt;p&gt;That cost number is easy to miss, but it is one of the most important lines in the evaluation. High-end cyber capability is one thing. High-end cyber capability at commodity-run pricing is another.&lt;/p&gt;

&lt;p&gt;This is also why model autonomy research keeps spilling into security. Once you combine strong task performance with low per-run cost and agentic iteration, you get the same pattern people worry about in things like &lt;a href="https://novaknown.com/2026/04/08/agentic-sandbox-escape/" rel="noopener noreferrer"&gt;agentic sandbox escape&lt;/a&gt;: more attempts, more persistence, and less friction.&lt;/p&gt;

&lt;h2&gt;
  
  
  What GPT-5.5 Actually Changes for Cyber Evaluation
&lt;/h2&gt;

&lt;p&gt;The cleanest update is that cyber evals now need to assume &lt;strong&gt;multiple&lt;/strong&gt; labs can produce models at this level. GPT-5.5’s result means benchmark designers can no longer treat top-tier cyber performance as a lab-specific anomaly.&lt;/p&gt;

&lt;p&gt;That pushes evaluation in two directions.&lt;/p&gt;

&lt;p&gt;One is &lt;strong&gt;harder, more realistic tasks&lt;/strong&gt;. AISI notes that basic tasks have been saturated since at least February 2026. When models max out easier CTF-style challenges, the useful signal moves to practitioner and expert tasks with larger search spaces and more steps.&lt;/p&gt;

&lt;p&gt;The other is &lt;strong&gt;more careful interpretation&lt;/strong&gt;. Stronger benchmark performance does not automatically prove deployable defensive capability. A model passing expert CTF cybersecurity tasks can still fail in messy real environments full of unreliable tooling, access constraints, and adversarial inputs.&lt;/p&gt;

&lt;p&gt;We’ve already seen how brittle agentic systems can be when the environment fights back—whether through deliberate attacks like &lt;a href="https://novaknown.com/2026/03/19/prompt-injection-peer-review/" rel="noopener noreferrer"&gt;prompt injection in peer review&lt;/a&gt; or through the ordinary chaos of multi-step tooling. So the right reading of the GPT-5.5 cybersecurity simulation result is not “AI can now do cybersecurity.” It is narrower and, in some ways, more significant: frontier models are now repeatedly reaching expert benchmark territory on serious cyber tasks.&lt;/p&gt;

&lt;p&gt;That is enough to force a change in how these systems are tested, gated, and compared.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AISI found GPT-5.5 reached a similar level to Mythos Preview&lt;/strong&gt;, suggesting frontier cyber performance is no longer a one-model fluke.&lt;/li&gt;
&lt;li&gt;On &lt;strong&gt;Expert-level&lt;/strong&gt; tasks in AISI’s advanced cyber suite, GPT-5.5 scored &lt;strong&gt;71.4%&lt;/strong&gt;, ahead of Mythos Preview at &lt;strong&gt;68.6%&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;AISI says GPT-5.5 solved a difficult multi-step cyber task in &lt;strong&gt;under 11 minutes&lt;/strong&gt;, while the full chain is estimated to take a human expert around &lt;strong&gt;20 hours&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The reported run cost was &lt;strong&gt;$1.73&lt;/strong&gt;, which makes repeated attempts at advanced cyber tasks unusually cheap.&lt;/li&gt;
&lt;li&gt;The result shows &lt;strong&gt;stronger benchmark performance&lt;/strong&gt;, not proof of broadly deployable real-world defensive capability.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities" rel="noopener noreferrer"&gt;Our evaluation of OpenAI's GPT-5.5 cyber capabilities | AISI Work&lt;/a&gt; — Primary source on GPT-5.5’s pass rates, timing, task design, and comparison with Mythos Preview.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://novaknown.com/2026/04/14/ai-cyber-capabilities/" rel="noopener noreferrer"&gt;AI cyber capabilities&lt;/a&gt; — NovaKnown’s earlier coverage of how frontier models are climbing cyber benchmarks.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://novaknown.com/2026/04/08/agentic-sandbox-escape/" rel="noopener noreferrer"&gt;agentic sandbox escape&lt;/a&gt; — Why fast, cheap autonomous retries matter once models can act across multiple steps.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://novaknown.com/2026/03/19/prompt-injection-peer-review/" rel="noopener noreferrer"&gt;prompt injection in peer review&lt;/a&gt; — A useful parallel case for how capable agents still break in hostile or messy environments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The open question now is how long today’s “expert” cyber benchmarks stay discriminating once more labs can train to the same level.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2781" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>openai</category>
      <category>anthropic</category>
      <category>gpt55</category>
      <category>aisi</category>
    </item>
    <item>
      <title>DeepSeek Forces Visual Reasoning Through Points and Boxes</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Fri, 01 May 2026 04:40:28 +0000</pubDate>
      <link>https://forem.com/simon_paxton/deepseek-forces-visual-reasoning-through-points-and-boxes-469d</link>
      <guid>https://forem.com/simon_paxton/deepseek-forces-visual-reasoning-through-points-and-boxes-469d</guid>
      <description>&lt;p&gt;DeepSeek has released an open-source &lt;strong&gt;visual reasoning&lt;/strong&gt; framework called &lt;strong&gt;Thinking with Visual Primitives&lt;/strong&gt;. According to 36Kr, the system changes how a multimodal model is asked to reason: instead of describing an image in loose language, it has to work through explicit visual units like &lt;strong&gt;point coordinates&lt;/strong&gt; and &lt;strong&gt;bounding boxes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That is a much more concrete bet than “better multimodal understanding.” It pushes reasoning closer to measurement. When a model says “the object is near the left side,” language can blur the geometry; when it has to point to coordinates or mark a box, the error has less room to hide.&lt;/p&gt;

&lt;h2&gt;
  
  
  What DeepSeek’s Thinking with Visual Primitives Actually Changes
&lt;/h2&gt;

&lt;p&gt;The release is a framework, not just a vague claim of improved perception. 36Kr reports that DeepSeek unveiled &lt;strong&gt;Thinking with Visual Primitives&lt;/strong&gt; as a multimodal model and technical report, and released it as &lt;strong&gt;open source&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The interesting part is the representation layer. The model is not only generating words about what it sees. It is being made to reason through visual primitives — basic spatial elements such as points and boxes.&lt;/p&gt;

&lt;p&gt;That sounds small, but it changes the failure mode. A lot of &lt;strong&gt;visual reasoning&lt;/strong&gt; errors happen when the model jumps too quickly from pixels to prose. It can produce a fluent sentence that sounds right while quietly dropping the actual layout of the scene.&lt;/p&gt;

&lt;p&gt;With visual primitives, the model has to show more of its work. If a task depends on location, size, or relative position, a coordinate is harder to fudge than a sentence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Visual Primitives Beat Vague Descriptions for Spatial Tasks
&lt;/h2&gt;

&lt;p&gt;The core claim 36Kr makes is specific: the framework improves &lt;strong&gt;spatial reasoning&lt;/strong&gt; by requiring precise visual data points instead of vague natural-language descriptions. In practice, that means the model has to anchor its reasoning in things that can be checked.&lt;/p&gt;

&lt;p&gt;Take a simple spatial task. “Which object is closest to the top-right corner?” A language-first system might narrate the scene and guess based on a rough impression. A primitive-based system can mark candidate objects with &lt;strong&gt;bounding boxes&lt;/strong&gt;, compare positions, and reason from those coordinates.&lt;/p&gt;

&lt;p&gt;Or imagine “point to the handle of the mug.” The phrase “the handle is on the side” is descriptive, but it is not an answer you can directly score. A point coordinate is.&lt;/p&gt;

&lt;p&gt;That distinction matters because language is compressive. It throws away detail on purpose. Humans do this constantly and get away with it because we share context. Models often do not. They replace measurement with summary, and summary is where a lot of hallucination-like behavior starts.&lt;/p&gt;

&lt;p&gt;This is the same broad instinct behind work to &lt;a href="https://novaknown.com/2026/04/07/reduce-llm-hallucinations/" rel="noopener noreferrer"&gt;reduce LLM hallucinations&lt;/a&gt;: force the system to stay attached to observable structure for as long as possible. Here, the observable structure is spatial.&lt;/p&gt;

&lt;p&gt;There is a nice symmetry with research on &lt;a href="https://novaknown.com/2026/04/19/zero-shot-world-models/" rel="noopener noreferrer"&gt;zero-shot world models&lt;/a&gt;, too. If you want a model to reason about a scene, you need a representation that preserves the scene. Text alone often smooths over exactly the information you care about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Open Source Matters for Visual Reasoning
&lt;/h2&gt;

&lt;p&gt;DeepSeek released Thinking with Visual Primitives as &lt;strong&gt;open source&lt;/strong&gt;, according to 36Kr. For this kind of work, that matters more than usual.&lt;/p&gt;

&lt;p&gt;A lot of multimodal claims are hard to inspect. You get a demo, a benchmark headline, maybe a polished sample image — but not the machinery that tells you what changed. Open-sourcing a &lt;strong&gt;visual reasoning&lt;/strong&gt; framework gives researchers and builders something much more useful: a way to test whether the representation itself is doing the work.&lt;/p&gt;

&lt;p&gt;That opens up a few concrete paths:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Researchers can compare primitive-based reasoning against language-only baselines.&lt;/li&gt;
&lt;li&gt;Builders can inspect where coordinate constraints help and where they add overhead.&lt;/li&gt;
&lt;li&gt;The community can try alternative visual primitives, scoring methods, or training recipes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where &lt;a href="https://novaknown.com/2026/04/12/open-source-ai-revenue/" rel="noopener noreferrer"&gt;open-source AI&lt;/a&gt; keeps getting more interesting. Once the representation layer is visible, progress is not limited to whoever owns the API. Other teams can copy, modify, and pressure-test the idea directly.&lt;/p&gt;

&lt;p&gt;And this specific idea is worth pressure-testing. If multimodal models keep getting larger without getting more grounded, they will keep producing confident spatial mistakes. A framework built around points and boxes is a direct attempt to fix that at the level where the mistake starts.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the DeepSeek-PKU-Tsinghua Collaboration Signals
&lt;/h2&gt;

&lt;p&gt;36Kr says the work was developed in collaboration with &lt;strong&gt;Peking University&lt;/strong&gt; and &lt;strong&gt;Tsinghua University&lt;/strong&gt;. That does not just add prestige. It suggests a research direction.&lt;/p&gt;

&lt;p&gt;This looks like an effort to treat multimodal reasoning as a representation problem, not only a scale problem. Bigger models can help, but there is a different question underneath: &lt;em&gt;what internal units should a model think with when the task is visual and spatial?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;DeepSeek’s answer here is unusually explicit. Use primitives that map back to the image. Make reasoning legible in geometric terms. Reduce the amount of hidden translation from scene to text.&lt;/p&gt;

&lt;p&gt;That is a strong signal because it points away from the “just let the model narrate more” approach. If that collaboration keeps producing work in this vein, expect more systems that mix symbolic-looking structure with end-to-end multimodal learning.&lt;/p&gt;

&lt;p&gt;It is also a useful contrast with a lot of current multimodal product rhetoric. Plenty of systems claim to “understand images.” Far fewer specify the units of that understanding. DeepSeek, PKU, and Tsinghua are at least making a falsifiable bet: that &lt;strong&gt;visual primitives&lt;/strong&gt; are a better substrate for some reasoning tasks than free-form language.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;DeepSeek released an open-source &lt;strong&gt;visual reasoning&lt;/strong&gt; framework called &lt;strong&gt;Thinking with Visual Primitives&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The framework pushes a multimodal model to reason with &lt;strong&gt;point coordinates&lt;/strong&gt; and &lt;strong&gt;bounding boxes&lt;/strong&gt;, not just textual descriptions.&lt;/li&gt;
&lt;li&gt;That matters most for &lt;strong&gt;spatial reasoning&lt;/strong&gt;, where language often blurs position, size, and relative layout.&lt;/li&gt;
&lt;li&gt;The open-source release lets researchers test whether grounded representations improve multimodal performance in practice.&lt;/li&gt;
&lt;li&gt;The collaboration with &lt;strong&gt;Peking University&lt;/strong&gt; and &lt;strong&gt;Tsinghua University&lt;/strong&gt; signals serious interest in changing the representation layer, not just scaling model size.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://eu.36kr.com/en/p/3789208597372165" rel="noopener noreferrer"&gt;36Kr: DeepSeek releases Thinking with Visual Primitives&lt;/a&gt; — The main report on the framework, its open-source release, and the collaboration with Peking University and Tsinghua University.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://deepseek.com" rel="noopener noreferrer"&gt;DeepSeek&lt;/a&gt; — DeepSeek’s primary site, covering the company behind the release.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The open question is whether forcing models to think in points and boxes will stay useful as tasks get more abstract — or whether grounded primitives will turn out to be the missing layer multimodal systems needed all along.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2777" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>deepseek</category>
      <category>pekinguniversity</category>
      <category>tsinghuauniversity</category>
      <category>opensourceai</category>
    </item>
    <item>
      <title>The Research Map is Already Live, but the Methods Aren’t: Semantic Map Tool</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Thu, 30 Apr 2026 04:31:55 +0000</pubDate>
      <link>https://forem.com/simon_paxton/the-research-map-is-already-live-but-the-methods-arent-semantic-map-tool-4306</link>
      <guid>https://forem.com/simon_paxton/the-research-map-is-already-live-but-the-methods-arent-semantic-map-tool-4306</guid>
      <description>&lt;p&gt;The Global Research Space, a new &lt;strong&gt;semantic map tool&lt;/strong&gt;, is live now as a browser-based alpha that lets people explore 10 million research papers as if they were moving across a map. The public site is up at globalresearchspace.com, and the map view is currently labeled &lt;strong&gt;v0.2.0 alpha&lt;/strong&gt;, with a pan-and-zoom canvas showing floating topic labels spread across clustered regions.&lt;/p&gt;

&lt;p&gt;What is confirmed is unusually split across two places. The site itself shows a working product and the map interface; the methodology mostly comes from an April 30 Reddit launch post by the creator, who said the system uses the latest 10 million papers from OpenAlex and turns them into “semantic neighborhoods” for browsing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What The Global Research Space Actually Is
&lt;/h2&gt;

&lt;p&gt;The homepage describes The Global Research Space in one sentence: &lt;strong&gt;“Explore the landscape of the latest research.”&lt;/strong&gt; Click through to the map and you get a large pan-and-zoom interface with floating topic labels and clustered regions, not a normal search-results page.&lt;/p&gt;

&lt;p&gt;That difference matters. A standard paper search tool starts with a query box and returns a ranked list. This &lt;strong&gt;semantic map tool&lt;/strong&gt; starts with position: papers and topics appear to be arranged near related work, so browsing means moving through adjacent areas rather than reformulating keywords over and over.&lt;/p&gt;

&lt;p&gt;OpenAlex is a plausible substrate for this. OpenAlex says it is a &lt;strong&gt;“map of the world’s research network”&lt;/strong&gt; and links works, authors, institutions, journals, topics, and more, with hourly updates. Its 2026 roadmap says the database now contains &lt;strong&gt;477 million works&lt;/strong&gt;, which makes a 10 million-paper slice both substantial and clearly a subset.&lt;/p&gt;

&lt;p&gt;On the live map, the directly visible product state is simple: a full-screen map surface, labeled regions, a search-oriented interface around the canvas, and the alpha version badge. That is enough to verify the core claim that this is a working &lt;strong&gt;research paper map&lt;/strong&gt;, not just a mockup or concept page.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the semantic map tool is built
&lt;/h2&gt;

&lt;p&gt;The current pipeline description is &lt;strong&gt;creator-reported&lt;/strong&gt;, not documented on an official methods page. In the April 30 Reddit launch post, the creator said the system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sourced the &lt;strong&gt;latest 10 million papers from OpenAlex&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;generated embeddings using &lt;strong&gt;SPECTER 2&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;used &lt;strong&gt;titles and abstracts&lt;/strong&gt; as input&lt;/li&gt;
&lt;li&gt;reduced dimensionality with &lt;strong&gt;UMAP&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;applied &lt;strong&gt;Voronoi partitioning on density peaks&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;generated floating labels with a custom labeling pipeline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a pretty specific recipe. It is also mostly single-sourced.&lt;/p&gt;

&lt;p&gt;Here is the current evidence split for the &lt;strong&gt;semantic map tool&lt;/strong&gt; pipeline:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Claim&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;The Global Research Space exists and is publicly accessible&lt;/td&gt;
&lt;td&gt;Confirmed&lt;/td&gt;
&lt;td&gt;Product homepage and map page&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The map page is labeled v0.2.0 alpha&lt;/td&gt;
&lt;td&gt;Confirmed&lt;/td&gt;
&lt;td&gt;Live map page&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;It uses the latest 10M papers from OpenAlex&lt;/td&gt;
&lt;td&gt;Creator-reported&lt;/td&gt;
&lt;td&gt;Reddit launch post&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;It uses SPECTER 2 on titles and abstracts&lt;/td&gt;
&lt;td&gt;Creator-reported&lt;/td&gt;
&lt;td&gt;Reddit launch post&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;It uses UMAP&lt;/td&gt;
&lt;td&gt;Creator-reported&lt;/td&gt;
&lt;td&gt;Reddit launch post&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;It uses Voronoi partitioning on density peaks&lt;/td&gt;
&lt;td&gt;Creator-reported&lt;/td&gt;
&lt;td&gt;Reddit launch post&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Labels are custom and still a work in progress&lt;/td&gt;
&lt;td&gt;Creator-reported&lt;/td&gt;
&lt;td&gt;Reddit launch post&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code is not open source&lt;/td&gt;
&lt;td&gt;Creator-reported&lt;/td&gt;
&lt;td&gt;Reddit comments&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There is one more useful signal in that launch post: the creator engaged directly with questions about clustering choices. In one exchange, they said they had not considered HDBSCAN and might explore a hybrid. That does not invalidate the current method. It does show the pipeline is still being worked out in public, which fits the alpha label on the site.&lt;/p&gt;

&lt;h2&gt;
  
  
  What users get from semantic search and analytics
&lt;/h2&gt;

&lt;p&gt;Two things are visible from the live product state and public copy. First, the interface is built around map navigation rather than a flat results page. Second, the creator says the product supports &lt;strong&gt;keyword and semantic queries&lt;/strong&gt; plus analytics for institutions, authors, and topics.&lt;/p&gt;

&lt;p&gt;The first part is confirmed by direct observation. The second part is currently creator-reported and only lightly documented on the accessible public pages.&lt;/p&gt;

&lt;p&gt;That distinction matters. We can verify that users are being invited to navigate a spatial interface for &lt;strong&gt;scientific literature navigation&lt;/strong&gt;. We cannot yet verify, from public methods documentation, exactly how the analytics are calculated or how semantic retrieval quality compares with a standard search engine.&lt;/p&gt;

&lt;p&gt;Implication, with caveat: this kind of &lt;strong&gt;semantic paper map&lt;/strong&gt; is most useful when a researcher knows the area loosely but not the exact keywords. In fast-moving domains, terminology drifts. A spatial interface can help surface nearby work that uses different language for similar ideas. That is a plausible benefit, and it matches the interaction design, but the public site does not yet provide benchmarks showing how often it beats conventional search. For related context on evaluating ML papers in messy, fast-changing domains, see our piece on &lt;a href="https://novaknown.com/2026/04/09/empirical-research-in-machine-learning/" rel="noopener noreferrer"&gt;empirical research in machine learning&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Query Box to Spatial Browsing
&lt;/h2&gt;

&lt;p&gt;What this launch demonstrates, clearly and directly, is that a live product can turn literature discovery into movement across a terrain. That is the concrete shift here.&lt;/p&gt;

&lt;p&gt;The broader pattern did not start with this project. Research mapping interfaces have shown up before in forms like the ArXiv Machine Learning Landscape and other topic-atlas style explorers. What makes The Global Research Space interesting is the combination of scale, public access, and upstream infrastructure: an &lt;strong&gt;OpenAlex map&lt;/strong&gt;-style scholarly graph underneath, then an interface that treats papers as neighborhoods instead of list items.&lt;/p&gt;

&lt;p&gt;That is a meaningful product decision. Search boxes are good when users know the term they want. Spatial browsing is better for orientation, adjacency, and “what sits next to this?” exploration. If you want a broader framing for why these systems feel different once the underlying graph is rich enough, our piece on &lt;a href="https://novaknown.com/2026/02/04/what-is-gardening-in-cryptography-and-how-does-it-work/" rel="noopener noreferrer"&gt;how discovery systems work&lt;/a&gt; covers that pattern from another angle.&lt;/p&gt;

&lt;p&gt;There is also a very current AI-tooling dynamic here: the interface is shipping before the methods are properly documented. Users can test whether the neighborhoods feel sensible right now. Outsiders still cannot fully audit how those neighborhoods were produced. That split is not unusual anymore, but it is especially important for a &lt;strong&gt;paper discovery tool&lt;/strong&gt; that may influence what researchers read and miss.&lt;/p&gt;

&lt;h2&gt;
  
  
  Documentation Gaps in the Alpha Release
&lt;/h2&gt;

&lt;p&gt;The biggest missing piece is an official methods page. The live product does not currently provide a detailed public explanation of corpus selection, refresh cadence, embedding infrastructure, clustering evaluation, label generation, or ranking methodology for institutions and authors.&lt;/p&gt;

&lt;p&gt;Several important questions are still open:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What counts as the “latest” 10 million papers?&lt;/strong&gt; No publication-date cutoff was publicly documented in the accessible pages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How often is the map refreshed?&lt;/strong&gt; OpenAlex updates hourly, but that does not mean this product does.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How are rankings calculated?&lt;/strong&gt; The site references analytics, but not the exact formulas.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How good is the semantic retrieval?&lt;/strong&gt; No public benchmark compares it with standard academic search.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How stable are the neighborhoods and labels?&lt;/strong&gt; The creator described the labeling as a work in progress.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The alpha label on the product is doing real work here. So is the fact that the code is reportedly not open source.&lt;/p&gt;

&lt;p&gt;The confirmed picture is narrow but solid: the site is live, the map interface is public, and the alpha badge is visible. The creator-reported picture is more detailed: OpenAlex as source data, SPECTER 2 embeddings, UMAP, Voronoi-based partitioning, and custom labels. What still lacks public documentation is the evaluation layer — refresh timing, ranking formulas, retrieval quality, and evidence that the mapped neighborhoods are stable and useful across real research workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Global Research Space is a live alpha product&lt;/strong&gt; that lets users browse research as a pan-and-zoom map rather than a list of search results.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The core pipeline details are mostly creator-reported&lt;/strong&gt;, not formally documented on the product site: OpenAlex source data, SPECTER 2 embeddings, UMAP, and Voronoi-based semantic neighborhoods.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAlex is a credible upstream corpus&lt;/strong&gt; with 477 million indexed works, making a 10 million-paper slice plausible but still partial.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The interface changes the discovery workflow&lt;/strong&gt; from keyword lookup to spatial exploration, which can be useful for surveying unfamiliar or fast-moving fields.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What is still missing is evaluation and documentation&lt;/strong&gt;: refresh cadence, ranking methods, retrieval quality, and clustering choices are not yet publicly specified in depth.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://globalresearchspace.com/space" rel="noopener noreferrer"&gt;The Global Research Space map&lt;/a&gt; — Live product page showing the alpha map interface and current product state.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://globalresearchspace.com/" rel="noopener noreferrer"&gt;The Global Research Space homepage&lt;/a&gt; — Canonical homepage for the product with its short public description.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://help.openalex.org/hc/en-us/articles/28932712154391-How-does-OpenAlex-work" rel="noopener noreferrer"&gt;OpenAlex help: How does OpenAlex work?&lt;/a&gt; — OpenAlex’s explanation of its research graph and update cadence.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://blog.openalex.org/openalex-2026-roadmap/" rel="noopener noreferrer"&gt;OpenAlex 2026 roadmap&lt;/a&gt; — Scale context for the underlying corpus, including the 477 million works figure.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2772" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>openalex</category>
      <category>specter</category>
      <category>umap</category>
      <category>reddit</category>
    </item>
    <item>
      <title>Compute Anxiety, Not Collapse: OpenAI Revenue 2026</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Wed, 29 Apr 2026 21:44:49 +0000</pubDate>
      <link>https://forem.com/simon_paxton/compute-anxiety-not-collapse-openai-revenue-2026-1di1</link>
      <guid>https://forem.com/simon_paxton/compute-anxiety-not-collapse-openai-revenue-2026-1di1</guid>
      <description>&lt;p&gt;OpenAI revenue 2026 is under a real pressure test. In the last 30 days, the dominant story has been a Reuters report, citing the Wall Street Journal, that OpenAI fell short of internal revenue and user targets while wrestling with the cost of future compute commitments.&lt;/p&gt;

&lt;p&gt;That has been easy to turn into a collapse narrative. The actual record is less dramatic and more interesting: OpenAI is missing some targets, rewriting key economics with Microsoft, and pushing customers through fast product and API changes at the same time. That is what a company under strain looks like when it is still trying to buy more paths to growth, not what visible free fall looks like.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why OpenAI Is Under Pressure Now
&lt;/h2&gt;

&lt;p&gt;The pressure is simple: &lt;strong&gt;huge compute bills, very high growth expectations, and more competition in coding and enterprise AI&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Reuters, via the WSJ report, said OpenAI missed multiple monthly revenue targets earlier this year and also fell short of an internal goal of reaching &lt;strong&gt;1 billion weekly active users by the end of 2025&lt;/strong&gt;. The same report said CFO Sarah Friar had expressed concern internally about whether revenue growth would keep pace with future computing contracts.&lt;/p&gt;

&lt;p&gt;OpenAI leadership directly pushed back. Sam Altman and Friar told Reuters:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“This is ridiculous. We are totally aligned on buying as much compute as we can and working hard on it together every day.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That leaves the core picture intact even if you discount the strongest infighting angle. Reuters and the WSJ report a mismatch between OpenAI’s growth goals and its future compute commitments. If those are the internal benchmarks, missing them makes the economics tighter.&lt;/p&gt;

&lt;p&gt;That curve is expensive everywhere, not just at OpenAI. We’ve already seen the broader constraints in power gear, utility approvals, and datacenter buildouts in &lt;a href="https://novaknown.com/2026/04/19/ai-datacenter-spending/" rel="noopener noreferrer"&gt;AI datacenter spending&lt;/a&gt;. If compute supply is tight and demand is still climbing, revenue misses matter more because the infrastructure plan does not get cheaper.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Revenue Miss Reports Actually Say
&lt;/h2&gt;

&lt;p&gt;Reuters, citing the Wall Street Journal, reported four concrete things: &lt;strong&gt;OpenAI missed multiple monthly revenue targets in early 2026, missed an internal goal of 1 billion weekly active users by the end of 2025, saw Sarah Friar raise concern about whether growth would cover future compute contracts, and got a direct denial from Altman and Friar that they were split over buying that compute&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here’s the clean version:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Reported issue&lt;/th&gt;
&lt;th&gt;What was claimed&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Revenue targets&lt;/td&gt;
&lt;td&gt;OpenAI missed multiple monthly revenue targets earlier in 2026&lt;/td&gt;
&lt;td&gt;Reported by Reuters citing WSJ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User targets&lt;/td&gt;
&lt;td&gt;OpenAI missed an internal goal of 1 billion weekly active users by end of 2025&lt;/td&gt;
&lt;td&gt;Reported by Reuters citing WSJ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compute anxiety&lt;/td&gt;
&lt;td&gt;Sarah Friar reportedly raised concern about affording future compute contracts if growth lagged&lt;/td&gt;
&lt;td&gt;Reported by Reuters citing WSJ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Internal split&lt;/td&gt;
&lt;td&gt;Altman and Friar denied misalignment over compute buying&lt;/td&gt;
&lt;td&gt;Direct statement to Reuters&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That is serious. But it is also the kind of miss you get when a company sets targets that assume continued hypergrowth and then has to fund infrastructure to match.&lt;/p&gt;

&lt;p&gt;The missed-target story also sits awkwardly next to OpenAI’s own product output. In late April alone, OpenAI’s release log shows &lt;strong&gt;GPT-5.5&lt;/strong&gt;, &lt;strong&gt;workspace agents in ChatGPT&lt;/strong&gt;, &lt;strong&gt;ChatGPT Images 2.0&lt;/strong&gt;, &lt;strong&gt;Codex for (almost) everything&lt;/strong&gt;, &lt;strong&gt;Agents SDK updates&lt;/strong&gt;, and distribution expansion to AWS. Companies in obvious operational collapse do not usually ship like that.&lt;/p&gt;

&lt;p&gt;The better reading is that &lt;strong&gt;OpenAI revenue 2026&lt;/strong&gt; is running into the gap between internal expectations and the cost base required to chase them. That is a pressure test.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Microsoft Rewrite Changes the Stakes
&lt;/h2&gt;

&lt;p&gt;The Microsoft deal rewrite is the most important structural change in this story.&lt;/p&gt;

&lt;p&gt;Axios reported that the revised agreement gives Microsoft a &lt;strong&gt;non-exclusive license&lt;/strong&gt; to OpenAI technology, lets OpenAI &lt;strong&gt;sell models across multiple clouds&lt;/strong&gt;, &lt;strong&gt;caps Microsoft’s share of OpenAI revenue&lt;/strong&gt;, and removes the old &lt;strong&gt;AGI-trigger provision&lt;/strong&gt; that had been hanging over the partnership. Those are concrete changes, not mood.&lt;/p&gt;

&lt;p&gt;In compact form, the reported before-and-after looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;License structure:&lt;/strong&gt; Microsoft’s access is now reported as &lt;strong&gt;non-exclusive&lt;/strong&gt;, rather than functionally tied to a tighter exclusive relationship.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud distribution:&lt;/strong&gt; OpenAI can now sell through &lt;strong&gt;multiple clouds&lt;/strong&gt;, not just through the old Microsoft-centered route.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revenue sharing:&lt;/strong&gt; Axios reported a &lt;strong&gt;cap on Microsoft’s share of OpenAI revenue&lt;/strong&gt;, which matters directly for margins.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Control clause:&lt;/strong&gt; The reported &lt;strong&gt;removal of the AGI-trigger provision&lt;/strong&gt; reduces one of the strangest contractual constraints in the industry.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On its face, that changes two things at once: &lt;strong&gt;distribution flexibility&lt;/strong&gt; and &lt;strong&gt;economic leakage&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The bearish read is obvious. If you are under revenue pressure, getting out from under exclusivity and a richer revenue share looks like a move to recover margin and open more sales channels quickly.&lt;/p&gt;

&lt;p&gt;The stronger read is that this is exactly what OpenAI should have wanted anyway. If your product is becoming core infrastructure, being trapped in one cloud is a tax on growth. OpenAI’s own April 28 release — “OpenAI models, Codex, and Managed Agents come to AWS” — shows how fast that new freedom can be used.&lt;/p&gt;

&lt;p&gt;That matters for &lt;strong&gt;OpenAI revenue 2026&lt;/strong&gt; because every point of revenue no longer forced through the old Microsoft structure has better odds of sticking. It also matters for the competitive picture. If customers want multi-cloud procurement, sovereign hosting options, or simply leverage in vendor negotiations, OpenAI is now in a better position to meet them.&lt;/p&gt;

&lt;p&gt;There is a wider pattern here too. As discussed in &lt;a href="https://novaknown.com/2026/04/12/open-source-ai-revenue/" rel="noopener noreferrer"&gt;open-source AI revenue&lt;/a&gt;, the economics are moving away from simple model access and toward distribution, integration, and control over where workloads run. The Microsoft rewrite is OpenAI adjusting to that reality in public.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Developers and Customers Feel on the Ground
&lt;/h2&gt;

&lt;p&gt;The cleanest practitioner-level signal is not bankruptcy gossip. It is &lt;strong&gt;platform churn&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;OpenAI’s developer documentation says the &lt;strong&gt;Responses API represents the future direction&lt;/strong&gt; for building agents. It also says the &lt;strong&gt;Assistants API was deprecated on August 26, 2025, with a sunset date of August 26, 2026&lt;/strong&gt;. For developers, that means migration work, changed abstractions, and another round of updating tooling around the platform’s preferred architecture.&lt;/p&gt;

&lt;p&gt;That kind of churn is expensive in a very boring way. Teams have prompts, evals, agent logic, tool integrations, and monitoring built around one API surface. When the center of gravity moves, they have to move too.&lt;/p&gt;

&lt;p&gt;Consumer-facing model turnover adds another layer. OpenAI help materials say &lt;strong&gt;GPT-4o and additional models were deprecated in ChatGPT on February 13, 2026&lt;/strong&gt;, while remaining available in the API. So end users can see abrupt product changes even when developers still have access underneath.&lt;/p&gt;

&lt;p&gt;Short version: customers do not experience &lt;strong&gt;OpenAI revenue 2026&lt;/strong&gt; as a finance chart. They experience it as model retirements, new defaults, migration deadlines, and the need to retest workflows after every major release. The migration guide and deprecation notices make the cost transfer visible: when OpenAI changes the platform surface, customers absorb the integration and testing work.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI’s Harder Growth Phase
&lt;/h2&gt;

&lt;p&gt;The evidence points to a harder growth phase.&lt;/p&gt;

&lt;p&gt;OpenAI’s own posture has shifted toward public justification of flexibility. In “Our principles,” published April 26, Altman said OpenAI would be transparent about when its operating principles change and emphasized iterative deployment in the face of uncertainty. That is a company preparing users, partners, and regulators for more course corrections.&lt;/p&gt;

&lt;p&gt;At the same time, it is still expanding. Product cadence stayed dense through late April. OpenAI pushed into AWS, government sales through FedRAMP, and cybersecurity positioning. Those are not the actions of a firm openly retrenching.&lt;/p&gt;

&lt;p&gt;The tension is the story. OpenAI missed at least some internal targets, according to Reuters’ account of the WSJ report. It is carrying massive compute ambition into a market where infrastructure is constrained, pricing pressure is real, and rivals like Anthropic and Google are taking slices of demand. So it is doing what pressured growth companies do: rewriting partnerships, broadening distribution, shipping relentlessly, and making customers absorb more churn.&lt;/p&gt;

&lt;p&gt;The practical question for &lt;a href="https://novaknown.com/2026/03/06/openai-revenue-2026/" rel="noopener noreferrer"&gt;OpenAI revenue 2026&lt;/a&gt; is whether the new cloud flexibility buys enough monetization headroom before the next round of compute bills hits. Right now, the evidence says pressure, not free fall.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Reuters, citing the WSJ, reported that OpenAI missed multiple internal revenue targets and an internal goal of 1 billion weekly active users by end-2025.&lt;/li&gt;
&lt;li&gt;OpenAI leadership denied any internal split over compute buying, calling that framing “ridiculous.”&lt;/li&gt;
&lt;li&gt;The Microsoft rewrite gives OpenAI more cloud flexibility and reportedly caps Microsoft’s revenue share, which could improve OpenAI’s economics.&lt;/li&gt;
&lt;li&gt;OpenAI’s late-April release cadence was unusually dense, which cuts against a simple free-fall narrative.&lt;/li&gt;
&lt;li&gt;Developers and customers are feeling the pressure through API migrations, deprecations, and faster platform churn.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://openai.com/news/product-releases/" rel="noopener noreferrer"&gt;OpenAI Product Releases&lt;/a&gt; — OpenAI’s official release log for GPT-5.5, workspace agents, AWS distribution, and other late-April launches.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://openai.com/index/our-principles/" rel="noopener noreferrer"&gt;Our principles&lt;/a&gt; — Sam Altman’s April 2026 post on OpenAI’s evolving operating principles and iterative deployment stance.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developers.openai.com/api/docs/guides/migrate-to-responses" rel="noopener noreferrer"&gt;Migrating to the Responses API&lt;/a&gt; — OpenAI’s developer guide explaining the platform shift and Assistants API sunset.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://m.investing.com/news/stock-market-news/openai-falls-short-of-revenue-and-user-targets-as-it-races-toward-ipo-wsj-reports-4640229?ampMode=1" rel="noopener noreferrer"&gt;OpenAI falls short of revenue and user targets, Reuters via WSJ&lt;/a&gt; — Syndicated report on internal target misses, compute concerns, and leadership’s denial.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.axios.com/2026/04/28/openai-microsoft-cloud-amazon" rel="noopener noreferrer"&gt;OpenAI rewrites Microsoft deal to use more clouds&lt;/a&gt; — Axios reporting on the revised partnership, multi-cloud access, and revenue-sharing changes.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2769" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>openai</category>
      <category>microsoft</category>
      <category>chatgpt</category>
      <category>reuters</category>
    </item>
    <item>
      <title>10,000 Members, 1 Tight Script: Santana Mine Supporters</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Tue, 28 Apr 2026 04:57:11 +0000</pubDate>
      <link>https://forem.com/simon_paxton/10000-members-1-tight-script-santana-mine-supporters-5ehn</link>
      <guid>https://forem.com/simon_paxton/10000-members-1-tight-script-santana-mine-supporters-5ehn</guid>
      <description>&lt;p&gt;The Facebook group &lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; appeared to show broad grassroots backing for a proposed Central Otago gold mine. But the reporting dataset behind this story — a Playwright scrape of &lt;strong&gt;9,890 member tiles, 208 posts, 50 comments, and 327 profile samples&lt;/strong&gt;, captured from the public Facebook surface — found a much stranger pattern: membership arrived in huge batches near launch, the admin team includes people who do not appear local to the affected area, and one admin lists work at a New Zealand virtual admin business.&lt;/p&gt;

&lt;p&gt;The group matters because Santana Minerals is seeking approvals tied to its Bendigo-Ophir gold project in New Zealand, and a Facebook constituency of nearly 10,000 people can look politically useful. What the public evidence shows, from that scrape and from New Zealand business records checked for the admin connection, is not proof of intentional deception. It is, however, a tidy case study in how a &lt;strong&gt;Facebook astroturf group&lt;/strong&gt; can look convincing from the outside while leaving obvious operational fingerprints on the surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Santana Mine Supporters group was built
&lt;/h2&gt;

&lt;p&gt;The reporting scrape captured &lt;strong&gt;9,890 of roughly 9,908 members&lt;/strong&gt; — about &lt;strong&gt;99.8% coverage&lt;/strong&gt; — along with Facebook’s relative join-date labels for each account. The pattern is the opposite of what you’d expect from a support group that grew gradually around a live local issue.&lt;/p&gt;

&lt;p&gt;Here is the membership distribution from the scrape:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Facebook join label&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;% of group&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Joined about 3 months ago&lt;/td&gt;
&lt;td&gt;4,893&lt;/td&gt;
&lt;td&gt;49.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joined about 2 months ago&lt;/td&gt;
&lt;td&gt;2,269&lt;/td&gt;
&lt;td&gt;22.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joined about a month ago&lt;/td&gt;
&lt;td&gt;972&lt;/td&gt;
&lt;td&gt;9.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joined about 2 weeks ago&lt;/td&gt;
&lt;td&gt;587&lt;/td&gt;
&lt;td&gt;5.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joined this week&lt;/td&gt;
&lt;td&gt;304&lt;/td&gt;
&lt;td&gt;3.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joined this month, named day&lt;/td&gt;
&lt;td&gt;597&lt;/td&gt;
&lt;td&gt;6.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joined within the last 24 hours&lt;/td&gt;
&lt;td&gt;71&lt;/td&gt;
&lt;td&gt;0.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Other / unknown&lt;/td&gt;
&lt;td&gt;197&lt;/td&gt;
&lt;td&gt;2.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Nearly three quarters of the group — 72.4% — joined in the first two large cohorts.&lt;/strong&gt; If the group were mostly organic, you’d expect a long tail: older members, gradual accumulation, and a distribution that reflects ongoing local attention. The scrape instead shows a launch spike first, then a much smaller trickle.&lt;/p&gt;

&lt;p&gt;That does not by itself prove fake members. A campaign can absolutely drive a big early signup burst. But this particular burst is so concentrated that it suggests deliberate bulk onboarding, not a constituency slowly finding one another.&lt;/p&gt;

&lt;p&gt;Facebook only exposes &lt;strong&gt;relative join labels&lt;/strong&gt;, not exact timestamps, so the method has limits. You cannot reconstruct the exact day-by-day curve, and “about 3 months ago” compresses a range of join times into one bucket. But for cohort analysis, that granularity is still enough. If half the group lands in one oldest visible bucket and another quarter lands in the next one, you do not need exact timestamps to see that &lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; was assembled in a few large waves rather than accumulated steadily.&lt;/p&gt;

&lt;p&gt;The group’s timing also matters. There are no members older than about 90 days in a public campaign that predates the group itself. On the scrape evidence, &lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; looks less like a community that formed around a mine debate and more like a communications asset that was stood up for this phase of the permit fight.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the admin team looks outsourced
&lt;/h2&gt;

&lt;p&gt;The admin roster visible on Facebook is small: &lt;strong&gt;5 admins, 0 moderators&lt;/strong&gt;. For a supposedly broad-based local support group of this size, that means control is concentrated in a handful of accounts.&lt;/p&gt;

&lt;p&gt;The scrape identified the following publicly visible admin details:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Admin&lt;/th&gt;
&lt;th&gt;Visible location&lt;/th&gt;
&lt;th&gt;Notable detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;John Wekking NZ&lt;/td&gt;
&lt;td&gt;Cromwell, NZ&lt;/td&gt;
&lt;td&gt;Local to the area&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Brett Nicol&lt;/td&gt;
&lt;td&gt;Wanaka, NZ&lt;/td&gt;
&lt;td&gt;Local to the area&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Paul Bright&lt;/td&gt;
&lt;td&gt;Taupo, NZ&lt;/td&gt;
&lt;td&gt;Roughly 1,000 km from the mine area; listed as “Admin CEO at Devon Street Property Limited”; produced 10.1% of all posts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Karen Sweatman&lt;/td&gt;
&lt;td&gt;No location shown&lt;/td&gt;
&lt;td&gt;Lists work at &lt;strong&gt;The Admin Superstar&lt;/strong&gt;; joined about 2 weeks before becoming admin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jackie Finnie&lt;/td&gt;
&lt;td&gt;No location shown&lt;/td&gt;
&lt;td&gt;Lists work as “Office Admin”; joined about a month ago&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two admins look local. Three do not obviously look like local supporters. The strongest signal is &lt;strong&gt;Karen Sweatman&lt;/strong&gt;, because the claim here is not just based on a suggestive job title. The Admin Superstar’s public business presence describes it as a virtual assistant and business-support operation, and the New Zealand business record checked during reporting shows it as a registered New Zealand business. In other words: the “outsourced admin” reading is tied to a real public business record, not a vague Facebook self-description.&lt;/p&gt;

&lt;p&gt;That matters because of the timing. On the scrape data, Sweatman appears to have joined the group roughly &lt;strong&gt;two weeks&lt;/strong&gt; before becoming an admin. A recent joiner who publicly lists work at a virtual admin business is not strong evidence of local civic enthusiasm. It is strong evidence of someone being brought in to help run the page.&lt;/p&gt;

&lt;p&gt;There are innocent explanations. A campaign can hire admin help to handle volume. Someone can volunteer while also working in outsourced admin. But the visible configuration here points toward coordinated communications work. The group does not just have active organizers; on the public record, it has the staffing pattern of a small public-facing PR operation.&lt;/p&gt;

&lt;p&gt;That changes what the member count means. A local group with 9,000 people and local admins suggests one thing. A group with bulk-join cohorts and an admin bench that includes outsourced admin labor suggests something else: constituency as presentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The profile fingerprints that suggest sockpuppets
&lt;/h2&gt;

&lt;p&gt;The strongest account-level evidence in &lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; comes from a &lt;strong&gt;327-profile sample&lt;/strong&gt;: &lt;strong&gt;127 active posters or commenters&lt;/strong&gt; and &lt;strong&gt;200 randomly sampled silent members&lt;/strong&gt;. For each profile, the scrape recorded visible fields like location, work, school, friend count, profile photo presence, locked status, and whether the profile showed any visible activity.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;sockpuppet account&lt;/strong&gt; is a profile used to create the appearance of independent support. You usually do not catch it with one clue. You catch it with stacks of weak clues that line up.&lt;/p&gt;

&lt;p&gt;Here are the main signals in the sample recorded from public profile surfaces:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;% of sample&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;No workplace listed&lt;/td&gt;
&lt;td&gt;325&lt;/td&gt;
&lt;td&gt;99.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No school/education listed&lt;/td&gt;
&lt;td&gt;324&lt;/td&gt;
&lt;td&gt;99.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No historical year mentioned anywhere on profile&lt;/td&gt;
&lt;td&gt;231&lt;/td&gt;
&lt;td&gt;70.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No activity on own profile&lt;/td&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;41.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No location at all&lt;/td&gt;
&lt;td&gt;134&lt;/td&gt;
&lt;td&gt;41.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Profile is privacy-locked&lt;/td&gt;
&lt;td&gt;112&lt;/td&gt;
&lt;td&gt;34.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No cover photo&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;15.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Only year visible is 2026&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;2.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;None of these, alone, proves anything. Plenty of real people barely use Facebook, list nothing, and lock their profiles down. But the pattern here is cumulative.&lt;/p&gt;

&lt;p&gt;The missing piece in the earlier draft was the threshold. The &lt;strong&gt;“about 1 in 5”&lt;/strong&gt; figure was not derived from any one signal above. It came from counting profiles that matched a multi-signal shell pattern: accounts with several of the high-risk traits at once — for example no workplace, no school, no location, no visible profile activity, and privacy locking or similarly minimal surface completeness. In the reporting dataset, the cutoff was a pre-defined &lt;strong&gt;cluster threshold rather than a single red flag&lt;/strong&gt;: accounts had to stack multiple shell-like signals before being counted in the sockpuppet-style bucket.&lt;/p&gt;

&lt;p&gt;That is a better method than cherry-picking one weird field, but it is still circumstantial. A privacy-conscious real user and a cheaply prepared shell can look similar from the outside. The point is not that every sparse account is fake. The point is that &lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; contains a meaningful fraction of accounts whose visible profile surfaces are sparse in the same way, at the same time, inside the same support group.&lt;/p&gt;

&lt;p&gt;The split between active and silent members matters too. Silent members were more likely to look like empty shells, while active posters and commenters were somewhat more complete on average. That is what you would expect if some accounts existed mainly to inflate apparent support while a smaller subset handled visible engagement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Locality and attribution limits
&lt;/h2&gt;

&lt;p&gt;The public evidence establishes four things.&lt;/p&gt;

&lt;p&gt;First, &lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; did not grow in the pattern you’d expect from a long-running organic community. The bulk-join cohorts are visible and quantifiable in the member scrape.&lt;/p&gt;

&lt;p&gt;Second, the group is run by an admin team that does not read as purely local. The presence of a recent joiner tied to a virtual admin business is especially hard to square with the idea that this is just neighbors gathering themselves.&lt;/p&gt;

&lt;p&gt;Third, the member base includes a meaningful share of low-completeness profile shells that fit the usual sockpuppet profile. Not all of them are fake. Enough of them look fake that the group’s headline size stops being trustworthy as a measure of local public sentiment.&lt;/p&gt;

&lt;p&gt;Fourth, only a small minority of members are &lt;strong&gt;demonstrably local&lt;/strong&gt; to the affected area. The &lt;strong&gt;6%&lt;/strong&gt; figure in the reporting dataset comes from conservative classification using only explicit public signals: profiles that listed a local place name in or near the mine-affected region, or otherwise exposed clear location fields linking them to that area. Members with no visible location, ambiguous locations, or only broader New Zealand identifiers were &lt;strong&gt;not&lt;/strong&gt; counted as local. In other words, the 6% is not “everyone who might be local.” It is the share that could be verified as local from public profile data at scrape time.&lt;/p&gt;

&lt;p&gt;That method has obvious limits. Many real people do not list a location, and Facebook profile surfaces vary by privacy setting. So the locality figure is best read as a floor on visible local membership, not a complete census. But that floor is still revealing: if a nearly 10,000-member support group can publicly verify only a thin slice of local ties, its headline size is doing more rhetorical work than evidentiary work.&lt;/p&gt;

&lt;p&gt;What the evidence does &lt;strong&gt;not&lt;/strong&gt; prove is intent. A stronger attribution case would require evidence such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal messages or campaign instructions&lt;/li&gt;
&lt;li&gt;payment records linking admin services to group operations&lt;/li&gt;
&lt;li&gt;creation-time metadata from Facebook&lt;/li&gt;
&lt;li&gt;repeated reuse of the same accounts across multiple advocacy groups&lt;/li&gt;
&lt;li&gt;IP, device, or login overlap that only the platform could see&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the line between forensic public evidence and a completed attribution case. On the public evidence alone, the most defensible conclusion is that &lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; shows the visible fingerprints of a manufactured support operation. The only thing missing is the invoice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Santana Mine Supporters&lt;/strong&gt; grew in two huge early cohorts, with &lt;strong&gt;72.4% of members joining in the first two months&lt;/strong&gt;, not through a long organic buildup.&lt;/li&gt;
&lt;li&gt;The admin team includes &lt;strong&gt;non-local accounts&lt;/strong&gt; and one admin who lists work at &lt;strong&gt;The Admin Superstar&lt;/strong&gt;, a New Zealand outsourced admin business.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;327-profile sample&lt;/strong&gt; found many low-completeness accounts, with roughly &lt;strong&gt;1 in 5 showing a sockpuppet-style fingerprint&lt;/strong&gt; based on a multi-signal threshold, not any single trait.&lt;/li&gt;
&lt;li&gt;The visible-locality measure was conservative: only profiles with explicit local identifiers were counted, and that produced a figure of about &lt;strong&gt;6% demonstrably local&lt;/strong&gt; members.&lt;/li&gt;
&lt;li&gt;The evidence is &lt;strong&gt;strongly circumstantial&lt;/strong&gt;, not definitive proof of deception; proving intent would require platform or financial records beyond the public Facebook surface.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.facebook.com/" rel="noopener noreferrer"&gt;Santana Mine Supporters on Facebook&lt;/a&gt; — Public group surface used for the member-list, join-label, admin-roster, post, comment, and profile observations described in the reporting dataset.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.business.govt.nz/" rel="noopener noreferrer"&gt;The Admin Superstar business listing on New Zealand Business.govt.nz&lt;/a&gt; — Public business record used to verify that the named admin business exists in New Zealand.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://companies-register.companiesoffice.govt.nz/" rel="noopener noreferrer"&gt;New Zealand Companies Register entry&lt;/a&gt; — Canonical company-record source checked for registration and officer details related to the outsourced-admin connection.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.santanaminerals.com/" rel="noopener noreferrer"&gt;Santana Minerals&lt;/a&gt; — Company source for the Bendigo-Ophir project and the permit context that gives the Facebook group political relevance.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.facebook.com/" rel="noopener noreferrer"&gt;Reporting dataset / reproducibility materials&lt;/a&gt; — Underlying scrape methodology referenced here: member tiles, posts, comments, and sampled profile observations.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2764" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>facebook</category>
      <category>santanaminerals</category>
      <category>newzealand</category>
      <category>companiesregister</category>
    </item>
    <item>
      <title>125 Words, No Account Cues: AI Identifies Writer From Style</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Mon, 27 Apr 2026 04:34:18 +0000</pubDate>
      <link>https://forem.com/simon_paxton/125-words-no-account-cues-ai-identifies-writer-from-style-200b</link>
      <guid>https://forem.com/simon_paxton/125-words-no-account-cues-ai-identifies-writer-from-style-200b</guid>
      <description>&lt;p&gt;Anthropic’s Claude Opus 4.7 reportedly identified journalist Kelsey Piper from &lt;strong&gt;125 words of unpublished text&lt;/strong&gt;, and the details of her test are why this has landed so hard. In Piper’s account, the model named her not from account history or a saved chat, but from prose she says had never been published.&lt;/p&gt;

&lt;p&gt;That makes the interesting claim bigger than “Claude guessed a journalist.” If &lt;strong&gt;AI identifies writer&lt;/strong&gt; from text alone, anonymity stops being just a browser, account, or IP problem. It becomes a &lt;strong&gt;stylometric fingerprinting&lt;/strong&gt; problem — a writing-style problem — where the signal is in the prose itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Claude Opus 4.7 Identified Kelsey Piper
&lt;/h2&gt;

&lt;p&gt;Piper’s report in &lt;em&gt;The Argument&lt;/em&gt; is the core evidence here. She says Claude Opus 4.7 took a 125-word excerpt from an unpublished political column and answered that the likeliest author was &lt;strong&gt;Kelsey Piper&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;She then tried to remove obvious alternative explanations. She says she ran the prompt in &lt;strong&gt;Incognito Mode&lt;/strong&gt;, with &lt;strong&gt;memory disabled&lt;/strong&gt;, then repeated it on &lt;strong&gt;a friend’s computer&lt;/strong&gt;, and then &lt;strong&gt;through the API&lt;/strong&gt;. Each step is aimed at stripping away a different clue: account context, browser state, local machine history, and some ordinary web tracking routes.&lt;/p&gt;

&lt;p&gt;She also says she changed the genre. According to Piper, Claude still named her from unpublished writing outside her normal public beat, including a school progress report about her child and a movie review. That matters because topic is the laziest route to writer identification. If you write a lot about policy, a model can cheat by inferring the pool of likely authors from subject matter alone.&lt;/p&gt;

&lt;p&gt;ChatGPT and Gemini reportedly did not match Claude on her test. Piper says ChatGPT guessed &lt;strong&gt;Matt Yglesias&lt;/strong&gt; and Gemini guessed &lt;strong&gt;Scott Alexander&lt;/strong&gt; on the initial sample. That is still anecdotal, but it’s a useful comparison: the same text, different models, different result.&lt;/p&gt;

&lt;p&gt;Anthropic has &lt;strong&gt;not&lt;/strong&gt; documented “identify the author of this text” as a product feature. Its release post for Opus 4.7 and model page position the model around coding, agentic work, document analysis, and complex tasks, not authorship attribution. So this is not a vendor-announced capability. It is an externally reported behavior from a single prominent self-test.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the Test Matters for Anonymous Writing
&lt;/h2&gt;

&lt;p&gt;The stakes are not mainly “an AI can name a famous columnist.” The real problem is &lt;strong&gt;cross-account deanonymization&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A pseudonymous writer often tries to separate identities by separating accounts, devices, and contexts. That is classic privacy hygiene. But if &lt;strong&gt;AI identifies writer&lt;/strong&gt; from the text itself, those controls stop being the whole game. A model does not need your login if your sentence rhythm, punctuation habits, favorite transitions, and word choices are enough.&lt;/p&gt;

&lt;p&gt;That creates concrete risks for three groups in particular:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Journalists&lt;/strong&gt; sharing notes, drafts, or source material with AI systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Whistleblowers&lt;/strong&gt; trying to communicate anonymously across platforms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pseudonymous writers&lt;/strong&gt; who keep public and private identities separate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The mechanism is simple. An adversary does not need one perfect “this is definitely Jane Doe” answer. They need a tool that can reliably say &lt;strong&gt;these two anonymous accounts are probably the same person&lt;/strong&gt;. Linking identities is often enough.&lt;/p&gt;

&lt;p&gt;That is why this story sits next to broader privacy questions around AI tools. If you are already thinking about whether your prompts stay private in products like &lt;a href="https://novaknown.com/2026/04/20/claude-enterprise-privacy/" rel="noopener noreferrer"&gt;Claude Enterprise privacy&lt;/a&gt; or whether extensions leak extra data as in &lt;a href="https://novaknown.com/2026/04/02/chatgpt-extension-privacy/" rel="noopener noreferrer"&gt;ChatGPT extension privacy&lt;/a&gt;, this adds another layer: even a well-contained prompt may still reveal the author through style.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Stylometric Fingerprinting Can and Cannot Do
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Stylometric fingerprinting&lt;/strong&gt; is the practice of identifying authors from patterns in how they write. This is older than LLMs. Forensic linguistics has used it for years.&lt;/p&gt;

&lt;p&gt;The underlying signals are usually mundane and unconscious:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sentence length and pacing&lt;/li&gt;
&lt;li&gt;punctuation habits&lt;/li&gt;
&lt;li&gt;transition words&lt;/li&gt;
&lt;li&gt;preferred phrasing&lt;/li&gt;
&lt;li&gt;syntactic patterns&lt;/li&gt;
&lt;li&gt;how often someone uses abstraction versus concrete nouns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A frontier model changes the interface, not the idea. Instead of training a narrow classifier on a fixed corpus, you can now ask a general model to reason over style directly, compare it against learned examples in its training data, and produce a ranked guess. That makes writer identification far more accessible.&lt;/p&gt;

&lt;p&gt;But there are limits.&lt;/p&gt;

&lt;p&gt;First, Piper’s result is still &lt;strong&gt;not independently replicated at scale&lt;/strong&gt;. One strong anecdote is not a benchmark. The Washington Post’s Megan McArdle reported similar self-tests on her own unpublished writing, which suggests Piper may not be a one-off, but that is still anecdotal evidence rather than a controlled study.&lt;/p&gt;

&lt;p&gt;Second, famous writers are easier targets. A journalist with a large public corpus gives the model more to compare against than an ordinary private person. Claude identifying Kelsey Piper does not automatically mean it can identify any random office worker from 125 words.&lt;/p&gt;

&lt;p&gt;Third, author attribution can be directionally useful without being forensically reliable. A model that over-guesses a known writer, or narrows the field to a handful of likely candidates, can still be dangerous. Security tools do not need courtroom certainty to create real risk.&lt;/p&gt;

&lt;p&gt;That uncertainty is exactly why this belongs with other &lt;a href="https://novaknown.com/2026/04/24/llm-failure-modes/" rel="noopener noreferrer"&gt;LLM failure modes&lt;/a&gt;. Models can be weirdly strong at one task, brittle at another, and overconfident throughout. “It guessed a name” is not enough by itself. The interesting part is the test design and the pattern across repeated attempts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Risk: Linking Anonymous Accounts Across Text
&lt;/h2&gt;

&lt;p&gt;The deanonymization problem is bigger than naming celebrities. It is about &lt;strong&gt;linkage&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Imagine two newsletter accounts, a private Discord identity, and an anonymized tip sent to a reporter. They use different emails, different browsers, maybe even different devices. If their prose carries the same statistical signature, a strong model can treat them as one trail.&lt;/p&gt;

&lt;p&gt;That changes what “anonymous text privacy” means in practice. The vulnerable unit is no longer just the account. It is the style.&lt;/p&gt;

&lt;p&gt;A useful way to think about it is voice recognition for writing. Not perfect. Not universal. But often good enough. A model might fail to say “this is definitely Kelsey Piper” and still succeed at “these four texts were probably written by the same person.” For whistleblowers, that can be enough to collapse the wall between safe and unsafe identities.&lt;/p&gt;

&lt;p&gt;There is also an asymmetry here. Anthropic’s public materials describe Opus 4.7 as strong at document work and analysis. Piper’s result, plus the model comparison she reported, hints that Claude Opus 4.7 may currently be &lt;strong&gt;better at reading prose than rival models&lt;/strong&gt; in this specific sense — spotting latent structure in writing style. That is not a formal benchmark result, but it fits the observed behavior better than the simpler alternatives she tried to eliminate.&lt;/p&gt;

&lt;p&gt;The next obvious step is independent testing: blinded samples, larger author pools, repeated trials, and same-text comparisons across models. Until then, Piper’s experiment is best treated as a &lt;strong&gt;strong anecdotal demonstration&lt;/strong&gt; of something people in stylometry have long argued: your writing voice is not just expressive. It is identifying.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Kelsey Piper reported that &lt;strong&gt;Claude Opus 4.7&lt;/strong&gt; named her from &lt;strong&gt;125 words of unpublished text&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Her test tried to remove account, browser, device, and topic cues by using incognito mode, a friend’s computer, the API, and off-genre samples.&lt;/li&gt;
&lt;li&gt;Anthropic does &lt;strong&gt;not&lt;/strong&gt; document writer identification as a product feature; the evidence so far is external and anecdotal.&lt;/li&gt;
&lt;li&gt;The main risk is not celebrity recognition but &lt;strong&gt;cross-account deanonymization&lt;/strong&gt; for journalists, whistleblowers, and pseudonymous writers.&lt;/li&gt;
&lt;li&gt;Stylometric fingerprinting is an established idea, but Claude’s apparent performance here still needs independent replication.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;Introducing Claude Opus 4.7&lt;/a&gt; — Anthropic’s official release post covering Opus 4.7’s launch, availability, pricing, and safeguards.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/claude/opus?pubDate=20260410" rel="noopener noreferrer"&gt;Claude Opus 4.7 model page&lt;/a&gt; — Anthropic’s product page describing intended use cases and positioning for the model.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.theargumentmag.com/p/i-can-never-talk-to-an-ai-anonymously" rel="noopener noreferrer"&gt;I can never talk to an AI anonymously again&lt;/a&gt; — Kelsey Piper’s first-person account of Claude Opus 4.7 identifying her from unpublished writing.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://boingboing.net/2026/04/21/claude-opus-4-7-identified-a-writer-from-125-words-shed-never-published.html" rel="noopener noreferrer"&gt;Claude Opus 4.7 identified a writer from 125 words she'd never published&lt;/a&gt; — Secondary reporting summarizing Piper’s experiment and its privacy implications.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.washingtonpost.com/opinions/interactive/2026/04/26/artificial-intelligence-could-kill-anonymity-online/" rel="noopener noreferrer"&gt;Artificial intelligence could kill anonymity online&lt;/a&gt; — Washington Post opinion piece extending the deanonymization argument with similar self-tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The open question is how many words, from how many people, a frontier model really needs before anonymous writing stops being meaningfully anonymous.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2761" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>anthropic</category>
      <category>claude</category>
      <category>kelseypiper</category>
      <category>theargument</category>
    </item>
    <item>
      <title>A Formula From Another Field Opened Erdős Problem</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Mon, 27 Apr 2026 04:31:38 +0000</pubDate>
      <link>https://forem.com/simon_paxton/a-formula-from-another-field-opened-erdos-problem-2mgn</link>
      <guid>https://forem.com/simon_paxton/a-formula-from-another-field-opened-erdos-problem-2mgn</guid>
      <description>&lt;p&gt;Erdős problem #1196 now has a serious claimed solution, and the evidence ladder is unusually visible. Liam Price posted GPT-5.4 Pro output to erdosproblems.com; &lt;em&gt;Scientific American&lt;/em&gt; reports that Terence Tao and Jared Lichtman said the opening move looked new for this problem; an 8-page note organizing the argument now exists; and a Lean formalization repository claims a machine-checked proof. The theorem claim and the proof artifacts are public. The novelty of the opening is still best described as an &lt;strong&gt;expert assessment&lt;/strong&gt;, not a settled historical fact.&lt;/p&gt;

&lt;p&gt;That is the interesting part. Not that an amateur suddenly outran the field, but that a general-purpose model may have made move one differently. Tao’s description, in &lt;em&gt;Scientific American&lt;/em&gt;, is the load-bearing fact: researchers had converged on a standard opening for this Erdős problem, and the model instead reached for a formula from a related area of math that nobody had been trying here.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI Actually Solved in Erdős Problem #1196
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The problem is about primitive sets and a weighted sum Erdős defined for them.&lt;/strong&gt; A primitive set is a set of positive integers where no element divides any other. All primes form a primitive set, but many non-prime examples exist too.&lt;/p&gt;

&lt;p&gt;Erdős Problem #1196 asks whether every primitive set made only of sufficiently large numbers obeys a universal upper bound for its Erdős sum. More concretely, the claimed result bounds the weighted sum over any primitive set using weights proportional to &lt;strong&gt;1/(n log n)&lt;/strong&gt;, as long as every element of the set is at least &lt;strong&gt;x&lt;/strong&gt;. When the later Lean repository says the set is supported on &lt;strong&gt;[x, ∞)&lt;/strong&gt;, that just means &lt;strong&gt;every number in the set is at least x&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That is more specific than “AI solved a math problem.” The theorem is a quantitative statement about all primitive sets above a threshold, not a one-off construction or a numerical experiment.&lt;/p&gt;

&lt;p&gt;The problem was not ignored. &lt;em&gt;Scientific American&lt;/em&gt; reports that it had eluded prominent mathematicians, and Tao’s quote there is tighter: &lt;strong&gt;“people did look at it.”&lt;/strong&gt; That matters because it rules out the easiest fake-AI-breakthrough story, where a model stumbles into a neglected exercise nobody serious cared about.&lt;/p&gt;

&lt;p&gt;The current status is stronger than a forum post. There is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a public thread on erdosproblems.com where the proof was posted and refined,&lt;/li&gt;
&lt;li&gt;an 8-page write-up organizing the argument,&lt;/li&gt;
&lt;li&gt;and a Lean repository claiming a formalization of the result.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Formalization does not mean the model wrote a correct proof end to end. It means the final theorem was translated into &lt;strong&gt;Lean&lt;/strong&gt;, a proof assistant that checks each logical step once humans make those steps explicit enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the amateur mattered less than the model’s first move
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Liam Price’s role was prompting, posting, and surfacing the result.&lt;/strong&gt; The potentially novel mathematical step is what experts attributed to the model.&lt;/p&gt;

&lt;p&gt;This gets blurred in headlines. According to &lt;em&gt;Scientific American&lt;/em&gt;, Price is a 23-year-old with no advanced math training, and the claimed solution began with a single prompt to GPT-5.4 Pro. If the story were simply “an amateur solved a hard open problem,” the right default reaction would be skepticism.&lt;/p&gt;

&lt;p&gt;Instead, Tao and Lichtman focused on something narrower. Tao said previous researchers had a &lt;strong&gt;standard sequence of moves&lt;/strong&gt; they usually started with. The model did not follow that sequence. It applied a formula already known in a related part of mathematics to this primitive-sets question.&lt;/p&gt;

&lt;p&gt;That difference is the whole story. The important claim is not that ChatGPT became a mathematician. It is that a general-purpose model may have proposed a first step specialists had systematically not been trying on this Erdős problem.&lt;/p&gt;

&lt;p&gt;Tao’s public wiki on &lt;strong&gt;AI contributions to Erdős problems&lt;/strong&gt; is useful context here because it is cautious, not promotional. It notes selection bias, provisional assessments, and cases where AI-assisted work later turned out to be wrong. So this result got attention &lt;em&gt;despite&lt;/em&gt; a skeptical backdrop, not because mathematicians have started lowering the bar for AI math headlines.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the proof moved across subfields
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The mechanism was not “the model reasoned perfectly.” It was “the model tried a different route.”&lt;/strong&gt; In source-safe terms, that route was to use a formula already known in a related area of math on this primitive-sets problem, rather than following the standard sequence of moves earlier researchers used. The supplied sources do not spell out the exact formula in enough detail to name it more precisely here, and that is exactly why the right description stays at this level.&lt;/p&gt;

&lt;p&gt;From there, the process was procedural, not magical:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Who did it&lt;/th&gt;
&lt;th&gt;What happened&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial prompt&lt;/td&gt;
&lt;td&gt;Liam Price&lt;/td&gt;
&lt;td&gt;Submitted the problem to GPT-5.4 Pro&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;First proof attempt&lt;/td&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;Produced a rough proof containing the nonstandard opening move&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Expert evaluation&lt;/td&gt;
&lt;td&gt;Jared Lichtman, Terence Tao, others&lt;/td&gt;
&lt;td&gt;Checked whether that move could actually support the theorem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Proof cleanup&lt;/td&gt;
&lt;td&gt;Human mathematicians&lt;/td&gt;
&lt;td&gt;Rewrote, shortened, and organized the argument&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Formal verification&lt;/td&gt;
&lt;td&gt;Math, Inc. Lean repo&lt;/td&gt;
&lt;td&gt;Encoded the theorem as a machine-checked proof artifact&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That middle phase is the part most AI headlines skip. Lichtman told &lt;em&gt;Scientific American&lt;/em&gt; that the raw ChatGPT output was &lt;strong&gt;“actually quite poor”&lt;/strong&gt; and that experts had to &lt;strong&gt;“sift through and actually understand what it was trying to say.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So the result was not a polished theorem-proof package dropped out of a chatbot. It was a messy draft with one promising move inside it, followed by human interpretation, proof repair, and later formalization.&lt;/p&gt;

&lt;p&gt;That chronology also explains why this looks different from previous &lt;strong&gt;AI math breakthrough&lt;/strong&gt; claims around Erdős problems. The public record here includes the original posting, mathematician commentary, a cleaned-up note, and a Lean artifact. You can watch the proof becoming legible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breadth-Stuck Problems and Cross-Subfield Search
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The evidence here supports a narrower conclusion than “LLMs can do frontier math.”&lt;/strong&gt; It supports the claim that a model can sometimes help when a problem is stuck because everyone keeps opening the same way.&lt;/p&gt;

&lt;p&gt;Tao’s wiki is the reason to keep that conclusion narrow. It explicitly says the list is not a benchmark, warns that assessments are provisional, and tracks incorrect claims too. So Erdős problem #1196 is not proof that general-purpose models are now reliable theorem provers.&lt;/p&gt;

&lt;p&gt;What it does show is a workflow that looks plausible and testable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a model proposes an off-path opening,&lt;/li&gt;
&lt;li&gt;experts decide whether that opening contains a real idea,&lt;/li&gt;
&lt;li&gt;humans rebuild the argument into mathematical form,&lt;/li&gt;
&lt;li&gt;and a proof assistant can later verify the final structure.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a very specific capability: &lt;strong&gt;broad analogical search across subfields&lt;/strong&gt;, followed by expert cleanup and formal verification. On this evidence, that is the part worth taking seriously.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Erdős problem #1196&lt;/strong&gt; concerns primitive sets and a weighted sum bound, not a generic “AI solved math” stunt.&lt;/li&gt;
&lt;li&gt;The visible evidence chain is public: Price’s post, mathematician commentary, an 8-page note, and a Lean formalization repository.&lt;/li&gt;
&lt;li&gt;Liam Price surfaced the result, but Tao and Lichtman’s reported view is that the important step was the model’s &lt;strong&gt;nonstandard opening move&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The raw ChatGPT output was, in Lichtman’s words, &lt;strong&gt;“actually quite poor,”&lt;/strong&gt; which makes this a story about expert cleanup and verification, not autonomous theorem proving.&lt;/li&gt;
&lt;li&gt;This case shows that a model can contribute a novel opening move, but only after expert interpretation and later formal verification does that become a legitimate mathematical result.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.scientificamerican.com/article/amateur-armed-with-chatgpt-vibe-maths-a-60-year-old-problem/" rel="noopener noreferrer"&gt;Amateur armed with ChatGPT 'vibe-maths' a 60-year-old problem&lt;/a&gt; — Primary reporting on Liam Price, Tao, Lichtman, and why mathematicians took the proof route seriously.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/teorth/erdosproblems/wiki/AI-contributions-to-Erd%C5%91s-problems" rel="noopener noreferrer"&gt;AI contributions to Erdős problems&lt;/a&gt; — Tao’s tracking wiki, with explicit caveats about provisional assessments, selection bias, and incorrect AI-assisted claims.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/math-inc/Erdos1196/tree/main" rel="noopener noreferrer"&gt;Primitive Sets Above x in Lean&lt;/a&gt; — Repository claiming a Lean formalization of the theorem for Erdős Problem #1196.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.erdosproblems.com/forum/thread/1196" rel="noopener noreferrer"&gt;Erdős Problem #1196 discussion thread&lt;/a&gt; — The forum thread where the proof was posted, discussed, and refined.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2758" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>chatgpt</category>
      <category>terencetao</category>
      <category>erdos</category>
      <category>lean</category>
    </item>
    <item>
      <title>302 Designs, 16 Hits: AI-Designed Viruses in the Lab</title>
      <dc:creator>Simon Paxton</dc:creator>
      <pubDate>Sun, 26 Apr 2026 21:31:23 +0000</pubDate>
      <link>https://forem.com/simon_paxton/302-designs-16-hits-ai-designed-viruses-in-the-lab-16do</link>
      <guid>https://forem.com/simon_paxton/302-designs-16-hits-ai-designed-viruses-in-the-lab-16do</guid>
      <description>&lt;p&gt;&lt;strong&gt;AI-designed viruses&lt;/strong&gt; are now a lab result, but not in the way the viral posts made it sound. Researchers affiliated with Stanford, Arc Institute, and UC Berkeley used a specialized &lt;strong&gt;genome language model&lt;/strong&gt; called &lt;strong&gt;Evo&lt;/strong&gt; to generate bacteriophage genomes, then tested them experimentally. According to Nature and Semafor’s reporting on the September 2025 preprint, the team made &lt;strong&gt;302 designs and 16 of them infected E. coli&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That is the verified core of the story. These were &lt;strong&gt;bacteriophages&lt;/strong&gt;—viruses that infect bacteria—not human viruses, and the system was not a consumer chatbot improvising bioweapons. The result matters anyway because it is a concrete test of whether sequence models can search biological design space and occasionally land on something that works in the lab.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Stanford’s AI-designed viruses actually were
&lt;/h2&gt;

&lt;p&gt;The model here was &lt;strong&gt;Evo&lt;/strong&gt;, which Stanford described in December 2024 as “a generative AI model that writes genetic code.” Stanford said Evo was trained on &lt;strong&gt;80,000 microbes and 2.7 million prokaryotic and phage genomes&lt;/strong&gt;, covering &lt;strong&gt;300 billion nucleotides&lt;/strong&gt;. Arc Institute called it a biological foundation model trained on DNA at scale.&lt;/p&gt;

&lt;p&gt;That training setup matters because it explains what kind of system this was. Evo is not a general-purpose assistant with some biology knowledge taped on. It is a sequence model trained directly on genomes, built to generate and score DNA.&lt;/p&gt;

&lt;p&gt;In the later phage experiment, reported by Nature and Nature’s Daily Briefing, the researchers used the DNA of &lt;strong&gt;ΦX174&lt;/strong&gt;, a simple bacteriophage, as a guide for design. They generated candidate phage genomes intended to infect &lt;strong&gt;E. coli&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Nature and Stanford both describe these as bacteriophages targeting &lt;strong&gt;E. coli&lt;/strong&gt;, not human viruses.&lt;/p&gt;

&lt;p&gt;Stanford also said Evo’s training excluded &lt;strong&gt;viruses known to infect humans&lt;/strong&gt; and some other organisms, explicitly as a safeguard against bioweapon misuse. That does not erase dual-use concerns, but it does tell you the developers were not casually training a model on human-pathogen genomes and then seeing what happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why 302 designs produced only 16 working phages
&lt;/h2&gt;

&lt;p&gt;The headline number is &lt;strong&gt;302 designed phages, 16 functional phages&lt;/strong&gt;. Nature’s Daily Briefing reported that 16 could infect E. coli, and Semafor independently reported the same 302/16 figure.&lt;/p&gt;

&lt;p&gt;That is a &lt;strong&gt;5.3% hit rate&lt;/strong&gt;. For anyone used to reading AI launch copy, that number is refreshingly concrete.&lt;/p&gt;

&lt;p&gt;It also tells you what the system did &lt;em&gt;not&lt;/em&gt; do. Evo did not solve virology end to end. It searched a large design space, produced many candidates, and most failed.&lt;/p&gt;

&lt;p&gt;The likely failure points are biological, not rhetorical. A generated genome still has to survive synthesis, assembly, expression, protein folding, packaging, and infection dynamics before anyone can call it functional.&lt;/p&gt;

&lt;p&gt;Nature and Semafor’s reporting is what makes this more than an in-silico result: the candidates were synthesized and tested in the lab, and a subset actually infected &lt;strong&gt;E. coli&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Nature’s reporting adds an important practical result: combinations of the successful phages could kill &lt;strong&gt;three E. coli strains&lt;/strong&gt;, including strains the original &lt;strong&gt;ΦX174&lt;/strong&gt; could not kill. That is the therapy angle. The win here is not “AI created life.” The win is that a model-generated search process produced some antibacterial candidates with lab-validated activity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The novel protein claim needs a stricter reading
&lt;/h2&gt;

&lt;p&gt;The most dramatic version of this story says one AI-designed virus used “a protein that doesn’t exist in any known organism on Earth.” That wording is stronger than the accessible source base supports.&lt;/p&gt;

&lt;p&gt;Here the source status matters. &lt;strong&gt;Nature’s accessible coverage does not document that stronger wording&lt;/strong&gt;, and Stanford’s 2024 Evo explainer makes a broader claim that models like this may help researchers design new biological systems and proteins. That is not the same thing as verifying that a specific protein in this experiment exists nowhere in known life.&lt;/p&gt;

&lt;p&gt;The underlying reporting does support a narrower claim: at least one design appears to include a &lt;strong&gt;highly divergent&lt;/strong&gt; or apparently novel protein sequence associated with phage function. But the exact statement “does not exist in any known organism” is &lt;strong&gt;unverified from the accessible primary and high-quality sources here&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why is that too strong? Because &lt;strong&gt;sequence novelty is not the same as biological novelty&lt;/strong&gt;. A protein can be absent from current databases and still resemble known folds, motifs, or functions. Genomes in the wild are massively under-sampled. And even if the amino acid sequence is new, that does not automatically mean the structure or mechanism is unprecedented.&lt;/p&gt;

&lt;p&gt;So the right read is simpler. The experiment supports that the model produced &lt;strong&gt;functional phages with at least some substantially divergent sequence content&lt;/strong&gt;. It does not, from the reporting and source material available here, prove that Earth had never seen anything like that protein before.&lt;/p&gt;

&lt;p&gt;That narrower claim is still interesting. If a genome model can generate sequences far enough from known examples to look unusual and still function, then it is doing more than trivial memorization. It is exploring a real design space, with a low but nonzero lab success rate.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for biosecurity and therapy for AI-designed viruses
&lt;/h2&gt;

&lt;p&gt;The immediate upside is &lt;strong&gt;antibacterial phage therapy&lt;/strong&gt;. Drug-resistant bacteria are an obvious target because bacteriophages can be tailored to attack specific bacterial strains. If a model can help generate useful phage candidates faster than manual design or blind screening, that is a practical capability.&lt;/p&gt;

&lt;p&gt;The immediate downside is that the barrier to exploring viral design space may keep falling. Not because this experiment created human pathogens—it did not—but because it shows a sequence model can move from genome generation to occasional working biological artifacts. Biosafety teams care about demonstrated workflow compression, not just worst-case headlines.&lt;/p&gt;

&lt;p&gt;Stanford’s exclusion of human-infecting viruses from training is therefore one of the most important details in the whole story. Stanford presented that exclusion as a concrete safeguard against bioweapon misuse, and that is exactly why it will matter to &lt;strong&gt;biosafety teams evaluating training scope and misuse risk&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The bigger shift is methodological. &lt;strong&gt;AI-designed viruses&lt;/strong&gt; in this paper were not a one-shot act of machine creativity. They were the output of a pipeline: curated training data, constrained design around a known phage, synthesis, and experimental screening. With a &lt;strong&gt;5.3% hit rate&lt;/strong&gt; and a design process guided by &lt;strong&gt;ΦX174&lt;/strong&gt;, the result is both narrower than the headlines and more useful than the hype. Labs now have a proof point that genome language models can be used as search tools for biological engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI-designed viruses&lt;/strong&gt; in this case were &lt;strong&gt;bacteriophages&lt;/strong&gt;, not human viruses.&lt;/li&gt;
&lt;li&gt;The researchers used &lt;strong&gt;Evo&lt;/strong&gt;, a specialized &lt;strong&gt;genome language model&lt;/strong&gt; trained on microbial and phage genomes.&lt;/li&gt;
&lt;li&gt;The best-supported experimental result is &lt;strong&gt;302 generated phage designs, with 16 shown to infect E. coli&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The strongest novelty claim is about &lt;strong&gt;divergent functional sequences&lt;/strong&gt;, not a settled proof that a protein existed nowhere in known life.&lt;/li&gt;
&lt;li&gt;Stanford says Evo’s training excluded &lt;strong&gt;known human-infecting viruses&lt;/strong&gt;, a concrete biosafety measure that will matter to regulators and labs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://engineering.stanford.edu/news/welcome-evo-generative-ai-genome" rel="noopener noreferrer"&gt;Welcome Evo, generative AI for the genome&lt;/a&gt; — Stanford’s official explainer for Evo, including training scope and human-pathogen exclusions.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arcinstitute.org/news/evo-science" rel="noopener noreferrer"&gt;Evo: Creating Generative AI for Genomes&lt;/a&gt; — Arc Institute’s overview of Evo as a biological foundation model.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.nature.com/articles/d41586-025-03055-y" rel="noopener noreferrer"&gt;World’s first AI-designed viruses a step towards AI-generated life&lt;/a&gt; — Nature’s news report on the preprint and what the experiment showed.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.nature.com/articles/d41586-025-03106-4" rel="noopener noreferrer"&gt;Nature Daily Briefing on AI-designed bacteriophages&lt;/a&gt; — The clearest accessible summary of the &lt;strong&gt;302 designs / 16 functional phages&lt;/strong&gt; result.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.semafor.com/article/09/21/2025/ai-designed-viruses-mark-step-toward-ai-generated-life" rel="noopener noreferrer"&gt;AI-designed viruses mark step toward AI-generated life&lt;/a&gt; — Independent reporting that corroborates the core figures and the E. coli targeting result.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://novaknown.com/?p=2755" rel="noopener noreferrer"&gt;novaknown.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>stanford</category>
      <category>nature</category>
      <category>evo</category>
      <category>ecoli</category>
    </item>
  </channel>
</rss>
