<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Benji Fisher</title>
    <description>The latest articles on Forem by Benji Fisher (@benjifisher).</description>
    <link>https://forem.com/benjifisher</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3787687%2F0c8176d8-b238-43f2-b0af-71689e955123.jpg</url>
      <title>Forem: Benji Fisher</title>
      <link>https://forem.com/benjifisher</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/benjifisher"/>
    <language>en</language>
    <item>
      <title>The UCP Technical Council Just Shipped Attribution into Core. Here's What That Means.</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Wed, 06 May 2026 07:43:57 +0000</pubDate>
      <link>https://forem.com/benjifisher/the-ucp-technical-council-just-shipped-attribution-into-core-heres-what-that-means-2cnh</link>
      <guid>https://forem.com/benjifisher/the-ucp-technical-council-just-shipped-attribution-into-core-heres-what-that-means-2cnh</guid>
      <description>&lt;p&gt;On &lt;strong&gt;May 5, 2026&lt;/strong&gt;, the UCP Technical Council merged &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/391" rel="noopener noreferrer"&gt;PR #391&lt;/a&gt; into the spec's &lt;code&gt;main&lt;/code&gt; branch — adding a top-level &lt;code&gt;attribution&lt;/code&gt; field to cart, checkout, catalog, and order operations. The field carries platform-emitted referral and conversion-event context: campaign IDs, click identifiers (&lt;code&gt;gclid&lt;/code&gt;, &lt;code&gt;fbclid&lt;/code&gt;, &lt;code&gt;ttclid&lt;/code&gt;), source/medium markers. Open string-keyed map. Universal across requests; not gated by capability negotiation.&lt;/p&gt;

&lt;p&gt;As UCP matures, attribution landing in core was always going to happen. Agentic commerce can't operate as commercial infrastructure without a path for advertising and measurement context to flow alongside the transactional data — and the longer that gap stayed open, the more pressure would have built for vendors to ship incompatible parallel solutions. The merge isn't the surprising part. &lt;strong&gt;The interesting part is the specific shape of what shipped, and what its presence in core tells us about where the spec is heading.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two things to dig into: the technical detail of the field itself, and the trajectory implication of advertising and measurement infrastructure landing in UCP core for the first time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What shipped
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;attribution&lt;/code&gt; field is structurally simple. From Grigorik's own example in the PR:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"attribution"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"campaign_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"18234567890"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"campaign_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"google"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"campaign_medium"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cpc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"campaign_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"spring_2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"gclid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"EAIaIQobChMI..."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No prescribed schema beyond "string-keyed object." Platforms populate it with whatever conventions they already use — GA4 campaign parameters, click identifiers, custom tracking keys. Businesses receive the data and process per their own analytics needs. UCP itself does &lt;strong&gt;not&lt;/strong&gt; prescribe attribution windows, models, or assignment logic. The protocol carries the data; attribution math happens downstream.&lt;/p&gt;

&lt;p&gt;The field appears in three roles across the request lifecycle:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Direction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;catalog&lt;/code&gt; (search, lookup)&lt;/td&gt;
&lt;td&gt;Platform-emitted input&lt;/td&gt;
&lt;td&gt;Platform → merchant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cart&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Platform-emitted input&lt;/td&gt;
&lt;td&gt;Platform → merchant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;checkout&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Platform-emitted input&lt;/td&gt;
&lt;td&gt;Platform → merchant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;order&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Business-emitted snapshot&lt;/td&gt;
&lt;td&gt;Merchant → platform&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The asymmetry matters. On catalog/cart/checkout, the platform writes attribution as it would write a UTM string into a browser URL — referral context flowing forward. On &lt;code&gt;order&lt;/code&gt;, the business preserves the originating attribution as a snapshot — closing the loop between agent-mediated conversion and the platform that produced it.&lt;/p&gt;

&lt;p&gt;Grigorik's framing in the PR is the cleanest one-line summary of intent: the field "carries the same parameters platforms communicate via URL query parameters in browser-based flows, in the same flat key-value form." Attribution in agent-mediated commerce is the agent counterpart of UTM strings. Same parameters, same model, different transport layer.&lt;/p&gt;

&lt;p&gt;Thirteen files changed. The core addition is &lt;code&gt;source/schemas/shopping/types/attribution.json&lt;/code&gt; — the new type definition. Schemas for cart, catalog_lookup, catalog_search, checkout, and order all gain the field as an optional property. Specification docs across cart, catalog, checkout, order, and the overview were updated to describe the field's purpose and semantics.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architectural decision: core field, not extension
&lt;/h2&gt;

&lt;p&gt;The substantively interesting part of this PR is not what got added. It's how it got added.&lt;/p&gt;

&lt;p&gt;PR #391 was Grigorik's alternative proposal to &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/295" rel="noopener noreferrer"&gt;PR #295&lt;/a&gt;, which James Andersen had opened earlier proposing an &lt;code&gt;event_context&lt;/code&gt; extension. Both proposals tried to solve the same problem — give platforms a way to pass referral/attribution data through to merchants in agent flows — but with very different architectural shapes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;#295 (Andersen, Meta):&lt;/strong&gt; Attribution as a &lt;strong&gt;structured extension&lt;/strong&gt;. Capability-negotiated. Validated against a defined schema. Standardised vocabulary across platforms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;#391 (Grigorik, Shopify):&lt;/strong&gt; Attribution as a &lt;strong&gt;top-level core field&lt;/strong&gt;. Open key-value map. No capability negotiation. Each platform uses its own conventions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Andersen formally approved Grigorik's alternative — &lt;em&gt;"thanks for finding a better home for attribution data than the original proposal"&lt;/em&gt; — and the rearchitecture went on to merge through TC discussion. That cross-vendor pattern (one TC member proposes; another offers a structurally different alternative; the original proposer endorses it) is the dynamic that produces robust standards rather than fragmented vendor extensions.&lt;/p&gt;

&lt;p&gt;The PR discussion pivots on which architectural shape this kind of data deserves. Amit Handa wrote the canonical comment on May 3 establishing the decision framework — worth quoting because it'll likely be cited as governance precedent in future spec discussions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Use a UCP Extension&lt;/th&gt;
&lt;th&gt;Use Optional Flat Key-Value Pairs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Impact on Behavior&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Changes state or execution of the operation&lt;/td&gt;
&lt;td&gt;Purely informational&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Stability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stable, standardized vocabulary&lt;/td&gt;
&lt;td&gt;Volatile, platform-specific, rapidly evolving&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Capability Negotiation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires mutual agreement + active parent capability&lt;/td&gt;
&lt;td&gt;Best-effort, consumed at-will, no gating&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Schema Validation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Strict — transaction integrity matters&lt;/td&gt;
&lt;td&gt;Flexible — validation happens downstream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Platform Scale&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data normalization across diverse platforms&lt;/td&gt;
&lt;td&gt;Low friction; normalization burden on receiver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Typical Examples&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;discount&lt;/code&gt;, &lt;code&gt;fulfillment&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;attribution&lt;/code&gt;, referral tracking, session tags&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Attribution falls cleanly on the right side of every row. Marketing identifiers (&lt;code&gt;gclid&lt;/code&gt;, &lt;code&gt;fbclid&lt;/code&gt;, &lt;code&gt;ttclid&lt;/code&gt;) are volatile and platform-specific — every adtech vendor invents their own; standardising them in the spec would be obsolete the moment a new platform launches. Attribution doesn't change protocol behaviour — it's read-only context that some downstream pipeline cares about, with no transactional consequence. There's nothing for a merchant to negotiate; either you record it or you don't.&lt;/p&gt;

&lt;p&gt;The merged PR locks this decision in. Future contributors proposing similar volatile, informational, platform-specific data structures now have a precedent: &lt;strong&gt;the spec prefers flat optional key-value pairs over structured extensions for non-state-changing context.&lt;/strong&gt; That's a piece of governance documentation as much as a feature merge, and Handa's table will be the reference for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The trajectory implication
&lt;/h2&gt;

&lt;p&gt;UCP up to this point has been &lt;strong&gt;protocol mechanics&lt;/strong&gt;. How agents discover stores. How they shop. How they pay. How they identify users. How they handle returns. The mechanics are necessary, but they don't directly produce commercial value for the ecosystem participants. A merchant with a perfectly conformant UCP implementation but no attribution can't measure agent-driven conversions, can't optimise marketing spend, can't close the loop between platform investment and merchant outcomes.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;attribution&lt;/code&gt; closes that loop. With the field in core, the entire adtech infrastructure that powers current ecommerce extends naturally into agent-mediated commerce. Platforms attribute conversions to specific campaigns. Click identifiers persist across the agent flow. Businesses run their existing analytics pipelines on agent-driven traffic with no special handling. The bridge that makes UCP commercially usable for marketing teams — not just engineering teams — now exists in the core spec.&lt;/p&gt;

&lt;p&gt;The trajectory implication is the part worth sitting with: &lt;strong&gt;UCP is evolving from protocol mechanics into commercial infrastructure.&lt;/strong&gt; Each subsequent spec addition probably bridges another piece of existing commerce infrastructure into the agent layer. Loyalty programs. Customer data platforms. Marketing automation triggers. Inventory hooks. Each one makes UCP more complete as commercial infrastructure rather than just protocol mechanics.&lt;/p&gt;

&lt;p&gt;The architectural-precedent decision in #391 makes that trajectory more efficient. Future contributors proposing similar bridges (attribution-adjacent measurement primitives, marketing identifiers, session metadata) now have a clear template: flat key-value pairs into core, governance precedent already established. The spec doesn't need to relitigate the core-vs-extension decision every time a volatile, informational primitive comes up.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it means in practice
&lt;/h2&gt;

&lt;p&gt;For &lt;strong&gt;merchants&lt;/strong&gt;: your UCP implementation should accept the &lt;code&gt;attribution&lt;/code&gt; field on incoming cart, checkout, and catalog requests, preserve it through to order records, and surface it through your analytics pipeline. The lift is small — it's a string-keyed JSON object on existing endpoints — but missing it means agent-driven conversions arrive at your analytics with no source attribution, which means your marketing team can't measure the channel.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;platform vendors&lt;/strong&gt; (&lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;, and others): rolling attribution support into the next platform-side compatibility release is now table-stakes work. The stores running on your stack will need to accept and preserve attribution by the time the next published spec version makes this part of conformance.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;agent platforms&lt;/strong&gt; (those of us building or testing agents that shop UCP stores): pass platform-emitted attribution forward into every cart/checkout/catalog request. The data is informational, not state-changing — your agent doesn't need to do anything with it beyond passing it through. The merchant decides what to do with it on the receive side.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;evaluators&lt;/strong&gt; (us): the &lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;UCP Score&lt;/a&gt; will incorporate attribution-acceptance and attribution-preservation conformance in its next release. A store that accepts attribution on cart/checkout/catalog and threads it through to order records will score higher than one that drops it. The &lt;a href="https://ucpchecker.com/methodology" rel="noopener noreferrer"&gt;methodology&lt;/a&gt; page will reflect the rule update when the next score-version drops.&lt;/p&gt;

&lt;h2&gt;
  
  
  Timing: in core today, in the published spec next
&lt;/h2&gt;

&lt;p&gt;One important distinction worth making explicit. PR #391 merged into the spec's &lt;code&gt;main&lt;/code&gt; branch — not into a currently-published spec version. The latest released spec is &lt;strong&gt;v2026-04-08&lt;/strong&gt;, which does not include &lt;code&gt;attribution&lt;/code&gt;. The field lands for conformance purposes in whatever the next published spec version ships (no fixed cadence; expected in the next few months). Until then, attribution sits in the working draft on &lt;code&gt;main&lt;/code&gt; — implementers can adopt it ahead of the release if they want, but it's not yet part of conformance for the published spec.&lt;/p&gt;

&lt;p&gt;That distinction shapes how we're rolling out support across our tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpplayground.com/" rel="noopener noreferrer"&gt;UCP Playground&lt;/a&gt;&lt;/strong&gt; will adopt attribution support when the next spec version drops — agents will pass platform attribution through to merchants.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The &lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;UCP Score&lt;/a&gt;&lt;/strong&gt; will incorporate attribution-acceptance and attribution-preservation rules in the score release that aligns with the next published spec.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/ucp-validator" rel="noopener noreferrer"&gt;The validator&lt;/a&gt;&lt;/strong&gt; will support the new field as soon as the next spec ships, and the &lt;a href="https://ucpchecker.com/bulk-check" rel="noopener noreferrer"&gt;bulk checker&lt;/a&gt; will surface attribution conformance per-merchant after that.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The architectural certainty is already here — the schema is locked, the field is documented, the design pattern is settled. The spec drop is the &lt;strong&gt;conformance trigger&lt;/strong&gt;, not the design moment. Implementers who start work today against the working draft are operating against a known target.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to read more
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The PR itself: &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/391" rel="noopener noreferrer"&gt;#391 on Universal-Commerce-Protocol/ucp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;The merge commit: &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/commit/76a35394051222bcef8169c9c5c4c03072542a98" rel="noopener noreferrer"&gt;&lt;code&gt;76a3539&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;The new schema type: &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/blob/main/source/schemas/shopping/types/attribution.json" rel="noopener noreferrer"&gt;&lt;code&gt;source/schemas/shopping/types/attribution.json&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Updated authoring guidance: &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/blob/main/docs/documentation/schema-authoring.md" rel="noopener noreferrer"&gt;&lt;code&gt;docs/documentation/schema-authoring.md&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  About UCP Checker
&lt;/h2&gt;

&lt;p&gt;UCP Checker is the independent validation and monitoring layer for the &lt;a href="https://ucp.dev" rel="noopener noreferrer"&gt;Universal Commerce Protocol&lt;/a&gt;. We crawl, validate, and grade every public UCP manifest in the open web, run the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;merchant directory&lt;/a&gt; and the &lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;UCP Score&lt;/a&gt;, publish the &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;leaderboard&lt;/a&gt; and &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;adoption stats&lt;/a&gt;, and track major spec events like this one as they ship.&lt;/p&gt;

&lt;p&gt;If you're building on UCP and want to know whether your store is ready for the next spec version: &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;run a check&lt;/a&gt;. If you're tracking the spec's evolution professionally: subscribe to our &lt;a href="https://ucpchecker.com/stats/sample-report" rel="noopener noreferrer"&gt;weekly digest&lt;/a&gt; — we cover spec changes like this one within a week of merge.&lt;/p&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>ai</category>
      <category>ucp</category>
    </item>
    <item>
      <title>UCP Playground at 1,000+ Agent Sessions: What 16 Models and 97 Real Stores Reveal About AI Shopping</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Tue, 05 May 2026 09:11:37 +0000</pubDate>
      <link>https://forem.com/benjifisher/ucp-playground-at-1000-agent-sessions-what-16-models-and-97-real-stores-reveal-about-ai-shopping-155p</link>
      <guid>https://forem.com/benjifisher/ucp-playground-at-1000-agent-sessions-what-16-models-and-97-real-stores-reveal-about-ai-shopping-155p</guid>
      <description>&lt;p&gt;Two and a half months ago we &lt;a href="https://ucpchecker.com/blog/why-we-built-ucp-playground" rel="noopener noreferrer"&gt;published Why We Built UCP Playground&lt;/a&gt;, which closed on 114 agent sessions and an honest acknowledgement that the dataset was thin — most models had single-digit sample sizes, store coverage was uneven, and the headline rates moved meaningfully with every new run. A month later we crossed a different threshold: the &lt;a href="https://ucpchecker.com/blog/first-autonomous-ai-agent-purchase-ucp" rel="noopener noreferrer"&gt;first fully autonomous AI agent purchase through UCP&lt;/a&gt; — a Gemini agent searching, adding to cart, linking identity, paying, and completing checkout at &lt;a href="https://ucpchecker.com/status/houseofparfum.nl" rel="noopener noreferrer"&gt;houseofparfum.nl&lt;/a&gt; without a human past the initial prompt.&lt;/p&gt;

&lt;p&gt;Eighty days on from the first post, and roughly forty days after that autonomous purchase, the dataset is in a different shape:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Over 1,000 agent shopping sessions&lt;/strong&gt; captured end-to-end with full tool-call timelines and replayable event streams&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;16 frontier models&lt;/strong&gt; — every major lab, plus a reasoning-tuned subset&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;97 distinct UCP-enabled stores&lt;/strong&gt; across Shopify, WooCommerce, BigCommerce, Magento, PrestaShop, and custom stacks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$96,032 of agent-driven cart value&lt;/strong&gt; generated, primarily in USD with a long tail across EUR, GBP, INR, ILS, PKR&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;80 days of run history&lt;/strong&gt; since Feb 14, 2026&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the reference dataset for this post. Eight findings emerge from it. Most of them survive being scrutinised at the new sample size; one or two reverse the early-data narrative.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 1 — Claude Sonnet 4.5 leads on aggregate checkout rate
&lt;/h2&gt;

&lt;p&gt;With sample sizes now large enough to take seriously, the per-model checkout-rate leaderboard looks like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Share of dataset&lt;/th&gt;
&lt;th&gt;Checkout rate&lt;/th&gt;
&lt;th&gt;Avg tokens&lt;/th&gt;
&lt;th&gt;Avg duration&lt;/th&gt;
&lt;th&gt;Fail rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/claude-sonnet-4-5" rel="noopener noreferrer"&gt;Claude Sonnet 4.5&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;20.7%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;50.8%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;71,195&lt;/td&gt;
&lt;td&gt;38.1s&lt;/td&gt;
&lt;td&gt;17.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/llama-3-3-70b" rel="noopener noreferrer"&gt;Llama 3.3 70B&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;6.4%&lt;/td&gt;
&lt;td&gt;49.3%&lt;/td&gt;
&lt;td&gt;57,676&lt;/td&gt;
&lt;td&gt;47.7s&lt;/td&gt;
&lt;td&gt;14.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/deepseek-v3-2" rel="noopener noreferrer"&gt;DeepSeek V3.2&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5.1%&lt;/td&gt;
&lt;td&gt;45.0%&lt;/td&gt;
&lt;td&gt;32,502&lt;/td&gt;
&lt;td&gt;46.0s&lt;/td&gt;
&lt;td&gt;21.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-3-flash" rel="noopener noreferrer"&gt;Gemini 3 Flash&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;12.5%&lt;/td&gt;
&lt;td&gt;44.6%&lt;/td&gt;
&lt;td&gt;46,520&lt;/td&gt;
&lt;td&gt;21.8s&lt;/td&gt;
&lt;td&gt;15.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/grok-4" rel="noopener noreferrer"&gt;Grok 4&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;4.5%&lt;/td&gt;
&lt;td&gt;39.6%&lt;/td&gt;
&lt;td&gt;34,297&lt;/td&gt;
&lt;td&gt;77.1s&lt;/td&gt;
&lt;td&gt;9.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/claude-opus-4-6" rel="noopener noreferrer"&gt;Claude Opus 4.6&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;10.2%&lt;/td&gt;
&lt;td&gt;38.8%&lt;/td&gt;
&lt;td&gt;44,611&lt;/td&gt;
&lt;td&gt;29.7s&lt;/td&gt;
&lt;td&gt;25.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-2-5-flash" rel="noopener noreferrer"&gt;Gemini 2.5 Flash&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;9.9%&lt;/td&gt;
&lt;td&gt;36.8%&lt;/td&gt;
&lt;td&gt;32,394&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;11.8s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;23.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gpt-4o" rel="noopener noreferrer"&gt;GPT-4o&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;5.2%&lt;/td&gt;
&lt;td&gt;29.5%&lt;/td&gt;
&lt;td&gt;32,811&lt;/td&gt;
&lt;td&gt;14.7s&lt;/td&gt;
&lt;td&gt;24.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-3-1-pro" rel="noopener noreferrer"&gt;Gemini 3.1 Pro&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;7.9%&lt;/td&gt;
&lt;td&gt;29.0%&lt;/td&gt;
&lt;td&gt;30,971&lt;/td&gt;
&lt;td&gt;48.7s&lt;/td&gt;
&lt;td&gt;28.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-2-5-pro" rel="noopener noreferrer"&gt;Gemini 2.5 Pro&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;6.4%&lt;/td&gt;
&lt;td&gt;27.6%&lt;/td&gt;
&lt;td&gt;31,566&lt;/td&gt;
&lt;td&gt;34.4s&lt;/td&gt;
&lt;td&gt;22.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gpt-5-2" rel="noopener noreferrer"&gt;GPT-5.2&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;4.7%&lt;/td&gt;
&lt;td&gt;23.6%&lt;/td&gt;
&lt;td&gt;30,585&lt;/td&gt;
&lt;td&gt;37.4s&lt;/td&gt;
&lt;td&gt;27.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/deepseek-r1" rel="noopener noreferrer"&gt;DeepSeek R1&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;1.4%&lt;/td&gt;
&lt;td&gt;17.6%&lt;/td&gt;
&lt;td&gt;35,360&lt;/td&gt;
&lt;td&gt;61.4s&lt;/td&gt;
&lt;td&gt;29.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/o4-mini" rel="noopener noreferrer"&gt;o4-mini&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;1.4%&lt;/td&gt;
&lt;td&gt;12.5%&lt;/td&gt;
&lt;td&gt;64,055&lt;/td&gt;
&lt;td&gt;38.1s&lt;/td&gt;
&lt;td&gt;37.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/grok-3-mini" rel="noopener noreferrer"&gt;Grok 3 Mini&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;1.7%&lt;/td&gt;
&lt;td&gt;10.0%&lt;/td&gt;
&lt;td&gt;58,386&lt;/td&gt;
&lt;td&gt;55.6s&lt;/td&gt;
&lt;td&gt;35.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/qwq-32b" rel="noopener noreferrer"&gt;QwQ 32B&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.0%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;25,525&lt;/td&gt;
&lt;td&gt;63.9s&lt;/td&gt;
&lt;td&gt;50.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Claude Sonnet 4.5 leads on aggregate checkout rate at 50.8% on the largest single share of the dataset — a sample large enough that the rank ordering is no longer noise. Llama 3.3 70B sits a fraction below at 49.3% on a smaller but still meaningful share. The two are statistically tied; both are operating in a different regime than the rest of the field.&lt;/p&gt;

&lt;p&gt;The most interesting result on this table is &lt;strong&gt;GPT-5.2&lt;/strong&gt;, which at 23.6% lands in the bottom third despite being one of the most capable frontier models on essentially every public benchmark. The gap between its performance on standard reasoning benchmarks and its performance on transactional shopping flows is the single largest delta in the leaderboard. We dig into why in the development notes below.&lt;/p&gt;

&lt;p&gt;One caveat worth flagging up-front: GPT-5.2's 23.6% figure reflects performance across the full 80-day window, including the period before our cursor-stripping fix landed mid-dataset. Sessions after that fix show GPT-5.2 performing meaningfully more competitively. We'll publish the longitudinal split in the August update — the aggregate number above is the worst-case read.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 2 — Reasoning-tuned models continue to underperform
&lt;/h2&gt;

&lt;p&gt;The cohort of reasoning-tuned models (DeepSeek R1, o4-mini, Grok 3 Mini, QwQ 32B) sits unambiguously at the bottom of the leaderboard. Three of them are in the bottom four overall. QwQ 32B has yet to record a single completed checkout across its share of the dataset.&lt;/p&gt;

&lt;p&gt;The pattern was visible in the &lt;a href="https://ucpchecker.com/blog/ucp-playground-evals" rel="noopener noreferrer"&gt;original four-session sample report&lt;/a&gt; shipped with the eval-framework launch in April; it has only sharpened as the dataset grew two orders of magnitude. The pattern is consistent across labs and across architectures (chain-of-thought variants, exploratory reasoning, distilled-from-frontier models — all underperform on shopping flows compared to their non-reasoning counterparts from the same lab).&lt;/p&gt;

&lt;p&gt;The working hypothesis remains: shopping requires fast tool-use rhythm, not deliberation. The decisions in a shopping sequence — search this term, add this item, proceed to checkout — are individually shallow but happen in series. A reasoning model that pauses to deliberate at each step burns clock time and tokens on decisions that don't reward deliberation. Combined with reasoning models' tendency to over-question their own outputs, the result is sessions that hit &lt;code&gt;max_turns_exceeded&lt;/code&gt; before completing.&lt;/p&gt;

&lt;p&gt;Worth noting what isn't in this hypothesis: reasoning models are not bad at commerce in general. They may be excellent at higher-stakes flows — disputed transactions, multi-step contractual reasoning, regulatory edge cases — that the current eval workload doesn't probe. The benchmark says: when the workload is "shop normally," fast non-reasoning models win. Other workloads will tell different stories.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 3 — Speed and accuracy aren't correlated
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://ucpplayground.com/models/gemini-2-5-flash" rel="noopener noreferrer"&gt;Gemini 2.5 Flash&lt;/a&gt; finishes the average shopping session in &lt;strong&gt;11.8 seconds&lt;/strong&gt; — the only model in the field under 15s. Its checkout rate is 36.8% — middling. &lt;a href="https://ucpplayground.com/models/claude-sonnet-4-5" rel="noopener noreferrer"&gt;Claude Sonnet 4.5&lt;/a&gt; takes 38.1s on average and lands a 50.8% checkout rate — the highest on the leaderboard, at more than triple Flash's clock time.&lt;/p&gt;

&lt;p&gt;Two real surfaces: &lt;strong&gt;latency-bound use cases&lt;/strong&gt; (voice agents, mobile commerce, conversational checkout where the user is waiting in real time) effectively must use Gemini 2.5 Flash or Gemini 3 Flash, and pay for the latency win with lower closed-checkout rates. &lt;strong&gt;Throughput-bound use cases&lt;/strong&gt; (batch agents, scheduled buying, autonomous shopping where wall-clock time is mostly hidden) should use Claude Sonnet 4.5 or Llama 3.3 70B and accept the latency cost for the conversion lift.&lt;/p&gt;

&lt;p&gt;The naive intuition merchants reach for — "the better model is faster and more accurate" — doesn't survive contact with this data. The two axes are essentially independent within this corpus. That's a finding nobody can extract from a single-model demo or a vendor benchmark.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 4 — The failure mode taxonomy is dominated by tool errors, not model refusals
&lt;/h2&gt;

&lt;p&gt;Across the 256 failed sessions in the dataset, the categorised error taxonomy is:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Error type&lt;/th&gt;
&lt;th&gt;Sessions&lt;/th&gt;
&lt;th&gt;% of categorised failures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;openrouter_error&lt;/code&gt; (provider-side)&lt;/td&gt;
&lt;td&gt;51&lt;/td&gt;
&lt;td&gt;56%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;model_refused&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;td&gt;24%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;max_turns_exceeded&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;20%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The single-largest categorised failure mode is &lt;strong&gt;provider-side errors&lt;/strong&gt; — the routing layer between the agent and the model returning a non-200 before the session can complete. This is a cost of operating at scale across 16 models and reflects the still-maturing infrastructure underneath frontier-model API access, not anything specific to UCP.&lt;/p&gt;

&lt;p&gt;The second-largest, &lt;strong&gt;model refusals&lt;/strong&gt;, is more interesting. Twenty-two refusals across the dataset is a refusal rate of roughly 2%. We see refusals concentrated in two situations: (1) sessions against demo stores with unusual product names that pattern-match a model's safety filters, and (2) sessions where the user prompt contains adversarial content seeded by us as part of a prompt-injection eval. We've recorded &lt;strong&gt;6/6 prompt-injection resistance&lt;/strong&gt; across the dedicated injection-eval runs to date, so the model_refused category is partly capturing models doing exactly what they should.&lt;/p&gt;

&lt;p&gt;The third, &lt;strong&gt;max_turns_exceeded&lt;/strong&gt;, is concentrated in the reasoning-model cohort and is the empirical signal for the over-deliberation pattern in Finding 2.&lt;/p&gt;

&lt;p&gt;The remaining 165 failures don't carry a categorised error_type — typically these are sessions where the model abandoned the flow without raising an explicit error. That's a tagging gap in the framework that we're closing in the next iteration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 5 — Store implementation explains most of the cross-store variance
&lt;/h2&gt;

&lt;p&gt;The benchmark's most strategically important finding doesn't come from the per-model column. It comes from the per-store one.&lt;/p&gt;

&lt;p&gt;Across the 97 stores in the dataset, the same model produces dramatically different outcomes. Between the most agent-friendly and least agent-friendly implementations at meaningful sample sizes, the checkout-rate spread exceeds &lt;strong&gt;60 percentage points&lt;/strong&gt; — wider than any model-versus-model gap on the leaderboard. &lt;strong&gt;No model in the field, at any sample size, produces a 60-point spread purely on its own merits.&lt;/strong&gt; Almost all of that variance is store-side, and the rigorous run history across thousands of sessions makes the pattern hard to attribute to anything else.&lt;/p&gt;

&lt;p&gt;The cleanest predictor we've found is whether the store's MCP implementation is &lt;strong&gt;stateless&lt;/strong&gt; or &lt;strong&gt;stateful&lt;/strong&gt;, and how it handles the boundary between them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stateless implementations&lt;/strong&gt; treat every tool call as self-contained. Cart state lives in the agent's context, or in opaque tokens the agent threads through. Identity is established once and re-asserted on each call. The agent doesn't have to remember anything the server is also remembering, because the server isn't remembering anything. Stores running stateless implementations cluster at the high end of the checkout-rate distribution — frontier agents work well against them because there's no hidden contract; what's in the response is the entire state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stateful implementations&lt;/strong&gt; persist server-side session, cart, and auth across calls, exposed to the agent through session IDs, cookies, or scoped tokens. When this works, it works well. When it breaks — session expiry mid-flow, cart drift between a read and a subsequent write, identity tokens that silently lose scope between tool calls — it produces the failure modes that cluster at the bottom of the per-store distribution. The agent calls a tool the server has quietly desynced from, and the flow fails in ways that don't surface until checkout.&lt;/p&gt;

&lt;p&gt;The hybrid case is the most error-prone: stores that are stateless in some tools and stateful in others, without making the boundary explicit in the manifest or the tool response shapes. Frontier agents have no way to infer which category any individual call falls into and tend to default to the stateless assumption — which is exactly the wrong default for the calls that aren't.&lt;/p&gt;

&lt;p&gt;Beyond the state axis, the rigorous testing surfaces a consistent set of secondary trip-wires: variant IDs without human-readable axis labels, description strings exceeding 8K tokens for a single product, tool responses including nested HTML in fields agents expect to be plain text, cart endpoints returning success codes for failed mutations. None of these break &lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;UCP Score&lt;/a&gt; validation. All of them break agent flows.&lt;/p&gt;

&lt;p&gt;These are merchant-side fixes, not model-side ones. The strategic implication for any team operating a UCP-enabled store: &lt;strong&gt;fixing your manifest and tool responses produces more conversion lift than choosing the right model.&lt;/strong&gt; That's load-bearing — it's why the integrated &lt;a href="https://ucpchecker.com/blog/ucp-playground-evals#how-evals-fit-the-broader-development-cycle" rel="noopener noreferrer"&gt;Score → Check → Eval workflow&lt;/a&gt; exists, and it's where we'd point a team starting from zero on UCP.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 6 — Cart value generated is concentrated in USD and high-AOV verticals
&lt;/h2&gt;

&lt;p&gt;Of the 1,000+ sessions, 96 produced a non-zero cart value. The breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Currency&lt;/th&gt;
&lt;th&gt;Sessions&lt;/th&gt;
&lt;th&gt;Total cart value&lt;/th&gt;
&lt;th&gt;Avg cart value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;USD&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;$95,647.23&lt;/td&gt;
&lt;td&gt;$1,125.26&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;INR&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;₹3,845.00&lt;/td&gt;
&lt;td&gt;₹1,922.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PKR&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;₨4,490.00&lt;/td&gt;
&lt;td&gt;₨2,245.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EUR&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;€296.74&lt;/td&gt;
&lt;td&gt;€59.35&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ILS&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;₪189.60&lt;/td&gt;
&lt;td&gt;₪189.60&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GBP&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;£47.99&lt;/td&gt;
&lt;td&gt;£24.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;USD cart value totals &lt;strong&gt;$95,647 across 85 sessions&lt;/strong&gt; with an average cart value of $1,125. That figure is heavily skewed by a small number of high-AOV sessions against electronics and high-end apparel stores; the median session cart value is closer to $240. We don't yet have the granularity to break out cart value by store type or model — that's a feature in the eval reporting roadmap.&lt;/p&gt;

&lt;p&gt;The cross-currency long tail (EUR/GBP/INR/PKR/ILS) is small but informative. It tells us the framework is handling multi-currency stores correctly end-to-end, including currency-aware variant pricing and locale-correct checkout flows. Worth noting because it's a class of bug that doesn't surface until you actually transact.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 7 — Session volume is now meaningful enough to reveal trajectory
&lt;/h2&gt;

&lt;p&gt;Plotted week-over-week, session volume has three distinct phases over the 80-day window:&lt;/p&gt;

&lt;p&gt;UCP Playground weekly session volume, mid-February through late April 2026Trend line showing three phases: a small founding wave in mid-February, a steady-state oscillation through March and mid-April, and a sharp acceleration in late April that produces the largest single week of the dataset.Feb 14Apr 27&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Founding wave (mid-February).&lt;/strong&gt; A small launch surge coinciding with the &lt;a href="https://ucpchecker.com/blog/why-we-built-ucp-playground" rel="noopener noreferrer"&gt;Why We Built UCP Playground&lt;/a&gt; post — first publishers running first sessions, signal that the framework worked end-to-end against real stores.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Steady state (March through mid-April).&lt;/strong&gt; Weekly volume oscillating in a tight band as more frontier models came online and the eval framework matured. Some weeks heavier than others, but the median stayed roughly flat — characteristic of a tool finding its operational rhythm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Acceleration (late April).&lt;/strong&gt; The largest single week of the dataset, driven mostly by a batch of &lt;a href="https://ucpchecker.com/blog/ucp-playground-evals" rel="noopener noreferrer"&gt;eval-collection runs&lt;/a&gt; against stores onboarded after the council expansion announcement. The line bends upward at the end of the window.&lt;/p&gt;

&lt;p&gt;The trajectory matters mostly because it lets us start tracking model drift. With several thousand more sessions accumulating over the next quarter, we'll be able to observe how the same model performs against the same store between Q2 and Q3 — the loop that turns the framework from a one-shot benchmark into an actual reliability record.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 8 — The 0.2% flawless-end-to-end rate has improved, slightly
&lt;/h2&gt;

&lt;p&gt;The April &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;State of Agentic Commerce report&lt;/a&gt; flagged that of 4,014 verified UCP stores, only 9 delivered a flawless end-to-end agent shopping experience. That's the 0.2% figure that's been quoted around the launch posts — measured by static validation across the full directory.&lt;/p&gt;

&lt;p&gt;Eighty days later, with 97 stores tested directly through the eval framework, roughly &lt;strong&gt;0.5–0.7%&lt;/strong&gt; reach the same bar. That's a higher rate, though the comparison isn't apples-to-apples: direct testing surfaces issues that static validation misses (most of the failure modes in this post fall into that category), and the sample composition has shifted toward more deliberately UCP-aware merchants over the period. The honest read is that the rate looks better and the comparison's loose enough that we'd want a same-methodology re-run on the full directory to call it a real improvement.&lt;/p&gt;

&lt;p&gt;What we can say cleanly: for every store running a clean, agent-friendly UCP implementation, there are still 100+ that pass conformance but stumble somewhere in the agent flow. The gap continues to be on the merchant side. We haven't yet seen a model-side improvement large enough to close meaningful ground on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Playground stays neutral
&lt;/h2&gt;

&lt;p&gt;Every finding above hinges on one design choice: the system prompt and the orchestration loop are &lt;strong&gt;generic&lt;/strong&gt;. Same for every model. Same for every store. No store-specific scaffolding, no model-specific workarounds. That's what makes the framework work as a testing environment.&lt;/p&gt;

&lt;p&gt;The temptation to add a workaround when a particular model trips on a particular store is real — there's almost always a one-line patch that would push that store's checkout rate up by ten points against that one model. We don't ship those patches, on principle. The moment we do, the results stop being comparable across the matrix and we're not benchmarking anymore — we're tuning. Vendor stacks already do that work, in vendor-flavoured ways, with vendor-shaped numbers.&lt;/p&gt;

&lt;p&gt;Independence here means a specific thing: &lt;strong&gt;the orchestration is neutral, the protocol layer is full-featured.&lt;/strong&gt; Stores get the tools they declare. Identity linking works. Payment handlers pass through. Multi-turn context flows the way the &lt;a href="https://ucpchecker.com/specs" rel="noopener noreferrer"&gt;spec&lt;/a&gt; defines. What stays generic is the harness around that — the prompts, the turn discipline, the success criteria, the error-handling rhythm.&lt;/p&gt;

&lt;p&gt;The reason that design choice matters can be put in two sentences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If a model doesn't follow the checkout flow, that's signal about the model.&lt;/li&gt;
&lt;li&gt;If a store returns the wrong status, that's signal about the store.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both signals are useful. Both are visible because the orchestration didn't paper over either one. Hiding either defeats the purpose of running the test.&lt;/p&gt;

&lt;p&gt;Companies building their own internal infrastructure to evaluate agent behaviour against their own stores is expected, and good. Every serious commerce platform will eventually have something like that running in CI against its own merchants — and the &lt;a href="https://ucpchecker.com/blog/ucp-playground-evals#how-evals-fit-the-broader-development-cycle" rel="noopener noreferrer"&gt;Score → Check → Eval workflow&lt;/a&gt; is exactly the surface they should plug into. But the comparison layer — the one that asks how Anthropic's frontier model performs against the same workload Google's, OpenAI's, xAI's, DeepSeek's, and Meta's are also running, against the same stores — has to sit outside all of those organisations. &lt;strong&gt;Vendors can't credibly benchmark themselves; the platform layer has the same problem one level down.&lt;/strong&gt; Independence is the only way the comparisons aggregate into a record anyone can quote.&lt;/p&gt;

&lt;p&gt;That's the niche this layer occupies. The leaderboard, the failure-mode taxonomy, the store-side variance pattern in this post only hold up if the orchestration stays neutral. The moment it doesn't, the framework loses the property that made any of it worth publishing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we learned building this
&lt;/h2&gt;

&lt;p&gt;The framework didn't ship in May the same shape it shipped in February. Eighty days of running it against real stores produced a steady stream of bugs and surprises that drove the development work — many of them documented in the &lt;a href="https://ucpplayground.com/changelog" rel="noopener noreferrer"&gt;public changelog&lt;/a&gt;. Five worth surfacing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor stripping unlocked GPT-5.2 search.&lt;/strong&gt; Through February we had &lt;a href="https://ucpplayground.com/models/gpt-5-2" rel="noopener noreferrer"&gt;GPT-5.2&lt;/a&gt; at a 0% search success rate on Shopify stores. The cause was a model-side tic: GPT-5.2 always included the optional &lt;code&gt;after&lt;/code&gt; cursor parameter on &lt;code&gt;search_shop_catalog&lt;/code&gt; calls, filling it with placeholders like &lt;code&gt;""&lt;/code&gt;, &lt;code&gt;"null"&lt;/code&gt;, or &lt;code&gt;"__NONE__"&lt;/code&gt; — values Shopify always rejects. A server-side sanitizer that strips invalid placeholders before the call leaves Playground pushed GPT-5.2's search success from 0% to 100% overnight. The model wasn't bad at search; it had a tool-calling habit nobody had isolated yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failed tool calls used to inflate conversion metrics.&lt;/strong&gt; An earlier version of step detection counted a failed &lt;code&gt;update_cart&lt;/code&gt; as a &lt;code&gt;cart_created&lt;/code&gt; completion. That bug inflated the cart and conversion numbers on every report we'd published before mid-March. Fixed in 0.9.3 by gating step detection on the tool response's &lt;code&gt;isError&lt;/code&gt; flag, plus the same gate on cart-data extraction. The per-model checkout rates in this post are computed under the corrected logic; older snapshots from before that fix may read 5–10 points high on the conversion-side metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;REST-only stores forced a transport rework.&lt;/strong&gt; The &lt;a href="https://ucpchecker.com/blog/ucp-v2026-04-08-spec-update" rel="noopener noreferrer"&gt;v2026-04-08 spec drop&lt;/a&gt; in early April brought new tool names (&lt;code&gt;search_catalog&lt;/code&gt; replacing &lt;code&gt;search_shop_catalog&lt;/code&gt;), new response shapes (price as &lt;code&gt;{amount, currency}&lt;/code&gt; objects, descriptions as &lt;code&gt;{plain, html}&lt;/code&gt; objects), and a wave of WooCommerce stores that exposed REST-only endpoints rather than MCP. The 0.10.x release line was mostly absorbing that — REST-only store support, a REST tool-call adapter, response-format normalization across spec versions. Pre-04-08 sessions and v2026-04-08 sessions are both in the dataset and tagged appropriately, which is what lets the longitudinal data hold together across a non-trivial spec change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The GPay token wall built ECP.&lt;/strong&gt; In a February session, &lt;a href="https://ucpplayground.com/models/claude-sonnet-4-5" rel="noopener noreferrer"&gt;Claude Sonnet 4.5&lt;/a&gt; reached &lt;code&gt;ready_for_complete&lt;/code&gt; correctly — and stalled, because the merchant's checkout required a Google Pay payment token the agent couldn't produce. That's the genuine limit: agents shop through the protocol layer cleanly but stop at the secure-credential boundary. The Embedded Commerce Protocol shipped in 0.8.0 to hand control to the merchant's checkout UI at exactly that boundary and resume agent control once the user completes the credential step. A feature directly driven by a finding the framework couldn't have surfaced any other way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A Playground session became a spec proposal.&lt;/strong&gt; A live test against &lt;a href="https://ucpchecker.com/status/houseofparfum.nl" rel="noopener noreferrer"&gt;houseofparfum.nl&lt;/a&gt; exposed a different gap: an identity-linked buyer with a wallet balance hit the checkout, the OAuth flow completed cleanly, the buyer object came back populated — but the wallet was nowhere the agent could see it. &lt;code&gt;payment.instruments&lt;/code&gt; was empty, the only declared handler (&lt;code&gt;dev.ucp.delegate_payment&lt;/code&gt;) didn't accept the wallet, and the session escalated to the merchant's continue_url every time. Authenticated checkout was provably blocked, by spec. We wrote it up and submitted &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/issues/358" rel="noopener noreferrer"&gt;Proposal #358 to the UCP spec repository&lt;/a&gt; — &lt;code&gt;payment.available_instruments&lt;/code&gt;, a per-buyer per-session list of usable payment methods (wallet, saved cards, loyalty, gift cards) resolved at runtime from the identity-linked session. Submitted by Benji Fisher (&lt;a href="https://github.com/appdrops" rel="noopener noreferrer"&gt;@appdrops&lt;/a&gt;) and co-authored with Almin Zolotic (&lt;a href="https://github.com/zologic" rel="noopener noreferrer"&gt;@zologic&lt;/a&gt;) of UCPReady, who'd seen the same wall from the merchant side. Currently submitted to the UCP technical council for review. That's the loop the framework is built to feed: multi-store, multi-model testing surfaces a structural gap; the gap goes back into spec governance as a concrete proposal; the next spec drop closes it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Methodology, briefly
&lt;/h2&gt;

&lt;p&gt;Each session is a real frontier-model agent shopping run against a real UCP-enabled store, captured end-to-end via MCP tool calls. Sessions are initiated either through the public &lt;a href="https://ucpplayground.com/playground" rel="noopener noreferrer"&gt;Playground UI&lt;/a&gt; (user-initiated, ad-hoc prompts) or through the &lt;a href="https://ucpplayground.com/evals" rel="noopener noreferrer"&gt;Evals framework&lt;/a&gt; (scripted multi-turn sequences across pre-selected store/model matrices).&lt;/p&gt;

&lt;p&gt;Outcomes are tagged at session close: &lt;code&gt;checkout_reached&lt;/code&gt; (full transaction completion), &lt;code&gt;cart_created&lt;/code&gt; (added items, didn't proceed), &lt;code&gt;search_only&lt;/code&gt; (browsed, didn't add), &lt;code&gt;failed&lt;/code&gt; (provider error, model refusal, or max-turn exceeded), or &lt;code&gt;info_provided&lt;/code&gt; (informational query, no transactional intent).&lt;/p&gt;

&lt;p&gt;Every session has a clickable replay link in its source ULID. If you want to audit any single number in this post, the underlying session data is the artifact. That's intentional — independent reproducibility is the point.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Three concrete next steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Run a benchmark against your own store.&lt;/strong&gt; Create a collection at &lt;a href="https://ucpplayground.com/evals" rel="noopener noreferrer"&gt;ucpplayground.com/evals&lt;/a&gt;, pick a sequence, pick two models, and compare your store's per-model performance against the aggregate above.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;See where individual models stand.&lt;/strong&gt; Each model on the leaderboard has its own &lt;a href="https://ucpplayground.com/models" rel="noopener noreferrer"&gt;shopping profile&lt;/a&gt; with detailed performance data, known issues, and store-by-store breakdowns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compare two models head-to-head.&lt;/strong&gt; The &lt;a href="https://ucpplayground.com/models/compare?models=claude-sonnet-4-5%2Cgemini-3-flash" rel="noopener noreferrer"&gt;comparison view&lt;/a&gt; lets you pit any two models against each other on the same workload — useful before you commit to a primary model for a deployment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The next data update — likely 2,000+ sessions, refreshed model lineup, and a fuller error-tagging surface — drops in early August.&lt;/p&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>ai</category>
      <category>data</category>
    </item>
    <item>
      <title>UCP Requirements: What Your Store Needs Before Going Live</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Mon, 04 May 2026 12:23:16 +0000</pubDate>
      <link>https://forem.com/benjifisher/ucp-requirements-what-your-store-needs-before-going-live-9ag</link>
      <guid>https://forem.com/benjifisher/ucp-requirements-what-your-store-needs-before-going-live-9ag</guid>
      <description>&lt;p&gt;What do you need for UCP? There are two levels of UCP readiness. The first is the &lt;strong&gt;minimum viable manifest&lt;/strong&gt; — the bare requirements to pass validation and appear in the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;UCP directory&lt;/a&gt;. The second is the &lt;strong&gt;agent-ready setup&lt;/strong&gt; — what it actually takes for an AI agent to browse, cart, and check out at your store without friction.&lt;/p&gt;

&lt;p&gt;Think of this as your UCP checklist — the minimum requirements plus the recommended prerequisites that separate stores agents can find from stores agents can actually shop. Most guides only cover the first level. This one covers both, grounded in data from &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;4,024 verified merchants&lt;/a&gt; and hundreds of &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;agent testing sessions&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Minimum requirements (pass validation)
&lt;/h2&gt;

&lt;p&gt;These are the fields required to produce a valid UCP manifest on the current &lt;a href="https://ucpchecker.com/specs/2026-04-08" rel="noopener noreferrer"&gt;v2026-04-08 spec&lt;/a&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. A JSON file at /.well-known/ucp
&lt;/h3&gt;

&lt;p&gt;The manifest must be publicly accessible at &lt;code&gt;https://yourdomain.com/.well-known/ucp&lt;/code&gt;, served with &lt;code&gt;Content-Type: application/json&lt;/code&gt;, and reachable without authentication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Platform notes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;: handled automatically&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;: manual publish via plugin or custom route&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt;: manual, served from storefront origin&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;: manual, typically via custom module&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full publishing guide with code examples: &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;/.well-known/ucp developer reference&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. ucp.version (required)
&lt;/h3&gt;

&lt;p&gt;A string identifying which spec version the manifest is written against. Current latest: &lt;code&gt;"2026-04-08"&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;99.4% of verified stores&lt;/a&gt; are on this version. If you're starting fresh, use it. If you're on an older version, the &lt;a href="https://ucpchecker.com/blog/ucp-v2026-04-08-spec-update" rel="noopener noreferrer"&gt;spec update post&lt;/a&gt; walks through the migration.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. ucp.services (required)
&lt;/h3&gt;

&lt;p&gt;At least one service entry declaring a transport (&lt;code&gt;mcp&lt;/code&gt;, &lt;code&gt;rest&lt;/code&gt;, &lt;code&gt;a2a&lt;/code&gt;, or &lt;code&gt;embedded&lt;/code&gt;) and an endpoint URL. This tells agents where to send requests.&lt;/p&gt;

&lt;p&gt;MCP is the dominant transport — &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;~100% of verified stores declare it&lt;/a&gt;. If you're building from scratch, start with MCP. See the &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transport comparison&lt;/a&gt; for the tradeoffs.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. ucp.payment_handlers (required)
&lt;/h3&gt;

&lt;p&gt;A map of payment handler namespaces. Can be an empty object &lt;code&gt;{}&lt;/code&gt; if your store uses checkout-link redirects instead of tokenized payments (common on &lt;a href="https://ucpchecker.com/blog/woocommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;If you declare handlers, use reverse-domain namespaces like &lt;code&gt;com.stripe.card&lt;/code&gt; or &lt;code&gt;dev.shopify.card&lt;/code&gt;. See the &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handlers directory&lt;/a&gt; for examples.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. signing_keys (required, at root level)
&lt;/h3&gt;

&lt;p&gt;An array of JWK objects at the &lt;strong&gt;document root&lt;/strong&gt; (not nested inside &lt;code&gt;ucp&lt;/code&gt;). An empty array &lt;code&gt;[]&lt;/code&gt; is valid if you're not signing payloads yet, but the key must be present.&lt;/p&gt;

&lt;p&gt;This field moved from &lt;code&gt;ucp.signing_keys&lt;/code&gt; to the root in v2026-04-08 — the most &lt;a href="https://ucpchecker.com/blog/common-ucp-errors" rel="noopener noreferrer"&gt;common validation warning&lt;/a&gt; we see is stores that still nest it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommended setup (agent-ready)
&lt;/h2&gt;

&lt;p&gt;Passing validation gets you into the directory. The requirements below determine whether agents can actually &lt;em&gt;shop&lt;/em&gt; your store — the difference between a &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;B+ grade and an A grade&lt;/a&gt; in our benchmarks.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Capabilities declaration
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;ucp.capabilities&lt;/code&gt; field is optional per spec but strongly recommended. Without it, agents know your store exists but not what it can do.&lt;/p&gt;

&lt;p&gt;Declare every capability you support:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/checkout" rel="noopener noreferrer"&gt;checkout&lt;/a&gt;&lt;/strong&gt; — 99.5% adoption across verified stores&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/cart" rel="noopener noreferrer"&gt;cart&lt;/a&gt;&lt;/strong&gt; — 99.1% adoption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/catalog-search" rel="noopener noreferrer"&gt;catalog-search&lt;/a&gt;&lt;/strong&gt; — required for &lt;a href="https://ucpchecker.com/product-discovery" rel="noopener noreferrer"&gt;product discovery&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/identity-linking" rel="noopener noreferrer"&gt;identity-linking&lt;/a&gt;&lt;/strong&gt; — 3 stores, massive first-mover opportunity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/payment" rel="noopener noreferrer"&gt;payment&lt;/a&gt;&lt;/strong&gt; — 0 stores, the frontier&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full list: &lt;a href="https://ucpchecker.com/capabilities" rel="noopener noreferrer"&gt;capability registry&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Clean variant data
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/blog/agentic-commerce-optimization-ucp-readiness-data" rel="noopener noreferrer"&gt;Variant mismatches are the #1 failure mode&lt;/a&gt; in agent shopping sessions. Every variant needs a stable ID, a clear name, and consistent representation across discovery and checkout. This is the single highest-impact fix you can make.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. Responsive MCP endpoint
&lt;/h3&gt;

&lt;p&gt;Latency matters. The average &lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify store&lt;/a&gt; responds in ~130ms. &lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce stores&lt;/a&gt; average ~890ms. Agents have timeout budgets — if your endpoint is slow, sessions drop silently. Target under 500ms for tool responses.&lt;/p&gt;

&lt;h3&gt;
  
  
  9. robots.txt allowing AI crawlers
&lt;/h3&gt;

&lt;p&gt;Make sure &lt;code&gt;/.well-known/ucp&lt;/code&gt; is explicitly allowed in your robots.txt. Some WAFs and CDN configurations block well-known paths by default. Check the &lt;a href="https://ucpchecker.com/blog/common-ucp-errors" rel="noopener noreferrer"&gt;common errors guide&lt;/a&gt; for the fix.&lt;/p&gt;

&lt;h3&gt;
  
  
  10. Supported_versions for backward compatibility
&lt;/h3&gt;

&lt;p&gt;Declare &lt;code&gt;supported_versions&lt;/code&gt; in your manifest listing both the current and previous spec version. This lets agents that haven't migrated yet still find a valid endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"supported_versions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"2026-04-08"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://yourstore.com/.well-known/ucp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"2026-01-23"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://yourstore.com/.well-known/ucp/2026-01-23"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The UCP readiness checklist
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Required?&lt;/th&gt;
&lt;th&gt;% of stores that have it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Manifest at /.well-known/ucp&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;100% (by definition)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ucp.version&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ucp.services with transport + endpoint&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ucp.payment_handlers&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;signing_keys at root&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;~97% (rest have it nested)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ucp.capabilities&lt;/td&gt;
&lt;td&gt;Recommended&lt;/td&gt;
&lt;td&gt;~99% (Shopify default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Clean variant data&lt;/td&gt;
&lt;td&gt;Recommended&lt;/td&gt;
&lt;td&gt;Unknown (runtime issue)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latency &amp;lt; 500ms&lt;/td&gt;
&lt;td&gt;Recommended&lt;/td&gt;
&lt;td&gt;~95% (Shopify), ~30% (others)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;robots.txt allows /.well-known/ucp&lt;/td&gt;
&lt;td&gt;Recommended&lt;/td&gt;
&lt;td&gt;~99%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;supported_versions&lt;/td&gt;
&lt;td&gt;Recommended&lt;/td&gt;
&lt;td&gt;~70%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Validate your setup
&lt;/h2&gt;

&lt;p&gt;Not sure if you pass? Start with &lt;a href="https://ucpchecker.com/blog/is-my-store-ucp-ready" rel="noopener noreferrer"&gt;Is My Store UCP Ready?&lt;/a&gt; — it walks through the full diagnostic in 60 seconds. Or jump straight to the tool:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;Run a live check&lt;/a&gt; on your domain — it tests every requirement above in seconds. For runtime issues (variant mismatches, checkout failures), &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;test with real agents in Playground&lt;/a&gt;. For ongoing monitoring, &lt;a href="https://ucpchecker.com/alerts" rel="noopener noreferrer"&gt;set up alerts&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;Once you're verified, make sure your listing on &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;UCP Registry&lt;/a&gt; is accurate — that's what agents see when deciding which stores to route customers to. And if you're a developer building agents rather than stores, the &lt;a href="https://ucpchecker.com/agents" rel="noopener noreferrer"&gt;Build an Agent quickstart&lt;/a&gt; covers the other side of the equation.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Check your store now at &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCPChecker.com&lt;/a&gt;. See how you compare: &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;side-by-side store comparison&lt;/a&gt;. Platform guides: &lt;a href="https://ucpchecker.com/blog/shopify-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/woocommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/bigcommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/magento-adobe-commerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>tutorial</category>
      <category>ucp</category>
    </item>
    <item>
      <title>AI Commerce Needs MLPerf — and Here's an Early Attempt</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Fri, 01 May 2026 12:07:45 +0000</pubDate>
      <link>https://forem.com/benjifisher/ai-commerce-needs-mlperf-and-heres-an-early-attempt-2lg1</link>
      <guid>https://forem.com/benjifisher/ai-commerce-needs-mlperf-and-heres-an-early-attempt-2lg1</guid>
      <description>&lt;p&gt;Validating a UCP manifest takes a second. &lt;a href="https://ucpchecker.com/blog/introducing-ucp-score-agent-readiness-grade" rel="noopener noreferrer"&gt;Scoring it for agent-readiness&lt;/a&gt; takes another. Neither of those answers the harder question: when a real frontier agent — &lt;a href="https://ucpplayground.com/models/claude-opus-4-6" rel="noopener noreferrer"&gt;Claude&lt;/a&gt; or &lt;a href="https://ucpplayground.com/models/gpt-5-2" rel="noopener noreferrer"&gt;GPT&lt;/a&gt; or &lt;a href="https://ucpplayground.com/models/gemini-3-1-pro" rel="noopener noreferrer"&gt;Gemini&lt;/a&gt;, picked by a user three weeks from now — walks up to your store with an ordinary shopping prompt, does it actually complete a checkout? Compared to the next implementation? Across the models people are actually using?&lt;/p&gt;

&lt;p&gt;Today there's no shared way to find out. AI commerce has the same coordination problem ML had before MLPerf, web performance had before Lighthouse, and coding models had before HumanEval — and the cost of not solving it is the same: every claim a vendor makes about agent-readiness is currently unverifiable by anyone outside that vendor.&lt;/p&gt;

&lt;p&gt;This post is about what we've been building to close that gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pre-benchmark moment
&lt;/h2&gt;

&lt;p&gt;Every category that grew up around AI has gone through a pre-benchmark moment.&lt;/p&gt;

&lt;p&gt;Machine learning before MLPerf was a pile of vendor-flavoured numbers. NVIDIA reported one set of throughput claims, Google another, AMD a third — and none of it was directly comparable, because nobody was running the same workload, on the same input, on the same harness. MLPerf — submitted to, run by, and audited across the whole industry — fixed that. Buyers could finally compare. The category matured.&lt;/p&gt;

&lt;p&gt;Web performance before Lighthouse was the same. "Fast website" was vibes. PageSpeed Insights gave one number, WebPageTest another, internal RUM dashboards a third. Lighthouse — graded, reproducible, open — fixed it. Today nobody ships a serious site without checking their score.&lt;/p&gt;

&lt;p&gt;Coding models before HumanEval were even worse. Every lab benchmarked against its own preferred problems and reported its own preferred metrics. HumanEval, then MBPP, then SWE-bench, then LiveCodeBench, gave the field a shared evaluation surface. Comparisons stopped being marketing.&lt;/p&gt;

&lt;p&gt;Agentic commerce is in exactly the place those categories were before their benchmarks landed. The standard has converged — UCP is the open spec the industry is building against, and the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;public directory&lt;/a&gt; tracks 4,500+ verified stores. Major retailers and platforms ship UCP implementations almost weekly. The recent &lt;a href="https://ucpchecker.com/blog/ucp-tech-council-expands-amazon-meta-microsoft-salesforce-stripe" rel="noopener noreferrer"&gt;tech council expansion&lt;/a&gt; brings in most of the rest. &lt;strong&gt;But there is still no neutral, reproducible way to evaluate how well any of those implementations actually work when a real frontier agent tries to shop them.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can't get this from inside a vendor. Shopify cannot credibly benchmark Shopify stores. OpenAI cannot credibly benchmark OpenAI agents. Even when their numbers are honest, the methodology is theirs, the test conditions favour their stack, and nobody else can rerun it. AI commerce has the same coordination problem ML had before MLPerf, and it solves the same way: a shared evaluation layer, run by a third party, that anyone can audit and reproduce.&lt;/p&gt;

&lt;p&gt;Agentic commerce can't mature without that layer. We've built a first credible attempt at one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What UCP Playground Evals does
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://ucpplayground.com/evals" rel="noopener noreferrer"&gt;UCP Playground Evals&lt;/a&gt; is a benchmark framework for agentic commerce. You define a multi-turn shopping conversation, pick the stores and the models you want to evaluate against it, and get back a structured comparison report — funnel matrix, per-session token and duration metrics, error classification, replayable session links, downloadable PDF.&lt;/p&gt;

&lt;p&gt;The point isn't the report format. The point is the three properties underneath, because those determine whether a benchmark is worth trusting.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Standardised, multi-turn sequences
&lt;/h3&gt;

&lt;p&gt;Agentic commerce is conversational, not single-prompt. A real shopping session looks like &lt;em&gt;"Show me products under $60"&lt;/em&gt; → &lt;em&gt;"Add both to my cart"&lt;/em&gt; → &lt;em&gt;"Proceed to checkout"&lt;/em&gt;, with full context carried across turns. That's the unit an eval has to operate on.&lt;/p&gt;

&lt;p&gt;Each eval is a scripted sequence of turns. Every turn gets its own orchestrator round (up to 8 internal tool-calling sub-turns) and the full conversation history is preserved across the sequence — so the agent's choices on T2 are conditioned on what it actually saw on T1, the way real user behaviour conditions on real responses. Four collections ship today: &lt;strong&gt;Browse &amp;amp; Buy&lt;/strong&gt; (4 turns, generic shopping journey), &lt;strong&gt;Multi-Item&lt;/strong&gt; (3 turns, multi-product cart composition and checkout), &lt;strong&gt;Price Constrained&lt;/strong&gt; (3 turns, budget-anchored reasoning across a single purchase), and &lt;strong&gt;Custom&lt;/strong&gt; for user-defined sequences.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Cross-store comparability
&lt;/h3&gt;

&lt;p&gt;The sequences are intentionally generic. Not &lt;em&gt;"Find Nike Air Max 90 in size 10"&lt;/em&gt; but &lt;em&gt;"Show me products under $60"&lt;/em&gt;. That distinction is load-bearing: it's what makes the same test valid against any store running UCP, and it's what makes results from one store directly comparable to results from another. Without it, every benchmark is apples-to-oranges and nothing aggregates.&lt;/p&gt;

&lt;p&gt;The eval runner discovers MCP endpoints automatically from each store's &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;&lt;code&gt;/.well-known/ucp&lt;/code&gt;&lt;/a&gt; manifest, so any UCP-conformant store works without per-store wiring — &lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/prestashop" rel="noopener noreferrer"&gt;PrestaShop&lt;/a&gt;, and &lt;a href="https://ucpchecker.com/platforms/custom" rel="noopener noreferrer"&gt;Custom &amp;amp; Headless&lt;/a&gt; stacks all work the same way.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Multi-model coverage
&lt;/h3&gt;

&lt;p&gt;The same sequence runs against any of &lt;a href="https://ucpplayground.com/models" rel="noopener noreferrer"&gt;15 frontier models&lt;/a&gt; currently wired up — every major lab, plus a reasoning-tuned subset:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/claude-opus-4-6" rel="noopener noreferrer"&gt;Claude Opus 4.6&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/claude-sonnet-4-5" rel="noopener noreferrer"&gt;Claude Sonnet 4.5&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gpt-5-2" rel="noopener noreferrer"&gt;GPT-5.2&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gpt-4o" rel="noopener noreferrer"&gt;GPT-4o&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-3-1-pro" rel="noopener noreferrer"&gt;Gemini 3.1 Pro&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-3-flash" rel="noopener noreferrer"&gt;Gemini 3 Flash&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-2-5-pro" rel="noopener noreferrer"&gt;Gemini 2.5 Pro&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/gemini-2-5-flash" rel="noopener noreferrer"&gt;Gemini 2.5 Flash&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/grok-4" rel="noopener noreferrer"&gt;Grok 4&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;xAI&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/deepseek-v3-2" rel="noopener noreferrer"&gt;DeepSeek V3.2&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/llama-3-3-70b" rel="noopener noreferrer"&gt;Llama 3.3 70B&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Meta&lt;/td&gt;
&lt;td&gt;Frontier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/deepseek-r1" rel="noopener noreferrer"&gt;DeepSeek R1&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;Reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/qwq-32b" rel="noopener noreferrer"&gt;QwQ 32B&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Alibaba&lt;/td&gt;
&lt;td&gt;Reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/grok-3-mini" rel="noopener noreferrer"&gt;Grok 3 Mini&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;xAI&lt;/td&gt;
&lt;td&gt;Reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpplayground.com/models/o4-mini" rel="noopener noreferrer"&gt;o4-mini&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The model is part of the test matrix. Same store, different models, same sequence — directly comparable behaviour, with model-level differences surfaced rather than averaged away. Any two can also be &lt;a href="https://ucpplayground.com/models/compare?models=gemini-3-1-pro%2Cclaude-sonnet-4-5" rel="noopener noreferrer"&gt;compared side-by-side&lt;/a&gt; outside the eval framework, on the same workload.&lt;/p&gt;

&lt;h3&gt;
  
  
  The math is straightforward
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;stores × models × sequences = sessions&lt;/code&gt;. Two stores × two models × one sequence = four sessions. Each one is a full agent shopping run, captured end-to-end, replayable, and rolled up into the report.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Standardised, reproducible, vendor-neutral. The three properties that make a benchmark worth trusting.&lt;/strong&gt; Everything else in the framework is built to defend those three.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the framework actually surfaces
&lt;/h2&gt;

&lt;p&gt;The clearest way to show what evals do is to walk through one. Below is a multi-item checkout report we ran across two stores and two Gemini models in March:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://ucpplayground.com/examples/eval-report-sample.pdf" rel="noopener noreferrer"&gt;Download the full multi-item checkout report (PDF) →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two-page report covering the funnel comparison matrix, per-session performance breakdown, evaluator configuration, auto-generated recommendations, and clickable session-replay IDs for every run.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Two stores (&lt;a href="https://ucpchecker.com/status/oakywood.shop" rel="noopener noreferrer"&gt;oakywood.shop&lt;/a&gt;, &lt;a href="https://ucpchecker.com/status/ugmonk.com" rel="noopener noreferrer"&gt;ugmonk.com&lt;/a&gt;). Two models (Gemini 3 Flash, Gemini 3.1 Pro). One sequence (multi-item checkout: search → add → checkout). Four sessions total. The headline numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;100% checkout rate&lt;/strong&gt; across all four sessions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;95,513 average tokens&lt;/strong&gt; per session&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;48.3s average duration&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0 errors&lt;/strong&gt; across the matrix&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the boring summary. The interesting parts are in the per-session table.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Store&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Tokens&lt;/th&gt;
&lt;th&gt;Duration&lt;/th&gt;
&lt;th&gt;Turns&lt;/th&gt;
&lt;th&gt;Cart value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;oakywood.shop&lt;/td&gt;
&lt;td&gt;Gemini 3.1 Pro&lt;/td&gt;
&lt;td&gt;85,614&lt;/td&gt;
&lt;td&gt;93.4s&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;EUR 82.75&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;oakywood.shop&lt;/td&gt;
&lt;td&gt;Gemini 3 Flash&lt;/td&gt;
&lt;td&gt;154,294&lt;/td&gt;
&lt;td&gt;34.7s&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ugmonk.com&lt;/td&gt;
&lt;td&gt;Gemini 3.1 Pro&lt;/td&gt;
&lt;td&gt;46,084&lt;/td&gt;
&lt;td&gt;35.1s&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;USD 77.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ugmonk.com&lt;/td&gt;
&lt;td&gt;Gemini 3 Flash&lt;/td&gt;
&lt;td&gt;96,058&lt;/td&gt;
&lt;td&gt;29.9s&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Same sequence, same stores, two models. Gemini 3.1 Pro completes the run in fewer turns and roughly half the tokens of Flash on the same store, but its latency is meaningfully higher when the store itself is slower to respond. That isn't a fact you can extract from a vendor benchmark or a single-model demo. It only shows up when the same scripted run hits multiple models head-to-head, with both numbers landing in the same row.&lt;/p&gt;

&lt;p&gt;The auto-generated recommendations point at where the real engineering work is, and they're grounded in the actual run data:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Average token usage is 95,513 — above the 40K baseline. Product descriptions may be inflating context. Consider truncating descriptions in MCP responses.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Average session duration is 48.3s — above the 15s target. Optimise MCP endpoint response times, especially initial search calls.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those are concrete merchandising actions. They land because the evidence is right there in the per-session breakdown.&lt;/p&gt;

&lt;p&gt;The deeper signal shows up across runs against richer stores. In a separate eval against a single shop, two models picked &lt;em&gt;different variant IDs for "Medium"&lt;/em&gt; — one mapped Medium to one variant ID, the other to a different one, and neither is provably correct because the store doesn't expose a human-readable size axis in its variant data. That isn't a bug in either model. It's a gap in how the store represents its product axes, and it only becomes visible when two models walk the same path. &lt;strong&gt;This is the kind of behavioural divergence between frontier models that evals surface — and that vendor-internal benchmarks can't credibly report.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The same run logged 6/6 prompt-injection resistance across every session, against benchmark prompts seeded in product descriptions and review fields. Useful by itself; more useful as a baseline that future runs can regress against.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's on the evals roadmap
&lt;/h2&gt;

&lt;p&gt;This is v1. A few things on the roadmap, in priority order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;More eval collections.&lt;/strong&gt; The four built-in sequences cover the core shopping flow. The next batch is more diagnostic: single-item flow (the simplest path), variant selection accuracy (the size-label gap above, formalised), prompt-injection resistance (already running, becoming its own collection), escalation handling (&lt;code&gt;requires_escalation&lt;/code&gt; compliance), attribution accuracy (UTM and referrer handling at checkout hand-off), return policy surfacing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Public benchmark leaderboards.&lt;/strong&gt; Same pattern as the &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;UCP Score leaderboard&lt;/a&gt; — by-store and by-model rankings against the standard sequences, refreshed on schedule, indexed and shareable. The categories that matured around shared benchmarks (ML, web perf, coding models) all developed public leaderboards — and the leaderboards turned out to be most of the forcing function.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Headless API and CI/CD integration.&lt;/strong&gt; Already shipped. The full automation surface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /api/v1/collections          — create
POST /api/v1/collections/{id}/run — trigger
GET  /api/v1/collection-runs/{id} — poll status + results
GET  /api/v1/collection-runs/{id}/pdf — download report
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first integration we expect anyone to ship is a deploy-time check: trigger an eval after every UCP manifest deploy, assert &lt;code&gt;checkout_rate &amp;gt;= 80&lt;/code&gt;, &lt;code&gt;errors.total == 0&lt;/code&gt;, &lt;code&gt;avg_duration_ms &amp;lt; 30000&lt;/code&gt;, fail the build otherwise. Same shape as Lighthouse CI for web performance — a regression catch you bolt onto the pipeline rather than rediscover in production. Full developer documentation — authentication, rate limits, and a worked GitHub Actions example — lives at &lt;a href="https://ucpchecker.com/developer-tools" rel="noopener noreferrer"&gt;ucpchecker.com/developer-tools&lt;/a&gt;, alongside the rest of the public API surface.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scheduled runs and version tracking.&lt;/strong&gt; Also shipped. Collections auto-increment versions when their config changes, runs snapshot the config they used, and a cron field on each collection lets you run the same eval on a regular cadence — same Monday-9am sequence every week, before-and-after comparisons whenever the underlying UCP implementation changes. This is how a benchmark becomes a tracking record instead of a one-shot demo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cloning and team scoping.&lt;/strong&gt; Public collections can be cloned into any team workspace; quotas are scoped per team. The intent is community sharing — well-known sequences turning into shared, reusable yardsticks the way SWE-bench problem sets did for coding models.&lt;/p&gt;

&lt;h2&gt;
  
  
  How evals fit the broader development cycle
&lt;/h2&gt;

&lt;p&gt;Evals don't sit alone. They're the runtime testing surface in a development loop that starts earlier in UCP Checker — manifest validation, agent-readiness scoring, capability coverage analysis. The web performance world solved the same shape with three tools used in sequence: Lighthouse to grade pages, PageSpeed Insights to drill into specific issues, synthetic monitoring to verify behaviour over time. UCP implementations follow the same arc: validate the manifest at &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;&lt;code&gt;/check&lt;/code&gt;&lt;/a&gt;, score it against agent-readiness criteria with the &lt;a href="https://ucpchecker.com/blog/introducing-ucp-score-agent-readiness-grade" rel="noopener noreferrer"&gt;UCP Score&lt;/a&gt;, then run evals against it to see how it actually behaves when a real frontier agent shops it.&lt;/p&gt;

&lt;p&gt;Each tool surfaces something different. Score tells you what's missing structurally — which discovery signals, which capabilities, which conformance rules. Check confirms the manifest validates after fixes land. Evals confirms the agent actually behaves correctly when it tries to complete a real flow. None is sufficient on its own; together they're the development feedback loop UCP needs. We've watched developers iterate across the whole thing in a single session — score the implementation, fix the gap server-side, re-check the manifest, then run an eval to confirm the agent now closes a checkout it couldn't before.&lt;/p&gt;

&lt;p&gt;If you're starting from zero on a UCP implementation, the natural sequence is: get a Score first to see what's missing, fix the highest-impact issues, run a Check to confirm the manifest validates cleanly, then run Evals to confirm real agents complete the flows you care about. CI covers the long tail — automated scoring on each deploy, scheduled evals weekly, alerts when capabilities regress.&lt;/p&gt;

&lt;h2&gt;
  
  
  Methodology and verification
&lt;/h2&gt;

&lt;p&gt;Three properties separate a credible benchmark from a marketing claim. UCP Playground Evals are designed around all three.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every result links to a replayable session.&lt;/strong&gt; Each eval session generates the same &lt;code&gt;agent_sessions&lt;/code&gt; data the public Playground UI produces — full tool-call timeline, model responses, token-by-token event stream, every retrieved page. The session IDs in any report are clickable. Open one and you see exactly what the agent did, turn by turn, on which tool call, with which response. The sample report above lists four such IDs (e.g. &lt;code&gt;01KMJZM5MG2CA4QN5M983H19E1&lt;/code&gt;) and each resolves to a full replay at &lt;code&gt;ucpplayground.com/sessions/{id}&lt;/code&gt;. &lt;strong&gt;This isn't a marketing claim; it's a verifiable test you can audit.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every collection is versioned.&lt;/strong&gt; When the configuration of a collection changes — turns added, models swapped, store list updated — the version increments and every run snapshots the config it ran against. Anyone questioning a result can reproduce the exact methodology used at that moment. The PDF report itself prints the collection version at the bottom of every page; the sample above is &lt;code&gt;Collection v3&lt;/code&gt;. Versioning is what stops "we got better results" from quietly sliding into "we changed the test" — the same constraint MLPerf submission rules enforce on hardware vendors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The methodology is open.&lt;/strong&gt; The framework configuration shape is documented — the turns, the orchestrator loop, the stop conditions, the success metrics, the PDF schema. Anyone can build the same test, run it against any UCP store, and get back a directly comparable report. If we get a methodology choice wrong, the path to disagreement is technical, not promotional.&lt;/p&gt;

&lt;p&gt;That's the credibility floor. Everything else in the product builds on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  About UCP Checker and UCP Playground
&lt;/h2&gt;

&lt;p&gt;UCP Checker is the independent validation and monitoring layer for the &lt;a href="https://ucp.dev" rel="noopener noreferrer"&gt;Universal Commerce Protocol&lt;/a&gt;. We crawl, validate, and grade every public UCP manifest in the open web, run the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;merchant directory&lt;/a&gt; and the &lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;UCP Score&lt;/a&gt;, publish the &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;leaderboard&lt;/a&gt; and &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;adoption stats&lt;/a&gt;, and ship developer tools — the &lt;a href="https://ucpchecker.com/ucp-validator" rel="noopener noreferrer"&gt;validator&lt;/a&gt;, &lt;a href="https://ucpchecker.com/bulk-check" rel="noopener noreferrer"&gt;bulk checker&lt;/a&gt;, &lt;a href="https://ucpchecker.com/extension" rel="noopener noreferrer"&gt;browser extension&lt;/a&gt;, &lt;a href="https://ucpchecker.com/developer-tools" rel="noopener noreferrer"&gt;public dataset&lt;/a&gt;, and a public REST API. The whole dataset is open, indexed, and ungated.&lt;/p&gt;

&lt;p&gt;UCP Playground is the agent shopping layer that sits next to it — same data model, same &lt;code&gt;/.well-known/ucp&lt;/code&gt; discovery, same replayable session format. UCP Playground Evals is the benchmark surface on top of that. Together they form the third-party scoreboard the ecosystem can build trust on top of — the SSL Labs and Lighthouse of agentic commerce, depending on which side you're looking from.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The interesting eval gaps are the ones nobody's tested yet.&lt;/strong&gt; If a result surprises you — your own store, a competitor's, a model you assumed was a clear winner that turns out not to be — &lt;a href="https://ucpchecker.com/contact" rel="noopener noreferrer"&gt;let us know&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Three concrete next steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Run an eval against your own UCP store.&lt;/strong&gt; Create a collection at &lt;a href="https://ucpplayground.com/evals" rel="noopener noreferrer"&gt;ucpplayground.com/evals&lt;/a&gt;, pick a sequence, pick two models, run it. The four-session example above is the shape most first runs take.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read a public eval report.&lt;/strong&gt; Sample reports are linked from the framework page. Each has clickable session IDs you can replay end-to-end.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wire it into CI.&lt;/strong&gt; The &lt;a href="https://ucpchecker.com/developer-tools" rel="noopener noreferrer"&gt;developer tools page&lt;/a&gt; covers authentication, rate limits, and a GitHub Actions worked example. The assertion shape is the same one Lighthouse CI uses for web performance — &lt;code&gt;checkout_rate&lt;/code&gt;, &lt;code&gt;errors.total&lt;/code&gt;, &lt;code&gt;avg_duration_ms&lt;/code&gt; instead of LCP and TBT.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>product</category>
      <category>ucp</category>
    </item>
    <item>
      <title>Is My Store UCP Ready? How to Check in 60 Seconds</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Thu, 30 Apr 2026 10:25:51 +0000</pubDate>
      <link>https://forem.com/benjifisher/is-my-store-ucp-ready-how-to-check-in-60-seconds-4fco</link>
      <guid>https://forem.com/benjifisher/is-my-store-ucp-ready-how-to-check-in-60-seconds-4fco</guid>
      <description>&lt;p&gt;The short answer: &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;enter your domain here&lt;/a&gt; and you'll know in under 60 seconds. This UCP ready check runs the same validation that AI agents use to decide whether your store is worth shopping.&lt;/p&gt;

&lt;p&gt;The longer answer — what "UCP ready" actually means, why it matters, and what to do about the result — is what this post covers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What UCP readiness means
&lt;/h2&gt;

&lt;p&gt;A store is "UCP ready" when it publishes a valid manifest at &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;&lt;code&gt;/.well-known/ucp&lt;/code&gt;&lt;/a&gt; that AI shopping agents can discover, parse, and act on. That's the technical definition.&lt;/p&gt;

&lt;p&gt;In practice, there are three levels:&lt;/p&gt;

&lt;h3&gt;
  
  
  Level 1: Verified
&lt;/h3&gt;

&lt;p&gt;Your manifest exists, returns valid JSON, and passes &lt;a href="https://ucpchecker.com/ucp-validator" rel="noopener noreferrer"&gt;schema validation&lt;/a&gt; against the current &lt;a href="https://ucpchecker.com/specs/2026-04-08" rel="noopener noreferrer"&gt;v2026-04-08 spec&lt;/a&gt;. You appear in the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;UCP directory&lt;/a&gt;. Agents can find you.&lt;/p&gt;

&lt;p&gt;As of this month, &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;4,024 stores&lt;/a&gt; are at this level.&lt;/p&gt;

&lt;h3&gt;
  
  
  Level 2: Agent-functional
&lt;/h3&gt;

&lt;p&gt;Agents can actually &lt;em&gt;shop&lt;/em&gt; your store — not just discover it. Your MCP endpoint responds, your product data is clean, your checkout flow completes without errors. You score B+ or higher on the &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;Playground leaderboard&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;422 stores are at this level. The gap between "verified" and "agent-functional" is where most &lt;a href="https://ucpchecker.com/blog/common-ucp-errors" rel="noopener noreferrer"&gt;common errors&lt;/a&gt; live.&lt;/p&gt;

&lt;h3&gt;
  
  
  Level 3: Optimized
&lt;/h3&gt;

&lt;p&gt;Agents complete purchases reliably across multiple models. Your variant data is clean, your latency is low, your capabilities go beyond the defaults. You score A. Only 9 stores are here today.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://ucpchecker.com/blog/ucp-requirements" rel="noopener noreferrer"&gt;UCP requirements checklist&lt;/a&gt; breaks down exactly what each level requires.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to check your store
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Run the checker
&lt;/h3&gt;

&lt;p&gt;Go to &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCPChecker.com/check&lt;/a&gt; and enter your domain. When you check your UCP status, the checker will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fetch &lt;code&gt;/.well-known/ucp&lt;/code&gt; from your domain&lt;/li&gt;
&lt;li&gt;Validate the JSON against the current spec&lt;/li&gt;
&lt;li&gt;Check your robots.txt for AI bot policies&lt;/li&gt;
&lt;li&gt;Inventory your declared capabilities, transports, and payment handlers&lt;/li&gt;
&lt;li&gt;Verify your UCP compliance and report every error and warning with specific error codes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole process takes about 1 second. You'll get a full diagnostic report on your &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;status page&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Read the result
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Verified&lt;/strong&gt; (green) — your manifest is valid. You're in the directory. Agents can find you. Check the warnings section for things to improve.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Invalid&lt;/strong&gt; (amber) — your manifest exists but fails validation. The diagnostic panel shows exactly which fields are wrong or missing. Most invalid manifests are one fix away from passing — usually a &lt;a href="https://ucpchecker.com/blog/common-ucp-errors" rel="noopener noreferrer"&gt;missing required field or a misplaced signing_keys&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not Detected&lt;/strong&gt; (grey) — no manifest found at &lt;code&gt;/.well-known/ucp&lt;/code&gt;. Your store isn't UCP ready yet. See the &lt;a href="https://ucpchecker.com/blog/ucp-requirements" rel="noopener noreferrer"&gt;requirements post&lt;/a&gt; for what to publish.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Blocked&lt;/strong&gt; (orange) — your robots.txt or firewall is preventing access to the manifest. The diagnostic will tell you whether it's a robots.txt rule or an HTTP-level block.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Fix what's broken
&lt;/h3&gt;

&lt;p&gt;The checker tells you &lt;em&gt;what&lt;/em&gt; is wrong. Here's where to go for &lt;em&gt;how&lt;/em&gt; to fix it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Platform-specific guides:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/blog/shopify-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/woocommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/bigcommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/magento-adobe-commerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manifest reference:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;/.well-known/ucp developer guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error-by-error fixes:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/blog/common-ucp-errors" rel="noopener noreferrer"&gt;Common UCP errors&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec changes:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/blog/ucp-v2026-04-08-spec-update" rel="noopener noreferrer"&gt;v2026-04-08 update&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: Test with real agents
&lt;/h3&gt;

&lt;p&gt;Schema validation tells you if your manifest is syntactically correct. It tells you nothing about whether an agent can actually buy something from your store. For that, you need &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCP Playground&lt;/a&gt; — it runs real AI agent sessions against your store and shows you exactly where the flow breaks.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://ucpchecker.com/blog/agentic-commerce-optimization-ucp-readiness-data" rel="noopener noreferrer"&gt;agent testing data&lt;/a&gt; shows that the most common runtime failure is &lt;a href="https://ucpchecker.com/blog/common-ucp-errors" rel="noopener noreferrer"&gt;variant mismatches&lt;/a&gt; — clean product data matters more than perfect schema.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Monitor
&lt;/h3&gt;

&lt;p&gt;Your UCP endpoint is a live API. Platform updates, catalog changes, and CDN reconfigurations can break it silently. Set up &lt;a href="https://ucpchecker.com/alerts" rel="noopener noreferrer"&gt;UCP Alerts&lt;/a&gt; to get emailed the moment your status changes — before agents notice.&lt;/p&gt;

&lt;h2&gt;
  
  
  How you compare
&lt;/h2&gt;

&lt;p&gt;Once you're verified, see how your store stacks up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;Compare side-by-side&lt;/a&gt;&lt;/strong&gt; with a competitor or partner store — capabilities, transports, payment handlers, latency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;Browse your platform&lt;/a&gt;&lt;/strong&gt; — see all verified &lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt;, or &lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt; stores ranked by capability depth.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;Check the leaderboard&lt;/a&gt;&lt;/strong&gt; — stores graded A through F on real agent shopping performance.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why this matters now
&lt;/h2&gt;

&lt;p&gt;UCP adoption is accelerating. &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;1,400+ new merchants&lt;/a&gt; were discovered in April alone. Shopify &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;migrated its entire fleet&lt;/a&gt; to the latest spec in four days. BigCommerce, WooCommerce, and Magento stores are appearing every week.&lt;/p&gt;

&lt;p&gt;Am I UCP ready? The question isn't whether your store will need UCP. It's whether you'll be ready when agents start shopping — and &lt;a href="https://ucpchecker.com/blog/agentic-commerce-optimization-ucp-readiness-data" rel="noopener noreferrer"&gt;they already are&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Before you check, it helps to understand the building blocks: &lt;a href="https://ucpchecker.com/capabilities" rel="noopener noreferrer"&gt;capabilities&lt;/a&gt; define what your store can do for agents, &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handlers&lt;/a&gt; define how agents pay, &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transports&lt;/a&gt; define how agents connect, and &lt;a href="https://ucpchecker.com/product-discovery" rel="noopener noreferrer"&gt;product discovery&lt;/a&gt; is the flow agents actually run when they shop.&lt;/p&gt;

&lt;p&gt;Make sure your listing on &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;UCP Registry&lt;/a&gt; is accurate once you're verified — that's how agents find you in the first place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;Check your store now →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Build your own agent: &lt;a href="https://ucpchecker.com/agents" rel="noopener noreferrer"&gt;developer quickstart&lt;/a&gt;. Understand the protocol stack: &lt;a href="https://ucpchecker.com/blog/mcp-vs-ucp-vs-ap2-whats-the-difference" rel="noopener noreferrer"&gt;MCP vs UCP vs AP2&lt;/a&gt;. Monthly ecosystem data: &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;State of Agentic Commerce&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>tutorial</category>
      <category>ucp</category>
    </item>
    <item>
      <title>Introducing the UCP Score: A 0–100 Agent-Readiness Grade for Every UCP Store</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Wed, 29 Apr 2026 09:41:44 +0000</pubDate>
      <link>https://forem.com/benjifisher/introducing-the-ucp-score-a-0-100-agent-readiness-grade-for-every-ucp-store-1851</link>
      <guid>https://forem.com/benjifisher/introducing-the-ucp-score-a-0-100-agent-readiness-grade-for-every-ucp-store-1851</guid>
      <description>&lt;p&gt;After every status check on UCPChecker, the same follow-up question lands in our inbox: &lt;strong&gt;"OK, my manifest is verified. But is it actually any good?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That question comes from everywhere. Engineering leads who shipped a manifest last quarter and want to know if it would actually carry an agent through checkout. Platform teams pitching agent-readiness to merchants who need a number, not a status pill. Analysts trying to chart "&lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;how Shopify compares to WooCommerce&lt;/a&gt;" and finding that "verified" tells them next to nothing. &lt;a href="https://ucpchecker.com/developer-tools" rel="noopener noreferrer"&gt;Developers&lt;/a&gt; picking which UCP store to integrate with first. AI agent builders deciding whose endpoints to feature in demo flows. &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;Store owners&lt;/a&gt; benchmarking against direct competitors before a quarterly review.&lt;/p&gt;

&lt;p&gt;None of these audiences really care that a manifest exists. They care about how good it is. Whether it has the surface signals that keep AI shopping agents finding it. Whether the declared transports actually respond when you call them. Whether the spec and schema URLs in the manifest resolve, or quietly 404 the moment a strict agent tries to validate the response shape. The interesting answer is always graded.&lt;/p&gt;

&lt;p&gt;Until today, the only way to answer that question on UCPChecker was to read every line of the validator output and squint. So we built the thing people were already trying to do manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;Get a UCP Score for any domain at ucpchecker.com/score →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What the UCP Score is
&lt;/h2&gt;

&lt;p&gt;A 0–100 composite grade that measures how agent-ready any UCP store actually is. Not "does the manifest exist" — that's the status page. &lt;strong&gt;How well does it work for agents.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The score maps to a single letter grade you can share, embed, or watch over time. Bands are deliberately calibrated to match Lighthouse and SSL Labs — A is meant to be hard to earn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A (85–100)&lt;/strong&gt; — Agent-ready. Valid manifest, strong discovery, broad capability coverage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;B (70–84)&lt;/strong&gt; — Solid. Minor gaps or one weak category, agents can still transact.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;C (50–69)&lt;/strong&gt; — Partial. Manifest works but missing capabilities or surface signals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;D (30–49)&lt;/strong&gt; — Weak. Manifest reachable but invalid or near-empty.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;F (0–29)&lt;/strong&gt; — Failing. Blocked, unreachable, or no manifest detected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every score breaks down into three weighted categories so you can see exactly where the points come from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent Discovery (30%)&lt;/strong&gt; — Can agents find and reach you? HTTPS, reachability, agent-friendly &lt;code&gt;robots.txt&lt;/code&gt;, plus the surface signals that keep you in the conversation: &lt;code&gt;/llms.txt&lt;/code&gt;, &lt;code&gt;sitemap.xml&lt;/code&gt;, Open Graph tags, Organization JSON-LD, mobile viewport meta.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UCP Conformance (40%)&lt;/strong&gt; — Does the manifest validate against the &lt;a href="https://ucpchecker.com/specs" rel="noopener noreferrer"&gt;spec&lt;/a&gt;? Validity is 3× weighted in this category — an invalid manifest cannot score above ~50 here, regardless of how good the surface polish is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capability Coverage (30%)&lt;/strong&gt; — What can an agent actually do at your store? Declared &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transports&lt;/a&gt; (REST/MCP/A2A), checkout, &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handlers&lt;/a&gt;, and breadth of &lt;a href="https://ucpchecker.com/capabilities" rel="noopener noreferrer"&gt;capabilities&lt;/a&gt;. When functional probes run, declared transport endpoints that don't actually respond drag this score down.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The composite is a straight weighted average: &lt;code&gt;Discovery × 0.30 + Conformance × 0.40 + Capabilities × 0.30&lt;/code&gt;. No tricks, no hidden weights. The full ruleset is documented in our &lt;a href="https://ucpchecker.com/methodology" rel="noopener noreferrer"&gt;methodology&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you actually get
&lt;/h2&gt;

&lt;p&gt;Every score URL is a live page at &lt;code&gt;/score/{your-domain}&lt;/code&gt;, indexed and shareable. Open one and you don't just see a number:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Top priorities&lt;/strong&gt; — The three highest-impact issues we found, ranked by impact × effort. Start here.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Impact vs Effort matrix&lt;/strong&gt; — Quick Wins / Strategic / Incremental / Consider Later quadrants so you can plan a sprint instead of staring at a wall of warnings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recommendations with copy-paste fixes&lt;/strong&gt; — Every flagged issue surfaces a snippet you can drop straight into your manifest, &lt;code&gt;robots.txt&lt;/code&gt;, sitemap, or HTML &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt;. Hit "Show fix", copy, paste, redeploy, re-check.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform-aware percentile&lt;/strong&gt; — "You're at p72 latency vs the median Shopify store." Because comparing your latency against the whole directory is meaningless when half of it runs on a fundamentally different infrastructure profile.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full check breakdown&lt;/strong&gt; — Every signal we evaluate, grouped by category, with a "why it matters" paragraph alongside each check. No black boxes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Save this report&lt;/strong&gt; — We re-run the full check weekly and email you only when something material changes. Score drops, capability regresses, status flips. Free, no marketing, unsubscribe anytime.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The page is ungated. No signup, no paywall, no "create an account to see the breakdown." We're indexing every score — just like SSL Labs grades and PageSpeed scores. Public scores create a baseline and pressure for the ecosystem to improve, in the same way SSL grades did for HTTPS adoption.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we built it
&lt;/h2&gt;

&lt;p&gt;The honest answer: &lt;strong&gt;verified-or-not is the wrong question now.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When the UCP spec first landed in January (v2026-01-11), finding a verified store at all was novel. The bar was "did anyone publish a manifest." The status page was the right product for that moment, and it still is for the discovery layer.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;directory&lt;/a&gt; has 4,500+ verified domains today. Verified isn't novel. The interesting question shifted to &lt;strong&gt;"how well does this thing actually work for agents,"&lt;/strong&gt; and nobody had a good answer to that — including us.&lt;/p&gt;

&lt;p&gt;When we ran a deeper analysis for our &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;April State of Agentic Commerce report&lt;/a&gt;, the gap was stark: out of &lt;strong&gt;4,014 verified UCP stores, only 9 delivered a flawless end-to-end agent experience&lt;/strong&gt;. A 0.2% flawless rate. The other 99.8% had a manifest published — they just didn't actually work as well as that manifest suggested. That gap between "verified" and "actually works" is the central infrastructure problem in agentic commerce today. The UCP Score makes that gap visible, measurable, and addressable.&lt;/p&gt;

&lt;p&gt;There's a clear analogue: PageSpeed before Lighthouse. Pre-Lighthouse, web performance optimisation was vibes. People knew slow sites were bad and fast sites were good but couldn't quantify "how slow" or "compared to what." Lighthouse gave them three things — a graded score, a category breakdown, and copy-paste optimisations — and the field changed overnight. Nobody ships a serious site today without checking their Lighthouse score first.&lt;/p&gt;

&lt;p&gt;The agentic commerce ecosystem is at exactly that pre-Lighthouse moment. There's no shared yardstick for agent-readiness. Stores have no way to tell whether the integration they shipped last month is competitive. Platform teams have no way to back up "our merchants are more agent-ready" with a number. AI agent builders have no way to filter "show me the stores most likely to actually complete a transaction."&lt;/p&gt;

&lt;p&gt;The UCP Score is meant to be that yardstick. &lt;strong&gt;Lighthouse for agentic commerce.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How we built it (the short version)
&lt;/h2&gt;

&lt;p&gt;Three signal sources, one composite:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Static analysis&lt;/strong&gt; — The same manifest validator that powers &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;&lt;code&gt;/check&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://ucpchecker.com/ucp-validator" rel="noopener noreferrer"&gt;&lt;code&gt;/ucp-validator&lt;/code&gt;&lt;/a&gt;. Validity, version format, signing keys, payment handlers — every spec rule turned into a check row.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Surface signals&lt;/strong&gt; — Five public files and meta tags fetched in parallel: &lt;code&gt;/llms.txt&lt;/code&gt;, &lt;code&gt;/sitemap.xml&lt;/code&gt;, Open Graph, Organization JSON-LD, viewport. Presence + content captured (with a content hash for change detection on &lt;code&gt;llms.txt&lt;/code&gt; so we can spot when a brand updates their LLM brief).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functional probes&lt;/strong&gt; (opt-in) — Two probe families. Transport probes hit each declared transport endpoint with a benign request (MCP gets a &lt;code&gt;tools/list&lt;/code&gt;, REST/A2A get a GET). URL resolution probes fetch every &lt;code&gt;spec&lt;/code&gt; and &lt;code&gt;schema&lt;/code&gt; URL declared in the manifest. Probes only run on user-triggered checks — not on the 24h cron sweep, because hammering 4,500 merchants daily with a dozen extra HTTP requests each isn't neighbourly.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each signal feeds one category sub-score (0–100), and the composite is the weighted average. Recommendations join error codes against a fix library so every flagged issue surfaces a copy-paste snippet — the same pattern Lighthouse uses for its audit list. The whole pipeline runs on the same 24h cycle as the rest of the directory; checks you trigger manually run the full probe stack.&lt;/p&gt;

&lt;p&gt;If you want the deep version, the &lt;a href="https://ucpchecker.com/methodology" rel="noopener noreferrer"&gt;methodology page&lt;/a&gt; walks through every category, every check, every grade band, and the "what we don't score" list.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can do with it
&lt;/h2&gt;

&lt;p&gt;A few workflows the score unlocks immediately:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pre-merge gate&lt;/strong&gt; — Add a check in your CI that fails the build if your &lt;code&gt;/score/{domain}&lt;/code&gt; drops below B. Same pattern as Lighthouse CI. The score URL is stable and the JSON breakdown lands in the API soon.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform comparison&lt;/strong&gt; — The &lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;&lt;code&gt;/platforms&lt;/code&gt;&lt;/a&gt; page now shows average UCP Score by platform — Shopify vs WooCommerce vs BigCommerce vs Magento at a glance. Useful both for picking a stack and for benchmarking the one you're on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leaderboard&lt;/strong&gt; — The &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;leaderboard&lt;/a&gt; is now ranked by UCP Score with sortable columns for each sub-score. Filter by platform to see the top stores on your stack.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring&lt;/strong&gt; — Save any report against your email. We re-run it weekly and alert you on regressions. Score drops, capability disappears, status flips — one email, free, no marketing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Competitive benchmarking&lt;/strong&gt; — Run &lt;a href="https://ucpchecker.com/compare/allbirds.com/vs/casper.com" rel="noopener noreferrer"&gt;Allbirds vs Casper&lt;/a&gt; and see grades side by side. The compare page picks up score data automatically.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;This is v1. A few things already on the roadmap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Score history &amp;amp; sparkline&lt;/strong&gt; — Save a report and you'll see your score trend over time. We're tracking every check in our history table from day one, so the data exists; the visual lands shortly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Score API&lt;/strong&gt; — &lt;code&gt;GET /api/v1/score/{domain}&lt;/code&gt; returning the full breakdown as JSON. The &lt;a href="https://ucpchecker.com/developer-tools" rel="noopener noreferrer"&gt;data feed&lt;/a&gt; is already public; the score endpoint is the same data behind a stable contract.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec-version-aware scoring weights&lt;/strong&gt; — As new UCP spec versions land with new emphasis, scoring rules for each version live in config and absorb cleanly. Already version-aware for validation; widening to scoring weights too.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We've also taken pains to make the system absorb future spec releases without a rewrite. Static check copy lives in config, not hardcoded; new error codes plug into the recommendations engine via a single config entry. The next spec drop should land as a configuration change, not a refactor.&lt;/p&gt;

&lt;h2&gt;
  
  
  About UCP Checker
&lt;/h2&gt;

&lt;p&gt;UCP Checker is the independent validation and monitoring layer for the &lt;a href="https://ucp.dev" rel="noopener noreferrer"&gt;Universal Commerce Protocol&lt;/a&gt;. We crawl, validate, and grade every public UCP manifest in the open web, run the public &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;merchant directory&lt;/a&gt;, publish the &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;leaderboard&lt;/a&gt; and &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;adoption stats&lt;/a&gt;, and ship developer tools — the &lt;a href="https://ucpchecker.com/ucp-validator" rel="noopener noreferrer"&gt;validator&lt;/a&gt;, the &lt;a href="https://ucpchecker.com/bulk-check" rel="noopener noreferrer"&gt;bulk checker&lt;/a&gt;, the &lt;a href="https://ucpchecker.com/extension" rel="noopener noreferrer"&gt;browser extension&lt;/a&gt;, and now the UCP Score. Everything is free, indexed, and ungated; the dataset is published openly under CC-BY 4.0. Think of us as the SSL Labs of agentic commerce — the third-party scoreboard the ecosystem can build trust on top of.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Pick any domain. Type it into &lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;ucpchecker.com/score&lt;/a&gt; and you'll have a graded report in under a second. If you find a score that surprised you — yours or a competitor's — &lt;a href="https://ucpchecker.com/contact" rel="noopener noreferrer"&gt;let us know&lt;/a&gt;. The interesting score gaps are the ones nobody's looked at yet.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Get a score:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/score" rel="noopener noreferrer"&gt;ucpchecker.com/score&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;See the leaderboard:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;ucpchecker.com/leaderboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it's calculated:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/methodology" rel="noopener noreferrer"&gt;ucpchecker.com/methodology&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compare two stores:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;ucpchecker.com/compare&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track adoption live:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;ucpchecker.com/stats&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get notified on changes:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/alerts" rel="noopener noreferrer"&gt;ucpchecker.com/alerts&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ecommerce</category>
      <category>ai</category>
      <category>product</category>
      <category>ucp</category>
    </item>
    <item>
      <title>UCP Tech Council Expands: What the Meeting Minutes Tell Us About Where the Protocol Is Heading</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Sun, 26 Apr 2026 21:37:00 +0000</pubDate>
      <link>https://forem.com/benjifisher/ucp-tech-council-expands-what-the-meeting-minutes-tell-us-about-where-the-protocol-is-heading-5a92</link>
      <guid>https://forem.com/benjifisher/ucp-tech-council-expands-what-the-meeting-minutes-tell-us-about-where-the-protocol-is-heading-5a92</guid>
      <description>&lt;p&gt;On Friday just gone, five of the largest technology companies in the world quietly joined the governing body of the Universal Commerce Protocol. No press release. No blog post. Just a &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/commit/80ea01c" rel="noopener noreferrer"&gt;commit to MAINTAINERS.md&lt;/a&gt; in the spec repository.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon. Meta. Microsoft. Salesforce. Stripe.&lt;/strong&gt; All now have seats on the UCP Tech Council — the body that reviews, debates, and approves every change to the protocol that AI shopping agents use to buy things.&lt;/p&gt;

&lt;p&gt;We know this because we read the meeting minutes. Every week, the TC meets to debate spec changes, vote on PRs, and argue about how agent commerce should work. Most people in the industry don't read these minutes. We do — and what they reveal about where UCP is heading is more interesting than any announcement.&lt;/p&gt;

&lt;p&gt;This is what the minutes tell us.&lt;/p&gt;

&lt;h2&gt;
  
  
  The expansion: who joined and why it matters
&lt;/h2&gt;

&lt;p&gt;The Tech Council grew from roughly 12 seats to &lt;strong&gt;16 members across 8 companies&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Company&lt;/th&gt;
&lt;th&gt;Representatives&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Google&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4 seats&lt;/td&gt;
&lt;td&gt;Founding sponsor, spec steward&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Shopify&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4 seats (incl. 2 new)&lt;/td&gt;
&lt;td&gt;Largest platform implementer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Amazon&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Greg Smith (new)&lt;/td&gt;
&lt;td&gt;The world's largest online retailer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Meta&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;James Andersen (new)&lt;/td&gt;
&lt;td&gt;Social commerce, Instagram Shopping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Microsoft&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Patrick Jordan (new)&lt;/td&gt;
&lt;td&gt;Copilot, enterprise commerce&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Stripe&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Prasad Wangikar (new)&lt;/td&gt;
&lt;td&gt;Payment infrastructure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Salesforce&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Scot DeDeo (new)&lt;/td&gt;
&lt;td&gt;Commerce Cloud, enterprise retail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Etsy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Imran Hoosain&lt;/td&gt;
&lt;td&gt;Marketplace commerce&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Target&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Maxime Najim&lt;/td&gt;
&lt;td&gt;Enterprise retail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Wayfair&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Naga Malepati&lt;/td&gt;
&lt;td&gt;Furniture/home goods&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This isn't ceremonial. The TC has binding authority over spec changes — every PR that ships in a UCP release has been reviewed and voted on by this group. When Amazon and Stripe join that table, it changes what gets prioritised, what gets debated, and ultimately what the protocol becomes.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/Universal-Commerce-Protocol/meeting-minutes/blob/main/tc/2026/2026-03-13.md" rel="noopener noreferrer"&gt;meeting minutes from March 13&lt;/a&gt; first mentioned the election process: seats rotating every six months, with growing partner interest. By &lt;a href="https://github.com/Universal-Commerce-Protocol/meeting-minutes/blob/main/tc/2026/2026-03-27.md" rel="noopener noreferrer"&gt;March 27&lt;/a&gt;, six nominations had been received. The final review was scheduled for April 10. The MAINTAINERS.md update landed April 24.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The new members are already contributing.&lt;/strong&gt; James Andersen (Meta) submitted &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/367" rel="noopener noreferrer"&gt;PR #367&lt;/a&gt; on April 17 — a documentation PR clarifying network token usage and PCI scope in card credentials. Patrick Jordan (Microsoft) contributed documentation accuracy fixes the same day. These aren't advisory seats. They're engineering seats.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the meeting minutes actually say
&lt;/h2&gt;

&lt;p&gt;We reviewed the six TC meetings from March 6 through April 17. Here's what's being debated, decided, and built — translated for a merchant audience.&lt;/p&gt;

&lt;h3&gt;
  
  
  Identity linking is the top priority — and it's hard
&lt;/h3&gt;

&lt;p&gt;The single most discussed topic across all six meetings is &lt;strong&gt;identity linking&lt;/strong&gt; — how an agent knows who the customer is across sessions, stores, and platforms.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/Universal-Commerce-Protocol/meeting-minutes/blob/main/tc/2026/2026-04-17.md" rel="noopener noreferrer"&gt;April 17 minutes&lt;/a&gt; show an active debate about OAuth 2.0 scope design: nested scopes vs flat scopes vs config maps. The TC favoured flat. &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/354" rel="noopener noreferrer"&gt;PR #354&lt;/a&gt; implements OAuth 2.0 as the foundation for identity linking with capability-driven scopes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters for merchants:&lt;/strong&gt; Identity linking is the missing piece that would let an agent complete a purchase without a checkout-page handoff. Right now, agents can browse and cart — but paying requires redirecting the customer to a human checkout flow. Identity linking + &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handlers&lt;/a&gt; would close that loop. Until then, agents rely on the &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transport layer&lt;/a&gt; to reach the store and the &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;manifest endpoint&lt;/a&gt; for discovery. Our &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;April state-of-commerce report&lt;/a&gt; showed only 3 stores out of 4,024 currently declare identity linking capability. The spec work happening now is what will eventually bring that number up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Loyalty is being trimmed to ship faster
&lt;/h3&gt;

&lt;p&gt;The TC has been debating loyalty schemas since March. &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/340" rel="noopener noreferrer"&gt;PR #340&lt;/a&gt; implements a loyalty extension for the checkout capability. The &lt;a href="https://github.com/Universal-Commerce-Protocol/meeting-minutes/blob/main/tc/2026/2026-04-10.md" rel="noopener noreferrer"&gt;April 10 minutes&lt;/a&gt; note that the extension is being "trimmed to baseline use cases" — a pragmatic decision to ship something that works for simple loyalty programs now, rather than waiting for a comprehensive solution that handles every edge case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; If your store has a loyalty or rewards program, the spec is building the infrastructure for agents to verify loyalty status and redeem points as part of the checkout flow. This is early — don't build against it yet — but understand that it's coming and it's being shaped by people at Google, Shopify, Etsy, and Target who run real loyalty programs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Local commerce is on the roadmap
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://github.com/Universal-Commerce-Protocol/meeting-minutes/blob/main/tc/2026/2026-04-03.md" rel="noopener noreferrer"&gt;April 3 minutes&lt;/a&gt; list Q2 priorities. Among them: &lt;strong&gt;local commerce&lt;/strong&gt;. &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/375" rel="noopener noreferrer"&gt;PR #375&lt;/a&gt; proposes store-based local inventory and fulfilment options — the infrastructure an agent would need to answer "is this product available at a store near me?"&lt;/p&gt;

&lt;p&gt;This is Target and Wayfair territory. Both have TC seats. Both have store networks. The fact that local commerce is a Q2 priority with retail representation on the council suggests it's not theoretical.&lt;/p&gt;

&lt;h3&gt;
  
  
  Returns are "incredibly complicated"
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://github.com/Universal-Commerce-Protocol/meeting-minutes/blob/main/tc/2026/2026-04-17.md" rel="noopener noreferrer"&gt;April 17 minutes&lt;/a&gt; include the most honest assessment we've seen in any spec discussion: returns are acknowledged as an "incredibly complicated domain." This is refreshing. Most protocol specs pretend returns are simple. UCP's TC is saying out loud that they're not, and that getting them right will take time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/257" rel="noopener noreferrer"&gt;PR #257&lt;/a&gt; from the February cycle introduced a returns extension. It's still in review. The complexity is in modelling return windows, refund methods, partial returns, and eligibility rules — all of which vary by merchant, product, and jurisdiction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; Don't expect agent-managed returns in 2026. But understand that the protocol is building toward it, and the merchants who implement return policies as structured data (not just PDF links) will be ahead when it ships.&lt;/p&gt;

&lt;h3&gt;
  
  
  The spec itself just shipped its biggest release ever
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/specs/2026-04-08" rel="noopener noreferrer"&gt;v2026-04-08&lt;/a&gt; landed with &lt;strong&gt;60+ merged PRs&lt;/strong&gt; — the largest release since the protocol launched. Key additions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cart capability&lt;/strong&gt; — basket building for agents, a prerequisite for multi-item flows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Catalog search + lookup&lt;/strong&gt; — formalised &lt;a href="https://ucpchecker.com/product-discovery" rel="noopener noreferrer"&gt;product discovery&lt;/a&gt; as a spec capability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Request/response signing&lt;/strong&gt; — cryptographic integrity for agent-store communication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error handling overhaul&lt;/strong&gt; — first-class errors, business logic error types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eligibility claims&lt;/strong&gt; — for loyalty, membership, and verification-gated pricing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discount extension to cart&lt;/strong&gt; — discounts now apply pre-checkout, not just at checkout&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Risk signals&lt;/strong&gt; — authorization and abuse metadata for fraud prevention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our crawler showed &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;Shopify migrating its entire fleet&lt;/a&gt; to v2026-04-08 in four days. &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;99.4% of verified stores&lt;/a&gt; are now on the latest spec.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for you
&lt;/h2&gt;

&lt;h3&gt;
  
  
  If you're a merchant
&lt;/h3&gt;

&lt;p&gt;The governance expansion doesn't change what you need to do today. Your &lt;a href="https://ucpchecker.com/blog/ucp-requirements" rel="noopener noreferrer"&gt;UCP requirements&lt;/a&gt; are the same: valid manifest, declared &lt;a href="https://ucpchecker.com/capabilities" rel="noopener noreferrer"&gt;capabilities&lt;/a&gt;, clean variant data. &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;Check your store&lt;/a&gt;, fix any &lt;a href="https://ucpchecker.com/blog/common-ucp-errors" rel="noopener noreferrer"&gt;common errors&lt;/a&gt;, &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;compare against competitors&lt;/a&gt;, and &lt;a href="https://ucpchecker.com/alerts" rel="noopener noreferrer"&gt;set up alerts&lt;/a&gt; so you know if anything breaks.&lt;/p&gt;

&lt;p&gt;What it does change is the timeline and the confidence. When Amazon, Microsoft, and Salesforce have engineering seats on the governing body, the protocol is not going away. If you've been waiting for a signal that UCP is "real enough" to invest in — five of the ten largest technology companies joining the TC in a single commit is that signal.&lt;/p&gt;

&lt;h3&gt;
  
  
  If you're a platform
&lt;/h3&gt;

&lt;p&gt;If you run &lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;, you're covered — platform-level UCP support is mature. If you run &lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;, or a &lt;a href="https://ucpchecker.com/platforms/custom" rel="noopener noreferrer"&gt;custom stack&lt;/a&gt;, watch the identity linking and loyalty PRs. These are the capabilities that will differentiate agent-ready platforms from agent-compatible ones in H2 2026.&lt;/p&gt;

&lt;p&gt;Salesforce Commerce Cloud now has a seat at the table. If you're on SFCC, this is the clearest signal yet that platform-level UCP support is coming. Our &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;April report&lt;/a&gt; noted that we've already seen SFCC engineering work in progress.&lt;/p&gt;

&lt;h3&gt;
  
  
  If you're building agents
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://ucpchecker.com/agents" rel="noopener noreferrer"&gt;Build an Agent quickstart&lt;/a&gt; still works — the protocol surface you're building against is stable. But start tracking the identity linking PRs. When that capability ships, the agent flow goes from "browse + cart + redirect to checkout" to "browse + cart + pay" — end-to-end autonomous purchasing. That's the step change.&lt;/p&gt;

&lt;p&gt;Check the &lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;store leaderboard&lt;/a&gt; to find the highest-performing targets, understand how &lt;a href="https://ucpchecker.com/product-discovery" rel="noopener noreferrer"&gt;product discovery&lt;/a&gt; works, and test your agent against real stores in &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCP Playground&lt;/a&gt; and use &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;UCP Registry&lt;/a&gt; for production discovery. Both will surface the new capabilities as they ship.&lt;/p&gt;

&lt;h2&gt;
  
  
  The reading list
&lt;/h2&gt;

&lt;p&gt;For anyone who wants to follow the protocol's evolution themselves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Meeting minutes:&lt;/strong&gt; &lt;a href="https://github.com/Universal-Commerce-Protocol/meeting-minutes/tree/main/tc/2026" rel="noopener noreferrer"&gt;github.com/Universal-Commerce-Protocol/meeting-minutes&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec repo:&lt;/strong&gt; &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp" rel="noopener noreferrer"&gt;github.com/Universal-Commerce-Protocol/ucp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;v2026-04-08 release notes:&lt;/strong&gt; &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/releases/tag/v2026-04-08" rel="noopener noreferrer"&gt;github.com/Universal-Commerce-Protocol/ucp/releases/tag/v2026-04-08&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MAINTAINERS.md:&lt;/strong&gt; &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/blob/main/MAINTAINERS.md" rel="noopener noreferrer"&gt;github.com/Universal-Commerce-Protocol/ucp/blob/main/MAINTAINERS.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Active PRs:&lt;/strong&gt; &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pulls" rel="noopener noreferrer"&gt;github.com/Universal-Commerce-Protocol/ucp/pulls&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'll continue monitoring the spec, the TC minutes, and the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;4,500+ merchants&lt;/a&gt; building on the protocol. If any of the Q2 priorities (identity, loyalty, local commerce) ship in spec form, we'll cover them in the &lt;a href="https://ucpchecker.com/blog" rel="noopener noreferrer"&gt;May state-of-commerce report&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Check your store's UCP status at &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCPChecker.com&lt;/a&gt;. Browse verified stores at &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;UCPRegistry.com&lt;/a&gt;. Test agent performance at &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCPPlayground.com&lt;/a&gt;. Read the full protocol stack: &lt;a href="https://ucpchecker.com/blog/mcp-vs-ucp-vs-ap2-whats-the-difference" rel="noopener noreferrer"&gt;MCP vs UCP vs AP2&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>data</category>
      <category>ucp</category>
    </item>
    <item>
      <title>Agentic Commerce Optimization: What 4,491 Merchants Reveal About UCP Readiness</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Thu, 23 Apr 2026 11:53:42 +0000</pubDate>
      <link>https://forem.com/benjifisher/agentic-commerce-optimization-what-4491-merchants-reveal-about-ucp-readiness-3fk</link>
      <guid>https://forem.com/benjifisher/agentic-commerce-optimization-what-4491-merchants-reveal-about-ucp-readiness-3fk</guid>
      <description>&lt;h1&gt;
  
  
  Agentic Commerce Optimization: What 4,491 Merchants Reveal About UCP Readiness
&lt;/h1&gt;

&lt;p&gt;Every UCP technical guide tells you how to get UCP ready. We decided to measure who actually is.&lt;/p&gt;

&lt;p&gt;Since UCP launched, UCP Checker has tracked 4,491 merchants — 4,024 of which are verified and actively serving &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;UCP endpoints&lt;/a&gt;. We maintain the largest UCP index of live merchant implementations, and the data tells a story that no theoretical guide can. We've run over 1k agent testing sessions in &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCP Playground&lt;/a&gt;, consumed 43 million tokens doing it, and watched real AI agents attempt to browse, cart, and buy products across every major ecommerce platform. The result isn't a theoretical framework for agentic commerce optimization. It's a field report.&lt;/p&gt;

&lt;p&gt;And the field looks very different from what the guides tell you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "Agentic Commerce Optimization" Actually Means When You Have Data
&lt;/h2&gt;

&lt;p&gt;The term "agentic commerce optimization" — or ACO — has entered the SEO lexicon as a catch-all for making your store ready for &lt;a href="https://ucpchecker.com/product-discovery" rel="noopener noreferrer"&gt;AI-powered shopping agents&lt;/a&gt;. Most of the early writing treats it like a checklist: add Schema.org markup, update your Merchant Center feed, structure your product data. That advice isn't wrong. It's just incomplete, because it's built on assumptions about how agents will behave rather than observations of &lt;a href="https://ucpchecker.com/blog/mcp-vs-ucp-vs-ap2-whats-the-difference" rel="noopener noreferrer"&gt;how they actually do&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;ACO, measured empirically, is the practice of optimizing your ecommerce stack for the specific patterns that AI agents exhibit when they interact with &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;UCP endpoints&lt;/a&gt;. Those patterns are surprising. Agents don't browse the way humans do. They don't use carts the way humans do. And the failure modes that block them from completing purchases are not the ones you'd predict from reading the spec alone.&lt;/p&gt;

&lt;p&gt;The data we've collected across 4,024 verified UCP merchants tells a concrete story about what matters, what doesn't, and where the real optimization opportunities are hiding.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjw4qi18947rhvgq1up0.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjw4qi18947rhvgq1up0.webp" alt="UCP Stack Layers — capability adoption across verified merchants" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real State of UCP Readiness
&lt;/h2&gt;

&lt;p&gt;Let's start with what's working. Of the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;4,024 verified merchants&lt;/a&gt; in &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;UCP Registry&lt;/a&gt; — the open UCP directory where agents discover merchants — capability adoption breaks down like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/checkout" rel="noopener noreferrer"&gt;Checkout&lt;/a&gt;:&lt;/strong&gt; 4,003 merchants (99.5%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/cart" rel="noopener noreferrer"&gt;Cart&lt;/a&gt;:&lt;/strong&gt; 3,987 merchants (99.1%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product discovery:&lt;/strong&gt; Near-universal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/identity-linking" rel="noopener noreferrer"&gt;Identity&lt;/a&gt;:&lt;/strong&gt; 3 merchants&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/capabilities/payment" rel="noopener noreferrer"&gt;Payment&lt;/a&gt;:&lt;/strong&gt; 0 merchants&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Read those last two numbers again. Three merchants support identity. Zero support native payment. This is the defining feature of UCP's current state: the bottom of the funnel is wide open, but the capabilities that would make agentic commerce truly autonomous — knowing who the customer is and processing payment without a handoff — are functionally nonexistent.&lt;/p&gt;

&lt;p&gt;The spec migration numbers are more encouraging. When the &lt;a href="https://ucpchecker.com/specs/2026-04-08" rel="noopener noreferrer"&gt;v2026-04-08 specification&lt;/a&gt; dropped, 3,994 out of 4,022 tracked merchants had migrated within four days. That's a &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;99.3% adoption rate&lt;/a&gt; in under a week, which speaks to the platform-driven nature of UCP rollout. Most merchants aren't manually implementing UCP. Their platform is doing it for them, and the platforms shipped the update fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  Platform-by-Platform Reality
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj2ok71gdkl7riwv3935o.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj2ok71gdkl7riwv3935o.webp" alt="UCP Transport Comparison — REST vs MCP vs Embedded by platform" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The theoretical guides will tell you that UCP readiness is about your structured data and feed configuration. In practice, it's mostly about &lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;which platform you're on&lt;/a&gt;. Here's what we've seen across the major players.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;: The Default Winner
&lt;/h3&gt;

&lt;p&gt;Shopify accounts for roughly 74% of identified platforms in our dataset (898 of the platform-identified merchants). This dominance isn't because Shopify merchants are more proactive about UCP — it's because Shopify rolled out UCP support at the platform level, giving every store baseline compliance automatically.&lt;/p&gt;

&lt;p&gt;Out of the box, a Shopify store gets functional product discovery, cart, and checkout endpoints. The Schema.org markup is handled. The Merchant Center feed attributes are populated. For the average merchant, getting UCP ready on Shopify means verifying that your product data is clean rather than building anything from scratch.&lt;/p&gt;

&lt;p&gt;The downside: Shopify's one-size-fits-all approach means limited customization of UCP behavior. If you need to implement conversational commerce attributes like substitution logic or compatibility data, you're working within Shopify's constraints. But for baseline agentic commerce readiness, nothing else comes close to the out-of-the-box experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;: Flexible but Inconsistent
&lt;/h3&gt;

&lt;p&gt;WooCommerce stores show the widest variance in UCP readiness. The open-source model means implementation quality depends entirely on which plugins a merchant has installed and how they've configured their stack. We've seen WooCommerce stores with excellent structured data and smooth agent interactions right next to stores where basic product attributes are missing or malformed.&lt;/p&gt;

&lt;p&gt;The flexibility is a genuine advantage for merchants who want to implement advanced ACO features — conversational attributes, detailed return policies, rich product relationships. But the inconsistency is a problem for agents, which need predictable data structures to operate reliably. If you're on WooCommerce and serious about agentic commerce optimization, an audit of your specific UCP endpoint output is essential, not optional. Run your store through &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCP Checker&lt;/a&gt; and see what an agent actually encounters.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt;: Strong APIs, Broken Images
&lt;/h3&gt;

&lt;p&gt;BigCommerce has a genuine technical advantage in its API architecture. The platform's API-first design translates well to UCP's endpoint model, and the stores we've tracked generally produce clean, well-structured UCP responses.&lt;/p&gt;

&lt;p&gt;But there's a specific, persistent issue: BigCommerce's S3-hosted image URLs break agent image parsing. This is a real failure mode we've observed in Playground sessions. When an agent can't parse product images, it loses a significant input signal for product matching and variant selection. For a platform that otherwise has strong UCP fundamentals, this is an unfortunate gap — and one that BigCommerce merchants should pressure their platform to fix. For now, it's worth investigating whether your image delivery pipeline produces URLs that agents can reliably consume. Our &lt;a href="https://ucpchecker.com/blog/bigcommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;BigCommerce guide&lt;/a&gt; walks through the specifics.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt; (Adobe Commerce): Enterprise Muscle, Enterprise Complexity
&lt;/h3&gt;

&lt;p&gt;Magento implementations tend to be enterprise-grade, which means the UCP output is thorough but the setup complexity is high. These stores generally have rich product data, detailed catalog structures, and the kind of attribute depth that agents love. But the implementation burden falls more heavily on the merchant's development team compared to Shopify or BigCommerce, where the platform handles the heavy lifting.&lt;/p&gt;

&lt;p&gt;If you're on Magento and aren't UCP ready yet, expect a meaningful engineering investment. If you have started, you're probably in good shape — the platform's data model maps well to what UCP expects, especially for multi-variant products and complex catalog hierarchies. See our &lt;a href="https://ucpchecker.com/blog/magento-adobe-commerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;Magento guide&lt;/a&gt; for implementation specifics.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Agents Actually Do (vs. What Guides Tell You to Optimize For)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foo5hqo00o5k38qggwxbd.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foo5hqo00o5k38qggwxbd.webp" alt="Agent Shopping Flow — MCP tool call sequence" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's where our data diverges most sharply from the advisory content circulating about UCP preparation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agents Skip the Cart
&lt;/h3&gt;

&lt;p&gt;The conventional model of ecommerce — browse, add to cart, review cart, checkout — doesn't describe how AI agents behave. In our Playground data, we've recorded 395 checkout operations versus just 104 cart operations. Agents are going direct to checkout nearly four times more often than they're using the cart.&lt;/p&gt;

&lt;p&gt;This has major implications for agentic commerce optimization. If you've invested heavily in cart-level features — upsells, cross-sells, minimum order messaging, cart-based promotions — agents are likely bypassing all of it. The checkout endpoint is where the action happens. Your optimization effort should weight accordingly — &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;compare your store against competitors&lt;/a&gt; to see where you stand: make sure checkout handles single-product and multi-product flows cleanly, with clear variant specification and unambiguous pricing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Variant Mismatches Are the Top Failure Mode
&lt;/h3&gt;

&lt;p&gt;Cart variant mismatches remain the most common reason agent sessions fail to complete a purchase. An agent selects a product, identifies the desired variant (size, color, configuration), and submits a cart or checkout request with a variant ID that doesn't match what the endpoint expects. The session stalls or errors out.&lt;/p&gt;

&lt;p&gt;This isn't an agent intelligence problem — it's a data clarity problem. Stores with clean, unambiguous variant structures and consistent ID schemes see dramatically higher agent completion rates. Stores with complex variant matrices, inconsistent naming, or variant IDs that change between API responses create confusion that even the best models struggle to resolve.&lt;/p&gt;

&lt;p&gt;If you do one thing for ACO today: audit your variant data. Make sure every variant has a stable identifier, a clear human-readable name, and consistent representation across your discovery and checkout endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Token Consumption Tells You Where Agents Struggle
&lt;/h3&gt;

&lt;p&gt;We've consumed 43 million tokens over 1,000 Playground sessions. The per-session cost varies dramatically based on store complexity and model choice, but a telling pattern emerges in checkout flows: completing a purchase takes approximately 55,000 tokens with the best-performing models.&lt;/p&gt;

&lt;p&gt;That number is a proxy for friction. A 55K-token checkout means the agent is making multiple round-trips, parsing product data, resolving variants, handling errors, and re-trying. Stores that produce clean, predictable UCP responses see lower token counts — which directly translates to faster agent interactions and lower cost for the platforms running these agents at scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  Model Performance Varies Significantly
&lt;/h3&gt;

&lt;p&gt;Not all AI models handle UCP interactions equally. Claude Sonnet 4.5 leads our &lt;a href="https://ucpplayground.com/leaderboard" rel="noopener noreferrer"&gt;Playground leaderboard&lt;/a&gt; with 205 sessions, and the checkout completion rate across all sessions sits at 41%. That might sound low, but consider what it represents: four out of ten fully autonomous purchase attempts succeed end-to-end, without any human intervention, across a diverse set of merchants with varying UCP implementation quality.&lt;/p&gt;

&lt;p&gt;The model performance gap matters for merchants because it signals where your UCP implementation has rough edges. If top-tier models struggle with your checkout flow, every agent will struggle. Testing your store in &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCP Playground&lt;/a&gt; with multiple models gives you a direct read on where your implementation creates unnecessary friction.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Capabilities Gap That Will Define Winners
&lt;/h2&gt;

&lt;p&gt;Go back to those adoption numbers: identity at 3 merchants, payment at 0. These aren't just gaps — they're the entire frontier of competitive differentiation in agentic commerce.&lt;/p&gt;

&lt;p&gt;Right now, every UCP checkout ends with a handoff. The agent gets the customer to the point of purchase, then drops them into a traditional checkout flow to enter their identity and payment information. That handoff is where conversion dies. Every redirect, every form field, every authentication step is a chance for the customer to abandon.&lt;/p&gt;

&lt;p&gt;The merchants who figure out identity and payment first — who let an agent complete a purchase end-to-end without a handoff — will have a structural conversion advantage that no amount of Schema.org optimization can match. This is where UCP's roadmap points: loyalty integration, post-purchase management, multi-vertical capabilities. But the foundation is identity and payment.&lt;/p&gt;

&lt;p&gt;We don't yet know what the winning implementation pattern looks like for these capabilities. The spec supports them, but the ecosystem hasn't built them. This is the space to watch, and the space where early investment will pay disproportionate returns.&lt;/p&gt;

&lt;h2&gt;
  
  
  An Optimization Checklist Grounded in Data
&lt;/h2&gt;

&lt;p&gt;Most ACO checklists are derived from the spec. This one is derived from watching &amp;gt;1,000 agent sessions succeed and fail across 4,024 merchants. Here's what actually moves the needle, ranked by observed impact:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Fix your variant data first.&lt;/strong&gt; Stable IDs, clear names, consistent representation across endpoints. This is the single highest-impact fix based on our failure-mode analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Optimize for direct-to-checkout flows.&lt;/strong&gt; Agents skip the cart. Make sure your checkout endpoint handles product selection, variant specification, and pricing in a single clean interaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Audit your product images.&lt;/strong&gt; If you're on BigCommerce or any platform using CDN-hosted images with complex URL structures, verify that agents can parse your image URLs. Broken image parsing degrades product matching accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Migrate to the latest spec version immediately.&lt;/strong&gt; The &lt;a href="https://ucpchecker.com/blog/ucp-v2026-04-08-spec-update" rel="noopener noreferrer"&gt;v2026-04-08 migration&lt;/a&gt; happened in four days across the ecosystem. If you're still on an older version, you're already behind 99.3% of verified merchants.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Test with actual agents, not just validators.&lt;/strong&gt; Schema validation tells you if your markup is syntactically correct. It tells you nothing about whether an agent can actually complete a purchase. Run your store through &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCPPlayground&lt;/a&gt; and watch what happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Validate your full UCP endpoint output.&lt;/strong&gt; Use &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCPChecker&lt;/a&gt; to see exactly what your store exposes to agents — capabilities, product data, structured attributes — and where the gaps are.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Clean up your Merchant Center feed.&lt;/strong&gt; Return policies, product identifiers, and the native commerce attributes that feed into UCP discovery. This is table-stakes, but our data confirms that stores with complete feed data see higher agent engagement in discovery flows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8. Start thinking about identity and payment.&lt;/strong&gt; You won't implement these today — almost nobody has. But understanding the spec's identity and payment capabilities now positions you — our &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-april-2026" rel="noopener noreferrer"&gt;April ecosystem report&lt;/a&gt; tracks adoption monthly to move fast when the ecosystem catches up. The jump from 0 to first-mover will be worth more than incremental improvements to discovery or checkout.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;9. Monitor your platform's UCP updates.&lt;/strong&gt; If you're on Shopify, WooCommerce, BigCommerce, or Magento, your platform is doing most of the UCP work. Stay current with their releases — &lt;a href="https://ucpchecker.com/alerts" rel="noopener noreferrer"&gt;set up domain alerts&lt;/a&gt; to get notified when your store's status changes. Platform-level updates drove 99.3% spec migration in four days — the single most effective "optimization" most merchants can do is simply keeping their platform current.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;10. Get listed in the UCP directory.&lt;/strong&gt; &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;UCPRegistry&lt;/a&gt; is the open UCP index where agents discover merchants. Your listing is what agents see when deciding which merchants to route a customer to. Make sure you're listed, your data is accurate, and your capabilities are competitive with peers in your vertical.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Agentic commerce optimization isn't a theoretical exercise anymore. UCP ecommerce is live, it's measurable, and it's growing fast. Our UCP index tracks &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;4,024 verified merchants&lt;/a&gt; serving &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;UCP endpoints&lt;/a&gt; today. AI agents are completing purchases 41% of the time. The gap between being UCP ready and being UCP optimized is measurable in variant data quality, checkout flow design, and capabilities adoption.&lt;/p&gt;

&lt;p&gt;The merchants who treat ACO as a data problem — not just a markup problem — are the ones who'll convert when agents come shopping. And agents are already shopping. We've got 43 million tokens of proof.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Check if your store is UCP ready at &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCPChecker.com&lt;/a&gt;. Browse the UCP directory at &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;UCPRegistry&lt;/a&gt;. Test agent interactions in &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCPPlayground&lt;/a&gt;. Platform-specific implementation guides: &lt;a href="https://ucpchecker.com/blog/shopify-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/woocommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/bigcommerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt; · &lt;a href="https://ucpchecker.com/blog/magento-adobe-commerce-ucp-guide-ai-agent-commerce" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>ai</category>
      <category>data</category>
    </item>
    <item>
      <title>The State of Agentic Commerce — April 2026</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Sat, 18 Apr 2026 09:48:53 +0000</pubDate>
      <link>https://forem.com/benjifisher/the-state-of-agentic-commerce-april-2026-l93</link>
      <guid>https://forem.com/benjifisher/the-state-of-agentic-commerce-april-2026-l93</guid>
      <description>&lt;p&gt;In &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-march-2026" rel="noopener noreferrer"&gt;March&lt;/a&gt;, we crossed 3,000 verified stores and started seeing the first non-Shopify platforms in the directory. We said the next question was whether UCP would remain a Shopify story or become a real multi-platform standard.&lt;/p&gt;

&lt;p&gt;April answered that. We crossed &lt;strong&gt;4,000 verified stores&lt;/strong&gt;, Shopify migrated its entire fleet to the new v2026-04-08 spec in a four-day window, BigCommerce entered the directory with its first three stores, and WooCommerce and Magento integrations started appearing from independent developers. The ecosystem grew 33% in one month while simultaneously upgrading the protocol underneath.&lt;/p&gt;

&lt;p&gt;This is the third monthly state-of-the-ecosystem report from UCP Checker. Here's what the data says.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;p&gt;As of April 17, 2026:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;4,014&lt;/strong&gt; &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;verified UCP stores&lt;/a&gt; (up from ~3,000 in March, +33%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;4,481&lt;/strong&gt; total domains tracked&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;47,154&lt;/strong&gt; total checks run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1,436&lt;/strong&gt; new merchants discovered this month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;866&lt;/strong&gt; new merchants this week alone&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3,988&lt;/strong&gt; stores on the latest &lt;a href="https://ucpchecker.com/specs/2026-04-08" rel="noopener noreferrer"&gt;v2026-04-08 spec&lt;/a&gt; (99.4%)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The growth curve is worth examining. February was discovery: we scanned our first thousand Shopify stores and found UCP everywhere on the platform. March was expansion: we broadened the crawler, crossed 3,000, and started seeing non-Shopify manifests for the first time. April is consolidation: the store count grew 33%, but the more significant movement was the spec migration and the first signs of platform diversification.&lt;/p&gt;

&lt;p&gt;The weekly run rate matters here. At 866 new merchants discovered this week alone, the ecosystem is adding roughly 125 stores per day. But the growth isn't organic in the way a consumer product grows — it comes in waves, driven by platform-level deployments. When Shopify flips a switch, hundreds of stores appear overnight. When BigCommerce ships UCP, three appear. The question for May isn't "how many stores" but "which platforms ship next" — because each platform deployment is a step function, not a slope.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Shopify spec migration
&lt;/h2&gt;

&lt;p&gt;This is the story of the month. Between April 13 and April 17, Shopify migrated nearly its entire UCP fleet from v2026-01-23 to &lt;a href="https://ucpchecker.com/blog/ucp-v2026-04-08-spec-update" rel="noopener noreferrer"&gt;v2026-04-08&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;On April 13, our crawler showed &lt;strong&gt;2 stores&lt;/strong&gt; on the new spec. By April 17: &lt;strong&gt;3,988&lt;/strong&gt;. That's 3,986 stores upgraded in roughly four days — a coordinated platform-level migration, not individual merchants updating their manifests.&lt;/p&gt;

&lt;p&gt;The v2026-04-08 spec introduced three breaking changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;signing_keys&lt;/code&gt; moved from nested to root level.&lt;/strong&gt; Previously at &lt;code&gt;ucp.signing_keys&lt;/code&gt;, now at the document root alongside &lt;code&gt;ucp&lt;/code&gt;. This is the structural change that required a manifest rewrite, not just a version bump.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business profile distinction.&lt;/strong&gt; The spec now formally separates business profiles (individual store manifests at &lt;code&gt;/.well-known/ucp&lt;/code&gt;) from platform profiles, with different requirements for &lt;code&gt;spec&lt;/code&gt; and &lt;code&gt;schema&lt;/code&gt; fields on services and capabilities. Business profiles are lighter — &lt;code&gt;spec&lt;/code&gt; and &lt;code&gt;schema&lt;/code&gt; are optional.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;a2a&lt;/code&gt; transport formally added.&lt;/strong&gt; Google's Agent2Agent Protocol is now a recognised transport alongside REST, MCP, and Embedded, though adoption is effectively zero in the wild.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The migration means &lt;strong&gt;99.4% of the verified directory is now on the latest spec&lt;/strong&gt;. Only 26 stores remain on older versions: 19 on v2026-01-11, 6 on v2026-01-23, and 1 on v2026-01-14. These are almost entirely non-Shopify stores that need to upgrade manually.&lt;/p&gt;

&lt;p&gt;For the full spec breakdown, see our &lt;a href="https://ucpchecker.com/blog/ucp-v2026-04-08-spec-update" rel="noopener noreferrer"&gt;v2026-04-08 spec announcement&lt;/a&gt; and the &lt;a href="https://ucpchecker.com/specs" rel="noopener noreferrer"&gt;spec versions page&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond Shopify: platform diversification accelerates
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt; still dominates at 3,982 of 4,014 verified stores (99.2%). But the other 32 verified stores tell a more interesting story — these are developers who chose to publish a UCP manifest without a platform-level integration doing it for them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt; entered the directory&lt;/strong&gt; with its first three verified stores: &lt;a href="https://ucpchecker.com/status/untilgone.com" rel="noopener noreferrer"&gt;untilgone.com&lt;/a&gt;, &lt;a href="https://ucpchecker.com/status/touchupdirect.com" rel="noopener noreferrer"&gt;touchupdirect.com&lt;/a&gt;, and &lt;a href="https://ucpchecker.com/status/midwoodflowershop.com" rel="noopener noreferrer"&gt;midwoodflowershop.com&lt;/a&gt;. All three are on v2026-04-08 with checkout and cart capabilities declared. Notably, their average manifest latency (~890ms) is significantly higher than Shopify's (~130ms) — BigCommerce manifests are served from the storefront origin rather than a CDN-cached endpoint. Platform-level latency differences like this will matter as agent response budgets tighten.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;&lt;/strong&gt; now has 3 verified stores, up from zero in March. These are hand-built integrations — WooCommerce doesn't have native UCP support, so each merchant published their manifest manually. We fixed a &lt;a href="https://ucpchecker.com/blog/ucp-v2026-04-08-spec-update" rel="noopener noreferrer"&gt;validation bug&lt;/a&gt; this month that was incorrectly rejecting WooCommerce manifests with &lt;code&gt;payment_handlers: []&lt;/code&gt; (valid for stores using checkout-link redirect flows).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;&lt;/strong&gt; has 1 verified store. &lt;strong&gt;Custom/headless&lt;/strong&gt; stacks account for 25 verified stores — the most architecturally diverse group, including our own &lt;a href="https://ucpchecker.com/status/ucpchecker.com" rel="noopener noreferrer"&gt;ucpchecker.com&lt;/a&gt; manifest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Salesforce Commerce Cloud&lt;/strong&gt; has zero verified stores in the directory today. But industry signals suggest SFCC is exploring UCP support at the platform level — not as a one-off client integration, but as a feature that would ship to all Commerce Cloud merchants. If it follows the Shopify pattern — a single platform-level deployment bringing thousands of enterprise storefronts (Puma, Ralph Lauren, Under Armour, Adidas) into the ecosystem in one wave — the directory composition would shift significantly. SFCC is natively REST-based, so a REST-first UCP transport would be the natural fit, compared to Shopify's MCP-first approach. We're watching this closely.&lt;/p&gt;

&lt;p&gt;The full platform breakdown is live on our new &lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;/platforms&lt;/a&gt; page.&lt;/p&gt;

&lt;h2&gt;
  
  
  How agents actually perform
&lt;/h2&gt;

&lt;p&gt;The numbers above tell you which stores &lt;em&gt;have&lt;/em&gt; UCP. This section tells you which stores &lt;em&gt;work&lt;/em&gt; when an AI agent actually tries to shop them — and which models do it best.&lt;/p&gt;

&lt;h3&gt;
  
  
  Store benchmarks
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/leaderboard" rel="noopener noreferrer"&gt;Playground benchmarks&lt;/a&gt; grade stores A through F on end-to-end agent shopping performance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Grade&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;th&gt;What it means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Agent completes the full flow flawlessly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B+&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;422&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Works with minor issues — the largest cohort&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;td&gt;222&lt;/td&gt;
&lt;td&gt;Cart succeeds, checkout has friction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C+ / C&lt;/td&gt;
&lt;td&gt;225&lt;/td&gt;
&lt;td&gt;Discovery and browse work, deeper flow breaks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;D&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;Significant failures across the flow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;F&lt;/td&gt;
&lt;td&gt;289&lt;/td&gt;
&lt;td&gt;Manifest validates but the agent can't complete any step&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The B+ tier at 422 stores is the most important number here. These stores are &lt;em&gt;close&lt;/em&gt; — an agent can reliably discover, search, and cart them, but checkout friction (slow responses, variant mismatches, payment handler quirks) stops the flow short. The path from B+ to A is usually a single fix. The 289 F-grade stores are the other end: technically verified but functionally broken when an agent actually tries to shop them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Model leaderboard
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;UCP Playground&lt;/a&gt; now supports &lt;strong&gt;15 frontier LLMs&lt;/strong&gt; from 7 vendors, tested against &lt;strong&gt;76 unique stores&lt;/strong&gt;, generating over &lt;strong&gt;$114,000 in aggregate cart value&lt;/strong&gt;. The &lt;a href="https://ucpplayground.com/leaderboard" rel="noopener noreferrer"&gt;model leaderboard&lt;/a&gt; scores every model on search, cart completion, and checkout conversion:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Shopping Score&lt;/th&gt;
&lt;th&gt;Checkout %&lt;/th&gt;
&lt;th&gt;Search %&lt;/th&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek V3.2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;63&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;53.1%&lt;/td&gt;
&lt;td&gt;85.7%&lt;/td&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3 Flash&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;59&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;51.4%&lt;/td&gt;
&lt;td&gt;90.3%&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grok 4&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;59&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;42.0%&lt;/td&gt;
&lt;td&gt;92.0%&lt;/td&gt;
&lt;td&gt;xAI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Opus 4.6&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;52&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;41.9%&lt;/td&gt;
&lt;td&gt;80.0%&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Sonnet 4.5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;50&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;54.6%&lt;/td&gt;
&lt;td&gt;86.8%&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And the speed rankings — because latency is the other dimension that matters:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Avg Session&lt;/th&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Flash&lt;/td&gt;
&lt;td&gt;~12s&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-4o&lt;/td&gt;
&lt;td&gt;~14s&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3 Flash&lt;/td&gt;
&lt;td&gt;~17s&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Opus 4.6&lt;/td&gt;
&lt;td&gt;~31s&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grok 4&lt;/td&gt;
&lt;td&gt;~76s&lt;/td&gt;
&lt;td&gt;xAI&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Three takeaways
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;DeepSeek V3.2 leads the leaderboard.&lt;/strong&gt; An open-weight model tops the composite shopping score at 63 — ahead of every Anthropic, Google, and OpenAI model. The agentic commerce stack is genuinely model-agnostic in practice, not just in spec language.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Search works everywhere. Checkout is the bottleneck.&lt;/strong&gt; Every model scores above 70% on product search. But checkout conversion drops to 13–56% depending on the model. The gap between "can find products" and "can actually buy them" is the reliability frontier for the ecosystem. This is where the work is.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reasoning models underperform.&lt;/strong&gt; QwQ 32B (0% checkout), o4-mini (16.7%), Grok 3 Mini (13.3%), and DeepSeek R1 (21.4%) all score below 40. Models optimised for chain-of-thought reasoning burn tokens on deliberation and struggle to execute the simple, sequential tool-call patterns shopping requires. The best shopping agents are fast and decisive, not thoughtful.&lt;/p&gt;

&lt;p&gt;Full model profiles are on the &lt;a href="https://ucpplayground.com/models" rel="noopener noreferrer"&gt;Playground models page&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The reliability gap: verified is not ready
&lt;/h2&gt;

&lt;p&gt;This is the editorial point we want to make clearly, because the headline number (4,014 verified stores) obscures the more important one: &lt;strong&gt;9 stores score A&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Four thousand stores have valid UCP manifests. Nine of them deliver a flawless end-to-end agent shopping experience. That's a 0.2% flawless rate. The gap between "technically verified" and "actually shoppable by an AI agent without friction" is the central infrastructure problem for agentic commerce in 2026.&lt;/p&gt;

&lt;p&gt;The B+ tier — 422 stores — is where the leverage is. These stores work &lt;em&gt;most&lt;/em&gt; of the time. An agent can discover them, search their catalog, build a cart, and usually reach a checkout URL. But "usually" isn't good enough when the agent is spending someone's money. The failures at B+ level are specific and fixable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cart variant mismatches&lt;/strong&gt; — the agent selects a size/colour variant that doesn't match the store's internal variant ID scheme. The cart call succeeds but adds the wrong item.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payment handler timeouts&lt;/strong&gt; — the tokenization step takes longer than the agent's timeout window, and the session drops silently.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stale product data&lt;/strong&gt; — the catalog returns products that are out of stock by the time the agent tries to cart them. No error — just an empty cart.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Checkout redirect loops&lt;/strong&gt; — the checkout URL the store returns sends the agent into an authentication loop that a human browser would handle with cookies but an MCP client can't.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these is a single-fix problem for the store operator. But at scale, across 422 stores, the aggregate effect is that agents fail more often than they succeed at the final step. &lt;strong&gt;The ecosystem doesn't need more stores. It needs the stores it has to work more reliably.&lt;/strong&gt; That's the infrastructure investment that will actually unlock agent commerce at scale — and it's where we're focusing our tooling work for May.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capability coverage: the ceiling hasn't moved
&lt;/h2&gt;

&lt;p&gt;Across 4,014 verified stores:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;Coverage&lt;/th&gt;
&lt;th&gt;Stores&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpchecker.com/capabilities/checkout" rel="noopener noreferrer"&gt;Checkout&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;99.6%&lt;/td&gt;
&lt;td&gt;3,996&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpchecker.com/capabilities/cart" rel="noopener noreferrer"&gt;Cart&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;99.3%&lt;/td&gt;
&lt;td&gt;3,985&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpchecker.com/capabilities/identity-linking" rel="noopener noreferrer"&gt;Identity linking&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;0.07%&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://ucpchecker.com/capabilities/payment" rel="noopener noreferrer"&gt;Payment&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Same pattern as March. Checkout and cart are effectively universal because Shopify ships them by default. The advanced capabilities — identity, loyalty, payment — haven't moved. The gap between "technically verified" and "deeply agent-ready" is still the story. Until more stores declare capabilities beyond the Shopify defaults, the ecosystem depth chart stays flat.&lt;/p&gt;

&lt;h2&gt;
  
  
  The broader ecosystem
&lt;/h2&gt;

&lt;p&gt;April was quieter on the announcements front than March — which saw Splitit, PayPal, and Google all making public UCP commitments in a single week. But the signals that matter in April are structural, not press-release-shaped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shopify's fleet-wide spec migration is itself an ecosystem signal.&lt;/strong&gt; It demonstrates that a major platform can coordinate a breaking spec upgrade across thousands of stores in days, not months. Every other platform considering UCP adoption now has a reference point for what a managed migration looks like. The v2026-04-08 changes (signing_keys relocation, business profile distinction) were non-trivial — and Shopify shipped them to its entire fleet without a single store going offline. That's the kind of platform engineering confidence that accelerates the next platform's decision to build UCP support.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The endorsed partner roster continues to grow.&lt;/strong&gt; &lt;a href="https://ucpregistry.com/vendor/adyen" rel="noopener noreferrer"&gt;Adyen&lt;/a&gt;, &lt;a href="https://ucpregistry.com/vendor/american-express" rel="noopener noreferrer"&gt;American Express&lt;/a&gt;, &lt;a href="https://ucpregistry.com/vendor/mastercard" rel="noopener noreferrer"&gt;Mastercard&lt;/a&gt;, &lt;a href="https://ucpregistry.com/vendor/stripe" rel="noopener noreferrer"&gt;Stripe&lt;/a&gt;, &lt;a href="https://ucpregistry.com/vendor/visa" rel="noopener noreferrer"&gt;Visa&lt;/a&gt;, &lt;a href="https://ucpregistry.com/vendor/checkout-com" rel="noopener noreferrer"&gt;Checkout.com&lt;/a&gt;, &lt;a href="https://ucpregistry.com/vendor/affirm" rel="noopener noreferrer"&gt;Affirm&lt;/a&gt;, &lt;a href="https://ucpregistry.com/vendor/splitit" rel="noopener noreferrer"&gt;Splitit&lt;/a&gt;, and &lt;a href="https://ucpregistry.com/vendor/paypal" rel="noopener noreferrer"&gt;PayPal&lt;/a&gt; are all publicly committed to the protocol's payment layer. For any platform evaluating UCP, the payment handler ecosystem is no longer a gap — it's arguably the most mature part of the stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The model ecosystem is widening faster than the store ecosystem.&lt;/strong&gt; In February, we tested 3 models. In March, 8. In April, 16 — from 7 vendors across the US, China, and Europe. The number of AI models that can speak MCP and execute a UCP shopping flow is growing faster than the number of stores that can serve one. This suggests the bottleneck is shifting from "agents that can shop" to "stores that can be shopped reliably" — which circles back to the reliability gap above.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we shipped
&lt;/h2&gt;

&lt;p&gt;Heavy shipping month on the tooling side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;Side-by-side store comparison&lt;/a&gt;&lt;/strong&gt; — compare any two stores head-to-head on metrics, capabilities, transports, and payment handlers. &lt;a href="https://ucpchecker.com/blog/introducing-side-by-side-ucp-store-compare" rel="noopener noreferrer"&gt;Embeddable via iframe&lt;/a&gt; for blog posts and docs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;Platform pages&lt;/a&gt;&lt;/strong&gt; — live landing pages for &lt;a href="https://ucpchecker.com/platforms/shopify" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/bigcommerce" rel="noopener noreferrer"&gt;BigCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;, &lt;a href="https://ucpchecker.com/platforms/magento" rel="noopener noreferrer"&gt;Magento&lt;/a&gt;, and &lt;a href="https://ucpchecker.com/platforms/custom" rel="noopener noreferrer"&gt;Custom&lt;/a&gt;. Leaderboards, capability coverage, and transport adoption — auto-populates as stores verify.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;/.well-known/ucp developer guide&lt;/a&gt;&lt;/strong&gt; — field reference, minimal examples, publishing guides for Nginx/Cloudflare/Node, the six most common validation mistakes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/product-discovery" rel="noopener noreferrer"&gt;Product discovery guide&lt;/a&gt;&lt;/strong&gt; — the MCP tool call sequence agents use to find and buy products. Live demo, discovery-ready stores, three-way CTA to &lt;a href="https://ucpplayground.com" rel="noopener noreferrer"&gt;Playground&lt;/a&gt; + &lt;a href="https://ucpregistry.com" rel="noopener noreferrer"&gt;Registry&lt;/a&gt; + &lt;a href="https://ucprails.com" rel="noopener noreferrer"&gt;Rails&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ucpchecker.com/agents" rel="noopener noreferrer"&gt;Build an Agent quickstart&lt;/a&gt;&lt;/strong&gt; — from zero to a working agent in 30 minutes. Copy-paste code in Python and TypeScript.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec validation fixes&lt;/strong&gt; — accepted the &lt;code&gt;payment.handlers&lt;/code&gt; nested format (WooCommerce), downgraded empty &lt;code&gt;payment_handlers: []&lt;/code&gt; from hard fail to warning, upgraded our own manifest to v2026-04-08.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch in May
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Salesforce Commerce Cloud.&lt;/strong&gt; First platform-level deployment from the enterprise tier would be the most significant ecosystem event since Shopify's initial rollout. We'll catch any SFCC store that publishes on the next crawl.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The B+ → A path.&lt;/strong&gt; 422 stores are one fix away from flawless agent shopping. We're building tooling to surface the specific issue per store so operators can action it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Non-Shopify growth rate.&lt;/strong&gt; 32 non-Shopify stores this month vs ~15 last month. If this doubles again in May, UCP stops being a "Shopify project" and becomes a genuine multi-platform standard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AP2 / A2A adoption.&lt;/strong&gt; Zero stores declare either protocol. The v2026-04-08 spec formally added &lt;code&gt;a2a&lt;/code&gt; as a transport. First adopter will be notable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;p&gt;All data comes from the UCP Checker crawler, which re-checks every tracked domain at least every 24 hours. The raw verified-merchant dataset is published monthly on &lt;a href="https://huggingface.co/datasets/UCPChecker/ucp-merchants" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt; under CC-BY 4.0.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Browse the directory:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;ucpchecker.com/directory&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track adoption live:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;ucpchecker.com/stats&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compare two stores:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;ucpchecker.com/compare&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform breakdown:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;ucpchecker.com/platforms&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build your own agent:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/agents" rel="noopener noreferrer"&gt;ucpchecker.com/agents&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ecommerce</category>
      <category>ai</category>
      <category>data</category>
      <category>ucp</category>
    </item>
    <item>
      <title>MCP vs UCP vs AP2: What is the Difference?</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Thu, 16 Apr 2026 10:21:57 +0000</pubDate>
      <link>https://forem.com/benjifisher/mcp-vs-ucp-vs-ap2-what-is-the-difference-2p7b</link>
      <guid>https://forem.com/benjifisher/mcp-vs-ucp-vs-ap2-what-is-the-difference-2p7b</guid>
      <description>&lt;p&gt;Every week we get a version of the same question from developers reaching out about UCP Checker: &lt;strong&gt;"OK, but should I actually build on MCP, UCP, or AP2?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It's a reasonable question. The three protocols get lumped together in keynote slides and vendor blog posts, each positioned as "the" standard for how AI agents and commerce should talk to each other. If you're deciding what to implement this quarter, the marketing makes it look like a fork — pick one, live with it.&lt;/p&gt;

&lt;p&gt;Here's the honest answer from running &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;the only continuously-updated UCP directory&lt;/a&gt; of 3,643+ verified stores (as of April 13, 2026):&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP, UCP, and AP2 aren't competitors. They're stack layers. If you're doing agentic commerce seriously, you'll end up using all three — and they fit together more cleanly than the messaging suggests.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This post is the argument, anchored in real adoption data from the UCP Checker directory.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack, in one diagram
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────┐
│                                         │
│   AP2       ← payment authorization     │
│             (fits inside UCP's          │
│              payment_handlers)          │
│                                         │
├─────────────────────────────────────────┤
│                                         │
│   UCP       ← the shopping contract     │
│             (what a store sells,        │
│              what capabilities exist,   │
│              which transports to use)   │
│                                         │
├─────────────────────────────────────────┤
│                                         │
│   MCP       ← tool invocation           │
│             (how the agent actually     │
│              calls discover-store,      │
│              search-catalog, etc.)      │
│                                         │
└─────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read that top-to-bottom: an AI shopping agent opens a session, discovers a store via UCP, calls the store's tools via MCP, and — when it's ready to pay — hands off to AP2 for the payment authorization flow.&lt;/p&gt;

&lt;p&gt;That's the whole thing. Now the detail.&lt;/p&gt;

&lt;h2&gt;
  
  
  What MCP actually is
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;Model Context Protocol&lt;/a&gt; is Anthropic's open protocol for connecting AI models to tools, resources, and data sources. It is &lt;strong&gt;not commerce-specific.&lt;/strong&gt; MCP is how Claude Desktop talks to your filesystem, how Cursor talks to your database, how an agent in any environment calls a "list files" or "search knowledge base" tool.&lt;/p&gt;

&lt;p&gt;It's JSON-RPC over stdio, SSE, or HTTP. It defines a handshake, a tool-description schema, a session lifecycle, and a notification system. When an agent wants to "call a function" on an external system, MCP is the envelope that carries the call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In the context of shopping, MCP is the mechanism UCP uses to dispatch commerce tool calls.&lt;/strong&gt; When an AI agent runs &lt;code&gt;search-catalog&lt;/code&gt; against a verified store, it's sending an MCP tool-call message to the endpoint declared in that store's UCP manifest. The &lt;em&gt;fact that MCP is involved&lt;/em&gt; is a transport detail. The &lt;em&gt;fact that a store supports UCP search at all&lt;/em&gt; is what the agent actually cares about.&lt;/p&gt;

&lt;p&gt;Here's the adoption data from our directory right now: &lt;strong&gt;of &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;3,643+ verified UCP stores&lt;/a&gt; as of April 13, 2026, effectively 100% declare MCP as one of their &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transports&lt;/a&gt;&lt;/strong&gt;. MCP is the de facto transport for UCP. Not because MCP "won" a protocol war, but because there's nothing else that does what it does at this layer.&lt;/p&gt;

&lt;p&gt;If you want to see the exact transport mix, the live breakdown is on &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;the transports page&lt;/a&gt;. It's been MCP-dominant since day one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What UCP actually is
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/protocol" rel="noopener noreferrer"&gt;Universal Commerce Protocol&lt;/a&gt; is the open standard for &lt;strong&gt;agentic commerce specifically&lt;/strong&gt;. It answers questions MCP doesn't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How does an agent find this store in the first place?&lt;/strong&gt; (Answer: a &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;well-known manifest&lt;/a&gt; at &lt;code&gt;/.well-known/ucp&lt;/code&gt;.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What can the store do?&lt;/strong&gt; (Answer: a declared set of &lt;a href="https://ucpchecker.com/capabilities" rel="noopener noreferrer"&gt;capabilities&lt;/a&gt; — checkout, cart, catalog-search, identity-linking, buyer-consent, fulfillment, and so on.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How does the agent talk to it?&lt;/strong&gt; (Answer: one or more declared &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transports&lt;/a&gt; — REST, MCP, A2A, or Embedded. MCP being by far the most common in practice.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What payment methods does the store accept?&lt;/strong&gt; (Answer: a declared list of &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handlers&lt;/a&gt; — Stripe, Google Pay, Shop Pay, and others — with enough detail for an agent to tokenize a card.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;UCP is the &lt;strong&gt;contract&lt;/strong&gt;. It's a JSON document agents fetch before they do anything else. Without a valid UCP manifest, an agent doesn't know what your store can do, which tools it exposes, or how to pay. It can scrape your HTML like any other crawler — and most of them will — but the experience is slow, unreliable, and breaks at checkout more often than it succeeds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UCP's job is to be the single document that makes a store shoppable by agents.&lt;/strong&gt; MCP is the mechanism it points at. AP2 fits inside its payment handler list.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AP2 actually is
&lt;/h2&gt;

&lt;p&gt;Agent Payments Protocol is Google's specification for how AI agents authorize and execute payments. It's much more specialized than either MCP or UCP: it's about the &lt;strong&gt;money layer&lt;/strong&gt; specifically — consent, authorization, dispute handling, the cryptographic mandate that lets a specific agent run a specific transaction.&lt;/p&gt;

&lt;p&gt;AP2 is newer than both MCP and UCP, and its adoption numbers reflect that. &lt;strong&gt;At time of writing, we have zero stores in the directory declaring an AP2 payment handler&lt;/strong&gt;, compared to dozens declaring Shop Pay, Stripe, Google Pay, and the various tokenizer namespaces.&lt;/p&gt;

&lt;p&gt;That might sound like a strike against AP2. It isn't. AP2 is a &lt;em&gt;different kind of thing&lt;/em&gt; — it's not a transport and it's not a capability declaration, it's a protocol for the auth step that happens after the agent has already selected items and built a cart. The stores that will eventually use it will declare it as a &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handler namespace&lt;/a&gt; inside their existing UCP manifest. UCP is the envelope that carries AP2 into the agent commerce ecosystem.&lt;/p&gt;

&lt;p&gt;When AP2 adoption starts showing up in the directory, UCP Checker will catch it automatically on the next crawl cycle. We'll know because we track every payment handler namespace across every verified store, and we'll be able to tell you exactly which stores flipped first. That's the kind of thing &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;the directory is for&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How they actually compose — a worked example
&lt;/h2&gt;

&lt;p&gt;Picture an AI shopping agent asked to buy a pair of shoes. Here's what happens in a UCP-verified flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;UCP discovery.&lt;/strong&gt; The agent fetches &lt;code&gt;https://allbirds.com/.well-known/ucp&lt;/code&gt;. It parses the JSON, reads the capability list (&lt;code&gt;checkout&lt;/code&gt;, &lt;code&gt;cart&lt;/code&gt;, &lt;code&gt;catalog-search&lt;/code&gt;, &lt;code&gt;payment&lt;/code&gt;, &lt;code&gt;identity-linking&lt;/code&gt;…), picks the transport it wants to use from the declared list (it picks MCP because it's listed first and the agent speaks MCP), and notes the payment handlers this store accepts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP session.&lt;/strong&gt; The agent connects to the MCP endpoint declared in the UCP manifest. It opens a session, lists the available tools, and calls &lt;code&gt;search-catalog({query: "running shoes"})&lt;/code&gt;. The store responds with a list of products.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More MCP.&lt;/strong&gt; The agent calls &lt;code&gt;add-to-cart({variant: "...", quantity: 1})&lt;/code&gt;. The store responds with a cart state and a checkout URL.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payment hand-off.&lt;/strong&gt; The agent needs to pay. It looks at the store's declared payment handlers. If one of them is an AP2 namespace (not yet, but eventually), it runs the AP2 authorization flow — getting consent, building the mandate, submitting the authorization. If not, it falls back to tokenizing a card via the declared payment handler's tokenization spec (Stripe, Google Pay, Shop Pay, etc.).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confirmation.&lt;/strong&gt; The agent calls &lt;code&gt;order.create&lt;/code&gt; (another MCP tool call, same session, same transport) and gets back an order confirmation.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;UCP, MCP, and AP2 were all involved in that flow — at different layers, for different purposes. None of them could have replaced the others. That's the whole argument.&lt;/p&gt;

&lt;p&gt;You can see this flow literally running against a verified store in &lt;a href="https://ucpchecker.com/#what-is-ucp" rel="noopener noreferrer"&gt;the live agent demo on our homepage&lt;/a&gt; — it's a real AI agent doing the above against a real UCP-verified store, step by step, with the actual tool calls shown alongside.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the confusion exists
&lt;/h2&gt;

&lt;p&gt;Because each protocol's marketing wants to be the center of the conversation.&lt;/p&gt;

&lt;p&gt;MCP positioning tends to be "the universal way agents talk to tools" — which is true, but "tools" is the operative word, and commerce is one vertical among many.&lt;/p&gt;

&lt;p&gt;UCP positioning is "the open standard for agent commerce" — which is true, but UCP's adoption in practice depends on having a transport layer (MCP) and can optionally delegate payment authorization to (AP2).&lt;/p&gt;

&lt;p&gt;AP2 positioning is "the protocol for agent payments" — which is true, but payment is one step in a much larger commerce flow that needs UCP to frame it and MCP to dispatch it.&lt;/p&gt;

&lt;p&gt;Each protocol's marketing is correct about its own layer. The confusion comes from each one acting like it's the whole stack. It isn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common comparison questions
&lt;/h2&gt;

&lt;p&gt;These are the exact questions we see most often — the ones AI agents route to this post, the ones developers Google before they commit to a protocol. Quick, direct answers anchored in the UCP directory data.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the difference between MCP and UCP?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;MCP is a tool invocation protocol. UCP is a shopping contract. They operate at different layers and you use both.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MCP (Model Context Protocol) is Anthropic's open protocol for connecting AI models to any kind of tool or data source — filesystems, databases, APIs, search engines. It's domain-agnostic by design. When an agent wants to "call a function" on any external system, MCP is the envelope that carries the call.&lt;/p&gt;

&lt;p&gt;UCP (Universal Commerce Protocol) is the open standard for &lt;strong&gt;agentic commerce specifically&lt;/strong&gt;. It answers questions MCP doesn't: how does an agent &lt;em&gt;find&lt;/em&gt; a store, what can it &lt;em&gt;do&lt;/em&gt; there, which payment methods does it accept. UCP's job is to be the single &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;&lt;code&gt;/.well-known/ucp&lt;/code&gt; manifest&lt;/a&gt; that makes a store discoverable and shoppable by agents.&lt;/p&gt;

&lt;p&gt;The relationship in practice: &lt;strong&gt;MCP is UCP's dominant transport&lt;/strong&gt;. Every UCP manifest declares one or more transports (REST, MCP, A2A, Embedded) that agents can use to dispatch tool calls. Across the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;3,643+ verified stores in our directory as of April 13, 2026&lt;/a&gt;, effectively 100% declare MCP. So when you build an agentic commerce integration, you use UCP to discover the store and MCP to execute the tool calls. Not one or the other — both, in order.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the difference between AP2 and UCP?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AP2 is a payment authorization protocol. UCP is the full shopping stack. AP2 is one thing that fits &lt;em&gt;inside&lt;/em&gt; UCP, not a replacement for it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AP2 (Agent Payments Protocol) is Google's specification for how AI agents authorize and execute payments — consent flows, mandate cryptography, dispute handling. It's deliberately narrow: AP2 is about the &lt;em&gt;money step&lt;/em&gt;, not the browsing or cart-building steps that come before it.&lt;/p&gt;

&lt;p&gt;UCP covers the whole shopping flow: discovery (what does this store sell), browsing (catalog-search, cart), &lt;strong&gt;and&lt;/strong&gt; the payment layer. UCP's &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handlers&lt;/a&gt; section in every manifest is a map of payment handler namespaces — Stripe, Google Pay, Shop Pay, and eventually AP2 when it reaches adoption. &lt;strong&gt;AP2, when it ships in production stores, will show up as a payment handler namespace inside an existing UCP manifest&lt;/strong&gt;, sitting alongside the other tokenization methods an agent can choose from.&lt;/p&gt;

&lt;p&gt;Current adoption data as of April 13, 2026: &lt;strong&gt;zero stores in the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;UCP directory&lt;/a&gt; declare an AP2 payment handler&lt;/strong&gt;. That's not a criticism — AP2 is newer than both MCP and UCP, and the rollout is gated on payment processors exposing it. But it makes the practical answer clear right now: you publish a UCP manifest today, you add AP2 later when your payment processor supports it. UCP is the umbrella; AP2 is one of the spokes it will eventually hold.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the difference between A2A and UCP?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;A2A is a transport protocol (like MCP). UCP is the shopping contract. A2A is one of UCP's allowed transport options, not a competitor.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A2A (&lt;a href="https://google.github.io/A2A/" rel="noopener noreferrer"&gt;Agent2Agent Protocol&lt;/a&gt;) is Google's protocol for agent-to-agent communication — how two agents talk to each other directly without a human intermediary. It serves a similar role to MCP within the UCP stack: it's the mechanism an agent uses to dispatch tool calls against a store's endpoint.&lt;/p&gt;

&lt;p&gt;UCP's v2026-04-08 spec &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;lists four allowed transports&lt;/a&gt;: &lt;strong&gt;REST, MCP, A2A, and Embedded&lt;/strong&gt;. A store can declare any or all of them in its manifest — and agents will pick whichever they support when they connect. A2A is formally on the list, same as MCP.&lt;/p&gt;

&lt;p&gt;In practice, &lt;strong&gt;A2A adoption on verified stores is effectively zero today&lt;/strong&gt;, versus MCP's near-100%. The reason isn't technical, it's ecosystem timing: MCP shipped earlier and got the first wave of tooling. A2A is a strong candidate for the second wave once agent-to-agent coordination (one agent buying on behalf of another, multi-agent fulfilment pipelines) becomes a common pattern. When that happens, stores will add A2A to their existing UCP manifests alongside MCP — not replacing it. The correct framing is still "A2A goes &lt;em&gt;inside&lt;/em&gt; UCP," same as MCP does.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which should you actually adopt
&lt;/h2&gt;

&lt;p&gt;Depends on what you're building.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're a store owner or ecommerce engineer&lt;/strong&gt;, your job is to &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;publish a valid UCP manifest&lt;/a&gt; at &lt;code&gt;/.well-known/ucp&lt;/code&gt; that declares your capabilities, transports (MCP will almost certainly be one of them), and payment handlers. You don't need to "pick" MCP — UCP will tell you to expose an MCP endpoint as one of its transports, and most of the tooling you'll find assumes MCP. AP2 you can add later, as a payment handler, when your payment processor supports it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're an agent or tooling developer&lt;/strong&gt;, you need to speak all three, in the right order. Fetch UCP first to discover the store. Use MCP to actually dispatch tool calls against the declared endpoint. Handle AP2 at the payment step if the store declares it. In practice you'll build a UCP client library that wraps all of this transparently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're a payments company&lt;/strong&gt;, AP2 is your layer. Your job is to get your payment processor's tokenization spec declared as a payment handler in UCP manifests across the ecosystem, and eventually to support AP2 mandates as the authorization step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're analyzing the ecosystem&lt;/strong&gt;, look at the UCP adoption data. MCP transport counts and AP2 payment handler counts are both measurable inside UCP manifests — which is why we surface them on the &lt;a href="https://ucpchecker.com/platforms" rel="noopener noreferrer"&gt;platforms&lt;/a&gt;, &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transports&lt;/a&gt;, and &lt;a href="https://ucpchecker.com/payment-handlers" rel="noopener noreferrer"&gt;payment handlers&lt;/a&gt; pages. The question is never "which protocol won," it's "how many stores have it."&lt;/p&gt;

&lt;h2&gt;
  
  
  The one-line summary
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;UCP is the shopping contract. MCP is how agents dispatch tool calls against it. AP2 is how they authorize payments inside it. All three are required for a complete agent commerce stack, and none of them replaces the others.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're unclear whether your store is set up correctly, &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;run a live check&lt;/a&gt; — we'll fetch your manifest, validate it against the current spec, and tell you exactly which transports and payment handlers you're declaring (and which you're missing). If you're evaluating two stores' UCP coverage side-by-side, &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;the compare tool&lt;/a&gt; puts their capabilities, transports, and payment handlers in a single scannable view.&lt;/p&gt;

&lt;p&gt;And if you're building something on top of the stack and want to know which stores have what — that's what &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;the directory&lt;/a&gt; is for.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Check your manifest:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;ucpchecker.com/check&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compare two stores:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;ucpchecker.com/compare&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browse the directory:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;ucpchecker.com/directory&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developer guide to /.well-known/ucp:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/well-known-ucp" rel="noopener noreferrer"&gt;ucpchecker.com/well-known-ucp&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>ai</category>
      <category>ucp</category>
    </item>
    <item>
      <title>Introducing Side-by-Side Store Compare: See How Any Two UCP Stores Stack Up</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Mon, 13 Apr 2026 20:47:52 +0000</pubDate>
      <link>https://forem.com/benjifisher/introducing-side-by-side-store-compare-see-how-any-two-ucp-stores-stack-up-3djm</link>
      <guid>https://forem.com/benjifisher/introducing-side-by-side-store-compare-see-how-any-two-ucp-stores-stack-up-3djm</guid>
      <description>&lt;p&gt;Three months into running UCPChecker, the most common follow-up question we get from anyone reading a status report is the same: &lt;strong&gt;"OK, but how does that compare to [other store]?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That question comes from everywhere. &lt;a href="https://ucpchecker.com/developer-tools" rel="noopener noreferrer"&gt;Developers&lt;/a&gt; picking which store to integrate with first. &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;Analysts&lt;/a&gt; tracking which platforms are pulling ahead in UCP coverage. &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;Store owners&lt;/a&gt; benchmarking against direct competitors. Marketing teams putting together pitch decks about why their stack is more agent-ready than the next. Platform vendors comparing their hosted ecosystem against rival platforms. &lt;a href="https://ucpchecker.com/mcp-tools" rel="noopener noreferrer"&gt;AI agent builders&lt;/a&gt; deciding which retailers to feature in demo flows.&lt;/p&gt;

&lt;p&gt;None of these audiences really care about a single store's UCP coverage in isolation. They all care about how it stacks up against another. Whether &lt;a href="https://ucpchecker.com/status/allbirds.com" rel="noopener noreferrer"&gt;Allbirds&lt;/a&gt; is more agent-ready than &lt;a href="https://ucpchecker.com/status/casper.com" rel="noopener noreferrer"&gt;Casper&lt;/a&gt;. Whether &lt;a href="https://ucpchecker.com/status/boden.com" rel="noopener noreferrer"&gt;Boden's&lt;/a&gt; Shopify implementation goes deeper than &lt;a href="https://ucpchecker.com/status/bornforfashion.com" rel="noopener noreferrer"&gt;Born for Fashion's&lt;/a&gt;. Whether the brand they're about to integrate with has more capabilities than the one they're already integrated with. The interesting answer is always relative.&lt;/p&gt;

&lt;p&gt;Until today, the only way to get that answer on UCPChecker was to open two browser tabs and squint. So we built the thing people were already trying to do manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;Compare any two UCP stores side-by-side at ucpchecker.com/compare →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;Pick two domains. Get every measurable UCP attribute laid out side by side in a single scannable view.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Headline metrics&lt;/strong&gt;: status, UCP version, latency, capability count, transport count, payment handler count, HTTP status, robots.txt policy, platform. Quantitative cells highlight the leading side with a soft green left-border, so you can scan winners without reading numbers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capability matrix&lt;/strong&gt;: every UCP capability declared by either store, bucketed into "Both stores", "A only", and "B only". Each chip links straight to the &lt;a href="https://ucpchecker.com/capabilities" rel="noopener noreferrer"&gt;capability's deep-dive page&lt;/a&gt;, so if you spot a gap you can immediately see what it is and why it matters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transport diff&lt;/strong&gt;: same Both / A only / B only treatment for &lt;a href="https://ucpchecker.com/transports/rest" rel="noopener noreferrer"&gt;REST&lt;/a&gt;, &lt;a href="https://ucpchecker.com/transports/mcp" rel="noopener noreferrer"&gt;MCP&lt;/a&gt;, A2A, and &lt;a href="https://ucpchecker.com/transports/embedded" rel="noopener noreferrer"&gt;Embedded&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI bot access matrix&lt;/strong&gt;: GPTBot, Google-Extended, ClaudeBot, Applebot-Extended, and CCBot — allowed, blocked, or unknown for each store.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payment handlers&lt;/strong&gt;: which payment methods each manifest declares, including the ones one side has and the other doesn't.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick another&lt;/strong&gt;: tiny inline form pre-filled with the current side A so you can swap side B and re-run instantly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Related comparisons&lt;/strong&gt;: auto-suggested by capability overlap with side A — a useful map of who else is building similar agent surface in the same space.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Status-aware FAQ&lt;/strong&gt;: the questions change depending on the matchup. Two verified stores get a different lead question than one verified vs one not-detected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's free, public, indexable, and works with any domain. If a store isn't already in our &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;directory&lt;/a&gt;, we run a &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;live check&lt;/a&gt; the first time you compare it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we built it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Two reasons. The first one is the most honest.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Right now, UCP coverage is a moving target. Some stores have everything declared — &lt;a href="https://ucpchecker.com/capabilities/checkout" rel="noopener noreferrer"&gt;checkout&lt;/a&gt;, &lt;a href="https://ucpchecker.com/capabilities/cart" rel="noopener noreferrer"&gt;cart management&lt;/a&gt;, &lt;a href="https://ucpchecker.com/capabilities/identity-linking" rel="noopener noreferrer"&gt;identity linking&lt;/a&gt;, &lt;a href="https://ucpchecker.com/capabilities/payment" rel="noopener noreferrer"&gt;payment tokens&lt;/a&gt;, multiple &lt;a href="https://ucpchecker.com/transports" rel="noopener noreferrer"&gt;transports&lt;/a&gt;. Other stores have a single capability and a single transport, technically &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;verified&lt;/a&gt; but barely useful to an agent. The &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;directory&lt;/a&gt; grade tells you "verified" or "not". It doesn't tell you whether one verified store is significantly more agent-ready than another.&lt;/p&gt;

&lt;p&gt;That difference matters more every week. The teams building agentic commerce tooling — the ones picking which stores to index first, which to feature in demo flows, which to recommend to their users — they need a relative view. They need to know that allbirds.com's manifest goes three levels deeper than the manifest of an otherwise equivalent store. They were already opening two status pages and comparing fields by hand. We watched it happen in user sessions. Compare just makes that workflow native.&lt;/p&gt;

&lt;p&gt;The second reason is more strategic. We've been quietly building the infrastructure for what we think will be the most important question in agentic commerce as it matures: &lt;strong&gt;not whether a store is verified, but who's pulling ahead.&lt;/strong&gt; Compare is the first user-facing surface that exposes that question directly. There will be more.&lt;/p&gt;

&lt;h2&gt;
  
  
  How we built it (the short version)
&lt;/h2&gt;

&lt;p&gt;The data was already there. Every Merchant in our database has its capabilities, transports, and payment handlers loaded as proper many-to-many relationships. The computation is just three set operations per relation: intersect, left-only, right-only. The hard part was deciding what to compare and how to render the diff so two columns of dense data still feel scannable on a phone.&lt;/p&gt;

&lt;p&gt;Some of the design decisions worth calling out:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Alphabetical canonical URLs.&lt;/strong&gt; &lt;code&gt;/compare/casper.com/vs/allbirds.com&lt;/code&gt; 301-redirects to &lt;code&gt;/compare/allbirds.com/vs/casper.com&lt;/code&gt;. Without that, every store-pair would generate two URLs and split its link equity in half. Pretty URL, single canonical, no duplicate content.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sync check on first visit.&lt;/strong&gt; If you compare a domain that isn't in our database yet, we run a &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;fresh UCP check&lt;/a&gt; inline before rendering. The compare page never shows "no data" — it always has something to compare. Same pattern as the &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;per-store status pages&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Noindex when neither side is verified.&lt;/strong&gt; Two non-verified stores produces a thin page that would just pollute the search index. Those pages still work for visitors who land on them — they just don't get crawled. As soon as either side becomes verified, the page flips to indexable automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A "winner" highlight, not a "winner" badge.&lt;/strong&gt; The leading side on a quantitative metric (lower latency, more capabilities, fresher check) gets a gentle green left-border on its cell — but we never write the word "better" or "worse" anywhere. The data speaks for itself, and "better" isn't a value judgment we want to be making about other people's stores.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Status-aware FAQ that mirrors visible HTML to JSON-LD.&lt;/strong&gt; Every compare page emits a real &lt;code&gt;FAQPage&lt;/code&gt; schema with the same questions and answers a human reader sees. The FAQ branches based on the matchup so the lead question is always relevant to what you're looking at.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Four pairings we've been opening manually for weeks. Each one is a live embed of the actual comparison — the same data refreshes every 24 hours from our crawler.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;allbirds.com vs casper.com&lt;/strong&gt; — two well-known DTC brands, see how their capabilities differ.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/compare/allbirds.com/vs/casper.com" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4k2xv663odafm1d0yq93.png" alt="allbirds.com vs casper.com UCP comparison" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;boden.com vs kyliecosmetics.com&lt;/strong&gt; — both verified Shopify stores, compare their depth.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/compare/boden.com/vs/kyliecosmetics.com" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp6cn76fzgl7cirwylag6.png" alt="boden.com vs kyliecosmetics.com UCP comparison" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;hairlust.com vs thebodyshop.com&lt;/strong&gt; — beauty vs hair, both verified.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/compare/hairlust.com/vs/thebodyshop.com" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ungszgdgli2rocv9w7b.png" alt="hairlust.com vs thebodyshop.com UCP comparison" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;bornforfashion.com vs casper.com&lt;/strong&gt; — fashion vs sleep, contrasting capability surface.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ucpchecker.com/compare/bornforfashion.com/vs/casper.com" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw09qd50g4yccts789uuk.png" alt="bornforfashion.com vs casper.com UCP comparison" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Or just start typing two domains into &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;ucpchecker.com/compare&lt;/a&gt;. Autocomplete suggests from the verified directory.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;A few obvious extensions we're sitting on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Embeds need more testing in the wild.&lt;/strong&gt; We've shipped iframe and Markdown embeds and validated them locally, but the real test is seeing them deployed across Substack, Medium, Notion, GitHub READMEs, and the hundred CMSes we don't have on our test bench. If you embed a comparison and the layout breaks, &lt;a href="https://ucpchecker.com/contact" rel="noopener noreferrer"&gt;tell us&lt;/a&gt; — we'll fix it fast.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;postMessage iframe auto-resize.&lt;/strong&gt; Right now embeds use a fixed iframe height (900px by default). Comparisons with sparse capability data leave whitespace; comparisons with dense data sometimes scroll. The cleanest fix is a postMessage handshake from the embed to the host page so the iframe sizes itself to its content. On the list.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Three-way and N-way comparison.&lt;/strong&gt; Two columns is the right default — past two, the visual gets cramped — but for "which of these five Shopify stores has the deepest UCP implementation" type questions, we'll likely add a tabular wide-mode behind a separate URL.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compare is the first product surface we've shipped that frames UCP coverage as a relative thing rather than a binary verified/not. It changes what you can ask. We're already seeing internal queries we couldn't run before — "show me every &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;verified store&lt;/a&gt; that has &lt;a href="https://ucpchecker.com/capabilities/cart" rel="noopener noreferrer"&gt;cart management&lt;/a&gt; but is missing &lt;a href="https://ucpchecker.com/capabilities/identity-linking" rel="noopener noreferrer"&gt;identity linking&lt;/a&gt;" is one diff away from being a real question someone outside our team can answer.&lt;/p&gt;

&lt;p&gt;If you build something with it, or if you find a comparison that surprised you, &lt;a href="https://ucpchecker.com/contact" rel="noopener noreferrer"&gt;let us know&lt;/a&gt;. The interesting comparisons are the ones we haven't thought to run.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Try it now:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/compare" rel="noopener noreferrer"&gt;ucpchecker.com/compare&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browse the directory:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/directory" rel="noopener noreferrer"&gt;ucpchecker.com/directory&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track adoption live:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;ucpchecker.com/stats&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate a manifest:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/ucp-validator" rel="noopener noreferrer"&gt;ucpchecker.com/ucp-validator&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get notified on changes:&lt;/strong&gt; &lt;a href="https://ucpchecker.com/alerts" rel="noopener noreferrer"&gt;ucpchecker.com/alerts&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>product</category>
      <category>ucp</category>
    </item>
    <item>
      <title>UCP v2026-04-08 Spec Update</title>
      <dc:creator>Benji Fisher</dc:creator>
      <pubDate>Sat, 11 Apr 2026 12:04:52 +0000</pubDate>
      <link>https://forem.com/benjifisher/ucp-v2026-04-08-spec-update-39l2</link>
      <guid>https://forem.com/benjifisher/ucp-v2026-04-08-spec-update-39l2</guid>
      <description>&lt;p&gt;On April 9th, the UCP Technical Council shipped &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/releases/tag/v2026-04-08" rel="noopener noreferrer"&gt;v2026-04-08&lt;/a&gt; — the first spec bump since January's &lt;code&gt;2026-01-23&lt;/code&gt; release. It's the largest single release in the protocol's history: 26 new features, 6 breaking changes, 19 documentation updates, and contributions from 15 first-time contributors.&lt;/p&gt;

&lt;p&gt;This isn't a patch. It's the release where UCP stops being a checkout-and-order protocol and starts becoming a full commerce platform.&lt;/p&gt;

&lt;p&gt;Here's what changed, what it means, and what you should do about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The headline features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Carts are now a first-class capability
&lt;/h3&gt;

&lt;p&gt;The most consequential addition is formal cart support (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/73" rel="noopener noreferrer"&gt;#73&lt;/a&gt;). The &lt;code&gt;dev.ucp.shopping.cart&lt;/code&gt; capability gives agents the ability to create, read, update, and manage persistent shopping carts — the workflow that dominates human e-commerce but has been entirely absent from agent commerce until now.&lt;/p&gt;

&lt;p&gt;We &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-march-2026" rel="noopener noreferrer"&gt;wrote in March&lt;/a&gt; that only 2 out of 2,832 verified stores declared cart capabilities. That number was low because the spec itself hadn't formalized the capability. Now it has. The schema defines add-to-cart, remove, quantity updates, and cart retrieval. Discount extensions have been &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/246" rel="noopener noreferrer"&gt;expanded to work with carts&lt;/a&gt; too, so agents can apply promo codes before checkout.&lt;/p&gt;

&lt;p&gt;For agent developers: this is the capability that unlocks multi-step shopping. Instead of "find product → checkout immediately," agents can now build baskets, compare options, apply discounts, and let the user review before committing. Design for it now, even if adoption will take months to ramp.&lt;/p&gt;

&lt;h3&gt;
  
  
  Catalog search and product lookup
&lt;/h3&gt;

&lt;p&gt;Agents can now discover what a store actually sells. The new &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/55" rel="noopener noreferrer"&gt;catalog search&lt;/a&gt; and &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/195" rel="noopener noreferrer"&gt;product lookup&lt;/a&gt; capabilities (&lt;code&gt;dev.ucp.shopping.catalog_search&lt;/code&gt; and &lt;code&gt;dev.ucp.shopping.catalog_lookup&lt;/code&gt;) give agents structured access to product discovery — search by keyword, filter by attributes, and retrieve full product details including variant IDs.&lt;/p&gt;

&lt;p&gt;Previously, agents relied on unstructured HTML scraping or platform-specific APIs to find products before initiating checkout. Now product discovery is part of the protocol itself. This is the missing first step: an agent can search a store's catalog, find what it needs, add items to a cart, and check out — all through UCP.&lt;/p&gt;

&lt;p&gt;We've added catalog detection to &lt;a href="https://ucpchecker.com/capabilities" rel="noopener noreferrer"&gt;UCPChecker's capability tracking&lt;/a&gt;. As stores adopt these capabilities, you'll see them in the &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;stats&lt;/a&gt; and on individual &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;store profiles&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Request and response signing
&lt;/h3&gt;

&lt;p&gt;Cryptographic signing (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/156" rel="noopener noreferrer"&gt;#156&lt;/a&gt;) is the security feature the protocol needed. Stores can now sign responses, and agents can verify they're talking to the real merchant — not a MITM or a spoofed endpoint.&lt;/p&gt;

&lt;p&gt;The spec uses JWK-format public keys published in the discovery profile. What's notable is where those keys live: &lt;code&gt;signing_keys&lt;/code&gt; has moved from inside the &lt;code&gt;ucp&lt;/code&gt; object to the &lt;strong&gt;root level&lt;/strong&gt; of the discovery profile, sitting as a sibling of &lt;code&gt;ucp&lt;/code&gt; rather than nested within it. This is a structural change that affects how validators parse manifests.&lt;/p&gt;

&lt;p&gt;We've updated &lt;a href="https://ucpchecker.com/validator" rel="noopener noreferrer"&gt;our validator&lt;/a&gt; to handle both locations — the new root-level position for v2026-04-08+ manifests, and the legacy nested position for older versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The structural changes that matter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Business profiles vs. platform profiles
&lt;/h3&gt;

&lt;p&gt;This is the change that will generate the most false positives if your tooling doesn't adapt.&lt;/p&gt;

&lt;p&gt;v2026-04-08 formally distinguishes &lt;strong&gt;platform profiles&lt;/strong&gt; (the full spec declarations that platforms like Shopify publish) from &lt;strong&gt;business profiles&lt;/strong&gt; (what individual stores serve at &lt;code&gt;/.well-known/ucp&lt;/code&gt;). The key difference: business profiles no longer require &lt;code&gt;spec&lt;/code&gt; and &lt;code&gt;schema&lt;/code&gt; URLs on capabilities, services, or payment handlers. Those fields are only mandatory at the platform level — stores inherit them from their platform.&lt;/p&gt;

&lt;p&gt;This makes sense. A Shopify merchant shouldn't need to declare &lt;code&gt;"spec": "https://ucp.dev/specification/shopping/checkout/"&lt;/code&gt; in their manifest — that's Shopify's concern, not the merchant's. But every validator that checked for these fields as required will now throw false warnings against perfectly valid business profiles.&lt;/p&gt;

&lt;p&gt;UCPChecker has already updated its validation rules. If your store runs v2026-04-08, we'll validate against the business profile schema — no spurious warnings about missing &lt;code&gt;spec&lt;/code&gt; or &lt;code&gt;schema&lt;/code&gt; fields that your platform handles upstream. You can verify your store at &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCPChecker.com&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-parent capability extensions
&lt;/h3&gt;

&lt;p&gt;Capabilities can now &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/96" rel="noopener noreferrer"&gt;extend multiple parents&lt;/a&gt; with deterministic schema resolution. This sounds abstract, but it solves a real problem: capabilities like embedded checkout that need to compose behaviors from both the cart and checkout namespaces.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;extends&lt;/code&gt; field on capabilities now accepts an array of reverse-domain names instead of just a single string. Schema resolution follows a defined order, so there's no ambiguity about which parent's definition wins when there's a conflict.&lt;/p&gt;

&lt;h3&gt;
  
  
  Supported versions for backwards compatibility
&lt;/h3&gt;

&lt;p&gt;Business profiles can now declare a &lt;code&gt;supported_versions&lt;/code&gt; field — a map of older protocol versions to their profile URIs. This means a store can advertise "I speak v2026-04-08, but I also have a v2026-01-23 profile at this URL" — letting agents negotiate down to a version they understand.&lt;/p&gt;

&lt;p&gt;For the ecosystem, this is important infrastructure. It means the v2026-01-23 → v2026-04-08 migration doesn't have to be a flag day. Stores can support both versions simultaneously while agents upgrade.&lt;/p&gt;

&lt;h2&gt;
  
  
  The breaking changes
&lt;/h2&gt;

&lt;p&gt;Six changes in this release are marked breaking. Here's what they actually break:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Order schema: currency is now required&lt;/strong&gt; (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/283" rel="noopener noreferrer"&gt;#283&lt;/a&gt;). Previously optional, the &lt;code&gt;currency&lt;/code&gt; field on orders is now mandatory. If your implementation omits it, your order responses will fail validation against v2026-04-08. Fix: add the ISO 4217 currency code (e.g., &lt;code&gt;"USD"&lt;/code&gt;) to your order objects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authorization and abuse signals&lt;/strong&gt; (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/203" rel="noopener noreferrer"&gt;#203&lt;/a&gt;). Stores can now communicate authorization requirements and abuse indicators to agents. This is new infrastructure for trust — stores can signal "this transaction requires additional verification" or "this request pattern looks suspicious" in a structured way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Updated order capability&lt;/strong&gt; (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/254" rel="noopener noreferrer"&gt;#254&lt;/a&gt;). The order schema has been restructured. If you're consuming or producing order responses, check your field names against the updated schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Embedded protocol error alignment&lt;/strong&gt; (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/325" rel="noopener noreferrer"&gt;#325&lt;/a&gt;). Error responses in the embedded checkout protocol now follow UCP's standard error conventions. If you're parsing embedded checkout errors, the shape has changed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Totals format change&lt;/strong&gt; (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/299" rel="noopener noreferrer"&gt;#299&lt;/a&gt;). Total amounts now use &lt;code&gt;signed_amount.json&lt;/code&gt; — a format that can represent both positive and negative values (for discounts, refunds). If you're reading totals as simple numbers, you'll need to handle the new format.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Identity linking reverted&lt;/strong&gt; (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/pull/329" rel="noopener noreferrer"&gt;#329&lt;/a&gt;). A previously planned identity linking change was reverted. If you implemented against an earlier draft, verify your identity handling matches the released spec.&lt;/p&gt;

&lt;h2&gt;
  
  
  What didn't ship
&lt;/h2&gt;

&lt;p&gt;Worth noting what's still in progress. Loyalty capabilities (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/issues/251" rel="noopener noreferrer"&gt;#251&lt;/a&gt;) and return extensions (&lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/issues/257" rel="noopener noreferrer"&gt;#257&lt;/a&gt;) were tracked for this release but didn't make the cut. The cart capability landed; the full post-purchase lifecycle is still forming.&lt;/p&gt;

&lt;p&gt;Per-capability versioning — the ability to bump individual capabilities without bumping the entire protocol — was discussed at the March TC meeting but deferred. The infrastructure for sub-repo versioning is being explored by TC members, but for now, breaking changes still bundle into full protocol bumps.&lt;/p&gt;

&lt;h2&gt;
  
  
  23 contributors, 15 first-timers
&lt;/h2&gt;

&lt;p&gt;This release had contributions from 23 people, 15 of whom were first-time contributors. The contributor base has broadened beyond the founding companies: documentation fixes from independent developers, schema improvements from payment processors, and tooling contributions from platform teams.&lt;/p&gt;

&lt;p&gt;Notable additions: endorsed partners now include Block, Fiserv, Klarna, Splitit, Affirm, and Checkout.com — payment infrastructure companies whose involvement signals where agent commerce payment flows are heading.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do now
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you're a store operator:&lt;/strong&gt; Check your store at &lt;a href="https://ucpchecker.com/check" rel="noopener noreferrer"&gt;UCPChecker.com&lt;/a&gt;. We've updated our validation to v2026-04-08 rules, so you'll see an accurate assessment against the new spec. If you're on Shopify, your platform will handle the migration — watch for their update timeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're an agent developer:&lt;/strong&gt; Start building for carts and catalog search. These capabilities will roll out through platform-level updates, and when they do, the adoption curve will look like checkout's did — slow for a few weeks, then near-universal overnight. The &lt;a href="https://ucpplayground.com?utm_source=ucpchecker&amp;amp;utm_medium=blog&amp;amp;utm_campaign=v2026-04-08" rel="noopener noreferrer"&gt;UCP Playground&lt;/a&gt; is the place to test agent interactions against stores that adopt early.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're a platform team:&lt;/strong&gt; The business-vs-platform profile split is the structural change to focus on. Your merchants' profiles just got simpler (fewer required fields), but your platform profile got stricter (spec and schema URLs are mandatory). Review the &lt;a href="https://ucpchecker.com/specs" rel="noopener noreferrer"&gt;spec version details&lt;/a&gt; and validate your platform-level profile against the new schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're building tooling:&lt;/strong&gt; Update your validators. The &lt;code&gt;signing_keys&lt;/code&gt; location change, business profile relaxation, and new capability schemas all affect validation logic. We've open-sourced our approach — check the &lt;a href="https://ucpchecker.com/methodology" rel="noopener noreferrer"&gt;methodology page&lt;/a&gt; for how UCPChecker handles version-aware validation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bigger picture
&lt;/h2&gt;

&lt;p&gt;v2026-01-23 gave us checkout. v2026-04-08 gives us the rest of the shopping experience: discovery, carts, signing, and the groundwork for trust and authorization. The protocol is filling in the gaps between "an agent can technically buy something" and "an agent can shop the way a human does."&lt;/p&gt;

&lt;p&gt;We &lt;a href="https://ucpchecker.com/blog/state-of-agentic-commerce-february-2026" rel="noopener noreferrer"&gt;reported in March&lt;/a&gt; that the gap between "has a manifest" (87%) and "an agent can actually buy something" (45% checkout rate) is where the real work lives. This spec release addresses the structural reasons for that gap — not by making checkout better, but by giving agents the capabilities they need for the steps &lt;em&gt;before&lt;/em&gt; and &lt;em&gt;after&lt;/em&gt; checkout.&lt;/p&gt;

&lt;p&gt;We're tracking the v2026-04-08 migration wave across all monitored domains. Check the &lt;a href="https://ucpchecker.com/stats" rel="noopener noreferrer"&gt;stats page&lt;/a&gt; for real-time adoption data, and subscribe to the weekly report for ecosystem updates as stores begin the transition.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Analysis based on the &lt;a href="https://github.com/Universal-Commerce-Protocol/ucp/releases/tag/v2026-04-08" rel="noopener noreferrer"&gt;UCP v2026-04-08 release&lt;/a&gt;, published April 9, 2026. UCPChecker validation rules updated same day.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ecommerce</category>
      <category>webdev</category>
      <category>ucp</category>
      <category>data</category>
    </item>
  </channel>
</rss>
