<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Abe Wheeler</title>
    <description>The latest articles on Forem by Abe Wheeler (@abewheeler).</description>
    <link>https://forem.com/abewheeler</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3648565%2Fd0908335-f89f-46af-9aeb-c5ea99da2236.png</url>
      <title>Forem: Abe Wheeler</title>
      <link>https://forem.com/abewheeler</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/abewheeler"/>
    <language>en</language>
    <item>
      <title>How do you know budget models are smart enough for your MCP server?</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Tue, 14 Apr 2026 18:23:12 +0000</pubDate>
      <link>https://forem.com/abewheeler/how-do-you-know-budget-models-are-smart-enough-for-your-mcp-server-1di3</link>
      <guid>https://forem.com/abewheeler/how-do-you-know-budget-models-are-smart-enough-for-your-mcp-server-1di3</guid>
      <description>&lt;p&gt;We just shipped evals for sunpeak.ai&lt;/p&gt;

&lt;p&gt;The #1 thing I hear from MCP server teams: “Our tools worked great with the latest models, but we had to start from scratch when we realized the free models couldn't use them at all.”&lt;/p&gt;

&lt;p&gt;Budget models call tools differently: they misread ambiguous schemas, they pass wrong arguments, they can't chain tool calls, and you don’t find out until users complain.&lt;/p&gt;

&lt;p&gt;sunpeak evals test your MCP server across every model that matters, in one command. 100% on GPT-4o, 40% on Gemini Flash. That 40% is a schema problem you’d never catch testing manually on ChatGPT. Fix the tool architecture + description, run it again, and watch it climb to 95%.&lt;/p&gt;

&lt;p&gt;Works with any MCP server. sunpeak connects over MCP, discovers your tools, and runs each eval case dozens of times per model so you get a real pass rate, not a single lucky result.&lt;/p&gt;

&lt;p&gt;Put it in CI. Track reliability over time. Your MCP server isn’t production-ready until the cheapest model your users might connect it to can use it consistently.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>chatgpt</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>🚀 MCP App Testing Framework!</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Wed, 08 Apr 2026 17:15:13 +0000</pubDate>
      <link>https://forem.com/abewheeler/mcp-app-testing-framework-4bao</link>
      <guid>https://forem.com/abewheeler/mcp-app-testing-framework-4bao</guid>
      <description>&lt;p&gt;We just shipped sunpeak.ai as a standalone testing framework for MCP Apps!&lt;/p&gt;

&lt;p&gt;If you're building MCP Apps for ChatGPT or Claude, you know the pain: deploy, open the host, start a conversation, trigger the tool, check the result. Repeat for both hosts, both themes, three display modes. That's 24 combinations per code change.&lt;/p&gt;

&lt;p&gt;sunpeak replicates the ChatGPT and Claude runtimes locally. You write Playwright tests that call tools, render resources, and assert against the output. One test file runs against both hosts automatically.&lt;/p&gt;

&lt;p&gt;What's included:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unit tests (Vitest + happy-dom)&lt;/li&gt;
&lt;li&gt;E2E tests against replicated ChatGPT and Claude runtimes&lt;/li&gt;
&lt;li&gt;Visual regression testing with screenshot baselines&lt;/li&gt;
&lt;li&gt;Live tests against real ChatGPT&lt;/li&gt;
&lt;li&gt;Works with any MCP server in any language (Python, Go, TypeScript)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Add it to an existing project with one command:&lt;/p&gt;

&lt;p&gt;pnpm add -g sunpeak &amp;amp;&amp;amp; sunpeak test init&lt;/p&gt;

&lt;p&gt;No paid host accounts. No AI credits. Runs in CI/CD.&lt;/p&gt;

&lt;p&gt;MIT licensed and open source! &lt;a href="https://sunpeak.ai/testing-framework/" rel="noopener noreferrer"&gt;https://sunpeak.ai/testing-framework/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>chatgpt</category>
      <category>webdev</category>
    </item>
    <item>
      <title>MCP Apps are hard to test</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Wed, 25 Mar 2026 16:39:12 +0000</pubDate>
      <link>https://forem.com/abewheeler/mcp-apps-are-hard-to-test-52d5</link>
      <guid>https://forem.com/abewheeler/mcp-apps-are-hard-to-test-52d5</guid>
      <description>&lt;p&gt;MCP Apps are hard to test.&lt;/p&gt;

&lt;p&gt;They run inside ChatGPT and Claude, so every code change means deploying to a real host, burning AI credits, and waiting through non-deterministic LLM responses. If you're building for both hosts, double everything.&lt;/p&gt;

&lt;p&gt;We built the &lt;a href="https://sunpeak.ai/mcp-app-inspector/" rel="noopener noreferrer"&gt;sunpeak Inspector&lt;/a&gt; to fix this.&lt;/p&gt;

&lt;p&gt;It replicates the ChatGPT and Claude MCP App runtimes on localhost. Your app renders exactly as it would inside the real hosts, with accurate display modes, themes, safe areas, and conversation chrome. One command to start:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sunpeak inspect --server URL&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;Works with any MCP server. Python, TypeScript, Go, whatever. No sunpeak project required.&lt;/p&gt;

&lt;p&gt;For development: Switch between ChatGPT and Claude from the sidebar. Toggle light/dark themes, mobile/tablet/desktop widths, and display modes. Edit tool input and output live. Changes appear instantly with HMR.&lt;/p&gt;

&lt;p&gt;For testing: The inspector doubles as the test runtime for Playwright E2E tests. Define tool states with simulation files (JSON fixtures), load them via URL, and assert against the rendered output. Test every host, theme, and display mode combination in CI/CD. No paid accounts, no API keys, no credits on your CI runners.&lt;/p&gt;

&lt;p&gt;For coding agents: Claude Code, Codex, and Cursor can run the inspector and execute Playwright tests programmatically, so they can iterate on MCP Apps without needing a human to manually test in a real host.&lt;br&gt;
sunpeak is MIT licensed and open source.&lt;a href="https://dev.tourl"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>chatgpt</category>
      <category>react</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Live Testing for Claude Connectors and ChatGPT Apps</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Wed, 18 Mar 2026 18:43:24 +0000</pubDate>
      <link>https://forem.com/abewheeler/live-testing-for-claude-connectors-and-chatgpt-apps-35lh</link>
      <guid>https://forem.com/abewheeler/live-testing-for-claude-connectors-and-chatgpt-apps-35lh</guid>
      <description>&lt;p&gt;The sunpeak simulator tests cover a lot. They replicate the ChatGPT and Claude runtimes, run display mode transitions, test themes, and validate tool invocations without any paid accounts or AI credits. For most development work, they're enough.&lt;/p&gt;

&lt;p&gt;But simulators don't catch everything. Real ChatGPT wraps your app in a nested iframe sandbox. The MCP protocol goes through ChatGPT's actual connection layer. Resource loading happens over a real network with production builds. There's a gap between "works in the simulator" and "works in ChatGPT," and the only way to close it is to test against the real thing.&lt;/p&gt;

&lt;p&gt;sunpeak 0.16.23 adds live testing: automated Playwright tests that run against real ChatGPT. You write the same kind of assertions you write for simulator tests, and sunpeak handles authentication, MCP server refresh, host-specific message formatting, and iframe traversal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Run &lt;code&gt;pnpm test:live&lt;/code&gt; with a tunnel active. sunpeak imports your browser session, starts the dev server, refreshes the MCP connection, and runs your &lt;code&gt;tests/live/*.spec.ts&lt;/code&gt; files in parallel against real ChatGPT. You write assertions against the app iframe. Everything else is automated.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Live Tests Actually Do
&lt;/h2&gt;

&lt;p&gt;A live test opens a real ChatGPT session in a browser, types a message that triggers your &lt;a href="https://sunpeak.ai/docs/mcp-apps/mcp/tools" rel="noopener noreferrer"&gt;MCP tool&lt;/a&gt;, waits for ChatGPT to call it, and then asserts against the rendered app inside the host's iframe.&lt;/p&gt;

&lt;p&gt;Here's a complete live test for an albums resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;albums tool renders photo grid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;show-albums&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Summer Slice&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;img&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Switch to dark mode without re-invoking the tool&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setColorScheme&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dark&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Summer Slice&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;live.invoke('show-albums')&lt;/code&gt; starts a new chat, sends &lt;code&gt;/{appName} show-albums&lt;/code&gt; to ChatGPT, waits for the LLM response to finish streaming, waits for the app iframe to render, and returns a Playwright &lt;code&gt;FrameLocator&lt;/code&gt; pointed at your app's content. From there, it's standard Playwright assertions.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;{ timeout: 15_000 }&lt;/code&gt; accounts for the LLM response time. ChatGPT needs to process your message, decide to call the tool, receive the result, and render the iframe. In practice this takes 5 to 10 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;You need three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;strong&gt;ChatGPT account&lt;/strong&gt; with MCP/Apps support (Plus or higher)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;tunnel&lt;/strong&gt; tool like &lt;a href="https://ngrok.com/" rel="noopener noreferrer"&gt;ngrok&lt;/a&gt; or &lt;a href="https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/" rel="noopener noreferrer"&gt;Cloudflare Tunnel&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Your MCP server &lt;strong&gt;connected in ChatGPT&lt;/strong&gt; (Settings &amp;gt; Apps &amp;gt; Create, enter your tunnel URL with &lt;code&gt;/mcp&lt;/code&gt; path)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You do not need to install anything extra in your sunpeak project. Live test infrastructure ships with &lt;code&gt;sunpeak&lt;/code&gt; starting at v0.16.23. New projects scaffolded with &lt;code&gt;sunpeak new&lt;/code&gt; include example live test specs and the Playwright config.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running Live Tests
&lt;/h2&gt;

&lt;p&gt;Open two terminals:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Terminal 1: Start a tunnel&lt;/span&gt;
ngrok http 8000

&lt;span class="c"&gt;# Terminal 2: Run live tests&lt;/span&gt;
pnpm &lt;span class="nb"&gt;test&lt;/span&gt;:live
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On first run, sunpeak imports your ChatGPT session from your browser. It checks Chrome, Arc, Brave, and Edge automatically. If no valid session is found, it opens a browser window and waits for you to log in. The session is saved to &lt;code&gt;tests/live/.auth/chatgpt.json&lt;/code&gt; and reused for 24 hours.&lt;/p&gt;

&lt;p&gt;After authentication, sunpeak:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Starts &lt;code&gt;sunpeak dev --prod-resources&lt;/code&gt; (production resource builds)&lt;/li&gt;
&lt;li&gt;Navigates to ChatGPT Settings &amp;gt; Apps, finds your MCP server, and clicks Refresh&lt;/li&gt;
&lt;li&gt;Runs all &lt;code&gt;tests/live/*.spec.ts&lt;/code&gt; files fully in parallel, each in its own chat window&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The MCP refresh happens once in &lt;code&gt;globalSetup&lt;/code&gt;, before any test workers start. This means your test workers don't each individually refresh the connection, which would be slow and flaky.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fixture API
&lt;/h2&gt;

&lt;p&gt;All live tests import from &lt;code&gt;sunpeak/test&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;test&lt;/code&gt; function provides a &lt;code&gt;live&lt;/code&gt; fixture with:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;invoke(prompt)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Starts a new chat, sends the prompt with host-specific formatting, waits for the app iframe, returns a &lt;code&gt;FrameLocator&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sendMessage(text)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sends a message in the current chat with &lt;code&gt;/{appName}&lt;/code&gt; prefix&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sendRawMessage(text)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sends a message without any prefix&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;startNewChat()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Opens a fresh conversation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;waitForAppIframe()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Waits for the MCP app iframe and returns a &lt;code&gt;FrameLocator&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;setColorScheme(scheme, appFrame?)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switches to &lt;code&gt;'light'&lt;/code&gt; or &lt;code&gt;'dark'&lt;/code&gt; via &lt;code&gt;page.emulateMedia()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;page&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Raw Playwright &lt;code&gt;Page&lt;/code&gt; object&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most tests only need &lt;code&gt;invoke&lt;/code&gt; and &lt;code&gt;setColorScheme&lt;/code&gt;. The &lt;code&gt;invoke&lt;/code&gt; method handles the full flow: new chat, message formatting (ChatGPT requires &lt;code&gt;/{appName}&lt;/code&gt; before your prompt), waiting for streaming to finish, waiting for the nested iframe to render, and returning a locator into your app's content.&lt;/p&gt;

&lt;h2&gt;
  
  
  Theme Testing Without Re-Invocation
&lt;/h2&gt;

&lt;p&gt;Sending a second message to trigger a new tool call is slow and burns credits. &lt;code&gt;setColorScheme&lt;/code&gt; avoids that by switching the browser's &lt;code&gt;prefers-color-scheme&lt;/code&gt; via Playwright's &lt;code&gt;page.emulateMedia()&lt;/code&gt;. ChatGPT propagates the change into the iframe, and your app re-renders with the new theme.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ticket card text stays readable in dark mode&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;show-ticket&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Search results not loading on mobile&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Verify status badge and assignee are visible in light mode&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;in progress&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Sarah Chen&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Switch to dark mode — common bugs: text blends into background,&lt;/span&gt;
  &lt;span class="c1"&gt;// borders disappear, badge colors lose contrast&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setColorScheme&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dark&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Same elements should still be visible with the new theme applied&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;in progress&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Sarah Chen&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Badge background should still be distinguishable from the card&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;badge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;span:has-text("high")&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;badgeBg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getComputedStyle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;backgroundColor&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;badgeBg&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;not&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;rgba(0, 0, 0, 0)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second argument to &lt;code&gt;setColorScheme&lt;/code&gt; tells it to wait for the app's &lt;code&gt;&amp;lt;html data-theme="dark"&amp;gt;&lt;/code&gt; attribute to confirm the theme propagated through the iframe boundary before your assertions run.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Full Example
&lt;/h2&gt;

&lt;p&gt;Here's a live test for a review card resource. It invokes the tool, checks the rendered content, verifies a button interaction triggers a state transition, and confirms the card re-themes correctly in dark mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review card renders and handles approval flow&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review-diff&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Verify the card rendered with the right content&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;h1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toHaveText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Refactor Authentication Module&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Action buttons present&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;applyButton&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Apply Changes&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;applyButton&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Theme switch: card should stay readable in dark mode&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setColorScheme&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dark&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;applyButton&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Click Apply Changes — UI transitions to accepted state&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;applyButton&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;applyButton&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;not&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text=Applying changes...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This catches real issues that simulator tests can miss: the iframe sandbox blocking a script load, a theme change not propagating through the nested iframe boundary, or a button click failing because of host-specific event handling.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Playwright Config
&lt;/h2&gt;

&lt;p&gt;The live test config is a one-liner:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// tests/live/playwright.config.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineLiveConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/test/config&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineLiveConfig&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This generates a full Playwright config with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;globalSetup&lt;/code&gt;&lt;/strong&gt; pointing to sunpeak's auth and MCP refresh flow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;headless: false&lt;/code&gt;&lt;/strong&gt; because chatgpt.com blocks headless browsers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-bot browser arguments&lt;/strong&gt; and a real Chrome user agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2-minute timeout per test&lt;/strong&gt; (LLM responses can be slow)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1 retry&lt;/strong&gt; per test (LLM responses are non-deterministic)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fully parallel execution&lt;/strong&gt; (each test gets its own chat)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic dev server&lt;/strong&gt; with &lt;code&gt;--prod-resources&lt;/code&gt; on a dynamically allocated port&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can pass options to customize the environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineLiveConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;colorScheme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dark&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;viewport&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1440&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fr-FR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;timezoneId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Europe/Paris&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;geolocation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;latitude&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;48.8566&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;longitude&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;2.3522&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;geolocation&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How It Relates to Simulator Tests
&lt;/h2&gt;

&lt;p&gt;Live tests don't replace &lt;a href="https://dev.to/blogs/complete-guide-testing-chatgpt-apps"&gt;simulator tests&lt;/a&gt;. They complement them.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Simulator (&lt;code&gt;pnpm test:e2e&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Live (&lt;code&gt;pnpm test:live&lt;/code&gt;)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runs against&lt;/td&gt;
&lt;td&gt;Local simulator&lt;/td&gt;
&lt;td&gt;Real ChatGPT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Seconds&lt;/td&gt;
&lt;td&gt;10-30 seconds per test&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;Requires ChatGPT Plus&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI/CD&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Not recommended (needs auth)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Catches&lt;/td&gt;
&lt;td&gt;Component logic, display modes, themes, cross-host layout&lt;/td&gt;
&lt;td&gt;Real MCP connection, LLM tool invocation, iframe sandbox, production resource loading&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Use simulator tests for development and CI/CD. Use live tests before shipping, after major changes, or when debugging issues that only reproduce in the real host.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Testing Pyramid for Claude Connectors
&lt;/h2&gt;

&lt;p&gt;A &lt;a href="https://dev.to/blogs/what-are-claude-connectors"&gt;Claude Connector&lt;/a&gt; built with sunpeak now has three test tiers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt; (&lt;code&gt;pnpm test&lt;/code&gt;): Vitest, jsdom, fast, test component logic in isolation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simulator e2e tests&lt;/strong&gt; (&lt;code&gt;pnpm test:e2e&lt;/code&gt;): Playwright against the local &lt;a href="https://dev.to/blogs/claude-simulator-for-mcp-apps"&gt;ChatGPT and Claude simulator&lt;/a&gt;, test display modes and themes, runs in &lt;a href="https://dev.to/blogs/mcp-app-github-actions-cicd"&gt;CI/CD&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live tests&lt;/strong&gt; (&lt;code&gt;pnpm test:live&lt;/code&gt;): Playwright against real ChatGPT (with Claude coming soon), test real MCP protocol behavior and iframe rendering&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each tier catches different classes of bugs. Unit tests catch logic errors. Simulator tests catch rendering and layout issues across hosts and display modes. Live tests catch protocol and sandbox issues that only show up in the real host environment.&lt;/p&gt;

&lt;p&gt;All three are pre-configured when you run &lt;code&gt;sunpeak new&lt;/code&gt;. You don't need to set up Vitest, Playwright, or any test infrastructure yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Host-Agnostic Architecture
&lt;/h2&gt;

&lt;p&gt;The live test infrastructure is designed to support multiple hosts. The &lt;code&gt;live&lt;/code&gt; fixture resolves the correct host page object based on the Playwright project name. All host-specific DOM interaction (selectors, login flow, settings navigation, iframe nesting) lives in per-host page objects that sunpeak maintains.&lt;/p&gt;

&lt;p&gt;Your test code is host-agnostic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my resource renders&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;show me something&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;h1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This same test will run against any host that sunpeak supports. Today that's ChatGPT. When Claude live testing ships, add it with one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// tests/live/playwright.config.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineLiveConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;chatgpt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No changes to your test files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;If you have an existing sunpeak project, update to v0.16.23 or later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add sunpeak@latest &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak upgrade
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create &lt;code&gt;tests/live/playwright.config.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineLiveConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/test/config&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineLiveConfig&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the test script to &lt;code&gt;package.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"test:live"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"playwright test --config tests/live/playwright.config.ts"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Write your first live test in &lt;code&gt;tests/live/your-resource.spec.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my tool renders correctly in ChatGPT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;your prompt here&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;your-selector&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start a tunnel, run &lt;code&gt;pnpm test:live&lt;/code&gt;, and watch Playwright drive a real ChatGPT session.&lt;/p&gt;

&lt;p&gt;New projects created with &lt;code&gt;sunpeak new&lt;/code&gt; include all of this out of the box, with example live tests for every starter resource.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>chatgpt</category>
      <category>webdev</category>
      <category>react</category>
    </item>
    <item>
      <title>Claude Simulator for MCP Apps: Test Claude Apps Locally with sunpeak</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Mon, 02 Mar 2026 15:15:24 +0000</pubDate>
      <link>https://forem.com/abewheeler/claude-simulator-for-mcp-apps-test-claude-apps-locally-with-sunpeak-2a1g</link>
      <guid>https://forem.com/abewheeler/claude-simulator-for-mcp-apps-test-claude-apps-locally-with-sunpeak-2a1g</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; sunpeak v0.15 adds a local Claude simulator. Run &lt;code&gt;sunpeak dev&lt;/code&gt;, pick Claude from the Host dropdown (or &lt;code&gt;?host=claude&lt;/code&gt; URL Param), and test your MCP App in both Claude and ChatGPT from one dev server.&lt;/p&gt;

&lt;p&gt;Until now, sunpeak's local simulator only replicated the ChatGPT runtime. If you wanted to test how your &lt;a href="https://dev.to/mcp-app-framework"&gt;MCP App&lt;/a&gt; looked in Claude, you had to deploy it and connect it manually. That's fixed.&lt;/p&gt;

&lt;p&gt;sunpeak v0.15 ships with first-class Claude support. The old &lt;code&gt;ChatGPTSimulator&lt;/code&gt; is now just &lt;code&gt;Simulator&lt;/code&gt;, and both Claude and ChatGPT are registered as host shells out of the box. Switch between them with a dropdown, a URL param, or a prop.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;The simulator is now multi-host. Instead of a single ChatGPT-specific component, sunpeak uses a pluggable host shell system. Each host registers its own conversation chrome, color palette, and theme behavior. The &lt;code&gt;Simulator&lt;/code&gt; component renders whichever host you select.&lt;/p&gt;

&lt;p&gt;Two hosts ship by default:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ChatGPT&lt;/strong&gt; uses the familiar gray/white palette with the ChatGPT conversation layout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude&lt;/strong&gt; uses a warm beige/cream palette matching claude.ai, with Claude's conversation chrome.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both implement the core MCP App protocol, but each host adds its own extras. ChatGPT supports host-specific features like file uploads and downloads on top of the standard. Claude doesn't have additional host APIs today, though sunpeak's Claude host shell does handle Claude's rendering quirks. If Claude adds host-specific capabilities in the future, they'll be built into this shell. Your resource component renders in both, wrapped in each host's chat UI, so you see exactly what your users will see.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Use It
&lt;/h2&gt;

&lt;p&gt;If you already have a sunpeak project, update to v0.15 and &lt;a href="https://github.com/Sunpeak-AI/sunpeak/releases/tag/v0.15.1" rel="noopener noreferrer"&gt;migrate your CSS&lt;/a&gt; classes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak upgrade
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run the dev server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;localhost:3000&lt;/code&gt;. You will see a Host dropdown in the simulator sidebar. Select &lt;strong&gt;Claude&lt;/strong&gt; to test your app in the Claude runtime. Select &lt;strong&gt;ChatGPT&lt;/strong&gt; to switch back.&lt;/p&gt;

&lt;p&gt;If you are starting fresh:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak
sunpeak new
&lt;span class="nb"&gt;cd &lt;/span&gt;my-app &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scaffolded project uses the new &lt;code&gt;Simulator&lt;/code&gt; component by default. Both hosts are available from the first run.&lt;/p&gt;

&lt;h2&gt;
  
  
  Host Selection
&lt;/h2&gt;

&lt;p&gt;Three ways to pick a host:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sidebar dropdown.&lt;/strong&gt; The Host control appears in the sidebar when multiple hosts are registered. Click it to switch at runtime.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;URL parameter.&lt;/strong&gt; Add &lt;code&gt;?host=claude&lt;/code&gt; or &lt;code&gt;?host=chatgpt&lt;/code&gt; to the simulator URL. This is useful for bookmarking a specific host, linking teammates to a particular test configuration, or testing certain rendering states automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;defaultHost&lt;/code&gt; prop.&lt;/strong&gt; Set the initial host in code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Simulator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/simulator&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Simulator&lt;/span&gt;
  &lt;span class="na"&gt;simulations&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;simulations&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
  &lt;span class="na"&gt;defaultHost&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"claude"&lt;/span&gt;
&lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default is &lt;code&gt;chatgpt&lt;/code&gt; if you don't specify one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Migrating from ChatGPTSimulator
&lt;/h2&gt;

&lt;p&gt;If your project uses the old &lt;code&gt;ChatGPTSimulator&lt;/code&gt; from &lt;code&gt;sunpeak/chatgpt&lt;/code&gt;, it still work as an alias to the new simulator. No migration is required, but the alias will be removed in the near future.&lt;/p&gt;

&lt;p&gt;The change is small. In your dev entry point, replace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Before&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ChatGPTSimulator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/chatgpt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ChatGPTSimulator&lt;/span&gt; &lt;span class="na"&gt;simulations&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;simulations&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// After&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Simulator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/simulator&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Simulator&lt;/span&gt; &lt;span class="na"&gt;simulations&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;simulations&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same simulations, same resource components, same test suite. The &lt;code&gt;Simulator&lt;/code&gt; just adds the host dropdown and Claude's rendering behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Your App Looks Like in Claude
&lt;/h2&gt;

&lt;p&gt;The Claude host shell wraps your resource component in Claude's conversation UI. The background uses Claude's warm beige and grey instead of ChatGPT's white and dark grey. User messages appear in Claude's bubble style. The toolbar and display mode controls (inline, fullscreen, picture-in-picture) work the same way.&lt;/p&gt;

&lt;p&gt;The core data flow is shared across hosts. &lt;a href="https://sunpeak.ai/docs/mcp-apps/react/use-app" rel="noopener noreferrer"&gt;&lt;code&gt;useToolData&lt;/code&gt;&lt;/a&gt; receives the tool output. &lt;code&gt;useAppState&lt;/code&gt; syncs state back to the host. &lt;code&gt;SafeArea&lt;/code&gt; handles safe rendering boundaries. These work the same in both Claude and ChatGPT.&lt;/p&gt;

&lt;p&gt;Where hosts differ is in extras. ChatGPT supports host-specific features like file uploads and downloads that go beyond the MCP App standard. Claude has its own rendering quirks that sunpeak's host shell accounts for. If Claude adds host-specific APIs later, sunpeak will surface them through the same shell system.&lt;/p&gt;

&lt;p&gt;The simulator lets you catch these differences locally instead of deploying to find out.&lt;/p&gt;

&lt;h2&gt;
  
  
  Extensible Host System
&lt;/h2&gt;

&lt;p&gt;The host shell registry is open. If a new major MCP App host appears, sunpeak can add support without changing the &lt;code&gt;Simulator&lt;/code&gt; component or your resource code. Each host registers itself with an &lt;code&gt;id&lt;/code&gt;, a &lt;code&gt;label&lt;/code&gt;, a conversation component, a &lt;a href="https://sunpeak.ai/docs/mcp-apps/styling/theme" rel="noopener noreferrer"&gt;theme function&lt;/a&gt;, and &lt;a href="https://sunpeak.ai/docs/mcp-apps/styling/css-variables" rel="noopener noreferrer"&gt;style variables&lt;/a&gt;. The simulator picks up all registered hosts automatically.&lt;/p&gt;

&lt;p&gt;For now, Claude and ChatGPT cover the two largest MCP App hosts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak
sunpeak new
&lt;span class="nb"&gt;cd &lt;/span&gt;my-app &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;localhost:3000&lt;/code&gt;, select Claude from the Host dropdown, and start building.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blogs/how-to-build-a-claude-app"&gt;How to Build a Claude App&lt;/a&gt; covers architecture and code patterns for Claude.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blogs/chatgpt-app-tutorial"&gt;ChatGPT App Tutorial&lt;/a&gt; walks through building a resource from scratch (same steps work for Claude).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blogs/build-mcp-app-for-chatgpt-and-claude"&gt;Building One MCP App for ChatGPT and Claude&lt;/a&gt; covers the cross-platform story.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/blogs/complete-guide-testing-chatgpt-apps"&gt;Testing guide&lt;/a&gt; covers Vitest and Playwright setup.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://sunpeak.ai/docs/quickstart" rel="noopener noreferrer"&gt;sunpeak documentation&lt;/a&gt; has the quickstart and full API reference.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Build once with &lt;a href="https://dev.to/"&gt;sunpeak&lt;/a&gt;, test locally in both Claude and ChatGPT, and ship to every &lt;a href="https://dev.to/mcp-app-framework"&gt;MCP App host&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>react</category>
      <category>webdev</category>
    </item>
    <item>
      <title>sunpeak is all-in on MCP Apps</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Wed, 11 Feb 2026 17:18:22 +0000</pubDate>
      <link>https://forem.com/abewheeler/sunpeak-is-all-in-on-mcp-apps-2lg8</link>
      <guid>https://forem.com/abewheeler/sunpeak-is-all-in-on-mcp-apps-2lg8</guid>
      <description>&lt;p&gt;MCP Apps now run in ChatGPT, Claude, Goose, and VS Code. That happened fast. Claude &lt;a href="https://blog.modelcontextprotocol.io/posts/2026-01-26-mcp-apps/" rel="noopener noreferrer"&gt;announced MCP App support&lt;/a&gt; on January 26, and ChatGPT &lt;a href="https://developers.openai.com/apps-sdk/mcp-apps-in-chatgpt" rel="noopener noreferrer"&gt;followed on February 4&lt;/a&gt;. Two weeks, two major hosts, one standard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; sunpeak's APIs are built around the MCP App standard. ChatGPT and Claude-specific features are layered on top as optional imports. Write your app once, run it everywhere—even localhost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why MCP-App-First
&lt;/h2&gt;

&lt;p&gt;When ChatGPT Apps launched in October 2025, they had their own proprietary SDK. Building for ChatGPT meant building &lt;em&gt;only&lt;/em&gt; for ChatGPT.&lt;/p&gt;

&lt;p&gt;That changed when OpenAI contributed to and aligned on &lt;a href="https://github.com/modelcontextprotocol/ext-apps" rel="noopener noreferrer"&gt;MCP Apps&lt;/a&gt; as the open standard. The rendering model, the iframe sandbox, the UI functionality — all of it became portable. And as of February 2026, the major hosts actually implemented it.&lt;/p&gt;

&lt;p&gt;sunpeak followed the same trajectory. We started as a ChatGPT App framework because ChatGPT was the only major host supporting embedded UIs. Now we're an MCP App framework, because the standard is real and the host list is growing.&lt;/p&gt;

&lt;p&gt;What that means in practice: sunpeak's core APIs target the MCP App interface, not any single host. At the same time, sunpeak layers in the major host-specific functionality developers need to seamlessly support differentiated platforms (think React Native).&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;sunpeak separates standard MCP App APIs from host-specific ones at the import level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Core APIs&lt;/strong&gt; come from the top-level &lt;code&gt;sunpeak&lt;/code&gt; import. These work everywhere:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useToolData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useHostContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useDisplayMode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;AppProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ResourceConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ResourceConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Show analytics dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;DashboardResource&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useToolData&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useHostContext&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;displayMode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useDisplayMode&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* Your UI — runs in ChatGPT, Claude, Goose, VS Code */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Host-specific APIs&lt;/strong&gt; come from subpath imports. Right now that's &lt;code&gt;sunpeak/chatgpt&lt;/code&gt; for ChatGPT-specific tooling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ChatGPTSimulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;buildDevSimulations&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/chatgpt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ChatGPT simulator, dev simulation builder, and any ChatGPT-only runtime features live here. They're first-class — not afterthoughts or community plugins — but they don't pollute your app code. Your resource components stay portable.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changed, What Didn't
&lt;/h2&gt;

&lt;p&gt;If you're already using sunpeak, you'll notice &lt;code&gt;v0.13&lt;/code&gt; changes many APIs to be based on MCP App abstractions and nomenclature.&lt;br&gt;
Fortunately, a lot of Apps SDK knowledge is easily portable to the MCP App interface.&lt;br&gt;
Refer to the &lt;a href="https://github.com/Sunpeak-AI/sunpeak/releases/tag/v0.13.1" rel="noopener noreferrer"&gt;release notes&lt;/a&gt; for more specific migration instructions,&lt;br&gt;
and refer to the &lt;a href="https://docs.sunpeak.ai/" rel="noopener noreferrer"&gt;sunpeak docs&lt;/a&gt; for a complete overview of sunpeak and MCP Apps.&lt;/p&gt;

&lt;p&gt;With these changes, your app renders in Claude (and others) today. The &lt;code&gt;sunpeak dev&lt;/code&gt; simulator at &lt;code&gt;localhost:6767&lt;/code&gt; replicates the MCP App runtime that all hosts implement, so what works locally works in production across hosts.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Host Landscape
&lt;/h2&gt;

&lt;p&gt;Here's where MCP Apps run today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ChatGPT&lt;/strong&gt; — OpenAI contributed elements of the original ChatGPT Apps protocol to MCP and now supports the open standard alongside their existing SDK.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude&lt;/strong&gt; — Anthropic's web and desktop clients render MCP Apps natively.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Goose&lt;/strong&gt; — Block's open-source AI agent supports MCP Apps.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VS Code Insiders&lt;/strong&gt; — Microsoft's editor renders MCP Apps in the chat sidebar.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More hosts will follow. The MCP App standard is under the Linux Foundation now, and the spec is actively developed at &lt;a href="https://github.com/modelcontextprotocol/ext-apps" rel="noopener noreferrer"&gt;modelcontextprotocol/ext-apps&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Platform-Specific Features Are First-Class
&lt;/h2&gt;

&lt;p&gt;MCP-App-first doesn't mean lowest-common-denominator. ChatGPT has features that Claude doesn't, and vice versa. sunpeak treats these as first-class extensions, not hacks.&lt;/p&gt;

&lt;p&gt;For ChatGPT, that means full access to OpenAI's &lt;a href="https://github.com/openai/apps-sdk-ui" rel="noopener noreferrer"&gt;apps-sdk-ui&lt;/a&gt; component library, the ChatGPT simulator for local development, and any ChatGPT-specific runtime APIs. These are maintained alongside the core framework, tested in CI, and documented.&lt;/p&gt;

&lt;p&gt;As Claude and other hosts ship their own platform-specific features, sunpeak will add corresponding subpath imports. The pattern scales: core stays portable, extensions stay organized.&lt;/p&gt;
&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;sunpeak is open source and free.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak new
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your app works across ChatGPT, Claude, and every other MCP App host from the first line of code.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.sunpeak.ai" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;: guides, API reference, and tutorials&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Sunpeak-AI/sunpeak" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;: source code and issue tracker&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://sunpeak.ai/chatgpt-app-framework" rel="noopener noreferrer"&gt;MCP App Framework&lt;/a&gt;: overview of sunpeak's capabilities&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>mcp</category>
      <category>chatgpt</category>
      <category>webdev</category>
      <category>react</category>
    </item>
    <item>
      <title>The Complete Guide to Testing ChatGPT Apps</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Tue, 03 Feb 2026 19:46:39 +0000</pubDate>
      <link>https://forem.com/abewheeler/the-complete-guide-to-testing-chatgpt-apps-2189</link>
      <guid>https://forem.com/abewheeler/the-complete-guide-to-testing-chatgpt-apps-2189</guid>
      <description>&lt;p&gt;Testing ChatGPT Apps presents unique challenges. Your UI runs inside ChatGPT's runtime, responds to tool invocations, and adapts to multiple display modes and themes. Without proper testing infrastructure, you're deploying blind.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Use sunpeak's built-in testing with Vitest for unit tests (&lt;code&gt;pnpm test&lt;/code&gt;) and Playwright for e2e tests (&lt;code&gt;pnpm test:e2e&lt;/code&gt;). Define states in simulation files, test across display modes with &lt;code&gt;createSimulatorUrl&lt;/code&gt;, and run everything in CI.&lt;/p&gt;

&lt;p&gt;This guide covers everything you need to test ChatGPT Apps and MCP Apps with confidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Testing ChatGPT Apps is Different
&lt;/h2&gt;

&lt;p&gt;ChatGPT Apps run in a specialized runtime environment. Your React components don't just render in a browser—they render inside ChatGPT's Apps SDK runtime with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ChatGPT frontend state&lt;/strong&gt; - Inline, in picture-in-picture, and fullscreen display modes, light or dark theme, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool invocations&lt;/strong&gt; - ChatGPT calls your app's tools with specific inputs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend state&lt;/strong&gt; - Various possible states for users and sessions in your database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Widget state&lt;/strong&gt; - Persistent state that survives across invocations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Testing each combination manually isn't feasible, the combinatorics are brutal.&lt;br&gt;
You need automated testing that covers all these scenarios.&lt;/p&gt;
&lt;h2&gt;
  
  
  Setting Up Your Testing Environment
&lt;/h2&gt;

&lt;p&gt;If you're using the &lt;a href="https://sunpeak.ai" rel="noopener noreferrer"&gt;sunpeak framework&lt;/a&gt;, testing is pre-configured. Start with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak new
&lt;span class="nb"&gt;cd &lt;/span&gt;my-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your project includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vitest&lt;/strong&gt; configured with jsdom, React Testing Library, and jest-dom matchers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Playwright&lt;/strong&gt; configured to test against the ChatGPT simulator&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simulation files&lt;/strong&gt; in &lt;code&gt;tests/simulations/&lt;/code&gt; for deterministic states&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Unit Testing with Vitest
&lt;/h2&gt;

&lt;p&gt;Unit tests validate individual components in isolation. Run them with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create tests alongside your components in &lt;code&gt;src/resources&lt;/code&gt; with the &lt;code&gt;.test.tsx&lt;/code&gt; extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;render&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;screen&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@testing-library/react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Counter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;../src/resources/counter-resource&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Counter&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;renders the initial count&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Counter&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;screen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeInTheDocument&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;increments when button is clicked&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Counter&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;userEvent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;screen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/increment/i&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;screen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeInTheDocument&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Unit tests run fast and catch component-level bugs early. They're ideal for testing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Component rendering logic&lt;/li&gt;
&lt;li&gt;User interactions within a component&lt;/li&gt;
&lt;li&gt;Props and state handling&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  End-to-End Testing with Playwright
&lt;/h2&gt;

&lt;p&gt;E2E tests validate your app running in the ChatGPT simulator. Run them with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm &lt;span class="nb"&gt;test&lt;/span&gt;:e2e
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create tests in &lt;code&gt;tests/e2e/&lt;/code&gt; with the &lt;code&gt;.spec.ts&lt;/code&gt; extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@playwright/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createSimulatorUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sunpeak/chatgpt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter increments in fullscreen mode&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;createSimulatorUrl&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;simulation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter-show&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;displayMode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fullscreen&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dark&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/increment/i&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://docs.sunpeak.ai/api-reference/simulations/chatgpt-simulator#createsimulatorurl" rel="noopener noreferrer"&gt;&lt;code&gt;createSimulatorUrl&lt;/code&gt;&lt;/a&gt; utility generates URLs with your test configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;simulation&lt;/code&gt;&lt;/strong&gt; - Your simulation file name (sets initial state)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;displayMode&lt;/code&gt;&lt;/strong&gt; - &lt;code&gt;inline&lt;/code&gt;, &lt;code&gt;pip&lt;/code&gt;, or &lt;code&gt;fullscreen&lt;/code&gt; (tests display adaptation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;theme&lt;/code&gt;&lt;/strong&gt; - &lt;code&gt;light&lt;/code&gt; or &lt;code&gt;dark&lt;/code&gt; (tests theme handling)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;deviceType&lt;/code&gt;&lt;/strong&gt; - &lt;code&gt;mobile&lt;/code&gt;, &lt;code&gt;tablet&lt;/code&gt;, &lt;code&gt;desktop&lt;/code&gt;, or &lt;code&gt;unknown&lt;/code&gt; (tests responsive behavior)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;touch&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;hover&lt;/code&gt;&lt;/strong&gt; - Enable or disable touch/hover capabilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;safeAreaTop&lt;/code&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;code&gt;safeAreaBottom&lt;/code&gt;&lt;/strong&gt;, etc. - Simulate device notches and insets&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Creating Simulation Files
&lt;/h2&gt;

&lt;p&gt;Simulation files define deterministic states for testing. Create them in &lt;code&gt;tests/simulations/{resource-name}/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userMessage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Show me a counter starting at 5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"show_counter"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Displays an interactive counter"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"inputSchema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"initialCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"number"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"callToolRequestParams"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"initialCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"callToolResult"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Counter displayed"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"structuredContent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This simulation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shows &lt;code&gt;userMessage&lt;/code&gt; in the simulator chat interface&lt;/li&gt;
&lt;li&gt;Defines the &lt;code&gt;tool&lt;/code&gt; with its name and input schema&lt;/li&gt;
&lt;li&gt;Sets &lt;code&gt;callToolRequestParams&lt;/code&gt; with mock input accessible via &lt;code&gt;useToolInput()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Provides &lt;code&gt;callToolResult&lt;/code&gt; with mock data passed to your component via &lt;code&gt;useWidgetProps()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use simulations to test specific states without manual setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Test the counter with structuredContent.count = 5&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;createSimulatorUrl&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;simulation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter-show&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;5&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Test a different initial state&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;createSimulatorUrl&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;simulation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter-initial&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing Across Display Modes
&lt;/h2&gt;

&lt;p&gt;ChatGPT Apps appear in three display modes. Test all of them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;displayModes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;inline&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pip&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fullscreen&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;displayMode&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;displayModes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`renders correctly in &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;displayMode&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; mode`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;createSimulatorUrl&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;simulation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter-show&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;displayMode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}));&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each mode has different constraints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inline&lt;/strong&gt; - Limited height, embedded in chat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Picture-in-picture&lt;/strong&gt; - Floating window, can be repositioned&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fullscreen&lt;/strong&gt; - Maximum space, modal overlay&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your app should adapt gracefully to each.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Theme Adaptation
&lt;/h2&gt;

&lt;p&gt;Test both light and dark themes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;adapts to dark theme&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;createSimulatorUrl&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;simulation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter-show&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dark&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;

  &lt;span class="c1"&gt;// Verify dark theme styles are applied&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;button&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;button&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toHaveCSS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;background-color&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;rgb(255, 184, 0)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running Tests in CI/CD
&lt;/h2&gt;

&lt;p&gt;Add testing to your GitHub Actions workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Test&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm/action-setup@v2&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;20'&lt;/span&gt;
          &lt;span class="na"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pnpm'&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm install&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm test&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm exec playwright install --with-deps&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pnpm test:e2e&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Playwright tests automatically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start the sunpeak dev server&lt;/li&gt;
&lt;li&gt;Wait for it to be ready&lt;/li&gt;
&lt;li&gt;Run tests against the ChatGPT simulator&lt;/li&gt;
&lt;li&gt;Shut down when complete&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Debugging Failing Tests
&lt;/h2&gt;

&lt;p&gt;When tests fail, use these debugging techniques:&lt;/p&gt;

&lt;h3&gt;
  
  
  Playwright Debug Mode
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm &lt;span class="nb"&gt;test&lt;/span&gt;:e2e &lt;span class="nt"&gt;--ui&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Opens a visual debugger where you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Step through tests&lt;/li&gt;
&lt;li&gt;Inspect the DOM at each step&lt;/li&gt;
&lt;li&gt;See screenshots and traces&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Vitest Verbose Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;--reporter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;verbose
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Shows detailed output including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Individual assertion results&lt;/li&gt;
&lt;li&gt;Component render output&lt;/li&gt;
&lt;li&gt;Error stack traces&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Screenshot on Failure
&lt;/h3&gt;

&lt;p&gt;Playwright automatically captures screenshots on failure. Find them in &lt;code&gt;test-results/&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Best Practices
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;One assertion per test.&lt;/strong&gt; Keep tests focused and easy to debug:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Good: focused test&lt;/span&gt;
&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;increment button is visible&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;createSimulatorUrl&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;simulation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter-show&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/increment/i&lt;/span&gt; &lt;span class="p"&gt;})).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Avoid: multiple unrelated assertions&lt;/span&gt;
&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter works&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Too many things being tested at once&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Test behavior, not implementation.&lt;/strong&gt; Focus on what users see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Good: tests user-visible behavior&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;5&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Avoid: tests implementation details&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;component&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Use descriptive test names.&lt;/strong&gt; Make failures self-explanatory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Good: clear failure message&lt;/span&gt;
&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;displays error message when API call fails&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;

&lt;span class="c1"&gt;// Avoid: vague description&lt;/span&gt;
&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;handles error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clean up between tests.&lt;/strong&gt; Reset state to avoid test pollution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;afterEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Reset any global state&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Testing is essential for shipping reliable ChatGPT Apps and MCP Apps. With sunpeak's testing infrastructure, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run unit tests with Vitest for fast feedback&lt;/li&gt;
&lt;li&gt;Run e2e tests with Playwright for full integration coverage&lt;/li&gt;
&lt;li&gt;Test across display modes, themes, and device types&lt;/li&gt;
&lt;li&gt;Integrate testing into your CI/CD pipeline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Get started with sunpeak:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak new
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Read the &lt;a href="https://docs.sunpeak.ai/guides/testing" rel="noopener noreferrer"&gt;testing documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Try the &lt;a href="https://sunpeak.ai/simulator" rel="noopener noreferrer"&gt;interactive simulator&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Sunpeak-AI/sunpeak" rel="noopener noreferrer"&gt;Star sunpeak on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>mcp</category>
      <category>chatgpt</category>
      <category>webdev</category>
      <category>react</category>
    </item>
    <item>
      <title>Why You Need a ChatGPT App Framework</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Thu, 29 Jan 2026 00:41:27 +0000</pubDate>
      <link>https://forem.com/abewheeler/why-you-need-a-chatgpt-app-framework-1bmc</link>
      <guid>https://forem.com/abewheeler/why-you-need-a-chatgpt-app-framework-1bmc</guid>
      <description>&lt;p&gt;ChatGPT Apps are a new UI paradigm: your code renders directly inside the ChatGPT conversation. But building them from scratch means solving the same infrastructure problems every time you start a project. A ChatGPT App framework changes that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; &lt;a href="https://sunpeak.ai" rel="noopener noreferrer"&gt;Sunpeak&lt;/a&gt; is the first ChatGPT App framework. It gives ChatGPT App developers the same developer experience that Next.js gives web developers: a simulator, components, CLI scaffolding, testing, and deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is a ChatGPT App Framework?
&lt;/h2&gt;

&lt;p&gt;A ChatGPT App framework provides the development infrastructure for building applications that use OpenAI's &lt;a href="https://developers.openai.com/apps-sdk" rel="noopener noreferrer"&gt;Apps SDK&lt;/a&gt; runtime. The Apps SDK defines &lt;em&gt;how&lt;/em&gt; ChatGPT Apps work: the protocol, the rendering model, the communication between your MCP server and ChatGPT. A framework builds on top of that to give you the tooling you actually need to develop, test, and ship apps.&lt;/p&gt;

&lt;p&gt;Think of it like the relationship between React and Next.js. React is the rendering library. Next.js gives you routing, server-side rendering, a dev server, and deployment. You &lt;em&gt;can&lt;/em&gt; build a React app without Next.js, but most teams don't, because the framework handles the infrastructure so you can focus on your product.&lt;/p&gt;

&lt;p&gt;A ChatGPT App framework does the same thing for the Apps SDK.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pain of Building Without a Framework
&lt;/h2&gt;

&lt;p&gt;If you've tried building a ChatGPT App from the official resources alone, you've hit these problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No local testing.&lt;/strong&gt; The only way to see your app render is to connect it to the real ChatGPT, which requires a paid ChatGPT Plus or Team subscription with developer mode. Every change means tunneling your local server, refreshing ChatGPT, and waiting for the round trip.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No component library.&lt;/strong&gt; OpenAI provides &lt;a href="https://github.com/openai/apps-sdk-ui" rel="noopener noreferrer"&gt;apps-sdk-ui&lt;/a&gt;, a low-level React component library. But it gives you primitives, not production-ready components. You're rebuilding common patterns from scratch every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No project structure.&lt;/strong&gt; Every project starts from zero. There's no standard way to organize your resources, tools, and configuration. You're making structural decisions before you've written a line of product code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No testing story.&lt;/strong&gt; You can't run automated tests against the ChatGPT interface. There's no way to verify your app renders correctly in CI. Manual testing through the ChatGPT UI is the only option.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No deployment pipeline.&lt;/strong&gt; Getting your app from local development to production means manually configuring an MCP server, setting up hosting, and wiring everything together.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a ChatGPT App Framework Gives You
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://sunpeak.ai" rel="noopener noreferrer"&gt;Sunpeak&lt;/a&gt; maps each of those pain points to a concrete solution:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local runtime simulator.&lt;/strong&gt; Run &lt;code&gt;sunpeak dev&lt;/code&gt; and open &lt;code&gt;localhost:6767&lt;/code&gt;. You get a full ChatGPT simulator that renders your app exactly as ChatGPT would, with no paid account, no tunneling, and no round trips.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak dev
&lt;span class="c"&gt;# Simulator running at http://localhost:6767&lt;/span&gt;
&lt;span class="c"&gt;# MCP server running at http://localhost:6766&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Apps SDK UI components.&lt;/strong&gt; Sunpeak includes production-ready components built on top of OpenAI's apps-sdk-ui. Cards, carousels, forms, and layouts, so you're not starting from scratch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CLI scaffolding.&lt;/strong&gt; Run &lt;code&gt;sunpeak new&lt;/code&gt; and get a working project with dependencies installed, configuration set up, and a starter app ready to modify.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak new
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Testing support.&lt;/strong&gt; Write tests with Vitest and Playwright that run against the simulator. Verify your UI renders correctly in CI without connecting to ChatGPT.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment via Resource Repository.&lt;/strong&gt; Sunpeak's &lt;a href="https://dev.to/blogs/introducing-sunpeak-resource-repository"&gt;Resource Repository&lt;/a&gt; gives you a deployment target for your app's resources, so you can ship without manually wiring MCP servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building With vs. Without a Framework
&lt;/h2&gt;

&lt;p&gt;Here's what the developer workflow looks like side by side:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without a framework:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Set up a new project&lt;/span&gt;
&lt;span class="nb"&gt;mkdir &lt;/span&gt;my-app &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;my-app
npm init &lt;span class="nt"&gt;-y&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; @modelcontextprotocol/sdk express
&lt;span class="c"&gt;# Manually configure MCP server...&lt;/span&gt;
&lt;span class="c"&gt;# Manually set up React rendering...&lt;/span&gt;
&lt;span class="c"&gt;# Manually build UI components...&lt;/span&gt;
&lt;span class="c"&gt;# Set up ngrok tunnel to test in ChatGPT...&lt;/span&gt;
&lt;span class="c"&gt;# Hope it renders correctly...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;With sunpeak:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Set up a new project&lt;/span&gt;
pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak new
&lt;span class="nb"&gt;cd &lt;/span&gt;my-app &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; pnpm &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;span class="c"&gt;# Develop with instant feedback&lt;/span&gt;
sunpeak dev
&lt;span class="c"&gt;# Open localhost:6767, done.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference isn't just fewer commands. It's fewer decisions, fewer things to debug, and fewer things that can go wrong before you've started building your actual product.&lt;/p&gt;

&lt;h2&gt;
  
  
  When You Don't Need a Framework
&lt;/h2&gt;

&lt;p&gt;A framework isn't always the right choice.&lt;/p&gt;

&lt;p&gt;If you're building a simple server-side-only MCP tool that doesn't render any UI in ChatGPT, you don't need sunpeak. A plain MCP server with a few tool handlers is straightforward to set up with just the &lt;code&gt;@modelcontextprotocol/sdk&lt;/code&gt; package.&lt;/p&gt;

&lt;p&gt;If you're writing a one-off script or experimenting with the protocol, going framework-free is fine.&lt;/p&gt;

&lt;p&gt;But the moment you're building a UI that renders inside ChatGPT, especially one you plan to maintain and ship to users, a framework pays for itself immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;Sunpeak is open source and free to use.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.sunpeak.ai" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;: guides, API reference, and tutorials&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Sunpeak-AI/sunpeak" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;: source code and issue tracker&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/chatgpt-app-framework"&gt;ChatGPT App Framework&lt;/a&gt;: overview of sunpeak's capabilities
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add &lt;span class="nt"&gt;-g&lt;/span&gt; sunpeak &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sunpeak new
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>mcp</category>
      <category>webdev</category>
      <category>ai</category>
      <category>react</category>
    </item>
    <item>
      <title>Storybook for ChatGPT Apps</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Wed, 14 Jan 2026 18:31:42 +0000</pubDate>
      <link>https://forem.com/abewheeler/storybook-for-chatgpt-apps-5908</link>
      <guid>https://forem.com/abewheeler/storybook-for-chatgpt-apps-5908</guid>
      <description>&lt;p&gt;If you've built React applications, you probably know &lt;a href="https://storybook.js.org/" rel="noopener noreferrer"&gt;Storybook&lt;/a&gt;—the tool that lets you develop UI components in isolation, share them with your team, and iterate without spinning up your entire app. Today we're bringing that same workflow to ChatGPT Apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with ChatGPT App Development
&lt;/h2&gt;

&lt;p&gt;Building ChatGPT Apps has a painful feedback loop. To see your changes, you need to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build your resources&lt;/li&gt;
&lt;li&gt;Deploy / run your MCP server&lt;/li&gt;
&lt;li&gt;Refresh your ChatGPT connector&lt;/li&gt;
&lt;li&gt;Start a new ChatGPT conversation&lt;/li&gt;
&lt;li&gt;Create the right conversation state&lt;/li&gt;
&lt;li&gt;Configure the perfect state in your database to illustrate a single scenario&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's a lot of friction for checking if a button is the right shade of blue.&lt;/p&gt;

&lt;p&gt;Worse, sharing your work-in-progress with teammates or stakeholders means they need access to your MCP server,&lt;br&gt;
mastery of your technical data model, and the patience to navigate through the same steps.&lt;/p&gt;
&lt;h2&gt;
  
  
  Enter the sunpeak simulator
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Local Development
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsasd795sv267k7umdef1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsasd795sv267k7umdef1.png" alt="sunpeak running on localhost" width="800" height="560"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The flagship sunpeak simulator was originally for local development only.&lt;br&gt;
In the simulator, each resource in your app gets its own preview.&lt;br&gt;
Switch between &lt;code&gt;inline&lt;/code&gt;, &lt;code&gt;fullcreen&lt;/code&gt;, and &lt;code&gt;pip&lt;/code&gt; display modes instantly.&lt;br&gt;
Test light and dark themes. No ChatGPT account required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This starts a local development server with hot reloading. Every save updates the preview immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hosted Storybook
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsnxz2xtmas8nmw912owr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsnxz2xtmas8nmw912owr.png" alt="sunpeak running in the sunpeak resource repository" width="800" height="577"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Sunpeak Resource Repository now hosts the sunpeak simulator to run your ChatGPT App resources in an isolated environment.&lt;br&gt;
Think of it as a higher-level Storybook for ChatGPT Apps: you can preview every resource, test different display modes, and share a link with your teammates.&lt;/p&gt;

&lt;p&gt;Once you push your resources to the repository, your teammates can try them out at the provided link:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak push &lt;span class="nt"&gt;-t&lt;/span&gt; design-review

Pushing 4 resource&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt; to repository &lt;span class="s2"&gt;"Sunpeak-AI/sunpeak"&lt;/span&gt;...
Tags: design-review

✓ Pushed albums, 1 simulation&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;, tags: design-review
  https://app.sunpeak.ai/resources/5e57bbe6-b4a5-4895-9f10-81b667740b78
✓ Pushed carousel, 1 simulation&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;, tags: design-review
  https://app.sunpeak.ai/resources/f5304085-46d2-4b96-9173-ad865523862b
✓ Pushed map, 1 simulation&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;, tags: design-review
  https://app.sunpeak.ai/resources/95087582-be0a-45b2-80ec-16d439b380eb
✓ Pushed review, 3 simulation&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;, tags: design-review
  https://app.sunpeak.ai/resources/c329195b-23ea-4577-8116-32b52de37f13
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Share your resource URLs with your team. Designers can review the UI without touching code. Product managers can validate the flow without configuring MCP servers. Engineers can debug tool responses in isolation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Collaborate on Behavior
&lt;/h3&gt;

&lt;p&gt;The simulator isn't just for visuals or static states. You can mock tool inputs and outputs to test how your app responds to different states:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What does the app look like when a tool returns an error?&lt;/li&gt;
&lt;li&gt;How does the UI handle a slow response?&lt;/li&gt;
&lt;li&gt;Does the loading state feel right?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Configure these scenarios once and share them with your team for feedback.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Storybook transformed frontend development by making components shareable and testable in isolation. ChatGPT Apps deserve the same treatment.&lt;/p&gt;

&lt;p&gt;With the sunpeak simulator, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Iterate faster&lt;/strong&gt;: See changes instantly without the deploy-refresh-navigate dance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collaborate earlier&lt;/strong&gt;: Get feedback on designs before they hit production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test edge cases&lt;/strong&gt;: Mock different tool responses without backend changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document behavior&lt;/strong&gt;: Create shareable previews that serve as living documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;The simulator is available now in the &lt;a href="https://app.sunpeak.ai/" rel="noopener noreferrer"&gt;sunpeak resource repository&lt;/a&gt;. If you're already using Sunpeak, &lt;code&gt;sunpeak push&lt;/code&gt; your resources to the repository.&lt;/p&gt;

&lt;p&gt;New to &lt;a href="https://sunpeak.ai" rel="noopener noreferrer"&gt;sunpeak&lt;/a&gt;? Check out the &lt;a href="https://sunpeak.ai/docs" rel="noopener noreferrer"&gt;quickstart guide&lt;/a&gt; to get your first ChatGPT App running in minutes.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>webdev</category>
      <category>ai</category>
      <category>react</category>
    </item>
    <item>
      <title>MCP Needs a Browser</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Mon, 05 Jan 2026 20:29:13 +0000</pubDate>
      <link>https://forem.com/abewheeler/mcp-needs-a-browser-4ei5</link>
      <guid>https://forem.com/abewheeler/mcp-needs-a-browser-4ei5</guid>
      <description>&lt;p&gt;MCP isn’t the perfect protocol, but I’ll leave it to other people to complain about it. It has adoption and that is all that matters—our systems &lt;em&gt;can&lt;/em&gt; be connected. Sometimes they &lt;em&gt;are&lt;/em&gt; connected. But MCP tool use has not remotely broken into the mainstream. Why?&lt;/p&gt;

&lt;p&gt;The consumer experience around MCP is horrendous.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Discovery&lt;/strong&gt;:  Imagine your parents proactively and willingly taking on the task of “connecting to the Facebook MCP server”, even through relatively simple UIs. The act of searching and the subject of the search are essentially dealbreakers for non-technical users.&lt;/p&gt;

&lt;p&gt;Even if users exceed the necessary technical bar, and even if users know exactly what they want done, they don’t know how to do it. They’re welcome to search the many lists of lists of lists of MCP servers, but it’s a lot of work and unlikely to surface trustworthy, stable results.&lt;/p&gt;

&lt;p&gt;For real, production MCP use today, we essentially rely on developers to proactively integrate MCP servers in the background so we can unwittingly use these servers via the web servers of products we’re already using. Imagine being able to use any given website only after a Google engineer found time &amp;amp; motivation to integrate it into &lt;a href="http://google.com/" rel="noopener noreferrer"&gt;google.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MCP needs a search engine &amp;amp; proactive connection embedded in the model.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Connection&lt;/strong&gt;: Imagine if, every time you went to a website, you had to read a security notice, a privacy notice, approve a terms &amp;amp; conditions popup, and review the structure of JSON payloads the website will be making. This has become more true over time as a consumer (thanks, EU), but the actual browser-server connection itself remains virtually permission-less. MCP servers are &lt;strong&gt;servers&lt;/strong&gt;, not client-side applications. Connecting to a server should be as easy as entering a URL in the browser.&lt;/p&gt;

&lt;p&gt;Obviously, seamless MCP server connection has major security implications. The models &amp;amp; their MCP clients need to be architected to be more sandboxed and trust-less. Ultimately, the protection of the user &amp;amp; user data falls almost entirely within the purview of the model provider. They’ve got the users, the data, and the access to protect, and the new paradigms &amp;amp; architectures will have to flow from them.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MCP needs to make connection more like a browser than an app store. This requires substantial protections built into the model.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use&lt;/strong&gt;: Imagine if, on a webpage, you had to manually trigger the correct sequence of API calls to deliver the proper user experience. With MCP, models are left with that impossible task. Invisible dependencies, edge cases, the permutations &amp;amp; combinatorics of all possible tool calls. Such a task is nontrivial even for relatively simple, newer products, let alone massive, complex, legacy systems and all of the unintuitive tech debt they’ve accrued.&lt;/p&gt;

&lt;p&gt;Further, imagine if, in using a webpage, every input to and output from that page had to pass through a model. Would you use such a webpage to wire rent money? Models are non-deterministic. They can be wrong (less and less over time, but they always will). In most systems, there’s at least one action that you want to be direct-to-server and 100% deterministic.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MCP needs to let server providers own parts of the client within the model.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All of the fundamental blockers to MCP have one thing in common: they’re totally dependent on the model provider to implement. Fortunately, OpenAI is on the right track.&lt;/p&gt;

&lt;p&gt;ChatGPT Apps bring MCP one step closer to having a “browser”, but it doesn’t go all the way. I suspect that this is the direction that we’re heading. As with all macro trends, it will take us a while to get there.&lt;/p&gt;

&lt;p&gt;MCP is very young, ChatGPT Apps are younger, and the Apps of today are only weeks old. Everything will get a LOT better. We’re building &lt;code&gt;sunpeak&lt;/code&gt; to help. &lt;a href="https://sunpeak.ai" rel="noopener noreferrer"&gt;https://sunpeak.ai&lt;/a&gt; is the ChatGPT App framework that helps developers quickstart, build, test, and ship ChatGPT Apps. Please &lt;a href="https://github.com/Sunpeak-AI/sunpeak/" rel="noopener noreferrer"&gt;star us on Github&lt;/a&gt;!&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>webdev</category>
      <category>ai</category>
      <category>react</category>
    </item>
    <item>
      <title>Introducing the Sunpeak Resource Repository</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Tue, 23 Dec 2025 19:35:25 +0000</pubDate>
      <link>https://forem.com/abewheeler/introducing-the-sunpeak-resource-repository-1bbg</link>
      <guid>https://forem.com/abewheeler/introducing-the-sunpeak-resource-repository-1bbg</guid>
      <description>&lt;p&gt;Today we're launching the Sunpeak Resource Repository—ECR for ChatGPT Apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Decouple Your App from Your MCP Server?
&lt;/h2&gt;

&lt;p&gt;ChatGPT Apps are built on MCP servers, but your UI resources don't need to live alongside your server code. Decoupling them provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Generic MCP servers&lt;/strong&gt;: Keep your production MCP server generic &amp;amp; largely client agnostic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Independent lifecycles&lt;/strong&gt;: Clearly indicate which code changes and version tags require ChatGPT App submission reviews and which are entirely MCP server-side&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team collaboration&lt;/strong&gt;: Designers and frontend devs can push UI changes without touching server infrastructure and vice versa&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Independent deployments&lt;/strong&gt;: Update your UI without redeploying your server&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Authenticate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This opens your browser for secure OAuth authentication. Sunpeak stores local configuration in &lt;code&gt;~/.sunpeak/&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Push Resources
&lt;/h3&gt;

&lt;p&gt;After building your resources with &lt;code&gt;sunpeak build&lt;/code&gt;, push them to the repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak push
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tag your resources for versioning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak push &lt;span class="nt"&gt;-t&lt;/span&gt; v1.0.0 &lt;span class="nt"&gt;-t&lt;/span&gt; staging
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pull Resources
&lt;/h3&gt;

&lt;p&gt;Retrieve resources by tag from any directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak pull &lt;span class="nt"&gt;-r&lt;/span&gt; myorg/my-app &lt;span class="nt"&gt;-t&lt;/span&gt; prod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This downloads the JavaScript bundles and metadata files, ready for deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Common Workflows
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rollback to a previous version:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak pull &lt;span class="nt"&gt;-r&lt;/span&gt; myorg/my-app &lt;span class="nt"&gt;-t&lt;/span&gt; v1.0.0
sunpeak deploy &lt;span class="c"&gt;# Shorthand for sunpeak push -t prod&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Promote staging to production:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sunpeak pull &lt;span class="nt"&gt;-r&lt;/span&gt; myorg/my-app &lt;span class="nt"&gt;-t&lt;/span&gt; staging
sunpeak deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;Ready to try it? Head to &lt;a href="https://sunpeak.ai" rel="noopener noreferrer"&gt;sunpeak.ai&lt;/a&gt; to learn more, or jump straight into the &lt;a href="https://app.sunpeak.ai" rel="noopener noreferrer"&gt;web application&lt;/a&gt; to create your account.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>webdev</category>
      <category>ai</category>
      <category>react</category>
    </item>
    <item>
      <title>Ship a ChatGPT App in 2 commands</title>
      <dc:creator>Abe Wheeler</dc:creator>
      <pubDate>Wed, 17 Dec 2025 13:37:53 +0000</pubDate>
      <link>https://forem.com/abewheeler/ship-a-chatgpt-app-in-2-commands-38i0</link>
      <guid>https://forem.com/abewheeler/ship-a-chatgpt-app-in-2-commands-38i0</guid>
      <description>&lt;p&gt;With &lt;a href="https://sunpeak.ai/" rel="noopener noreferrer"&gt;sunpeak&lt;/a&gt;, you can start and ship a ChatGPT App with two commands:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Initialize your project:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;pnpm&lt;/span&gt; &lt;span class="nx"&gt;dlx&lt;/span&gt; &lt;span class="nx"&gt;sunpeak&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Inside your project, start your mcp server:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;pnpm&lt;/span&gt; &lt;span class="nx"&gt;mcp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Your ChatGPT App UI and mock data server is now up and running.&lt;/p&gt;

&lt;p&gt;If you’re running the server on your local machine, you’ll need to expose that MCP server so ChatGPT can access it. Do so with a free account from &lt;a href="https://ngrok.com/" rel="noopener noreferrer"&gt;ngrok&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;ngrok&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt; &lt;span class="mi"&gt;6766&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lastly, you need to point ChatGPT to your new app. From your ChatGPT account, proceed to: &lt;strong&gt;&lt;code&gt;User &amp;gt; Settings &amp;gt; Apps &amp;amp; Connectors &amp;gt; Create&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You need to be in &lt;strong&gt;developer mode&lt;/strong&gt; to add your App, which requires a paid account. If you don’t have a paid account, you can just develop your App locally with &lt;code&gt;pnpm dev&lt;/code&gt; instead of &lt;code&gt;pnpm mcp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can now connect ChatGPT to the ngrok &lt;strong&gt;&lt;code&gt;Forwarding URL&lt;/code&gt;&lt;/strong&gt; at the &lt;strong&gt;&lt;code&gt;/mcp&lt;/code&gt;&lt;/strong&gt; path (e.g. &lt;strong&gt;&lt;code&gt;https://your-random-subdomain.ngrok-free.dev/mcp&lt;/code&gt;&lt;/strong&gt;). Your App is now connected to ChatGPT! Send &lt;code&gt;/sunpeak show carousel&lt;/code&gt; to ChatGPT to see your UI in action!&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>webdev</category>
      <category>ai</category>
      <category>react</category>
    </item>
  </channel>
</rss>
