<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: 정상록</title>
    <description>The latest articles on Forem by 정상록 (@_46ea277e677b888e0cd13).</description>
    <link>https://forem.com/_46ea277e677b888e0cd13</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3798483%2F5958ee10-90dd-45e5-815a-91d3f8196156.png</url>
      <title>Forem: 정상록</title>
      <link>https://forem.com/_46ea277e677b888e0cd13</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/_46ea277e677b888e0cd13"/>
    <language>en</language>
    <item>
      <title>OmniVoice: Open-Source TTS with 600+ Languages and Zero-Shot Voice Cloning</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Tue, 21 Apr 2026 02:32:49 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/omnivoice-open-source-tts-with-600-languages-and-zero-shot-voice-cloning-1mpn</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/omnivoice-open-source-tts-with-600-languages-and-zero-shot-voice-cloning-1mpn</guid>
      <description>&lt;h1&gt;
  
  
  OmniVoice: Open-Source TTS with 600+ Languages and Zero-Shot Voice Cloning
&lt;/h1&gt;

&lt;p&gt;The TTS landscape just shifted. On March 31, 2026, the k2-fsa team — the same group behind Kaldi and k2, with Daniel Povey as a core contributor — released &lt;strong&gt;OmniVoice&lt;/strong&gt;: an Apache 2.0 licensed TTS model supporting 600+ languages zero-shot, with 40x real-time inference speed.&lt;/p&gt;

&lt;p&gt;In just three weeks, it hit &lt;strong&gt;3,775 GitHub stars&lt;/strong&gt; and &lt;strong&gt;460,000+ HuggingFace downloads&lt;/strong&gt;. Here's why developers are paying attention and how to run it locally.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers That Matter
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Languages supported&lt;/td&gt;
&lt;td&gt;600+ (zero-shot)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RTF (Real-Time Factor)&lt;/td&gt;
&lt;td&gt;0.025 (40x faster than real-time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;Apache 2.0 (commercial use OK)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Base model&lt;/td&gt;
&lt;td&gt;Qwen3-0.6B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reference audio needed&lt;/td&gt;
&lt;td&gt;3–10 seconds (or none)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware&lt;/td&gt;
&lt;td&gt;Runs on consumer GPUs, Apple Silicon MPS supported&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Compare this to commercial services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ElevenLabs Pro&lt;/strong&gt;: $22/month, limited characters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ElevenLabs Business&lt;/strong&gt;: $99/month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Azure TTS&lt;/strong&gt;: $16/million characters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google Cloud TTS&lt;/strong&gt;: $16/million characters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OmniVoice: zero cost after deployment. Unlimited usage on your own hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Modes in One Model
&lt;/h2&gt;

&lt;p&gt;OmniVoice supports three inference modes through a single unified API:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Voice Cloning
&lt;/h3&gt;

&lt;p&gt;Clone a voice from a short reference audio clip. Whisper auto-transcribes the reference text if you don't provide it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;omnivoice&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OmniVoice&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;soundfile&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sf&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;OmniVoice&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k2-fsa/OmniVoice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;device_map&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cuda:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This voice was cloned from a 3-second reference.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ref_audio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ref.wav&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# ref_text optional - Whisper auto-transcribes
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;sf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cloned.wav&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;24000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Voice Design
&lt;/h3&gt;

&lt;p&gt;Design a voice from scratch using natural language attributes. No reference audio required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This is a designed voice.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;female, low pitch, british accent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combine attributes freely: gender, age, pitch, speech speed, accent, dialect, emotional tone.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Auto Voice
&lt;/h3&gt;

&lt;p&gt;No voice prompt at all. Fastest mode for quick prototyping.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Quick test output.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;PyTorch must be installed first, with version pinned to 2.8.0.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# NVIDIA (CUDA 12.8)&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;&lt;span class="nv"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.8.0+cu128 &lt;span class="nv"&gt;torchaudio&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.8.0+cu128 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--extra-index-url&lt;/span&gt; https://download.pytorch.org/whl/cu128

&lt;span class="c"&gt;# Apple Silicon (M1/M2/M3)&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;&lt;span class="nv"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.8.0 &lt;span class="nv"&gt;torchaudio&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.8.0

&lt;span class="c"&gt;# OmniVoice&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;omnivoice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify GPU/MPS detection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CUDA:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cuda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_available&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MPS:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backends&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_available&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Fast Path: Web Demo
&lt;/h2&gt;

&lt;p&gt;The fastest way to validate your setup is the bundled Gradio demo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;omnivoice-demo &lt;span class="nt"&gt;--ip&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 8001
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Navigate to &lt;code&gt;http://localhost:8001&lt;/code&gt; and test all three modes through a UI.&lt;/p&gt;

&lt;h2&gt;
  
  
  CLI Tools
&lt;/h2&gt;

&lt;p&gt;Besides the Python API, OmniVoice ships with two CLI tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Single inference&lt;/span&gt;
omnivoice-infer &lt;span class="nt"&gt;--model&lt;/span&gt; k2-fsa/OmniVoice &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--text&lt;/span&gt; &lt;span class="s2"&gt;"Hello world."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--ref_audio&lt;/span&gt; ref.wav &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output&lt;/span&gt; hello.wav

&lt;span class="c"&gt;# Multi-GPU batch inference&lt;/span&gt;
omnivoice-infer-batch &lt;span class="nt"&gt;--model&lt;/span&gt; k2-fsa/OmniVoice &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--test_list&lt;/span&gt; test.jsonl &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--res_dir&lt;/span&gt; results/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Batch format (JSONL):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"clip_001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"First clip."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ref_audio"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ref.wav"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"clip_002"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Second clip."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ref_audio"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ref.wav"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Perfect for audiobook generation or large-scale narration pipelines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Expression Control with Inline Tokens
&lt;/h2&gt;

&lt;p&gt;Drop non-verbal tokens anywhere in the text:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;That&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s hilarious [laughter] but also a bit concerning [sigh].&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ref_audio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ref.wav&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Available tags include &lt;code&gt;[laughter]&lt;/code&gt;, &lt;code&gt;[sigh]&lt;/code&gt;, &lt;code&gt;[question-ah]&lt;/code&gt;, &lt;code&gt;[surprise-wa]&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pronunciation Override
&lt;/h2&gt;

&lt;p&gt;For homophones and proper nouns, override pronunciation directly.&lt;/p&gt;

&lt;h3&gt;
  
  
  English (CMU notation)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# "bass" as musical instrument, not low-frequency sound
&lt;/span&gt;&lt;span class="n"&gt;audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;He plays [B EY1 S] guitar.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Chinese (Pinyin with tone numbers)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Force specific tone
&lt;/span&gt;&lt;span class="n"&gt;audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;打ZHE2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Architecture Notes
&lt;/h2&gt;

&lt;p&gt;OmniVoice uses a &lt;strong&gt;Diffusion Language Model&lt;/strong&gt; hybrid architecture. It's neither pure diffusion nor pure autoregressive — it combines the quality benefits of diffusion with the speed advantages of LLM-style generation. The base model is Qwen3-0.6B, making it light enough for consumer hardware while leveraging the language understanding of a modern LLM.&lt;/p&gt;

&lt;p&gt;This is a different direction from previous open-source TTS projects (Bark, XTTS, F5-TTS), and it seems to be paying off in both quality and inference speed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production Tips from Issue #44
&lt;/h2&gt;

&lt;p&gt;The community has been active on GitHub Issue #44 discussing real-world usage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Voice Design consistency.&lt;/strong&gt; Each call produces slightly different timbre. Generate once, save the output, then reuse as &lt;code&gt;ref_audio&lt;/code&gt; to lock in a consistent voice for an entire project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt caching.&lt;/strong&gt; Use &lt;code&gt;create_voice_clone_prompt&lt;/code&gt; to precompute reference audio encodings once, then reuse the cached prompt for repeated generation. Critical for throughput on long-form content.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Number normalization.&lt;/strong&gt; Raw digits like "123" can produce inconsistent output. Normalize to words ("one hundred twenty-three") using WeTextProcessing or similar before passing text to the model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-lingual accent bleed.&lt;/strong&gt; If you use a Korean reference to generate English, the output has a Korean accent. For neutral target language accent, use native-speaker references.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who This Is For
&lt;/h2&gt;

&lt;p&gt;OmniVoice fits several developer profiles well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Solo developers and indie hackers&lt;/strong&gt; who were paying commercial TTS subscriptions just for hobby projects&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI agent builders&lt;/strong&gt; needing voice output without vendor lock-in&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content creators&lt;/strong&gt; doing multilingual localization (YouTube, podcasts)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Voice cloning experiments&lt;/strong&gt; where 3-second references unlock a lot of creative possibilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low-resource language applications&lt;/strong&gt; (the 600+ language coverage includes many languages with no good commercial option)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The k2-fsa ecosystem has a strong track record of long-term maintenance (Kaldi is still actively used 15+ years after release). That matters when you're deciding whether to build production infrastructure on a new model.&lt;/p&gt;

&lt;p&gt;If you're evaluating TTS options, OmniVoice deserves a spot in the comparison. The combination of 600+ language support, 40x real-time inference, and Apache 2.0 licensing is genuinely rare in open source TTS today.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/k2-fsa/OmniVoice" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/k2-fsa/OmniVoice" rel="noopener noreferrer"&gt;HuggingFace model card&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2604.00688" rel="noopener noreferrer"&gt;Paper on arXiv&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/k2-fsa/OmniVoice/issues/44" rel="noopener noreferrer"&gt;Issue #44: community Q&amp;amp;A&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Have you tried OmniVoice yet? Would love to hear how it compares to your current TTS setup in the comments.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>nlp</category>
      <category>opensource</category>
    </item>
    <item>
      <title>chrome-devtools-mcp: Google's Official MCP Server That Lets AI Agents Drive Chrome DevTools</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Tue, 21 Apr 2026 02:26:07 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/chrome-devtools-mcp-googles-official-mcp-server-that-lets-ai-agents-drive-chrome-devtools-1m16</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/chrome-devtools-mcp-googles-official-mcp-server-that-lets-ai-agents-drive-chrome-devtools-1m16</guid>
      <description>&lt;h1&gt;
  
  
  chrome-devtools-mcp: Google's Official MCP Server That Lets AI Agents Drive Chrome DevTools
&lt;/h1&gt;

&lt;p&gt;If you're using Claude Code, Cursor, or any MCP-capable AI agent for frontend work, there's now an official way to let your agent drive Chrome directly — and it's maintained by Google's Chrome DevTools team.&lt;/p&gt;

&lt;p&gt;The package is called &lt;code&gt;chrome-devtools-mcp&lt;/code&gt;, it's Apache-2.0 licensed, and as of April 2026 it's at v0.21.0 after 43 releases in 7 months. This isn't a one-off experiment — Chrome team is shipping it on a roughly weekly cadence.&lt;/p&gt;

&lt;p&gt;Here's what changed for me after installing it, and what you need to know before doing the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Official Google MCP server wrapping Chrome DevTools via Puppeteer&lt;/li&gt;
&lt;li&gt;29 tools across 7 categories (input, navigation, performance, network, debugging, emulation, memory)&lt;/li&gt;
&lt;li&gt;One-line install for Claude Code: &lt;code&gt;claude mcp add chrome-devtools --scope user npx chrome-devtools-mcp@latest&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Chrome M144+ ships &lt;code&gt;--autoConnect&lt;/code&gt; — AI attaches to your already-running Chrome session (keeps logins, cookies, extensions)&lt;/li&gt;
&lt;li&gt;Not a Playwright MCP replacement. Different use case (live debugging vs deterministic E2E tests)&lt;/li&gt;
&lt;li&gt;Security-sensitive: browser content is shared with your MCP client; disable telemetry with &lt;code&gt;CI=1&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What it actually does
&lt;/h2&gt;

&lt;p&gt;Before &lt;code&gt;chrome-devtools-mcp&lt;/code&gt;, asking an AI agent to debug a web page meant a lot of manual copy-pasting. You'd screenshot the Network tab, paste console errors as strings, describe what you see. The AI was always one abstraction layer away from the actual browser state.&lt;/p&gt;

&lt;p&gt;With the MCP server installed, your agent can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Navigate pages (&lt;code&gt;navigate_page&lt;/code&gt;, &lt;code&gt;new_page&lt;/code&gt;, multi-tab control)&lt;/li&gt;
&lt;li&gt;Click, type, drag, fill forms (&lt;code&gt;click&lt;/code&gt;, &lt;code&gt;fill_form&lt;/code&gt;, &lt;code&gt;upload_file&lt;/code&gt;, &lt;code&gt;press_key&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Read the actual DevTools Performance trace (&lt;code&gt;performance_start_trace&lt;/code&gt;, &lt;code&gt;performance_analyze_insight&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Execute Lighthouse audits (&lt;code&gt;lighthouse_audit&lt;/code&gt;) — including CrUX real-user data&lt;/li&gt;
&lt;li&gt;Inspect every HTTP request natively, not via screenshot (&lt;code&gt;list_network_requests&lt;/code&gt;, &lt;code&gt;get_network_request&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Read console messages with sourcemap-resolved stack traces (&lt;code&gt;list_console_messages&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Take memory heap snapshots for leak hunting (&lt;code&gt;take_memory_snapshot&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full list is 29 tools, all surfaced via the standard MCP protocol. Any agent that speaks MCP — Claude Code, Cursor, Cline, Zed, Gemini CLI — can use them identically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Method 1: MCP server only
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add chrome-devtools &lt;span class="nt"&gt;--scope&lt;/span&gt; user npx chrome-devtools-mcp@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--scope user&lt;/code&gt; makes it available across all your Claude Code projects. Use &lt;code&gt;--scope project&lt;/code&gt; to limit to the current workspace.&lt;/p&gt;

&lt;h3&gt;
  
  
  Method 2: Plugin (MCP + Skills)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin marketplace add ChromeDevTools/chrome-devtools-mcp
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;chrome-devtools-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The plugin bundle adds Chrome-curated Skills (prompt presets) alongside the MCP server. Higher-level commands like "run a performance audit" just work.&lt;/p&gt;

&lt;h3&gt;
  
  
  Requirements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Node.js v20.19 or newer (LTS)&lt;/li&gt;
&lt;li&gt;Chrome Stable (M144+ recommended for &lt;code&gt;--autoConnect&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;npm (comes with Node)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After installing, restart Claude Code and run &lt;code&gt;/mcp&lt;/code&gt; to verify &lt;code&gt;chrome-devtools (29 tools available)&lt;/code&gt; is listed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The feature that changed my workflow: &lt;code&gt;--autoConnect&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Chrome M144 (December 2025) added support for &lt;code&gt;--autoConnect&lt;/code&gt;. Instead of spawning a fresh Chrome instance every time, the MCP server attaches to your existing Chrome via remote debugging.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Start Chrome with remote debugging enabled&lt;/span&gt;
open &lt;span class="nt"&gt;-a&lt;/span&gt; &lt;span class="s2"&gt;"Google Chrome"&lt;/span&gt; &lt;span class="nt"&gt;--args&lt;/span&gt; &lt;span class="nt"&gt;--remote-debugging-port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;9222

&lt;span class="c"&gt;# 2. Register MCP pointing at it&lt;/span&gt;
claude mcp add chrome-devtools-live &lt;span class="nt"&gt;--scope&lt;/span&gt; &lt;span class="nb"&gt;local&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  npx chrome-devtools-mcp@latest &lt;span class="nt"&gt;--autoConnect&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this matters: your existing browser state — SSO sessions, extensions, developer tools panel position, the exact tab you were debugging — is preserved. You can manually reproduce an issue, hand off to AI mid-session, and the AI sees the same DOM you see.&lt;/p&gt;

&lt;p&gt;Playwright MCP doesn't do this. Playwright is intentionally designed for test isolation, spawning fresh contexts. Different philosophy, different use case.&lt;/p&gt;

&lt;h2&gt;
  
  
  chrome-devtools-mcp vs Playwright MCP
&lt;/h2&gt;

&lt;p&gt;This is the question that comes up every time.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concern&lt;/th&gt;
&lt;th&gt;chrome-devtools-mcp&lt;/th&gt;
&lt;th&gt;Playwright MCP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Primary purpose&lt;/td&gt;
&lt;td&gt;Live debugging, performance audits&lt;/td&gt;
&lt;td&gt;Deterministic E2E tests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browser support&lt;/td&gt;
&lt;td&gt;Chrome only&lt;/td&gt;
&lt;td&gt;Chrome / Firefox / WebKit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance tracing&lt;/td&gt;
&lt;td&gt;Native&lt;/td&gt;
&lt;td&gt;Not supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lighthouse audit&lt;/td&gt;
&lt;td&gt;Built in&lt;/td&gt;
&lt;td&gt;Not supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CrUX real-user data&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test isolation&lt;/td&gt;
&lt;td&gt;Weak (by design)&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI/CD fit&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Rule of thumb: writing regression tests → Playwright MCP. Debugging why production is slow or a specific user hits an error → chrome-devtools-mcp. Most teams benefit from having both installed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-world workflow examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Weekly performance audit
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Every Monday, for these 5 landing pages:
1. Run Lighthouse audit (mobile and desktop)
2. Compare Core Web Vitals (LCP, CLS, INP) to last week's numbers
3. Flag any score regressions &amp;gt; 5 points
4. Diagnose the root cause in the network waterfall
5. Output a markdown report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This used to be a 2-hour manual process. Now it's a 10-minute async handoff.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-deploy smoke test
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;On localhost:3000:
1. Load home page — fail if any console errors
2. Complete signup form — fail if not redirected to onboarding
3. Authenticate and load dashboard — fail if any 4xx/5xx requests
4. Screenshot each step
5. Bail with error analysis on first failure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Production incident triage
&lt;/h3&gt;

&lt;p&gt;With &lt;code&gt;--autoConnect&lt;/code&gt; attached to your logged-in Chrome, you can say "reproduce the bug the customer reported on /app/settings" and the AI navigates there, clicks through, captures console + network, and returns the sourcemap-resolved stack trace.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful flags
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--slim&lt;/code&gt; — reduces the tool surface to 3 essentials (navigate, click, snapshot). Saves MCP tokens when you don't need the full kit.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--isolated&lt;/code&gt; — runs in an isolated Chrome profile, doesn't touch your personal session.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--headless&lt;/code&gt; — for CI/CD.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--browserUrl http://localhost:9222&lt;/code&gt; — explicit version of &lt;code&gt;--autoConnect&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--channel beta|canary|dev&lt;/code&gt; — target a non-Stable Chrome channel.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--no-usage-statistics&lt;/code&gt; — opt out of Google telemetry (also respects &lt;code&gt;CI=1&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--no-performance-crux&lt;/code&gt; — skip CrUX real-user data lookups.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Security caveats (do not skip)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Browser content is sent to your MCP client.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anything rendered in the browser — DOM, cookies, localStorage — can end up in your AI model's context. Do not run DevTools MCP on tabs where you're logged into a bank, an internal admin panel, or anything with sensitive PII, unless you're comfortable with that data reaching your model API provider.&lt;/p&gt;

&lt;p&gt;Mitigation: use &lt;code&gt;--isolated&lt;/code&gt; to separate from your personal profile, or start Chrome with a dedicated profile argument.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Telemetry is on by default.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Google collects anonymized usage statistics. If that's not acceptable for your compliance requirements, set &lt;code&gt;CI=1&lt;/code&gt; in your shell env or append &lt;code&gt;--no-usage-statistics&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Remote debugging port must be localhost-bound.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;--autoConnect&lt;/code&gt; works by connecting to Chrome's remote debugging port (default 9222). If that port is exposed to your LAN or a public IP, anyone on the network can read your browser state or execute scripts in your session.&lt;/p&gt;

&lt;p&gt;Verify binding with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lsof &lt;span class="nt"&gt;-i&lt;/span&gt; :9222
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You want to see &lt;code&gt;127.0.0.1:9222&lt;/code&gt; or &lt;code&gt;localhost:9222&lt;/code&gt;, never &lt;code&gt;0.0.0.0:9222&lt;/code&gt;. On shared networks, avoid the port-based flow entirely and let MCP spawn its own isolated Chrome.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Does it work with Chromium forks like Edge, Brave, Arc?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Officially Chrome Stable only. Chromium-based browsers may work for basic cases but performance-analysis tools can be unreliable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the Lighthouse audit slower than CLI Lighthouse?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A bit — you're paying the AI round-trip cost. If you only need raw scores, stick with CLI Lighthouse. If you want scores + root-cause analysis + a fix, the MCP path saves time overall.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is it free?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. Apache-2.0 open source, Google-maintained. No API keys, no subscription.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verdict
&lt;/h2&gt;

&lt;p&gt;If you use an MCP-capable AI agent for frontend work and you haven't installed chrome-devtools-mcp yet, you're leaving the single highest-leverage automation on the table. The install is five minutes, the learning curve is shallow, and the maintainer is the Chrome team itself.&lt;/p&gt;

&lt;p&gt;The workflows that save the most time for me: weekly Lighthouse audits with regression diagnosis, pre-deploy smoke tests, and production incident triage with &lt;code&gt;--autoConnect&lt;/code&gt;. If any of those land in your week, DevTools MCP is a straightforward win.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Official repo: &lt;a href="https://github.com/ChromeDevTools/chrome-devtools-mcp" rel="noopener noreferrer"&gt;github.com/ChromeDevTools/chrome-devtools-mcp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm package: &lt;a href="https://www.npmjs.com/package/chrome-devtools-mcp" rel="noopener noreferrer"&gt;npmjs.com/package/chrome-devtools-mcp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Chrome DevTools AI Agents docs: &lt;a href="https://developer.chrome.com/docs/devtools/agents" rel="noopener noreferrer"&gt;developer.chrome.com/docs/devtools/agents&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Announcement blog post: &lt;a href="https://developer.chrome.com/blog/chrome-devtools-mcp" rel="noopener noreferrer"&gt;developer.chrome.com/blog/chrome-devtools-mcp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--autoConnect&lt;/code&gt; walkthrough: &lt;a href="https://developer.chrome.com/blog/chrome-devtools-mcp-debug-your-browser-session" rel="noopener noreferrer"&gt;developer.chrome.com/blog/chrome-devtools-mcp-debug-your-browser-session&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>tooling</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Claude Cowork Goes GA — Why Live Artifacts Change How We Work With AI</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Tue, 21 Apr 2026 02:19:16 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/claude-cowork-goes-ga-why-live-artifacts-change-how-we-work-with-ai-jch</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/claude-cowork-goes-ga-why-live-artifacts-change-how-we-work-with-ai-jch</guid>
      <description>&lt;p&gt;On April 9, 2026, Anthropic released Claude Cowork to every paid plan — Pro, Max, Team, Enterprise. It started as a macOS-only research preview for Max users on January 12. Three months later, it's everywhere. And the real headline isn't the rollout. It's &lt;strong&gt;Live Artifacts&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Cowork Actually Is
&lt;/h2&gt;

&lt;p&gt;Claude Cowork is not just another desktop app. Anthropic defines it as a "collaborative workspace where Claude performs long-running tasks in the background while exchanging results with the user in real time."&lt;/p&gt;

&lt;p&gt;The difference from regular Claude.ai chat:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Claude Chat&lt;/th&gt;
&lt;th&gt;Claude Cowork&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Interaction&lt;/td&gt;
&lt;td&gt;Turn-based Q&amp;amp;A&lt;/td&gt;
&lt;td&gt;Background long-running execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;Text + static artifacts&lt;/td&gt;
&lt;td&gt;Live artifacts + progress state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context&lt;/td&gt;
&lt;td&gt;Per-session&lt;/td&gt;
&lt;td&gt;Persistent per Project&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integrations&lt;/td&gt;
&lt;td&gt;Optional MCP&lt;/td&gt;
&lt;td&gt;Zoom, Excel, PowerPoint, Slack built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plan&lt;/td&gt;
&lt;td&gt;All tiers&lt;/td&gt;
&lt;td&gt;Paid plans only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Why now? OpenAI's Agent Mode and Google's Gemini Deep Research pushed the industry into "long-horizon autonomous AI" in Q1 2026. Anthropic responded — but with a twist. Instead of "let AI handle it alone," they chose "let AI handle it beside you."&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Live Artifacts Matter
&lt;/h2&gt;

&lt;p&gt;Artifacts themselves shipped in August 2024. Nothing new there. The shift to &lt;em&gt;Live&lt;/em&gt; Artifacts has three real implications:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. You see state changes as they happen
&lt;/h3&gt;

&lt;p&gt;When Claude runs a 10-minute task, you used to wait for the final output. Now the dashboard fills up, charts draw themselves, data streams into the artifact panel — all in real time.&lt;/p&gt;

&lt;p&gt;This isn't a UX polish. It means &lt;strong&gt;you can course-correct mid-execution&lt;/strong&gt;. "Actually, make that a bar chart instead" works.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Artifacts run in isolated VMs
&lt;/h3&gt;

&lt;p&gt;The Cowork artifact sandbox is an isolated VM that actually executes React components, Python scripts, and interactive HTML. When you ask for a dashboard, Claude generates the code &lt;em&gt;and&lt;/em&gt; the running dashboard appears simultaneously. Click a button, it responds. Change a filter, the chart updates.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. MCP integrations pipe real data in
&lt;/h3&gt;

&lt;p&gt;The Zoom MCP connector ships built-in. Meeting recordings, summaries, and action items flow straight into Live Artifacts. Excel and PowerPoint get end-to-end integration — upload a file, Claude's edits land back in the original.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Timeline That Got Us Here
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Jan 12  — Research preview (macOS + Max only)
Feb 10  — Windows support
Feb 24  — Plugin marketplace + Excel/PPT + Slack
Mar 20  — Projects integration
Mar 23  — Dispatch (assign from phone, execute on desktop)
Apr 8   — Enterprise deployment announcement
Apr 9   — General availability + enterprise governance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three months, three expansion axes: platforms (Mac → Windows), features (plugins → Projects → Dispatch), and distribution (Max → all paid).&lt;/p&gt;

&lt;h2&gt;
  
  
  What the April 9 GA Actually Added
&lt;/h2&gt;

&lt;p&gt;Beyond rollout, this release brought enterprise-grade governance down to the Pro plan:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role-based access control (RBAC)&lt;/strong&gt;: Scope Cowork features per team member. Sensitive projects stay locked to specific roles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Group spend limits&lt;/strong&gt;: Set API usage caps per team or project. Budget blowups get harder.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenTelemetry observability&lt;/strong&gt;: Cowork activity logs export in standard telemetry format. Plug into Datadog, New Relic, whatever you use. Essential for audit requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Analytics API&lt;/strong&gt;: Query Cowork usage patterns programmatically. Which team uses what, how often, in measurable terms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zoom MCP connector&lt;/strong&gt;: Automatic meeting data pipeline.&lt;/p&gt;

&lt;p&gt;The key point: &lt;strong&gt;this stack is available on the $20/month Pro plan&lt;/strong&gt;. Enterprise-tier governance for solo operators.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solo Operator Playbook
&lt;/h2&gt;

&lt;p&gt;If you're a one-person shop wearing multiple hats (I run Quantum Jump Club as CEO + marketer + developer + accountant), Cowork changes the math.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 1: Projects per Role
&lt;/h3&gt;

&lt;p&gt;Split Cowork Projects by job function:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Content"&lt;/strong&gt; Project: Blog, social, YouTube scripts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Finance"&lt;/strong&gt; Project: Revenue analysis, expense processing, budget monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"CRM"&lt;/strong&gt; Project: Lead analysis, pipeline management, customer research&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Product"&lt;/strong&gt; Project: Curriculum design, VOD planning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each Project keeps its own context. Step into "Content" and Claude already knows your brand voice, past content style, and target audience.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 2: Live Artifacts for Numerical Decisions
&lt;/h3&gt;

&lt;p&gt;Monday morning: upload revenue Excel, ask "analyze vs last week + suggest 3 focus metrics this week."&lt;/p&gt;

&lt;p&gt;In the artifact panel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dashboard fills in real time&lt;/li&gt;
&lt;li&gt;KPI cards flag warnings by color&lt;/li&gt;
&lt;li&gt;What-if sliders generate themselves&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Drag the slider and ask "what if ad spend goes up 20%?" The AI partner runs it with you.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 3: Meeting-to-Action Pipeline
&lt;/h3&gt;

&lt;p&gt;Zoom a customer call. Zoom MCP connector auto-pipes recording/transcript/summary into Cowork. Cowork extracts action items and renders them as a live checklist artifact. Within 30 seconds of meeting end, "who does what" is visualized.&lt;/p&gt;

&lt;p&gt;Different from classic meeting automation — output arrives &lt;strong&gt;in an executable form&lt;/strong&gt;, not just text.&lt;/p&gt;

&lt;h2&gt;
  
  
  Honest Limitations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Network restrictions
&lt;/h3&gt;

&lt;p&gt;The artifact sandbox is an isolated VM with restricted outbound network. GitHub issue #46243 (April 10, 2026) flags that video/audio/HLS streaming gets blocked. Anthropic confirmed this is intentional for security. Real-time media rendering use cases remain constrained.&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory scope
&lt;/h3&gt;

&lt;p&gt;Context persists within a Project but doesn't cross Projects. If you want Project A's knowledge in Project B, you transfer it manually.&lt;/p&gt;

&lt;h3&gt;
  
  
  Desktop only
&lt;/h3&gt;

&lt;p&gt;Mobile supports Dispatch (task assignment) but actual Cowork execution is desktop. Field work remains limited.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost structure
&lt;/h3&gt;

&lt;p&gt;Cowork is available on Pro ($20/mo) but Live Artifacts burn tokens fast. Run long sessions often and you'll hit the monthly cap. Max ($100/mo) is more realistic for mission-critical use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started in 5 Minutes
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Update Claude Desktop to latest (macOS: Cmd+Q restart; Windows: Settings &amp;gt; Check for Updates)&lt;/li&gt;
&lt;li&gt;Confirm Pro plan or higher&lt;/li&gt;
&lt;li&gt;Click the Cowork tab in the left sidebar (enable in Settings &amp;gt; Features if missing)&lt;/li&gt;
&lt;li&gt;Create a new Project organized by role&lt;/li&gt;
&lt;li&gt;Drag an Excel file in&lt;/li&gt;
&lt;li&gt;Ask: "build a live dashboard from this data"&lt;/li&gt;
&lt;li&gt;Watch the artifact panel come alive&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the moment the phrase "working through a chat window" starts feeling weird.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;Claude Cowork's Live Artifact update isn't a feature drop. It's a shift in how we work with AI. Chat is conversation. Cowork is collaboration.&lt;/p&gt;

&lt;p&gt;While OpenAI and Google race on autonomy, Anthropic bet on the collaborative experience. For solo operators, that bet pays off — the "Excel analyst + deck designer + action-item secretary" combo is now $100/month away.&lt;/p&gt;

&lt;p&gt;Update your Claude Desktop and open the Cowork tab. Five minutes is enough to see the shift.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;References:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://claude.com/blog/cowork-for-enterprise" rel="noopener noreferrer"&gt;Claude Cowork for Enterprise (Anthropic Blog)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://support.claude.com/en/articles/14477985-monitor-claude-cowork-activity-with-opentelemetry" rel="noopener noreferrer"&gt;Monitor Cowork with OpenTelemetry (Anthropic Help)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/product/claude-cowork" rel="noopener noreferrer"&gt;Claude Cowork Product Page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/anthropics/claude-code/issues/46243" rel="noopener noreferrer"&gt;GitHub Issue #46243 — Sandbox Network Limits&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Claude Design Launch — How Anthropic Made Figma's Stock Drop 7% in One Day</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Mon, 20 Apr 2026 11:31:09 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/claude-design-launch-how-anthropic-made-figmas-stock-drop-7-in-one-day-25j3</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/claude-design-launch-how-anthropic-made-figmas-stock-drop-7-in-one-day-25j3</guid>
      <description>&lt;h1&gt;
  
  
  Claude Design Launch — How Anthropic Made Figma's Stock Drop 7% in One Day
&lt;/h1&gt;

&lt;p&gt;On April 17, 2026, &lt;strong&gt;Anthropic shipped Claude Design&lt;/strong&gt; — a research preview that turns natural-language prompts into presentations, UI prototypes, and marketing one-pagers. Figma's stock dropped about 7% the same day.&lt;/p&gt;

&lt;p&gt;I've been poking at it for the last 48 hours. Here's what actually matters for devs and founders.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR for Engineers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Release&lt;/td&gt;
&lt;td&gt;2026-04-17, Anthropic Labs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Model&lt;/td&gt;
&lt;td&gt;Claude Opus 4.7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Render engine&lt;/td&gt;
&lt;td&gt;Canva Design Engine (partnership)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Access&lt;/td&gt;
&lt;td&gt;Claude Pro / Max / Team / Enterprise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Killer feature&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;One-shot handoff to Claude Code&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exports&lt;/td&gt;
&lt;td&gt;PDF, PPTX, HTML, Canva, org URL, code bundle&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Five Features That Matter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Prompt → Visual in ~30 seconds
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"SaaS landing page, dark mode, hero with 2 CTAs,
3-tier pricing table, customer logo strip"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Opus 4.7 plans the structure, Canva's engine renders. Editable output in 30 seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Interactive inline edits
&lt;/h3&gt;

&lt;p&gt;Not "regenerate the whole thing." Three tools work in parallel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inline comments ("more restrained tone here")&lt;/li&gt;
&lt;li&gt;Direct editing (text, colors, sizing)&lt;/li&gt;
&lt;li&gt;Custom sliders (spacing, contrast, font weight)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the Figma/Canva UX ported to AI. It's the reason I think Claude Design has staying power where previous AI design tools failed.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Team design system ingestion
&lt;/h3&gt;

&lt;p&gt;This is the enterprise unlock. During onboarding you point Claude at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub repo (it parses &lt;code&gt;tailwind.config.js&lt;/code&gt;, CSS variables)&lt;/li&gt;
&lt;li&gt;Figma library URL&lt;/li&gt;
&lt;li&gt;Brand guide PDF&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Claude extracts color palette, typography, spacing rules, component patterns. All subsequent generations use your tokens. No more pasting hex codes into every prompt.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Claude Code handoff
&lt;/h3&gt;

&lt;p&gt;The differentiator that Figma doesn't have:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/handoff &lt;span class="nt"&gt;--to&lt;/span&gt; claude-code &lt;span class="nt"&gt;--format&lt;/span&gt; nextjs &lt;span class="nt"&gt;--styling&lt;/span&gt; tailwind
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Design ships a bundle (design intent + structure + tokens) to Claude Code, which generates production Next.js / React / Vue. The loop closes inside Anthropic's ecosystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Multi-format export
&lt;/h3&gt;

&lt;p&gt;PDF (investors), PPTX (existing slide workflows), HTML (preview), organization URL (team review), Canva (fine-tuning), Claude Code (production).&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Actually Use It — 6 Steps
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Confirm Claude Pro+ subscription&lt;/li&gt;
&lt;li&gt;Click the &lt;strong&gt;palette icon (🎨)&lt;/strong&gt; in left nav on claude.ai&lt;/li&gt;
&lt;li&gt;(Optional, Team/Enterprise) Connect codebase + design system during onboarding&lt;/li&gt;
&lt;li&gt;Write a structured prompt (see pattern below)&lt;/li&gt;
&lt;li&gt;Iterate with inline comments / sliders&lt;/li&gt;
&lt;li&gt;Export or handoff to Claude Code&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Prompt pattern that actually works
&lt;/h3&gt;

&lt;p&gt;Five elements, in order:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Output type (landing, pitch deck, UI, one-pager)
2. Core structure (N sections, ordering)
3. Style (tone, color palette, reference brand)
4. Content (actual text, numbers, names)
5. Constraints (length, fonts, banned elements)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bad prompt: &lt;code&gt;"nice landing page"&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Good prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;B2B SaaS landing page, 1 page. Sections: (1) hero with headline
"AI automates your bookkeeping", 1-line sub, CTAs "14-day free trial"
and "Book demo"; (2) logo strip (6 logos, Notion/Linear/Figma style);
(3) 3 value props with icon + 1-liner; (4) 1 screenshot;
(5) 3-tier pricing $29/99/299; (6) 4-question FAQ; (7) footer.
Style: Linear aesthetic, dark mode, restrained typography,
NO diagonal glass effects, NO accent bars.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Figma Stock Dropped
&lt;/h2&gt;

&lt;p&gt;Not because designers are obsolete. Because &lt;strong&gt;the "explore what to build" step — the one where founders traditionally bring wireframes to designers — just got 100x faster&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Designers still win on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto Layout precision&lt;/li&gt;
&lt;li&gt;Design critique and iteration&lt;/li&gt;
&lt;li&gt;Systems thinking&lt;/li&gt;
&lt;li&gt;Dev Mode handoff to existing codebases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But the &lt;em&gt;pre-designer exploration phase&lt;/em&gt; just got eaten. Founders walk into meetings with prototypes, not wireframes. PMs ship specs as working screens, not PowerPoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Honest Part — What r/ClaudeAI Complains About
&lt;/h2&gt;

&lt;p&gt;Reddit found the pattern fast:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Homogeneous output&lt;/strong&gt; — serif fonts, blinking status dots, color accent bars, "container soup" card layouts. Generated outputs scream "made with Claude Design."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customization ceiling&lt;/strong&gt; — without reference screenshots or design tokens, you can't escape the preset aesthetic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stock reaction hype&lt;/strong&gt; — some analysts argue the 7% drop reflects AI-disruption anxiety more than actual competitive threat. Figma is still vastly more capable per hour of work.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Workarounds I've Found
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Attach 3–5 reference screenshots&lt;/strong&gt; of brands you want to emulate (Linear, Superhuman, Stripe)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Negative prompts work&lt;/strong&gt;: &lt;code&gt;"NO: serif fonts, blinking dots, accent bars, overlapping cards"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use it as first draft&lt;/strong&gt;, then hand off to designer or Canva for final polish&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replace default fonts&lt;/strong&gt; in the generated HTML before export&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Competitive Landscape for Devs
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;When to pick&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Design&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Founder/PM draft → Claude Code production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Figma&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Designer-led team, fine-grained control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vercel v0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single React component, fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lovable / Bolt.new&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Skip design, go straight to working app&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Canva&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Marketing SNS, printed assets&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you're already in the Anthropic ecosystem (Claude Code user), the handoff alone justifies Claude Design. The friction between "here's what I want" and "here's working code" drops significantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Take
&lt;/h2&gt;

&lt;p&gt;The pattern I see repeatedly with AI tools is: &lt;strong&gt;they don't replace the craft, they eat the pre-craft exploration&lt;/strong&gt;. Claude Code didn't replace developers — it ate the "boilerplate and first-pass implementation" phase. Claude Design doesn't replace designers — it eats the "what should this even look like" phase.&lt;/p&gt;

&lt;p&gt;If you're a founder: ship this week. Drafts that took 3 days now take 30 seconds.&lt;/p&gt;

&lt;p&gt;If you're a designer: the scarcity shifts from "creating options" to "judging and refining options." Your leverage goes up, not down, &lt;em&gt;if you adapt&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;If you're an engineer: the handoff from Claude Design to Claude Code is the interesting part. Test whether your team can actually ship production-quality code from the bundle. I'd bet it's 70% there today and 95% in six months.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/news/claude-design-anthropic-labs" rel="noopener noreferrer"&gt;Anthropic official announcement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://techcrunch.com/2026/04/17/anthropic-launches-claude-design-a-new-product-for-creating-quick-visuals/" rel="noopener noreferrer"&gt;TechCrunch coverage&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://venturebeat.com/technology/anthropic-just-launched-claude-design-an-ai-tool-that-turns-prompts-into-prototypes-and-challenges-figma" rel="noopener noreferrer"&gt;VentureBeat analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.canva.com/newsroom/news/canva-claude-design/" rel="noopener noreferrer"&gt;Canva partnership newsroom&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Curious what other Dev.to folks think — are you shipping with this already, or waiting for the kinks to shake out?&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Claude Code Advisor: Opus Steering Sonnet Inside a Single API Call</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Mon, 20 Apr 2026 03:19:24 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/claude-code-advisor-opus-steering-sonnet-inside-a-single-api-call-4j8p</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/claude-code-advisor-opus-steering-sonnet-inside-a-single-api-call-4j8p</guid>
      <description>&lt;h1&gt;
  
  
  Claude Code Advisor: Opus Steering Sonnet Inside a Single API Call
&lt;/h1&gt;

&lt;p&gt;Anthropic shipped the &lt;strong&gt;Advisor Strategy&lt;/strong&gt; on April 9, 2026. It's not a new model. It's a pattern: a fast executor (Haiku/Sonnet) does the work, and at decision points it calls a stronger advisor (Opus) for strategic guidance — all inside a single &lt;code&gt;/v1/messages&lt;/code&gt; request, handled server-side.&lt;/p&gt;

&lt;p&gt;The numbers are the headline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Benchmark Story
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SWE-bench Multilingual&lt;/strong&gt; (real coding tasks):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sonnet 4.6 solo: 72.1%&lt;/li&gt;
&lt;li&gt;Sonnet 4.6 + Opus advisor: &lt;strong&gt;74.8%&lt;/strong&gt; (+2.7pp)&lt;/li&gt;
&lt;li&gt;Cost: &lt;strong&gt;11.9% cheaper&lt;/strong&gt; (yes, higher quality &lt;em&gt;and&lt;/em&gt; cheaper)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;BrowseComp&lt;/strong&gt; (web research):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Haiku 4.5 solo: 19.7%&lt;/li&gt;
&lt;li&gt;Haiku 4.5 + Opus advisor: &lt;strong&gt;41.2%&lt;/strong&gt; (2x+)&lt;/li&gt;
&lt;li&gt;Cost: &lt;strong&gt;85% cheaper than Sonnet solo&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last one is the jaw-dropper. You get Haiku latency, Opus judgment, and a price point below Sonnet.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;The Advisor is server-side. From the client's perspective, it's just another tool in the &lt;code&gt;tools&lt;/code&gt; array:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# executor
&lt;/span&gt;    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;advisor_20260301&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# must be &amp;gt;= executor
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_uses&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;caching&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ttl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;extra_headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic-beta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;advisor-tool-2026-03-01&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Refactor the payment module.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Per-call token breakdown in usage.iterations[]
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iterations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What happens internally:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Executor emits a &lt;code&gt;server_tool_use&lt;/code&gt; block (&lt;code&gt;name="advisor"&lt;/code&gt;, &lt;code&gt;input={}&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Anthropic's server forwards current context to the advisor&lt;/li&gt;
&lt;li&gt;Advisor (Opus 4.7) runs extended thinking, returns guidance&lt;/li&gt;
&lt;li&gt;An &lt;code&gt;advisor_tool_result&lt;/code&gt; block lands in the stream&lt;/li&gt;
&lt;li&gt;Executor continues&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The advisor &lt;strong&gt;does not&lt;/strong&gt; call tools. The advisor &lt;strong&gt;does not&lt;/strong&gt; respond to the user. It only whispers strategy to the executor.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Call the Advisor
&lt;/h2&gt;

&lt;p&gt;Anthropic's official prompting guide says four moments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Before substantive work&lt;/strong&gt; — before writing, before committing to an interpretation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When the task feels complete&lt;/strong&gt; — but persist the deliverable first&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When stuck&lt;/strong&gt; — same error recurring, approach not converging&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;When considering a change of approach&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What &lt;em&gt;not&lt;/em&gt; to call it for: pure mechanical work, single-turn Q&amp;amp;A, workloads that need top-tier intelligence on every turn.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Trimming Trick
&lt;/h2&gt;

&lt;p&gt;Add this line to your system prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The advisor should respond in under 100 words and use enumerated steps, not explanations."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Per Anthropic's own measurements: &lt;strong&gt;35-45% token reduction&lt;/strong&gt; on advisor calls with no quality degradation. Pair it with &lt;code&gt;max_uses&lt;/code&gt; to cap per-request spend.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code Users: One Command
&lt;/h2&gt;

&lt;p&gt;If you're using Claude Code CLI, skip the SDK entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/advisor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The toggle flips, and every subsequent agent loop uses the Executor-Advisor pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Error Codes to Know
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Code&lt;/th&gt;
&lt;th&gt;Cause&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;max_uses_exceeded&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-request cap hit&lt;/td&gt;
&lt;td&gt;Raise &lt;code&gt;max_uses&lt;/code&gt; or tighten executor prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;overloaded&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Advisor capacity&lt;/td&gt;
&lt;td&gt;Retry with backoff, fallback to Opus 4.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;context_mismatch&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;You stripped &lt;code&gt;advisor_tool_result&lt;/code&gt; blocks&lt;/td&gt;
&lt;td&gt;Keep full assistant response in multi-turn&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;invalid_advisor_model&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Advisor weaker than executor&lt;/td&gt;
&lt;td&gt;Always use Opus 4.7 as advisor&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Customer Evidence
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bolt&lt;/strong&gt; (Eric Simmons, CEO): "Architectural decisions on complex tasks visibly improved. The plans and trajectories are completely different."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eve Legal&lt;/strong&gt; (Anuraj Pandey): "Haiku 4.5 with Opus 4.6 advisory hit frontier quality at 5x less cost on structured document extraction."&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Takeaway for Indie Developers
&lt;/h2&gt;

&lt;p&gt;The model-selection dilemma — "Opus quality but at Haiku prices" — just got structurally solved. If you're running long-horizon coding agents, multi-step research pipelines, or Computer Use flows, this pattern changes your cost curve.&lt;/p&gt;

&lt;p&gt;Three lines of code to try it. One CLI command if you're in Claude Code.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Reference&lt;/strong&gt;: &lt;a href="https://claude.com/blog/the-advisor-strategy" rel="noopener noreferrer"&gt;Anthropic Blog — The Advisor Strategy&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>llm</category>
      <category>agents</category>
    </item>
    <item>
      <title>OpenAI 'Spud' (Rumored GPT-5.5 Pro): What Employees Are Saying, What We Actually Know</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Mon, 20 Apr 2026 03:18:37 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/openai-spud-rumored-gpt-55-pro-what-employees-are-saying-what-we-actually-know-h97</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/openai-spud-rumored-gpt-55-pro-what-employees-are-saying-what-we-actually-know-h97</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pre-read disclaimer&lt;/strong&gt;: Most of what follows is pre-release rumor and speculation. "GPT-5.5 Pro" is a community-coined name, not an officially confirmed product. Benchmark numbers, release dates, and architectural details below are unverified. Treat this as an informed signal analysis, not confirmed fact.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;OpenAI finished pretraining its next frontier model on &lt;strong&gt;March 24, 2026&lt;/strong&gt;. Internal codename: &lt;strong&gt;Spud&lt;/strong&gt;. Official branding is TBD — the community has been calling it "GPT-5.5" or "GPT-6" depending on how well it benchmarks. "Pro" is a community assumption.&lt;/p&gt;

&lt;p&gt;The interesting part isn't the technical rumors. It's the language pattern. Greg Brockman called it "the big model feel" — the exact phrase OpenAI employees internally used for the GPT-3 → GPT-4 jump. Multiple employees are saying "different from anything before." That's unusual.&lt;/p&gt;

&lt;p&gt;For devs, the one practical thing to know: Spud is rumored to have &lt;strong&gt;native multi-modality at the architecture level&lt;/strong&gt;. If you have pipelines where &lt;code&gt;image → text description → LLM&lt;/code&gt;, consider refactoring to a unified context now. That design pattern may become obsolete soon.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Happened on March 24, 2026
&lt;/h2&gt;

&lt;p&gt;The Information broke the story: OpenAI completed pretraining of the next model. Key facts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Fact&lt;/th&gt;
&lt;th&gt;Source confidence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Codename "Spud"&lt;/td&gt;
&lt;td&gt;High (multiple corroborating reports)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Training at Stargate (Abilene, TX)&lt;/td&gt;
&lt;td&gt;High (publicly known facility)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Currently in RLHF + red team&lt;/td&gt;
&lt;td&gt;High (standard OpenAI pipeline)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Target release in "a few weeks"&lt;/td&gt;
&lt;td&gt;High (Altman direct quote)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Official name "GPT-5.5" or "GPT-6"&lt;/td&gt;
&lt;td&gt;Low (depends on benchmarks)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Suffix "Pro"&lt;/td&gt;
&lt;td&gt;Very low (community speculation)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The interesting phrasing choices:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sam Altman&lt;/strong&gt;: "A very strong model that could really accelerate the economy."&lt;/p&gt;

&lt;p&gt;He didn't use "accelerate the economy" for GPT-4. Unusual word choice from a CEO who is normally measured.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Greg Brockman&lt;/strong&gt;: "There are two years of research inside this model. It has a big model feel — it's not an incremental improvement, it's a significant change in the way we think about model development."&lt;/p&gt;

&lt;p&gt;This is the one that matters. "Big model feel" is not marketing copy. It's internal OpenAI vernacular from the GPT-3 → GPT-4 transition. Brockman using this phrase publicly is a signal choice.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Employee Leak Pattern
&lt;/h2&gt;

&lt;p&gt;What makes this different from typical hype cycles: multiple employees using the same phrasing in independent channels.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"very different from what we've seen before"&lt;/li&gt;
&lt;li&gt;"not just bigger"&lt;/li&gt;
&lt;li&gt;"changes how I think about what's possible"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One employee expressing excitement means nothing. A dozen using the same language suggests a shared internal framing. That's the current pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  The LM Arena Anomaly
&lt;/h2&gt;

&lt;p&gt;Early April, three anonymous models appeared on LM Arena for a few hours and got yanked:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;maskingtape-alpha
gaffertape-alpha
packingtape-alpha
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All tape-related names. Same test family, almost certainly. Community consensus:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;One is likely the new image model (GPT-Image-2, which OpenAI subsequently shipped)&lt;/li&gt;
&lt;li&gt;The other two are text/multimodal variants of Spud&lt;/li&gt;
&lt;li&gt;They got pulled because someone internally noticed live testing leaked to a public benchmark&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;April 19: Multiple prod API users reported response patterns that didn't match GPT-5.4's behavior. Could be "limited live testing" of Spud.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Benchmark Picture
&lt;/h2&gt;

&lt;p&gt;This is where it gets strategic. SWE-bench Pro (code agent capability) is the benchmark that matters for enterprise AI spend in 2026:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;SWE-bench Pro&lt;/th&gt;
&lt;th&gt;Released&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4 (OpenAI current)&lt;/td&gt;
&lt;td&gt;57.70%&lt;/td&gt;
&lt;td&gt;Public&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Mythos (Anthropic)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;77.80%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Public&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spud (estimate)&lt;/td&gt;
&lt;td&gt;high 70s to 80s&lt;/td&gt;
&lt;td&gt;Internal&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;OpenAI is &lt;strong&gt;~20 percentage points behind Anthropic&lt;/strong&gt; on coding agent tasks right now. In enterprise AI procurement, this kind of gap is causing actual customer migrations. Anthropic has been gaining ground fast on dev tooling specifically.&lt;/p&gt;

&lt;p&gt;So the question isn't just "how good is Spud?" It's "does Spud close or exceed the Anthropic gap?" The community thinks that's the difference between "GPT-5.5" branding (close but not ahead) and "GPT-6" branding (clearly ahead).&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Rumored to Be Architecturally Different
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Native multi-modality&lt;/strong&gt;. That's the headline leak.&lt;/p&gt;

&lt;p&gt;Current approach (GPT-5.4):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[user sends image + text]
       ↓
Vision module → converts image to text description
Text module → combines with original text  
       ↓
LLM → generates response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spud approach (rumored):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[user sends image + audio + text]
       ↓
Unified transformer (same blocks handle all modalities natively)
       ↓
Response (possibly multimodal output)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this is real, the implications for developers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pipeline architectures become obsolete.&lt;/strong&gt; If you built a multimodal app that routes &lt;code&gt;image → description → LLM&lt;/code&gt;, that's going to be dead weight.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool use quality should improve.&lt;/strong&gt; Native multimodality + longer agentic context handles complex 14+ step workflows better (this is the specific gap vs Claude Mythos).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency drops.&lt;/strong&gt; No inter-module routing overhead.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Practical recommendation: if you're building agentic systems right now, design your context as a single unified stream rather than pre-processed pipelines. That bet pays off regardless of whether it's Spud or a subsequent model that lands this capability in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Timeline (Polymarket as of April 20, 2026)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Implied probability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Released by April 23&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;81%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Released by April 30&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;72%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Released by May 31&lt;/td&gt;
&lt;td&gt;93%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Altman said "a few weeks" on March 24. Arithmetically: late April to mid-May. Polymarket pricing matches.&lt;/p&gt;

&lt;p&gt;Note: April 16 had a rumored internal release document screenshot that got deleted from X. Whether that was real internal leak or fake is unconfirmed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why OpenAI Shut Down Sora Quietly
&lt;/h2&gt;

&lt;p&gt;Compute reallocation. OpenAI pulled GPU capacity from Sora (video generation) and redirected it to Spud. The signal value: OpenAI is treating the Anthropic competition as existential enough to sacrifice a marquee consumer product.&lt;/p&gt;

&lt;p&gt;Also worth noting: Spud is reportedly the base model for &lt;strong&gt;two upcoming voice agent platforms&lt;/strong&gt; — not just a chatbot upgrade. OpenAI is betting the 2026-2027 product roadmap on this model.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Watch For (and What to Ignore)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Actual signals to track&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SWE-bench Pro number when it drops (deciding factor for enterprise spend)&lt;/li&gt;
&lt;li&gt;Context window announcement (agentic task quality)&lt;/li&gt;
&lt;li&gt;Pricing (affects adoption curve)&lt;/li&gt;
&lt;li&gt;API pass-through latency (native multimodality claim verification)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Noise to ignore&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Capybara tier" and other extreme rumors&lt;/li&gt;
&lt;li&gt;Specific dates floating on X before official announcement&lt;/li&gt;
&lt;li&gt;Any claim about AGI or "human-level" anything&lt;/li&gt;
&lt;li&gt;The "GPT-5.5 Pro" name until OpenAI actually uses it&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The real story isn't "there's a new model." It's "OpenAI is under actual competitive pressure from Anthropic for the first time, and their internal language is signaling they think this response is significant."&lt;/p&gt;

&lt;p&gt;We'll know within 4-6 weeks whether the hype matches reality. Until then, the most actionable thing is to start thinking about your multimodal architectures in a way that assumes unified context rather than pipeline routing.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Primary source&lt;/strong&gt;: &lt;a href="https://primeaicenter.com/gpt-5-5-review/" rel="noopener noreferrer"&gt;primeaicenter.com/gpt-5-5-review&lt;/a&gt; (most thorough English archive)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Secondary&lt;/strong&gt;: The Information (March 24 pretraining completion), Altman/Brockman X posts, Polymarket, LM Arena observation community.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclaimer: AI-assisted analysis of publicly available leaks and statements. Nothing above is financial or procurement advice. Pre-release rumors change frequently.&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Vercel's April 2026 Breach: A 3-Step OAuth Supply-Chain Chain Every Dev Should Read</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Mon, 20 Apr 2026 03:16:13 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/vercels-april-2026-breach-a-3-step-oauth-supply-chain-chain-every-dev-should-read-21h</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/vercels-april-2026-breach-a-3-step-oauth-supply-chain-chain-every-dev-should-read-21h</guid>
      <description>&lt;p&gt;On 2026-04-19 (updated 2026-04-20), Vercel published a Security Bulletin confirming unauthorized access to certain internal systems. The technically interesting part isn't "Vercel got hacked." It's &lt;strong&gt;how&lt;/strong&gt; they got hacked: a three-step OAuth chain that started with a third-party AI tool an employee was using.&lt;/p&gt;

&lt;p&gt;If you deploy anything on Vercel, you have homework to do this week. Let me break down what happened, what was and wasn't exposed, and the concrete steps that matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The attack chain (in one paragraph)
&lt;/h2&gt;

&lt;p&gt;A Vercel employee was using &lt;strong&gt;Context.ai&lt;/strong&gt;, a third-party AI tool. The attacker compromised &lt;strong&gt;Context.ai's Google Workspace OAuth application&lt;/strong&gt;, used it to hijack the employee's Google Workspace account, and from there rode the employee's existing OAuth grant to &lt;strong&gt;reach Vercel's internal environment&lt;/strong&gt;. No MFA bypass. No stolen password. Just a legitimate chain of OAuth consents being inherited by the wrong principal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Context.ai] --OAuth--&amp;gt; [Google Workspace] --OAuth--&amp;gt; [Vercel]
     ^                                                   ^
     |                                                   |
 initial breach                                    reached via
   (supply chain)                              inherited OAuth scopes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What got exposed, and what didn't — the &lt;code&gt;Sensitive&lt;/code&gt; flag saved some secrets
&lt;/h2&gt;

&lt;p&gt;Vercel's post is unusually precise about the blast radius, and this is where the architectural lesson lands.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Non-sensitive environment variables&lt;/strong&gt;: potentially accessed. Vercel recommends rotating all of them. This is the bucket most teams default into — API keys, DB URLs, JWT secrets, webhook signing keys, LLM credentials.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sensitive environment variables&lt;/strong&gt;: no evidence of access. These are stored encrypted and decrypted only at build time, and the value cannot be retrieved in plaintext from the dashboard after creation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same variables. Same account. Same dashboard. &lt;strong&gt;One checkbox at creation time.&lt;/strong&gt; That checkbox was the difference between "needs rotation across your entire production stack" and "probably fine."&lt;/p&gt;

&lt;h2&gt;
  
  
  The IOC you can act on right now
&lt;/h2&gt;

&lt;p&gt;Vercel and downstream security reports published this OAuth App ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;110671459871-30f1spbu0hptbs60cb4vsmv79i7bbvqj.apps.googleusercontent.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're a Google Workspace admin, go to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;admin.google.com → Security → Access and data control → API controls
→ App access control → Manage Third-Party App Access
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Search for that project number (&lt;code&gt;110671459871&lt;/code&gt;). If it shows up in your org, you now have a concrete incident to investigate.&lt;/p&gt;

&lt;p&gt;Context.ai's OAuth app is being described as part of a larger supply-chain compromise affecting hundreds of orgs — not a Vercel-only issue.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do this week — the pragmatic version
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Day 0 (today, ~2 hours)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Search your Google Workspace for the IOC above&lt;/li&gt;
&lt;li&gt;[ ] Review Vercel Activity Log (dashboard or &lt;code&gt;vercel ls&lt;/code&gt;) for unexpected logins, deployments, env changes&lt;/li&gt;
&lt;li&gt;[ ] Upgrade Deployment Protection to at least &lt;strong&gt;Standard&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Revoke unused Deployment Protection bypass tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Day 1-7 (rotation week)
&lt;/h3&gt;

&lt;p&gt;Rotate everything that was a non-sensitive env var. Prioritize like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Priority&lt;/th&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;P0 (24h)&lt;/td&gt;
&lt;td&gt;Payment / DB&lt;/td&gt;
&lt;td&gt;Stripe, Toss, Postgres URL, Redis URL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;P0 (24h)&lt;/td&gt;
&lt;td&gt;Signing keys&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;JWT_SECRET&lt;/code&gt;, &lt;code&gt;NEXTAUTH_SECRET&lt;/code&gt;, webhook signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;P1 (48h)&lt;/td&gt;
&lt;td&gt;LLM APIs&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;OPENAI_API_KEY&lt;/code&gt;, &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;P2 (7d)&lt;/td&gt;
&lt;td&gt;Messaging / storage&lt;/td&gt;
&lt;td&gt;SendGrid, Mailgun, S3, Cloudinary&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Standard rotation flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Issue a fresh value in the provider's console&lt;/span&gt;
&lt;span class="c"&gt;# 2. Re-add it as a Sensitive env var&lt;/span&gt;
vercel &lt;span class="nb"&gt;env &lt;/span&gt;add STRIPE_SECRET_KEY production &lt;span class="nt"&gt;--sensitive&lt;/span&gt;

&lt;span class="c"&gt;# 3. Deploy on the new value&lt;/span&gt;
git commit &lt;span class="nt"&gt;--allow-empty&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"rotate: STRIPE_SECRET_KEY"&lt;/span&gt;
git push

&lt;span class="c"&gt;# 4. After verification, revoke the old value at the provider&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For &lt;code&gt;JWT_SECRET&lt;/code&gt;-style keys, remember that rotating them invalidates every existing session. Keep a &lt;code&gt;JWT_SECRET_LEGACY&lt;/code&gt; for a 24-72h grace window and accept tokens signed with either, then drop the legacy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 8-30 (structural improvements)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Flip your Google Workspace default OAuth access policy to &lt;strong&gt;Restricted&lt;/strong&gt;; move known-good apps to &lt;strong&gt;Trusted&lt;/strong&gt;, block the rest.&lt;/li&gt;
&lt;li&gt;Introduce a secrets manager (1Password Secrets Automation / Doppler / Bitwarden SM) so that Vercel becomes the &lt;em&gt;injection point&lt;/em&gt;, not the source of truth.&lt;/li&gt;
&lt;li&gt;Add a recurring quarterly calendar event: "OAuth audit — Google, GitHub, Vercel, Slack, Notion."&lt;/li&gt;
&lt;li&gt;Install &lt;code&gt;gitleaks&lt;/code&gt; as a pre-commit hook to catch accidental secret prints.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The broader takeaway for devs
&lt;/h2&gt;

&lt;p&gt;The attack that got Vercel wasn't novel cryptography, a zero-day, or MFA bypass. It was &lt;strong&gt;OAuth consent being treated as low-stakes&lt;/strong&gt; by the end user.&lt;/p&gt;

&lt;p&gt;In 2026, "Sign in with Google" on a new AI tool is effectively deciding: &lt;em&gt;"I'm willing to let this tool's security posture become the lower bound of my entire Google-connected surface area."&lt;/em&gt; That includes Vercel, GitHub, Slack, Notion, Supabase, and anything else federated through your Google ID.&lt;/p&gt;

&lt;p&gt;If your team doesn't have an AI-tool adoption policy yet, three lines probably cover most of the risk:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Any third-party tool asking for Google Workspace OAuth requires security-owner approval.&lt;/li&gt;
&lt;li&gt;Anything beyond read-only scopes needs a written justification.&lt;/li&gt;
&lt;li&gt;OAuth grants unused for 30 days are auto-revoked.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the price of admission for living on this much managed infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Source
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Vercel Security Bulletin (official): &lt;a href="https://vercel.com/kb/bulletin/vercel-april-2026-security-incident" rel="noopener noreferrer"&gt;https://vercel.com/kb/bulletin/vercel-april-2026-security-incident&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stay rotated, devs.&lt;/p&gt;

</description>
      <category>security</category>
      <category>devops</category>
      <category>vercel</category>
      <category>oauth</category>
    </item>
    <item>
      <title>HeyGen HyperFrames: An Open-Source Video Framework Built for AI Agents (Not Humans)</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Mon, 20 Apr 2026 03:15:53 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/heygen-hyperframes-an-open-source-video-framework-built-for-ai-agents-not-humans-mj8</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/heygen-hyperframes-an-open-source-video-framework-built-for-ai-agents-not-humans-mj8</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;HeyGen open-sourced HyperFrames under Apache 2.0 in 2026. Instead of programmable video via React components (like Remotion), you write plain HTML with &lt;code&gt;data-*&lt;/code&gt; attributes and GSAP timelines. The design goal is explicit: &lt;strong&gt;AI coding agents are the primary users&lt;/strong&gt;, not humans.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add heygen-com/hyperframes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This single command installs five slash commands into Claude Code / Cursor / Codex / Gemini CLI and turns your agent into a video editor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Another Video Framework?
&lt;/h2&gt;

&lt;p&gt;The homepage headline is the thesis statement: &lt;strong&gt;"Now Claude Code can edit videos."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Content automation pipelines have agent-friendly tools for research, writing, and image generation. Video was the missing piece. The question HyperFrames answers is: &lt;strong&gt;"What abstraction level do AI agents handle best?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The answer, according to HeyGen: &lt;strong&gt;HTML&lt;/strong&gt;. Not JSX, not imperative timeline APIs, just HTML.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Primitive
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"root"&lt;/span&gt; &lt;span class="na"&gt;data-composition-id=&lt;/span&gt;&lt;span class="s"&gt;"root"&lt;/span&gt;
     &lt;span class="na"&gt;data-start=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt; &lt;span class="na"&gt;data-width=&lt;/span&gt;&lt;span class="s"&gt;"1920"&lt;/span&gt; &lt;span class="na"&gt;data-height=&lt;/span&gt;&lt;span class="s"&gt;"1080"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;video&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"clip-1"&lt;/span&gt; &lt;span class="na"&gt;data-start=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt; &lt;span class="na"&gt;data-duration=&lt;/span&gt;&lt;span class="s"&gt;"5"&lt;/span&gt; &lt;span class="na"&gt;data-track-index=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt;
         &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"intro.mp4"&lt;/span&gt; &lt;span class="na"&gt;muted&lt;/span&gt; &lt;span class="na"&gt;playsinline&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/video&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"overlay"&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"clip"&lt;/span&gt; &lt;span class="na"&gt;data-start=&lt;/span&gt;&lt;span class="s"&gt;"2"&lt;/span&gt; &lt;span class="na"&gt;data-duration=&lt;/span&gt;&lt;span class="s"&gt;"3"&lt;/span&gt;
       &lt;span class="na"&gt;data-track-index=&lt;/span&gt;&lt;span class="s"&gt;"1"&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"logo.png"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;audio&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"bg-music"&lt;/span&gt; &lt;span class="na"&gt;data-start=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt; &lt;span class="na"&gt;data-duration=&lt;/span&gt;&lt;span class="s"&gt;"9"&lt;/span&gt; &lt;span class="na"&gt;data-track-index=&lt;/span&gt;&lt;span class="s"&gt;"2"&lt;/span&gt;
         &lt;span class="na"&gt;data-volume=&lt;/span&gt;&lt;span class="s"&gt;"0.5"&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"music.wav"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/audio&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the full mental model. Four clip types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;video&amp;gt;&lt;/code&gt; — must be &lt;code&gt;muted&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; — static visuals&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;audio&amp;gt;&lt;/code&gt; — separated from video&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;div data-composition-id&amp;gt;&lt;/code&gt; — nested compositions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Five required attributes cover timing, layering, and optional volume. A &lt;code&gt;class="clip"&lt;/code&gt; tells the framework to honor the &lt;code&gt;data-start&lt;/code&gt;/&lt;code&gt;data-duration&lt;/code&gt; window.&lt;/p&gt;

&lt;h2&gt;
  
  
  Determinism Is Non-Negotiable
&lt;/h2&gt;

&lt;p&gt;One of the seven official "must follow" rules caught my eye:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Math.random() is forbidden.&lt;/strong&gt; If you need randomness, use a seeded PRNG like &lt;code&gt;mulberry32&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That level of commitment to determinism is rare in video tooling. The reasoning is clear: agent-driven pipelines need the same input to produce identical bytes every time, otherwise you cannot put rendering in CI.&lt;/p&gt;

&lt;p&gt;Other non-negotiables:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Every timeline must register into &lt;code&gt;window.__timelines&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;video&amp;gt;&lt;/code&gt; elements must be &lt;code&gt;muted&lt;/code&gt; (audio goes into &lt;code&gt;&amp;lt;audio&amp;gt;&lt;/code&gt; tags)&lt;/li&gt;
&lt;li&gt;GSAP timeline construction must be synchronous (no &lt;code&gt;async/await/fetch&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Timed elements require &lt;code&gt;class="clip"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Never call &lt;code&gt;video.play()&lt;/code&gt; or &lt;code&gt;audio.currentTime&lt;/code&gt; from scripts — the framework owns media control&lt;/li&gt;
&lt;li&gt;Every scene needs an entrance animation&lt;/li&gt;
&lt;li&gt;Scenes need transitions between them&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Natural Language → Technical Mapping
&lt;/h2&gt;

&lt;p&gt;The prompting guide includes a mapping table that does most of the work:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Natural Language&lt;/th&gt;
&lt;th&gt;GSAP Easing&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;smooth&lt;/td&gt;
&lt;td&gt;&lt;code&gt;power2.out&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;snappy&lt;/td&gt;
&lt;td&gt;&lt;code&gt;power4.out&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;bouncy&lt;/td&gt;
&lt;td&gt;&lt;code&gt;back.out&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;springy&lt;/td&gt;
&lt;td&gt;&lt;code&gt;elastic.out&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;dramatic&lt;/td&gt;
&lt;td&gt;&lt;code&gt;expo.out&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;dreamy&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sine.inOut&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The same approach for caption tones maps "Hype / Corporate / Tutorial / Storytelling / Social" to specific font weights, entrance animations, and size ranges. The user describes a feeling; the framework resolves to technique.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Prompt Modes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cold Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;10-second product intro, fade-in title, dark background, BGM, corporate mood
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Recommended structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Duration&lt;/li&gt;
&lt;li&gt;Aspect ratio (16:9 / 9:16 / 1:1)&lt;/li&gt;
&lt;li&gt;Mood (energetic / calm / premium / playful)&lt;/li&gt;
&lt;li&gt;Key elements&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Warm Start
&lt;/h3&gt;

&lt;p&gt;This is where HyperFrames shines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Turn this GitHub repo into a 45-second pitch video
Turn this PDF into a 30-second summary video
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent handles &lt;strong&gt;both research and production&lt;/strong&gt; in a single prompt. The &lt;code&gt;/website-to-hyperframes&lt;/code&gt; slash command is a first-class pipeline for URL → video.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Mistakes (The Debugging Cheat Sheet)
&lt;/h2&gt;

&lt;p&gt;From the official Common Mistakes doc, here are the failure modes I would not have guessed:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Animating video element dimensions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Freezes frame rendering&lt;/span&gt;
&lt;span class="nx"&gt;gsap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#video1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1920&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Animate a wrapper div&lt;/span&gt;
&lt;span class="nx"&gt;gsap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.video-wrapper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1920&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Timeline shorter than video
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Extend timeline with a zero-duration set&lt;/span&gt;
&lt;span class="nx"&gt;tl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({},&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="mi"&gt;283&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Oversized images
&lt;/h3&gt;

&lt;p&gt;A 7000×5000 PNG causes ~140MB decode per frame. Keep images at 2× canvas size max.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Backdrop-filter stacks
&lt;/h3&gt;

&lt;p&gt;16 layers of &lt;code&gt;backdrop-filter: blur()&lt;/code&gt; calculated every frame will kill render performance. Cap at 2-3 layers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Monorepo with clean separation:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Responsibility&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;hyperframes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CLI (create, preview, lint, render)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@hyperframes/core&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Types, parser, linter, runtime, frame adapter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@hyperframes/engine&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Page → video capture (Puppeteer + FFmpeg)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@hyperframes/producer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full pipeline (capture + encode + audio mix)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@hyperframes/studio&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Browser-based composition editor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@hyperframes/player&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Embeddable &lt;code&gt;&amp;lt;hyperframes-player&amp;gt;&lt;/code&gt; web component&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@hyperframes/shader-transitions&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;WebGL shader transitions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The &lt;strong&gt;Frame Adapter&lt;/strong&gt; pattern is the extensibility story. Adapters can bring GSAP, Lottie, CSS animations, or Three.js into the render pipeline. First-mover adapters will probably shape the ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  TTS Is Built-In
&lt;/h2&gt;

&lt;p&gt;Kokoro TTS runs locally, no API key required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx hyperframes tts &lt;span class="nt"&gt;--text&lt;/span&gt; &lt;span class="s2"&gt;"Hello world"&lt;/span&gt; &lt;span class="nt"&gt;--voice&lt;/span&gt; af_heart &lt;span class="nt"&gt;--output&lt;/span&gt; narration.wav
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Recommended voices by use case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product demos: &lt;code&gt;af_heart&lt;/code&gt;, &lt;code&gt;af_nova&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Tutorials: &lt;code&gt;am_adam&lt;/code&gt;, &lt;code&gt;bf_emma&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Marketing: &lt;code&gt;af_sky&lt;/code&gt;, &lt;code&gt;am_michael&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Component Registry
&lt;/h2&gt;

&lt;p&gt;Over 50 blocks are registered and installable via CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx hyperframes add flash-through-white
npx hyperframes add instagram-follow
npx hyperframes add data-chart
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Categories include social overlays, shader transitions, data visualizations, and cinematic effects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Workflow I Would Adopt
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;npx hyperframes init my-video&lt;/code&gt; (installs skill automatically)&lt;/li&gt;
&lt;li&gt;Open in Claude Code / Cursor / Codex&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/hyperframes&lt;/code&gt; with a warm start prompt pointing to source material&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;npx hyperframes preview&lt;/code&gt; for browser live reload&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small, targeted follow-up prompts&lt;/strong&gt;: "make the title 2x larger", "add a fade-out at the end"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;npx hyperframes lint&lt;/code&gt; to catch structural issues&lt;/li&gt;
&lt;li&gt;&lt;code&gt;npx hyperframes render --preset high --output final.mp4&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Anti-Patterns to Avoid
&lt;/h2&gt;

&lt;p&gt;From the prompting guide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Asking for React/Vue components&lt;/strong&gt; — adds a translation layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requesting 4K/60fps&lt;/strong&gt; — 1920×1080 30fps is the sweet spot for speed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skipping the slash command&lt;/strong&gt; — the agent will fall back to generic HTML video conventions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Giant monolithic prompts&lt;/strong&gt; — targeted, iterative edits beat one-shot mega-prompts&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Requirements
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Node.js 22+&lt;/li&gt;
&lt;li&gt;FFmpeg&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the entire system requirement list.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;The design signals a specific bet: &lt;strong&gt;the future of content tooling is agent-primary, human-secondary&lt;/strong&gt;. Most frameworks treat agent support as a retrofit. HyperFrames treats it as the foundational design constraint. Whether that bet pays off or not, the engineering choices (HTML-first, deterministic rendering, slash command integration) are worth studying regardless of which tool you end up using.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Homepage: &lt;a href="https://hyperframes.heygen.com/" rel="noopener noreferrer"&gt;https://hyperframes.heygen.com/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Prompting guide: &lt;a href="https://hyperframes.heygen.com/guides/prompting" rel="noopener noreferrer"&gt;https://hyperframes.heygen.com/guides/prompting&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Compositions concept: &lt;a href="https://hyperframes.heygen.com/concepts/compositions" rel="noopener noreferrer"&gt;https://hyperframes.heygen.com/concepts/compositions&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Common mistakes: &lt;a href="https://hyperframes.heygen.com/guides/common-mistakes" rel="noopener noreferrer"&gt;https://hyperframes.heygen.com/guides/common-mistakes&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/heygen-com/hyperframes" rel="noopener noreferrer"&gt;https://github.com/heygen-com/hyperframes&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Building a Hermes Fleet: Reproducing a Self-Hosted Agent Stack in a Weekend (mem0 + Qdrant + Ollama + Claude Code Stop hook)</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Fri, 17 Apr 2026 01:28:53 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/building-a-hermes-fleet-reproducing-a-self-hosted-agent-stack-in-a-weekend-mem0-qdrant-ollama-3l5c</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/building-a-hermes-fleet-reproducing-a-self-hosted-agent-stack-in-a-weekend-mem0-qdrant-ollama-3l5c</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;@Mosescreates posted his Hermes fleet — six agent profiles across two machines, all writing to a shared memory layer made of mem0 + Qdrant + Ollama. I spent the weekend reproducing a stripped-down version with two profiles on a single MacBook. The architecture held.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Load-bearing pieces (don't skip):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Shared Qdrant collection&lt;/li&gt;
&lt;li&gt;Local Ollama embeddings&lt;/li&gt;
&lt;li&gt;mem0 as memory abstraction&lt;/li&gt;
&lt;li&gt;Claude Code Stop hook writing each turn&lt;/li&gt;
&lt;li&gt;Native-only OpenRouter provider pinning&lt;/li&gt;
&lt;li&gt;Local LLM fallback (Gemma 2 9B 4-bit in my case)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything else — second machine, launchd units, backup cron, fleet status CLIs — is nice to have.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────┐        ┌──────────────────┐
│ Claude Code    │        │ Telegram Agent   │
└────────┬───────┘        └────────┬─────────┘
         │ Stop hook               │ on every reply
         ▼                         ▼
┌────────────────────────────────────────────┐
│             mem0 (Python SDK)              │
│ LLM: OpenRouter (primary) / Ollama fallback│
│ Embedder: Ollama nomic-embed-text (768d)   │
└────────────────────┬───────────────────────┘
                     ▼
         ┌────────────────────────┐
         │ Qdrant (Docker)        │
         │ collection: fleet_memory│
         └────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every tool is a reader &lt;em&gt;and&lt;/em&gt; a writer of the same store. That's the whole idea.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bringing up the three services
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Qdrant&lt;/span&gt;
docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; qdrant &lt;span class="nt"&gt;-p&lt;/span&gt; 6333:6333 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/qdrant_data:/qdrant/storage"&lt;/span&gt; qdrant/qdrant

&lt;span class="c"&gt;# Ollama + embedding model&lt;/span&gt;
ollama pull nomic-embed-text
ollama serve &amp;amp;

&lt;span class="c"&gt;# mem0&lt;/span&gt;
python3.11 &lt;span class="nt"&gt;-m&lt;/span&gt; venv ~/.venv/fleet &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; ~/.venv/fleet/bin/activate
pip &lt;span class="nb"&gt;install &lt;/span&gt;mem0ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Config that actually works
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ~/.mem0/config.yaml&lt;/span&gt;
&lt;span class="na"&gt;vector_store&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;qdrant&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;127.0.0.1&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;6333&lt;/span&gt;
    &lt;span class="na"&gt;collection_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;fleet_memory&lt;/span&gt;
    &lt;span class="na"&gt;embedding_model_dims&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;768&lt;/span&gt;   &lt;span class="c1"&gt;# &amp;lt;- don't omit this&lt;/span&gt;

&lt;span class="na"&gt;llm&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;llama3.1:8b&lt;/span&gt;
    &lt;span class="na"&gt;ollama_base_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:11434&lt;/span&gt;

&lt;span class="na"&gt;embedder&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nomic-embed-text&lt;/span&gt;
    &lt;span class="na"&gt;ollama_base_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:11434&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things that differ from the original snippet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;All three blocks&lt;/strong&gt; (&lt;code&gt;vector_store&lt;/code&gt;, &lt;code&gt;llm&lt;/code&gt;, &lt;code&gt;embedder&lt;/code&gt;) are specified. mem0's defaults are OpenAI &lt;code&gt;gpt-4o&lt;/code&gt; + &lt;code&gt;text-embedding-3-small&lt;/code&gt;. If you only override the embedder, mem0 silently still calls OpenAI for the LLM piece and you get a confusing 401.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;embedding_model_dims: 768&lt;/code&gt;&lt;/strong&gt; is explicit. nomic-embed-text returns 768-dim vectors; mem0's Qdrant default assumes 1536 (OpenAI). Missing this causes silent insert failures — see &lt;a href="https://github.com/mem0ai/mem0/issues/3441" rel="noopener noreferrer"&gt;mem0 issue #3441&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;127.0.0.1&lt;/code&gt;&lt;/strong&gt; over &lt;code&gt;localhost&lt;/code&gt;. Happy Eyeballs bit me multiple times on macOS when services resolved to both IPv6 and IPv4.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Claude Code Stop hook
&lt;/h2&gt;

&lt;p&gt;This is the piece that wires Claude Code into the shared memory. Settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Stop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/Users/you/.venv/fleet/bin/python3 /Users/you/.claude/hooks/mem_broadcast.py"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: use the venv Python explicitly. The hook runs with a non-login shell env, so &lt;code&gt;python3&lt;/code&gt; resolves to system Python, which doesn't have mem0 installed.&lt;/p&gt;

&lt;h3&gt;
  
  
  The hook script
&lt;/h3&gt;

&lt;p&gt;The original blogpost uses &lt;code&gt;payload.get("transcript", [])&lt;/code&gt;. The actual &lt;a href="https://docs.anthropic.com/en/docs/claude-code/hooks" rel="noopener noreferrer"&gt;Stop hook payload&lt;/a&gt; gives you &lt;code&gt;transcript_path&lt;/code&gt; — a JSONL file path. You have to open and parse it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;#!/usr/bin/env python3
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Claude Code Stop hook → mem0 writer with redaction + idempotency.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mem0&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Memory&lt;/span&gt;

&lt;span class="n"&gt;REDACT_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-[A-Za-z0-9_-]{20,}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ghp_[A-Za-z0-9]{36,}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer\s+[A-Za-z0-9._\-]+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;redact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pat&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;REDACT_PATTERNS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[REDACTED]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdin&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="c1"&gt;# Prevent infinite loops
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop_hook_active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="n"&gt;transcript_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transcript_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;transcript_path&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcript_path&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="n"&gt;turns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcript_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;turns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

    &lt;span class="n"&gt;user_turn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;reversed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;turns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;assistant_turn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;reversed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;turns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;user_turn&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;assistant_turn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;turn_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;turns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;user_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;redact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_turn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)))[:&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;assistant_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;redact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_turn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)))[:&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expanduser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;~/.mem0/config.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User asked: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Assistant answered: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;assistant_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fleet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;turn_index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;turn_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idempotency_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;turn_index&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Known Stop hook bug
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/anthropics/claude-code/issues/11786" rel="noopener noreferrer"&gt;Claude Code issue #11786&lt;/a&gt; reports a regression in v2.0.42+ where prompt-based Stop hooks can't access transcript content. Reading &lt;code&gt;transcript_path&lt;/code&gt; directly (as above) mostly dodges it, but I occasionally saw the last JSONL turn not yet flushed when the hook fired — adding &lt;code&gt;time.sleep(0.2)&lt;/code&gt; at the top of &lt;code&gt;main()&lt;/code&gt; smoothed it over in practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four-line YAML that matters most
&lt;/h2&gt;

&lt;p&gt;OpenRouter defaults to falling back to cheaper/faster providers. That's fine for a chat UI, catastrophic for an agent that writes to shared memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;primary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openrouter&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;qwen/qwen-3-72b-instruct&lt;/span&gt;
  &lt;span class="na"&gt;provider_config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;only&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alibaba"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;allow_fallbacks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

&lt;span class="na"&gt;fallback&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gemma2:9b-instruct-q4_0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenRouter's &lt;a href="https://openrouter.ai/docs/guides/routing/provider-selection" rel="noopener noreferrer"&gt;provider routing docs&lt;/a&gt; officially support &lt;code&gt;provider.only&lt;/code&gt; + &lt;code&gt;allow_fallbacks: false&lt;/code&gt;. If the pinned provider is down, the call fails loudly instead of silently drifting to another one. Loud failure is what you want when memory consistency is on the line.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local fallback from day one
&lt;/h2&gt;

&lt;p&gt;30 seconds without network during a conversation — user barely noticed. Memory writes kept going against local Gemma. Recovery was silent. Pull the model once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull gemma2:9b-instruct-q4_0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Three lessons the hard way
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Happy Eyeballs is real.&lt;/strong&gt; Pin services to &lt;code&gt;127.0.0.1&lt;/code&gt;, set &lt;code&gt;AddressFamily inet&lt;/code&gt; in &lt;code&gt;~/.ssh/config&lt;/code&gt;, use &lt;code&gt;curl -4&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency keys aren't optional.&lt;/strong&gt; I skipped them for day one and by Sunday I had duplicate memory entries from mem0 retries after a Qdrant timeout. &lt;code&gt;session_id:turn_index&lt;/code&gt; — do it from the start.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-hosted embeddings are self-ownership.&lt;/strong&gt; Cost was a trivial amount of electricity. Privacy is total. If OpenRouter vanishes tomorrow, my memory layer is untouched.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The qualitative part
&lt;/h2&gt;

&lt;p&gt;After roughly 10 minutes of Claude Code work, I asked my Telegram bot "what was I working on?" and it answered coherently from the shared store. I kept catching myself context-switching without re-explaining.&lt;/p&gt;

&lt;p&gt;None of this is novel tech. The work is in the wiring — making every tool a reader/writer of the same store, plus enough discipline around idempotency and redaction to trust what's in there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;Moshe's "nothing lock-in" point hits harder after doing it. When something broke I fixed it. When I wanted a new profile I copied four lines of YAML. No platform was in the loop.&lt;/p&gt;

&lt;p&gt;Start: Qdrant + Ollama + mem0. Get Claude Code Stop hook writing. Everything else builds from there.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Credit&lt;/strong&gt;: Moshe (@Mosescreates) for the original Hermes thread. mem0 / Qdrant / Ollama / Claude Code teams for the underlying pieces. This is just a weekend reproduction.&lt;/p&gt;

&lt;p&gt;Full Korean writeup with additional troubleshooting: &lt;a href="https://qjc.app/blog/hermes-fleet-self-hosted-agent-stack" rel="noopener noreferrer"&gt;https://qjc.app/blog/hermes-fleet-self-hosted-agent-stack&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>OpenAI Agents SDK v0.14 — Sandbox Agents and the Model-Native Harness Go GA</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Fri, 17 Apr 2026 01:16:51 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/openai-agents-sdk-v014-sandbox-agents-and-the-model-native-harness-go-ga-49ej</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/openai-agents-sdk-v014-sandbox-agents-and-the-model-native-harness-go-ga-49ej</guid>
      <description>&lt;p&gt;On April 15, 2026, OpenAI &lt;a href="https://openai.com/index/the-next-evolution-of-the-agents-sdk/" rel="noopener noreferrer"&gt;announced the next evolution of the Agents SDK&lt;/a&gt; and shipped &lt;code&gt;openai-agents-python&lt;/code&gt; v0.14.0/v0.14.1 with two headline features: &lt;strong&gt;Sandbox Agents&lt;/strong&gt; and a &lt;strong&gt;model-native harness&lt;/strong&gt;. Both are now GA for all API customers.&lt;/p&gt;

&lt;p&gt;The one-sentence read: &lt;em&gt;the default shape of "an AI agent" just moved from a chatbot with tool calls to a coworker with a computer.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The three trade-offs OpenAI is trying to escape
&lt;/h2&gt;

&lt;p&gt;OpenAI frames the existing landscape as three approaches, each with a trade-off:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Model-agnostic frameworks&lt;/strong&gt; (LangChain, LlamaIndex, etc.) — flexible across models, but can't fully exploit frontier model capabilities. There's always a gap between "what the latest model can do" and "what the framework surfaces."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provider SDKs&lt;/strong&gt; — closest to the model, but thin on harness observability. Hard to see what a production agent is actually doing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Managed agent APIs&lt;/strong&gt; — easy to deploy, but constrain execution location and data access.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The response: &lt;strong&gt;split harness and compute&lt;/strong&gt;. The harness owns the loop, memory, orchestration, observability. The sandbox owns execution, filesystem, network, tool calls. They run in different isolation domains.&lt;/p&gt;

&lt;p&gt;The primary reason is security. If model-generated code executes in the same environment as your credentials, you're one prompt injection away from exfiltration. Long-running agents also need to survive environment loss — which a clean split makes tractable.&lt;/p&gt;

&lt;h2&gt;
  
  
  What landed on the harness side
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Configurable memory&lt;/td&gt;
&lt;td&gt;Reuse lessons across runs, progressive disclosure instead of shoving everything into context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sandbox-aware orchestration&lt;/td&gt;
&lt;td&gt;Route sub-agents into isolated sandboxes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Codex-like filesystem tools&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;read&lt;/code&gt;/&lt;code&gt;write&lt;/code&gt;/&lt;code&gt;edit&lt;/code&gt; as first-class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP support&lt;/td&gt;
&lt;td&gt;Native Model Context Protocol (Anthropic's open standard)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Skills&lt;/td&gt;
&lt;td&gt;Progressive disclosure of domain capabilities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AGENTS.md&lt;/td&gt;
&lt;td&gt;Project-level custom instructions file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shell tool&lt;/td&gt;
&lt;td&gt;1st-class shell command execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apply patch tool&lt;/td&gt;
&lt;td&gt;1st-class file patching&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two items stand out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AGENTS.md&lt;/strong&gt; is essentially &lt;code&gt;CLAUDE.md&lt;/code&gt; with a different name. If you've used Claude Code, the concept is identical: a project-root file containing custom instructions for the agent. OpenAI officially adopting this convention is itself a signal — "project-level agent config file" is becoming a cross-provider pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP support&lt;/strong&gt; means OpenAI officially endorsed Anthropic's Model Context Protocol. When both major model providers support MCP, the tool ecosystem becomes portable. This is a bigger deal than the harness features IMO — it's infrastructure-level consolidation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What landed on the sandbox side
&lt;/h2&gt;

&lt;p&gt;Seven default providers ship out of the box:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Blaxel&lt;/li&gt;
&lt;li&gt;Cloudflare&lt;/li&gt;
&lt;li&gt;Daytona&lt;/li&gt;
&lt;li&gt;E2B&lt;/li&gt;
&lt;li&gt;Modal&lt;/li&gt;
&lt;li&gt;Runloop&lt;/li&gt;
&lt;li&gt;Vercel&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Plus BYO sandbox for anyone who wants to run their own isolation layer.&lt;/p&gt;

&lt;p&gt;The most interesting piece is the &lt;strong&gt;Manifest abstraction&lt;/strong&gt; — a portable format describing an agent workspace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local file mounts&lt;/li&gt;
&lt;li&gt;Output directories&lt;/li&gt;
&lt;li&gt;Storage integrations (AWS S3, Google Cloud Storage, Azure Blob Storage, Cloudflare R2)&lt;/li&gt;
&lt;li&gt;Git repository cloning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the sandbox container dies, the Manifest lets you reconstruct an identical workspace elsewhere. This is what unlocks the next piece:&lt;/p&gt;

&lt;h3&gt;
  
  
  Durable execution
&lt;/h3&gt;

&lt;p&gt;Agent state is externalized from the container. Snapshotting and rehydration are built in. If an existing environment fails or expires, the agent resumes from its last checkpoint in a new container.&lt;/p&gt;

&lt;p&gt;For long-running tasks — bulk translations, large refactors, overnight research — this is the difference between "I can run this" and "I can run this safely."&lt;/p&gt;

&lt;h3&gt;
  
  
  Permissions
&lt;/h3&gt;

&lt;p&gt;Per-Manifest-entry file permissions, mapped to Unix filesystem semantics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dataroom/   → read-only
output/     → writable
config/     → inaccessible
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not a convenience feature. It's a structural limit on blast radius from prompt injection. Injection isn't prevented, but what an injected instruction can &lt;em&gt;actually do&lt;/em&gt; is bounded.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scaling
&lt;/h3&gt;

&lt;p&gt;One agent run can use one sandbox or many. Sandboxes are called only when needed. Sub-agents route to isolated containers. Container-level parallelism for speed. Cost and performance on the same axis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Availability and pricing
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GA&lt;/strong&gt; for all OpenAI API customers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard API pricing&lt;/strong&gt; — tokens + tool usage. No new tier.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python first&lt;/strong&gt; (v0.14.0+). TypeScript coming later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Roadmap&lt;/strong&gt;: Code mode and Subagents expanding to Python and TypeScript.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ecosystem size at the time of this release: ~14.7M monthly Python downloads, ~1.5M TypeScript downloads, 19,000+ GitHub stars, 250+ contributors.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Claude Code convergence
&lt;/h2&gt;

&lt;p&gt;If you look at the naming, the parallel with Anthropic's Claude Code is hard to miss:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;OpenAI Agents SDK&lt;/th&gt;
&lt;th&gt;Anthropic Claude Code&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AGENTS.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CLAUDE.md&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Skills&lt;/td&gt;
&lt;td&gt;Skills&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shell tool&lt;/td&gt;
&lt;td&gt;Bash tool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apply patch tool&lt;/td&gt;
&lt;td&gt;Edit tool&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both companies converged on the same answer to "what does an AI agent need?" The answer is: shell, filesystem, project-level instructions, tool-use standard (MCP), and a sandbox to do it all safely.&lt;/p&gt;

&lt;p&gt;This matters because it redefines the default. "AI agent = chatbot with tools" is dead. "AI agent = semi-autonomous process with a computer" is the new baseline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical angles for small teams
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Wire Manifest to your data layer directly
&lt;/h3&gt;

&lt;p&gt;Connect S3 or Cloudflare R2 through a Manifest entry. The agent reads source data and writes results itself — no wrapper scripts. This collapses a lot of "data prep → run → save results" plumbing.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Safe overnight jobs
&lt;/h3&gt;

&lt;p&gt;With Durable execution, "run 50 translations overnight" is no longer a gamble. Container crashes resume from checkpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Security-by-default code agents
&lt;/h3&gt;

&lt;p&gt;Permissions structurally prevent common mistakes like "agent has access to &lt;code&gt;.env&lt;/code&gt; and leaks credentials." Solo devs especially benefit — the security layer isn't "remember to configure it," it's "the Manifest says dataroom is read-only."&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Isolated sub-agent parallelism
&lt;/h3&gt;

&lt;p&gt;Complex tasks split into sub-agents, each in its own sandbox, running in parallel. A pattern that was DIY is now production-ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trade-offs worth watching
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;TypeScript lag.&lt;/strong&gt; Python-first is painful for TS/frontend-heavy teams. TypeScript is on the roadmap but timing isn't public.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provider lock-in risk.&lt;/strong&gt; Seven sandbox providers is great on paper, but depth of integration with any one will make migration costly. Manifest promises portability — real-world portability always has edges.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Billing predictability.&lt;/strong&gt; Three cost axes (sandbox compute + tokens + tool usage) make budget modeling harder. Monitor carefully in the early rollout.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt injection is not solved.&lt;/strong&gt; Harness/compute split limits damage. It doesn't prevent injection. Input validation, least-privilege, and audit logging are still on you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom line
&lt;/h2&gt;

&lt;p&gt;"AI agent" as an abstraction shifted from "a model that chats and calls tools" to "a semi-autonomous process with sandboxed compute, durable state, and Unix-style permissions." OpenAI formalized the direction Claude Code had been demonstrating, both companies are on MCP, and the practical implications for small teams are real — overnight jobs, direct data-layer access, structural security.&lt;/p&gt;

&lt;p&gt;One more stack to learn. Probably worth it.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/the-next-evolution-of-the-agents-sdk/" rel="noopener noreferrer"&gt;The next evolution of the Agents SDK — OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/ko-KR/index/the-next-evolution-of-the-agents-sdk/" rel="noopener noreferrer"&gt;Korean version&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.github.io/openai-agents-python/release/" rel="noopener noreferrer"&gt;Release notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/openai-agents-python/" rel="noopener noreferrer"&gt;openai-agents-python on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/blog/skills-agents-sdk" rel="noopener noreferrer"&gt;Skills + AGENTS.md on the OpenAI Developers blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://community.openai.com/t/the-next-evolution-of-the-agents-sdk/1379072" rel="noopener noreferrer"&gt;Community announcement thread&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>GPT Image 2 Leak: 개발자가 지금 당장 해야 할 5가지</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Fri, 17 Apr 2026 01:16:46 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/gpt-image-2-leak-gaebaljaga-jigeum-dangjang-haeya-hal-5gaji-2pb1</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/gpt-image-2-leak-gaebaljaga-jigeum-dangjang-haeya-hal-5gaji-2pb1</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;2026년 4월 4일 LM Arena에 등장한 3개 익명 모델(maskingtape-alpha / gaffertape-alpha / packingtape-alpha)이 GPT Image 2로 추정됩니다. near-perfect 텍스트 렌더링과 네이티브 4K 해상도가 핵심입니다. DALL-E는 5월 12일 종료 예정이라, 지금이 API 마이그레이션을 준비할 마지막 타이밍입니다.&lt;/p&gt;




&lt;h2&gt;
  
  
  뭐가 바뀌나
&lt;/h2&gt;

&lt;p&gt;GPT Image 2의 핵심 업그레이드 5가지를 개발자 관점에서 정리했어요.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;near-perfect 텍스트 렌더링&lt;/strong&gt; → UI 목업의 수정 왕복이 사라집니다&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;네이티브 4K&lt;/strong&gt; → 2048x2048 기본, 4K 업스케일 지원&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;포토리얼리즘&lt;/strong&gt; → AI 판별기가 실패한 비치 셀카 리크&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;월드 지식 강화&lt;/strong&gt; → 실제 브랜드/장소 정확한 렌더링&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;독립 아키텍처&lt;/strong&gt; → GPT-4o 기반 아닌 이미지 전용 모델&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  코드로 확인하는 현재 상태
&lt;/h2&gt;

&lt;p&gt;기존 OpenAI 이미지 API는 이렇게 생겼습니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-image-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A modern dashboard UI with pricing cards&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1536x1024&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;quality&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;high&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;b64_json&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GPT Image 2가 정식 출시되면 모델명과 size가 바뀔 가능성이 높습니다. 예상 형태는 이렇습니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 예상 (공식 스펙 미공개)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-image-2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;// maskingtape/gaffertape/packingtape 중 하나가 stable이 될 가능성&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A modern dashboard UI with pricing cards, 'Upgrade now' button&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2048x2048&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;              &lt;span class="c1"&gt;// 네이티브 4K&lt;/span&gt;
  &lt;span class="na"&gt;quality&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;standard&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hd&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;// hd가 4K 업스케일&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  마이그레이션 준비 코드
&lt;/h2&gt;

&lt;p&gt;DALL-E 의존 코드를 GPT Image 계열로 전환하는 어댑터 패턴을 만들어두면 전환이 쉬워집니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// image-provider.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PROVIDERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dall-e-3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dall-e-3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1792x1024&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;deprecated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-05-12&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-image-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-image-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1536x1024&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;deprecated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-image-2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-image-2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2048x2048&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;deprecated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;native4k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;PROVIDERS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-image-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;deprecated&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;deprecated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; deprecated on &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;deprecated&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxSize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;quality&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;quality&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;standard&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  프롬프트 전략 변화
&lt;/h2&gt;

&lt;p&gt;기존 GPT Image 1에서는 이미지 내 텍스트를 생성할 때 짧게 끊어야 했습니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// BAD: 1.5 버전에서 자주 깨짐&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A billboard with the long slogan: 'The future of AI is not predicting but creating, one pixel at a time'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// GOOD: 1.5 버전에서도 안정적&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A billboard with the text 'Future is Now' in bold white&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GPT Image 2에서는 장문 텍스트도 near-perfect로 렌더링된다고 하니, 프롬프트 길이 제약이 완화될 것으로 예상됩니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// GPT Image 2 예상: 장문 텍스트도 안정적&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A startup pitch slide titled 'Q2 Growth Metrics'. Three data cards: 'Users: +42%', 'Revenue: $1.2M MRR', 'Churn: 2.1%'. Clean dark theme, white text, purple accents.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  지금 해야 할 체크리스트
&lt;/h2&gt;

&lt;p&gt;출시 전까지 개발자가 준비해야 할 것들입니다.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] 기존 DALL-E 호출 지점 전수 조사 (5/12 이후 깨집니다)&lt;/li&gt;
&lt;li&gt;[ ] 4K 출력을 고려한 CDN 및 스토리지 용량 재점검&lt;/li&gt;
&lt;li&gt;[ ] 프롬프트 저장소에 길이별 테스트 케이스 준비&lt;/li&gt;
&lt;li&gt;[ ] API 요금 변화 대응을 위한 cost tracking 코드 추가&lt;/li&gt;
&lt;li&gt;[ ] 이미지 provider 어댑터 레이어 도입 (위 코드 참고)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  한 가지 주의사항
&lt;/h2&gt;

&lt;p&gt;"GPT Image 2"라는 이름 자체는 아직 OpenAI 공식 명칭이 아닙니다. 커뮤니티 추정입니다. 다만 LM Arena 등장과 A/B 트리거 보고는 팩트라 실체가 있다는 것은 확실합니다.&lt;/p&gt;

&lt;p&gt;공식 발표가 나오면 모델명과 size 파라미터를 실제 값으로 교체하면 됩니다. 그 전까지 마이그레이션 구조만 미리 준비해두세요.&lt;/p&gt;

&lt;h2&gt;
  
  
  참고 자료
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/new-chatgpt-images-is-here/" rel="noopener noreferrer"&gt;OpenAI 공식 이미지 생성 소개&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/api/docs/guides/image-generation" rel="noopener noreferrer"&gt;OpenAI API 이미지 생성 가이드&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://getimg.ai/blog/gpt-image-2-rumours-leaks-release-date-2026" rel="noopener noreferrer"&gt;GetImg: GPT Image 2 루머 분석&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://officechai.com/ai/three-image-generation-models-named-maskingtape-gaffertape-and-packingtape-create-buzz-on-arena-rumoured-to-be-openais-gpt-image-2/" rel="noopener noreferrer"&gt;OfficeChai: LM Arena 세 모델 보고&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Claude Opus 4.7 is out — what actually changed for developers</title>
      <dc:creator>정상록</dc:creator>
      <pubDate>Fri, 17 Apr 2026 01:16:11 +0000</pubDate>
      <link>https://forem.com/_46ea277e677b888e0cd13/claude-opus-47-is-out-what-actually-changed-for-developers-k73</link>
      <guid>https://forem.com/_46ea277e677b888e0cd13/claude-opus-47-is-out-what-actually-changed-for-developers-k73</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Anthropic dropped Claude Opus 4.7 yesterday (April 16, 2026). Price is identical to 4.6 ($5/$25 per M tokens) but SWE-bench Pro rose from 53.4% → 64.3% — actually passing GPT-5.4 (57.7%). The practical changes that matter: Claude Code now defaults to a new &lt;code&gt;xhigh&lt;/code&gt; effort tier automatically, a &lt;code&gt;/ultrareview&lt;/code&gt; slash command runs dedicated code review sessions, and the new tokenizer inflates the same input by 1.0–1.35x tokens. You probably need to bump &lt;code&gt;max_tokens&lt;/code&gt; to 64k+ and retest prompts you wrote for 4.6.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Anthropic actually shipped
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Model ID:      claude-opus-4-7
Release:       2026-04-16 (GA)
Price:         $5/M input, $25/M output (unchanged from 4.6)
Availability:  Claude API · Amazon Bedrock · GCP Vertex AI · Microsoft Foundry
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're used to thinking of Opus releases as marginal, 4.7 is not that kind of release.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmark deltas that aren't just noise
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                        Opus 4.6    Opus 4.7    Delta
SWE-bench Verified        —          87.6%       —
SWE-bench Pro            53.4%       64.3%      +10.9pp
CursorBench              58%         70%        +12pp
XBOW visual accuracy     54.5%       98.5%      +44pp
Databricks OfficeQA Pro   —          err -21%    —
Rakuten SWE-Bench         —           3x tasks    —
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The one I'd focus on is XBOW. Vision accuracy going from 54.5% to 98.5% isn't benchmark farming — it's the difference between "computer-use agents are a demo" and "I can actually ship one."&lt;/p&gt;

&lt;h2&gt;
  
  
  The xhigh effort tier (and why Claude Code just got better for free)
&lt;/h2&gt;

&lt;p&gt;Effort levels used to be &lt;code&gt;low | medium | high | max&lt;/code&gt;. 4.7 adds &lt;code&gt;xhigh&lt;/code&gt; between &lt;code&gt;high&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Anthropic SDK
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# bump from 32k — xhigh produces longer reasoning
&lt;/span&gt;    &lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budget_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;effort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xhigh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# new tier
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The invisible change: &lt;strong&gt;Claude Code defaulted every plan to xhigh automatically.&lt;/strong&gt; You don't set anything. Quality went up, nothing in your CLI config changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  /ultrareview — a review session that actually pushes back
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/ultrareview
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs a dedicated review pass that flags bugs and design issues. Pro/Max users get 3 free sessions per month.&lt;/p&gt;

&lt;p&gt;Vercel's team noted after testing: "new behavior of starting from proofs for systems code." In practice it's more willing to say "this invariant isn't being preserved here" than prior Claude versions, which tended toward agreement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Migration: things that will bite you
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Token usage goes up — sometimes a lot
&lt;/h3&gt;

&lt;p&gt;New tokenizer. Same input now uses &lt;strong&gt;1.0x–1.35x&lt;/strong&gt; tokens depending on content type. Code-heavy inputs trend toward the high end.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before (4.6)
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# After (4.7) — needs headroom
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# xhigh reasoning expands late-turn output
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anthropic claims efficiency improved at all effort levels (shorter output for same quality). My anecdotal tests back this up for coding tasks but it varies. Don't assume, measure.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Instruction-following is now strict
&lt;/h3&gt;

&lt;p&gt;This is the hidden gotcha. 4.7 executes prompts literally. Implicit assumptions you leaned on with 4.6 — "obviously Claude will skip step X if condition Y" — no longer hold.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# A prompt that used to "just work"
"Refactor this function and add tests if needed."

# 4.7 behavior: always adds tests, even when not needed
# 4.6 behavior: would judge whether tests were warranted
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anthropic's own migration guide recommends re-tuning prompts and agentic harnesses. Plan the time.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Task Budget (public beta) — let the model see its own budget
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;task_budget&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;remaining_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# beta
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You tell Claude how many tokens are left. It prioritizes and gracefully winds down as the budget drops. Actually useful for long agentic loops that used to die mid-step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Safety direction: intentionally narrower
&lt;/h2&gt;

&lt;p&gt;Worth noting because it's unusual: 4.7 is the first Anthropic model where cyber capabilities were &lt;strong&gt;deliberately scaled back&lt;/strong&gt;. A new Cyber Verification Program exists for legitimate security researchers (vuln research, pentesting, red teaming) to regain expanded access through a vetting process.&lt;/p&gt;

&lt;p&gt;If you're in security and hit capability walls you didn't see in 4.6, that's why.&lt;/p&gt;

&lt;h2&gt;
  
  
  What partners are saying (filter for patterns, not hype)
&lt;/h2&gt;

&lt;p&gt;The repeated keywords across partner quotes are what matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"stays coherent for hours" (Cognition/Devin)&lt;/li&gt;
&lt;li&gt;"doesn't give up on hard problems" (Cognition)&lt;/li&gt;
&lt;li&gt;"passes implicit-requirement tests" (Notion Agent)&lt;/li&gt;
&lt;li&gt;"starts from proofs for systems code" (Vercel)&lt;/li&gt;
&lt;li&gt;"loop resistance, consistency, graceful error recovery" (Genspark)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Translation: the model doesn't just benchmark better, it &lt;em&gt;behaves&lt;/em&gt; differently in long-running agentic setups.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tonight's checklist
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;pip install -U anthropic&lt;/code&gt; (or update your Claude Code client)&lt;/li&gt;
&lt;li&gt;Run one existing prompt against &lt;code&gt;claude-opus-4-7&lt;/code&gt; and diff the output&lt;/li&gt;
&lt;li&gt;Bump &lt;code&gt;max_tokens&lt;/code&gt; in every API call to 64k&lt;/li&gt;
&lt;li&gt;Try &lt;code&gt;/ultrareview&lt;/code&gt; on a PR you're not sure about&lt;/li&gt;
&lt;li&gt;Before rolling 4.7 into a production agent: budget 1-2 hours for harness re-tuning&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is there a free tier?
&lt;/h3&gt;

&lt;p&gt;No. API pricing is identical to 4.6 ($5/M input, $25/M output). Claude Pro and Max plans get Claude Code access including the new xhigh default.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I need to change my model ID?
&lt;/h3&gt;

&lt;p&gt;Yes. &lt;code&gt;claude-opus-4-7&lt;/code&gt;. The old &lt;code&gt;claude-opus-4-6&lt;/code&gt; ID still works during the deprecation window.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Mythos Preview the same thing?
&lt;/h3&gt;

&lt;p&gt;No. Mythos Preview is an unreleased Anthropic model available only via limited preview. Opus 4.7 is the strongest &lt;strong&gt;generally available&lt;/strong&gt; model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will 4.7 break my existing Claude Code setup?
&lt;/h3&gt;

&lt;p&gt;Almost certainly not — Claude Code handles the transition. What may feel different is faster &lt;em&gt;quality&lt;/em&gt; improvement because xhigh is on by default. Your prompts may still need re-tuning.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;Claude Opus 4.7 official announcement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7" rel="noopener noreferrer"&gt;What's New in Claude 4.7&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.claude.com/docs/en/about-claude/models/migration-guide" rel="noopener noreferrer"&gt;Migration guide (4.6 → 4.7)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2026/04/claude-opus-4.7-amazon-bedrock/" rel="noopener noreferrer"&gt;AWS announcement — Bedrock availability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.blog/changelog/2026-04-16-claude-opus-4-7-is-generally-available/" rel="noopener noreferrer"&gt;GitHub changelog — GA&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
  </channel>
</rss>
