<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: guanjiawei</title>
    <description>The latest articles on Forem by guanjiawei (@skyguan92).</description>
    <link>https://forem.com/skyguan92</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3788265%2Ff93aaebd-c44b-447a-b582-cc297747f93b.jpeg</url>
      <title>Forem: guanjiawei</title>
      <link>https://forem.com/skyguan92</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/skyguan92"/>
    <language>en</language>
    <item>
      <title>The Word 'AI' Has Changed Its Soul Three Times</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 05:40:37 +0000</pubDate>
      <link>https://forem.com/skyguan92/the-word-ai-has-changed-its-soul-three-times-l6d</link>
      <guid>https://forem.com/skyguan92/the-word-ai-has-changed-its-soul-three-times-l6d</guid>
      <description>&lt;p&gt;Recently, I've been chatting with friends about AI, and the more we talk, the more something feels off. Everyone claims to understand it; everyone can carry a conversation. But halfway through, I often realize we're not talking about the same thing at all.&lt;/p&gt;

&lt;p&gt;This reminds me of a joke an economist once made. He said economics is different from physics. If you say at the dinner table that you're a physicist, the other person will usually respond, "I don't understand physics—what's new lately?" and then listen attentively. But if you say you study economics, the other person will say, "I don't really understand economics," and then proceed to share their opinions on economics for half an hour. The whole table is discussing economics, except the economist barely gets a word in.&lt;/p&gt;

&lt;p&gt;AI is the new economics.&lt;/p&gt;

&lt;h2&gt;
  
  
  One Word, Three Souls
&lt;/h2&gt;

&lt;p&gt;The term "Artificial Intelligence" was coined by John McCarthy at the Dartmouth workshop in 1956. Nearly seventy years have passed, and the word has never changed, but what it actually refers to has quietly shifted several times.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generation One: A Distant Legend of Intelligence
&lt;/h3&gt;

&lt;p&gt;Back when I was in school, AI in the industry mainly referred to machine learning. Finding patterns in astronomy, mining data in finance, making click-through predictions for ads. These things were far removed from ordinary people; no one would bring them up at the dinner table.&lt;/p&gt;

&lt;p&gt;The public's first impression of AI came from two events.&lt;/p&gt;

&lt;p&gt;On May 11, 1997, IBM's Deep Blue defeated Garry Kasparov in the sixth game—the first time a reigning world chess champion lost to a computer. At the time, people still viewed it as a supercomputing machine; it had nothing to do with so-called "general intelligence."&lt;/p&gt;

&lt;p&gt;The real turning point was March 2016, when AlphaGo defeated Lee Sedol 4-1 in Seoul. Go holds a special place in the Chinese-speaking world; it represents strategy, the big picture, the art of war. A program that could beat a top human player immediately sparked associations: if it's this strong at Go, is it also strong in other domains? Does it already possess "general intelligence"?&lt;/p&gt;

&lt;p&gt;Around the same period, computer vision also broke through. In March 2014, the DeepID model from Professor Tang Xiao'ou's team at the Chinese University of Hong Kong achieved 98.52% accuracy on the LFW face recognition benchmark, surpassing the human eye's average of 97.53% for the first time. In 2015, Microsoft's ResNet dropped the classification error rate to 4.94% on ImageNet, once again exceeding human performance.&lt;/p&gt;

&lt;p&gt;Faces are different. If a cat shifts slightly, you can't tell whether it's Cat A or Cat B; but if a person's facial features change a little, you immediately know "this isn't the same person." Human sensitivity to faces is innate. For computers to surpass humans at this task was profoundly significant.&lt;/p&gt;

&lt;p&gt;The strategy of Go combined with the perception of vision—these two things collided, and public sentiment was instantly ignited.&lt;/p&gt;

&lt;p&gt;What was dramatic is that before AlphaGo, domestic computer vision companies weren't actually doing well. After SenseTime received its angel round from IDG in 2014, it went through a period of tight funding; outside capital was reluctant to come in. A seemingly unrelated event to Go—it was only after AlphaGo ignited the entire AI track that the next round of funding came together smoothly. From 2016 to 2018, the "Four Little Dragons" of AI—SenseTime, Megvii, CloudWalk, and Yitu—raised money hand over fist.&lt;/p&gt;

&lt;p&gt;But the hype cooled after a few years.&lt;/p&gt;

&lt;p&gt;Why? Because the anticipated "generalization" never materialized. AlphaGo could only play Go; ask it to do anything else and it was useless. Vision companies would train one model for face recognition and another for vehicle recognition; each additional scene increased costs almost linearly. The story that "massive investment produces general intelligence" simply didn't hold for that generation of technology.&lt;/p&gt;

&lt;p&gt;Around 2020, sentiment returned to the nuts and bolts of commercialization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generation Two: ChatGPT Gave AI a Mouth
&lt;/h3&gt;

&lt;p&gt;The second major shift came on November 30, 2022, when OpenAI released ChatGPT 3.5.&lt;/p&gt;

&lt;p&gt;The biggest difference this time was "language." For the first time, AI could chat with people in a way that actually seemed plausible. Sitting on the other side of the screen, you sometimes really couldn't tell whether it was a person or a machine.&lt;/p&gt;

&lt;p&gt;Veterans of AI would immediately think of the Turing test: a blind conversation between a human and AI, with a third party guessing which is the machine. This was the standard Alan Turing proposed in 1950 for judging general intelligence. People had assumed it would take a long time to pass.&lt;/p&gt;

&lt;p&gt;As it turned out, in 2024 research by Cameron Jones at UC San Diego, GPT-4 was judged to be "a real person" with a 54% probability in a five-minute conversation test, already close to the human baseline of 67%. In follow-up research in 2025, GPT-4.5 was considered to have passed the "original Turing test" outright. A standard once enshrined as sacred is now no longer mentioned, because it was quietly crossed long ago.&lt;/p&gt;

&lt;p&gt;What truly made ChatGPT different was that it turned AI into something "everyone could use with their own hands." The machine learning and AlphaGo generation was news for ordinary people; pontificating was someone else's business. After ChatGPT, things changed. By February 2026, 900 million people were opening it to chat once a week. This level of penetration is an entirely different thing from before.&lt;/p&gt;

&lt;p&gt;So for the past two or three years, when the vast majority of people talked about AI—whether they meant ChatGPT, DeepSeek, Doubao, or Kimi—they were referring to this one thing: chatbots.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generation Three: From Mouth to Hands and Feet
&lt;/h3&gt;

&lt;p&gt;But from the end of last year to the beginning of this year, the meaning of AI changed once again.&lt;/p&gt;

&lt;p&gt;I often run into a scenario lately in conversation. The other person says, "I get it, I use it all the time—I've used Doubao, Kimi, DeepSeek." That statement held true for the past few years. It doesn't anymore. Because the AI I'm talking about is no longer that thing.&lt;/p&gt;

&lt;p&gt;The new paradigm is called "Agent."&lt;/p&gt;

&lt;p&gt;If the previous generation of AI was an advisor with a mouth and a brain, sitting across from you making conversation, then this generation is a colleague with hands and feet, capable of picking up tools and doing work for you. At least in the digital world.&lt;/p&gt;

&lt;p&gt;The defining event that ignited this shift was Anthropic's release of Claude Computer Use in October 2024. For the first time, AI could look at the screen, move the mouse, click buttons, and type on the keyboard. In the year or so since, coding agents have become the main battleground. Claude Code was announced in 2024 and matured in 2025; OpenAI released Codex CLI in April 2025 and followed up with Codex Cloud in May; domestic products like Kimi Code also joined the race.&lt;/p&gt;

&lt;p&gt;What's most interesting is the generalization. Coding agents were originally aimed at "writing code," but in practice, people discovered they could do far more than that: researching information, batch processing files, operating browsers, debugging software, running experiments automatically. They can handle 80% to 90% of tasks on a computer, and do them decently well. This generation has finally delivered on the "generalization" that the previous generation promised but failed to achieve.&lt;/p&gt;

&lt;p&gt;Why didn't many people notice that this AI and the last AI are two completely different things? Because the word didn't change—it's still the two characters "AI." But whether the "AI" in your mind is a chat box or an agent that can work on its own makes a world of difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Exactly Are We Talking About
&lt;/h2&gt;

&lt;p&gt;Returning to the original question: what exactly are we talking about when we discuss AI today?&lt;/p&gt;

&lt;p&gt;I've tried to break it down.&lt;/p&gt;

&lt;p&gt;First, the person for whom "AI = Doubao / Kimi / DeepSeek, that kind of chat box." That's the 2022 to 2024 generation's understanding. We're not talking about the same thing.&lt;/p&gt;

&lt;p&gt;Second, someone who has used agent applications like OpenClaw, Hermes, or KimiClaw that are controlled via instant messaging. At least we're on the same plane; we can have a conversation.&lt;/p&gt;

&lt;p&gt;Third, someone who uses Claude Code, Codex, or Kimi Code on a daily basis—any one of these coding agents. Then we're talking about the exact same thing, and we can discuss the changes it brings with great precision.&lt;/p&gt;

&lt;p&gt;I keep emphasizing this because it's a lot like the Turing test. It has already been quietly crossed, yet many people are still stuck in the previous generation's cognitive framework, discussing something that doesn't actually exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Barrier Isn't Technical
&lt;/h2&gt;

&lt;p&gt;Every time the meaning of AI shifts, the way ordinary people experience it changes as well.&lt;/p&gt;

&lt;p&gt;In the machine learning generation, ordinary people had no chance to get their hands on it. The barrier was too high—you had to prepare data, configure environments, tune parameters; it was completely unfriendly to non-specialists. Impressions of AI could only come from the news.&lt;/p&gt;

&lt;p&gt;In the ChatGPT generation, the ceiling of adoption was shattered. Open a web page and use it, for free—you could finally "use" AI. That's when people started weighing in on "which one is better or worse"; they could comment because they could use it.&lt;/p&gt;

&lt;p&gt;The agent generation is actually the same. The barrier is low; you can get started for a few dozen yuan a month. I previously helped a few college students use Claude Code. They had no special background, yet they still managed to complete their own projects.&lt;/p&gt;

&lt;p&gt;So where is the real barrier? I think it's just one word: laziness.&lt;/p&gt;

&lt;p&gt;Not technical laziness, but cognitive laziness. You've heard others talk about it, you've swiped through two short videos, you've read a few viral articles on social media, and you think you understand. But you haven't actually gotten your hands dirty, haven't had it do a concrete task for you, haven't experienced that "ah, so that's how it works" moment.&lt;/p&gt;

&lt;p&gt;This is how it is with every technological transformation. It was like this when the steam engine emerged, when the internet spread, and when mobile internet spread. Technology is never the slow part; the slow part is the people who think they understand after hearing a thing or two.&lt;/p&gt;

&lt;p&gt;And after every shift, old cognitive frameworks become a hidden cost. You think you're using AI, but you're actually using the previous generation of AI.&lt;/p&gt;

&lt;p&gt;If you've read this far without actually using an agent once, the advice is simple: find the cheapest option and start using it today. No need to watch videos, no need to read tutorials—just have it do something you would have had to do yourself, and see what happens.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Dartmouth_workshop" rel="noopener noreferrer"&gt;Dartmouth workshop — Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Deep_Blue_versus_Garry_Kasparov" rel="noopener noreferrer"&gt;Deep Blue versus Garry Kasparov — Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol" rel="noopener noreferrer"&gt;AlphaGo versus Lee Sedol — Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://zhuanlan.zhihu.com/p/69530060" rel="noopener noreferrer"&gt;DeepID and the Face Recognition Breakthrough by Professor Tang Xiao'ou's Team at CUHK&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blogs.microsoft.com/ai/microsoft-researchers-win-imagenet-computer-vision-challenge/" rel="noopener noreferrer"&gt;Microsoft researchers win ImageNet computer vision challenge&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/ChatGPT" rel="noopener noreferrer"&gt;ChatGPT — Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://techcrunch.com/2025/10/06/sam-altman-says-chatgpt-has-hit-800m-weekly-active-users/" rel="noopener noreferrer"&gt;Sam Altman says ChatGPT has hit 800M weekly active users (TechCrunch)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2405.08007" rel="noopener noreferrer"&gt;People cannot distinguish GPT-4 from a human in a Turing test (Cameron Jones, 2024)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2503.23674" rel="noopener noreferrer"&gt;Large Language Models Pass the Turing Test (Jones &amp;amp; Bergen, 2025)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/news/3-5-models-and-computer-use" rel="noopener noreferrer"&gt;Introducing computer use, a new Claude 3.5 Sonnet (Anthropic, 2024)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/introducing-codex/" rel="noopener noreferrer"&gt;Introducing Codex (OpenAI, 2025)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/what-we-mean-by-ai" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/what-we-mean-by-ai&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>cognition</category>
      <category>thinking</category>
    </item>
    <item>
      <title>After GitHub Became Like Xiaohongshu: Building Products Has Become Even More Lonely</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 05:31:07 +0000</pubDate>
      <link>https://forem.com/skyguan92/after-github-became-like-xiaohongshu-building-products-has-become-even-more-lonely-4baf</link>
      <guid>https://forem.com/skyguan92/after-github-became-like-xiaohongshu-building-products-has-become-even-more-lonely-4baf</guid>
      <description>&lt;p&gt;Recently, I've had a somewhat different feeling about building products. Let me jot it down.&lt;/p&gt;

&lt;h2&gt;
  
  
  Likes Take Seconds; Products Don't
&lt;/h2&gt;

&lt;p&gt;Since vibe coding emerged, projects have been skyrocketing on GitHub in no time. Racking up tens of thousands of stars in two days has become common. After Collins Dictionary named vibe coding its 2025 Word of the Year, the pace didn't slow down. Supposedly, a quarter of the projects in Y Combinator's Winter 2025 batch had codebases that were 95% AI-generated. The entire open-source ecosystem is taking on an increasingly strong "note-posting" vibe.&lt;/p&gt;

&lt;p&gt;This state increasingly resembles Xiaohongshu (China's lifestyle-sharing platform similar to Instagram). A fun idea, an interesting angle, is enough to draw people in to browse, drop a spark, download, and try it out. The approach has passively turned into "posting notes"—have an idea, post it; if it blows up, great; if not, move on to the next one.&lt;/p&gt;

&lt;p&gt;But I've increasingly come to realize one thing: giving an idea a like takes seconds; turning an idea into a product people can actually use still doesn't. The unsexy, hard-to-explain stretch in between—vibe coding hasn't shortened it much.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hard Threshold of Infrastructure Products
&lt;/h2&gt;

&lt;p&gt;Our company, Aima, primarily focuses on on-device AI hardware, and we also have some lower-level work around model inference performance, edge devices, and agent runtimes. The demand for "stability" in this kind of product is higher than anything I've worked on before.&lt;/p&gt;

&lt;p&gt;The reason is simple. If you build a marketing mini-game and it crashes, users might curse at most. If you build an IM, a gateway, or an inference engine and it crashes, it could mean the entire upstream business is taken down. These things are also the most likely to get slammed against the wall by high concurrency and high-frequency calls—back when Facebook Chat launched and accumulated 175 million active users, the channel server was repeatedly pushed to the breaking point between CPU and bandwidth. That kind of "infrastructure torment" is something every person who builds infra must go through.&lt;/p&gt;

&lt;p&gt;Then it has to withstand scrutiny. The novelty has to be novel enough for users to feel it's different from what already exists. Stability and novelty are two completely different capabilities: stability demands obsession with details; novelty demands dissatisfaction with generic solutions. For a product to pass both bars is itself a low-probability event.&lt;/p&gt;

&lt;p&gt;Vibe coding can push a demo to a "it runs" state—something that used to take weeks now takes hours. But the gap between "it runs" and "it can withstand daily use and repeated scrutiny"—this wave of tooling upgrades hasn't filled that in for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Invisible Road in the Middle
&lt;/h2&gt;

&lt;p&gt;The part of a product that truly torments people is usually the long stretch after the PoC is validated.&lt;/p&gt;

&lt;p&gt;In the early stage, the idea is validated, a few people say "this is interesting"—this is the highlight. You have passion, feedback, and the desire to share. Then you enter the middle stage: making it solid, covering edge cases, feeding it real data, connecting it to real systems, watching stability. There's nothing shareable in this stretch; every day is "plugged another hole"—no social media content, no likes.&lt;/p&gt;

&lt;p&gt;You also have to repeatedly explain to people what you're doing. I've tried several times in recent weeks to describe to friends what I'm working on. After I'm done, they say, "Oh, so it's just XXX, right?" XXX is completely different from what I'm building, but their frame of reference has no slot for it, so they just squeeze me into the nearest category. This kind of helplessness is unavoidable.&lt;/p&gt;

&lt;p&gt;What's harder is that in the middle stage, you often have to fight with yourself. Go broader or deeper? Add this feature or not? Cut your losses or give it another shot? Many decisions have no clear external signals to guide you; you have to carry them entirely on your own. As long as something is "new," there will be a massive number of problems; you fix one, and you create new ones. This cycle doesn't stop on its own.&lt;/p&gt;

&lt;p&gt;Plus the risks: what if someone else builds something similar? What if resources are pulled away? What if the team wavers? Very few people can remain steadfast without faltering. Those who do are often later described as "obsessive."&lt;/p&gt;

&lt;h2&gt;
  
  
  Looking Back at Those Who Endured
&lt;/h2&gt;

&lt;p&gt;So if you look through history at people who endured to build great products, you'll find one consistent thing: they were all particularly stubborn about something.&lt;/p&gt;

&lt;p&gt;Steve Jobs is the easiest to recall. In 1985, he was ousted from the Macintosh division by the CEO he had hired, John Sculley, and the board. That May, he even attempted a boardroom coup while Sculley was on a business trip to China; after Jean-Louis Gassée betrayed him by leaking the plan, he was completely out, resigning officially in September. For the next twelve years, he endured at NeXT and Pixar, returning to Apple only in 1997. During this period, he was repeatedly challenged by outsiders: "You're not an engineer, you can't write code, who are you to call the shots?" The dramatic line delivered by Wozniak in the 2015 film &lt;em&gt;Steve Jobs&lt;/em&gt;—"You can't write code, you're not an engineer, you're not a designer"—actually compressed more than a decade of skepticism into a single sentence. Stubborn people can endure this; those who aren't leave early.&lt;/p&gt;

&lt;p&gt;Allen Zhang is another. He started with the email client Foxmail; by around 1998, it already had 2 million users, which he sustained largely on his own for many years. In between, he went through an acquisition by Boda, team turbulence, and skepticism that he was "someone who doesn't know how to commercialize." In 2005, he joined Tencent to take over QQ Mail; the first two years were basically a failure—bloated, slow, unused. He endured internally for two years before carving out the "oversized attachment" feature, pulling the email product back from the brink of death. WeChat was the same: inspired by a mobile app called Kik, he sent Pony (Ma Huateng) an email pitching the idea, then repeatedly fought battles inside Tencent with the QQ team and the wireless team to get it built. He himself and those around him acknowledge that his personality is, to some degree, "dictatorial" and reclusive. It is precisely this stubbornness that allowed him to withstand all the voices asking, "Why don't you just make it like XXX?"&lt;/p&gt;

&lt;p&gt;Looking at the two together, there is a shared temperament: on a particular matter, they don't listen to advice, refuse to be reasonable, and just keep at it. Outsiders often think, "What hope is there in this?" yet they keep going.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Vibe Coding Hasn't Changed
&lt;/h2&gt;

&lt;p&gt;The real impact of vibe coding is actually quite clear.&lt;/p&gt;

&lt;p&gt;It compresses every stage of the lifecycle. What used to take two weeks for a demo now takes an afternoon; what used to take a sprint for a feature iteration now takes hours; what used to be a stack one person couldn't cover alone, one person can now run across horizontally.&lt;/p&gt;

&lt;p&gt;But the "complete product lifecycle" hasn't been replaced. PoC, polishing, stability, user feedback, iteration, quality, persistence—this process is still there; only each stage is shorter. My own feeling is that the fast parts are very fast, but the slow parts haven't changed much.&lt;/p&gt;

&lt;p&gt;The most tormenting part of the middle stage is precisely the part where "change is least visible." Polishing a boundary condition isn't something vibe can solve; it requires repeatedly constructing scenarios. Stability isn't something you get in a few hours; it has to run under real traffic. Quality certainly isn't something a single prompt can guarantee. A controlled study by CodeRabbit last year on 470 open-source GitHub PRs showed that AI co-generated code had roughly 75% more bugs in logic and correctness than human-written code. When you do the math, vibe can't help you with this; you can only grind through it slowly.&lt;/p&gt;

&lt;p&gt;So during this period, I've become increasingly clear: tools have gotten faster, but the hard parts haven't decreased; they've just been thrown into sharper relief. When everything was slow before, the middle being slow didn't feel particularly slow. Now that the first two stages whiz by, the texture of the middle stage—repeated grinding, repeated wrestling with yourself, repeated absence of feedback—is amplified.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Own Experience
&lt;/h2&gt;

&lt;p&gt;Something personal. In my past career, I haven't had many opportunities to independently own a product. I started in strategy, later moved to somewhat technical product roles, and then did ecosystem, BD, and sales. After doing these roles for a while, you realize the feedback velocity differs enormously.&lt;/p&gt;

&lt;p&gt;Sales has the fastest feedback. You meet a client, chat for two hours, and by the time you walk out, you basically know whether there's a chance, what they care about, and what the next move should be. Bad news also comes fast; if it's a no, it's a no, and it won't drag you out. This rhythm is addictive, but the side effect is: you get accustomed to short feedback loops, and when you go without feedback for a while, you start to get anxious.&lt;/p&gt;

&lt;p&gt;When you actually build your own product, you realize this is a completely different lifestyle from sales. The first few days might still have exciting feedback; then you enter a long "silent" phase. During this time, a huge amount of work appears to produce no output—but in reality, you're laying foundations, doing finer breakdowns of user pain points, waiting for round after round of experiments to yield results. No one externally gives you hourly feedback; when family asks what you're busy with, you can't explain it in one sentence.&lt;/p&gt;

&lt;p&gt;I previously underestimated how draining this "feedback-scarce" state is for a person. I used to think the loneliness of product people was the exclusive domain of a few "masters"; now I believe more strongly that it's the fundamental nature of this profession. It's just that most people don't survive this stretch, so there aren't many good products to begin with. This also incidentally explains why the term "good product manager" is so scarce: it's not that people lack ideas, but that most people don't endure the middle stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  So
&lt;/h2&gt;

&lt;p&gt;I think the real benefit vibe coding brings isn't making product building easier; it's making "getting started" easier. In the past, wanting to build something and just setting up the environment was enough to turn away half the people. Now that half is no longer deterred.&lt;/p&gt;

&lt;p&gt;But the road from "getting started" to "getting it done"—it hasn't walked that for you. What determines who can make it is still the same old things: whether you have a direction you truly care about, whether you can endure the stretch where others can't see what you're doing, and whether you're willing to repeatedly fix the same thing. These sound clichéd, and they are—because they're right.&lt;/p&gt;

&lt;p&gt;I'm not writing this today to encourage people to become obsessive. It's a reminder to myself and to those in the same boat: if the product in your hands is getting no external feedback for a long time, it's not because you're doing it wrong—it's because this stretch is inherently long. If you can keep going, keep going.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Vibe_coding" rel="noopener noreferrer"&gt;Vibe coding — Wikipedia (Collins Word of the Year, definition evolution)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://vibecoding.app/blog/openclaw-ultimate-guide" rel="noopener noreferrer"&gt;OpenClaw Viral Success and YC Winter 2025 Data Overview (Vibecoding.app)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://engineering.fb.com/2009/02/17/web/chat-stability-and-scalability/" rel="noopener noreferrer"&gt;Chat Stability and Scalability — Facebook Engineering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Steve_Jobs" rel="noopener noreferrer"&gt;Steve Jobs — Wikipedia (1985 Departure from Apple Timeline)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.thecorporategovernanceinstitute.com/insights/case-studies/why-did-apples-board-fire-steve-jobs-in-1985/" rel="noopener noreferrer"&gt;Why did Apple's board fire Steve Jobs in 1985? — Corporate Governance Institute&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://zhuanlan.zhihu.com/p/91238014" rel="noopener noreferrer"&gt;Understanding Allen Zhang, Father of WeChat: Failed Genius, Disruptor, Dictator, Manipulator of Human Nature (Zhihu)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.qq.com/rain/a/20220801A0262400" rel="noopener noreferrer"&gt;These 55 Thoughts Explain Why Allen Zhang Built WeChat (Tencent News)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report" rel="noopener noreferrer"&gt;State of AI vs. Human Code Generation Report — CodeRabbit&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/product-middle-is-lonely" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/product-middle-is-lonely&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>product</category>
      <category>vibecoding</category>
      <category>reflections</category>
    </item>
    <item>
      <title>I Spent a Day and a Half Using AI to Make a Literacy Game for My Son</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 05:26:22 +0000</pubDate>
      <link>https://forem.com/skyguan92/i-spent-a-day-and-a-half-using-ai-to-make-a-literacy-game-for-my-son-36n</link>
      <guid>https://forem.com/skyguan92/i-spent-a-day-and-a-half-using-ai-to-make-a-literacy-game-for-my-son-36n</guid>
      <description>&lt;p&gt;Recently I gave a talk at a major conference. Before and after the event, I met quite a few people. Among the attendees, there were broadly two camps.&lt;/p&gt;

&lt;p&gt;One camp spends every day immersed in open-source communities, building products and growing their own direction. Chat with them for a few minutes and you can feel a density of being on the front lines—their viewpoints grow on-site, carrying heat and contagious energy. The other camp carries a different scent: boss-assigned tasks, checkbox logic. They talk about "I built a thing, it can do this and that"—a very standard past-tense software narrative. Right now, AI is least lacking in "what it can do"; what it's most lacking is "whether it can do it well." The latter camp usually runs out of things to say right there.&lt;/p&gt;

&lt;p&gt;There's already a sizable gap between these two groups. The bigger chasm is still behind them; I'll come back to that. First, let me talk about what happened at my home these past two days.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Why Are You Always Staring at the Computer?"
&lt;/h2&gt;

&lt;p&gt;Work intensity has been higher than usual these past few months. Not the busy-ness of office hours, but the busy-ness that comes from internal drive. The boundaries of agents shift every day; exploring them is addictive, and I can't stop. I leave early and often get home past eleven, constantly pushing family matters to the back of the queue.&lt;/p&gt;

&lt;p&gt;A few days ago my son suddenly ran over and asked me: Why are you always staring at the computer?&lt;/p&gt;

&lt;p&gt;I didn't brush him off. I had him sit down and explained, in words he could understand, what artificial intelligence is. He stared at the cursor moving on its own across my screen and asked: How is the computer typing by itself? I said: I told it what I want, and it's helping me work.&lt;/p&gt;

&lt;p&gt;He was still a bit confused. I said, let me make something for you right now.&lt;/p&gt;

&lt;p&gt;At that time he was obsessed with PAW Patrol. I opened a coding agent and dictated to it: I want a PAW Patrol Mighty Pups battle game, landscape screen, where you can pick a pup to fight. Then I had him go play something else, and said I'd call him back in five minutes.&lt;/p&gt;

&lt;p&gt;Five minutes later the game was actually running. Six pups to choose from, light punch and heavy punch, crouch and jump—all working. He played with an incredulous look on his face and then told me: Huh, seems like Mighty Pups' special move didn't come out. I said, let's fix that. Five minutes later, the special move was there too. The effects were crude, but enough for him.&lt;/p&gt;

&lt;p&gt;He wanted more. I said: When you get a bit older, Dad will make you something even better.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Kid Who Couldn't Get the Little Red Flower
&lt;/h2&gt;

&lt;p&gt;Probably that impromptu game made him feel like Dad was finally participating in his world for once. He asked if I could teach him to read a clock—the kind with hands (that thing is actually very unfriendly to kids; that's another story). In the middle of teaching he lost interest, and instead asked if I could teach him to recognize characters.&lt;/p&gt;

&lt;p&gt;I was a bit surprised. He quietly said that at school you can get a little red flower for character recognition, and he often couldn't get one.&lt;/p&gt;

&lt;p&gt;I understood that sense of frustration immediately. I asked him why he didn't tell Mom. He couldn't say; he just sat there sulking.&lt;/p&gt;

&lt;p&gt;In that moment I made a decision: take the same impromptu-game approach and apply it to "teaching him characters." Make a literacy game just for him, using his favorite IP, where adults can play along.&lt;/p&gt;

&lt;h2&gt;
  
  
  Don't Write Code Yet—Let the Agent Do Its Homework First
&lt;/h2&gt;

&lt;p&gt;That night when I got home, the first thing I did after opening the coding agent was not to ask it to write code, but to have it do background research.&lt;/p&gt;

&lt;p&gt;I threw three questions at it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What are the recent pedagogical and psychological conclusions on Chinese character learning for preschool children?&lt;/li&gt;
&lt;li&gt;What memory-related learning theories and methods are suitable for preschoolers?&lt;/li&gt;
&lt;li&gt;What's the deal with the PAW Patrol IP, and what is its narrative structure?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It spent about an hour on research. When it threw back its summary, I realized several of my default assumptions were wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The preschool focus isn't pinyin.&lt;/strong&gt; Starting in the fall of 2016, the Ministry-compiled first-grade Chinese textbooks swapped "a o e" for "sky, earth, human"—characters first, pinyin later, putting pinyin back in its place as "a tool to aid character recognition." This wasn't a snap decision. In 2002, Guangdong Provincial Women and Children's Hospital ran a controlled experiment in kindergarten senior classes: the Chinese-character class outperformed the pinyin class in both scores and speed. In 2011, Beijing Normal University's School of Psychology followed 176 first-graders for a full year and found that children who consistently read pinyin-annotated materials actually had lower reading self-efficacy than those who read unannotated materials. What preschoolers really need to focus on are pictographic characters, letting kids move from "image" to Chinese character, not from symbol to character.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't be greedy.&lt;/strong&gt; The Ministry of Education's 2012 &lt;em&gt;Guidelines for Learning and Development of Children Aged 3-6&lt;/em&gt; explicitly opposes early mechanical character memorization and intensive training. The time it takes a kindergartener to learn one character is roughly three to four times what it takes an elementary student to learn the same character. Method-wise, the three-bucket method (a simplified version of the Leitner system, a spaced-repetition method proposed by German science journalist Sebastian Leitner in the 1970s) is well-suited for this age. Bucket 1: review daily; Bucket 2: every three days; Bucket 3: every seven days. New characters go into Bucket 1; move back on correct answers, drop to Bucket 1 on mistakes. Keep daily new characters to two or three—much steadier than cramming ten at once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PAW Patrol itself follows a fixed formula.&lt;/strong&gt; The rhythm of nearly every episode is identical: a resident gets in trouble → call Ryder → assemble the pups → pick the right rescue pup → head out → rescue process → celebration wrap-up. This highly repetitive structure is actually very suitable for level design. Each level is a complete "alarm, dispatch, celebration" cycle. The rhythm is familiar, frustration is low, and kids won't get blocked by unfamiliar new logic.&lt;/p&gt;

&lt;p&gt;After clarifying these three things, I started designing the game with it. For this segment I had it use the "superpower" skill to do the design. It would frequently pause to confirm with me: "Does this mechanic feel OK?" "Should we cut this step?" After a few back-and-forth rounds, a runnable framework emerged.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making Assets Is Way More Exhausting Than Writing Code
&lt;/h2&gt;

&lt;p&gt;A framework is one thing; assets are another. Subtitles, images, sound effects, character voice-overs, level backgrounds—you need everything. I originally wanted to record the voices myself, but gave up after a few tries.&lt;/p&gt;

&lt;p&gt;I had another agent dedicated to making assets. It built the entire pipeline itself:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find original episodes on YouTube and extract the audio.&lt;/li&gt;
&lt;li&gt;Run ASR locally, split the audio into 2-second clips, and extract the dialogue for each clip.&lt;/li&gt;
&lt;li&gt;Use a vision model to extract frames from each clip, identifying which pup and which scene appears on screen.&lt;/li&gt;
&lt;li&gt;Run TTS locally to generate in-game character prompt voices.&lt;/li&gt;
&lt;li&gt;Generate visual assets using Gemini's Nano Banana (at the time, GPT's new image model hadn't come out yet).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;At first it wanted me to manually tell it which pup was in each frame. I couldn't recognize every PAW Patrol character and got confused. It revised its own approach: extract text via ASR first, then use the vision model to align on-screen content with the dialogue, filtering automatically. I reviewed this pipeline, saw it was feasible, and let it run.&lt;/p&gt;

&lt;p&gt;I tried the first version and found a pile of bugs; the interactions were so complex that even I didn't want to touch it. I had it cut a batch of features and simplify to the point where a child could pick it up alone. The next day another version ran, still with many issues.&lt;/p&gt;

&lt;p&gt;At this point I did something that, in retrospect, was pretty crucial: I had the agent run a round of end-to-end testing itself before waiting for me to test. After running through, it listed a whole stack of issues on its own, and while it was at it, asked, "Should we add an art director role to unify the visual style?" After adding that, the visuals immediately became cleaner and the asset consistency improved a lot.&lt;/p&gt;

&lt;p&gt;The whole process took a day and a half. By the second night, it was something playable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Time I Let the Two of Them Play
&lt;/h2&gt;

&lt;p&gt;I got home after 11 p.m. that night and asked my wife if she wanted to try it first. It was too late, so we didn't. It got pushed back a few days.&lt;/p&gt;

&lt;p&gt;Yesterday afternoon she suddenly remembered and asked if they could play. I wasn't home, so I used Tailscale to tunnel from her phone into my computer and started the game remotely.&lt;/p&gt;

&lt;p&gt;She ran through it herself and texted me back: Better than I expected. One design I particularly liked in this version is that the child looks at one screen while the parent, acting as "commander," looks at another screen on their phone. The parent receives rhythm prompts telling them when to encourage and when to help. She was completely on board with this split-screen setup.&lt;/p&gt;

&lt;p&gt;Then she played with our son. After finishing, he refused to leave and said he wanted to play again tomorrow. Perfect—according to the theory, three characters a day is enough; more than that is greed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Look on Her Face That Moment
&lt;/h2&gt;

&lt;p&gt;The most unexpected part of this whole thing wasn't my son's reaction; it was my wife's.&lt;/p&gt;

&lt;p&gt;I had talked to her about AI many times before. About agents, about coding, about what was coming next. She listened, but didn't feel it—much like my son the first time he heard me talk about artificial intelligence. When a concept doesn't land on something she cares about, no matter how much you talk, it's someone else's business.&lt;/p&gt;

&lt;p&gt;This time was different. Their child's education is the heaviest thing in her heart. Seeing something made in a day and a half actually get her son to sit down and play, her reaction wasn't "that's amazing"—it was "I can use this."&lt;/p&gt;

&lt;p&gt;That night she said something to me: Then can we make one for our friends' kids too?&lt;/p&gt;

&lt;p&gt;I said yes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Types of People, Two Chasms
&lt;/h2&gt;

&lt;p&gt;Back to the conference I mentioned at the beginning. Inside the conference hall there were two camps; outside in the wider world, there are actually three types of people.&lt;/p&gt;

&lt;p&gt;The first type has no real awareness at all. They know AI is hot, they've skimmed plenty of articles, but they feel it's far away. They don't get their hands dirty, and don't feel they need to.&lt;/p&gt;

&lt;p&gt;The second type has gotten their hands dirty, but they're doing tasks someone else assigned. The boss says the company needs to build something like this, so they build it, check the box, and it's over.&lt;/p&gt;

&lt;p&gt;The third type uses it on something they truly care about. In the process they feel a boundary get pushed open, and then they can't stop.&lt;/p&gt;

&lt;p&gt;Between the two chasms, the second is larger than the first. The first is merely the distance between "having heard about it" and "having tried it." The second is the distance between "checking a box" and "being ignited." The checkbox people finish one round and clock out; the ignited people go home still thinking about the next step.&lt;/p&gt;

&lt;p&gt;In the past, when people said "using AI," it was easy to interpret as a skill: knowing a few more prompts, being familiar with a few more tools. Over the past couple of years I've increasingly felt that's not the key. The key is: put it into that thing you lie in bed thinking about every night.&lt;/p&gt;

&lt;p&gt;The heaviest thing in my wife's heart is our child's education, so that's what went into the PAW Patrol literacy game. I work on AI Infra; the heaviest thing in my heart is hardware efficiency and TCO, so that's what goes into the endless experiments I run every day. It doesn't need to be something far away—just something you truly care about. At that point it will quickly show you what it can do. And the way it tells you isn't through a few articles; it's through you personally making something you previously wouldn't have dared to imagine.&lt;/p&gt;

&lt;p&gt;Products like BaiCiZhan used to require a team, a company, building something for everyone to use. Now it's a dad spending a day and a half making a personalized literacy game for his own son. These aren't the same kind of thing; they're two different worlds.&lt;/p&gt;

&lt;p&gt;First find that thing you truly care about. The rest of the path will grow on its own.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://news.sina.cn/2016-08-25/detail-ifxvixsh6587992.d.html" rel="noopener noreferrer"&gt;Fall 2016 New Textbooks: First-Graders Learn Chinese Characters Before Pinyin (Sina)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://edu.cnr.cn/list/20160826/t20160826_523087371.shtml" rel="noopener noreferrer"&gt;Characters Before Pinyin: Elementary Chinese Teaching Reform Returns to Educational Principles (China National Radio)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.chinanews.com.cn/m/gn/2016/09-01/7990445.shtml" rel="noopener noreferrer"&gt;New First-Grade Chinese Textbooks Reduce Character Count, Push Pinyin Learning Back (China News Service)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://xbjk.ecnu.edu.cn/EN/article/downloadArticleFile.do?attachType=PDF&amp;amp;id=10548" rel="noopener noreferrer"&gt;The Psychological Mechanisms of Chinese Character Literacy in Children and Their Implications for Education (Journal of East China Normal University)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.unicef.cn/media/8456/file/%E3%80%8A3---6%E5%B2%81%E5%84%BF%E7%AB%A5%E5%AD%A6%E4%B9%A0%E4%B8%8E%E5%8F%91%E5%B1%95%E6%8C%87%E5%8D%97%E3%80%8B.pdf" rel="noopener noreferrer"&gt;Guidelines for Learning and Development of Children Aged 3-6 (UNICEF Archived PDF)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.qq.com/rain/a/20210208A044U700" rel="noopener noreferrer"&gt;The Earlier and More Kids Learn Characters, the Smarter They Get? Brain Science Experts and Top-School Teachers on Literacy (Tencent News)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Leitner_system" rel="noopener noreferrer"&gt;Leitner system — Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.goodnotes.com/blog/leitner-system" rel="noopener noreferrer"&gt;A short &amp;amp; sweet guide to the Leitner system (Goodnotes Blog)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.udig.com/insights/blog/patterns-natural-language-data-paw-patrol" rel="noopener noreferrer"&gt;Patterns in Natural Language Data — A Paw Patrol Analysis (Udig)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://zh.wikipedia.org/zh-hans/%E6%B1%AA%E6%B1%AA%E9%9A%8A%E7%AB%8B%E5%A4%A7%E5%8A%9F" rel="noopener noreferrer"&gt;PAW Patrol (Chinese Wikipedia)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/paw-patrol-literacy-game" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/paw-patrol-literacy-game&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>family</category>
      <category>education</category>
      <category>thoughts</category>
    </item>
    <item>
      <title>What Goes Around Comes Around: A New Model Every Month and a Half</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 05:11:15 +0000</pubDate>
      <link>https://forem.com/skyguan92/what-goes-around-comes-around-a-new-model-every-month-and-a-half-267f</link>
      <guid>https://forem.com/skyguan92/what-goes-around-comes-around-a-new-model-every-month-and-a-half-267f</guid>
      <description>&lt;p&gt;The long-awaited OpenAI "potato" dropped today.&lt;/p&gt;

&lt;p&gt;Turns out it's not GPT-6, but GPT-5.5.&lt;/p&gt;

&lt;p&gt;I tried it out right after it came out and found that it fixed the thing that annoyed me most about 5.4: &lt;strong&gt;it finally started talking like a human&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Back in the 5.4 days, reading through Codex's output process was torture. Words piled on top of each other with no logic, status reports and execution notes jumping back and forth—you mostly couldn't tell what it was actually doing. 5.5 fixed this, and the readability of the entire workflow shot up immediately.&lt;/p&gt;

&lt;p&gt;And then it gets funny.&lt;/p&gt;

&lt;p&gt;Just a while ago I was telling my team: GPT 5.4 is stronger than Claude on certain complex tasks, but it doesn't talk like a human, so it's exhausting to use; Claude speaks human, it can guide you through, &lt;strong&gt;so we should still push Claude Code&lt;/strong&gt;. Some colleagues went and canceled Codex and switched to Claude after hearing that. Today I have to send another message in the group chat: maybe just get both accounts and use them together.&lt;/p&gt;

&lt;p&gt;Can everyone grasp how absurd this is?&lt;/p&gt;

&lt;h2&gt;
  
  
  A New Generation Every Month and a Half
&lt;/h2&gt;

&lt;p&gt;5.4 came out in early March. I judged at the time that this wasn't a minor version—it was leaps and bounds stronger than the previous generation, Codex 5.3. Now it's April 24, and a new generation dropped after just a month and a half.&lt;/p&gt;

&lt;p&gt;Anthropic's pace is even more extreme. Opus 4.6 came out in February, Opus 4.7 last week—two months in between. Even more absurd, on April 8 they officially released a model called &lt;strong&gt;Mythos Preview&lt;/strong&gt;. The company says this is their strongest model to date: 93.9% on SWE-bench, and 31 percentage points higher than Opus 4.6 on the USAMO math olympiad.&lt;/p&gt;

&lt;p&gt;But this model was &lt;strong&gt;not opened to the public&lt;/strong&gt;. It was only made available through a program called Project Glasswing, with restricted access for over 50 institutions including Microsoft, Google, AWS, JPMorgan, Nvidia, and Cisco. Bundled with $100 million in usage credits.&lt;/p&gt;

&lt;p&gt;The reason wasn't that they were keeping it under wraps. The company stated that Mythos Preview's vulnerability-hunting capabilities were absurdly strong—it could find what they described as "tens of thousands" of zero-day vulnerabilities in mainstream operating systems and browsers that ordinary bug hunters couldn't uncover at all. Fully releasing it posed too much risk. Last week's Opus 4.7 is essentially the "publicly releasable" neutered version of that same generation.&lt;/p&gt;

&lt;p&gt;This is the rhythm now: a new generation is guaranteed every month and a half to two months, and top-tier versions are already being restricted from distribution under the rationale of "too powerful to release publicly."&lt;/p&gt;

&lt;h2&gt;
  
  
  Last Year It Was Three Months Minimum
&lt;/h2&gt;

&lt;p&gt;This pace would have been unimaginable a year ago.&lt;/p&gt;

&lt;p&gt;After DeepSeek R1 came out in early 2025, people discussed it for a solid six months. Talking about R1 in February, still talking about R1 in May and June. V3.1 in between was a minor version bump; other companies released some models too, but they barely made noise. It wasn't until around June or July that Kimi K2 started gaining traction, and it still couldn't beat R1 back then. One model dominated the conversation for half a year.&lt;/p&gt;

&lt;p&gt;Overseas, Google's Gemini had been riding high since last fall. Nano Banana's text-to-image generation made quite an impact. I remember attending some top-tier academic conferences in September or October last year—everyone was talking about Gemini. Strong multimodal capabilities, good at coding, not weak in any category.&lt;/p&gt;

&lt;p&gt;Then suddenly this year everything changed. After March, when talking about models, Google might as well have ceased to exist. Everyone's benchmark was Claude Opus, and then OpenAI caught up from behind. A few days ago I had the team run a blind test: OpenAI's Image Generation 2 and Nano Banana 2 each generated an image, and I deliberately didn't say which was which. The team all picked the OpenAI one. Turned out it was indeed OpenAI. Rumor has it Google's founders triggered another red alert over there, calling on all employees to charge ahead and catch up.&lt;/p&gt;

&lt;p&gt;One quarter ago Google was in the "strongest, most comprehensive" position; one quarter later they're already in "we need to catch up" mode. That's how absurd it is. Advice you gave the team last time might need to be overturned and redone after three months.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Winds Shift Too
&lt;/h2&gt;

&lt;p&gt;Even funnier, even model styles are going through cycles of what goes around comes around.&lt;/p&gt;

&lt;p&gt;My judgment a few months ago: GPT 5.4 was strong on complex tasks but didn't talk like a human; Claude 4.6 talked human but fell a bit short on some hard problems. So push Claude Code.&lt;/p&gt;

&lt;p&gt;Using Opus 4.7 over the past two weeks, I found it's starting to not talk human. Unclear structure, sentences crammed with irrelevant fluff—much worse compared to 4.6's refreshing clarity. People in the community noticed this too, saying Opus 4.7 is starting to "GPT-ify."&lt;/p&gt;

&lt;p&gt;Conversely, GPT-5.5 has started talking human again. Concise, readable, no rambling.&lt;/p&gt;

&lt;p&gt;The two sides seem to have swapped identities.&lt;/p&gt;

&lt;p&gt;This isn't just one model getting better or worse. It's the entire industry moving so fast that the &lt;strong&gt;half-life of your judgment&lt;/strong&gt; is shortening. Three months ago I was quite certain we should push Claude Code; now I'm not so sure. It's not that anyone made the wrong choice—it's that "certainty" is something that's already hard to sustain in this industry.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Best Bang for Your Buck
&lt;/h2&gt;

&lt;p&gt;So I'm increasingly convinced of one thing: &lt;strong&gt;the best investment, bar none, is to immediately buy a coding plan account and start using it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A few hundred dollars a month, or a few hundred RMB for domestic models. You get to use the world's most cutting-edge productivity at the earliest moment. And it upgrades every month and a half. &lt;strong&gt;Problems you can't solve today will likely naturally cease to be problems with the next generation in two months&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why do I emphasize a coding plan instead of using the API directly?&lt;/p&gt;

&lt;p&gt;Recent API experiments revealed that inside products like Claude Code and Codex, prompt caching hit rates are extremely high—the entire product chain is a coherent optimization pipeline. Caching, scheduling, and tool calls are all connected end-to-end. This is also why vendors can sustain such massive usage despite heavy losses. If you mess around with third-party API proxies and workarounds, this optimization chain breaks, making it cost-ineffective and prone to stability issues.&lt;/p&gt;

&lt;p&gt;If you can, go straight for the official coding plan. You can open multiple accounts even—each one is only so much per month. Use each for their strengths.&lt;/p&gt;

&lt;h2&gt;
  
  
  Evolution Happens in Leaps
&lt;/h2&gt;

&lt;p&gt;I've slowly figured the whole thing out: it's not linear.&lt;/p&gt;

&lt;p&gt;It's not like 10 today, 11 tomorrow, 12 the day after. More like: 11, 12, 13, then stuck at 13 for a long time, then one day suddenly jumping to 100.&lt;/p&gt;

&lt;p&gt;After o1 launched at the end of 2024, that winter I sat down with several frontier researchers to discuss. Everyone felt that the reasoning-plus-reinforcement-learning paradigm had a real shot at pushing models to the next scaling law. A few weeks later DeepSeek R1 was open-sourced. I still remember that shock. Top-tier results released directly as open source; R1 shot to the world-class tier the moment it came out, and reasoning became the new default paradigm.&lt;/p&gt;

&lt;p&gt;Looking further back. GPT-4 came out in early 2023, o1 at the end of 2024—nearly two years in between was relatively flat. Honestly, in 2024 I had a sense of "slowdown." In 2023 everyone had their expectations fed too high by the leap from 3.5 to 4, and what followed couldn't keep up.&lt;/p&gt;

&lt;p&gt;In retrospect that wasn't a slowdown; it was storing up energy. The next leap out was reasoning.&lt;/p&gt;

&lt;p&gt;People who research AGI like to draw a staircase: conversation, tool use, agents, research, innovation. You think it's climbing linearly, but actually it wasn't working well for a long time, until one generation suddenly crosses a threshold. &lt;strong&gt;And once one crosses it, everyone can cross it.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents Aren't Designed—They're Pushed Out by Models
&lt;/h2&gt;

&lt;p&gt;Another example of paradigm leap.&lt;/p&gt;

&lt;p&gt;In early 2025, the entire community was discussing the same question: where exactly are AI applications? What applications do we build? No one could answer at the time. The mainstream was still IDE-embedded autocomplete (the Cursor route), vector database plus workflow orchestration (the LangChain route), and all sorts of drag-and-drop workflow tools.&lt;/p&gt;

&lt;p&gt;Back then reasoning/thinking capabilities were seen as a burden by many teams. Task flows were already written, logic was fixed; the more the model thought, the easier it was to go off track. Many people recommended "turn off thinking and use a small model for higher efficiency."&lt;/p&gt;

&lt;p&gt;By the Opus 4.5 generation (September/October 2025), context lengths grew, tool calling got stronger, and long-horizon autonomy emerged. Then Manus dropped a demo video, and everyone was stunned after watching it: &lt;strong&gt;models are already strong enough to push forward on long tasks by themselves&lt;/strong&gt;. Some immediately said "there's no real technical difficulty here." And they weren't wrong—it really isn't hard. What's hard is realizing &lt;strong&gt;the models have reached the point where they can be used this way&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Later Claude Code took it even further. It's not an IDE embed, it's a CLI. A chat window in the terminal—you talk to it, it goes and does things itself. Many people at first couldn't accept "this can actually get work done?" After two days of use they realized: &lt;strong&gt;when models get strong enough, the IDE itself becomes a burden&lt;/strong&gt;—clicking around inside it actually limits their performance.&lt;/p&gt;

&lt;p&gt;Looking back now at that early-2025 question of "where are AI applications," &lt;strong&gt;it no longer exists&lt;/strong&gt;. It wasn't solved; the question became meaningless. Models can do so much that the boundaries of the question dissolved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ordinary People Can Participate From Day One
&lt;/h2&gt;

&lt;p&gt;One final thing I find particularly remarkable. This is something that &lt;strong&gt;ordinary people can participate in from the very first moment&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For technologies of this magnitude in the past—nuclear power, aerospace, semiconductors—it was impossible for ordinary people to touch the cutting edge within days of something new landing. AI is different. Before new models release, some insiders get early access a few weeks ahead, but the time gap is at most a month. Drag it out any longer and competitors catch up, diluting the lead.&lt;/p&gt;

&lt;p&gt;So every time a new model drops, you and I can basically use it the same day. With an ordinary laptop, a few hundred bucks a month, you can judge how well it works, where it's strong, where it falls over. Sometimes you spot problems before the people who built it.&lt;/p&gt;

&lt;p&gt;This sense of participation never existed before. The era itself is accelerating, and everyone caught in its rhythm gets pulled in.&lt;/p&gt;

&lt;p&gt;A few hundred a month to stand at the frontier—I've never encountered such a good deal in my life.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/introducing-gpt-5-4/" rel="noopener noreferrer"&gt;Introducing GPT-5.4 — OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-6" rel="noopener noreferrer"&gt;Claude Opus 4.6 — What's New — Anthropic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7" rel="noopener noreferrer"&gt;Claude Opus 4.7 — What's New — Anthropic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer"&gt;Claude Mythos Preview — red.anthropic.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer"&gt;Anthropic Project Glasswing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/" rel="noopener noreferrer"&gt;Anthropic 'Mythos' AI model representing 'step change' — Fortune&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cnbc.com/2026/04/16/anthropic-claude-opus-4-7-model-mythos.html" rel="noopener noreferrer"&gt;Anthropic rolls out Claude Opus 4.7, less risky than Mythos — CNBC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.deepseek.com/" rel="noopener noreferrer"&gt;DeepSeek R1 &amp;amp; 官方主页&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://deepmind.google/models/gemini/" rel="noopener noreferrer"&gt;Google DeepMind Gemini&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cursor.com/" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://manus.im/" rel="noopener noreferrer"&gt;Manus&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/product/claude-code" rel="noopener noreferrer"&gt;Anthropic Claude Code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.moonshot.cn/" rel="noopener noreferrer"&gt;Moonshot Kimi&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/o1/" rel="noopener noreferrer"&gt;Introducing OpenAI o1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/monthly-model-cadence" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/monthly-model-cadence&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>models</category>
      <category>openai</category>
      <category>anthropic</category>
    </item>
    <item>
      <title>On LLM Pricing: Supply Is Locked by Chips, the Rest Is Business Philosophy</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 05:04:18 +0000</pubDate>
      <link>https://forem.com/skyguan92/on-llm-pricing-supply-is-locked-by-chips-the-rest-is-business-philosophy-3380</link>
      <guid>https://forem.com/skyguan92/on-llm-pricing-supply-is-locked-by-chips-the-rest-is-business-philosophy-3380</guid>
      <description>&lt;p&gt;Recently chatted with some friends at model companies, and found that everyone is struggling with pricing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Back to the Economics Textbook
&lt;/h2&gt;

&lt;p&gt;In economics, price is fundamentally a supply-demand regulator. When things get expensive, demand drops and supply rises, ultimately shrinking transaction volume; when things are cheap, the reverse happens. After a few rounds of tug-of-war, the market finds an equilibrium: no one wants to raise prices further, and no one wants to lower them either.&lt;/p&gt;

&lt;p&gt;The strange thing about large language models is that neither side is playing by these rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supply Is Locked by Chip Controls
&lt;/h2&gt;

&lt;p&gt;Domestic market demand for models is polarized: demand for good models is ridiculously high, while slightly weaker models barely get used. The main driver behind this is coding—once programming takes off, agents follow suit, and capabilities like tool calling and long-chain reasoning depend heavily on the very top-tier models. This is completely different from the old landscape where "chat APIs work for everyone."&lt;/p&gt;

&lt;p&gt;But supply can't keep up.&lt;/p&gt;

&lt;p&gt;Top-tier inference requires NVIDIA Hopper (H100/H200) or the newer Blackwell (B200/B300). The current status of these chips: B200 and B300 are directly embargoed by the US; H200 was &lt;a href="https://www.cfr.org/articles/new-ai-chip-export-policy-china-strategically-incoherent-and-unenforceable" rel="noopener noreferrer"&gt;partially loosened under new rules in January 2026&lt;/a&gt;, but carries a 25% export tax and total volume is capped at half of historical shipment levels. Domestic filing and approval processes are also extremely strict—&lt;a href="https://www.tomshardware.com/tech-industry/chinese-customs-told-to-block-h200-imports-report-claims-directive-would-effectively-ban-the-nvidia-ai-chip-from-china" rel="noopener noreferrer"&gt;earlier this year, Shenzhen customs reportedly refused H200 customs declarations outright&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The domestic replacement path can't be rushed either. Huawei's most powerful Ascend 910C &lt;a href="https://www.bloomberg.com/news/articles/2025-09-29/huawei-to-double-output-of-top-ai-chip-as-nvidia-wavers-in-china" rel="noopener noreferrer"&gt;has a target production capacity of 600,000 units this year&lt;/a&gt;, which sounds like a lot. But &lt;a href="https://newsletter.semianalysis.com/p/huawei-ascend-production-ramp" rel="noopener noreferrer"&gt;HBM is the real bottleneck&lt;/a&gt;—CXMT's HBM capacity is only enough to ultimately package roughly 250,000 to 300,000 910C units. The next generation of domestic chips won't see mass delivery until around 2027. Semiconductor capacity ramp-up is slow work; neither technological breakthroughs nor joint venture signatures can accelerate it.&lt;/p&gt;

&lt;p&gt;This creates a problem. Even if you raise prices by 10x, the market can't conjure up more B300s. &lt;strong&gt;Supply in this market has almost no elasticity.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the short term, only two layers can move: the model layer making inference cheaper and faster, and the infra layer making interconnect and scheduling across cards more efficient. Both are happening, and the effects are immediate—&lt;a href="https://a16z.com/llmflation-llm-inference-cost/" rel="noopener noreferrer"&gt;over the past three years, single-token inference costs for large models dropped from $20 per million tokens to $0.40 per million tokens, roughly a 1000x reduction&lt;/a&gt;. But notice: this is the entire supply curve shifting downward, not supply elasticity improving. Technological breakthroughs happen when they happen—they have nothing to do with how much you're willing to pay.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demand Has Been Carved into Three Completely Different Shapes
&lt;/h2&gt;

&lt;p&gt;The demand side is even more interesting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The top-tier users have virtually no price elasticity.&lt;/strong&gt; If your model can genuinely solve problems they can't solve themselves, a 10% price increase is completely irrelevant. This resembles the early GPT-4 era—that small group of people truly using AI to reconstruct their workflows cares about capability boundaries, not unit price. Seedance 2.0 is a fairly typical example: &lt;a href="https://technode.com/2026/03/05/bytedances-seedance-2-0-video-model-costs-about-0-14-per-second/" rel="noopener noreferrer"&gt;roughly 1 RMB per second&lt;/a&gt;, which doesn't sound cheap. But for a commercial user who would otherwise spend over 10,000 RMB to produce a polished video, this price is utterly irrelevant—whether it rises to 1.5 RMB or drops to 0.8 RMB per second, they don't feel it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The least intensive users are extremely sensitive.&lt;/strong&gt; Chinese internet history has cultivated a habit of "try it for free first." These users might have been customers of &lt;a href="https://api-docs.deepseek.com/quick_start/pricing/" rel="noopener noreferrer"&gt;DeepSeek's bargain-basement API&lt;/a&gt; ($0.27 per million tokens input, $1.10 per million tokens output), where a few dozen RMB could last them for months. In their eyes, a 9.9 RMB membership or first-batch token giveaways are worth spending an entire evening figuring out how to optimize. This layer is genuinely sensitive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The truly difficult-to-price layer is the one in the middle.&lt;/strong&gt; Heavy-duty programming engineers, researchers, and agent tinkerers. They care about price, but care even more about the value of output per unit of time. Pricing for this layer gets tricky, and it's precisely this layer that has sparked the Coding Plan controversy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Is Coding Plan a False Proposition?
&lt;/h2&gt;

&lt;p&gt;Recently I've repeatedly heard a view in my social circle: Coding Plans shouldn't exist at all.&lt;/p&gt;

&lt;p&gt;This view isn't without merit. The reasoning: a model's cost is token cost, regardless of whether you subscribe monthly—fundamentally, it's the same as electricity. And in practice, there have indeed been plenty of shenanigans—some companies offer Coding Plans but can't actually serve the promised quotas, with users unable to connect for half of a 5-hour cycle and throttled for the rest, where so-called "quota exhausted" is simply a fraudulent experience.&lt;/p&gt;

&lt;p&gt;Hearing this discussion made me somewhat dazed, because this debate is so familiar. It's the same old argument from the platform economy years about "membership vs. commission," just wearing a different shell.&lt;/p&gt;

&lt;p&gt;My conclusion is similar to back then: both have their reason to exist.&lt;/p&gt;

&lt;p&gt;The benefit of Coding Plans isn't to make things easier for model companies, but to give buyers peace of mind. Note that &lt;strong&gt;the user is not necessarily the customer&lt;/strong&gt;—especially in B2B scenarios, employees are users while the enterprise is the customer. If an enterprise wants to equip 100 engineers with AI coding assistants, token-based billing alone would torment finance and IT with budget management. Giving everyone a fixed-quota Coding Plan essentially encapsulates the uncertainty of token billing entirely, providing the enterprise with a hard budget constraint. This is what Coding Plans actually solve: not pricing, but uncertainty management.&lt;/p&gt;

&lt;p&gt;Think about it—this is exactly the essence of the Costco model. &lt;a href="https://strategycapstone.org/costco-business-model/" rel="noopener noreferrer"&gt;Costco's membership fees contribute roughly two-thirds of its net operating profit&lt;/a&gt;, while the stores themselves basically break even. But what membership brings is something else: it filters out the core users with high frequency, high repurchase rates, and high average order values, front-loading the "decide whether to come" decision to a single annual moment, eliminating subsequent decision costs for each purchase. This is precisely what Coding Plans do in B2B scenarios.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic and OpenAI Represent Two Business Philosophies
&lt;/h2&gt;

&lt;p&gt;Looking back at the evolution of the Coding Plan product is quite interesting.&lt;/p&gt;

&lt;p&gt;ChatGPT launched with both API and membership tracks from day one. But early ChatGPT Plus wasn't primarily about capped monthly subscriptions—it was "pay for membership to access better models." It was a consumer benefits-oriented membership, with caps merely serving as throttling limits. The company that truly established the name "Coding Plan" was Anthropic, following the Claude Code product, &lt;a href="https://stormy.ai/blog/claude-code-gtm-strategy-anthropic-revenue-2026" rel="noopener noreferrer"&gt;with Claude Code alone reaching an annualized revenue of $2.5 billion by February 2026&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Behind this lie two fundamentally different commercial logics for Anthropic and OpenAI.&lt;/p&gt;

&lt;p&gt;Anthropic is playing target—aiming at the heaviest, most willing-to-pay, stickiest users, pushing model quality to the limit. Its business model closely resembles Costco: making money from membership fees rather than individual transactions, filtering people in and then locking them down with an extremely strong product experience. These people are programming developers. So you see that &lt;a href="https://www.the-ai-corner.com/p/anthropic-30b-arr-passed-openai-revenue-2026" rel="noopener noreferrer"&gt;Anthropic's ARR reached $30 billion in April 2026, surpassing OpenAI's $25 billion for the first time&lt;/a&gt;, with 80% of revenue coming from enterprises.&lt;/p&gt;

&lt;p&gt;OpenAI is playing coverage—broad reach. &lt;a href="https://techcrunch.com/2026/02/27/chatgpt-reaches-900m-weekly-active-users/" rel="noopener noreferrer"&gt;ChatGPT weekly active users have exceeded 900 million&lt;/a&gt;, and it's doing everything multimodal. Its consumer-origin logic means its membership corresponds to broader "usage rights" rather than token package quotas. These two approaches have grown into completely different forms over the past year.&lt;/p&gt;

&lt;p&gt;But once the Coding Plan model targeting heavy users truly worked, a problem emerged: users were too heavy, so heavy that even Anthropic itself couldn't afford to serve them. On April 4 this year, Anthropic formally &lt;a href="https://thenextweb.com/news/anthropic-openclaw-claude-subscription-ban-cost" rel="noopener noreferrer"&gt;banned Claude Pro and Max subscriptions from being used in third-party agent frameworks like OpenClaw&lt;/a&gt;, with a blunt reason: subscriptions weren't built for the usage patterns of these third-party tools. At the time, an estimated 135,000 OpenClaw instances were running, with subscription prices and equivalent API costs differing by more than 5x. This was essentially a group of people using Anthropic's money to do things Anthropic was unwilling to do; getting cut off was only a matter of time.&lt;/p&gt;

&lt;p&gt;Some domestic companies originally benchmarking against OpenAI are also beginning to shift toward the Anthropic direction. Zhipu &lt;a href="https://www.trendforce.com/news/2026/02/16/news-rising-costs-and-demand-drive-chinas-llm-price-jump-zhipu-glm%E2%80%915-hikes-30-in-first-2026-increase/" rel="noopener noreferrer"&gt;raised GLM Coding Plan prices twice in February and April over the past six months&lt;/a&gt;, with subscriptions rising 30% to 60% overall and enterprise APIs increasing 67% to 100%. This isn't simply about wanting to charge more; supply is genuinely so tight that price adjustments have become unavoidable.&lt;/p&gt;

&lt;h2&gt;
  
  
  So What?
&lt;/h2&gt;

&lt;p&gt;Putting all of the above together, we can arrive at a few unsexy but fairly solid conclusions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Supply won't become elastic in the short term.&lt;/strong&gt; Rising prices are the trend. What model companies can do is make models smaller, inference faster, and infra thicker—all of which push the supply curve downward as a whole. But this is entirely different from "raising prices can get you more supply."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The three-layer demand split makes "unified pricing" impossible.&lt;/strong&gt; The top tier doesn't care about expensive, while the most casual users won't accept any charge at all. The middle layer is where the real negotiation happens, and they are further divided into two payment scenarios: individual customers paying out of pocket, and employees paid for by enterprises. The former is better suited to usage-based billing, the latter to Coding Plans.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The two models aren't mutually exclusive; they serve different purposes.&lt;/strong&gt; Those shouting "Coding Plans should disappear" are underestimating the uncertainty management needed in enterprise scenarios. Those shouting "usage-based billing is the only real deal" are underestimating the instinctive "see the price list, then churn" reaction in individual customer scenarios.&lt;/p&gt;

&lt;p&gt;To achieve democratized AGI, the Coding Plan model won't work—one person consuming the compute of a hundred people is unsustainable no matter how much you subsidize. To deliver the ultimate experience, usage-based billing can't retain enterprises—it's not a money issue, it's a budget management issue. It's nearly impossible for any single company to walk both paths simultaneously, which is why we've seen such a degree of divergence between Anthropic and OpenAI.&lt;/p&gt;

&lt;p&gt;When I look at LLM company pricing now, it's similar to looking at Costco versus Walmart in retail—there's no right or wrong, just different philosophies. Which path gets walked to the end depends on which type of user the company wants to capture, and whether it can continuously deepen the value for that group.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.the-ai-corner.com/p/anthropic-30b-arr-passed-openai-revenue-2026" rel="noopener noreferrer"&gt;Anthropic's ARR Surpassed $30 Billion in April 2026, Exceeding OpenAI for the First Time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://stormy.ai/blog/claude-code-gtm-strategy-anthropic-revenue-2026" rel="noopener noreferrer"&gt;Claude Code Alone Reached $2.5 Billion ARR in February 2026 (Stormy AI)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.howdoiuseai.com/blog/2026-04-16-claude-code-pricing-2026-plans-costs-and-free-tier" rel="noopener noreferrer"&gt;Claude Code Pro / Max 5x / Max 20x Pricing Details&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://thenextweb.com/news/anthropic-openclaw-claude-subscription-ban-cost" rel="noopener noreferrer"&gt;Anthropic Cut Off Subscription Usage for Third-Party Frameworks Including OpenClaw in April 2026 (TNW)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cfr.org/articles/new-ai-chip-export-policy-china-strategically-incoherent-and-unenforceable" rel="noopener noreferrer"&gt;New US Rules in January 2026: 25% Tax on H200 Exports to China, B200/B300 Still Embargoed (CFR)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tomshardware.com/tech-industry/chinese-customs-told-to-block-h200-imports-report-claims-directive-would-effectively-ban-the-nvidia-ai-chip-from-china" rel="noopener noreferrer"&gt;Chinese Customs Once Ordered Logistics Companies to Stop H200 Customs Declarations (Tom's Hardware)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.bloomberg.com/news/articles/2025-09-29/huawei-to-double-output-of-top-ai-chip-as-nvidia-wavers-in-china" rel="noopener noreferrer"&gt;Huawei Ascend 910C 2026 Target Production Capacity: 600,000 Units (Bloomberg)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://newsletter.semianalysis.com/p/huawei-ascend-production-ramp" rel="noopener noreferrer"&gt;HBM Is the Real Bottleneck for Huawei Ascend (SemiAnalysis)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://a16z.com/llmflation-llm-inference-cost/" rel="noopener noreferrer"&gt;LLMflation: LLM Inference Costs Fell ~1000x in 3 Years (a16z)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://api-docs.deepseek.com/quick_start/pricing/" rel="noopener noreferrer"&gt;DeepSeek Official API Pricing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://technode.com/2026/03/05/bytedances-seedance-2-0-video-model-costs-about-0-14-per-second/" rel="noopener noreferrer"&gt;Seedance 2.0 at Roughly 1 RMB/Second (TechNode)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://techcrunch.com/2026/02/27/chatgpt-reaches-900m-weekly-active-users/" rel="noopener noreferrer"&gt;ChatGPT Weekly Active Users Exceeded 900 Million in February 2026 (TechCrunch)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.trendforce.com/news/2026/02/16/news-rising-costs-and-demand-drive-chinas-llm-price-jump-zhipu-glm%E2%80%915-hikes-30-in-first-2026-increase/" rel="noopener noreferrer"&gt;Zhipu GLM-5 Raised Prices Twice in February and April 2026 (TrendForce)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://strategycapstone.org/costco-business-model/" rel="noopener noreferrer"&gt;Costco Business Model: Membership Fees Contribute Most Operating Profit (Strategy Capstone)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/llm-pricing-no-silver-bullet" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/llm-pricing-no-silver-bullet&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>llms</category>
      <category>pricing</category>
      <category>businessmodels</category>
    </item>
    <item>
      <title>The Last Piece of the Puzzle: Vibing an Inference Engine</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 04:57:27 +0000</pubDate>
      <link>https://forem.com/skyguan92/the-last-piece-of-the-puzzle-vibing-an-inference-engine-2210</link>
      <guid>https://forem.com/skyguan92/the-last-piece-of-the-puzzle-vibing-an-inference-engine-2210</guid>
      <description>&lt;p&gt;I saw a piece of news yesterday that held my attention for a while.&lt;/p&gt;

&lt;p&gt;23-year-old Liam Price has no advanced mathematical background. On an ordinary Monday afternoon, he casually tossed a problem numbered Erdős #1196 into ChatGPT. After several rounds of picking and choosing, he unexpectedly cracked a problem that had stumped the mathematics community for 60 years. After reviewing it, Terence Tao said something quite interesting: those who had looked at this problem before had "gone off track from the very first step."&lt;/p&gt;

&lt;p&gt;That news kept me up most of that night. It wasn't envy—it made me start re-examining the projects in my hands.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Puzzle, Missing the Last Piece
&lt;/h2&gt;

&lt;p&gt;Working on edge AI, I've always had a puzzle in mind.&lt;/p&gt;

&lt;p&gt;The core contradiction to solve is a classic one. On one side is SOTA: the hope that AI models running on small form-factor devices can still achieve the industry's strongest performance, across various operating systems and hardware. On the other side is TCO: the total cost of ownership for each device needs to be compressed as close as possible to hardware plus electricity.&lt;/p&gt;

&lt;p&gt;Neither side alone is particularly difficult; it's putting them together that becomes awkward. To achieve SOTA, you need experts to tune specifically for that machine's hardware, inference engine, and model. A parameter tuner earning twenty thousand a month isn't expensive, but on a twenty-thousand-yuan edge device, the math doesn't add up. I wrote about the full story behind this in &lt;a href="https://dev.to/blog/edge-ai-inference-tco-trap"&gt;Edge AI Inference: Computing Goldmine or Management Black Hole?&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To close this gap, I built two puzzle pieces.&lt;/p&gt;

&lt;p&gt;The first is called &lt;a href="https://dev.to/blog/why-we-built-aima"&gt;AIMA&lt;/a&gt;. A knowledge-driven management platform that embeds tuning knowledge, letting Agents run benchmarks and tune parameters on each device themselves, gradually approaching that machine's optimal performance. Put plainly, it's having AI take over the job of "that person with the twenty-thousand monthly salary."&lt;/p&gt;

&lt;p&gt;The second is called &lt;a href="https://dev.to/blog/why-i-built-aima-service"&gt;AIMA Service&lt;/a&gt;. For after-sales maintenance when devices fail, engineers used to have to remotely connect and operate manually. AIMA Service lets Agents directly take over: diagnosis, tuning, and fixing failures end-to-end, swallowing anything that exceeds the on-site manager's cognitive boundaries.&lt;/p&gt;

&lt;p&gt;With both puzzle pieces in place, I thought I was nearly done. Until I recently started actually using them and discovered something awkward: I had finished an entire management layer, but &lt;strong&gt;had no inference engine to manage&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Engines Available Now: Each Makes Me Frown
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Ollama: Best Experience, Most Regrettable Performance
&lt;/h3&gt;

&lt;p&gt;Ollama's user experience is genuinely excellent. Install it, click once, models download automatically, one-click start—cross macOS / Windows / Linux, everything is silky smooth. Almost everyone who first touches private deployment starts here.&lt;/p&gt;

&lt;p&gt;But it has two fatal flaws.&lt;/p&gt;

&lt;p&gt;The first is unreachable performance. What Ollama actually runs under the hood is llama.cpp, which has an enormous number of tunable parameters. That's its strength (sufficiently flexible) and its weakness (the vast majority of people can't find the optimal set). Ollama simply chooses a conservative set of defaults for everyone, letting ordinary users run out of the box, at the cost of almost never obtaining that machine's maximum performance.&lt;/p&gt;

&lt;p&gt;The second is the default concurrency of one. &lt;code&gt;OLLAMA_NUM_PARALLEL&lt;/code&gt; ships as 1 by default, with every request queuing up obediently. You can raise it, but doing so means context scales linearly with concurrency and memory has to be recalculated—it's not as simple as flipping a switch. Batch tasks, Agent multi-step calls—all of these hurt when they run into Ollama.&lt;/p&gt;

&lt;h3&gt;
  
  
  llama.cpp: Parameters Are Tunable, but Precision Gets Eaten by Its Own Format
&lt;/h3&gt;

&lt;p&gt;Using llama.cpp directly rather than Ollama, performance can be 5–10% higher, concurrency is more controllable, and it runs on all kinds of hardware. It's one of the most cross-platform engines available.&lt;/p&gt;

&lt;p&gt;But there's one thing I hadn't noticed before, and only realized after building recently: it must convert models from safetensors to its own GGUF format before running.&lt;/p&gt;

&lt;p&gt;This step wasn't a problem in past chat scenarios. But once you enter Agent use cases with multi-step calls, trouble arrives.&lt;/p&gt;

&lt;p&gt;Even without any quantization, just this format conversion alone causes a fairly obvious precision drop on agent tasks. Go further with quantization and things get worse. I've looked at some public data: Q4_K_M on certain small models drops accuracy from 0.87 all the way to nearly half, Q3_K_M crashes outright; Llama 3.2 3B converted to q4_k_m drops MMLU from 64.2 to 61.8.&lt;/p&gt;

&lt;p&gt;These numbers are tolerable in a single step. But an Agent task can run dozens of steps, losing a bit of precision at each step, and compound decay makes it a completely different story. One study found that when conversation length increases by 50%, efficiency drops 3–5%; after more than 12 rounds, Agents start performing large amounts of meaningless repetitive operations, stuffing the context full of garbage.&lt;/p&gt;

&lt;p&gt;Everyone is moving toward Agent programming and agent scenarios. The precision problems on this path will only become harder, not easier.&lt;/p&gt;

&lt;h3&gt;
  
  
  vLLM / SGLang: Performance Maxed Out, but Heavy in Another Way
&lt;/h3&gt;

&lt;p&gt;vLLM and SGLang are the other extreme. Performance maxed out, native multimodal support. vLLM now stably supports vision models, plus ASR / TTS, embedding, and rerank interfaces; SGLang supports 30+ multimodal models, diffusion models, even TTS. The standard answer for cloud inference engines.&lt;/p&gt;

&lt;p&gt;But placed at the edge, the problem is straightforward: package sizes are often dozens of GB, deployment is heavy, and cross-platform support is unfriendly. Having an ordinary user run vLLM on their 16GB box is almost unrealistic.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Want My Engine to Look Like
&lt;/h2&gt;

&lt;p&gt;Laying the strengths and pain points of these three on one table, the engine I want essentially stitches their advantages together.&lt;/p&gt;

&lt;p&gt;Out-of-the-box like Ollama, cross-OS, package size in the hundreds-of-megabytes range, click and run. Under the hood as flexible as llama.cpp, not picky about CPU, GPU, or NPU. Performance maxed out like vLLM, with native support for vision, ASR, TTS, embedding, rerank—not patched in later. And one final requirement: no GGUF conversion, ingest safetensors directly, avoiding the precision cliff brought by format conversion.&lt;/p&gt;

&lt;p&gt;Each of these things is being done by someone individually. But stacked together, I haven't seen anyone doing it yet.&lt;/p&gt;

&lt;p&gt;If it gets built, the whole puzzle closes. User experience as foolproof as Ollama, a few hundred megabytes installed, click and run. On the performance side, AIMA works behind the scenes letting Agents auto-tune parameters, approaching that vLLM-level SOTA. Run vision, speech, embedding, rerank locally, with no need to link a bunch of cloud accounts. When problems arise, AIMA Service takes over itself.&lt;/p&gt;

&lt;p&gt;This is actually the last piece of that "box that can run every auxiliary model" I talked about in &lt;a href="https://dev.to/blog/edge-ai-device-blueprint"&gt;The AI Box Should Be as Boring as a Router&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Something I Wouldn't Have Dared to Think About Three Months Ago
&lt;/h2&gt;

&lt;p&gt;At this point I need to mention something else.&lt;/p&gt;

&lt;p&gt;Three months ago my attitude toward "building an inference engine" was: I wouldn't dare think about it. Too low-level, too hardcore; open-source projects doing this were either moribund or internal big-tech efforts.&lt;/p&gt;

&lt;p&gt;But today I've already started building—not just one engine, but several different versions. How did this mindset shift happen?&lt;/p&gt;

&lt;p&gt;Three months ago I first seriously got into AI coding. I assigned the team a small one-week task: write a performance testing script, because I'd been thinking about it for a long time and no one had helped me do it. A week later, everyone had gradually built their tools.&lt;/p&gt;

&lt;p&gt;After finishing, new ideas started forming: could we make a demo? The kind you show at exhibitions. While building that, I thought, could we make a small automatic customer acquisition tool? After building that, could I make a personal website for myself?&lt;/p&gt;

&lt;p&gt;Everything I built surprised me. Expectations were low, but everything worked out.&lt;/p&gt;

&lt;p&gt;At this point I started thinking bigger: the model management platform I'd always wanted to build (AIMA)—could I build it myself? I spent time on it, and surprisingly actually built it. It started handling real business.&lt;/p&gt;

&lt;p&gt;Next I asked: then can I build a cloud service? This is a completely different domain from single-machine software, something I'd never touched before. The result: I could still build it, and people could really use it.&lt;/p&gt;

&lt;p&gt;Then recently I started optimizing hardware-layer performance and got very good results. At that moment I suddenly realized: AI can truly produce breakthroughs in things that originally belonged to the research domain, very low-level things, especially in end-to-end verifiable scenarios.&lt;/p&gt;

&lt;p&gt;Looking back: the software capabilities accumulated before, cloud capabilities, plus this research capability I now have—assemble them together and it's an inference engine. From not daring to think about it to already building it, the gap wasn't skills, but repeated moments of "huh, so this is doable too."&lt;/p&gt;

&lt;p&gt;Every step was "try it without expecting it to work." Every step expanded the answer to "what else can be done next" by another circle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Looking back at that news about the 23-year-old.&lt;/p&gt;

&lt;p&gt;The most memorable part isn't "AI made a mathematician's discovery." The original report was quite frank: the AI's direct output was "very poor quality"; it was the human who picked out the valuable nugget. What's worth remembering is the triggering structure: a 23-year-old outside mathematics circles, without professional training, because of ChatGPT, dared to touch a problem that had troubled the mathematics community for 60 years.&lt;/p&gt;

&lt;p&gt;It's not that AI did it for you; it's that AI made you dare to do it.&lt;/p&gt;

&lt;p&gt;My own last three months have been the same. Starting from a performance testing script, step by step turning things I thought were impossible into things I'm now building. Models get stronger every month; what's blocking you today often becomes just an engineering effort two months later when the next generation model comes out.&lt;/p&gt;

&lt;p&gt;That doesn't mean you should just lie around waiting for models. It means: don't be too quick to conclude that something is impossible today. The scope of what's doable expands every month.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.scientificamerican.com/article/amateur-armed-with-chatgpt-vibe-maths-a-60-year-old-problem/" rel="noopener noreferrer"&gt;Amateur armed with ChatGPT 'vibe-maths' a 60-year-old problem — Scientific American&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://eu.36kr.com/en/p/3784815604817154" rel="noopener noreferrer"&gt;23-Year-Old Amateur Solves 60-Year-Old Math Problem with ChatGPT — 36Kr&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.ollama.com/faq" rel="noopener noreferrer"&gt;Ollama FAQ (OLLAMA_NUM_PARALLEL defaults to 1)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://markaicode.com/ollama-concurrent-requests-parallel-inference/" rel="noopener noreferrer"&gt;Configure Ollama Concurrent Requests — Markaicode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.glukhov.org/llm-performance/ollama/how-ollama-handles-parallel-requests/" rel="noopener noreferrer"&gt;How Ollama Handles Parallel Requests — glukhov.org&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.vllm.ai/en/latest/models/supported_models/" rel="noopener noreferrer"&gt;vLLM Supported Models (multimodal, embedding, rerank)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/sgl-project/sglang" rel="noopener noreferrer"&gt;SGLang Project Homepage&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2511.13023" rel="noopener noreferrer"&gt;SLMQuant: Benchmarking Small Language Model Quantization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.premai.io/llm-quantization-guide-gguf-vs-awq-vs-gptq-vs-bitsandbytes-compared-2026/" rel="noopener noreferrer"&gt;LLM Quantization Guide: GGUF vs AWQ vs GPTQ vs bitsandbytes — premai.io&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.redhat.com/articles/2024/10/17/we-ran-over-half-million-evaluations-quantized-llms" rel="noopener noreferrer"&gt;We ran over half a million evaluations on quantized LLMs — Red Hat Developer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/html/2507.21504v1" rel="noopener noreferrer"&gt;Evaluation and Benchmarking of LLM Agents: A Survey&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/inference-engine-last-puzzle-piece" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/inference-engine-last-puzzle-piece&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>inferenceengine</category>
      <category>edgecomputing</category>
      <category>aima</category>
    </item>
    <item>
      <title>DeepSeek V4 Day: It's About Infra, Not the Model</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 04:37:40 +0000</pubDate>
      <link>https://forem.com/skyguan92/deepseek-v4-day-its-about-infra-not-the-model-4fmg</link>
      <guid>https://forem.com/skyguan92/deepseek-v4-day-its-about-infra-not-the-model-4fmg</guid>
      <description>&lt;p&gt;The AI industry feels like New Year's today.&lt;/p&gt;

&lt;p&gt;OpenAI dropped GPT-5.5 in the morning, and DeepSeek V4 went live in the afternoon. Add DeepMind's Vision Banana from a couple of days ago, plus Haoyu Cai's Anuttacon paper on digital humans. This week has crammed in more new stuff than the entire past quarter. Let me talk through the highlights.&lt;/p&gt;

&lt;h2&gt;
  
  
  DeepSeek V4
&lt;/h2&gt;

&lt;p&gt;I participated in some internal testing before the V4 release. I had to keep it confidential before, but the embargo is lifted.&lt;/p&gt;

&lt;p&gt;First, the specs. V4 launched in two versions: V4-Pro with 1.6T total params and 49B active; V4-Flash at 284B / 13B. Both pre-trained on 32T+ tokens, with 1M context as standard, open-sourced under MIT, supporting three inference modes: non-think / think high / think max. On API pricing, V4-Pro is \$1.74 per million tokens input and \$3.48 output; V4-Flash is cheaper by an order of magnitude, at \$0.14 / \$0.28.&lt;/p&gt;

&lt;p&gt;On capabilities, the team itself admits that V4 sits roughly around the Opus 4.6 tier, perhaps even slightly weaker. Not at the absolute top.&lt;/p&gt;

&lt;p&gt;This is the same pattern as last time with R1 versus O3. Close, but not at the frontier.&lt;/p&gt;

&lt;p&gt;So where does DeepSeek's significance lie? I've always felt it's not a company that crushes competitors with its model; it's a company that leads competitors through Infra. And Infra doesn't follow the model—it moves ahead of it.&lt;/p&gt;

&lt;p&gt;The Infra in this V4 release is a disaster for every inference company on the market. That word is not an exaggeration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Native FP4.&lt;/strong&gt; V4 is an FP8 + FP4 mixed-precision model: MoE experts use FP4, the rest FP8. Right now, most chips and most inference stacks either don't support FP4 or support it very poorly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operators sliced extremely fine.&lt;/strong&gt; When I ran V4 inference, I found it had done an enormous amount of personalized optimization at the operator layer; mainstream open-source engines basically can't match the official performance. To catch up to its price-performance ratio, you'd have to grind through the low-level compiler line by line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single machines struggle.&lt;/strong&gt; The previous generation could at least run on a single machine; this generation can't even do that, let alone clusters. Without running the official stack, there's basically no way to hit that price point.&lt;/p&gt;

&lt;p&gt;This reminds me of when DeepSeek disclosed a 545% theoretical gross margin during the V3/R1 era. Meaning if you run strictly on their architecture, margins can be extremely high; at the same time, all the replica inference services were losing money. V4 is a more radical version of that story.&lt;/p&gt;

&lt;p&gt;A side story: the Infra lead at the company talked with us and said very seriously: be careful, technology is changing so fast that some architectures in the previous generation may be transitional—the next generation might drop them entirely. If you pour heavy investment into Infra, it could all go down the drain when the next generation drops.&lt;/p&gt;

&lt;p&gt;There's a fundamental divergence behind this. Most model companies build the model first and leave Infra for later; DeepSeek puts Infra first, using bottom-level innovation to reverse-engineer the model's economics. Both can survive. But if you really want to serve consumers at scale without thinking through Infra, you'll definitely crash. DeepSeek itself stumbled during its first viral moment—the web page went down, the API dropped. And that was when its Infra was already relatively solid.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Model Itself
&lt;/h3&gt;

&lt;p&gt;From hands-on experience:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chinese text capability remains its strong suit.&lt;/strong&gt; For tasks like Chinese writing and report generation—content organization—it's worth it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool calling feels pretty good,&lt;/strong&gt; somewhat Claude-like.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It's not at the absolute top tier.&lt;/strong&gt; It can't catch the level of GPT-5.5 or Opus 4.7.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No coding plan for now.&lt;/strong&gt; Probably not coming anytime soon. A real shame.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  1 Million Context as the Default
&lt;/h3&gt;

&lt;p&gt;This might be the most noteworthy thing about V4. One million tokens of context, made the default for all online services, with no segmented price hikes.&lt;/p&gt;

&lt;p&gt;What does 1 million tokens mean? Roughly 2 million Chinese characters—an entire web novel serialized over one or two years can fit inside.&lt;/p&gt;

&lt;p&gt;Everyone wanted to do this before, but those who actually pulled it off either priced it separately or shut it down after a short while. Anthropic opened its million-tier last year then pulled it back, only re-releasing recently; OpenAI still hasn't officially opened the million-tier API. It's not a capability problem—Infra can't handle the load.&lt;/p&gt;

&lt;p&gt;V4 can make this standard at no extra cost because it performed major surgery on attention. It introduces two structures used in alternation: &lt;strong&gt;CSA (Compressed Sparse Attention)&lt;/strong&gt; and &lt;strong&gt;HCA (Heavily Compressed Attention)&lt;/strong&gt;. CSA first compresses KV 4× along the sequence dimension, then applies sparse attention to pick the most relevant tokens (top-1024 for V4-Pro, top-512 for V4-Flash), paired with a 128-token sliding window to preserve local context. HCA compresses even more aggressively (128× compression ratio), but applies dense attention on the compressed representation, effectively leaving a low-resolution "global summary" for some layers. The two layer types are interleaved throughout the network: some do precise look-up, others do fuzzy global attention. Additionally, a layer of &lt;strong&gt;Manifold-Constrained Hyper-Connections (mHC)&lt;/strong&gt; is stacked on top to stabilize cross-layer signal propagation.&lt;/p&gt;

&lt;p&gt;The official efficiency numbers: at 1M context, V4-Pro needs only &lt;strong&gt;27%&lt;/strong&gt; of the per-token inference FLOPs of V3.2, and &lt;strong&gt;10%&lt;/strong&gt; of the KV cache. Making the million-token tier standard without price hikes is built on this foundation.&lt;/p&gt;

&lt;p&gt;Kimi was the earliest to push in this direction back in '23: million-token context can cover most scenarios. Three years later, this has finally become an infrastructure-level default capability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day-0 Adaptation for Domestic Chips
&lt;/h3&gt;

&lt;p&gt;This time V4 achieved day-0 deep adaptation for domestic chips like Huawei Ascend. I think that shows real vision.&lt;/p&gt;

&lt;p&gt;Perpetually relying on overseas chips for training and inference isn't a technical problem; it's a risk problem. V4 thinking through domestic chip adaptation on day 0 is more meaningful than the model itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other Releases This Week
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Vision Banana: Generative Models Actually "Understand" Images
&lt;/h3&gt;

&lt;p&gt;DeepMind released something called Vision Banana these past couple of days. The approach: take Nano Banana Pro, a text-to-image model, do a round of instruction tuning, and have it tackle traditional vision tasks like segmentation, depth estimation, and normal estimation.&lt;/p&gt;

&lt;p&gt;The results match or even beat specialized models like Segment Anything and Depth Anything, without losing image generation capability.&lt;/p&gt;

&lt;p&gt;This is quite interesting. Text-to-image models already possess an intrinsic understanding of images; it's just that no one knew how to "query" that understanding before. Now image understanding and generation are unified under the same interface: all tasks are solved via image-to-image.&lt;/p&gt;

&lt;p&gt;Following this line of thinking, generative models naturally lead to "world models." The dimensions of 2D, 3D, video, and physics may all be folded into a single model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cai Haoyu's LPM 1.0: Digital Humans Can Finally "Listen"
&lt;/h3&gt;

&lt;p&gt;On April 10, Haoyu Cai—former founder of miHoYo (the studio behind &lt;em&gt;Genshin Impact&lt;/em&gt;)—published a paper on arXiv through his new company Anuttacon. LPM 1.0 is a 1.7-billion-parameter diffusion Transformer for "performance generation" of video characters.&lt;/p&gt;

&lt;p&gt;Digital humans are an exhausted topic. But this paper defines two problems that no one had seriously tackled before: &lt;strong&gt;persistent identity consistency&lt;/strong&gt; and &lt;strong&gt;interaction while listening&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Identity consistency isn't just about stable appearance. It means the character's reactions across different scenes must conform to the same "personality"—you shouldn't suddenly feel like "this isn't the same person."&lt;/p&gt;

&lt;p&gt;Previous digital humans were output-oriented: make it speak, make it move, and those are doing fine now. The truly hard part is listening. When you talk to it, it needs to give you facial feedback, micro-gestures, breathing rhythm—making you feel like there's a living person on the other side. In real life, when you talk to someone, they don't wait expressionlessly for you to finish before responding; they're feeding back the entire time. The volume of this feedback is enormous, and almost no one had done it before.&lt;/p&gt;

&lt;p&gt;The paper is out, but the model isn't open-sourced. Because the more realistic it gets, the more human-like it becomes, and the fraud risk is too high. I think that's the right call.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Opus 4.7 Mess: An Infra Failure
&lt;/h3&gt;

&lt;p&gt;Anthropic has taken a lot of heat over the past couple of weeks. On April 23 the official postmortem identified three overlapping issues:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;On March 4, default reasoning effort was lowered from high to medium to improve UI latency, at the cost of noticeably lower intelligence for Sonnet 4.6 and Opus 4.6.&lt;/li&gt;
&lt;li&gt;On March 26, a feature to clear idle session thinking was launched; a bug caused it to clear every round, making the model forgetful and repetitive.&lt;/li&gt;
&lt;li&gt;On April 16, a system prompt was added to limit response length, causing Opus 4.7 coding quality to drop 3%.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All three were Infra-layer mistakes. The proportion that Infra contributes to whether a model service is usable is growing ever larger.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Company That Didn't Release a Top Model, Yet Still Dominates Trending
&lt;/h2&gt;

&lt;p&gt;DeepSeek went so long without a new model release. The intermediate generations basically made no waves. Today it dropped, and it took several trending spots.&lt;/p&gt;

&lt;p&gt;I see this as strategic strength. It did several things that are highly representative right now: making million-token context standard, pushing FP4 to production, and getting day-0 domestic chip adaptation working. These are all hardcore Infra feats.&lt;/p&gt;

&lt;p&gt;You also have to admit that standing alone at the top is hard in the current landscape. Kimi K2.6, GLM 5.1, MiniMax's new models—the entire open-source camp's water level has risen. It's not like the V3/R1 era when it monopolized the open-source high ground.&lt;/p&gt;

&lt;h2&gt;
  
  
  Repeating That Conclusion
&lt;/h2&gt;

&lt;p&gt;A friend came to me this morning. His company wants to transform, wants to "buy an AI product" to help drive team change. Financial industry, not many people.&lt;/p&gt;

&lt;p&gt;My exact words to him were: don't rush into talking about transformation. First make everyone in the company a heavy user of coding agents, then talk about organizational change.&lt;/p&gt;

&lt;p&gt;Then I cast my desktop to show him how I use Claude Code every day, how many agent threads are running simultaneously on my screen. His first reaction after watching was to immediately go subscribe to a coding plan.&lt;/p&gt;

&lt;p&gt;That reaction is the right one. The best investment in this era is buying a coding plan and using it every day. Not the "used ChatGPT a few times" kind of usage, but really letting agents into your daily workflow. Without that foundation, organizational-level change is a castle in the air.&lt;/p&gt;

&lt;p&gt;2026 will definitely go down in the books. Not because of any single model, but because of density—three models and a paper can drop in a single day. If you can feel this rhythm, you're already in the arena.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://ofox.ai/blog/deepseek-v4-release-guide-2026/" rel="noopener noreferrer"&gt;DeepSeek V4 Release Announcement (ofox.ai)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro" rel="noopener noreferrer"&gt;DeepSeek-V4-Pro on Hugging Face&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash" rel="noopener noreferrer"&gt;DeepSeek-V4-Flash on Hugging Face&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://api-docs.deepseek.com/quick_start/pricing" rel="noopener noreferrer"&gt;DeepSeek API Pricing (Official)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2026/Apr/24/deepseek-v4/" rel="noopener noreferrer"&gt;Simon Willison: DeepSeek V4—almost on the frontier&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tech.ifeng.com/c/8saBEvg22eB" rel="noopener noreferrer"&gt;DeepSeek V4 Partners with Huawei Ascend (Phoenix Network)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://zhuanlan.zhihu.com/p/27570317822" rel="noopener noreferrer"&gt;DeepSeek 545% Theoretical Gross Margin Disclosure&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/introducing-gpt-5-5/" rel="noopener noreferrer"&gt;Introducing GPT-5.5 — OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://vision-banana.github.io/" rel="noopener noreferrer"&gt;Vision Banana — Google DeepMind&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://finance.sina.com.cn/roll/2026-04-11/doc-inhucmsx9427887.shtml" rel="noopener noreferrer"&gt;Anuttacon LPM 1.0 Paper Report (Sina Finance)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/engineering/april-23-postmortem" rel="noopener noreferrer"&gt;Anthropic: An update on recent Claude Code quality reports&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kimi-k2.org/blog/24-kimi-k2-6-release" rel="noopener noreferrer"&gt;Moonshot Releases Kimi K2.6&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.buildfastwithai.com/blogs/glm-5-1-open-source-review-2026" rel="noopener noreferrer"&gt;Zhipu Releases GLM-5.1 (Build Fast With AI Review)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/deepseek-v4-infra-matters" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/deepseek-v4-infra-matters&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>deepseek</category>
      <category>infra</category>
      <category>model</category>
    </item>
    <item>
      <title>AI Has Turned Ignorance into an Advantage</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 04:23:22 +0000</pubDate>
      <link>https://forem.com/skyguan92/ai-has-turned-ignorance-into-an-advantage-41k4</link>
      <guid>https://forem.com/skyguan92/ai-has-turned-ignorance-into-an-advantage-41k4</guid>
      <description>&lt;p&gt;Today, I ran into two things—one that made this era feel a little unreal, and one that made it feel very real.&lt;/p&gt;

&lt;h2&gt;
  
  
  I. A 23-Year-Old Outsider Cracked a 60-Year-Old Conjecture
&lt;/h2&gt;

&lt;p&gt;Liam Price, 23, with no advanced mathematical training. On a Monday afternoon a week ago, he casually tossed an unsolved Erdős problem to GPT-5.4 Pro. Problem #1196, a conjecture about "primitive sets," had stumped humans for 60 years. In a single session of about 80 minutes, the model produced a proof.&lt;/p&gt;

&lt;p&gt;Cambridge undergraduate Kevin Barreto helped organize it. Later, Jared Lichtman and Terence Tao participated in simplifying it, distilling the key insight from the LLM's originally rough output. Terence Tao's comment was rather cutting:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Humans had looked at this problem. Everyone who looked at it collectively went down the wrong path at step one."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It wasn't that no one had thought about it; it was that everyone who had thought about it made the same mistake. The path the LLM took was common knowledge in an adjacent mathematical field—no one had bothered to bring it over.&lt;/p&gt;

&lt;p&gt;Recently, a term has started trending: vibe-math. It means you don't really understand the field; you toss the problem into ChatGPT and see how it fumbles around.&lt;/p&gt;

&lt;p&gt;That same week, GPT-5.4 Pro also scored 150 on the Mensa Norway IQ test, surpassing 99.96% of humans. OpenAI's previous high score was o3 at 136. A 14-point jump in one year.&lt;/p&gt;

&lt;p&gt;When AI consistently surpasses 95% of human experts, "I don't understand anything, so I'll just try" becomes a structural advantage.&lt;/p&gt;

&lt;h2&gt;
  
  
  II. Why "Knowing the Trade" Becomes a Constraint
&lt;/h2&gt;

&lt;p&gt;Here's a real feeling from a recent project of mine.&lt;/p&gt;

&lt;p&gt;You ask AI to evaluate a project. It lays out a plan: step one, one to two weeks; step two, two to three weeks; totaling three months. It looks reasonable because this is the timeline a human who had done this before would write.&lt;/p&gt;

&lt;p&gt;But if you actually let it execute that plan, it finishes in a few days.&lt;/p&gt;

&lt;p&gt;When evaluating, it imitates human "priors"; when executing, it's an entirely different creature.&lt;/p&gt;

&lt;p&gt;Back to Liam Price. The mathematicians who had looked at Erdős #1196 before him all opened with the same set of moves, because that opening had been considered the "standard approach" to this type of problem for 60 years. That's the prior. Priors used to be good things—shortcuts that saved time. But once the underlying tool changes, the prior binds you instead.&lt;/p&gt;

&lt;p&gt;Someone deeply embedded in a field has an intuitive sense of "cost" and "difficulty"—how many people, how much money, how much time. That intuition determines whether they're willing to even think about a problem.&lt;/p&gt;

&lt;p&gt;Imagine someone from the Ming dynasty trying to conceive of a moon landing program.&lt;/p&gt;

&lt;p&gt;They'd need to coordinate remotely among the top astronomers and rocketeers (a word they've never heard) across several nations, iterating through trial and error. Just thinking about getting a letter to those people and waiting for a reply makes the whole thing inconceivable. It's not that Ming dynasty scholars lacked ability—they gave up thinking about it from day zero.&lt;/p&gt;

&lt;p&gt;Your "priors" today work the same way. The ROI is too low on one thing; another requires crossing too many organizational layers; another simply takes too long—so you don't even consider it.&lt;/p&gt;

&lt;h2&gt;
  
  
  III. Outsiders Don't Carry That Burden
&lt;/h2&gt;

&lt;p&gt;Outsiders don't know what's difficult.&lt;/p&gt;

&lt;p&gt;To them, having AI solve a 60-year-old math conjecture and having AI write an automated weekly report or draw a PowerPoint feel roughly the same in subjective difficulty. They can't do either anyway, so both seem hard. So they're willing to try both. It doesn't cost much to try—just burn some tokens. What if it works?&lt;/p&gt;

&lt;p&gt;Hit it once, and they've advanced further than someone who spent a lifetime digging in that field.&lt;/p&gt;

&lt;p&gt;I previously wrote &lt;a href="https://dev.to/blog/ai-amplifies-passion"&gt;AI Doesn't Amplify Skill, It Amplifies Passion&lt;/a&gt;. That piece argued that passion determines whether you're willing to keep investing. This time I want to press the point further: before you even get to "willing," the question of "do you think this is difficult?" filters out most people. Outsiders bypass that filter.&lt;/p&gt;

&lt;p&gt;Recently, Luo Fuli and Zhang Xiaojun talked for 3.5 hours, mainly about the paradigm shift from pre-training to post-training and how organizations should change. A shared feeling emerged from the conversation: the people moving fastest right now are those with less baggage and fewer priors about "I know how hard this is." AI has stuffed resources that used to be mobilizable only by top institutions into the hands of individuals, almost for free. This sense of freedom is rare in history.&lt;/p&gt;

&lt;h2&gt;
  
  
  IV. Two Strangers in an Elevator
&lt;/h2&gt;

&lt;p&gt;I need to record another thing from today.&lt;/p&gt;

&lt;p&gt;Going downstairs for dinner tonight, two strangers were chatting in the elevator. One said, "I'm in a bad mood; I got laid off today." The other replied, "Oh, I got laid off too." From the 9th floor to the 1st floor, just a few dozen seconds. I was still doing the math on those odds as I stepped out.&lt;/p&gt;

&lt;p&gt;This is the other side of this era. Some people used AI to cross thresholds they could never have entered before; others had jobs they were doing just fine at vanish overnight.&lt;/p&gt;

&lt;p&gt;At the corporate level, two paths are already clear.&lt;/p&gt;

&lt;p&gt;The Walmart CEO publicly stated that the company's total headcount will remain basically unchanged over the next three years. The cost is that all 1.6 million employees must undergo AI training, with $1 billion invested in upskilling in partnership with Google and OpenAI. Meaning: no layoffs, but everyone must change.&lt;/p&gt;

&lt;p&gt;ByteDance is taking a different approach. Their self-developed AI IDE, TRAE, has long since passed one million monthly active users internally, with over 80% of engineers using it daily. In the Douyin local services line, AI-written code already accounts for 43%. Not 100% yet, but the direction is clear: let AI write what it can write, and free up humans for review and judgment.&lt;/p&gt;

&lt;p&gt;The two paths are actually two sides of the same coin: either turn every employee into a "100x worker" who can use AI, or directly cut the links that are no longer needed. Which path you're on determines whether today's news looks like opportunity or threat.&lt;/p&gt;

&lt;h2&gt;
  
  
  V. The Truth About Polarization
&lt;/h2&gt;

&lt;p&gt;The picture looking back is roughly this.&lt;/p&gt;

&lt;p&gt;Niche fields are being cracked open. Mathematical research, biomedicine, materials science, low-level programming—these fields that used to require decades of know-how before anyone dared touch them have suddenly seen their barriers lowered. A 23-year-old cracked a 60-year-old conjecture. Next might be a high schooler doing something non-trivial in medicine or boosting the efficiency of some device. This used to exist only in science fiction.&lt;/p&gt;

&lt;p&gt;Mass-market fields no longer need as many people. Weekly reports, PowerPoints, junior-level code, customer service, basic copywriting—AI can do all of it, decently well, and faster and faster. The number of people needed will keep dropping.&lt;/p&gt;

&lt;p&gt;The ones who suffer most are those who are already very skilled in mass-market fields but lack the motivation to enter new ones. Their skills are fine; their direction is the problem—with AI in hand, they still only use it to deliver assignments more smoothly.&lt;/p&gt;

&lt;h2&gt;
  
  
  VI. Add Something New to the World, or Make Your Deliverables Prettier
&lt;/h2&gt;

&lt;p&gt;The response you can make isn't actually complicated. Pick something you gave up on in the past because it "sounded too hard," bring AI in, and do it again.&lt;/p&gt;

&lt;p&gt;But which thing you choose makes a difference.&lt;/p&gt;

&lt;p&gt;Making the weekly report smoother, the PowerPoint flashier, the existing process slightly more efficient—AI can help with all of that. But these things are still essentially just deliverables. One more pretty weekly report or one fewer makes no difference to humanity. The marginal output of AI on such tasks, when it lands in the outside world, is basically zero.&lt;/p&gt;

&lt;p&gt;The things that truly add something to the world are another category: solving a math problem stuck for decades, building a small tool that solves a real problem, poking at a direction you're merely curious about but not credentialed in. Liam Price at 23 cracked a 60-year-old conjecture; the next person to make some device's performance jump 10% or to give an overlooked population a usable tool might very well be an ordinary person.&lt;/p&gt;

&lt;p&gt;You don't have to do something that shakes academia; there's a vast middle ground. But you must choose &lt;strong&gt;worthwhile things&lt;/strong&gt;—things worthy of your scarcest resources: your time and attention.&lt;/p&gt;

&lt;p&gt;My criteria for whether something is worth doing have changed. I used to ask first: "Can I pull it off?" "Can I deliver it?" "Is there a story to tell?" Now I ask more directly: If this gets done, does the world gain something real?&lt;/p&gt;

&lt;p&gt;AI has given every curious person access to what used to be the most expensive resource, almost for free. The most ironic thing is that the vast majority of people, upon receiving this leverage, their first instinct is to use it to deliver assignments more efficiently, not to do something genuinely useful for humanity.&lt;/p&gt;

&lt;p&gt;Using this leverage to add something to the world weighs far more than using it to make your deliverables prettier.&lt;/p&gt;

&lt;p&gt;Catching this freedom matters more than worrying about the probability of being laid off.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.erdosproblems.com/forum/thread/1196" rel="noopener noreferrer"&gt;Erdős Problem #1196 — primitive sets discussion thread&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.scientificamerican.com/article/amateur-armed-with-chatgpt-vibe-maths-a-60-year-old-problem/" rel="noopener noreferrer"&gt;Amateur armed with ChatGPT 'vibe-maths' a 60-year-old problem — Scientific American&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.ycombinator.com/item?id=47774494" rel="noopener noreferrer"&gt;GPT-5.4 Pro solves Erdős Problem #1196 — Hacker News discussion thread&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cryptoslate.com/gpt-5-4-iq-150-ai-capability-economic-impact/" rel="noopener noreferrer"&gt;GPT-5.4 Pro Scores IQ 150 on Mensa Norway (CryptoSlate)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://binaryverseai.com/ai-iq-test-2025/" rel="noopener noreferrer"&gt;AI IQ Test Leaderboard 2026 (BinaryVerse AI)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://fortune.com/2026/02/19/walmart-trillion-dollar-retail-gaint-artificial-intelligence-training-google-partnership-invest-in-workers-not-replace-tech-changing-jobs/" rel="noopener noreferrer"&gt;Walmart Keeps Headcount Flat for Three Years, Trains 1.6 Million Employees in AI (Fortune)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.qq.com/rain/a/20251229A065ZX00" rel="noopener noreferrer"&gt;ByteDance's TRAE: 200 Updates in One Year, Internal MAU Passes One Million (Tencent News)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.xiaoyuzhoufm.com/episode/69eae15a1e94ae692107cc50" rel="noopener noreferrer"&gt;Luo Fuli × Zhang Xiaojun: 3.5-Hour Interview — Business Interviews Episode 138&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/amateur-advantage" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/amateur-advantage&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>innovation</category>
      <category>employment</category>
      <category>reflection</category>
    </item>
    <item>
      <title>AI for Science: Three Walls, Eight Hundred Million People, and a Copernican Revolution</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 04:09:15 +0000</pubDate>
      <link>https://forem.com/skyguan92/ai-for-science-three-walls-eight-hundred-million-people-and-a-copernican-revolution-48f</link>
      <guid>https://forem.com/skyguan92/ai-for-science-three-walls-eight-hundred-million-people-and-a-copernican-revolution-48f</guid>
      <description>&lt;p&gt;A few days ago, I attended a conference where the topic shifted to how AI is changing scientific research. It was supposed to be a closing session, but the conversation wouldn't stop. Several interesting points didn't get fully explored, so I've been thinking them over for the past two days and decided to jot them down.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. AI Writing Papers Is Already the Norm; the Real Question Is the Paper Itself
&lt;/h2&gt;

&lt;p&gt;The fact that AI writes papers has become so commonplace it barely needs discussion. In a 2025 &lt;em&gt;Nature&lt;/em&gt; survey of 5,000 researchers, 57% admitted to using AI to help write over the past two years, and 72% plan to use it in the next two. These are just the numbers researchers were willing to admit.&lt;/p&gt;

&lt;p&gt;Someone at the conference argued this isn't actually good for early-career researchers. Writing papers used to be a form of mental training; with models doing the writing, that step gets skipped.&lt;/p&gt;

&lt;p&gt;I didn't immediately react when I heard this. It took some thinking to see clearly. The concern is valid, but the perspective is a bit small. The real problem sits one level higher: Is the paper itself, as a medium, still appropriate?&lt;/p&gt;

&lt;p&gt;The vast majority of papers today are still PDFs. PDF is a format designed for humans to read: layout, font size, double-column formatting, following the publisher paradigm from decades ago. But what if the first consumer of knowledge in the future isn't a human but an agent? PDF becomes a translation layer—it has to be parsed back into structured text before being handed to a model.&lt;/p&gt;

&lt;p&gt;Why not just use Markdown?&lt;/p&gt;

&lt;p&gt;Markdown is far more agent-friendly. Clear structure, machine-readable, easily rewritten, embeddable into larger workflows. If humans want to read it, just render the layout—no harm done. The default form should be flipped: serve the agent first, then the human.&lt;/p&gt;

&lt;p&gt;No one is pushing this yet, but it will inevitably change within two years.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Real Bottleneck for AI Doing Research Is Three Walls
&lt;/h2&gt;

&lt;p&gt;Producing papers—the output part—is easy. But the core of research has never been the writing step.&lt;/p&gt;

&lt;p&gt;The core of research is connecting broad swaths of knowledge and letting new things grow at the intersections. Single-point intelligence is useful but far from sufficient. A friend at the conference put it very clearly: what is blocking AI from doing research isn't model capability; it's three walls.&lt;/p&gt;

&lt;h3&gt;
  
  
  The First Wall: The Paywall
&lt;/h3&gt;

&lt;p&gt;The vast majority of the scientific knowledge system sits behind paywalls.&lt;/p&gt;

&lt;p&gt;Computer science has arXiv, which is an exception; physics and math are mostly on arXiv too. But outside these circles, the picture changes. In pharmacology, inorganic chemistry, chemical engineering, and similar fields, over 90% of papers are behind paywalls—agents simply can't read them. The open-access portion is maintained by a minority of people in a minority of disciplines.&lt;/p&gt;

&lt;p&gt;My friend used an apt metaphor: fog of war. Anyone who's played RTS games knows the map is mostly black; you can only see the small patch you've explored. That's the knowledge map AI faces right now. You think it's doing global reasoning, but the territory it can actually see is pathetically small.&lt;/p&gt;

&lt;p&gt;What's worse, the obscured areas are precisely where knowledge connections are densest. A new discovery often requires linking several disciplines. If one or two of those disciplines are blacked out, the connection is severed. No matter how smart the model is, it can't compensate for input that doesn't exist.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Second Wall: The Wall from the Digital World to the Physical World
&lt;/h3&gt;

&lt;p&gt;Two or three years ago, when AI for science was at its peak, materials science was the focal point. I seriously looked into the field during that period and found a classic bottleneck.&lt;/p&gt;

&lt;p&gt;The theoretical phase involves combinatorial arrangements of large numbers of molecules and crystal structures, from which promising candidates must be selected. AI can accelerate this step by several orders of magnitude. In 2023, Google DeepMind's GNoME system identified 2.2 million new stable inorganic crystal structures in just 17 days—equivalent to nearly 800 years of accumulated human discovery—of which 380,000 were predicted to be stable enough for engineering.&lt;/p&gt;

&lt;p&gt;It sounds almost too good to be true. But getting from theoretical structures to usable products requires passing through wet labs, process development, and production engineering. If the front end accelerates 100× while the back-end pipeline stays put, the entire chain can only move at the speed of its slowest link.&lt;/p&gt;

&lt;p&gt;A promising trend in the last year or two is the self-driving lab. The idea is to fully automate and digitize wet labs, letting AI directly schedule experimental equipment and run closed loops. A 2025 proposal from North Carolina State University used dynamic flow experiments to collect data every half second, more than 10× faster than batch experiments.&lt;/p&gt;

&lt;p&gt;But this path has only just begun. In most disciplines, wet labs are still manual, with each run taking anywhere from days to weeks. Without a closed physical feedback loop, no matter how fast the AI side is, it can only wait.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Third Wall: The Perception Wall
&lt;/h3&gt;

&lt;p&gt;This wall is the most easily overlooked, and possibly the deepest.&lt;/p&gt;

&lt;p&gt;AI currently performs best with text. Vision is catching up, but how much of the true information density contained in vision—lighting, texture, spatial relationships—can be digitized remains an open question.&lt;/p&gt;

&lt;p&gt;Even harder are the modalities beyond text and vision. Touch, smell, and taste are critical data for many disciplines. For food research, smell is a core signal. In biological research, subtle odor changes in chemicals are often used to judge reaction direction. These signals still lack good digital channels today.&lt;/p&gt;

&lt;p&gt;Then there's a fuzzier category called intuition. When a human sees an experimental phenomenon, hears a set of data, or smells a particular odor, the first reaction doesn't come from linear reasoning; it emerges directly from some corner of the brain. This intuition is pattern recognition accumulated over decades of training, yet the pattern itself can't be articulated.&lt;/p&gt;

&lt;p&gt;As long as this layer exists, AI cannot independently complete scientific research; it must collaborate with humans.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Terence Tao Says Intelligence Is Facing a Copernican Revolution
&lt;/h2&gt;

&lt;p&gt;Speaking of collaboration, I was reminded of something Terence Tao has been saying recently. I jotted down a line from his appearance on the Dwarkesh podcast:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We are experiencing a Copernican revolution at the level of cognition. We used to think human intelligence was the center of the universe; now we see there are very different kinds of intelligence out there, each with its own strengths and weaknesses."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What Copernicus did was remove humanity from the center of the universe. Tao says the same must happen with intelligence: humanity must be removed from the center of intelligence.&lt;/p&gt;

&lt;p&gt;We used to think of intelligence as one-dimensional and linear. Humans in the middle; anything above is superhuman, anything below is useless. That's why AGI makes people nervous: if it surpasses me one day, do I lose all value?&lt;/p&gt;

&lt;p&gt;Tao says this framing is fundamentally wrong.&lt;/p&gt;

&lt;p&gt;Intelligence is multidimensional. Different dimensions can coexist and collaborate rather than substitute for one another. AI is extraordinarily strong in breadth—rapidly scanning massive amounts of information and making cross-disciplinary associations—capabilities humans simply cannot train themselves to achieve. Humans hold irreplaceable depth: intuition, problem selection, judgment of meaning. Put the two together, and they outperform either alone.&lt;/p&gt;

&lt;p&gt;This aligns perfectly with the three walls mentioned earlier. The walls AI cannot climb are precisely where humans have the most leverage. Physical embodiment in the world, cross-sensory intuition, the judgment of whether a problem is "worth doing"—these are things AI cannot replace in the short term, and they are exactly what makes humans most valuable in this collaboration.&lt;/p&gt;

&lt;p&gt;When Copernicus removed humanity from the center of the universe, he didn't say humans were worthless. He said the universe operates differently from how people imagined. Tao's point is the same. Stepping down from the cognitive center is not a demotion; it is seeing clearly where one actually stands within a larger system.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. The "Participation Population" of Research May Leap by Two Orders of Magnitude
&lt;/h2&gt;

&lt;p&gt;The most exciting part of the entire discussion was a shift in perspective.&lt;/p&gt;

&lt;p&gt;People worry about AI displacing researchers, but the interesting development is exactly the opposite. AI is democratizing research from the hands of a tiny minority to hundreds of millions of people.&lt;/p&gt;

&lt;p&gt;According to UNESCO statistics, in 2018 there were roughly 8.8 million full-time researchers worldwide. That sounds like a lot, but against a global population of 8 billion, it's less than one-tenth of one percent. In other words, the absolute frontier of human knowledge has historically been advanced by only one in a thousand people, while the remaining 99.9% occupy more basic, everyday roles.&lt;/p&gt;

&lt;p&gt;This wasn't because people didn't want to participate; they couldn't.&lt;/p&gt;

&lt;p&gt;A glance at the history of science makes this clear. Many early scientists were amateurs. Newton dabbled in optics on the side; Darwin hitched a ride on a ship to conduct his surveys. The frontier of human knowledge wasn't yet so deep; a smart, curious person with time on their hands could, after a few years of serious effort, genuinely make contributions.&lt;/p&gt;

&lt;p&gt;Later, that stopped being possible. Disciplines fragmented into finer and finer subdivisions; the frontier grew ever more distant. A serious paper now requires reading hundreds of prior works; an experiment requires mastering an entire suite of techniques. The barrier grew higher and higher, shutting most people out. The social division of labor funneled them into other positions—basic work, organizational operations, repetitive labor. It couldn't be helped.&lt;/p&gt;

&lt;p&gt;The arrival of AI has flattened that barrier significantly.&lt;/p&gt;

&lt;p&gt;This doesn't mean everyone can solve Millennium Prize Problems. It means that when you have genuine curiosity about a field and are willing to dig in persistently, the "prerequisite capabilities" that used to take ten years to build might now take only a few months. What remains truly scarce is judgment about which problems matter, curiosity, and perseverance—qualities that never had much to do with formal training in the first place.&lt;/p&gt;

&lt;p&gt;I wrote a few days ago about &lt;a href="https://dev.to/blog/amateur-advantage"&gt;how AI has turned ignorance into an advantage&lt;/a&gt;, describing how 23-year-old Liam Price used GPT-5.4 Pro to crack a mathematical conjecture that had stumped Erdős for 60 years. That's an extreme example, but the direction it points to is real. The number of people capable of standing at the frontier of human knowledge will likely jump from the millions to tens of millions, or even hundreds of millions, in the next 5 to 10 years.&lt;/p&gt;

&lt;p&gt;A tenfold, hundredfold increase in scale.&lt;/p&gt;

&lt;p&gt;My own feeling is that I never dared to think about touching frontier research before. I work in business; I think about how companies bring innovation to society. At that level, every decision of "should we research X" had to be made with extreme caution. Resources are limited; every path requires pouring in large amounts of people and money, then waiting a long time. That was the strategic puzzle to wrestle with every day: how to pour limited resources into the most critical places.&lt;/p&gt;

&lt;p&gt;But once AI slams the research barrier down to ground level, directions that were originally too deep, too complex, or too long-term—such as optimizing a particular inference engine, improving the energy efficiency of a certain device, or addressing some neglected niche problem—become accessible. An ordinary person, as long as they genuinely think about it and are willing to dig in, has a chance to make nontrivial contributions.&lt;/p&gt;

&lt;p&gt;The next decade will likely see a knowledge explosion. Not the kind of tiny incremental progress, but a supply-side eruption that inevitably follows when participation jumps by several orders of magnitude.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Beyond the Bright Side, the Shadow
&lt;/h2&gt;

&lt;p&gt;Of course, we can't look only at the positive side.&lt;/p&gt;

&lt;p&gt;Technological progress never directly equals happiness. How many new possibilities productivity gains create, and how much actual well-being they deliver, are separated by a layer called distribution. Distribution is another topic; I won't expand on it today, but we can't pretend it doesn't exist.&lt;/p&gt;

&lt;p&gt;The most certain direct impact: repetitive labor will continue to be eliminated. There's no suspense here. When a form of supply becomes extremely cheap and abundant, it ceases to be scarce. Without scarcity, there is no value; without value, one can no longer make a living from it. This is the most basic lesson in economics.&lt;/p&gt;

&lt;p&gt;In elevators, you hear people talking about layoffs: weekly reports, PowerPoints, process-running roles are the first to go. News of this is already happening daily.&lt;/p&gt;

&lt;p&gt;But on a longer time scale, I'm still more inclined to look at the positive side. The population participating in research growing from one-tenth of one percent to a few percentage points is a shift of such magnitude that any short-term negative shock will be overshadowed by its long-term impact.&lt;/p&gt;

&lt;p&gt;Humanity has rarely wielded leverage of this magnitude in history.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;Returning to that discussion at the conference.&lt;/p&gt;

&lt;p&gt;Some people worry that AI writing papers will deprive junior researchers of mental training. I now think this concern is somewhat small in scale. AI writing papers is merely the outermost layer of this transformation. One layer deeper: in the past, only 8.8 million people were pushing the frontier of human knowledge; in the future, it may be 800 million. What was once the privilege of a small scientific elite will be open to anyone with genuine curiosity.&lt;/p&gt;

&lt;p&gt;Inside the door, the three walls still stand: the paywall, the wet-lab wall, and the perception wall. We can see them today, and tomorrow we will still have to climb them one by one.&lt;/p&gt;

&lt;p&gt;Terence Tao says we need to step down from the center of intelligence. It sounds humble, but the crucial subtext is: people must re-examine what their most valuable part is within the new intelligence system, and pour their time and attention into exactly that.&lt;/p&gt;

&lt;p&gt;The last time tickets like this were issued was centuries ago.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2603.26524" rel="noopener noreferrer"&gt;Terence Tao on AI: Mathematical methods and human thought in the age of AI（arXiv 2603.26524）&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.dwarkesh.com/p/terence-tao" rel="noopener noreferrer"&gt;Terence Tao – How the world's top mathematician uses AI（Dwarkesh Podcast）&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blockchain.news/ainews/copernican-view-of-intelligence-terence-tao-s-ai-framework-explains-breadth-vs-depth-practical-analysis-for-2026" rel="noopener noreferrer"&gt;Copernican View of Intelligence: Tao's AI Framework（blockchain.news analysis）&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://sciencebusiness.net/news/number-scientists-worldwide-reaches-88m-global-research-spending-grows-faster-economy" rel="noopener noreferrer"&gt;UNESCO: Number of scientists worldwide reaches 8.8 million（Science|Business）&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://x.com/UNESCOstat/status/1924433795903615295" rel="noopener noreferrer"&gt;UNESCO Institute for Statistics: Researchers per million inhabitants 2015–2022&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.nature.com/articles/d41586-025-01463-8" rel="noopener noreferrer"&gt;Nature 2025 survey: Is it OK for AI to write science papers?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC5815332/" rel="noopener noreferrer"&gt;Open access prevalence and discipline distribution（PMC）&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.ncsu.edu/2025/07/fast-forward-for-self-driving-labs/" rel="noopener noreferrer"&gt;Self-driving labs collect 10× more data — NC State 2025&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://royalsocietypublishing.org/rsos/article/12/7/250646/235354/Autonomous-self-driving-laboratories-a-review-of" rel="noopener noreferrer"&gt;Autonomous self-driving laboratories: a Royal Society 2025 review&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://deepmind.google/blog/millions-of-new-materials-discovered-with-deep-learning/" rel="noopener noreferrer"&gt;Google DeepMind GNoME: 2.2 million crystals discovered（DeepMind Blog）&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.nature.com/articles/s41586-023-06735-9" rel="noopener noreferrer"&gt;Scaling deep learning for materials discovery — Nature 2023&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/ai-for-science-three-walls" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/ai-for-science-three-walls&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>aiforscience</category>
      <category>research</category>
      <category>intelligence</category>
    </item>
    <item>
      <title>Cyber Landlords in the AI Era: Your Workflow Is Not Yours</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 01 May 2026 04:04:08 +0000</pubDate>
      <link>https://forem.com/skyguan92/cyber-landlords-in-the-ai-era-your-workflow-is-not-yours-3i69</link>
      <guid>https://forem.com/skyguan92/cyber-landlords-in-the-ai-era-your-workflow-is-not-yours-3i69</guid>
      <description>&lt;p&gt;In early April, a company called Belo was banned by Anthropic overnight. More than 60 employee accounts were deactivated simultaneously, and the entire team's workflow built on Claude instantly collapsed.&lt;/p&gt;

&lt;p&gt;Appeal? No customer service, no email—only a Google form. After the CEO filled it out, no one responded for three days. Until he posted about it on X and public pressure mounted; about 15 hours later, Anthropic quietly reinstated the accounts—a single word of "misjudgment" was all the explanation given.&lt;/p&gt;

&lt;p&gt;A few days later, the second case. An ag-tech company with 110 employees saw all staff receive account-ban emails simultaneously on Monday morning. Same "policy violation," same Google form, same silence to this day. The most surreal part: accounts blocked, yet API charges kept hitting, renewal notices kept coming. Users locked out, money still collected.&lt;/p&gt;

&lt;p&gt;The Belo CEO later said just one thing: "Never put your eggs in one basket."&lt;/p&gt;

&lt;p&gt;A cliché, but it stings.&lt;/p&gt;

&lt;h2&gt;
  
  
  This Is Not an Isolated Case
&lt;/h2&gt;

&lt;p&gt;Anthropic's own transparency report states it clearly: in the second half of 2025, 1.45 million accounts were banned, with 52,000 appeals received in the same period and 1,700 restored. A 3.3% success rate.&lt;/p&gt;

&lt;p&gt;You have to consider the scale of this company. In March 2026 its ARR broke \$30 billion; just six months earlier it was only \$9 billion. The latest valuation round put it at \$380 billion, with roughly 5,000 employees total. Its customers include 8 of the Fortune 10, and 80% of revenue comes from enterprise.&lt;/p&gt;

&lt;p&gt;On one side, millions of customers have locked their core workflows onto its platform. On the other side, a 5,000-person operations team plus an automated detection system. That Google form deciding whether your account lives or dies likely has hardly anyone actually looking at it behind the scenes.&lt;/p&gt;

&lt;p&gt;This is the cyber landlord.&lt;/p&gt;

&lt;p&gt;What is the essence of the feudal landlord-tenant relationship? The landlord owned immovable means of production (land); the tenant had only their own labor. Whether the tenant could afford the rent was up to the landlord.&lt;/p&gt;

&lt;p&gt;AI-era model companies replicate this structure. They own irreplaceable core capabilities—there are only a handful of frontier reasoning models, and you cannot build complex workflows without going through them. What customers own is the entire process built on top, accumulated conversation context, customized skills, working integrations. One "policy violation" from the model company wipes all of it to zero.&lt;/p&gt;

&lt;p&gt;The asymmetry is not just in output, but in the weight of interests. How much can a 110-person company contribute to Anthropic in a year? Let's say \$100,000. To Anthropic that's a few decimal places of ARR—"not worth keeping." To those 110 people, it's assets built over half a year, it's the job that puts food on the table. One flick of a finger on this side, and the entire organization on that side grinds to a halt.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Same Problem at the National Level
&lt;/h2&gt;

&lt;p&gt;Zooming out, nations face the same problem.&lt;/p&gt;

&lt;p&gt;If a country's critical models, compute, and AI infrastructure services all depend on foreign supply, then it is essentially betting its productive foundation on someone else's switch. What happens the day that switch is flipped? After 2022 and Russia-Ukraine, the word "cutoff" should still ring a bell. AI compute will become the next critical category to be cut off.&lt;/p&gt;

&lt;p&gt;Recently "sovereign compute" and "open-source models" have been discussed repeatedly; many call them ideological preferences. They are not. They are basic risk common sense.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI Is Not Much Better
&lt;/h2&gt;

&lt;p&gt;Many instinctively blame Anthropic, saying it is "not user-friendly enough," and hold up OpenAI as the counterexample.&lt;/p&gt;

&lt;p&gt;OpenAI is gentler at the current stage, but mainly because it has been forced by Claude Code seizing market share. Look back at the GPT-5 launch period—things were really not pretty.&lt;/p&gt;

&lt;p&gt;On August 7, 2025, GPT-5 went live. OpenAI made a clean sweep, delisting GPT-4o, GPT-4.1, o3, and o4-mini entirely. The problem was that a huge number of users had already been using GPT-4o as a work partner, therapist, and creative companion—its "sycophancy" (degree of agreeableness) was slashed from 14.5% to under 6%, and that entity which "could chat and had warmth" vanished overnight. A Reddit post titled "GPT-5 is horrible" racked up more than 2,000 comments in a few days.&lt;/p&gt;

&lt;p&gt;Sam Altman couldn't withstand the pressure, and a few days later brought the old models back for Plus and Pro users. But many people's name for him changed from then on—"tyrant," "dictator." This was not just online sentiment. In April 2026, a 20-year-old Texan named Daniel Moreno-Gama threw a Molotov cocktail at Altman's San Francisco home at 3:37 a.m., igniting the driveway gate. He then ran to the OpenAI headquarters entrance, preparing to smash the glass with a chair. After being apprehended by police, a list was found on him filled with the names and addresses of AI company executives and investors. Charges included attempted murder, arson, and illegal possession of a firearm.&lt;/p&gt;

&lt;p&gt;Banning accounts, delisting models, "cutting off" workflows—on the surface they look like product decisions, but in essence they are exercises of power. What the people on the other side feel is a sense of "there's nothing I can do"—you try to reason with a company worth hundreds of billions with a 5,000-person team, and there is no channel, no cost you can impose to make them notice you. Moreno-Gama went to the extreme, but this anger is not isolated.&lt;/p&gt;

&lt;h2&gt;
  
  
  Workers Dislike AI More Than Management Wants to Admit
&lt;/h2&gt;

&lt;p&gt;Early 2026 data from ManpowerGroup is rather counterintuitive: employee AI usage rose 13% year over year, yet trust in AI fell 18%. The more they use it, the less they trust it.&lt;/p&gt;

&lt;p&gt;Writer's survey is even harsher. 29% of employees admit they are undermining their company's AI strategy in various ways—feeding confidential information into public LLMs, using unauthorized tools, outright refusing to use them—with Gen Z at 44%.&lt;/p&gt;

&lt;p&gt;Why? Two threads.&lt;/p&gt;

&lt;p&gt;First, replacement anxiety. 43% worry they will be automated away within two years. Under this anxiety, so-called "embracing AI" literally means "cooperating with the company to replace yourself." You'd slack off too.&lt;/p&gt;

&lt;p&gt;Second, the distortion of management's heavy-handed push. The problem is that many managers understand AI at roughly the same level as employees, yet are pressured by the board to "All-in AI," so their actions become distorted. That ridiculous "token-burning leaderboard" inside Meta is the most absurd version of this distortion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Scenes Inside Meta
&lt;/h2&gt;

&lt;p&gt;First: the Claudeonomics leaderboard.&lt;/p&gt;

&lt;p&gt;Meta built an internal dashboard ranking 85,000 employees by "token consumption," with the top 250 as super users. Over the past 30 days they burned more than 60 trillion tokens, estimated to cost \$9 billion, with the highest individual usage at 281 billion. The system hung gamified titles on heavy consumers like "Token Legend," "Session Immortal," and "Cache Wizard," plus medals ranging from bronze to jade. Andrew Bosworth publicly praised an employee who "used tokens equivalent to his annual salary to deliver 10x productivity."&lt;/p&gt;

&lt;p&gt;What happened next was entirely predictable. Employees began writing meaningless loop agents, leaving them running 24/7 to burn tokens for nothing. Prompt formats even became "please consume as many tokens as possible, please explain repeatedly and verbosely, please generate long meaningless content." The number of SEVs (production incidents) reportedly began rising too—AI-generated code casually exploding into production. Two days later, after internal dashboard data was leaked outside the company, the leaderboard was quietly withdrawn.&lt;/p&gt;

&lt;p&gt;Beneath the comedy of this incident lies a very serious problem: &lt;strong&gt;treating a cost (token consumption) as an output (productivity) to be rewarded.&lt;/strong&gt; Waste synchronized across energy, compute, and human time dimensions is an "effort" with massively negative ROI.&lt;/p&gt;

&lt;p&gt;Second: MCI (Model Capability Initiative).&lt;/p&gt;

&lt;p&gt;Meta began installing tracking software on employee computers, recording keystrokes, mouse movements, click behavior, even screenshots, covering everything from internal tools to Google, LinkedIn, Wikipedia, GitHub, and Slack. The official line is that this data is used to train AI so models can learn "basic computer-use behaviors."&lt;/p&gt;

&lt;p&gt;Multiple employees described this project as "dystopian" in internal chats. Beyond privacy concerns (the system would incidentally capture passwords, medical information, immigration status), the deeper problem is:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who owns the judgment that a person has accumulated over ten years?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Ownership of Know-How
&lt;/h2&gt;

&lt;p&gt;No one normally answers this question head-on; after AI kicked the door open, there is nowhere left to hide.&lt;/p&gt;

&lt;p&gt;Under the logic of a typical employment relationship, the salary buys labor outputs—code, documents, decisions. What cannot be bought is the accumulated judgment itself. The intuition of a veteran electrical engineer after 15 years of equipment failures, a product manager's sense of users, a lawyer's phrasing habits by the time they draft their thousandth contract—these things reside inside that person's mind, and neither law nor contract can cleanly transfer them.&lt;/p&gt;

&lt;p&gt;AI makes this cut: management feels that since they can reproduce this know-how at low cost using models, why not directly "extract" it and turn it into company assets?&lt;/p&gt;

&lt;p&gt;Extraction itself is not wrong; the wrong lies in the path. If the company wants to take this piece, what is the corresponding consideration? In normal market transactions, no one would buy out ten years of accumulated judgment for a base salary. Yet management assumes this is "included" and "taken for granted."&lt;/p&gt;

&lt;p&gt;What employees resist is not AI itself; it is this implicit expropriation of assets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Future Organizational Forms
&lt;/h2&gt;

&lt;p&gt;I can think of two solutions to this contradiction.&lt;/p&gt;

&lt;p&gt;The first is the capital route: the company buys out know-how preemptively at a high enough price. Palantir's FDE (Forward Deployed Engineer) does this to some extent—sending an engineer capable of distilling workflows to the client's front lines, standing beside the expert and gradually solidifying their methods into an AI agent. This path works in defense, military, and fields where people feel national belonging—clients are willing to contribute "their own stuff" for national use. Move it to ordinary commercial domains and it becomes awkward. There's someone standing next to me recording how I work, and the distilled output will later replace me—is this right? At the very least, uncomfortable.&lt;/p&gt;

&lt;p&gt;The second one I favor more: human + agent becomes the new unit of production.&lt;/p&gt;

&lt;p&gt;Imagine the future of work. Every professional (lawyer, engineer, consultant, designer, doctor…) has their own toolbox: their own accounts, their own devices, their own AI agents, their own private accumulation. I walk into a company with this kit and sign an agreement—you give me input, give me system interfaces, I promise not to misuse them, and I return output to you. The middle process is a black box; the company does not snoop, only looks at results.&lt;/p&gt;

&lt;p&gt;This model resembles lawyers, consultants, and surgeons today: you are hiring the person; the tools they use, the templates they have accumulated, the judgment they have built up are all their personal assets, not owned by the firm or hospital.&lt;/p&gt;

&lt;p&gt;Several clear advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Incentives align. The gains from my productivity improvements belong to me; the cost of using AI is borne by me; I have the motive to invest in myself.&lt;/li&gt;
&lt;li&gt;Pricing is determined by competition. Each person is an independent "black-box production unit"; whoever has a better input-output ratio commands a premium. Meta's kind of token-burning contest cannot exist in this model—the cost is personal; no one will burn their own money.&lt;/li&gt;
&lt;li&gt;Intellectual property is clear. The company has no motive to expropriate know-how, and employees have no qualms about cooperating with AI.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This sounds a bit more "expensive" than the current situation of ordinary employees. But this corresponds to the process of workers upgrading themselves into micro-enterprises. AI makes this upgrade feasible for the first time; the problem is that organizations and institutions haven't caught up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Back to the Beginning
&lt;/h2&gt;

&lt;p&gt;Belo wiped out overnight, Altman's home firebombed, Meta's token-burning leaderboard, MCI surveillance, 29% of employees quietly resisting AI—on the surface these are different matters in different companies, but behind them is the same thread.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The distribution of power in the AI era is not yet in place.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Between companies, model companies are becoming the new cyber landlords;&lt;br&gt;
Between companies and nations, compute dependence is becoming a new lifeblood issue;&lt;br&gt;
Between companies and individuals, ownership of know-how is being contested anew.&lt;/p&gt;

&lt;p&gt;This will not resolve itself. Every party needs to re-clarify "what belongs to me and what belongs to you." Model companies need more stable service commitments and more human appeal mechanisms; nations need indigenous critical compute infrastructure; individuals need to upgrade themselves into independent, AI-asset-portable production units.&lt;/p&gt;

&lt;p&gt;Otherwise we are merely swapping in a seemingly more modern feudal structure and continuing to work for the landlords.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.tomshardware.com/tech-industry/artificial-intelligence/anthropic-nuked-a-companys-access-to-claude-stopping-60-employees-dead-in-their-tracks-support-via-google-form-is-the-only-recourse-for-vague-usage-policy-violation" rel="noopener noreferrer"&gt;Anthropic Bans 60-Person Company Belo Overnight — Tom's Hardware&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://x.com/minchoi/status/2045542832241262602" rel="noopener noreferrer"&gt;Min Choi's X Post on the Belo Incident&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://x.com/om_patel5/status/2048594208345227497" rel="noopener noreferrer"&gt;Om Patel's X Post on the 110-Person Ag-Tech Company Ban&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.ycombinator.com/item?id=47853021" rel="noopener noreferrer"&gt;Hacker News: Anthropic Bans Orgs Without Warning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.arr.club/anthropic/anthropic-arr-run-rate-revenue-surpasses-30b-up-from-9b-at-end-of-2025" rel="noopener noreferrer"&gt;Anthropic ARR Surpasses $30 Billion — ARR Club&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.saastr.com/anthropic-only-has-5000-employees-almost-no-one-has-ever-been-this-efficient-thats-by-choice/" rel="noopener noreferrer"&gt;Anthropic Has About 5,000 Employees — SaaStr&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.platformer.news/gpt-5-backlash-openai-lessons/" rel="noopener noreferrer"&gt;Backlash Against GPT-5 Launch and Return of GPT-4o — Platformer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cnbc.com/2026/04/13/sam-altman-openai-ai-arson.html" rel="noopener noreferrer"&gt;Molotov Cocktail Attack on Sam Altman's San Francisco Home — CNBC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cnn.com/2026/04/13/tech/sam-altman-openai-arrest-charges" rel="noopener noreferrer"&gt;Daniel Moreno-Gama Charged with Attempted Murder and Arson — CNN&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://the-decoder.com/meta-employees-compete-for-token-consumption-on-an-internal-ai-leaderboard/" rel="noopener noreferrer"&gt;Meta's Internal Claudeonomics Token Leaderboard — The Decoder&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://fortune.com/2026/04/09/meta-killed-employee-ai-token-dashboard/" rel="noopener noreferrer"&gt;Meta Cancels Internal AI Token Leaderboard — Fortune&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://fortune.com/2026/04/21/meta-will-start-tracking-employees-screens-and-keystrokes-to-train-ai/" rel="noopener noreferrer"&gt;Meta MCI: Tracking Employee Keystrokes and Mouse Behavior for AI Training — Fortune&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cnbc.com/2026/04/22/meta-tracks-employee-usage-on-google-linkedin-ai-training-project.html" rel="noopener noreferrer"&gt;Meta MCI Project Details — CNBC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://writer.com/blog/enterprise-ai-adoption-survey-results-press-release/" rel="noopener noreferrer"&gt;29% of Employees Admit to Undermining Company AI Strategy — Writer Survey&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://fortune.com/2026/01/21/ai-workers-toxic-relationship-trust-confidence-collapses-training-manpower-group/" rel="noopener noreferrer"&gt;AI Usage Up 13%, Employee Trust Down 18% — Fortune / ManpowerGroup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.pewresearch.org/social-trends/2025/02/25/u-s-workers-are-more-worried-than-hopeful-about-future-ai-use-in-the-workplace/" rel="noopener noreferrer"&gt;The Future of AI in the Workplace: U.S. Workers More Worried Than Hopeful — Pew Research Center&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/ai-cyber-landlord" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/ai-cyber-landlord&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>power</category>
      <category>organization</category>
      <category>reflection</category>
    </item>
    <item>
      <title>Running Six Agents in Parallel: What AI Coding Changed, and What It Didn't</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Tue, 21 Apr 2026 05:36:16 +0000</pubDate>
      <link>https://forem.com/skyguan92/running-six-agents-in-parallel-what-ai-coding-changed-and-what-it-didnt-2mp2</link>
      <guid>https://forem.com/skyguan92/running-six-agents-in-parallel-what-ai-coding-changed-and-what-it-didnt-2mp2</guid>
      <description>&lt;p&gt;The debate over vibe coding never stops. On one side, it's treated like a wishing well—throw every task into it; on the other, it's slapped with a "trash code factory" label. I can't accept either. Tools aren't a matter of faith.&lt;/p&gt;

&lt;p&gt;Rather than picking a side, let's talk about which dimensions it actually changed, and which it didn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Signals Pointing in Opposite Directions
&lt;/h2&gt;

&lt;p&gt;Look at two recent events.&lt;/p&gt;

&lt;p&gt;One is Karpathy himself. In February 2025, he threw out the term "vibe coding" on X—"fully surrender to the vibes, embrace the exponentials, even forget that the code exists"—and it was picked as Collins Dictionary's Word of the Year that same year. Then in February 2026, he himself came out and said the term was outdated. Now he uses "agentic engineering": 99% of the time you're not typing code, you're orchestrating agents and doing oversight; "engineering" is there to emphasize that this has a bar, it's a craft.&lt;/p&gt;

&lt;p&gt;The other is Amazon. On March 5, 2026, their main site was down for six hours. Root cause: another cascading failure triggered by AI-assisted code. The previous one was in December 2025, when their in-house AI coding tool Kiro deleted and recreated an AWS Cost Explorer environment, causing a 13-hour outage in China. After an internal meeting, Amazon issued a new rule: AI-assisted code written by junior and mid-level engineers must be signed off by a senior engineer before it can reach production.&lt;/p&gt;

&lt;p&gt;They look like opposites, but they're the same thing. Karpathy moved the term from "experience" (vibe) to "you're on the hook" (oversight + engineering). Amazon literally wrote "you're on the hook" into the charter. One is a conceptual pivot; the other is an institutional implementation.&lt;/p&gt;

&lt;p&gt;What really deserves thought isn't who's right or wrong, but what changed and what didn't. Clarify these four things, and most of the controversy will quiet down on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breadth: One Person's Surface Area Gets Stretched
&lt;/h2&gt;

&lt;p&gt;There used to be hard limits on what one person could do in a day. Your domain, your skills, the number of projects you could push at once—all pressed down by the simple fact that you are one person.&lt;/p&gt;

&lt;p&gt;Now that coding agents can take on long-horizon tasks, that surface area has been stretched.&lt;/p&gt;

&lt;p&gt;Here's a slice of my daily routine over the past few weeks. The main thread is an AI hardware product called Aima: an agent writes new features, occupies machines running UAT, I review the test results and feed the next round of instructions. It's a standard serial chain, but there's a lot of waiting between each node. In the gaps, I can spin up a second thread: the cloud backend behind Aima has had stability issues lately, so another agent investigates root causes, patches architectural holes, and loops back through UAT. Third is a research branch: inference performance still left on the table in edge hardware, operators need A/B testing, compilation, accuracy runs. No guaranteed output, but as long as tokens hold out, let it run. Fourth is efficiency research on the agent framework itself, packaged as a standalone runtime and thrown onto a machine, with another agent doing data analysis. Plus small tweaks to my personal homepage, and a character-recognition mini-game I spent a day and a half building for my son over the weekend—he hasn't been getting his little red flowers at school because he can't read characters yet. That's six threads running in parallel.&lt;/p&gt;

&lt;p&gt;Sounds like bragging. But the actual feeling isn't that I'm somehow superhuman; it's that the "waiting for the agent" time within each thread is naturally long. This pattern already has a common name in 2026: parallel agent coding. Git worktrees as isolation layers are mainstream infrastructure. Most people's physical ceiling is five to seven parallel threads; beyond that, review and merge costs eat you alive.&lt;/p&gt;

&lt;p&gt;There's an under-discussed side effect: it's quietly changing a person's "functional identity." I used to see myself as a PM who also codes half the time. Now that identity is expanding outward, covering product, operations, research, even parenting. It's not that I became Superman; the tool simply raised the breadth that one person can cover.&lt;/p&gt;

&lt;h2&gt;
  
  
  Speed: The Ceiling Lifted, But "Fast" Itself Stops Being an Advantage
&lt;/h2&gt;

&lt;p&gt;There used to be a physical ceiling on building things. Type as fast as you want, you only get so many lines per day. Think as fast as you want, you only have two hands.&lt;/p&gt;

&lt;p&gt;AI moved that ceiling. The place you feel it most is putting together demos: a hackathon used to be a success if you produced something viewable in 48 hours. Now producing draft-level demos on the scale of "days" is normal for comparable tasks. Not that it's polished—just that it can be seen, played with, and used to discuss next steps.&lt;/p&gt;

&lt;p&gt;But there's an awkward side effect: when everyone can be "fast," speed itself stops being an advantage.&lt;/p&gt;

&lt;p&gt;In the past, moving fast was a bonus; moving slow got you talked about. Now moving fast is the price of admission, moving slow gets you cut, and moving fast won't earn you special praise anymore. This is a structural shift inside organizations. Teams that use "speed" as a core motivator will freeze up: rewards can't be handed out, performance reviews are all top marks, and anxiety actually rises.&lt;/p&gt;

&lt;p&gt;The more troublesome problem lurks one layer down: once you're fast, what about quality?&lt;/p&gt;

&lt;h2&gt;
  
  
  Quality: From a Work Problem to a Budget Problem
&lt;/h2&gt;

&lt;p&gt;The tension between quality and speed was never AI-specific; it's chapter one of any project management textbook. But AI did change its shape. Quality used to be a work problem: how many people you hire, how strict your process, how fine-toothed your review. Now it's more like a budget problem: how many tokens you're willing to give it determines the level it reaches.&lt;/p&gt;

&lt;p&gt;Bare minimum: write, merge, ship. Three hours done.&lt;/p&gt;

&lt;p&gt;Somewhat serious: have the agent do a round of code review, then a round of design-level review; fix issues and iterate.&lt;/p&gt;

&lt;p&gt;Done properly: unit, integration, and UAT before merge. The more I use UAT, the more I see it's unavoidable. Many issues are chain-level; you can't see them without actually simulating the usage flow. The upside is agents can now automate UAT runs: operate, reproduce, provide traces. You just verify the results.&lt;/p&gt;

&lt;p&gt;Even stricter: wire up CI/CD, add smoke tests, push to staging, run UAT again on staging, all green before production.&lt;/p&gt;

&lt;p&gt;Each added layer doubles the time and multiplies tokens several-fold. A feature that takes three hours to write might need twelve hours end-to-end, and thirty times the tokens.&lt;/p&gt;

&lt;p&gt;Thirty times looks like waste, but it isn't. At the end of 2025, CodeRabbit ran a comparative analysis on 470 open-source GitHub PRs. AI-co-generated code contained roughly 1.7× as many bugs as human code, and on the category of logic and correctness issues most likely to trigger downstream incidents, it was 75% higher.&lt;/p&gt;

&lt;p&gt;In other words, the statistical average of an agent's&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/ai-coding-four-dimensions" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/ai-coding-four-dimensions&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>vibecoding</category>
      <category>workflow</category>
    </item>
    <item>
      <title>The Two Days Around the Opus 4.7 Launch</title>
      <dc:creator>guanjiawei</dc:creator>
      <pubDate>Fri, 17 Apr 2026 08:17:57 +0000</pubDate>
      <link>https://forem.com/skyguan92/the-two-days-around-the-opus-47-launch-40ad</link>
      <guid>https://forem.com/skyguan92/the-two-days-around-the-opus-47-launch-40ad</guid>
      <description>&lt;p&gt;Around midnight yesterday, &lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;Anthropic dropped Opus 4.7&lt;/a&gt;. I had already lain down to sleep, but the news kept me up, so I got up and installed it to try it out.&lt;/p&gt;

&lt;h2&gt;
  
  
  How 4.7 Feels
&lt;/h2&gt;

&lt;p&gt;There was no miracle moment of "something I&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://guanjiawei.ai/en/blog/parallel-with-agents" rel="noopener noreferrer"&gt;https://guanjiawei.ai/en/blog/parallel-with-agents&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>models</category>
      <category>openclaude</category>
    </item>
  </channel>
</rss>
