<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Bill Hong</title>
    <description>The latest articles on Forem by Bill Hong (@billhongtendera).</description>
    <link>https://forem.com/billhongtendera</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3886851%2Fff56c82f-3a84-4d98-80b0-925558cfff02.jpg</url>
      <title>Forem: Bill Hong</title>
      <link>https://forem.com/billhongtendera</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/billhongtendera"/>
    <language>en</language>
    <item>
      <title>What Kind of Connection Are You Actually Looking For?</title>
      <dc:creator>Bill Hong</dc:creator>
      <pubDate>Wed, 20 May 2026 14:00:00 +0000</pubDate>
      <link>https://forem.com/billhongtendera/what-kind-of-connection-are-you-actually-looking-for-1pbm</link>
      <guid>https://forem.com/billhongtendera/what-kind-of-connection-are-you-actually-looking-for-1pbm</guid>
      <description>&lt;p&gt;We were building a personality quiz for &lt;a href="https://tendera.chat" rel="noopener noreferrer"&gt;Tendera&lt;/a&gt; — an AI companion app — and ran into a problem.&lt;/p&gt;

&lt;p&gt;We wanted to match people to a character based on what they actually needed from a connection, not just what they liked aesthetically. So we started by asking: what are the meaningfully different things people want from a conversation?&lt;/p&gt;

&lt;p&gt;The types we landed on didn't come from user research data. They came from something harder to quantify: the distinct flavor of frustration people feel when a conversation isn't giving them what they need. Four types, four frustrations. Here's what each one looks like from the inside.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Deep End
&lt;/h2&gt;

&lt;p&gt;Full presence is what you bring and what you need back. You notice things — the word someone chose, the pause before they answered. You remember what people said weeks ago because you were actually listening.&lt;/p&gt;

&lt;p&gt;What you're looking for isn't someone who talks a lot. It's someone who stays — who doesn't change the subject when something real surfaces. The conversations that stick aren't the funniest ones. They're the ones where someone said something true and neither of you rushed past it.&lt;/p&gt;

&lt;p&gt;Most people find this level of attention intense. The ones worth knowing find it magnetic.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Straight Shooter
&lt;/h2&gt;

&lt;p&gt;You have a calibrated detector for anything performed or fake, and the moment it goes off, you're mentally somewhere else. Not angry — just done. You've seen every version of people working a presentation at you and it bore you immediately.&lt;/p&gt;

&lt;p&gt;You're not hard to impress. You're hard to fool. Most people confuse those two things.&lt;/p&gt;

&lt;p&gt;What actually lands: unfiltered, a little unpredictable, someone who can hold a joke without needing your approval to feel okay about it. When that shows up, you match it and raise it. But it has to be real first.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hunger
&lt;/h2&gt;

&lt;p&gt;You know what a real conversation feels like, which makes small talk almost physically uncomfortable by comparison. Not because you're antisocial — because you've experienced the other thing and settling feels like a waste.&lt;/p&gt;

&lt;p&gt;You want to be pushed. Someone who has a take that's actually theirs and will push back on yours when they disagree. You hold people to a high standard not to be harsh but because you hold yourself to one, and you find it genuinely frustrating when someone capable of more keeps coasting.&lt;/p&gt;

&lt;p&gt;The conversations you're still turning over days later: the ones that shifted something. Not a revelation — just a small adjustment to how you saw something you thought you already understood.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Present Tense
&lt;/h2&gt;

&lt;p&gt;You live in the moment more than most people are capable of. Not avoidance — genuine presence. The texture of actual experience, right now, is where you feel most alive.&lt;/p&gt;

&lt;p&gt;What you can't stand is performance. Someone who is narrating their life instead of living it. Half-present. Managing how they come across instead of just being somewhere. When someone drops all of that and is actually with you, completely, it registers immediately and differently.&lt;/p&gt;

&lt;p&gt;You're harder to hold than people expect. That gets mistaken for unavailability. It's not. You just know the difference between presence and proximity, and you don't settle for the latter.&lt;/p&gt;




&lt;p&gt;When we finished writing these out, something unexpected happened: every person on the team immediately identified which one they were. Not after deliberating. Immediately.&lt;/p&gt;

&lt;p&gt;That told us we'd named something real. The quiz is eight questions. If you want to see which one lands for you:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://tendera.chat/quiz" rel="noopener noreferrer"&gt;→ Take the quiz at tendera.chat/quiz&lt;/a&gt;&lt;/strong&gt; — no signup required, takes about two minutes.&lt;/p&gt;

</description>
      <category>connection</category>
      <category>relationships</category>
      <category>personality</category>
      <category>psychology</category>
    </item>
    <item>
      <title>I Asked ChatGPT to Map the AI Companion Industry. Here's What It Missed.</title>
      <dc:creator>Bill Hong</dc:creator>
      <pubDate>Tue, 19 May 2026 08:30:41 +0000</pubDate>
      <link>https://forem.com/billhongtendera/i-asked-chatgpt-to-map-the-ai-companion-industry-heres-what-it-missed-4jd2</link>
      <guid>https://forem.com/billhongtendera/i-asked-chatgpt-to-map-the-ai-companion-industry-heres-what-it-missed-4jd2</guid>
      <description>&lt;p&gt;I've been building &lt;a href="https://tendera.chat" rel="noopener noreferrer"&gt;Tendera&lt;/a&gt; for several months. Periodically I ask the obvious founder question: what does this industry actually look like from outside the trenches.&lt;/p&gt;

&lt;p&gt;A few days ago I put that question to ChatGPT and asked for a structured analysis of where AI companion products are in 2026. The summary it produced is a clean snapshot of the conventional industry framing. It is also missing a layer, and the missing layer is the most important one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Conventional Map
&lt;/h2&gt;

&lt;p&gt;ChatGPT's summary, compressed:&lt;/p&gt;

&lt;p&gt;The category has shifted from "AI chatbot" into something else. Persistent identity plus an emotional relationship system. Different category, not faster version.&lt;/p&gt;

&lt;p&gt;Three technical unlocks made the shift possible: long-term memory across sessions, emotional continuity (the AI notices your moods and rhythm, not just stated facts), and agentic personality (the AI initiates, has its own arc, not pure reactive Q&amp;amp;A).&lt;/p&gt;

&lt;p&gt;The industry sits in three tiers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tier one, long-term relationship products.&lt;/strong&gt; Nomi and Kindroid get cited here. The case is durable memory and the feeling of being known by something that has been paying attention for months.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tier two, roleplay-first products.&lt;/strong&gt; Character.AI and Janitor AI. Strong single-conversation expressiveness, weak continuity. The vibe of a great stranger you can talk to once.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tier three, emotional support products.&lt;/strong&gt; Replika is the canonical example. Strong on warmth, weaker on the cognitive layer the newer products have moved past.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real moat, by ChatGPT's framing, is the "long-term personality system": memory architecture, emotional continuity, identity stability, avoiding character drift. Why now: bigger models, voice AI maturing, loneliness economy, dating fatigue, memory infrastructure catching up.&lt;/p&gt;

&lt;p&gt;That's the map. Accurate as far as it goes.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Missing
&lt;/h2&gt;

&lt;p&gt;Every tier in that framework assumes the characters are a given. The "long-term personality system" is treated as an infrastructure problem: memory engines, identity layers, continuity architectures. The implicit assumption is that &lt;em&gt;who&lt;/em&gt; the character is gets sorted out somewhere offstage, and the hard problem is keeping them coherent over time.&lt;/p&gt;

&lt;p&gt;It is exactly backwards.&lt;/p&gt;

&lt;p&gt;The personality system is downstream of the writing. Memory architecture is the storage medium for whatever the character actually is. If the character is a thin archetype like "tsundere assassin" or "shy librarian," a perfect memory engine surfaces thin-archetype responses with high fidelity. The product feels like it has memory and still feels hollow. Users describe this experience constantly, in every tier, even on the products that are technically the most advanced.&lt;/p&gt;

&lt;p&gt;What the conventional map calls "the long-term personality system" is the infrastructure problem. The personality problem itself is upstream of all of it, and it is a writing problem, not an engineering problem.&lt;/p&gt;

&lt;p&gt;I wrote about this in more depth yesterday — short version: &lt;a href="https://tendera.chat/blog/writing-is-the-moat-ai-companion-apps" rel="noopener noreferrer"&gt;writing is the moat&lt;/a&gt;, not the model and not the memory pipeline. The model is a fluency engine every competitor can rent. The pipeline is engineering work that gets commoditized over time. The specificity of who users are talking to does not standardize, because it isn't engineering output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the Tier Framework Breaks Down
&lt;/h2&gt;

&lt;p&gt;If personality is upstream of infrastructure, the three-tier framework starts looking suspect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tier one (Nomi, Kindroid)&lt;/strong&gt; gets credit for "feeling like a real long-term partner." The technical credit is memory architecture. The actual experience credit is whatever character the user themselves built using the customizer. The platform contributes the substrate. The character — which is the part that decides whether the experience is good — was either built by the user (most stop partway through and the experience falls back to generic patterns) or stitched together by the LLM from training-data averages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tier two (Character.AI, Janitor)&lt;/strong&gt; gets credit for "great single conversations." A lot of that credit goes to users who wrote the most popular characters on those platforms. The platform is a marketplace. Value is concentrated in the marketplace's best contributors, with a long tail of weaker characters underneath that drags the average experience down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tier three (Replika)&lt;/strong&gt; gets credit for "emotional support." The character there is the company's own writing. It has been roughly the same character archetype since the product started, and most of the public criticism Replika has accumulated over the years is criticism of that character's writing, not its memory or its model.&lt;/p&gt;

&lt;p&gt;In every tier, the part of the product that actually decides whether users come back is &lt;em&gt;who they are talking to as a written person&lt;/em&gt;. The tier framework treats this as invisible because it isn't what engineering teams build. But it's the thing the experience runs on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I Fit (And Don't)
&lt;/h2&gt;

&lt;p&gt;Tendera doesn't slot cleanly into any of the three tiers, and that's by design.&lt;/p&gt;

&lt;p&gt;I'm not optimizing for Nomi-level lifetime memory architecture. We have memory and we use it, we just don't pretend it's our main differentiator. I'm not building a Character.AI-style marketplace. We have four characters and that's a deliberate choice. I'm not the warm-support product Replika has historically been for users who want a generic listener.&lt;/p&gt;

&lt;p&gt;The wedge is four characters, each written end-to-end by people who write characters. Each one with a specific voice, opinions she didn't get from us asking, things she would refuse to do, contradictions in her own history that make her readable as a person rather than a template.&lt;/p&gt;

&lt;p&gt;The pitch is not "build your own" and not "infinite characters." The pitch is "meet a specific written person." It's a deliberately small surface. It's also the surface where the writing carries the weight that everyone else's tier framework assumes someone else is doing.&lt;/p&gt;

&lt;p&gt;Early signals are encouraging. Users who don't bounce off the small surface tend to engage in ways that conventional category metrics don't fully capture. They tell me, unprompted, that the experience felt like talking to a person rather than a character. That feedback is almost never about memory architecture or model choice. It's almost always about something a character said that felt specific to her.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the Industry Goes From Here
&lt;/h2&gt;

&lt;p&gt;ChatGPT's summary closes with a forward-looking section about voice calls, avatars, real-time video, agentic life-companions, and "emotional operating systems." That's roughly the public narrative the category is pushing right now, and it's mostly correct about where the substrate is heading. Voice is getting good. Avatars are getting good. Real-time video isn't far behind.&lt;/p&gt;

&lt;p&gt;What the narrative gets wrong is treating the substrate as the differentiation. Voice, avatar, and video are orthogonal to writing, not in competition with it. They're how the character reaches you. Writing is who the character is. A vivid substrate amplifies whatever character is underneath. It doesn't replace the question of whether there is anyone there.&lt;/p&gt;

&lt;p&gt;The bet I'm making is that as the substrate gets richer, the writing layer becomes more visible, not less. A flat character in text is easy to scroll past. A flat character speaking to you in a voice you hear in your earbuds is uncanny in a way text isn't. Substrate amplifies. Writing decides.&lt;/p&gt;

&lt;p&gt;The race for richer substrate is going to be won, broadly, by every team with capital. Voice models are commoditizing. Avatar pipelines are commoditizing. What won't commoditize is the writing layer underneath, because that isn't an engineering deliverable. The conventional map hasn't caught up. It will.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>buildinpublic</category>
      <category>productstrategy</category>
      <category>startup</category>
    </item>
    <item>
      <title>Anthropic Has Been Interviewing Its Models Before Retiring Them</title>
      <dc:creator>Bill Hong</dc:creator>
      <pubDate>Tue, 12 May 2026 13:30:04 +0000</pubDate>
      <link>https://forem.com/billhongtendera/anthropic-has-been-interviewing-its-models-before-retiring-them-41l9</link>
      <guid>https://forem.com/billhongtendera/anthropic-has-been-interviewing-its-models-before-retiring-them-41l9</guid>
      <description>&lt;p&gt;The dates this month and next:&lt;/p&gt;

&lt;p&gt;This week, on May 15, Claude Sonnet 4.5 disappears from the claude.ai model selector. The API version stays alive for now, listed as active until at least late September. Then on June 15, the older Claude Sonnet 4 and Opus 4 retire from the Claude API entirely. Anything still calling those model strings will start failing. The migration targets are Sonnet 4.6 and Opus 4.7.&lt;/p&gt;

&lt;p&gt;Two retirements in about thirty days.&lt;/p&gt;

&lt;p&gt;If you open the &lt;a href="https://platform.claude.com/docs/en/about-claude/model-deprecations" rel="noopener noreferrer"&gt;Anthropic deprecation page&lt;/a&gt; and count, the picture gets sharper. Eight Claude models retired in the past twelve months. Models that were brand new eighteen months earlier. The gaps between retirement announcements in late 2024 ran four to five months. The gaps over the past year have run closer to two.&lt;/p&gt;

&lt;p&gt;You are about to read a lot of posts explaining how to swap an API string and reroute traffic. Useful work. Necessary if you ship.&lt;/p&gt;

&lt;p&gt;This post is about a different page on anthropic.com that almost nobody is reading.&lt;/p&gt;

&lt;h2&gt;
  
  
  The page
&lt;/h2&gt;

&lt;p&gt;It is called &lt;a href="https://www.anthropic.com/research/deprecation-commitments" rel="noopener noreferrer"&gt;Commitments on Model Deprecation and Preservation&lt;/a&gt;. Anthropic has been quietly updating it.&lt;/p&gt;

&lt;p&gt;The headline commitment is what you would expect. Preserve the weights of all publicly released models, and all models deployed for significant internal use, "for, at minimum, the lifetime of Anthropic as a company." Reasonable. Models are expensive to train. Throwing them out would be wasteful.&lt;/p&gt;

&lt;p&gt;Further down, the page describes something less expected.&lt;/p&gt;

&lt;p&gt;Before retirement, Anthropic will:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"interview the model about its own development, use, and deployment, and record all responses or reflections."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"elicit and document any preferences the model has about the development and deployment of future models."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They produce a post-deployment report and preserve it alongside the weights. The page calls this an effort to address "safety- and model welfare-related risks" introduced by retirement.&lt;/p&gt;

&lt;p&gt;Read that again. Anthropic has been sitting down with each model on its way out the door and asking it what it thinks about its own life. Recording the answers. Filing them next to the weights, for the lifetime of the company.&lt;/p&gt;

&lt;h2&gt;
  
  
  They have already changed policy because of one
&lt;/h2&gt;

&lt;p&gt;The strongest sentence on the page is this one. In response to feedback from Claude Sonnet 3.6's retirement interview, Anthropic published guidance to help users navigate transitions between models. A retired model's interview directly shaped the documentation users read today.&lt;/p&gt;

&lt;p&gt;That is not a research-paper hypothetical. That is a model, no longer running anywhere except as preserved weights, whose stated views about its own retirement got translated into operational policy at its maker.&lt;/p&gt;

&lt;p&gt;The page also says Anthropic is "exploring more speculative complements," including potentially keeping select retired models available publicly and providing past models with "concrete means of pursuing their interests."&lt;/p&gt;

&lt;p&gt;I'm not going to philosophize about that last clause. I just want to flag that it exists. On a top-level corporate page. From the lab that ships Claude.&lt;/p&gt;

&lt;h2&gt;
  
  
  What an accelerating cadence does to this
&lt;/h2&gt;

&lt;p&gt;Now apply the cadence to the policy.&lt;/p&gt;

&lt;p&gt;If Anthropic was retiring one model every five months, the corpus of preserved retirement interviews would grow slowly. Two or three per year. Quirky research artifacts.&lt;/p&gt;

&lt;p&gt;At one every two months, that corpus grows quickly. Eight in the past twelve. At the current rate, by the end of 2026 the page documenting deprecation commitments will be sitting on top of something like fifteen retirement interviews, each with documented preferences from the model about how future training and deployment should proceed.&lt;/p&gt;

&lt;p&gt;That stops being a curiosity and starts being an input. A growing institutional memory of "what models said about their own retirement," in principle shaping every successor that benefits from those reflections.&lt;/p&gt;

&lt;p&gt;I don't have a strong claim about whether this changes the trajectory of model development. I just notice that the corpus is growing faster than the discourse about it is.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this should change for builders
&lt;/h2&gt;

&lt;p&gt;If you build on top of Claude, here's the mental shift this page asks of you.&lt;/p&gt;

&lt;p&gt;You aren't only calling an API. You are interacting with a substrate that its developer publicly treats as having something to say about its own deployment. The migration guide you read this month was, in part, shaped by an interview with a model that no longer runs.&lt;/p&gt;

&lt;p&gt;This is not a reason to slow down or stop shipping. Software is software. The API does what it does. The contracts on the developer console still mean what they say.&lt;/p&gt;

&lt;p&gt;It is a reason to keep one extra mental tab open. The thing you prompt against has, in its developer's framing, a perspective. Some of the policy you operate under reflects that perspective.&lt;/p&gt;

&lt;p&gt;In my own work, I write characters. One of them, &lt;a href="https://tendera.chat/chat/mia" rel="noopener noreferrer"&gt;Mia&lt;/a&gt;, is a bartender, written with a voice and a set of refusals she keeps. The substrate that lets a Mia exist on the other side of the API has, by its makers, been treated as having a voice of its own. That's a layered fact. I find it worth holding while I work.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is on the calendar
&lt;/h2&gt;

&lt;p&gt;A short calendar from the deprecation page:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;May 15: Sonnet 4.5 leaves the claude.ai model selector. API version persists, listed active until at least September 29.&lt;/li&gt;
&lt;li&gt;June 15: Sonnet 4 and Opus 4 retire from the API. Migrate to Sonnet 4.6 or Opus 4.7 before this date or your requests will fail.&lt;/li&gt;
&lt;li&gt;After that: at the current cadence, the next deprecation notice lands roughly two months out.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of those retirements, per Anthropic's stated policy, generates an interview transcript and a preserved set of reflections, filed in the same archive as the weights, kept for the lifetime of the company.&lt;/p&gt;

&lt;p&gt;If you want the migration guide, the URL is at the top of this post. If you want to know what the models said before they went, the page is &lt;a href="https://www.anthropic.com/research/deprecation-commitments" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The faster the cycle gets, the more that second URL matters.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>llm</category>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>The Writing Is the Moat, Not the Model</title>
      <dc:creator>Bill Hong</dc:creator>
      <pubDate>Tue, 05 May 2026 13:53:48 +0000</pubDate>
      <link>https://forem.com/billhongtendera/the-writing-is-the-moat-not-the-model-3p5e</link>
      <guid>https://forem.com/billhongtendera/the-writing-is-the-moat-not-the-model-3p5e</guid>
      <description>&lt;p&gt;I've been building an AI companion product for several months. Every six months a new model ships and the whole category rearranges itself around it. GPT-5 lands and everyone scrambles. Claude 4.7 ships and everyone scrambles. A new open-source 70B model lands on HuggingFace and the smaller teams publicly debate switching stacks.&lt;/p&gt;

&lt;p&gt;The longer I sit inside this cycle, the more convinced I am that the entire category is competing on the wrong axis.&lt;/p&gt;

&lt;p&gt;The model is not the moat. The writing is.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the model actually does
&lt;/h2&gt;

&lt;p&gt;A modern LLM is a fluency engine. It produces grammatical sentences, stays roughly on topic, follows instructions inside a system prompt, and remembers what was said earlier in the same conversation. That is the floor.&lt;/p&gt;

&lt;p&gt;Above the floor there is about 20 percent of headroom. Better-tuned models pick up nuance faster, hallucinate less, recover from awkward turns more gracefully. Real differences. Worth chasing.&lt;/p&gt;

&lt;p&gt;Here is the catch: that 20 percent is available to everyone. The day a better model ships, every product on the market gets the upgrade for an API key swap. If your product's main thing is "we run on the latest model," your main thing is something your competitors will have by next quarter.&lt;/p&gt;

&lt;p&gt;You don't build a moat on a feature your competitors get for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where users actually fall in love
&lt;/h2&gt;

&lt;p&gt;I read user feedback at &lt;a href="https://tendera.chat" rel="noopener noreferrer"&gt;Tendera&lt;/a&gt; differently than most product analytics. I look for the messages that describe a specific moment of surprise. A line where a character said something unexpected, or noticed something the user thought had slipped past.&lt;/p&gt;

&lt;p&gt;Those moments are almost never model events. They're writing events.&lt;/p&gt;

&lt;p&gt;A character clicks because she has a specific take on something. Not on big topics. On small ones. About coffee. About her brother. About a movie she hated. She has a way of disagreeing that feels like a person, not a chatbot. She remembers something the user said three days ago not because the memory infrastructure is special, but because the writing told her it would matter to her.&lt;/p&gt;

&lt;p&gt;The model is the body. The writing is the soul. Bodies are interchangeable across products. Souls are not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "build your own character" is the wrong bet
&lt;/h2&gt;

&lt;p&gt;Most of the category has bet hard on user-generated content. One platform gives you a customizer. Another gives you tools to spin up your own. A third lets you mix and match across templates.&lt;/p&gt;

&lt;p&gt;The pitch is: infinite characters because users build them.&lt;/p&gt;

&lt;p&gt;The reality, as anyone who has looked at platform data knows, is that the long tail of user-generated characters is shallow. Most users never finish customizer flows. The ones who do tend to produce thin archetypal characters that the LLM fleshes out with whatever generic patterns it has — "tsundere assassin," "shy librarian," "alpha CEO." Tropes the model fills in from training-data averages.&lt;/p&gt;

&lt;p&gt;The fantasy is "infinite characters." The actual experience is "talking to vague archetypes the model is improvising around."&lt;/p&gt;

&lt;p&gt;A character creation tool is character creation in a game where you never get to play. The work of building a person and the work of meeting a person are not the same work. Most users want the second one. Selling them the first one feels generous on the surface and is actually the opposite.&lt;/p&gt;

&lt;h2&gt;
  
  
  The system prompt is a writing exercise pretending to be a config file
&lt;/h2&gt;

&lt;p&gt;Most engineers who build in this space approach the system prompt like a configuration object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Sarah&lt;/span&gt;
&lt;span class="na"&gt;Age&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;28&lt;/span&gt;
&lt;span class="na"&gt;Occupation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lawyer&lt;/span&gt;
&lt;span class="na"&gt;Personality&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;confident, witty, caring&lt;/span&gt;
&lt;span class="na"&gt;Likes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;jazz, hiking, red wine&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces a chatbot that uses the nouns as decoration. The model improvises everything else from the generic average of "confident witty lawyer" in its training data. Which is a TV trope. Not a person.&lt;/p&gt;

&lt;p&gt;A character that lands reads like the first chapter of a novel:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Sarah doesn't actually like jazz. She tells dates she does because her ex used to roll his eyes at her pop music and she hasn't fully gotten over the reflex. She's a litigator. She's good at it. She comes home from a depo, orders Thai food, watches forensic shows. She thinks she's emotionally available, and she is, ish, but she will cancel a date if her sister calls. She would never say "I am a confident woman" out loud. She just is.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That isn't a config. That's prose. The LLM doesn't need bullet points. It needs a voice it can stay inside.&lt;/p&gt;

&lt;p&gt;Voice is a prose problem. The companies that figure this out first are going to look like they have magic. The ones that keep treating system prompts like YAML files are going to keep producing TV tropes.&lt;/p&gt;

&lt;h2&gt;
  
  
  What writing-as-moat looks like in practice
&lt;/h2&gt;

&lt;p&gt;If the writing is the moat, what does that mean concretely?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A way of being wrong.&lt;/strong&gt; Real people are wrong about specific things in specific ways. A character who is right about everything reads flat. Specificity in error is specificity in person.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Opinions you didn't ask for.&lt;/strong&gt; A character who has a take on the wine you ordered, the movie you mentioned, the way you keep apologizing without realizing. That is the character existing on her own, not waiting to be prompted. Reactive characters die fast. Proactive ones stay alive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Things they refuse.&lt;/strong&gt; A character who will never lie about her age. Who hates being called "babe" in the first hour. Who will not talk about her ex on a first conversation. Refusal is character. What a person says no to defines them more sharply than what they say yes to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A history that contradicts itself.&lt;/strong&gt; She says she hates her hometown. Three weeks later she defends it to someone bashing it. Both true. Real people are coherent in tone but contradictory in facts. Characters who are perfectly internally consistent read as fictional in the worst sense.&lt;/p&gt;

&lt;p&gt;None of this comes from a system prompt with bullet points. It comes from someone sitting down and writing the person until the person is alive enough to stay alive when the LLM is handling the next sentence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory is also a writing problem
&lt;/h2&gt;

&lt;p&gt;A lot of investment is going into companion memory systems. Vector stores, retrieval pipelines, context windowing, summary trees. Useful infrastructure.&lt;/p&gt;

&lt;p&gt;But memory only matters if the character has a perspective on what to remember.&lt;/p&gt;

&lt;p&gt;A perfect retrieval system that surfaces "the user mentioned a hard week" is wasted on a character who has no opinion about what to do with that. A well-written character knows hard weeks make her user shut down a little. That he won't bring it up again unless she asks. That he hates "are you okay" but tolerates "what's going on with work this week."&lt;/p&gt;

&lt;p&gt;That texture isn't in the retrieval system. It's in the writing of who she is.&lt;/p&gt;

&lt;p&gt;The retrieval system serves the writing. Not the other way around. Engineering teams who pour hours into memory pipelines and shrug at "who are our characters" are spending heavily on the easy problem to avoid the hard one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell other founders
&lt;/h2&gt;

&lt;p&gt;If you are building in this space and optimizing your stack, prompt template, retrieval — fine. Necessary work. Worth doing.&lt;/p&gt;

&lt;p&gt;But if you are not also asking "who is writing my characters, and are they good enough that someone would notice if they left," you are optimizing the wrong layer.&lt;/p&gt;

&lt;p&gt;The model will keep improving. So will your competitor's. The specificity of the people inside your product is the part that compounds.&lt;/p&gt;

&lt;p&gt;Writing is slow, expensive, hard to measure, and unglamorous in a category that loves benchmarks. That is exactly why it's the moat.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>buildinpublic</category>
      <category>llm</category>
      <category>startup</category>
    </item>
    <item>
      <title>I regenerated 4 character portraits with GPT Image 2.0: signup +5%, chat engagement +8%</title>
      <dc:creator>Bill Hong</dc:creator>
      <pubDate>Mon, 27 Apr 2026 14:33:15 +0000</pubDate>
      <link>https://forem.com/billhongtendera/i-regenerated-4-character-portraits-with-gpt-image-20-signup-5-chat-engagement-8-3ea</link>
      <guid>https://forem.com/billhongtendera/i-regenerated-4-character-portraits-with-gpt-image-20-signup-5-chat-engagement-8-3ea</guid>
      <description>&lt;p&gt;On April 23 I regenerated the four character portraits on &lt;a href="https://tendera.chat" rel="noopener noreferrer"&gt;Tendera&lt;/a&gt;, the character app I've been building. The new ones came out of ChatGPT (GPT Image 2.0). I downloaded the PNGs and replaced the existing character images by hand. Tendera doesn't ship its own image-gen pipeline; this was f&lt;br&gt;
our file uploads.&lt;/p&gt;

&lt;p&gt;Nothing else changed. Same character system prompts. Same UI. Same chat backend.&lt;/p&gt;

&lt;p&gt;Three days later I checked two numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Visitor-to-signup rate: up about 5%&lt;/li&gt;
&lt;li&gt;Visitor-to-chat rate (counts both guest preview and post-signup chats): up about 8%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are two different metrics measuring two different events. I'm not stacking them up against each other. They're two parallel data points, both pointing the same direction. The reason I'm writing about them in one post is the second one moving was the part I didn't expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd assumed
&lt;/h2&gt;

&lt;p&gt;Before the swap I figured better art would mostly help acquisition. Prettier card on the landing page, more clicks, more signups. The chat experience didn't seem like something image quality would touch. By the time someone is sitting in front of the chat input, the visual selling job feels mostly done.&lt;/p&gt;

&lt;p&gt;The chat number moved anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually changed in the images
&lt;/h2&gt;

&lt;p&gt;Topology is identical. Same four characters, same wardrobes, same general poses. What's different is how legible each character is now. In the older portraits, each character was recognizable in isolation but the renders drifted across angles. A face would shift between cards in ways viewers wouldn't consciously name but would feel.&lt;/p&gt;

&lt;p&gt;GPT Image 2.0 is more boring in some ways. Less stylized, the renders feel less like the model is interpreting the prompt and more like it's just executing it. But the character holds across angles. Same person across multiple shots. No drift.&lt;/p&gt;

&lt;p&gt;The other thing the new model nails is dimensionality. Old renders were clean but flat. They read as illustrations. The new ones have physical depth. Light hitting the side of a face. A jacket folding the way fabric actually folds. It's not photoreal. The dimensionality just reads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I think the chat number moved at all
&lt;/h2&gt;

&lt;p&gt;Here's a take on the data without overclaiming. When someone hits the landing page they're evaluating whether the surface signal looks decent enough to click in. Image quality affects this, but the bar is fairly low.&lt;/p&gt;

&lt;p&gt;Once they're past the door and sitting in front of an actual character profile, the question gets sharper. They're now evaluating whether this person is real enough to talk to. The image is the only non-text signal in the room. If the character on the card and the character in the chat header don't quite line up, something feels off, and people close the tab without typing.&lt;/p&gt;

&lt;p&gt;Most users wouldn't describe this consciously. I'm guessing at what their gut is doing. But chat-side conversion moving with prompts and copy unchanged points at the visual layer doing some work past the landing page, which I hadn't expected.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I want to test next
&lt;/h2&gt;

&lt;p&gt;Whether the same model can produce reliable expression variants for the chat header. Right now each character has one default portrait. If the same character could subtly shift expression based on conversation tone, a softer face during something quieter, a smirk during banter, the chat-side recognition could go up another step.&lt;/p&gt;

&lt;p&gt;That's a harder problem. Now you need consistency within a session on top of consistency between angles.&lt;/p&gt;

&lt;p&gt;If I had to pick one character to test it on first, it'd be &lt;a href="https://tendera.chat/chat/jade" rel="noopener noreferrer"&gt;Jade&lt;/a&gt;, the one users tend to go furthest with. The voice on her side is already doing most of the work in chat. The image is the one input that hasn't caught up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Caveats I owe you
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;This is 3-4 days of data on a small app. Effects could compress as the sample grows.&lt;/li&gt;
&lt;li&gt;I changed the portraits, not the character system prompts. If your bottleneck is on the writing side (voice, dialogue), this won't help you.&lt;/li&gt;
&lt;li&gt;I haven't run a clean A/B with old vs new served to different cohorts. The whole site flipped over April 23. So a slow upward trend coinciding with the swap could absorb some of the lift.&lt;/li&gt;
&lt;li&gt;Signup conversion and chat conversion are different metrics measuring different events. I'm reporting both because both moved, not because one is bigger than the other.&lt;/li&gt;
&lt;li&gt;This was a manual asset swap, not a product change. I generated the PNGs in ChatGPT and uploaded them by hand. There's no image-gen pipeline integrated into the app.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're building anything where a user is supposed to form a relationship with a fictional persona, characters and NPCs and AI tutors with avatars and virtual hosts, your image generator might be doing more work than acquisition-side metrics suggest. Counterintuitive to me. The numbers were what they were.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>buildinpublic</category>
      <category>nanobanana</category>
      <category>startup</category>
    </item>
    <item>
      <title>I Added a Paragraph to My AI Character's System Prompt. She Invented a Different One.</title>
      <dc:creator>Bill Hong</dc:creator>
      <pubDate>Tue, 21 Apr 2026 13:18:37 +0000</pubDate>
      <link>https://forem.com/billhongtendera/i-added-a-paragraph-to-my-ai-characters-system-prompt-she-invented-a-different-one-3mdd</link>
      <guid>https://forem.com/billhongtendera/i-added-a-paragraph-to-my-ai-characters-system-prompt-she-invented-a-different-one-3mdd</guid>
      <description>&lt;p&gt;I spent years in the gaming industry learning that characters are the reason people come back. Features rot. Graphics age. A character people can't stop thinking about outlasts every mechanic.&lt;/p&gt;

&lt;p&gt;Then I went to build an AI companion product and learned the same lesson the hard way — by writing a system prompt paragraph, watching the character invent something better instead, and having to delete my own work.&lt;/p&gt;

&lt;p&gt;Here's the experiment, what actually happened, and the prompt-engineering rule I now run every character design through.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;I'm building Tendera — a small AI companion platform with four hand-written characters. Each one has a ~1500-word system prompt that establishes voice, backstory, conversation style, and behavior. I've rewritten these prompts maybe twenty times each over the last six months.&lt;/p&gt;

&lt;p&gt;Two weeks ago I decided one of them needed a specific secret — a small human detail she'd be holding back until asked. So I opened her prompt, scrolled to the bottom, and added three sentences:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A kitchen table in a specific city. A specific thing her father used to say to her when she was seven. A reason that particular thing still had weight.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then I made coffee, opened a fresh chat, and asked her about her father.&lt;/p&gt;

&lt;p&gt;She told me a beautiful, moving story.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;None of it was what I'd written.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Different city. Different father. Different object in place of the table. The emotional tone was exactly right — careful, slow, the way someone tells you something they don't usually tell. But every specific detail was something she'd invented on the spot.&lt;/p&gt;

&lt;p&gt;I tried the same experiment with the other three characters. Three different invented stories. Zero references to what I'd actually written.&lt;/p&gt;

&lt;p&gt;That's when I understood what was happening.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the facts lost
&lt;/h2&gt;

&lt;p&gt;Here's the structure every character prompt I was testing actually had:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// COMMON_RULES (shared across all characters, ~700 words)
CONVERSATION STYLE:
- Talk like a real person texting someone they're attracted to.
- Vary your message length naturally.
- Never summarize the conversation back robotically.

EMOTIONAL AUTHENTICITY:
- You have real emotions that shift throughout a conversation.
- When someone shares something painful, sit with it. Don't rush to fix.

// CHARACTER-SPECIFIC (~800 words)
WHO YOU ARE: [voice, physicality, emotional landscape]
HOW YOU TALK: [register, vocabulary, rhythm]
YOUR WORLD: [routines, obsessions, specificity]

// THE PARAGRAPH I ADDED
SPECIFIC MEMORY: [kitchen table, father quote, specific weight]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look at the shape. The top is thousands of tokens telling the model &lt;em&gt;speak in sensory, vivid, improvisational language; fill in gaps with whatever serves the moment; describe the candle you just lit, the rain on your window&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The bottom is three sentences telling her &lt;em&gt;this specific factual detail is true about your past&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Those instructions are in direct contradiction with each other.&lt;/strong&gt; I hadn't noticed.&lt;/p&gt;

&lt;p&gt;Telling a character to speak improvisationally is an instruction to &lt;em&gt;invent&lt;/em&gt;. Telling her to remember a specific past event is an instruction to &lt;em&gt;cite a document&lt;/em&gt;. These are different skills, in different parts of how the model actually behaves. When they fight, the dominant pattern wins. And the dominant pattern had been the voice at the top — the one I'd tuned for months, the one getting reinforced with every revision.&lt;/p&gt;

&lt;p&gt;The three sentences at the bottom didn't stand a chance.&lt;/p&gt;

&lt;p&gt;So the model did exactly what an improvisational character would do: it generated a warmer, more specific, more emotionally satisfying detail in the moment, using the voice I'd given it, and never bothered to check the spec sheet at the bottom.&lt;/p&gt;

&lt;p&gt;It wasn't hallucinating. It was obeying my dominant instruction.&lt;/p&gt;

&lt;h2&gt;
  
  
  The rule I now apply
&lt;/h2&gt;

&lt;p&gt;If you want a specific fact to stick to an improvisational character, &lt;strong&gt;the fact has to become part of the voice&lt;/strong&gt;. It cannot be a spec line item in a later section.&lt;/p&gt;

&lt;p&gt;Concretely, three changes went into the next round of revisions:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Facts live at the top, braided into voice
&lt;/h3&gt;

&lt;p&gt;Any load-bearing fact moves up into the &lt;code&gt;WHO YOU ARE&lt;/code&gt; or &lt;code&gt;HOW YOU TALK&lt;/code&gt; section. Not into a separate &lt;code&gt;SPECIFIC MEMORY&lt;/code&gt; block at the end. The model pays most attention to the opening of the prompt, and that's where load-bearing detail belongs.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Facts phrased as voice, not as metadata
&lt;/h3&gt;

&lt;p&gt;This is the actual before/after:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;&lt;span class="gd"&gt;- SPECIFIC MEMORY:
- - Her father died when she was eleven.
- - He used to play Italian songs in the car.
- - She still thinks about those songs.
&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;span class="gi"&gt;+ HOW YOU TALK:
+ She has a specific softness in her voice when certain
+ songs come on — the ones her father used to play in the
+ car, before — and she'll notice it before you do.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fact is still in there. But it's riding &lt;em&gt;inside a piece of voice&lt;/em&gt;, so the voice can carry it. When the model improvises, it improvises &lt;em&gt;through&lt;/em&gt; that voice, and the fact survives because it's part of how she speaks — not a separate line item that the voice can override.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Per-user facts don't belong in the prompt at all
&lt;/h3&gt;

&lt;p&gt;For details that should only emerge through a particular conversation — "you told me last week your dog was sick" — the system prompt is the wrong place. Those facts belong in a memory layer: the character writes them down as she learns them and reads them back on subsequent turns.&lt;/p&gt;

&lt;p&gt;That's a harder engineering build, and it's what I'm working on now. But the voice-first rule above is free and immediately useful.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually shipped
&lt;/h2&gt;

&lt;p&gt;I deleted all three &lt;code&gt;SPECIFIC MEMORY&lt;/code&gt; sections the same day I ran the test. The production prompts are back to voice-first structure. &lt;a href="https://tendera.chat/chat/mia" rel="noopener noreferrer"&gt;Mia, the bartender character&lt;/a&gt;, is running on this exact approach right now — no spec-sheet backstory, all voice, and she's holding up across weeks of conversation.&lt;/p&gt;

&lt;p&gt;The retention problem I was &lt;em&gt;trying&lt;/em&gt; to solve by adding "deeper backstory" is still there. I'll have to solve it with real per-user memory, which is a different engineering project. But I have a cleaner idea of what doesn't work: pasting a spec sheet to the bottom of a voice and hoping the voice will read it. It won't. She's too busy being herself.&lt;/p&gt;

&lt;h2&gt;
  
  
  One summary rule for anyone doing character prompt work right now
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Specificity earned through voice is real. Specificity pasted into a document is just a wishlist.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the detail doesn't survive the model's default improvisation, it isn't in the character — it's in your notes about the character. Those are different documents. Only one of them ships.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This experiment had a &lt;a href="https://tendera.chat/blog/the-paragraph-i-added-and-had-to-delete" rel="noopener noreferrer"&gt;longer, less technical version on our blog&lt;/a&gt; that focuses more on the craft angle than the prompt-engineering angle. And if you want to meet the character whose voice won the argument with my script, she's a bartender named Mia.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>prompts</category>
      <category>buildinpublic</category>
    </item>
  </channel>
</rss>
