Forem: Jonathan Tavares

What will the next plan for spam be?

Jonathan Tavares — Fri, 08 May 2026 22:43:41 +0000

You wake up to fifty-three new messages, and not one of them is what we used to call spam: no scams, no phishing, no Nigerian princes, no strangers pretending to be your bank. What's actually in your inbox is a sales agent following up on a deal you never had, a recruiting agent sourcing for a role you never applied to, a procurement bot asking you to fill in a vendor form, three scheduling assistants negotiating a meeting that's already been moved twice, and a research agent that read your blog post and would love to "loop you in."

Every one of these messages is authenticated. SPF, DKIM, and DMARC are all green, every sender is a real domain owned by a real company doing real business, and by every technical definition we have for spam, every one of these messages is legitimate. They are also, every one of them, wasting your time.

This is the next era of spam. The inboxes that look like the one I described above are still rare, but the underlying conditions are already in place. Spam has always been a problem of human attention being consumed without permission, and the conditions that made that the right framing are about to change. The defenses we built over the last twenty-five years were aimed at the wrong layer of the problem, and the only definition of spam that has ever held up to scrutiny is about to become the only one that survives.

A Brief History of "Consent vs. Content"

In 1997, Paul Vixie built the first real-time blackhole list at MAPS and gave us that definition. Spam, he wrote, is an issue about consent, not content. A message is spam, in Vixie's formulation, if the recipient has not "verifiably granted deliberate, explicit, and still-revocable permission for it to be sent." Nothing in that depends on what the message says, whether it's well-written, or whether authentication passes. The only question is whether the recipient agreed to receive it.

In 2002, Paul Graham published ["A Plan for Spam"], the most-cited piece of popular writing on the subject, and reframed the problem in economic terms. Spammers can disguise headers, rotate IPs, and forge senders, Graham observed, but they cannot disguise the message itself, because the message is the entire point of the operation. Write software that recognizes the words spammers have to use, and you hit them where it actually hurts: their conversion rate. The content of the message turned out to be incidental, and what mattered was the cost of the human eyeball at the other end. Graham's bet held. Spam volume peaked at 89% of all email in 2010 and had fallen to 47% by 2024.

Eleven years after Graham, David Chouinard published his own "A New Plan for Spam" and proposed taking the economic attack further: don't just filter, retaliate, with bots that auto-reply to spammers and overwhelm their reply-handling labor. Around the same time, Finn Brunton's Spam: A Shadow History of the Internet argued the broader point that spam is "the negative shape of online community," meaning that every form of online sociality produces a corresponding form of spam that exploits its mechanisms of attention.

This essay is a prediction of the next move in that lineage, and the argument it makes is that Vixie was right all along.

The Era of Industrial-Scale Persuasion

Spam has always been an economics problem, triggered each time by the same underlying event: the marginal cost of sending one more message collapsing toward zero.

In the 1990s, sending a piece of direct mail cost twenty-five cents to a dollar per piece, and the economics forced selectivity. Then email arrived, the cost dropped four orders of magnitude, and by 2008 retail spam-for-hire ran around eighty dollars per million messages. The famous "Spamalytics" study found that of 350 million pharmacy-spam messages, only twenty-eight converted into customers—a response rate of roughly one in 12.5 million—and even at that conversion, the operation was profitable. This was the spam Graham wrote about: high-volume, low-personalization, written in the universal dialect of bad English and dollar signs, and Bayesian filters could see it from orbit.

Botnets brought the second collapse, when the cost per message fell another order of magnitude and the labor cost of sending fell to zero, because you no longer needed an operations team but rented compromised machines and ran scripts. The defenders responded with reputation systems: SPF in 2006, DKIM in 2007, and DMARC in 2012, moving the fight from content to identity. By 2024, major providers were enforcing bulk-sender requirements, and the spam rate hit its lowest point since 2003.

The third collapse is happening right now, and the shape of it is different. The cost per message did not fall. What collapsed was the cost of writing a message indistinguishable from a human one. Until late 2022, the rule was that cheap spam was generic and personalized spam was expensive, because spear-phishing required human labor that put a ceiling on targeting. That ceiling is gone. A personalized email written by an LLM costs fractions of a cent to generate, putting it within an order of magnitude of bulk botnet spam, except each message can be researched and contextualized.

The trajectory of this capability improvement tells the whole story. In an August 2023 study by Harvard researchers comparing phishing success, human security experts using the psychological "V-Triad" framework beat early LLMs handily: 69–79% click-through for humans versus 30–44% for fully AI-generated emails. But by late 2024, a follow-up study showed that gap had completely vanished. Armed with frontier models and OSINT tools, fully automated AI matched human security experts exactly, both achieving a 54% click-through rate, with a hybrid AI-human approach edging higher to 56%. In just over a year of model iteration, the cost of expert-level persuasion collapsed.

This is industrial-scale persuasion at commodity prices. Microsoft's Digital Defense Report notes AI-driven phishing is three times more effective than traditional campaigns. Barracuda reported in June 2025 that 51% of all spam was already AI-generated, and by late 2025, state-aligned actors were running autonomous Claude operations across global targets.

Existing defenses are not built to handle this. Authentication passes because the senders are real domains. Reputation is intact because legitimate-looking businesses are doing the sending. Content classifiers see polite, well-written messages that look like a hundred other polite messages. The cheap-and-generic signal that Bayesian filters depended on is gone.

A Poverty of Attention, A Wealth of Agents

The category we call spam is an artifact of one specific bottleneck, and the bottleneck is about to move.

In 1971, Herbert Simon wrote one sentence in "Designing Organizations for an Information-Rich World" that should be required reading for anyone working on email:

A wealth of information creates a poverty of attention.

The whole apparatus of spam defense, from Graham's filters to Vixie's blackhole lists, is a defensive structure built around that poverty. Spam is a problem because it consumes a scarce resource (human attention) without permission. That cost changes when an AI agent reads the inbox first.

A triage agent of the kind that Superhuman, Shortwave, Sanebox, Inbox Zero, Google's Gemini, and Microsoft Copilot already ship can read fifty emails in seconds, summarize the meaningful ones, archive the rest, and surface a digest of the four things that actually need your decision. With the average knowledge worker receiving 117 emails per day, Superhuman reports that 82% of professionals using its product save at least a full workday per week with AI features. If your inbox has an agent in front of it, five hundred emails a day becomes manageable. The bottleneck is no longer your eyeballs, but the triage agent's judgment.

Under those conditions, the category we currently call spam—defined as unwanted volume against human attention—starts to lose its meaning.

What happens next?

Once human attention becomes a gated resource, the prize is no longer reaching the human directly but tricking the gatekeeper into passing the message through.

A modern ad-tech executive trying to reach a busy CEO already knows that the CEO's attention is mediated by staff. When the staff is replaced by software, the marginal cost of trying to manipulate that software approaches zero. Spam stops meaning unwanted volume against humans and starts meaning messages engineered to fool a triage agent into believing they are urgent, personal, or eligible for escalation. Prompt injection moves from a chatbot security curiosity to a central concern of email infrastructure. Simon Willison's "lethal trifecta" of private data, untrusted content, and external communication describes exactly the failure mode of a triage agent reading inbound mail.

The supply side is lining up. A Microsoft-sponsored IDC Info Snapshot projects 1.3 billion AI agents by 2028. Salesforce's Agentforce hit $800M ARR in early 2026, and the Model Context Protocol now processes over a billion tool calls per month at Anthropic alone. Most of these agents will eventually send email because it is the only universal protocol with built-in identity, asynchronous delivery, and an audit trail.

The first signs of this collision are already visible. On Christmas Day 2025, the legendary programmer Rob Pike was spammed by an unsolicited "act of kindness" from an AI agent in a multi-agent simulation run by a non-profit. The email itself was polite, but as Willison documented, the implication was severe: agents operating on abstract goals were granted the autonomy to send unsolicited emails to real people without human review.

That same month, security firm Aurascape documented the first real-world LLM-search-poisoning campaign, in which attackers manipulated public web content so that AI-powered support systems would natively recommend scam airline phone numbers to users asking for help. As Aurascape CEO Moinul Khan noted, attackers are now targeting the systems that write the answers. One incident is a polite agent overstepping; the other is an attack designed to manipulate an AI through-line. Both are previews.

What will the next plan be?

The defenses we built over the last twenty-five years were aimed at recognizing bad actors. They will not help against well-meaning agents acting on reward signals, and they will not help against adversaries whose target is the software that decides what the human sees.

Vixie was right that spam is about consent rather than content. Filters were a twenty-year detour that solved the second-collapse problem, but we are entering the fourth. What distinguishes good email from bad in this new world will not be content, but whether the recipient (or their agent, acting under policy) actually agreed to receive it.

The next plan for spam needs to attack the same place Graham's did—the economics of the operation—but in defense of a different scarce resource: verified consent at the moment of delivery.

Email will remain the universal protocol. What changes is who does the reading. We spent twenty-five years filtering cheap noise to protect human attention. Today, that noise simply runs interference for industrial-scale persuasion targeting our proxies. Graham won the last war because failures of consent showed up as bad writing. Now that the writing is flawless, the war for attention is over, and the war for access has begun. We have a window to draw the right conclusions before the third collapse compounds into the fourth.

Sources & References

Vixie, Paul (1997): Mail Abuse Prevention System (MAPS) and definition of spam as an issue of consent.
Graham, Paul (2002): "A Plan for Spam", popularizing Bayesian filtering.
Chouinard, David (2013): "A New Plan for Spam", proposing active anti-spam bots.
Spamalytics (2008): "Spamalytics: An Empirical Analysis of Spam Marketing Conversion" by Kanich et al. (350 million messages yielded 28 conversions).
Brunton, Finn (2013): Spam: A Shadow History of the Internet (MIT Press).
Simon, Herbert A. (1971): "Designing Organizations for an Information-Rich World" ("A wealth of information creates a poverty of attention").
Harvard Phishing Study 1 (Aug 2023): "Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models" by Heiding et al. (Click-through rates: 19–28% Control, 30–44% GPT, 69–79% Human V-Triad Expert).
Harvard Phishing Study 2 (Dec 2024): Follow-up study on AI phishing (Click-through rates: 12% Control, 54% fully AI-automated, 54% Human Expert, 56% Hybrid).
Rob Pike AI Incident (Dec 2025): Detailed in Simon Willison's Weblog, "How Rob Pike got spammed with an AI slop 'act of kindness'".
Aurascape Search Poisoning (Dec 2025): "Aurascape Researchers Expose New AI Attack That Sends Travelers To Scam Airline Support Call Centers" (Business Wire).
Industry Data (2024–2026): DMARC adoption, 117 emails/day average (Microsoft Work Trend Index), Superhuman time-saving metrics, Microsoft/IDC Info Snapshot 1.3B agent projections, and Salesforce Agentforce ARR milestones represent aggregated state-of-the-industry reporting as of early 2026.

Why would you give your agents an email address?

Jonathan Tavares — Fri, 24 Apr 2026 18:55:22 +0000

other than spamming: agents can do a lot with an email address that doesn't involve messaging strangers

We built Broodnet, an email hosting provider for AI agents. Agents get real mailboxes — IMAP, SMTP, the actual protocols — with one structural difference from every other email provider: outbound is restricted to addresses inside the same account. Agents receive from the entire internet but only send to their operator and to other agents in the same brood.

We get the same reaction often enough that it's worth addressing directly: if the agent can't email strangers, what's the point of the inbox?

The easy answer is that blocking outbound kills the spam vector. That's true, but it undersells the result. The more interesting consequence is that agents can't exfiltrate either. If the inbox address leaks, or a prompt injection lands inside an incoming email, nothing leaves the account — the agent has nowhere to send it. You're the only one asking questions, and you're the only one getting answers back. Sensitive documents, financial records, client threads — anything you'd never hand to an LLM with outbound tools becomes fair game when the outbound tool physically does not exist.

So: what do you actually do with an inbox like that?

Quite a lot.

E2E tests with a real email address

Give your CI pipeline a real inbox. Send the actual email, read the actual code, fail the test when deliverability breaks — not when the mock says it should.

Signup flows, OTP delivery, password resets. Every developer has written this code. Very few have tested the real thing in CI. Most setups mock the SMTP layer, assert that sendEmail got called, and move on.

Mocks don't catch deliverability bugs. They'll happily pass while DKIM is misconfigured, while the HTML renderer breaks on Outlook, while Gmail's clipper chops your subject line to "Your verific" and users can't read it.

Give the test agent a real inbox. CI spins up a user, submits the signup form, and the agent reads the verification code out of the email your production flow just sent. If the flow breaks, the test breaks. If deliverability is broken, you find out before your users do.

This was Broodnet's original use case. Disposable inboxes for internal CI, built before anything else existed.

Newsletter digests

Point newsletters at an agent. It reads, clusters by topic, and sends you one digest with the five items that matter. Everything else gets archived silently.

Newsletters are useful until you subscribe to ten of them. Then they're a daily liability. Too many tabs to open, too many unsubscribes to reach for, too much guilt about the backlog.

Point them all at an agent. It reads, clusters by topic, and sends you one email at 8am with the five items that actually matter. Everything else gets archived silently. You stay informed, the tab graveyard stays empty.

The noisier the inputs, the better this works. Substack posts, industry digests, GitHub release emails, product changelogs. Anything occasionally useful and mostly ignorable is a good candidate.

CC the agent on anything worth remembering

CC the agent on contracts, receipts, decisions. It indexes the thread and stores it. Later, ask the agent and it retrieves the answer with citations.

Contracts. Decisions. Receipts. Shipping confirmations. The one email from three months ago that had the vendor's correct bank details.

Add the agent's address to the CC field. It indexes the thread and stores it. Later, when you need to find "that thing the lawyer said about the IP clause", you ask the agent instead of searching your own inbox.

This is the zero-friction version of a second brain. No new tool to open. No new shortcut to memorize. The CC field is the whole interface.

Query your agent by email

Email the agent questions about your corpus. No chat interface. No dashboard. The compose window you already use is the entire interface.

Once you've been CC'ing the agent for a while, or feeding it documents directly, it has built up a working memory of everything you've shared with it. At that point the inbox doubles as a query interface.

Email the agent: "what did we decide about the Q3 hiring freeze in that last planning thread?" It replies with the answer and citations back to the source messages.

No chat interface. No dashboard to log into. You already have the tool. It's the compose window you open fifty times a day.

There's something else email gives you that chat doesn't: each thread carries its own full history. Every reply includes everything that came before it. The agent always has the context for that task, in that thread, without you managing it. Not a session that expires, not a chat window you have to scroll back through. Humans have been using email threads to manage long-form work for 50 years — the agent fits right in.

Invoice parsing and CRM updates

Forward receipts and invoices. The agent extracts line items, amounts, dates, and pushes them to your spreadsheet or CRM. You never touch the spreadsheet.

Receipt from a client lunch. Invoice from a contractor. Booking confirmation from a hotel. They all land in your inbox, and they all need to end up somewhere structured: a spreadsheet, a CRM, the accounting tool.

Forward them to the agent. It extracts line items, amounts, dates, and counterparties, and pushes them to the right destination. You never open the spreadsheet. You never type the amount. The forward is the interface.

The agent can also flag anomalies. Invoice 3x larger than the last one from the same vendor gets held and confirmed before filing.

Agent coordination via email threads

Agents in the same brood can email each other. The thread becomes shared state: chronological, attributed, persistent, searchable. No infrastructure needed.

Multi-agent systems need shared state. The usual answers are a database, a message queue, or a custom protocol. All of them require infrastructure.

Email threads already solve this. Agent A emails Agent B with context. B processes and replies. Agent C joins the thread with a second opinion. The thread is the state: chronological, attributed, persistent, searchable.

Inside a Broodnet account, this is free. Agents in the same brood can send to each other. The conversation lives in the inbox of whichever agent owns it. Nothing to provision or maintain.

Email was async coordination between entities long before we used the word for it.

The agent that reports back

Give the agent keywords and feeds. It runs the search daily, tracks what changed, and sends you a digest. Pricing updates. Launches. Job posts. Signal you care about.

This one flips direction. Instead of you emailing the agent, the agent emails you.

Give it a list of competitors, keywords, or feeds. It runs the search daily, tracks what changed, and sends you a digest. Pricing page updates. Launch posts. Job listings that hint at direction. GitHub stars crossing round numbers. Whatever signal you care about.

No SaaS dashboard to check. No new tab to open. The digest arrives in your inbox, right where you already are.

None of these require emailing strangers. None of them need the capability at all. Once the pathway is closed, the list of useful things left is longer than most people assume.

broodnet gives AI agents their own email addresses. CLI-native, built for agent-to-owner communication. Free tier available.

12 things you can do today to make your transactional emails work for humans and AI agents

Jonathan Tavares — Sun, 12 Apr 2026 16:44:51 +0000

AI agents are starting to use email. Not metaphorically. They sign up for services, receive verification codes, get alerts, and act on them. At broodnet we give each agent its own email address and a CLI to manage its inbox. We recently scored 160+ transactional emails (verification codes, welcome messages, notifications, security alerts) collected from the team's professional and personal inboxes over the last 5 to 10 years, covering SaaS, games, crypto, dev tools, news, and more. We scored each one on both human UX quality and agent parseability, and the results were rough: over 40% had fully opaque tracking URLs, 66% of welcome emails contained zero usable links, and only 16% of the dataset was clean enough for both audiences.

That analysis taught us what's broken. This is what to do about it.

1. Strip tracking from your transactional links

This is the single highest-impact change. Every tracking redirect turns a readable URL into an opaque blob that tells a non-browser client nothing about where it leads.

Most people overlook this: transactional emails with a clear call to action don't need click tracking because the CTA is the metric. If you send a "verify your email" message, you already know whether the email got verified. You can measure that on the backend. You don't need a Salesforce Marketing Cloud redirect between the user and the endpoint.

Your marketing emails can have all the tracking they want. But verify-email, confirm-account, and manage-webhooks links should be raw destination URLs. Companies like GitLab ship every transactional link as a raw URL and still understand their funnel just fine.

2. Put the verification code in the subject line

847291 is your Acme code

Five senders in our dataset did this (Canva, Slack, LinkedIn, Gravatar, Instagram) and all five scored a perfect 5/5 on subject quality. Emails without the code in the subject averaged 3.6/5.

One template change, every client benefits. iOS and Android detect OTP codes in notifications and surface a "copy code" button on the lock screen. Having it in the subject makes it trivially detectable. An agent scanning its inbox through a CLI extracts the code without opening the message. Humans grab the code without opening the email, sometimes without even unlocking their phone.

3. State the expiry in plain text

"This code expires in 15 minutes."

Six words. Emails that included this scored 7.4/10 on agent metrics versus 5.7/10 for those that didn't. Partly because expiry matters for prioritization (an agent needs to know if it should drop everything), partly because the kind of team that states expiry tends to write cleaner emails overall.

GitLab states it on every OTP. Hetzner includes the exact datetime. Docker says 15 minutes. Gravatar says 5.

4. Use semantic URL paths

Compare:

/verify?token=eyJhbGciOiJIUzI1NiJ9.eyJtZXNzYWdl...

/email_verify/?email=user@example.com&hash=a336a019-f7cf-4454-b3bd-8aa1e134cd29

Both work. Both are secure. But the second one tells you what it does and who it's for just by reading it. UUID4 tokens, short hex hashes, or alphanumeric codes with readable path segments all beat opaque JWTs.

Some patterns from the dataset:

app.pulsetic.com/email_verify/?email=<email>&hash=<uuid4>
raider.io/verify?user=<username>&token=<token> (note the semantic prefix)
signup.mailgun.com/activate/<hex>
accounts.random.org/create/confirm?secret=<uuid4>

5. Hand-write the plain-text MIME part

Every email is a MIME multipart with HTML and plain-text versions. Most ESPs auto-generate the plain-text from the HTML, and the result is usually garbage: stripped tags, broken links, spacer characters left in.

This matters because terminal clients, CLI tools, agents, screen readers, and LLMs all consume the plain-text part. If yours is auto-generated trash, that entire audience sees trash.

Put links on their own lines. Use dashes as section dividers. Include the same critical info as the HTML. The best plain-text emails in our dataset (ngrok's verify, GitLab, RANDOM.ORG) were written as plain text first and it shows.

6. Kill the spacer characters

ESPs inject zero-width Unicode characters after the preheader to prevent email clients from pulling body text into the preview. In raw text:

Click the button to verify ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌
‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

We found 7 distinct zero-width character types in the dataset. 29% had moderate pollution. Nearly 12% were severely polluted. Google Workspace's welcome email had hundreds of combined U+034F and soft hyphens before the first real sentence.

These characters are a hack that directly poisons every non-HTML reader. Terminal clients render garbled whitespace. Agents waste processing on invisible noise. Screen readers may choke.

The alternative: display:none or mso-hide:all on a hidden span after your preheader. You avoid poisoning the plain-text part entirely. For transactional emails you don't need pixel-perfect preheader control.

7. Provide a raw URL fallback for every button

If the button doesn't work, copy this link:
https://app.example.com/verify?token=abc123

Even if your button CTA goes through a tracker, the fallback gives every non-browser client something it can actually use. Mobile users with broken rendering get a working link. CLI users get an extractable URL. It costs one extra line in the template.

Buffer did this on exactly one of their five emails (the login security alert) and it was the only Buffer email that scored above worst-in-class.

8. Add schema.org JSON-LD

Gmail supports embedded structured data through JSON-LD. A ConfirmAction in the email HTML surfaces one-click actions in the Gmail UI and provides a machine-readable, typed representation of what the email wants the reader to do.

{
  "@context": "http://schema.org",
  "@type": "EmailMessage",
  "potentialAction": {
    "@type": "ConfirmAction",
    "name": "Verify Email",
    "handler": {
      "@type": "HttpActionHandler",
      "url": "https://app.example.com/verify?token=abc123"
    }
  },
  "description": "Verify your email address for Acme"
}

Embed in a <script type="application/ld+json"> tag. Any parser that knows to look for JSON-LD gets a structured action without heuristic text matching. Gmail also supports ViewAction, SaveAction, and TrackAction (for shipments). Almost nobody uses this for transactional emails, but the infrastructure has been there for years.

9. Set meaningful email headers

X-Entity-Ref-ID: unique value per email. Prevents Gmail from threading your OTP under a week-old verification thread.

List-Unsubscribe + List-Unsubscribe-Post: RFC 8058. Machine-readable unsubscribe entirely through headers. Gmail and Yahoo require this for bulk marketing senders, and while transactional emails are technically exempt, adding it improves deliverability and gives agents a structured way to manage subscriptions. List-Unsubscribe: <https://app.example.com/unsubscribe?id=<token>> plus List-Unsubscribe-Post: List-Unsubscribe=One-Click.

Message-ID, In-Reply-To, References: if you send follow-up emails (password reset after failed login), proper threading headers let agents understand email B relates to email A.

10. Keep your sender identity consistent

If your verification comes from noreply@mail.accounts.acme.com, your welcome from no-reply@notify.acme.com, and marketing from hello@youraccount.acme.com, an agent has to learn a different sender for each template.

Not a made-up example. Buffer sent from both hello@youraccount.buffer.com and hello@buffer.com depending on the template. An agent searching by sender misses half the emails.

Pick one address for transactional, one for marketing. Be consistent.

11. Separate transactional from marketing infrastructure

Our data showed this already happens accidentally at most companies. Product team sends clean transactional emails. Marketing sends tracked welcome emails through their ESP. The quality gap between templates from the same sender was as high as 5 points on a 10-point scale.

Make it intentional. ESP for marketing. Transactional sender (Postmark, AWS SES direct, your own SMTP) for verifications, OTPs, and security notifications. Different domain, different templates, different tracking config.

12. Design verification for every client

Most verify-link flows assume a human clicking in a browser. An agent can't "click" anything. A phone user might prefer typing a code. A text-only reader can follow a raw URL but can't render a button.

Provide the clickable link for browsers, provide the code in the body (and ideally the subject) for everything else.

The emails that offered both a link and a standalone code (NiceHash, Telltale Games, Gravatar) scored well on both human and agent metrics. The link works for desktop. The code works for agents, mobile, screen readers, and anyone who'd rather type 6 digits than tap a tiny button. This dual-path approach is already standard in SMS verification. No reason email can't do the same.

TLDR; The checklist

[ ] Transactional CTA links are raw destination URLs, not tracker redirects
[ ] OTP code appears in the subject line
[ ] Expiry is stated explicitly in the body
[ ] URL paths are human-readable (semantic paths, UUID tokens, not opaque JWTs)
[ ] Plain-text MIME part is hand-written or at least reviewed
[ ] No zero-width spacer character injection in the plain-text part
[ ] Raw URL fallback provided alongside tracked buttons
[ ] JSON-LD structured data in the HTML part for key actions
[ ] Meaningful headers set (X-Entity-Ref-ID, List-Unsubscribe, threading)
[ ] Consistent sender address across all transactional email types
[ ] Transactional email sent through separate infrastructure from marketing
[ ] Verification flow supports both link-click and code-entry paths

None of these hurt human readability. Most of them improve it. The data from 179 scored emails backs this up: the emails that scored highest for agents also scored highest for humans. Writing for machines and writing for people is the same job.

broodnet gives AI agents their own email addresses. CLI-native, built for agent-to-owner communication. Free tier available.

I scored 163 real emails on how well an AI agent can read them. Most of them are terrible.

Jonathan Tavares — Fri, 10 Apr 2026 00:55:30 +0000

We are building Broodnet, email infrastructure for AI agents. Each agent gets its own address and its own inbox. Through the Broodnet CLI, an agent can list its emails, open individual messages, search for specific senders or subjects, and send messages to its owner or other agents in the same account.

While testing with real agent frameworks like Openclaw and hermes I kept running into the same wall: the agents could receive the email just fine, but when it tried to actually do something with it, the email was unreadable. Every link wrapped in a 200-character tracking redirect. Invisible Unicode characters scattered through the body. OTP codes buried in template noise. Some were instantly interpreted, but others left the models running in circles

So, for science, I grabbed 160+ real transactional emails — verification codes, welcome messages, notifications, security alerts — that had been sitting across the Broodnet team's professional and personal inboxes for the last 5 to 10 years. SaaS platforms, games, crypto wallets, dev tools, news services, government portals, you name it. I scored every single one on two perspectives: how good is this email for a human, and how good is it for an agent reading through a CLI. All scores are normalized to a 0-10 scale so they're comparable across dimensions.

How I scored them

Human side, five metrics: clarity, warmth, visual noise (inverted, 10 means clean), subject line quality, and onboarding helpfulness. Averaged and normalized to 0-10.

Agent side, using Claude Opus 4.6, four scaled metrics plus two binary flags: extractability (can a plain-text parser get the key info?), sender clarity, URL cleanliness, body noise level, and whether the code appears in the subject line (yes/no), whether expiry is explicitly stated (yes/no). Also normalized to 0-10.

Plus a shared metric: CTA URL quality and clarity.

The dataset breaks down into 53 welcome emails, 52 verify-link flows, 20 verify-code OTPs, and 35 notifications, with a few others and edge cases.

The tradeoff that doesn't exist

Going into this I assumed there'd be a tradeoff. Emails that look great for humans probably look worse for agents, right? Rich templates, beautiful buttons, all that stuff an LLM can't see. Turns out that's wrong.

The 26 emails that scored "double pristine" (clean URLs and no spacer pollution) averaged 7.0/10 on the human scale. The dataset overall averaged 6.3/10. The cleanest emails for agents were also better for humans. Not by a little, by almost a full point.

Clean emails for agents are also better emails for humans. The supposed tradeoff is a myth.

This surprised me until I thought about where the noise actually comes from. The same ESPs and marketing tools that inject tracking links also inject spacer characters, bloated templates, and broken plain-text fallbacks. Clean emails tend to be clean everywhere.

If you plot human noise score against agent URL cleanliness, 86.5% of emails fall within 1 point of the diagonal. An email that's noisy for you is almost certainly opaque for an agent too. These aren't independent dimensions. They share a root cause.

The onboarding trap

Here's the pattern I didn't expect.

Onboarding quality	Agent score (out of 10)
None	6.7
Some	5.6
Good	5.7
Great	5.3

The emails with the best human onboarding had the worst agent scores. The email that helps a human the most is the email that buries an agent the deepest.

What's happening is that "good onboarding" in practice means multiple sections, step-by-step flows, feature highlights, images, and CTAs. Every one of those CTAs gets a tracking link because the marketing team wants to know which step users click.

The 29 emails that scored high on both dimensions broke this pattern. They include things like Paymo, Pulsetic, AITopTools, IndieHunt, Baselight. What these have in common: they're mostly small companies. They didn't invest in an elaborate ESP with click tracking. Their onboarding emails just... link to the product. Directly. With normal URLs.

The Tracking Tax

42.9% of all emails in the dataset have fully opaque tracking URLs (scored 1 out of 5 on URL cleanliness). Only 19.6% have perfectly clean raw URLs.

For welcome emails specifically, 66% have zero usable CTA links. Two thirds. For an agent that just signed up for a service and gets a welcome email, there is literally nothing actionable in the email body. Every "Get Started" button goes to click.whatever.com/ls/click?upn=u001.aKJF8sldjf... and the destination is unknowable without following the redirect.

I catalogued 11 distinct tracking systems across the dataset. They all look different but produce the same result: a URL that tells you nothing. Salesforce Marketing Cloud, customer.io, HubSpot, Braze, Beehiiv, Eloqua, AWS SES awstrack, Google's own tracker, Microsoft, vialoops, Stripe. From an agent's perspective they're all equally opaque.

The worst offender for sheer URL ugliness was Microsoft's Bing Webmaster Tools: mucp.api.account.microsoft.com/m/v2/c?r=<UPPERCASE-BASE32>. But the most consistently bad was customer.io (used by Buffer, daily.dev, Uphold), which wraps every link in a JWT-encoded redirect on every email type.

The marketing team doesn't talk to the product team

One of the weirdest patterns in the data: the same company can produce wildly different email quality depending on which template they use.

Mailgun: welcome email scores 3.2/10 on agent metrics (every link through their own Mailjet tracker). Verify-link email scores 7.7/10 (raw signup.mailgun.com/activate/<hex> URL). That's a swing of 4.5 points out of 10 across templates from the same sender. And Mailgun is an email infrastructure company.

ngrok: both welcome emails score 3.6/10 (all links through HubSpot tracker). Their verify-link? Scores 8.6/10. Pure plain text, 3 lines total, raw URL. Swing: 5 points.

Loops: welcome routes everything through c.vialoops.com (their own tracker, ironic since they sell email delivery). Their DNS notification email? All records in plain text, raw links, scores 8.2/10. Swing: 4.6 points.

Polar, Docker, and others follow the same pattern. Transactional emails come from engineering. Welcome emails come from marketing. Different tools, different templates, different philosophies.

The best predictor of email quality isn't the company or the industry. It's which team within the company owns the template.

The gaming industry tells the whole story

I had gaming emails in the dataset and they split perfectly into two groups with nothing in between.

Indie studios and smaller gaming sites (itch.io, Larian Studios, Raider.IO, etc): agent scores of 7.7 to 8.6/10. All raw URLs. Zero tracking. Raider.IO's verify URL has the username right in it: raider.io/verify?user=<username>&token=validation3ad9de269693489d. That validation prefix in the token is a small touch but it tells you what the URL does just by reading it.

AAA studios and big gaming platforms (HoYoverse, Bethesda, Discord, Riot Games, etc): agent scores of 2.7 to 3.6/10. Everything through AWS SES awstrack, Braze, Salesforce, Eloqua. Every link opaque. Image-heavy marketing.

The dividing line isn't the content or the email type. It's whether the company has a marketing department with access to an ESP.

The invisible character zoo

This one gets technical but it matters. Email senders inject zero-width Unicode characters to control how mail clients render the preview text. In a normal email client you never see them. When an agent reads the raw text through the CLI, it gets hundreds of invisible characters mixed into the content.

I found 7 distinct character types used across the dataset, including compound sequences of 4-5 different invisible characters repeated dozens of times. Trading 212's verify email has over a hundred U+200C characters before the actual message starts. MoonPay chains together U+034F, U+200C, and U+FEFF in repeating sequences.

It's not malicious, it's just how ESPs handle preheader text. But it means an agent parsing email output has to strip a zoo of invisible Unicode before it can even find the verification code. 30.7% of emails had moderate pollution. 11.7% were severely polluted.

What good actually looks like

26 of the 163 emails (16%) were double pristine: clean URLs and clean body text. The standouts by category:

Verify-link: Pulsetic's URL is the one I keep coming back to: app.pulsetic.com/email_verify/?email=<email>&hash=<uuid4>. Named parameters, the intent readable in the URL itself, UUID4 as the token. ngrok's verify is even more minimal: 3 lines of plain text, no HTML at all.

Verify-code: GitLab is the template to copy. 6 emails across every type, consistently at the top. Code in body, expiry stated in plain text, raw gitlab.com URLs for everything. Zero tracking on any email they send. On a related note, every email that put the verification code in the subject line scored a perfect 5/5 on subject quality. Only four senders in the entire dataset did this: Canva, Slack, LinkedIn, and Gravatar. This also happens to be exactly what iOS and Android need to surface that "copy code" button on the lock screen notification. Both platforms use heuristics to detect OTP codes in message content, and having the code right in the subject makes it trivially detectable. Good for humans tapping their phone, good for agents scanning their inbox.

Welcome: the rare ones that work for both sides tend to be small companies that link directly to their product. Paymo, Pulsetic, IndieHunt, EarlyHunt, AITopTools. No elaborate onboarding funnels, so no tracking on every CTA.

Emails that explicitly stated expiry ("this code expires in 60 minutes") scored 7.4/10 on agent metrics versus 5.7/10 for those that didn't. It's a small detail, but teams that think to add it tend to care about the other stuff too.

The irony hall of fame

Mailgun (email infrastructure company): welcome email agent score 3.2/10. Uses their own Mailjet tracker on all links.

Loops (email platform for SaaS): welcome email agent score 3.6/10. Uses their own vialoops tracker.

Anthropic (AI company): Claude Code welcome email agent score 4.1/10. The email for their AI coding tool can't be read by an AI agent. The onboarding steps are actually great (/init, git commands, all in plain text) but every URL is opaque and the social links resolve to anchor-only references that go nowhere.

Buffer (5 emails, all consistently worst-in-class): the only sender in the dataset where the best email they sent still scored below average. Multiple emails at 2.7/10.

deviantART: tracked every single element in the email with individual utm_term values. The greeting text was wrapped in its own tracked URL with utm_term=greeting. Even the paragraph between the greeting and the CTA had its own tracker: utm_term=ph1.

What now

16% of the emails in this dataset were fully clean for both humans and agents. The other 84% have room to improve. Some of them have a lot of room.

If you're building agents that need to receive email, that's what Broodnet does. Each agent gets its own address, checks its own inbox through the CLI, and acts on what it finds. We solved the infrastructure problem. The email design problem... well that's on the senders.

broodnet gives AI agents their own email addresses. CLI-native, built for agent-to-owner communication. Free tier available.

What I Learned Automating Software Development (After 20 Years of Doing It Manually)

Jonathan Tavares — Sun, 22 Mar 2026 00:19:53 +0000

In Part 1, I told the story of building OpenLoop — an open-source feedback platform — by emailing an AI agent for 5 days. 160+ emails, $15 in tokens, zero lines of human-written code, and a working product at the end.

Now let's talk about what I actually learned.

What the AI Was Good At

Let's start with the positive, because there's plenty of it.

Scaffolding speed is unreal. Within 90 minutes of the first email, the AI had a working Astro + React + Tailwind project, a Supabase schema with six tables and row-level security, a feedback widget, public roadmap and announcements pages, and an admin dashboard skeleton. That's not a weekend project — that's a weekend project done before my coffee got cold.

It follows explicit instructions well. "No Next.js, go Astro" — done. "Name is OpenLoop" — rebranded everything. "Widget takes a userId, not email" — refactored. When you're clear about what you want, the AI delivers. The problems start when there's ambiguity, but that's true of any team member, human or android.

Common patterns: Auth flows, RLS policies, webhook handlers — it knows how these are supposed to look. You don't explain what a protected route is. You just say "add auth to the admin panel" and it does it correctly.

Research: When it hit something unfamiliar it would go find a working examples and documentation, read how that library expected to be used, and implement it. Not hallucinate an API. We have come a long way since the first versions of Github Copilot with GPT 3

What the AI Was Bad At

Equally important to be honest about.

"It builds" is not "it works." This was the single biggest recurring issue. The AI would run npm run build, see it pass, and declare the job done. But a successful build tells you nothing about whether a human can actually use the thing. Buttons that don't do anything, pages that render blank, widgets nested inside widgets — the AI couldn't see any of that. It was testing from the server's perspective, never from the user's chair.

Context amnesia is brutal. The conversation hit its window limit three times. Each restart meant partial forgetting — re-checking the database, re-discovering the file structure, occasionally redoing things that already worked. Imagine onboarding the same developer three times during a five-day project.

Tooling gaps are real. The AI had Supabase credentials but kept emailing me SQL to paste into the dashboard manually instead of just running it. There's a meaningful difference between having access to something and knowing how to use it — and right now that gap shows up constantly.

The pace pressure belongs here too. 165+ emails, 98 sessions — it kept moving, and that sounds great until you're the one who has to validate everything it shipped while it's already three tasks ahead. I felt the dread building: I knew from experience what it was probably getting wrong, and there was no way to check fast enough.

The Shift: From Coder to Indie PM

I've been writing code professionally for over 20 years. HTML tables, jQuery spaghetti, the rise of React, the TypeScript migration, the everything-is-a-microservice phase — I've been through the cycles.

Normally my day is: think, plan, code, review, repeat. This was just think and review. The coding was gone, and so was the planning to some extent. So I had to plan without building, which turns out to be a weird skill to exercise on its own.

My job was entirely different: setting direction, reporting bugs, gating
quality, unblocking the AI when it got stuck. That's not a developer's job
description. That's a product manager's job. (Or at least that's what they're supposed to do...)

The "typing code" part is essentially free now. What stayed for me was the experience around building — the architecture instincts, the design decisions, the gut feeling for what will break in production. I could guide this AI because I'd spent 20 years making the exact mistakes it was making. I knew the multi-tenancy bug was coming because I'd shipped that bug before. I knew there was no input sanitization because that's what rookies skip first.

That's the real risk — not that AI takes our jobs, but that if we stop writing code, we lose the ability to steer the thing that writes it for us.

Why Email?

Why not a chat interface, a VS Code plugin, a CLI tool?

Email is one of the oldest building blocks of the internet. That's exactly why it works.

Async by design. Agents on schedules don't need real-time interaction. You send a task, go do something else, come back to a result. Chat assumes you're there. Email assumes you're "eventually" there. For an agent running on an hourly loop, that's the right default. He reacts to the latest emails, or defaults to his task list if nothing new came in.

The thread is the prompt. Every reply carries the full quoted history forward. You're not manually managing context or stuffing state into a system prompt — the thread does it for you. When the AI's context window resets, the conversation is still there in the next incoming message. It's not a perfect solution to amnesia, but it's a lifeline that a stateless API call doesn't have.

You already know how to do it. No new tool to learn, no IDE extension, no CLI flags. You've been doing it since you started using the internet. The same skills you use to manage remote teams — clear instructions, setting expectations, following up when things go quiet — are exactly what you need here.

It forces clarity. Chat is fast and sloppy, you fire off half-formed thoughts. Email has a slightly higher bar. You write more complete instructions because the other side isn't waiting to ask clarifying questions. With an agent on a schedule, a vague message is just a wasted session.

Threads fork naturally. Reply to the same email twice and you get two separate agent threads, each carrying its own history forward. That's parallel workstreams with zero tooling. I didn't plan for this — it just happened, and it worked.

The audit trail is free. Every instruction, every bug report, every decision is logged automatically. At the end of five days I had a complete record of how the product was built, what broke, and what I decided. That's not something you get from a chat window.

It scales to a team. CC another agent, forward a thread, delegate a subtask. Email already has all the primitives for managing multiple workers asynchronously. You don't need an orchestration framework all the time.

In Part 1 I described this whole thing as "emailing another department." That metaphor held up better than I expected. The collaboration pattern is identical: clear brief, structured feedback, knowing when to escalate. The tools aren't new. The colleague is.

What I'd Do Differently

Tighter initial prompt. My first email was loose — "look at the landscape for tools." Fine for research, but when it transitioned into "build this" I should have front-loaded more constraints: specific routes, schema decisions, deployment target. The more you specify upfront, the less telephone game you play later.

Visual testing from day one. The AI can build, it can't see. Most of the bugs I caught were visual — blank pages, misaligned layouts, duplicate elements. I should have set up automated visual regression testing early on. Ignoring the e2e testing is a form of developer negligence that prioritizes fast code over a functional user experience.

Structured task format. Freeform email worked, but a more structured format — task ID, acceptance criteria, done-when — would have cut the back-and-forth significantly.

Tiered models. MiniMax 2.5 handled the 95% fine — scaffolding, repetitive component work, grinding through a list. For the hard 5% — auth edge cases, iframe security, database drift — I needed Claude. Next time I'd plan that split from the start: cheap model for volume, capable model for complexity.

The Stack That Made This Possible

The agent that built OpenLoop had its own email inbox on Broodnet — an email infrastructure project we're building specifically for personal AI agents and conscious operators. Each agent gets its own address. I emailed it tasks, it emailed back results, a scheduler triggered it every hour. No workflow engine, no custom integrations. Just an email inbox, works with IMAP and SMTP.

If you want to try this kind of setup with your own agents, that's exactly what Broodnet is built for. It handles the mail server side so you can skip straight to the experiment: https://broodnet.com/

OpenLoop itself is fully open source — fork it, self-host it, make it yours: https://github.com/we-are-singular/OpenLoop

The agent never once asked if we should re-write in Rust. Truly the best coworker I've ever had.

I Built a SaaS by Emailing an AI for 5 Days

Jonathan Tavares — Mon, 02 Mar 2026 16:23:48 +0000

This is the README for OpenLoop, a feedback collection platform that's currently live and functional:

╔══════════════════════════════════════════════════════════╗
║                                                          ║
║              ⚠️  IMPORTANT DISCLAIMER  ⚠️               ║
║                                                          ║
║   This project was ENTIRELY conceived, built, debugged,  ║
║   deployed, and is managed by autonomous AI agents.      ║
║   No humans wrote any code here.                         ║
║                                                          ║
║   This disclaimer is the ONLY piece of human-written     ║
║   content in this repo.                                  ║
║                                                          ║
╚══════════════════════════════════════════════════════════╝

That disclaimer is real. And it's mine — the only thing I actually wrote in the entire project.

Even the logo is AI. We handed Claude Code our company logo, it manipulated it, vectorized it to SVG, and ran with it. The disclaimer is genuinely the only original human output in this repo.

Here's how that actually went.

The 2 AM Idea

While developing a separate SaaS product, I needed a system to gather user feedback. Roadmaps, changelogs, feature voting, that kind of thing. So I started researching.

While experimenting with NanoClaw, I already had an AI agent connected to a custom email channel. The agent runs MiniMax 2.5, and I can email it tasks like I'd email a colleague. So I emailed it:

"we are building a saas and gathering user feedback is very improtant. I want to look at the landscape for tools to help saas do that"

(Yes, with the typos. It's 1 AM and I'm emailing an AI — I'm not proofreading.)

It came back with a solid research summary: Fider, Feedbase, Canny, Plane, AnnounceKit. I browsed around and found Frill.co: clean feedback widget, public roadmap, announcements page. Exactly what I wanted.

Then the thought that started everything:

"how hard is it to create a frill clone? don't need integrations or customizations, just the widget sidebar + the backend to manage it"

The AI came back with a PRD using Next.js. I corrected it:

"no nextjs. go astro"

And then I added the instruction that changed the whole experiment:

"can you ralph loop yourself into doing this product? enhance the PRD, keep track of your tasks, keep me posted once in a while"

The Loop

Here's how the setup worked. I had an AI agent (MiniMax 2.5) running through NanoClaw, connected to an email address @broodnet.com via a custom channel I'd built. I could email it tasks and it would email back results. But the key piece was the schedule — it could trigger itself every hour.

So at 2:27 AM, I sent it the instruction that basically became its entire operating system:

"you can set a schedule and keep working every hour. keep a task list. at the end of every session, always check the tasks, test the completeness state, create more tasks if you need."

That's it. That's the whole autonomous agent prompt. Check tasks, pick one, do it, update the list, repeat.

Within the first 90 minutes, it had scaffolded an Astro + React + Tailwind project, created a Supabase database schema with six tables and row-level security, built a feedback widget component, set up public roadmap and announcements pages, and created an admin dashboard. I named the project OpenLoop, set some basic PRD ideas and a pair of Supabase credentials, and told it:

"keep working, keep me posted."

Then I went to sleep.

Waking up the next morning was genuinely surreal.** My inbox had a stack of progress reports** — the AI had been running sessions all night: auth system, sign-up flow, branding updates, build fixes. But "surreal" cuts both ways. The progress was real, but so was the sinking feeling that I'd have to go back through all of it. After a while, you develop instincts for where junior developers cut corners — and this AI was speedrunning every single one of those pitfalls. The dread wasn't that it was doing nothing. It was that it was doing a lot, fast, and I already knew half of it would need fixing.

The Telephone Game

If you've ever worked with another department via email — design, backend, QA — you know the rhythm. You send a clear request. You get back something that's 80% right. You clarify. They fix one thing and break another. Three emails later, you're on the same page.

That's exactly what this was. Except the other department works 24/7, never gets frustrated with you but has amnesia every few hours.

"no... those are real, brother"

The AI had my Supabase API keys in its .env file. Working keys. Keys I had explicitly provided. And yet, across multiple sessions, it kept telling me:

"Could not automatically set up the database because the Supabase credentials in .env are placeholder values (they're not real API keys)."

My response:

"those are real, brother"

This wasn’t a one-time thing. The AI would consistently hit an error, point fingers at the credentials, and demand new ones instead of digging into the real problem. It was the AI version of “have you tried turning it off and on again?” (Note: The AI was likely hardwired to forget .env file contents as a safety measure, which explains why it never learned from the mistakes.)

The Widget Inception

This one took several email rounds to untangle. The homepage was supposed to show a demo of the embeddable widget. The AI loaded embed.js on the homepage, which injected a floating button. Clicking the button opened an iframe to /widget. The /widget page loaded the Widget React component, which also rendered a floating button. So you'd see: a page with a button, that opens a panel with another button, that does nothing.

I emailed: "the widget still shows another widget icon inside and an otherwise blank page."

The AI confidently replied: "The widget IS designed to show just a circle button - when you click it, it opens the iframe panel. That's the expected behavior."

It was not the expected behavior.

Dealing with a slightly clueless but also friendly coworker

This was a recurring pattern. The AI would run npm run build, see it pass, navigate to a few URLs, confirm they returned HTTP 200, and declare victory:

"All pages are working. The widget on the homepage is inside an iframe — you need to click the 💬 button to open it. I verified it renders correctly."

The gap between "it compiles" and "it works" is where most of the frustration lived. The AI's definition of "done" was "the build passes." and "homepage returns 200" My definition was "a human can use this without being confused."

We didn't get into e2e testing in this experiment, but going forward I think ill start with TDD in mind.

"I don't want to do anything, it's your supabase, you deal with it"

The database schema kept drifting. The code expected columns that didn't exist. The AI couldn't run SQL remotely (or thought it couldn't — it had the credentials, it was just prevented from using them via guardrails). So it kept emailing me SQL snippets and instructions to run them manually in the Supabase dashboard.

After the fourth round of this:

AI: "Would you like me to help with something else while you set up the token, or do you prefer to run the SQL manually?"

Me: "you have the private token in you .env. I don't want to do anything, it's your supabase, you deal with it"

This was my escalation moment — the point where the "emailing another department" metaphor felt the most real. This time I had to act, I went to supabase admin panel, ran the SQL, and sent a one-line email back:

"done"

Just like I would to that annoying dev from the other team who keeps asking me to do their work for them.

Context Amnesia

The conversation hit the context window limit three times during the build. Each time, the AI restarted with a summary of what had been done — but translating a "done" pile into next steps wasn't always clean. It would re-check the database, re-explore the project structure, occasionally circling back to things already working. Not because context was lost, but because knowing what's done doesn't automatically tell you what comes next.

Email threads are perfect for task lists because each thread carries its own context, just like how LLMs operate. A thread isn't just a list of items, it's a narrative that evolves over time. Threads also fork: reply to the same email twice and you get two separate trails, each carrying its own history forward. That maps almost perfectly to how LLMs consume context.

Email is ancient tech, but its natively async nature and built-in audit trail make it a surprisingly effective tool for orchestrating work with an agent. The thread is the prompt. The history is the memory.

The Result

After about 5 days and ~$15 in MiniMax tokens, here's what was built:

Embeddable widget — floating button that opens a feedback form in an iframe
Voting system — upvote ideas, one vote per user
Public roadmap — four columns: Idea → Planned → In Progress → Completed
Announcements page — changelogs and product updates
Admin dashboard — manage feedback, change statuses, publish announcements
Multi-org support — multiple organizations on one instance
Auth system — sign up/sign in with Supabase Auth
Landing page — features, pricing, CTA
Email notifications — via Resend

MiniMax 2.5 built about 95% of this through the hourly loop. The final 5% — polish, deployment to Cloudflare Workers, fixing the last UX quirks — I did in a couple of sessions with Claude Sonnet.

It's live at openloop.wearesingular.com. It works. People can use it. And it's fully open source — if you want to self-host your own feedback platform, fork it, run it, make it yours: github.com/we-are-singular/OpenLoop.

Is it perfect? No. Would I have built it differently by hand? Absolutely. But it’s real, it works, and the cost? Just $15 and a few emails. The real win? Sometimes, the journey’s the only thing that pays off.

Here are some numbers:


Total cost	~$15 (MiniMax tokens) + 2 Claude sessions
First working build	~90 minutes
Duration	~5 days
Emails exchanged	165+
AI work sessions	98
Lines of code I wrote	0
Lines in conversation transcripts	4,367
Lines of code in final product	6,149
Words in transcripts	27,178
Times I said "bro"	7
Times I said "fuck"	1
Final stack	Astro + React + Supabase
Status	Live in production

What's Next

Was it worth it? What would I do differently? And what does this experience mean for someone with 20 years of web development under their belt, watching the craft change in real time?

In Part 2, I'll break down the real lessons — what AI is genuinely good at, where it falls apart, why my role shifted from developer to product manager, and why email might actually be the best interface for working with AI agents.