Forem: Matt Cretzman

Master Library + Client Forks: A Deployment Pattern for Multi-Client AI Skills

Matt Cretzman — Fri, 08 May 2026 20:32:19 +0000

A fractional CMO walks into Monday morning with five clients on her roster. Each one sells something different. Each one targets a different ICP. Each one has a different competitor set, a different industry, a different incumbent vendor problem. And every Monday she has to brief herself before five different sales conversations.

The math gets ugly fast. Most fractional consultants run three to five clients at a time. Average monthly retainers for fractional sales leaders climbed to $9,651 in 2024. 72% of CEOs plan to increase their use of fractional executives in the next twelve months.

Which means every fractional operator has the same operational problem: how do you build leverage across multiple clients without flattening them all into the same generic playbook?

You don't want a separate, hand-built workflow for each client. That doesn't scale. You also don't want one generic workflow that ignores everything that makes each client different. That doesn't deliver.

The answer is a deployment pattern most operators haven't named yet. Master Library + Client Forks.

The pattern

Inside Skill Refinery, every account can have multiple libraries. A library is a folder. Each library can hold any number of skills — structured pieces of expert IP you invoke from ChatGPT, Claude, or Copilot via MCP.

Two layers:

Layer 1 — Master Library. One library called Master Skills (Blueprints). Holds the canonical, generalized version of every skill. Industry-agnostic, ICP-agnostic, offer-agnostic. The blueprint.

Layer 2 — Client Libraries. One library per client, named after the client. Contents start identical to the Master — a clone — then get personalized. Same skill structure, but tuned to the client's industry, ICP, regulatory framework, offer language, named accounts and competitors.

The Master is the source of truth. The Client Library is the working copy.

If you've worked in software, this is the same shape as a base class with subclasses, or a config file with environment overrides. The methodology lives in one place. Personalization lives in another. Inheritance is implicit but disciplined.

Why this beats the alternatives

Two wrong-shape patterns most operators stumble into:

One big library. Everything goes in one library. Skills accumulate conditional logic — if client is in healthcare, do X; if financial services, do Y. Within six months they're bloated, fragile, impossible to reason about. Every change risks breaking three other clients' workflows.

Skill per client. Every skill gets cloned once per client and lives in isolation. No master. When you learn something new, you have to update five skills manually. You won't. Drift sets in. Quality varies wildly across clients.

Master Library + Client Forks fixes both. The master is the single place where methodology lives. Forks are the only place where personalization lives. They never get confused.

Implementation — three steps

1. Create the Master Library
   Name: "Master Skills (Blueprints)"
   Access: internal
   Audience: staff
   Description: "Canonical generalized skills. Forked into client 
                 libraries for personalization."

2. Drop generalized skills into the master
   - Strip client-specific naming
   - Strip tool dependencies that aren't universal
   - Generic enough for any operator to pick up Monday morning

3. Create one library per client. Clone master skills into each.
   Library name = client name
   Personalize:
     - Industry context
     - ICP definition
     - Named competitors / incumbents
     - Offer language
     - Tone-of-voice rules

Onboarding a new client onto every skill in the master takes about thirty minutes. Linear scaling with client count.

The leverage moment

The reason the pattern matters is what happens at month three.

You'll learn something about how to run a sales intelligence briefing that you didn't know on day one. Maybe LinkedIn job descriptions are the highest-signal incumbent vendor check ever invented. Maybe Form 4 SEC filings are the most underused data point in B2B prep.

When that learning hits, you don't update five client skills.

You update the master. Once.

Then you propagate the relevant change to client libraries that need it — and only the ones that need it. Some changes are universal. Some only matter for clients in specific industries.

Compound leverage. Every engagement makes the master better. Every master improvement makes every future engagement better. The skills don't degrade. They get sharper over time.

The IP shift

The pattern works because of something most consultants get wrong about IP.

Your IP isn't a single document or a single methodology. It's the set of repeatable, generalized patterns plus the specific knowledge of how to deploy them in a given context.

Generalized methodology without context is a textbook.

Context without generalized methodology is tribal knowledge that walks out the door when you do.

A master library captures the methodology. A client library captures the context. Together they're an actual operating system for fractional work.

You're not selling hours anymore. You're selling deployments of your skill stack.

What to build first

Pick one skill. The one you do most often. The one you'd hate to lose if you had a memory wipe.

For most fractionals that's pre-meeting research, weekly client check-in synthesis, or outbound sequence drafting.

Generalize it into a master skill. Fork it into a single client library for one engagement. Run it for two weeks. See whether the master gets sharper. See whether the fork stays personalized. See whether the second client onboarding takes thirty minutes instead of three hours.

The pattern compounds quietly until it doesn't. By client four or five it's the only way you can imagine running the practice.

The tools are off the shelf. Skill Refinery, an AI like Claude or ChatGPT, the libraries you set up yourself in the SR admin UI. No engineering required.

Keep Building,
Matt

Originally published at blog.mattcretzman.com.

About the author: Matt Cretzman builds AI agent systems through Stormbreaker Digital. Ventures include Skill Refinery. Writing at mattcretzman.com.

I Replaced a $400/Month Podcast Booking Tool in One Session. Here's the Full System.

Matt Cretzman — Sat, 25 Apr 2026 15:00:01 +0000

I was paying PodPitch $400 a month to book me on podcasts. It did three things: find shows, write pitches, send emails. Three things I already had tools for.

So I cancelled the subscription and built the replacement in a single session with Claude Code.

274 podcasts discovered. 49 bespoke pitches drafted. A 3-touch email sequence loaded into 20 warm sender inboxes. Zero additional monthly cost.

Here's how the whole system works — and why I think most SaaS tools are just a tax on not knowing how to wire APIs together.

The SaaS Decomposition Framework

Before I wrote a line of code, I decomposed PodPitch into its three layers:

Data source: Where do they find podcasts? Podcast databases with search APIs. Podcast Index is free. Apple's iTunes Search API is free. Between them, they cover virtually every active show.

Intelligence layer: How do they generate pitches? AI writes them. I already pay for Claude, which writes significantly better than any template engine.

Action layer: How do they send emails? Through an email sender. I already pay for SmartLead with 20 warm inboxes.

PodPitch wasn't selling me a product. It was selling me glue between three services I already own. And glue is exactly what Claude Code builds.

The Architecture

The system runs as a five-stage pipeline:

Monday 6am:     DISCOVER  → Search Podcast Index + iTunes APIs
Daily 6:30am:   SCORE     → Claude evaluates relevance + audience fit
Daily 7am:      DRAFT     → Claude writes bespoke pitch per podcast
Daily 7:05am:   PUSH      → Telegram cards with approve/edit/reject buttons
Daily 8am:      OUTREACH  → Approved pitches load into SmartLead

Each stage has its own database state machine. A podcast moves through: discovered → scored → pitched → outreach_sent → replied → booked. Failures at any stage don't block the others. If scoring fails on one podcast, the other 49 still draft and send.

The Discovery Engine

This is where most people would reach for a paid API. I used two free ones.

Podcast Index (podcastindex.org) is an open podcast directory with a free API. HMAC authentication, full-text search, episode metadata. 25 results per query.

iTunes Search API is Apple's podcast search. No authentication. No rate limit documentation. 50 results per query. For the keyword "knowledge ownership," Podcast Index returned 1 result. iTunes returned 14. Same keyword, 15x the coverage.

I run 30 keywords across both APIs every Monday. The first run discovered 274 podcasts. After deduplication by RSS URL, 273 were unique.

But here's the part that surprised me: the RSS feed is the most underrated enrichment source on the internet.

The 60% Email Discovery Rate

Every podcast has an RSS feed. Most people treat it as a syndication pipe. I treat it as an enrichment goldmine.

The <itunes:email> tag inside the <itunes:owner> block contains the podcast host's email address. Not a generic inbox. The host's actual email. And it exists in roughly 60% of all podcast RSS feeds.

Out of 274 podcasts discovered, my RSS parser pulled 80 host emails at zero API cost. For the remaining 40%, I fall back to Hunter.io domain search on the podcast's website URL. That caught another 34.

Final tally: 114 out of 274 podcasts had a verified host email. 78% discovery rate. 80 of those emails cost me nothing.

Why the Pitches Actually Work

Here's a typical cold podcast pitch:

"Hi, I'm an expert in X and I'd love to be on your show. I think your audience would benefit from hearing about Y."

Delete. Every host gets 50 of these a week. They all sound the same because they are the same — template engines with variable slots.

Here's what my system generates instead:

The pitch references 2-3 of the host's recent episode titles by name. It connects a specific theme from my book to what their audience already cares about, based on those episodes. The subject line is under 60 characters. The body is under 250 words. It reads like someone who actually listened to the show.

That's because the pipeline stores the last 5 episodes per podcast — titles, descriptions, publication dates. When Claude drafts the pitch, it has real context. Not "your show about entrepreneurship" but "your episode with Alyssa Schneider on building on your own terms."

And here's the quality control feature I didn't expect: out of 51 podcasts that passed scoring, Claude declined to pitch one of them. The model returned a decline_reason explaining why the fit was bad. That one declined pitch is worth more than the 50 it wrote — it means the system has taste.

The Approval Flow

I could have wired this to auto-send. I didn't.

Every pitch gets sent to a dedicated Telegram bot (@podpitch_owt_bot) as a card with the full context: podcast name, host info, scores, subject line, pitch body, referenced episodes. Four buttons: Approve, Edit, Reject, Skip.

My approval workflow is two minutes on my phone while I'm waiting for coffee. Scroll through the cards, tap Approve on the good ones, Reject the bad fits, Edit anything that needs a tweak.

Approved pitches flow into SmartLead automatically at 8am the next morning. Three-touch sequence: Day 0 is the bespoke pitch, Day 4 is a short nudge, Day 9 is a different angle using the $2.4M extraction stat from the book. Twenty warm inboxes rotate the sends for deliverability.

The Numbers

Here's what the first run produced:

274 podcasts discovered across 30 keywords from 2 free APIs
51 passed Claude scoring (18.6% pass rate — tight filter, by design)
49 bespoke pitches drafted with episode-specific references
114 host emails found (80 from RSS at zero cost, 34 from Hunter)
20 warm sender inboxes attached to a 3-touch sequence
$0/month incremental cost vs $200-400/month for PodPitch

The pipeline runs automatically. Discovery on Mondays. Scoring, drafting, and loading daily. I review pitch cards on my phone. Approved pitches send the next morning.

What This Really Means

I didn't build a podcast booking tool. I decomposed a SaaS product into its component APIs and wired them together with AI.

The data source was free (Podcast Index + iTunes). The intelligence layer was Claude, which I already pay for. The action layer was SmartLead, which I already pay for. The enrichment was 60% free (RSS parsing) and 40% Hunter, which I already pay for. The approval UX was a Telegram bot, which is free.

PodPitch's value proposition was holding those pieces together. That's the part AI coding assistants do best.

This pattern applies to any SaaS tool you're paying for. Decompose it: What's the data source? What's the intelligence layer? What's the action layer? If you already pay for each piece individually, you're paying the vendor to be glue. And glue just got automated.

The expert owns the delivery. On their terms.

Keep Building,
— Matt

My Content Pipeline Broke on Day Two. Here's What I Built to Make It Unbreakable.

Matt Cretzman — Fri, 24 Apr 2026 15:00:01 +0000

If you're automating Substack, Medium, Notion, or any ProseMirror-based editor with Playwright, you've probably hit this: keyboard.type() produces raw markdown characters instead of formatted text.

[link text](https://url) shows up as literal square brackets. 1. triggers an auto-formatted list. **bold** stays as asterisks.

The fix is six lines.

The Problem

Rich text editors like ProseMirror don't parse markdown on keyboard input. When you call page.keyboard.type('[text](url)'), the editor receives individual keystrokes. It has no idea you meant a hyperlink.

Medium's editor is worse — it watches for formatting triggers. Type 1. at the start of a line and it auto-converts to an ordered list. Type - and you get bullets. Character-by-character input fires every trigger.

The Fix: HTML Clipboard Paste

ProseMirror (and Medium, and most contenteditable editors) do interpret HTML — but only on paste operations via ClipboardEvent.

Karpathy Just Described the Future of Knowledge Delivery. We Already Built It.

Matt Cretzman — Sat, 04 Apr 2026 17:24:20 +0000

Yesterday, Andrej Karpathy posted a breakdown of how he's using LLMs to build "LLM Knowledge Bases." Raw documents go into a directory. An LLM compiles them into structured, interlinked markdown. The LLM maintains the knowledge base. Health checks catch inconsistencies. Queries get filed back. Knowledge compounds.

Then he said: "I think there is room here for an incredible new product instead of a hacky collection of scripts."

I've been building that product. It's called Skill Refinery. Here's the technical story of what it takes to go from personal wiki to production platform.

Same Thesis, Different Architecture

Karpathy's insight: LLMs are knowledge compilers, not answer machines. Use them to process raw source material into structured knowledge they can reason over.

Our extraction engine does exactly this. Books, courses, training recordings, coaching frameworks, technical manuals — any format. The engine processes raw expert IP into structured skill cards. But going from Obsidian on one machine to a multi-tenant, multi-platform production system introduces five architectural problems a personal wiki never faces.

Problem 1: IP Protection at the Architecture Level

Karpathy's wiki contains public research. Nobody's livelihood depends on it. When you extract knowledge from published authors or proprietary enterprise IP, the architecture has to enforce protection by design.

Our approach: source documents are processed by the extraction engine to produce skill cards. After extraction, source files are done — they never re-enter an AI context window during delivery. Skill cards enter context transiently during MCP calls but are excluded from model training under enterprise API contracts.

This isn't a policy. It's an architectural constraint. The system physically cannot serve source material after extraction is complete.

Problem 2: Multi-Platform Delivery via MCP

Karpathy's output is markdown files in Obsidian. For a product, knowledge needs to reach users wherever they work.

We deliver through MCP (Model Context Protocol) — the open standard supported by Anthropic, OpenAI, and Microsoft. One extraction produces skill cards that work inside Claude, ChatGPT, and Microsoft Copilot. The MCP server handles authentication, skill resolution, and access control per request.

Internal members authenticate with sk_org_ prefixed key-based identity resolution. External clients use static client_ prefixed keys. No OAuth sessions needed for internal delivery — the key resolves identity and permissions on every call.

Problem 3: Multi-Tenant Access Control

One user, one wiki, no permissions needed. An enterprise with seven business units, each with different knowledge scopes, needs granular access.

We just shipped division-level skill visibility. Business unit assignments on skill cards control which teams see which knowledge. Changes take effect on the next MCP call — no reconnect required. The middleware resolves the requesting user's organization and division, then filters the skill response accordingly.

Problem 4: Monetization Infrastructure

Expert knowledge has value. Karpathy's system is free because it's personal. A platform needs payment rails.

We use Stripe Connect with the platform as merchant of record. Experts set pricing on Creator Storefronts. Subscribers pay through the platform. Revenue splits are automated. The skill card itself carries the monetization metadata — tier, pricing, access level — so the MCP server can enforce paid vs. free at the protocol level.

Problem 5: Extraction That Scales

Karpathy manually feeds documents into his raw/ directory and prompts the LLM to compile. That works for 100 articles.

At 50 experts with hundreds of skill cards each, you need a pipeline. Our extraction engine processes source material in under 10 minutes per source. The pipeline identifies frameworks, decision trees, and expert reasoning patterns, then structures them into modular skill cards with metadata for classification, audience tagging, and tier assignment.

The extraction methodology itself is institutional knowledge — refined through processing 125+ books from a single expert. Each extraction improves the pipeline.

The Category

I coined the term Knowledge Delivery System (KDS) months ago. The LMS manages courses. The KMS manages documents. The KDS delivers answers.

Karpathy just validated the thesis from the research side. The technical community is going to build toward this whether we exist or not. The question is whether the infrastructure is purpose-built or cobbled together from scripts.

If you're interested in the architecture: skillrefinery.ai

Full KDS framework: The LMS Is Dead. Long Live the KDS.

Originally published at blog.mattcretzman.com.

About the author: Matt Cretzman builds AI agent systems through Stormbreaker Digital. Ventures include Skill Refinery. Writing at mattcretzman.com.

Building In-Context Messaging Into MCP Tool Responses

Matt Cretzman — Sun, 22 Mar 2026 23:20:24 +0000

MCP (Model Context Protocol) is pull-based. The client — Claude, ChatGPT via custom Actions, Copilot — initiates every request. There's no server-push mechanism.

That means if you want to deliver a message to a user inside their AI tool, you can't push it. You need to wait for them to show up.

At Skill Refinery, we built a message queue layer that does exactly this. When an expert broadcasts a message, it enters a queue. The next time a subscriber's AI tool makes any tool call to our MCP server, pending messages are appended to the response.

From the subscriber's side, it feels like getting a notification the moment they open their AI. Here's the architecture pattern and the design decisions that make it work.

The Problem

Knowledge platforms are passive. Subscribers ask, the platform answers. Experts have no way to reach their subscribers except through saturated external channels — email (121 messages per day for the average worker), push notifications (46 per day per smartphone user), or social media (algorithm-dependent reach).

If your MCP server is already handling tool calls from subscribers, you already have a delivery channel. You just need to build the plumbing.

The Architecture Pattern

Four components:

1. Message queue table. Stores pending broadcasts with sender info, message type, content, expiration timestamp, and a dismissable boolean.

2. Delivery tracking table. Records which subscribers received which messages, keyed by a SHA-256 hash of the subscriber's MCP key — never the raw key. Tracks delivery count per subscriber-message pair.

3. Middleware layer. Sits between the tool call handler and the response. On every tool call: query pending messages for this subscriber, filter by rate limits and delivery caps, append qualifying messages to the response payload, insert delivery records in batch.

4. Subscriber control tool. A dedicated MCP tool that lets subscribers manage their preferences through natural language — dismiss a message, opt out of a sender, opt out of a message type, or nuclear opt-out of all broadcasts.

The Design Decisions

These aren't technical afterthoughts. They define the product experience.

Knowledge answer first, messages after. The subscriber asked a question. Answer it. Then append messages. Messages never hijack the primary interaction. This is the foundational principle — flip the order and you destroy the core product experience.

Pull-based delivery, not push. True server-push doesn't exist in MCP today. Rather than waiting for Streamable HTTP transport to mature and for client implementations to support server-initiated notifications, build on what works now. The infrastructure is the same either way — when push arrives, you flip the delivery channel without rebuilding the queue.

Max 2 messages per tool call. Regardless of how many senders are active for a given subscriber. A subscriber with 10 expert subscriptions doesn't get hit with 10 messages at once.

Daily cap per subscriber. Even across multiple sessions and multiple tool calls, a subscriber sees a maximum number of unique messages in a 24-hour window. The system protects the experience at the subscriber level, not the sender level.

Auto-dismiss after 3 deliveries. If a subscriber has seen the same message across three separate tool calls and hasn't engaged (clicked, responded, or explicitly dismissed), the message is automatically marked dismissed. This prevents stale-message pile-up.

SHA-256 hashed keys for delivery tracking. Delivery records need to associate messages with subscribers, but storing raw MCP keys in a delivery table is a security risk. Hash the key, use the hash for lookup, never persist the raw value.

Cross-Platform Rendering

This is the biggest unknown. Claude, ChatGPT, and Copilot all handle appended text in tool responses differently. Some render it cleanly. Some truncate. Some prioritize the "answer" portion and collapse additional content.

The approach: keep messages plain text, self-contained, and short. No markdown formatting assumptions. No HTML. Each message should be legible as a standalone paragraph even if the AI wraps it differently than expected.

As a fallback, build a dedicated check_messages tool. If a platform consistently drops appended content, subscribers on that platform can explicitly ask their AI to check for pending messages. It's a less elegant experience but a reliable backup.

Fan-Out at Scale

An org admin broadcasting to 16,000 members needs to insert 16,000 delivery records. Doing this synchronously on the broadcast call would time out.

The approach: batch inserts of 500 rows per transaction. Queue the fan-out asynchronously. Analytics updates (delivery counts, open tracking) happen in background jobs that never block the tool response.

The tool call response path must stay fast. Any work that doesn't directly affect the subscriber's current response goes async.

Rate Limiting as Product Design

Rate limits on this system aren't about server protection. They're product design decisions that define the subscriber experience.

Experts get a limited number of sends per day. Org admins get more, because internal operational messages have higher urgency. The per-subscriber caps (messages per tool call, messages per day) override sender limits — the subscriber experience is always the ceiling.

Getting these numbers wrong in either direction is a product failure, not a technical one. Too restrictive and senders can't reach their audience. Too permissive and subscribers opt out of everything.

Message Types and Expiration

Different message types have different lifespans. Announcements expire quickly — they're time-sensitive by nature. Compliance or action-required messages persist longer. Content notifications sit in between.

Org compliance messages support a dismissable: false flag. The message keeps delivering until its expiration date, regardless of how many times the subscriber has seen it. This directly supports use cases where a company needs confirmation that staff received a critical update.

The Reusable Pattern

This pattern generalizes to any MCP server:

Add a message queue table to your database
Build middleware that checks for pending messages on every tool call
Append qualifying messages to the response, after the primary content
Track deliveries with hashed subscriber keys
Expose a subscriber-control tool for preference management
Rate limit at the subscriber level, not just the sender level

The infrastructure is lightweight — four database tables, a middleware function, and one additional MCP tool. No external services required.

The harder part isn't the code. It's the product decisions: what are the right rate limits? When does a message become stale? How do you balance sender value against subscriber trust? Those decisions are the product.

If you're building MCP servers and delivering knowledge to subscribers, you already have the delivery channel. The message queue just makes it a communication channel.

We're building this at Skill Refinery right now. If you're an expert, coach, or consultant who wants to deliver knowledge — and soon, messages — through your subscribers' AI tools, that's where you set up. If you're building in the MCP space and want to talk integration or partnership, grab time on my calendar.

I'm Matt Cretzman. I build AI agent systems through Stormbreaker Digital. Ventures include Skill Refinery, TextEvidence, LeadStorm AI. Writing at mattcretzman.com.

Originally published at blog.mattcretzman.com.

82% of B2B Buyers Think Your Reps Are Unprepared. Here's the AI Stack That Fixes It.

Matt Cretzman — Sat, 07 Mar 2026 22:28:39 +0000

Your reps are walking into calls underprepared. Not because they're lazy — because doing pre-call research right takes one to two hours minimum. Most reps don't have that time, so they do ten minutes on LinkedIn and call it done.

The traditional alternatives are expensive: a dedicated research analyst runs $50–$150 an hour. Enterprise sales intelligence platforms run $15,000–$40,000 a year. Most companies don't invest there. So the gap stays wide, call after call.

Here's the question worth sitting with: how much more revenue could your sales team close if every rep had a full AI-powered intelligence briefing before every call — delivered in four minutes?

That's not hypothetical. The system exists. Here's the full stack.

The Problem Is Bigger Than Most Teams Admit

82% of B2B decision-makers believe sales reps are often unprepared for their calls. Only 16% of reps met quota in 2024, down from 53% in 2012. And 63% of B2B losses happen before needs assessment — in discovery, before the rep has even gotten to pitch.

The intelligence gap is where deals go to die.

What the System Delivers

A structured PDF briefing in the rep's inbox in under four minutes:

Company overview, subsidiary map, tech stack analysis
Contact verification with CRM conflict detection (Apollo vs HubSpot cross-reference)
Org map and decision-maker hypothesis
Commercial triggers — facility expansions, leadership changes, regulatory filings
Hiring signal analysis and what it signals about operational priorities
Verbatim, account-specific discovery questions (not generic frameworks)
Objection prep for this account type
CRM data quality flags and pre-meeting action items

The Stack

Clay — Enrichment layer. Pulls from Apollo and Hunter.io, runs waterfalls, flags conflicts between data sources. When Apollo says "Director of Planning and Inventory" and your CRM says "Plant Manager," Clay surfaces that conflict before it costs you the meeting.

n8n — Orchestration engine. Workflow logic, data routing, error handling when a source returns null versus bad data. The connective tissue.

Exa — Real-time web intelligence. Recent news, press releases, company announcements from the last 90 days. Multiple source confirmation before treating something as a verified trigger.

Perplexity — Synthesized company research. The 30-minute analyst writeup, automated.

Regulatory databases — FDA enforcement records pulled programmatically via the public API. USDA/FSIS flagged as manual action item when dairy or meat processing is involved.

Claude — Synthesis layer. Not formatting — analysis. Every enriched source feeds a structured prompt that produces account-specific discovery questions, objection prep, and interpretive analysis of hiring patterns and tech stack signals. This is the layer that turns data into a usable briefing.

HubSpot — Existing relationship context. CRM data cross-referenced against external sources. CRM data is almost always partially wrong — the system catches that before the rep walks in.

The On-Demand Layer

Initial version: automated briefings for scheduled HubSpot meetings. Useful, but it missed all the calls where reps were most likely to wing it — cold outreach, inbound callbacks, trade show follow-ups.

The on-demand fix: a form. Company name, domain, contact name. Submit. Four minutes later the briefing is in the inbox.

This is the architectural decision that changes actual field behavior. Automated-only systems improve preparation for meetings reps were already planning carefully. On-demand changes preparation for the calls where reps were going to skip it.

The Form-to-Workflow Trigger

One thing worth noting for anyone building similar systems: the form field names must match the downstream workflow exactly.

When I wired the on-demand form to the n8n pipeline, the product spec listed five field names that had changed during development of the underlying workflow. None of those mismatches would have thrown an error. The system would have run, the PDF would have delivered, and every data-dependent section would have populated with [DATA GAP] because all the queries were empty strings.

The failure mode that looks like success is the most dangerous one.

Before writing the submission handler, I pulled the live workflow and read the actual field names from the source. Two minutes. That's the check.

The Numbers

Personalized outreach gets 32% higher response rates than generic. 52% of sales teams using AI tools report 10–25% pipeline growth.

Direct math: average B2B win rate is 21%. A 5-point improvement from better pre-call intelligence = 24% relative increase in closed revenue from the same pipeline. On a $5M target, that's $1.2M without a single new lead.

Implementation

Stack is off the shelf. The value is in the wiring and the synthesis prompt architecture — getting Claude to generate verbatim, account-specific questions rather than generic frameworks requires prompt iteration.

If you want to build this, the architecture decisions are all here. If you want it installed, reach out.

→ mattcretzman.com | Stormbreaker Digital

Originally published at blog.mattcretzman.com.

About the author: Matt Cretzman builds AI agent systems through Stormbreaker Digital. Ventures include TextEvidence, LeadStorm AI, Skill Refinery. Writing at mattcretzman.com.

CigarSnap: From $1.08 Reddit Scrape to Live App in 48 Hours

Matt Cretzman — Sat, 07 Mar 2026 20:32:18 +0000

On December 27, 2024, I heard Greg Isenberg break down mobile apps generating $50K+ in monthly recurring revenue using AI. By the time the episode ended, I had a thesis. Seventy-two hours later, I had a production app with 15+ features.

This is how CigarSnap went from a podcast moment to a live platform — and why the real money isn't where you'd expect.

The Isenberg Framework: Five Criteria Before a Single Line of Code

I don't build things because they sound cool. I build things that survive contact with a framework. Isenberg's five-criteria filter for AI mobile app opportunities is how I evaluate ideas before writing anything:

The audience actively spends money.
A repeating problem exists.
The solution involves photo or video input.
Accuracy matters enough to pay for.
Existing tools are weak.

Cigars scored on all five.

Premium cigar enthusiasts — mostly men, 35 to 65, high disposable income — already spend hundreds on their hobby. They buy cigars repeatedly, visit lounges regularly, and want to know what they're smoking. A photo-based identification tool solves a real, recurring problem. And the existing apps? Manual databases where you type in what you're smoking. No AI. No image recognition. Desktop-first designs from a decade ago.

The market was begging for something modern.

$1.08 for 3,522 Qualified Leads

Before building anything, I needed to know the audience was real and reachable. So I ran an Apify scrape on Reddit's r/cigars community — 221,000+ members, actively posting, deeply engaged.

Total cost: $1.08.

That scrape returned 3,522 qualified users. Real people talking about cigars daily, recommending brands, sharing humidor photos, reviewing blends. Not emails purchased from a list broker. Actual enthusiasts who'd raised their hand by participating in the community.

Market validation doesn't have to be expensive. It has to be honest.

The 48-Hour Sprint

With the framework validated and the audience confirmed, I went to Replit Agent and started building.

I wrote approximately 12 complete, copy-paste-ready prompts — each one a full feature specification with database schemas, API routes, UI components, and integration points. Not vague descriptions. Production-ready specs that an AI coding assistant could execute without guessing.

In 48 hours, I shipped:

AI cigar identification via Claude Vision API
A digital humidor with collection tracking
A smoking journal with the "Three Thirds" flavor education system — the app calculates smoke time based on ring gauge and length, then prompts tasting notes at 33% and 66% to teach flavor transitions
A social community feed
A lounge finder with 138+ Texas venues mapped
A gamification system with 47 achievement badges across 10 categories
Daily leaderboards with a "Scan of the Day" algorithm
Shareable cards for social media
A referral engine
RevenueCat subscription integration

That's not a prototype. That's a product.

Don Carlos: The AI Concierge Who Remembers Your Humidor

Most AI integrations in consumer apps feel like an afterthought — a chatbot bolted onto the side. I wanted something different.

Don Carlos is a branded AI persona powered by Claude Sonnet. Named after the Arturo Fuente Don Carlos line, he's a distinguished gentleman who greets you with Spanish time-of-day greetings — "Buenas tardes, señor" — and has full visibility into your humidor and tasting history.

He doesn't just answer questions. He knows what you've smoked, what you rated highly, and what's sitting in your humidor right now. He suggests pairings, identifies cigars from photos when direct scanning fails, and delivers recommendations with butler-like hospitality: "Welcome home, sir. Your collection awaits."

The goal was to make Don Carlos feel like a knowledgeable friend at a premium lounge — warm, opinionated, genuinely helpful. Not a generic chatbot. A character.

Branded AI personas are the future of consumer apps. If your AI layer doesn't have a personality, you're leaving engagement on the table.

The Trojan Horse: Why the Consumer App Isn't the Business

Here's where most people get the CigarSnap story wrong. They think it's a consumer app. It is — on the surface. But the consumer app is the tip of the spear.

The real revenue comes from B2B lounge partnerships.

Think about it from a lounge owner's perspective. CigarSnap's lounge finder is driving foot traffic to their venue. Users are checking in, scanning cigars, leaving reviews, sharing experiences. That data — which lounges are getting traffic, what cigars people are smoking there, when they visit — is incredibly valuable to the lounge operator.

I built a Partner Portal where lounge owners can claim their profile, manage inventory, post offers to nearby users, and access analytics. Pricing tiers run from free to $149/month.

The consumer app creates the audience. The B2B portal monetizes it. I call it the Trojan Horse Strategy — the free consumer experience generates the foot traffic data that lounges can't ignore, converting free listings into paid subscriptions at $49 to $149 per month.

The consumer app is the moat. The B2B revenue is the castle.

The Self-Building Asset Library

One more thing that makes CigarSnap defensible: when a user scans a cigar that doesn't have an image in the database, the system calls Google's Gemini to generate a photorealistic product photo. Studio lighting, wood background, accurate band details. That image gets stored and served to every future user who scans the same cigar.

The community is unknowingly building a premium image library that would cost thousands to replicate manually. Cost to us: approximately $20 to $80 per 1,000 photorealistic images.

Every scan makes the platform better. Every user adds to an asset that competitors would need significant investment to match.

Current State and What's Next

The iOS app has been submitted to the Apple App Store. The web platform is live at web.cigarsnap.app. The B2B Partner Portal is operational. Go-to-market is underway with Reddit community targeting, influencer outreach, and a DFW hyper-local launch playbook.

Total elapsed time from podcast inspiration to production platform: under 30 days. Solo founder.

The web platform also runs on Stripe instead of Apple's in-app purchases — which means capturing 97% of subscription revenue versus 70% through the App Store. That's not a minor detail when you're building a real business.

CigarSnap started as a hypothesis tested against a framework, validated with a $1.08 scrape, and built in a 48-hour sprint. The consumer app is live. The B2B model is ready. And every user who scans a cigar makes the whole thing more valuable.

That's the kind of flywheel I like building.

Originally published at mattcretzman.com

I Built a 15-Feature AI App in 48 Hours as a Solo Founder. Here's the Exact Method.

Matt Cretzman — Mon, 23 Feb 2026 06:12:25 +0000

On December 27, 2024, I heard Greg Isenberg break down AI mobile apps generating $50K+ MRR on his podcast. He laid out five criteria for picking a market: (1) audience actively spends money, (2) repeating problem, (3) solution involves photo/video input, (4) accuracy matters enough to pay for, (5) existing tools are weak.

I immediately thought: cigars.

$57 billion global market. 5–7% annual growth. The top competitor — Cigar Scanner, 150K users — had been pulled from the App Store. Cigar Dojo had 29K members on a desktop-first platform with zero AI. Every existing app was basically a manual database where you type in what you're smoking.

By December 28, I had a functional app with AI-powered cigar identification, a digital humidor, tasting journal, smoke session tracker, social feed, achievement badges, subscription billing, and a referral engine. Fifteen major features. Forty-eight hours. One person.

This post breaks down exactly how I did it — the architecture, the methodology, and the prompt engineering approach that made the speed possible.

The Stack

Mobile: React Native / Expo (iOS + Android)
Web: Next.js 14
Backend: Supabase (PostgreSQL + Row Level Security + Auth + Storage)
AI — Cigar Identification: Claude Vision API
AI — Concierge Chat: Claude Sonnet
AI — Image Generation: Gemini / Imagen 3.0
Payments: RevenueCat (mobile), Stripe (web)
Development: Replit Agent + Claude for architecture/prompts

One database. One API. Three interfaces (mobile app, web app, B2B partner portal).

The Method: AI-Prompt Sequencing

Here's the thing that made 48 hours possible. I didn't build features one by one. I didn't even write code directly for the first several hours.

Instead, I used Claude to architect the entire product — market validation, feature specs for every screen, database schema, monetization model, and go-to-market strategy. That gave me a comprehensive blueprint.

Then came the part that changed how I build everything: I had Claude design a series of complete, sequenced prompts that I could feed directly to Replit Agent.

Each prompt was a self-contained module. Not "build me a social feed." Each one included:

Database schema (exact tables, columns, types, RLS policies)
API routes (endpoints, request/response shapes, error handling)
UI components (screens, state management, user flows)
Integration points with previously built modules

Approximately 12 prompts. Ordered by dependency — authentication first, then core data models, then features that depend on those models, then features that depend on other features.

The key insight: each prompt assumed the previous ones were already implemented. So prompt #7 (badges system) could reference the database tables created in prompt #3 (humidor) and prompt #5 (smoke sessions) without re-specifying them.

I fed each prompt to Replit Agent in sequence, tested, fixed edge cases, and moved to the next one.

The AI Integration Layer

Cigar Identification (Claude Vision)

The core feature. User photographs a cigar band. The image goes to Claude Vision with a structured prompt requesting JSON output:

{
  "brand": "Arturo Fuente",
  "product_line": "Don Carlos",
  "vitola": "Robusto",
  "country_of_origin": "Dominican Republic",
  "wrapper_type": "Cameroon",
  "ring_gauge": 50,
  "length_inches": 5.25,
  "strength": "Medium-Full",
  "price_range_single": "$12-18",
  "price_range_box": "$180-240",
  "confidence": 0.92
}

Confidence scores determine the UX path. High confidence → show results directly. Lower confidence → suggest manual verification or route to the AI concierge for a second opinion.

Don Carlos AI Concierge (Claude Sonnet)

This is where it gets interesting. Don Carlos is a branded AI persona — a distinguished gentleman with deep cigar knowledge, warmth, and cultural sophistication. But the real power is context injection.

Every conversation receives:

The user's complete humidor inventory
Their tasting history and ratings
Previous smoke sessions
Their stated flavor preferences

So when a user asks "what should I smoke tonight?" Don Carlos isn't giving generic recommendations. He's looking at what's actually in your humidor, what you've rated highly, and what you haven't tried yet.

He can also identify cigars from photos when direct scanning is inconclusive, suggest drink pairings based on flavor profiles, and maintain conversation persistence across sessions.

Self-Building Image Library (Gemini/Imagen 3.0)

This one's my favorite piece of architecture. When a scan identifies a cigar that doesn't have an existing image in the database, the system generates a photorealistic product photo using Gemini.

The prompt specifies studio lighting, dark wood background, accurate band details, and elegant composition. The image gets stored in Supabase Storage and cached for all future users who scan the same cigar.

Users unknowingly build a premium image database. Cost: roughly $20–80 per 1,000 photorealistic images. Competitors would need to photograph thousands of cigars manually to replicate what the community builds for us passively.

A generation logging table tracks requests, success rates, costs, and provides a premium gating option — free users see existing images, premium users trigger generation for cigars not yet in the library.

The Feature That Came From a Dumb Observation

Here's a product insight that had nothing to do with AI.

Cigar smoking is a 45–90 minute ritual where people are literally just sitting there. That's an insanely long potential session time that most consumer apps would kill for.

This led to Smoke Session Mode — a companion experience with a real-time animated burning cigar that progresses as time passes, complete with ash buildup and smoke wisps.

The session timer isn't arbitrary. It uses a formula:

baseTime = ringGauge × lengthInches / 30
adjustedTime = baseTime × strengthMultiplier

So a thick, long, full-bodied cigar gets a longer estimated session than a slim mild one. Personalized to the exact cigar you scanned.

At 33% and 66% through the session, the app prompts you with Three Thirds flavor education — explaining how the flavor profile shifts as you smoke through the first, second, and third portions. Most cigar smokers don't know this. Now they learn it in real time, while smoking.

Ambient mode dims the UI to show just the burning cigar and timer. Haptic feedback fires at phase transitions. Session history tracks everything with ratings and notes.

No competitor has anything like this. And it emerged from a simple observation about session length, not from a feature spec.

The Gamification Layer

47 badges across 10 categories. This sounds excessive until you understand the retention strategy.

The design principle: every celebration should feel earned. Confetti for logging in is cringe. Confetti for breaking a 30-day smoking streak? That hits different.

Badge categories span collection milestones (humidor size), exploration (trying cigars from different countries), social engagement (sharing, reviewing), session commitment (completing full smoke sessions), and knowledge (identifying cigars correctly on first scan).

The leaderboard runs a "Scan of the Day" algorithm that surfaces interesting scans — rare cigars, high-confidence identifications, first-time-scanned brands — rather than just ranking by volume.

The Business Model Nobody Sees

The consumer app is a Trojan horse. The real revenue is B2B.

Consumer subscriptions ($6.99/week or $49.99/year on mobile, $9.99/month or $79.99/year on web) provide baseline revenue. But the real play is lounge partnerships.

CigarSnap drives foot traffic to lounges through the lounge finder with check-ins and foot traffic analytics. Free listings prove value. Once a lounge sees 50 check-ins a month from CigarSnap users, the conversation about a $49–149/month Preferred Partner listing sells itself.

138 Texas lounges mapped at launch. B2B revenue projections for DFW alone: $2,470/month conservative, $7,920/month aggressive.

The consumer app is top of funnel. B2B is the real money.

What I'd Do Differently

Testing between prompts. I should have written integration tests after each prompt module instead of doing a big QA pass at the end. Some dependency issues between modules would've been caught earlier.

The web rebuild. The initial build was mobile-first through Replit Agent. The Next.js web app came later as a separate prompt sequence. I should have designed both interfaces from the start with a shared component library rather than rebuilding UI components.

Scope discipline. I almost built a CRM inside CigarSnap because a lounge owner told me about her $25K/year CRM spend. I had to physically stop myself. CigarSnap is a lead source that feeds into a CRM — it's not a CRM itself. Knowing what NOT to build is harder than building.

The Numbers

Metric	Value
Idea to functional MVP	~48 hours
Features at launch	15+
Full ecosystem build (mobile + web + B2B portal)	Under 30 days
Texas lounges mapped	138+
Reddit leads scraped for validation	3,522 at $1.08 total cost
Achievement badges	47 across 10 categories
AI image generation cost	$20–80 per 1,000 images
Global cigar market	$57B+

About Me

I'm Matt Cretzman. I build AI agent systems that run entire business functions — and companies that depend on them working.

CigarSnap is one of seven ventures I'm currently running. The others span legal tech (TextEvidence), AI coaching (Skill Refinery), B2B lead generation (LeadStorm AI), EdTech (HeyBaddie), meeting management (MyPRQ), and the AI marketing agency that started it all (Stormbreaker Digital).

I write about the technical side of building these at mattcretzman.com and share the less-filtered version on Substack.

If you're building with AI agents, I'd love to hear what your stack looks like. Drop a comment or find me on LinkedIn or GitHub.