<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ayoola Solomon</title>
    <description>The latest articles on Forem by Ayoola Solomon (@ayoolasolomon).</description>
    <link>https://forem.com/ayoolasolomon</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F58080%2Fcb137a0e-7d34-4dcd-8b38-2822fb2d4143.jpeg</url>
      <title>Forem: Ayoola Solomon</title>
      <link>https://forem.com/ayoolasolomon</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ayoolasolomon"/>
    <language>en</language>
    <item>
      <title>How I Designed a Modular, Event-Driven Architecture for Real-Time Voice AI</title>
      <dc:creator>Ayoola Solomon</dc:creator>
      <pubDate>Wed, 19 Nov 2025 09:49:54 +0000</pubDate>
      <link>https://forem.com/ayoolasolomon/how-i-designed-a-modular-event-driven-architecture-for-real-time-voice-ai-3d6g</link>
      <guid>https://forem.com/ayoolasolomon/how-i-designed-a-modular-event-driven-architecture-for-real-time-voice-ai-3d6g</guid>
      <description>&lt;p&gt;&lt;strong&gt;Most voice AI systems today are built as a fixed chain:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;STT → LLM → TTS → Audio Output.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This works for demos, but falls apart the moment you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Custom business logic&lt;/li&gt;
&lt;li&gt;CRM integrations&lt;/li&gt;
&lt;li&gt;Multi-agent routing&lt;/li&gt;
&lt;li&gt;Knowledge lookups&lt;/li&gt;
&lt;li&gt;Scheduling flows&lt;/li&gt;
&lt;li&gt;Post-call actions&lt;/li&gt;
&lt;li&gt;Pipeline branching&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Swappable providers (Claude vs GPT, Deepgram vs Whisper, etc.)&lt;/p&gt;

&lt;p&gt;So for &lt;strong&gt;EchoStack&lt;/strong&gt;, I scrapped the idea of a “voice bot pipeline” entirely and built a &lt;strong&gt;voice automation platform&lt;/strong&gt; powered by an &lt;strong&gt;event-driven orchestration layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here’s how the architecture works — and why it has completely changed what’s possible with real-time AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  LiveKit Only Handles Ingress &amp;amp; Egress
&lt;/h2&gt;

&lt;p&gt;Not STT.&lt;br&gt;
Not LLM.&lt;br&gt;
Not TTS.&lt;/p&gt;

&lt;p&gt;Just pure audio transport:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Mic → LiveKit → EchoStack  
EchoStack → LiveKit → User Speaker
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside EchoStack, every audio frame becomes an event:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;processing.livekit.audio_frame
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes the audio layer fully modular and independent of AI logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Everything Inside EchoStack Is a Connector
&lt;/h2&gt;

&lt;p&gt;A connector can be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deepgram STT&lt;/li&gt;
&lt;li&gt;WhisperX&lt;/li&gt;
&lt;li&gt;AssemblyAI&lt;/li&gt;
&lt;li&gt;Claude&lt;/li&gt;
&lt;li&gt;GPT-4o&lt;/li&gt;
&lt;li&gt;Llama 3&lt;/li&gt;
&lt;li&gt;ElevenLabs&lt;/li&gt;
&lt;li&gt;Azure Neural TTS&lt;/li&gt;
&lt;li&gt;HubSpot&lt;/li&gt;
&lt;li&gt;Salesforce&lt;/li&gt;
&lt;li&gt;Zendesk&lt;/li&gt;
&lt;li&gt;Calendly&lt;/li&gt;
&lt;li&gt;A custom HTTP API&lt;/li&gt;
&lt;li&gt;A knowledge search&lt;/li&gt;
&lt;li&gt;A database entry&lt;/li&gt;
&lt;li&gt;Or even another AI agent
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"consumes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"processing.deepgram.text"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"produces"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"processing.claude.agent_message"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;EchoStack&lt;/strong&gt; uses this to decide where events flow next.&lt;/p&gt;

&lt;p&gt;This creates a real-time version of Zapier or LangGraph.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pipelines Are Just Manifests
&lt;/h2&gt;

&lt;p&gt;Instead of hardcoded logic, pipelines are defined like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"pipeline"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"ingress.livekit.audio_frame → deepgram.stt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"deepgram.stt → claude.agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"claude.agent → elevenlabs.tts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"elevenlabs.tts → egress.livekit.audio_chunk"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No code.&lt;br&gt;
No wiring.&lt;br&gt;
Just declarative routing.&lt;/p&gt;

&lt;p&gt;Want to swap Deepgram for Whisper?&lt;br&gt;
Edit one line.&lt;/p&gt;

&lt;p&gt;Want to add sentiment analysis between STT and LLM?&lt;br&gt;
Add one rule.&lt;/p&gt;

&lt;p&gt;Want multi-agent routing?&lt;br&gt;
Add a router connector.&lt;/p&gt;
&lt;h2&gt;
  
  
  Multi-Playbook Orchestration (The Real Game-Changer)
&lt;/h2&gt;

&lt;p&gt;Traditional voice agents can only run one flow.&lt;/p&gt;

&lt;p&gt;EchoStack can run many — and switch between them in real time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LeadQualifier.json  
MeetingBooker.json  
FAQBot.json  
SupportAgent.json  
CRMLogger.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the user says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I want to book a meeting.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Routing connector switches the playbook:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;processing.deepgram.text → intent.router → meeting_booker.playbook
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is impossible in a linear voice bot pipeline, but trivial in an event system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-Time Streaming (STT, LLM, TTS)
&lt;/h2&gt;

&lt;p&gt;Because everything is async events, the system supports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Streaming STT transcripts&lt;/li&gt;
&lt;li&gt;Streaming LLM tokens (Claude / GPT-4o)&lt;/li&gt;
&lt;li&gt;Streaming TTS audio chunks&lt;/li&gt;
&lt;li&gt;Barge-in and interruption&lt;/li&gt;
&lt;li&gt;Live agent escalation&lt;/li&gt;
&lt;li&gt;Parallel processing&lt;/li&gt;
&lt;li&gt;Multi-agent collaboration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example LLM output stream event:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;processing.claude.agent_message.partial
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example TTS stream:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;processing.elevenlabs.audio_chunk.stream
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user hears responses as they are generated — not after the full LLM response.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full Pipeline Simulation (No LiveKit Needed)
&lt;/h2&gt;

&lt;p&gt;This is my favorite feature.&lt;/p&gt;

&lt;p&gt;EchoStack can simulate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Audio → STT&lt;/li&gt;
&lt;li&gt;STT → LLM&lt;/li&gt;
&lt;li&gt;LLM → TTS&lt;/li&gt;
&lt;li&gt;TTS → Egress&lt;/li&gt;
&lt;li&gt;All connector interactions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without touching real providers.&lt;/p&gt;

&lt;p&gt;It utilizes a mock runtime registry to generate realistic, fake outputs.&lt;/p&gt;

&lt;p&gt;This allows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Visual debugging&lt;/li&gt;
&lt;li&gt;Step-by-step replay&lt;/li&gt;
&lt;li&gt;Education demos&lt;/li&gt;
&lt;li&gt;Test-driven development&lt;/li&gt;
&lt;li&gt;Predictable QA&lt;/li&gt;
&lt;li&gt;“Dry runs” before deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is something even Retell &amp;amp; Vapi don’t have today.&lt;/p&gt;

&lt;h2&gt;
  
  
  And It Scales Like a Distributed System
&lt;/h2&gt;

&lt;p&gt;Because everything is events:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each connector is a worker&lt;/li&gt;
&lt;li&gt;Workers scale horizontally&lt;/li&gt;
&lt;li&gt;Backpressure is manageable&lt;/li&gt;
&lt;li&gt;Failures can be contained&lt;/li&gt;
&lt;li&gt;Retries &amp;amp; fallbacks are simple&lt;/li&gt;
&lt;li&gt;Pipelines can fork or merge&lt;/li&gt;
&lt;li&gt;Multi-agent flows work naturally&lt;/li&gt;
&lt;li&gt;Audioless connectors (CRM, DB, API) blend seamlessly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It behaves like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zapier&lt;/li&gt;
&lt;li&gt;AWS EventBridge&lt;/li&gt;
&lt;li&gt;LangGraph&lt;/li&gt;
&lt;li&gt;Airflow&lt;/li&gt;
&lt;li&gt;N8N&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…but optimized for real-time audio.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Unlocks for Businesses
&lt;/h2&gt;

&lt;p&gt;This is where the architecture stops being “cool tech” and becomes actual value:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lead qualification&lt;/li&gt;
&lt;li&gt;After-hours support&lt;/li&gt;
&lt;li&gt;Customer triage&lt;/li&gt;
&lt;li&gt;Booking assistants&lt;/li&gt;
&lt;li&gt;Helpdesk automation&lt;/li&gt;
&lt;li&gt;Sales follow-ups&lt;/li&gt;
&lt;li&gt;Knowledge Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;Order tracking&lt;/li&gt;
&lt;li&gt;Multi-agent escalation&lt;/li&gt;
&lt;li&gt;CRM syncing&lt;/li&gt;
&lt;li&gt;Custom playbooks per industry&lt;/li&gt;
&lt;li&gt;Complex routing between AI tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don’t just deploy “a bot.”&lt;/p&gt;

&lt;p&gt;You deploy a &lt;strong&gt;network of intelligent voice automations&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;Voice AI is moving fast, but most of what exists today is still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rigid&lt;/li&gt;
&lt;li&gt;non-composable&lt;/li&gt;
&lt;li&gt;difficult to integrate&lt;/li&gt;
&lt;li&gt;tied to single vendors&lt;/li&gt;
&lt;li&gt;non-debuggable&lt;/li&gt;
&lt;li&gt;non-portable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By making the entire system event-driven and connector-based, EchoStack becomes:&lt;/p&gt;

&lt;p&gt;A real-time automation platform where voice is the entry point — not the limitation.&lt;/p&gt;

&lt;p&gt;If you’re into real-time systems, LiveKit, STT/LLM/TTS pipelines, or voice automation, I’d love to exchange ideas.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>startup</category>
      <category>automation</category>
    </item>
    <item>
      <title>Inside the Manifest: How We Make Voice-AI Playbooks Deployable</title>
      <dc:creator>Ayoola Solomon</dc:creator>
      <pubDate>Mon, 10 Nov 2025 21:51:59 +0000</pubDate>
      <link>https://forem.com/ayoolasolomon/inside-the-manifest-how-we-make-voice-ai-playbooks-deployable-5431</link>
      <guid>https://forem.com/ayoolasolomon/inside-the-manifest-how-we-make-voice-ai-playbooks-deployable-5431</guid>
      <description>&lt;p&gt;Most businesses don’t want another AI “demo.”&lt;br&gt;
They want &lt;strong&gt;deployable outcomes&lt;/strong&gt; — like qualifying a lead, booking a meeting, or handling an after-hours call automatically.&lt;/p&gt;

&lt;p&gt;At EchoStack, we wanted to make these outcomes &lt;strong&gt;as easy to launch as deploying code&lt;/strong&gt; — and that’s how the &lt;em&gt;manifest&lt;/em&gt; was born.&lt;/p&gt;


&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Voice-AI tools today can hold great conversations, but they often stop there.&lt;br&gt;
To &lt;em&gt;actually&lt;/em&gt; move data into CRMs, book meetings, or send follow-ups, you need engineers stitching APIs together.&lt;/p&gt;

&lt;p&gt;That doesn’t scale for non-technical teams.&lt;/p&gt;


&lt;h3&gt;
  
  
  Our Solution — The Manifest
&lt;/h3&gt;

&lt;p&gt;Each playbook in EchoStack (e.g., &lt;em&gt;Lead Qualifier&lt;/em&gt;, &lt;em&gt;After-Hours Support&lt;/em&gt;) is powered by a &lt;strong&gt;manifest&lt;/strong&gt; — a single file that describes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what the playbook does&lt;/li&gt;
&lt;li&gt;which tools it connects to&lt;/li&gt;
&lt;li&gt;and what should happen when key events occur.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s like &lt;strong&gt;Terraform&lt;/strong&gt;, but for voice-driven business outcomes.&lt;/p&gt;


&lt;h3&gt;
  
  
  Example Manifest (Simplified)
&lt;/h3&gt;

&lt;p&gt;Here’s a simplified version of what a playbook looks like under the hood:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lead-qualifier"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Voice Lead Qualifier"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Qualifies inbound leads via voice and syncs results to CRM."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"connectors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"twilio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hubspot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"calendly"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"events"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"call.started"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"start_conversation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"lead.qualified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"create_contact"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"meeting.booked"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"schedule_event"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn’t code — it’s a &lt;strong&gt;declarative contract&lt;/strong&gt; between the voice experience and the business stack.&lt;br&gt;
Our platform reads this file, provisions the right connectors, and handles orchestration automatically.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why It Matters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No-code deployment:&lt;/strong&gt; Business teams can launch playbooks instantly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version control:&lt;/strong&gt; Every manifest can be tracked, forked, and redeployed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extensibility:&lt;/strong&gt; Developers can author new playbooks using familiar patterns (JSON, events, connectors).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It turns &lt;strong&gt;AI workflows into deployable building blocks&lt;/strong&gt; — reusable, composable, and measurable.&lt;/p&gt;




&lt;h3&gt;
  
  
  What’s Next
&lt;/h3&gt;

&lt;p&gt;We’re expanding the manifest system to support richer event schemas and multi-step orchestration.&lt;br&gt;
Soon, teams will be able to chain multiple playbooks together — stacking outcomes like Lego blocks.&lt;/p&gt;




&lt;h3&gt;
  
  
  Closing Thought
&lt;/h3&gt;

&lt;p&gt;If you’ve ever written Terraform for infrastructure or YAML for CI/CD, imagine doing that —&lt;br&gt;
but for &lt;em&gt;voice automation&lt;/em&gt;.&lt;br&gt;
That’s what EchoStack’s manifests make possible.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Curious to see a manifest in action?&lt;/strong&gt;&lt;br&gt;
We’re opening early access for developers building voice-AI workflows — you can join at &lt;a href="https://getechostack.com" rel="noopener noreferrer"&gt;getechostack.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>startup</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Designing Deployable Voice-AI Playbooks</title>
      <dc:creator>Ayoola Solomon</dc:creator>
      <pubDate>Tue, 28 Oct 2025 08:54:07 +0000</pubDate>
      <link>https://forem.com/ayoolasolomon/designing-deployable-voice-ai-playbooks-p95-300-ms-preflight-bluegreen-55eh</link>
      <guid>https://forem.com/ayoolasolomon/designing-deployable-voice-ai-playbooks-p95-300-ms-preflight-bluegreen-55eh</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This is a &lt;strong&gt;design/engineering write-up&lt;/strong&gt; for our EchoStack pivot. We’re packaging Voice-AI &lt;strong&gt;playbooks&lt;/strong&gt; (like &lt;em&gt;After-hours Answering&lt;/em&gt; and &lt;em&gt;Lead Qualifier → Auto-Book&lt;/em&gt;) into &lt;strong&gt;deployable solutions&lt;/strong&gt; with no-code setup, safe rollouts, and KPI tiles.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Status:&lt;/strong&gt; Early Access only — we’re validating integrations and rollout safety with a small group before opening signups.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why playbooks (not tool soup)
&lt;/h2&gt;

&lt;p&gt;Teams don’t buy models; they buy &lt;strong&gt;outcomes&lt;/strong&gt;: fewer missed calls, more booked meetings, lower AHT. The hard parts are &lt;strong&gt;barge-in latency&lt;/strong&gt; and &lt;strong&gt;safe deployment&lt;/strong&gt;, not the LLM itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Latency budget we hold ourselves to (p95 targets)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;ASR partials: &lt;strong&gt;60–90 ms&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;LLM first token: &lt;strong&gt;80–120 ms&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;TTS first audio: &lt;strong&gt;50–80 ms&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Network buffers: &lt;strong&gt;40–60 ms&lt;/strong&gt;
→ Goal: &lt;strong&gt;&amp;lt; 300 ms p95&lt;/strong&gt; end-to-end (barge-in friendly).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Rollout safety (what we’re building)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;preflight → plan → apply (blue) → smoke test → switch (green) → rollback&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Preflight&lt;/strong&gt; checks scopes, latency probes, and config drift.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan&lt;/strong&gt; shows a human-readable diff.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apply&lt;/strong&gt; deploys to an inactive slot (blue).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Switch&lt;/strong&gt; flips traffic; &lt;strong&gt;Rollback&lt;/strong&gt; is one click.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Integration surface (first adapters)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Telephony: Twilio/Plivo/SIP&lt;/li&gt;
&lt;li&gt;Voice agent: Retell (others later)&lt;/li&gt;
&lt;li&gt;Calendar: Calendly/Google&lt;/li&gt;
&lt;li&gt;CRM: HubSpot/Salesforce (Sheets fallback)&lt;/li&gt;
&lt;li&gt;Helpdesk: Zendesk (optional)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What exists today
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Data model + manifests for two playbooks&lt;/li&gt;
&lt;li&gt;No-code configuration flow (internal)&lt;/li&gt;
&lt;li&gt;Preflight → plan → apply skeleton&lt;/li&gt;
&lt;li&gt;KPI tiles (self-serve %, AHT, booked meetings) wired to session events&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What we’re validating next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Region-aware routing under load&lt;/li&gt;
&lt;li&gt;Failure modes during blue/green switches&lt;/li&gt;
&lt;li&gt;Adapter ergonomics (CRM/calendar/telephony edge cases)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Looking for feedback
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Are these &lt;strong&gt;p95 targets&lt;/strong&gt; realistic for your use case?&lt;/li&gt;
&lt;li&gt;What &lt;strong&gt;minimum logs/SLAs&lt;/strong&gt; make you comfortable with rollout?&lt;/li&gt;
&lt;li&gt;Which &lt;strong&gt;adapter combo&lt;/strong&gt; should be prioritized?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;More context:&lt;/strong&gt; &lt;a href="https://getechostack.com/playbooks" rel="noopener noreferrer"&gt;https://getechostack.com/playbooks&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Early Access (no public demo yet):&lt;/strong&gt; &lt;a href="https://getechostack.com/contact?subject=Early%20Access" rel="noopener noreferrer"&gt;https://getechostack.com/contact?subject=Early%20Access&lt;/a&gt;&lt;/p&gt;

</description>
      <category>voiceai</category>
      <category>architecture</category>
      <category>nocode</category>
      <category>saas</category>
    </item>
    <item>
      <title>From building a voice AI widget to mapping the entire Voice AI ecosystem (Introducing echostack)</title>
      <dc:creator>Ayoola Solomon</dc:creator>
      <pubDate>Mon, 13 Oct 2025 19:04:15 +0000</pubDate>
      <link>https://forem.com/ayoolasolomon/from-building-a-voice-ai-widget-to-mapping-the-entire-voice-ai-ecosystem-introducing-echostack-ceo</link>
      <guid>https://forem.com/ayoolasolomon/from-building-a-voice-ai-widget-to-mapping-the-entire-voice-ai-ecosystem-introducing-echostack-ceo</guid>
      <description>&lt;p&gt;Hey everyone,&lt;/p&gt;

&lt;p&gt;I’m Solomon — the creator of &lt;a href="https://getechospace.com/" rel="noopener noreferrer"&gt;GetEchoSpace&lt;/a&gt;, a voice AI widget that lets any website host real-time audio conversations for support, live shopping, or community.&lt;/p&gt;

&lt;p&gt;While building it, I constantly had to combine tools for ASR, text-to-speech, and LLMs — juggling APIs from different vendors and testing pipelines just to get a working flow.&lt;/p&gt;

&lt;p&gt;At some point, it hit me:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Everyone building in voice AI is reinventing the same workflows from scratch.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;There are incredible voice AI tools out there — from OpenAI’s speech APIs to ElevenLabs, Whisper, Speechmatics, and more.&lt;br&gt;
But there’s no central place to discover, compare, and see how they connect in real-world setups.&lt;/p&gt;

&lt;p&gt;Builders like me spend hours figuring out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which ASR integrates best with Twilio,&lt;/li&gt;
&lt;li&gt;how to pass data between TTS and LLMs,&lt;/li&gt;
&lt;li&gt;and how to deploy these flows in production.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Enter echostack
&lt;/h2&gt;

&lt;p&gt;So I started building &lt;a href="https://getechostack.com" rel="noopener noreferrer"&gt;echostack&lt;/a&gt; — a public directory of voice AI tools and ready-made “stacks.”&lt;/p&gt;

&lt;p&gt;Think of it as Zapier templates or Stack Overflow for voice AI workflows.&lt;br&gt;
Each stack shows how to combine tools (e.g., Retell + OpenAI + Twilio + GCP ASR) to achieve real outcomes — like multilingual dubbing, customer triage bots, or AI-powered voice assistants.&lt;/p&gt;

&lt;p&gt;The goal:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;help developers and AI builders spend less time wiring tools, and more time shipping value.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Tech Behind the MVP
&lt;/h3&gt;

&lt;p&gt;The MVP is built with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Next.js 15 (App Router)&lt;/li&gt;
&lt;li&gt;TypeScript + Tailwind&lt;/li&gt;
&lt;li&gt;Supabase (for data)&lt;/li&gt;
&lt;li&gt;Zapier &amp;amp; n8n export support planned for v0.2&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What’s Live Now
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figkl65113pmirp4him11.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figkl65113pmirp4him11.png" alt="stack detail page" width="800" height="932"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can explore:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Featured voice AI tools&lt;/li&gt;
&lt;li&gt;Early “stacks” (like multilingual dubbing or real-time triage bots)&lt;/li&gt;
&lt;li&gt;Newsletter signup for updates as new stacks drop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://getechostack.com" rel="noopener noreferrer"&gt;https://getechostack.com&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  I’d love your feedback
&lt;/h3&gt;

&lt;p&gt;If you’re building with Voice-AI or integrating ASR/TTS/LLM tools, I’d love to hear:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What workflows or “stacks” you’d want to see next&lt;/li&gt;
&lt;li&gt;Which tools are must-haves for you&lt;/li&gt;
&lt;li&gt;Whether you prefer no-code or code-level examples&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What’s Next?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Expand to more tools and stacks&lt;/li&gt;
&lt;li&gt;Add semantic search and tagging&lt;/li&gt;
&lt;li&gt;Support Zapier/n8n exports&lt;/li&gt;
&lt;li&gt;Launch the curated Voice-AI Stacks Newsletter&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If that sounds interesting, you can check it out or share feedback directly on &lt;a href="https://getechostack.com" rel="noopener noreferrer"&gt;echostack&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>voiceai</category>
      <category>indiehackers</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
