<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Aaron VanSledright</title>
    <description>The latest articles on Forem by Aaron VanSledright (@avansledright).</description>
    <link>https://forem.com/avansledright</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3817240%2Fb7f5c68f-432d-4d27-91d4-6c7a7b572314.jpg</url>
      <title>Forem: Aaron VanSledright</title>
      <link>https://forem.com/avansledright</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/avansledright"/>
    <language>en</language>
    <item>
      <title>I built a WordPress Plugin For Generating Images With Nano Banana</title>
      <dc:creator>Aaron VanSledright</dc:creator>
      <pubDate>Thu, 19 Mar 2026 16:49:09 +0000</pubDate>
      <link>https://forem.com/avansledright/i-built-a-wordpress-plugin-for-generating-images-with-nano-banana-187j</link>
      <guid>https://forem.com/avansledright/i-built-a-wordpress-plugin-for-generating-images-with-nano-banana-187j</guid>
      <description>&lt;p&gt;AI is every where. Accept it. Anyway, I had a random thought last night about having a WordPress plugin that allows you to generate images on the fly for your posts. Pictures increase engagement on posts so, what if we just inline Nano Banana directly into Gutenberg?&lt;/p&gt;

&lt;p&gt;This morning I built this plugin which is a simple API call to Google’s Gemini AI Studio through a Gutenberg block.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Type your prompt&lt;/li&gt;
&lt;li&gt;Choose your model&lt;/li&gt;
&lt;li&gt;Hit generate&lt;/li&gt;
&lt;li&gt;Insert
Simple!&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypxad6u9etwbwkn3v91o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypxad6u9etwbwkn3v91o.png" alt=" " width="769" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once the image is inserted into the post it turns the block into a standard image block so its as easy to manage as any other image.&lt;/p&gt;

&lt;p&gt;I submitted the plugin to the official WordPress repository but it takes a while to get approved. So, if you want to add it to your own WordPress instance feel free to message me and I’ll give you access to the repository!&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>gemini</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I Built an AI-Powered WordPress Hosting Platform — Here's Why</title>
      <dc:creator>Aaron VanSledright</dc:creator>
      <pubDate>Fri, 13 Mar 2026 21:14:52 +0000</pubDate>
      <link>https://forem.com/avansledright/i-built-an-ai-powered-wordpress-hosting-platform-heres-why-3aik</link>
      <guid>https://forem.com/avansledright/i-built-an-ai-powered-wordpress-hosting-platform-heres-why-3aik</guid>
      <description>&lt;p&gt;I'm a cloud architect by day. I've spent 6+ years designing infrastructure on AWS for enterprise clients — the kind of environments where uptime, security, and scalability aren't optional.&lt;/p&gt;

&lt;p&gt;But I kept noticing something. The people who needed solid web hosting the most — small business owners, freelancers, entrepreneurs — were stuck choosing between two bad options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cheap shared hosting&lt;/strong&gt; that's slow, insecure, and breaks at the worst possible times&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expensive managed platforms&lt;/strong&gt; that charge a premium for infrastructure that costs a fraction of the price&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So I built something in between.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Does
&lt;/h2&gt;

&lt;p&gt;The platform spins up a fully configured WordPress site on its own dedicated AWS infrastructure. No shared servers, no noisy neighbors. Each site gets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Its own isolated environment running on AWS&lt;/li&gt;
&lt;li&gt;SSL certificates configured automatically&lt;/li&gt;
&lt;li&gt;AI-generated themes so you're not starting from a blank screen&lt;/li&gt;
&lt;li&gt;A site that's ready to go in under a minute&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI piece isn't a gimmick — it generates a custom WordPress theme based on your business, so you skip the hours of digging through starter themes and tweaking settings before you can even start adding content.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture (For the Curious)
&lt;/h2&gt;

&lt;p&gt;Under the hood, the platform uses a &lt;strong&gt;Golden AMI strategy&lt;/strong&gt;. Instead of bootstrapping a fresh server every time (installing WordPress, configuring Apache, setting up the database, etc.), I pre-bake all of that into a machine image. When a new site is requested, it launches from that image with everything already in place.&lt;/p&gt;

&lt;p&gt;This brings deployment time down to about &lt;strong&gt;30–45 seconds&lt;/strong&gt; from request to live site.&lt;/p&gt;

&lt;p&gt;Each tenant gets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A dedicated EC2 instance launched from the Golden AMI&lt;/li&gt;
&lt;li&gt;MariaDB running locally (keeps costs low and latency minimal)&lt;/li&gt;
&lt;li&gt;Let's Encrypt SSL via automated provisioning&lt;/li&gt;
&lt;li&gt;Infrastructure managed entirely with Terraform&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI theme generation runs through &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; (Claude Sonnet), which generates a complete WordPress theme — &lt;code&gt;style.css&lt;/code&gt;, template files, &lt;code&gt;functions.php&lt;/code&gt; — based on the business type and preferences provided during setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Just Use [Insert Platform Here]?
&lt;/h2&gt;

&lt;p&gt;Fair question. Here's the honest answer:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vs. Shared hosting (Bluehost, GoDaddy, etc.):&lt;/strong&gt; Those environments pack hundreds of sites onto the same server. Performance degrades, security is shared-risk, and you have zero control. This platform gives each site its own isolated infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vs. Managed WordPress (WP Engine, Kinsta, Flywheel):&lt;/strong&gt; Great products, but they charge $30–$100+/month for what is fundamentally commodity infrastructure with a management layer on top. This platform delivers a comparable experience at a lower price point because the architecture is designed to be cost-efficient from the ground up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vs. Page builders (Wix, Squarespace):&lt;/strong&gt; Those aren't WordPress. If you want the flexibility of the WordPress ecosystem — plugins, themes, WooCommerce, full code access — you need actual WordPress hosting. This gives you that without the setup headache.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned Building It
&lt;/h2&gt;

&lt;p&gt;A few things that might be useful if you're building a similar platform:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Golden AMIs save everything.&lt;/strong&gt; I originally prototyped with a bootstrap-on-launch approach (user-data scripts installing WordPress, configuring Apache, etc.). It was slow (~5 minutes) and fragile. Pre-baking everything into an AMI cut that to under a minute and eliminated an entire category of deployment failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MariaDB on EC2 &amp;gt; Aurora for this use case.&lt;/strong&gt; Aurora is amazing for multi-tenant databases at scale, but for a platform where each tenant has their own instance, running MariaDB locally is dramatically cheaper and simpler. Per-tenant cost dropped to around $20/month, which makes even the lowest pricing tier profitable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI theme generation needs guardrails.&lt;/strong&gt; The first version would occasionally generate themes with broken PHP or CSS that didn't render correctly. Adding validation steps and a structured prompt template with explicit file-by-file output instructions fixed about 95% of those issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Let's Encrypt automation is non-negotiable.&lt;/strong&gt; Manual SSL setup is a support nightmare. Automating certificate provisioning and renewal during the deployment process eliminated what would have been a constant stream of support tickets.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The platform is live and I'm onboarding early users. The roadmap includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One-click staging environments&lt;/strong&gt; — clone your site for testing before pushing changes live&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Automated backups with one-click restore&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A theme marketplace&lt;/strong&gt; where AI-generated themes can be saved, shared, and reused&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WooCommerce quick-start&lt;/strong&gt; — pre-configured e-commerce setup with AI-generated product pages&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It / Get In Touch
&lt;/h2&gt;

&lt;p&gt;If you're interested in checking it out, the platform is running under &lt;a href="https://45squared.com" rel="noopener noreferrer"&gt;45Squared&lt;/a&gt;. I'm actively looking for early adopters and feedback.&lt;/p&gt;

&lt;p&gt;If you're a developer building something similar, happy to chat architecture — drop a comment or reach out.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Aaron, a cloud architect and the founder of 45Squared. I build tools that make AWS infrastructure accessible to people who shouldn't have to think about infrastructure.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>wordpress</category>
      <category>aws</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I Replaced My Agent Framework With Markdown Files and 140 Lines of Python</title>
      <dc:creator>Aaron VanSledright</dc:creator>
      <pubDate>Wed, 11 Mar 2026 21:58:41 +0000</pubDate>
      <link>https://forem.com/avansledright/i-replaced-my-agent-framework-with-markdown-files-and-140-lines-of-python-3323</link>
      <guid>https://forem.com/avansledright/i-replaced-my-agent-framework-with-markdown-files-and-140-lines-of-python-3323</guid>
      <description>&lt;p&gt;Every AI agent framework I tried added complexity I didn't need. LangChain, CrewAI, AutoGen — they're powerful, but for deploying a Slack bot that answers questions using a few tools, I was pulling in hundreds of dependencies to do something &lt;code&gt;boto3&lt;/code&gt; already handles natively.&lt;/p&gt;

&lt;p&gt;So I built something different: a Terraform module where agent behavior lives in markdown files, tools are plain Python functions, and the entire runtime engine is ~140 lines of code with zero external dependencies.&lt;/p&gt;

&lt;p&gt;I open-sourced it: &lt;a href="https://github.com/AIOpsCrew/terraform-module-markdown-agent" rel="noopener noreferrer"&gt;terraform-module-markdown-agent&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With Agent Frameworks
&lt;/h2&gt;

&lt;p&gt;Most agent frameworks want to own your entire stack. You get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Heavyweight dependencies&lt;/strong&gt; — hundreds of packages for what amounts to a loop calling an LLM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Framework lock-in&lt;/strong&gt; — custom decorators, base classes, and abstractions that couple your business logic to the framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment friction&lt;/strong&gt; — designed for containers or servers, not serverless&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Opaque behavior&lt;/strong&gt; — hard to debug when the agent does something unexpected because the prompt is buried in framework internals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're running agents on AWS Lambda with Bedrock, you already have &lt;code&gt;boto3&lt;/code&gt;. The &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-call.html" rel="noopener noreferrer"&gt;Bedrock Converse API&lt;/a&gt; handles tool use natively. The framework is mostly just getting in the way.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Idea: Markdown as Configuration
&lt;/h2&gt;

&lt;p&gt;What if agent behavior was just a markdown file?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;support-agent&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1.0.0&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Handles&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;customer&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;support&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;queries"&lt;/span&gt;
&lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;support&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;customer&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Support Agent&lt;/span&gt;

&lt;span class="gu"&gt;## When to Use&lt;/span&gt;
Activated for all customer-facing support requests.

&lt;span class="gu"&gt;## Process&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Greet the customer
&lt;span class="p"&gt;2.&lt;/span&gt; Use &lt;span class="sb"&gt;`search_docs`&lt;/span&gt; to find relevant documentation
&lt;span class="p"&gt;3.&lt;/span&gt; If the issue requires escalation, use &lt;span class="sb"&gt;`create_ticket`&lt;/span&gt;
&lt;span class="p"&gt;4.&lt;/span&gt; Summarize the resolution

&lt;span class="gu"&gt;## Guardrails&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Never share internal pricing or roadmap details
&lt;span class="p"&gt;-&lt;/span&gt; Always confirm before creating tickets
&lt;span class="p"&gt;-&lt;/span&gt; Keep responses under 3 paragraphs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This markdown file &lt;em&gt;is&lt;/em&gt; the system prompt. The frontmatter provides metadata for routing. The sections give the LLM structured instructions. You can read it, diff it, review it in a PR — no code changes needed to adjust agent behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Runtime Works
&lt;/h2&gt;

&lt;p&gt;The engine is a simple loop:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the skill markdown file as the system prompt&lt;/li&gt;
&lt;li&gt;Append any shared rules (company context, formatting guidelines)&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;bedrock-runtime.converse()&lt;/code&gt; with the user message and tool specs&lt;/li&gt;
&lt;li&gt;If the model wants to use a tool, route it to the handler function&lt;/li&gt;
&lt;li&gt;Feed the tool result back and loop&lt;/li&gt;
&lt;li&gt;Return the final text response&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's the actual function signature:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;runtime.engine&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;run_agent&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;skill_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;support-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I can&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t log in to my account&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tool_specs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_tool_specs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tool_handler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_handler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;conversation_history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full engine handles Bedrock throttling with exponential backoff, safe error messages (no internal details leaked to users), a max-turns safety limit, and S3 or local filesystem skill loading. And it does all of this in ~140 lines using only &lt;code&gt;boto3&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools Are Just Functions
&lt;/h2&gt;

&lt;p&gt;No decorators. No base classes. Define a JSON schema for Bedrock's tool spec, write a Python function, register it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tools/specs/support.py
&lt;/span&gt;&lt;span class="n"&gt;SUPPORT_TOOL_SPECS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolSpec&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search the knowledge base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Search query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tools/support.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_docs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Your actual search logic here
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;my_search_index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tools/registry.py
&lt;/span&gt;&lt;span class="n"&gt;TOOL_HANDLERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;search_docs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;inp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The registry is a dictionary. The spec is JSON. The handler is a function. You can test each piece independently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Agent Delegation
&lt;/h2&gt;

&lt;p&gt;A coordinator skill can delegate to specialized sub-skills:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;coordinator&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1.0.0&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Routes requests to specialized agents&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Coordinator&lt;/span&gt;

&lt;span class="gu"&gt;## Process&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Analyze the user's request
&lt;span class="p"&gt;2.&lt;/span&gt; Delegate to &lt;span class="sb"&gt;`support-agent`&lt;/span&gt; for customer issues
&lt;span class="p"&gt;3.&lt;/span&gt; Delegate to &lt;span class="sb"&gt;`ops-agent`&lt;/span&gt; for infrastructure questions
&lt;span class="p"&gt;4.&lt;/span&gt; Handle general conversation directly
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;delegate_to_skill&lt;/code&gt; tool handles the routing. Recursion depth is limited (default: 3 levels) to prevent infinite loops between skills.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Terraform Deploys
&lt;/h2&gt;

&lt;p&gt;The module provisions everything you need:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lambda Function + Layer&lt;/td&gt;
&lt;td&gt;Agent runtime&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IAM Role&lt;/td&gt;
&lt;td&gt;Least-privilege Bedrock + DynamoDB access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Gateway (optional)&lt;/td&gt;
&lt;td&gt;HTTP endpoint for Slack webhooks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB Table (optional)&lt;/td&gt;
&lt;td&gt;Thread-based conversation memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventBridge Rules (optional)&lt;/td&gt;
&lt;td&gt;Scheduled agent tasks (cron)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"agent"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"github.com/45squaredLLC/terraform-module-markdown-agent"&lt;/span&gt;

  &lt;span class="nx"&gt;name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"support-agent"&lt;/span&gt;
  &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"prod"&lt;/span&gt;

  &lt;span class="nx"&gt;source_dir&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${path.module}/src"&lt;/span&gt;
  &lt;span class="nx"&gt;layer_path&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${path.module}/dist/layer.zip"&lt;/span&gt;
  &lt;span class="nx"&gt;bedrock_model_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us.anthropic.claude-sonnet-4-5-20250929-v1:0"&lt;/span&gt;

  &lt;span class="nx"&gt;ssm_parameter_prefixes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/support-agent/slack/*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="nx"&gt;enable_api_gateway&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;enable_memory_table&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;terraform apply&lt;/code&gt; and you have a working agent with an HTTPS endpoint, conversation memory, and IAM policies scoped to exactly what it needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversation Memory
&lt;/h2&gt;

&lt;p&gt;DynamoDB stores conversation history per Slack thread:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Partition key&lt;/strong&gt;: &lt;code&gt;THREAD#{thread_id}&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sort key&lt;/strong&gt;: &lt;code&gt;MSG#{timestamp}#{uuid}&lt;/code&gt; (collision-safe)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TTL&lt;/strong&gt;: Auto-expires after 30 days (configurable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cap&lt;/strong&gt;: 100 messages per thread to stay within context windows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The runtime loads history automatically when processing a message in an existing thread. No session management code needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scheduled Agents
&lt;/h2&gt;

&lt;p&gt;Need an agent that runs on a cron schedule? EventBridge handles it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;scheduled_tasks&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"daily-report"&lt;/span&gt;
    &lt;span class="nx"&gt;schedule_expression&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cron(0 13 * * ? *)"&lt;/span&gt;
    &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"scheduled"&lt;/span&gt;
      &lt;span class="nx"&gt;task&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"daily-report"&lt;/span&gt;
      &lt;span class="nx"&gt;slack_channel&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"C123ABC"&lt;/span&gt;
      &lt;span class="nx"&gt;prompt&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Generate the daily operations summary"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same agent, same skills, same tools — just triggered by a schedule instead of a Slack message.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;src/
├── orchestrator/
│   ├── handler.py        # Lambda entry point
│   └── agent.py          # Wires skills + tools
├── runtime/              # Provided by the module
│   ├── engine.py         # ~140-line Bedrock Converse loop
│   ├── handler.py        # Slack event handling
│   ├── memory.py         # DynamoDB conversation store
│   └── delegation.py     # Skill-to-skill routing
├── skills/
│   ├── coordinator.md    # Entry point skill
│   └── support-agent.md  # Domain skill
├── rules/
│   └── formatting.md     # Shared context
└── tools/
    ├── registry.py       # Tool routing
    ├── specs/
    │   └── support.py    # Tool JSON schemas
    └── support.py        # Tool implementations
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Changing agent behavior = editing a markdown file. Adding a tool = writing a function + JSON schema. No framework upgrades, no breaking API changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security
&lt;/h2&gt;

&lt;p&gt;A few things I cared about getting right:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;IAM scoping&lt;/strong&gt;: Policies are locked to the deployment region and specific resource ARNs. Bedrock access is limited to Anthropic models only.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skill validation&lt;/strong&gt;: Skill names are regex-validated to prevent path traversal. S3-loaded skills are size-limited to 1MB.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool error isolation&lt;/strong&gt;: Internal errors return only the exception type to the model — no stack traces or secrets leak into responses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slack verification&lt;/strong&gt;: HMAC-SHA256 signature verification runs before any event processing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSM least-privilege&lt;/strong&gt;: Lambda can only read the specific SSM parameter prefixes you declare.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When To Use This (and When Not To)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Good fit:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slack bots and chat agents on AWS&lt;/li&gt;
&lt;li&gt;Agents with a handful of well-defined tools&lt;/li&gt;
&lt;li&gt;Teams that want agent behavior in version-controlled markdown&lt;/li&gt;
&lt;li&gt;Serverless-first deployments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Look elsewhere if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need multi-model orchestration (different LLMs per step)&lt;/li&gt;
&lt;li&gt;Your agent requires complex stateful workflows with branching&lt;/li&gt;
&lt;li&gt;You're not on AWS or don't want Bedrock&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone the example&lt;/span&gt;
git clone https://github.com/AIOpsCrew/terraform-module-markdown-agent
&lt;span class="nb"&gt;cd &lt;/span&gt;terraform-module-markdown-agent/examples/slack-bot

&lt;span class="c"&gt;# Build the Lambda layer&lt;/span&gt;
bash ../../scripts/build_layer.sh &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Deploy&lt;/span&gt;
terraform init
terraform apply
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The example includes a working Slack bot with &lt;code&gt;get_time&lt;/code&gt; and &lt;code&gt;get_weather&lt;/code&gt; tools. Swap the skills and tools for your use case.&lt;/p&gt;




&lt;p&gt;The repo is Apache 2.0 licensed. If you're building agents on AWS and tired of fighting frameworks, give it a look: [github.com/AIOpsCrew/terraform-module-markdown-agent(&lt;a href="https://github.com/AIOpsCrew/terraform-module-markdown-agent" rel="noopener noreferrer"&gt;https://github.com/AIOpsCrew/terraform-module-markdown-agent&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Questions or feedback? Drop a comment or open an issue.&lt;/p&gt;

</description>
      <category>python</category>
      <category>terraform</category>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>How We Cut 500 Unnecessary Contact Center Transfers With a $48 AWS Architecture Change</title>
      <dc:creator>Aaron VanSledright</dc:creator>
      <pubDate>Tue, 10 Mar 2026 16:18:21 +0000</pubDate>
      <link>https://forem.com/avansledright/how-we-cut-500-unnecessary-contact-center-transfers-with-a-48-aws-architecture-change-3036</link>
      <guid>https://forem.com/avansledright/how-we-cut-500-unnecessary-contact-center-transfers-with-a-48-aws-architecture-change-3036</guid>
      <description>&lt;p&gt;Most Amazon Lex failures aren't Lex failures.&lt;/p&gt;

&lt;p&gt;They're speech-to-text failures that Lex gets blamed for.&lt;/p&gt;

&lt;p&gt;I want to walk through a real production problem we solved recently — a contact center bot that worked perfectly in testing and fell apart the moment real customers picked up the phone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;A client was running Amazon Lex as the front line of their customer-facing voice bot. Thousands of calls per month — routing inquiries, collecting account info, resolving common requests without human agents.&lt;/p&gt;

&lt;p&gt;In QA: flawless.&lt;/p&gt;

&lt;p&gt;In production: chaos.&lt;/p&gt;

&lt;p&gt;Callers were phoning in from cars, construction sites, busy restaurants, and airports. Background noise was destroying speech-to-text accuracy. Lex couldn't match the right intent. Callers got stuck in retry loops, gave up, or got dumped to a live agent — exactly the outcome the bot was built to prevent.&lt;/p&gt;

&lt;p&gt;The client estimated &lt;strong&gt;~5% of all calls&lt;/strong&gt; were being unnecessarily transferred to human agents due to noisy transcriptions. At 10,000 calls per month, that's &lt;strong&gt;500 avoidable transfers&lt;/strong&gt; — each one consuming agent time, increasing wait queues, and frustrating customers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Happens
&lt;/h2&gt;

&lt;p&gt;The default Amazon Lex architecture bundles speech-to-text (STT) and natural language understanding (NLU) into a single pipeline. You send audio in, Lex gives you an intent back. Clean and simple.&lt;/p&gt;

&lt;p&gt;The problem is that Lex's built-in STT isn't optimized for real-world telephony noise. It's designed for reasonably clean audio. The moment you introduce background noise — wind, traffic, restaurant ambience — transcription quality degrades, and bad transcriptions produce wrong intents or no match at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before (default architecture):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Audio → Lex (STT + NLU)
          ↓
   Garbled transcription
          ↓
   Wrong intent matched
          ↓
   Agent transfer ❌
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix isn't to retrain your bot. The fix is to separate the concerns.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Decouple STT From NLU
&lt;/h2&gt;

&lt;p&gt;Amazon Transcribe is purpose-built for telephony audio. It uses a separate acoustic model trained on phone-quality audio with background noise, and it significantly outperforms Lex's built-in STT in noisy environments.&lt;/p&gt;

&lt;p&gt;The architecture change is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Route audio to Amazon Transcribe&lt;/strong&gt; instead of Lex directly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get clean text back&lt;/strong&gt; from Transcribe&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pass that clean text to Lex&lt;/strong&gt; via &lt;code&gt;RecognizeText&lt;/code&gt; (NLU only — no STT)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda orchestrates&lt;/strong&gt; the handoff between the two services&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;After (decoupled architecture):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Audio → Transcribe (STT)
          ↓
      Clean text
          ↓
  Lex RecognizeText (NLU only)
          ↓
   Correct intent matched ✅
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Lambda function sitting in the middle looks roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="n"&gt;transcribe_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;transcribe-streaming&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;lex_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lexv2-runtime&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_utterance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bot_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bot_alias_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;locale_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Step 1: Transcribe audio to text
&lt;/span&gt;    &lt;span class="n"&gt;transcription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transcribe_audio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;clean_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;transcripts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;transcript&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 2: Send clean text to Lex for intent matching
&lt;/span&gt;    &lt;span class="n"&gt;lex_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lex_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recognize_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;botId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bot_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;botAliasId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bot_alias_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;localeId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;locale_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sessionId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;clean_text&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;lex_response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The actual streaming implementation uses &lt;code&gt;StartStreamTranscription&lt;/code&gt; for real-time audio — the above is simplified for clarity.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Observability: Don't Ship Blind
&lt;/h2&gt;

&lt;p&gt;One thing we added alongside the architecture change was proper CloudWatch instrumentation. The original setup had almost no visibility into &lt;em&gt;why&lt;/em&gt; calls were failing — just that they were.&lt;/p&gt;

&lt;p&gt;We added custom metrics for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Transcription confidence scores per utterance&lt;/li&gt;
&lt;li&gt;Intent match rate vs. fallback rate&lt;/li&gt;
&lt;li&gt;Utterances that hit the noise threshold and triggered a retry&lt;/li&gt;
&lt;li&gt;Transfer rate by hour of day (useful for spotting shift patterns)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gave the client's ops team actual dashboards to monitor bot health in real time — something they'd never had before.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Unnecessary agent transfers&lt;/td&gt;
&lt;td&gt;~500/month&lt;/td&gt;
&lt;td&gt;Near zero&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent time wasted&lt;/td&gt;
&lt;td&gt;$1,000+/month&lt;/td&gt;
&lt;td&gt;Recovered&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Additional AWS cost&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;~$48/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Added latency per utterance&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;100–400ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 100–400ms latency increase from adding Transcribe in the loop was imperceptible to callers. We monitored it closely for the first two weeks post-deploy and received zero complaints.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Pattern Is Good For
&lt;/h2&gt;

&lt;p&gt;This decoupled STT + NLU pattern is worth knowing about any time you're running Lex in environments where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Callers are mobile (driving, outside, in transit)&lt;/li&gt;
&lt;li&gt;Your customer base includes call centers or field workers&lt;/li&gt;
&lt;li&gt;You're seeing high fallback/retry rates that don't correlate with bad intents&lt;/li&gt;
&lt;li&gt;You have multilingual requirements (Transcribe has broader language support than Lex's built-in STT)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's also a cleaner architecture for testing — you can unit test your NLU layer independently of audio input, which makes bot development significantly faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost Breakdown
&lt;/h2&gt;

&lt;p&gt;Amazon Transcribe Streaming is billed per second of audio transcribed (~$0.024/min). At 10,000 calls averaging 3 minutes of active speech:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;10,000 calls × 3 min × $0.024 = ~$720/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But you're already paying for Lex's built-in STT in the per-request pricing. The net delta ends up around &lt;strong&gt;$48/month&lt;/strong&gt; for this client's volume — a rounding error compared to the agent time recovered.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Amazon Lex's built-in STT struggles with real-world background noise&lt;/li&gt;
&lt;li&gt;Decouple STT (Amazon Transcribe) from NLU (Lex RecognizeText) using Lambda&lt;/li&gt;
&lt;li&gt;Add CloudWatch metrics so you can actually see what's happening&lt;/li&gt;
&lt;li&gt;500 fewer transfers/month, $1,000+ saved, $48 in additional AWS costs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full case study with architecture diagrams is on the &lt;a href="https://45squared.com/case-study-eliminating-noisy-caller-failures-in-an-amazon-lex-contact-center/" rel="noopener noreferrer"&gt;45Squared blog&lt;/a&gt;. The technical deep dive including the full Lambda implementation is on &lt;a href="https://aiopscrew.com/blog/eliminating-noisy-calls-with-amazon-lex.html" rel="noopener noreferrer"&gt;AIOPSCrew.com&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Building on Amazon Connect or Lex and running into similar issues? I do fixed-scope &lt;a href="https://45squared.com/sprints" rel="noopener noreferrer"&gt;Architecture Sprints&lt;/a&gt; — production-ready in 2 weeks, fixed price, no retainer. Feel free to reach out.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>architecture</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
