<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Alex Vega</title>
    <description>The latest articles on Forem by Alex Vega (@alex_vega_77).</description>
    <link>https://forem.com/alex_vega_77</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3861883%2F11f03d5a-394a-4526-935d-5692a18fba9c.png</url>
      <title>Forem: Alex Vega</title>
      <link>https://forem.com/alex_vega_77</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/alex_vega_77"/>
    <language>en</language>
    <item>
      <title>Why We Ditched Bedrock Agents for Nova Pro and Built a Custom Orchestrator</title>
      <dc:creator>Alex Vega</dc:creator>
      <pubDate>Sun, 05 Apr 2026 07:22:01 +0000</pubDate>
      <link>https://forem.com/alex_vega_77/why-we-ditched-bedrock-agents-for-nova-pro-and-built-a-custom-orchestrator-364h</link>
      <guid>https://forem.com/alex_vega_77/why-we-ditched-bedrock-agents-for-nova-pro-and-built-a-custom-orchestrator-364h</guid>
      <description>&lt;p&gt;We're building a healthcare prior authorization platform. If you've never dealt with prior auths, congratulations, you've been spared one of the most soul-crushing workflows in American healthcare. Our platform tries to make it less painful. One of our core features is an AI assistant that helps clinical staff review denial cases, check patient eligibility, and generate appeal letters.&lt;/p&gt;

&lt;p&gt;We wanted to use Amazon Nova Pro as the foundation model for this particular feature. The reasoning was simple: it's AWS's own model. AWS removes most calls-per-minute limitations on their own models, so you're not fighting throttling issues or provisioned throughput caps. With third-party models on Bedrock you can run into rate limits that require you to request increases or provision dedicated capacity. Nova is just there. No friction, no access requests, no surprise throttling at 2pm on a Tuesday.&lt;/p&gt;

&lt;p&gt;To be clear, we still use Claude and other models within Bedrock for different parts of the platform. Our denial letter analysis runs on Claude. Document parsing uses a different model. You pick the right model for the job. But for a conversational agent that needs to call tools reliably at scale without worrying about throttling, Nova Pro made sense.&lt;/p&gt;

&lt;p&gt;The obvious choice for orchestration was Bedrock Agents with Action Groups. AWS literally built this for agentic workflows. Define your tools, let the agent decide when to call them, get results back. The marketing material makes it sound like you're about 15 minutes away from production.&lt;/p&gt;

&lt;p&gt;We were not 15 minutes away from production.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: 16 Versions, Zero Reliable Tool Calls
&lt;/h2&gt;

&lt;p&gt;We started with Bedrock Agents using Claude 3 Sonnet. Our agent had 6 tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;getCaseContext&lt;/strong&gt; pulls patient demographics, ICD-10/CPT codes, denial details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;detectMissingInfo&lt;/strong&gt; scans appeal letters for placeholder gaps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;updatePatientInfo&lt;/strong&gt; updates demographics from conversational input&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;regenerateAppeal&lt;/strong&gt; re-generates appeal letters with updated data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;search_for_payer&lt;/strong&gt; looks up insurance payers via Stedi MCP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eligibility_check&lt;/strong&gt; runs real-time X12 270/271 eligibility verification&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We built 16 versions of the Bedrock Agent. Sixteen. v1 through v16:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;v1-v9: Various prompt iterations trying to force tool calling&lt;/li&gt;
&lt;li&gt;v10-v13: Simplified instructions, upgraded to Claude 3.5 Sonnet&lt;/li&gt;
&lt;li&gt;v14-v16: Added session attributes, made parameters optional&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result every time: the agent responded conversationally instead of calling tools. "I'd be happy to look up that case for you!" Great. So look it up. Call the tool. That's what it's there for.&lt;/p&gt;

&lt;p&gt;It would not call the tool.&lt;/p&gt;

&lt;p&gt;No matter how explicit the prompt, no matter how we structured the Action Groups, the agent preferred to have a nice chat about calling tools instead of actually calling them. Like a coworker who replies to every Slack message with "great question, let me look into that" and then never looks into it.&lt;/p&gt;

&lt;p&gt;Then we tried switching to Nova Pro as the agent model.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Error That Changed Everything
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dependencyFailedException: Dependency resource: received model 
timeout/error exception from Bedrock
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Beautiful. Nova Pro with Bedrock Agents and Action Groups throws a &lt;code&gt;dependencyFailedException&lt;/code&gt;. The error message is misleading. It sounds like you misconfigured something. You didn't. We triple-checked. Then quadruple-checked. Then started questioning our career choices.&lt;/p&gt;

&lt;p&gt;What's actually happening: Nova Pro can't generate responses fast enough within Bedrock Agents' internal timeout window when Action Groups are involved. The agent framework has its own timeout for model responses that's separate from your Lambda timeout, and Nova Pro doesn't consistently meet it.&lt;/p&gt;

&lt;p&gt;This isn't just us. Other builders have reported the same issue on &lt;a href="https://repost.aws/questions/QUwyrhlEugTH-DV_dH5lG3Ag/issue-with-bedrock-agents-and-lambda-function-model-timeout-error" rel="noopener noreferrer"&gt;AWS re:Post&lt;/a&gt; and &lt;a href="https://dev.to/aws-builders/how-to-solve-dependencyfailedexception-on-a-multi-agent-in-aws-bedrock-3fdi"&gt;DEV Community&lt;/a&gt;. The recommended fix in those threads? Switch to Claude Sonnet.&lt;/p&gt;

&lt;p&gt;Sure. Or, hear me out, we could just build the thing ourselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Converse API + Custom Tool Orchestration
&lt;/h2&gt;

&lt;p&gt;Here's the plot twist nobody saw coming: Nova Pro works perfectly fine with the Bedrock Converse API for tool calling. The problem is specifically with the Agents framework wrapping it. The layers of abstraction that are supposed to make your life easier are the exact thing breaking it.&lt;/p&gt;

&lt;p&gt;So we cut out the middleman.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User &amp;gt; Bedrock Agent &amp;gt; Action Groups &amp;gt; Lambda functions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We built:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User &amp;gt; Next.js API &amp;gt; Lambda Orchestrator &amp;gt; Converse API (Nova Pro)
                                         &amp;gt; Tool Execution
                                         &amp;gt; Converse API (final response)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Less magic, more control, actually works.&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;Define tools as Converse API tool specifications:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;TOOLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolSpec&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;getCaseContext&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retrieves comprehensive case information...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputSchema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;# ... more tools
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Call Nova Pro with tool definitions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock_runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;converse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;amazon.nova-pro-v1:0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;toolConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;inferenceConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;maxTokens&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check if Nova wants to use a tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;stopReason&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tool_use&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tool_uses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; 
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;toolUse&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execute the tools yourself and return results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tool_use&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tool_uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tool_use&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;toolUse&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;tool_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tool_use&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;toolUse&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;tool_use_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tool_use&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;toolUse&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;toolUseId&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;tool_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolResult&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolUseId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_use_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Pass results back to Nova for final response
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_results&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;final_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock_runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;converse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;amazon.nova-pro-v1:0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;toolConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;inferenceConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;maxTokens&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Two Converse API calls. One to get tool selections, one to get the final response after tool execution. No agent framework, no Action Groups, no &lt;code&gt;dependencyFailedException&lt;/code&gt;. No existential crisis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tool Routing: Local + MCP
&lt;/h2&gt;

&lt;p&gt;Our &lt;code&gt;call_tool&lt;/code&gt; function routes to different backends depending on the tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;clinic_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;case_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;case_context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Stedi MCP tools (insurance payer search + eligibility)
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;search_for_payer&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;eligibility_check&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;call_stedi_mcp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Pre-fetched case context (no API call needed)
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;getCaseContext&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;case_context&lt;/span&gt;

    &lt;span class="c1"&gt;# All other tools call our Next.js API with service auth
&lt;/span&gt;    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_service_token&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# ... HTTP request to our API endpoints
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the Stedi MCP integration, we call their MCP server directly using JSON-RPC 2.0:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_stedi_mcp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonrpc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;method&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools/call&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;params&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arguments&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MCP_SERVER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;STEDI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;30.0&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing fancy. It just works. Which, after 16 versions of an agent that preferred small talk over tool calls, felt like an unreasonable amount of progress.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Gained
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Reliable tool calling.&lt;/strong&gt; Nova Pro calls tools correctly every time with the Converse API. Temperature 0.0 makes it deterministic. In 9 prompt iterations we had a production-ready system, compared to 16 failed iterations with Bedrock Agents. Sometimes less abstraction is more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full prompt control.&lt;/strong&gt; Our system prompt is 80+ lines with a decision tree that tells Nova exactly when to call which tool. With Bedrock Agents, you're working within their prompt scaffolding. With the Converse API, the prompt is yours and yours alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debuggability.&lt;/strong&gt; Every tool call is logged in CloudWatch with input and output. When something goes wrong, we can see exactly what Nova requested and what the tool returned. With Bedrock Agents, you're reading tea leaves in CloudTrail trying to figure out why your agent decided to go rogue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance.&lt;/strong&gt; No agent overhead. The Lambda has a 120-second timeout and 512MB memory. Typical response time is 3-5 seconds including tool execution. The irony of the agent framework being slower than doing it yourself is not lost on us.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost control.&lt;/strong&gt; Two Converse API calls per interaction plus tool execution costs. That's it. No agent session fees. Our CFO appreciates this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Nova Pro has quirks with thinking tags.&lt;/strong&gt; If you add instructions like "Never output in thinking section," Nova can silently fail without selecting a tool. We spent a fun afternoon figuring that one out. We handle it with a regex cleanup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="n"&gt;final_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;thinking&amp;gt;.*?&amp;lt;/thinking&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;final_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DOTALL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Hyphens in tool names can cause issues&lt;/strong&gt; with some Bedrock configurations. We use camelCase: &lt;code&gt;getCaseContext&lt;/code&gt;, not &lt;code&gt;get-case-context&lt;/code&gt;. This one cost us about two hours and a significant amount of coffee.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pre-fetch context before invoking the orchestrator.&lt;/strong&gt; Our Next.js API fetches the case context before calling the Lambda, so &lt;code&gt;getCaseContext&lt;/code&gt; returns instantly from cached data instead of making another round trip. This is the kind of optimization that seems obvious in retrospect and absolutely was not obvious at the time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Service account authentication matters.&lt;/strong&gt; The Lambda uses a Cognito service account with cached tokens (1-hour TTL) to call our API endpoints. Token refresh adds about 200ms when the cache expires. We spent way too long on this part but it's solid now.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use This Pattern
&lt;/h2&gt;

&lt;p&gt;Use the Converse API directly instead of Bedrock Agents when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need Nova Pro specifically and hit the &lt;code&gt;dependencyFailedException&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;You need deterministic, reliable tool calling&lt;/li&gt;
&lt;li&gt;You want full control over prompt engineering and tool routing&lt;/li&gt;
&lt;li&gt;You're integrating with external MCP servers or APIs&lt;/li&gt;
&lt;li&gt;You need clear debugging and logging of tool executions&lt;/li&gt;
&lt;li&gt;You want to minimize latency and cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Bedrock Agents when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're using Claude models (they work reliably with Agents)&lt;/li&gt;
&lt;li&gt;You need multi-turn agent sessions with built-in memory&lt;/li&gt;
&lt;li&gt;You want AWS to handle the orchestration loop&lt;/li&gt;
&lt;li&gt;Your tools are simple Lambda functions without complex routing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    +-----------------------+
                    |    Next.js Frontend    |
                    |  (Auth + Case Fetch)   |
                    +-----------+-----------+
                                |
                    +-----------v-----------+
                    |  Lambda Orchestrator   |
                    |  (Python 3.11, 120s)   |
                    +-----------+-----------+
                                |
              +-----------------+------------------+
              |                                    |
    +---------v---------+              +-----------v-----------+
    |  Bedrock Converse  |              |     Tool Execution    |
    |  Nova Pro v1       |              |                       |
    |  (tool selection   |              |  - Case Context       |
    |   + final response)|              |  - Detect Missing     |
    +--------------------+              |  - Update Patient     |
                                       |  - Regen Appeal       |
                                       |  - Stedi MCP (payer)  |
                                       |  - Stedi MCP (elig)   |
                                       +-----------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The entire orchestrator is a single Lambda function, about 550 lines of Python. No frameworks, no SDKs beyond boto3 and urllib3. It does one thing well: route between Nova Pro and our tools.&lt;/p&gt;

&lt;p&gt;We've been running this in production since November 2025 for healthcare prior authorization workflows. It handles denial case analysis, patient eligibility verification via Stedi's MCP server, and appeal letter generation.&lt;/p&gt;

&lt;p&gt;If you're hitting the same Bedrock Agents limitations with Nova Pro, stop fighting it. The Converse API is right there. Sometimes the best framework is no framework.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Alex Vega builds AI-powered prior authorization tools for healthcare providers at EasyPA.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;aws&lt;/code&gt; &lt;code&gt;bedrock&lt;/code&gt; &lt;code&gt;ai&lt;/code&gt; &lt;code&gt;healthcare&lt;/code&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>architecture</category>
      <category>aws</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
