Why We Ditched Bedrock Agents for Nova Pro and Built a Custom Orchestrator

Alex Vega — Sun, 05 Apr 2026 07:22:01 +0000

We're building a healthcare prior authorization platform. If you've never dealt with prior auths, congratulations, you've been spared one of the most soul-crushing workflows in American healthcare. Our platform tries to make it less painful. One of our core features is an AI assistant that helps clinical staff review denial cases, check patient eligibility, and generate appeal letters.

We wanted to use Amazon Nova Pro as the foundation model for this particular feature. The reasoning was simple: it's AWS's own model. AWS removes most calls-per-minute limitations on their own models, so you're not fighting throttling issues or provisioned throughput caps. With third-party models on Bedrock you can run into rate limits that require you to request increases or provision dedicated capacity. Nova is just there. No friction, no access requests, no surprise throttling at 2pm on a Tuesday.

To be clear, we still use Claude and other models within Bedrock for different parts of the platform. Our denial letter analysis runs on Claude. Document parsing uses a different model. You pick the right model for the job. But for a conversational agent that needs to call tools reliably at scale without worrying about throttling, Nova Pro made sense.

The obvious choice for orchestration was Bedrock Agents with Action Groups. AWS literally built this for agentic workflows. Define your tools, let the agent decide when to call them, get results back. The marketing material makes it sound like you're about 15 minutes away from production.

We were not 15 minutes away from production.

The Problem: 16 Versions, Zero Reliable Tool Calls

We started with Bedrock Agents using Claude 3 Sonnet. Our agent had 6 tools:

getCaseContext pulls patient demographics, ICD-10/CPT codes, denial details
detectMissingInfo scans appeal letters for placeholder gaps
updatePatientInfo updates demographics from conversational input
regenerateAppeal re-generates appeal letters with updated data
search_for_payer looks up insurance payers via Stedi MCP
eligibility_check runs real-time X12 270/271 eligibility verification

We built 16 versions of the Bedrock Agent. Sixteen. v1 through v16:

v1-v9: Various prompt iterations trying to force tool calling
v10-v13: Simplified instructions, upgraded to Claude 3.5 Sonnet
v14-v16: Added session attributes, made parameters optional

The result every time: the agent responded conversationally instead of calling tools. "I'd be happy to look up that case for you!" Great. So look it up. Call the tool. That's what it's there for.

It would not call the tool.

No matter how explicit the prompt, no matter how we structured the Action Groups, the agent preferred to have a nice chat about calling tools instead of actually calling them. Like a coworker who replies to every Slack message with "great question, let me look into that" and then never looks into it.

Then we tried switching to Nova Pro as the agent model.

The Error That Changed Everything

dependencyFailedException: Dependency resource: received model 
timeout/error exception from Bedrock

Beautiful. Nova Pro with Bedrock Agents and Action Groups throws a dependencyFailedException. The error message is misleading. It sounds like you misconfigured something. You didn't. We triple-checked. Then quadruple-checked. Then started questioning our career choices.

What's actually happening: Nova Pro can't generate responses fast enough within Bedrock Agents' internal timeout window when Action Groups are involved. The agent framework has its own timeout for model responses that's separate from your Lambda timeout, and Nova Pro doesn't consistently meet it.

This isn't just us. Other builders have reported the same issue on AWS re:Post and DEV Community. The recommended fix in those threads? Switch to Claude Sonnet.

Sure. Or, hear me out, we could just build the thing ourselves.

The Solution: Converse API + Custom Tool Orchestration

Here's the plot twist nobody saw coming: Nova Pro works perfectly fine with the Bedrock Converse API for tool calling. The problem is specifically with the Agents framework wrapping it. The layers of abstraction that are supposed to make your life easier are the exact thing breaking it.

So we cut out the middleman.

Instead of:

User > Bedrock Agent > Action Groups > Lambda functions

We built:

User > Next.js API > Lambda Orchestrator > Converse API (Nova Pro)
                                         > Tool Execution
                                         > Converse API (final response)

Less magic, more control, actually works.

How It Works

Define tools as Converse API tool specifications:

TOOLS = [
    {
        "toolSpec": {
            "name": "getCaseContext",
            "description": "Retrieves comprehensive case information...",
            "inputSchema": {
                "json": {
                    "type": "object",
                    "properties": {},
                    "required": []
                }
            }
        }
    },
    # ... more tools
]

Call Nova Pro with tool definitions:

response = bedrock_runtime.converse(
    modelId='amazon.nova-pro-v1:0',
    messages=messages,
    system=system_prompt,
    toolConfig={'tools': TOOLS},
    inferenceConfig={
        'maxTokens': 2000,
        'temperature': 0.0
    }
)

Check if Nova wants to use a tool:

if response['stopReason'] == 'tool_use':
    tool_uses = [
        block for block in response['output']['message']['content'] 
        if 'toolUse' in block
    ]

Execute the tools yourself and return results:

for tool_use in tool_uses:
    tool_name = tool_use['toolUse']['name']
    tool_input = tool_use['toolUse']['input']
    tool_use_id = tool_use['toolUse']['toolUseId']

    result = call_tool(tool_name, tool_input)

    tool_results.append({
        "toolResult": {
            "toolUseId": tool_use_id,
            "content": [{"json": result}]
        }
    })

# Pass results back to Nova for final response
messages.append(assistant_message)
messages.append({"role": "user", "content": tool_results})

final_response = bedrock_runtime.converse(
    modelId='amazon.nova-pro-v1:0',
    messages=messages,
    system=system_prompt,
    toolConfig={'tools': TOOLS},
    inferenceConfig={'maxTokens': 1000, 'temperature': 0.0}
)

That's it. Two Converse API calls. One to get tool selections, one to get the final response after tool execution. No agent framework, no Action Groups, no dependencyFailedException. No existential crisis.

Tool Routing: Local + MCP

Our call_tool function routes to different backends depending on the tool:

def call_tool(tool_name, tool_input, clinic_id, case_id, case_context=None):
    # Stedi MCP tools (insurance payer search + eligibility)
    if tool_name in ['search_for_payer', 'eligibility_check']:
        return call_stedi_mcp(tool_name, tool_input)

    # Pre-fetched case context (no API call needed)
    if tool_name == 'getCaseContext':
        return case_context

    # All other tools call our Next.js API with service auth
    token = get_service_token()
    # ... HTTP request to our API endpoints

For the Stedi MCP integration, we call their MCP server directly using JSON-RPC 2.0:

def call_stedi_mcp(tool_name, tool_input):
    payload = {
        "jsonrpc": "2.0",
        "method": "tools/call",
        "params": {"name": tool_name, "arguments": tool_input},
        "id": int(time.time() * 1000)
    }

    response = http.request('POST', MCP_SERVER,
        body=json.dumps(payload),
        headers={
            'Authorization': STEDI_API_KEY,
            'Content-Type': 'application/json'
        },
        timeout=30.0
    )
    return json.loads(response.data)

Nothing fancy. It just works. Which, after 16 versions of an agent that preferred small talk over tool calls, felt like an unreasonable amount of progress.

What We Gained

Reliable tool calling. Nova Pro calls tools correctly every time with the Converse API. Temperature 0.0 makes it deterministic. In 9 prompt iterations we had a production-ready system, compared to 16 failed iterations with Bedrock Agents. Sometimes less abstraction is more.

Full prompt control. Our system prompt is 80+ lines with a decision tree that tells Nova exactly when to call which tool. With Bedrock Agents, you're working within their prompt scaffolding. With the Converse API, the prompt is yours and yours alone.

Debuggability. Every tool call is logged in CloudWatch with input and output. When something goes wrong, we can see exactly what Nova requested and what the tool returned. With Bedrock Agents, you're reading tea leaves in CloudTrail trying to figure out why your agent decided to go rogue.

Performance. No agent overhead. The Lambda has a 120-second timeout and 512MB memory. Typical response time is 3-5 seconds including tool execution. The irony of the agent framework being slower than doing it yourself is not lost on us.

Cost control. Two Converse API calls per interaction plus tool execution costs. That's it. No agent session fees. Our CFO appreciates this.

Lessons Learned

Nova Pro has quirks with thinking tags. If you add instructions like "Never output in thinking section," Nova can silently fail without selecting a tool. We spent a fun afternoon figuring that one out. We handle it with a regex cleanup:

import re
final_text = re.sub(r'<thinking>.*?</thinking>', '', final_text, flags=re.DOTALL)

Hyphens in tool names can cause issues with some Bedrock configurations. We use camelCase: getCaseContext, not get-case-context. This one cost us about two hours and a significant amount of coffee.

Pre-fetch context before invoking the orchestrator. Our Next.js API fetches the case context before calling the Lambda, so getCaseContext returns instantly from cached data instead of making another round trip. This is the kind of optimization that seems obvious in retrospect and absolutely was not obvious at the time.

Service account authentication matters. The Lambda uses a Cognito service account with cached tokens (1-hour TTL) to call our API endpoints. Token refresh adds about 200ms when the cache expires. We spent way too long on this part but it's solid now.

When to Use This Pattern

Use the Converse API directly instead of Bedrock Agents when:

You need Nova Pro specifically and hit the dependencyFailedException
You need deterministic, reliable tool calling
You want full control over prompt engineering and tool routing
You're integrating with external MCP servers or APIs
You need clear debugging and logging of tool executions
You want to minimize latency and cost

Use Bedrock Agents when:

You're using Claude models (they work reliably with Agents)
You need multi-turn agent sessions with built-in memory
You want AWS to handle the orchestration loop
Your tools are simple Lambda functions without complex routing

The Architecture

                    +-----------------------+
                    |    Next.js Frontend    |
                    |  (Auth + Case Fetch)   |
                    +-----------+-----------+
                                |
                    +-----------v-----------+
                    |  Lambda Orchestrator   |
                    |  (Python 3.11, 120s)   |
                    +-----------+-----------+
                                |
              +-----------------+------------------+
              |                                    |
    +---------v---------+              +-----------v-----------+
    |  Bedrock Converse  |              |     Tool Execution    |
    |  Nova Pro v1       |              |                       |
    |  (tool selection   |              |  - Case Context       |
    |   + final response)|              |  - Detect Missing     |
    +--------------------+              |  - Update Patient     |
                                       |  - Regen Appeal       |
                                       |  - Stedi MCP (payer)  |
                                       |  - Stedi MCP (elig)   |
                                       +-----------------------+

The entire orchestrator is a single Lambda function, about 550 lines of Python. No frameworks, no SDKs beyond boto3 and urllib3. It does one thing well: route between Nova Pro and our tools.

We've been running this in production since November 2025 for healthcare prior authorization workflows. It handles denial case analysis, patient eligibility verification via Stedi's MCP server, and appeal letter generation.

If you're hitting the same Bedrock Agents limitations with Nova Pro, stop fighting it. The Converse API is right there. Sometimes the best framework is no framework.

Alex Vega builds AI-powered prior authorization tools for healthcare providers at EasyPA.

Tags: aws bedrock ai healthcare

Forem: Alex Vega